Probabilistic graphical models

Artificial Intelligence

Assignment 3

Semester 1 2015

Due 11.59pm on Friday 12th June 2015

1 Probabilistic graphical models

1.1 Inference (undergrads and postgrads)

In lectures, we mainly covered Bayesian networks, one type of probabilistic graphical

model (PGM) using directed acrylic graphs. Your task is to write a program to implement

likelihood weighting (or weighted) sampling, as described in lectures, to perform

inference on an arbitrary Bayesian network. You will need to parse an input text file

that encodes the graph and the conditional probability distributions for each variable

(conditioned on its parents). The format of the file is given in section 1.1.1 below. Two

PGM model files have been defined in this format and can be downloaded, along with

two queries each, at:

https://cs.adelaide.edu.au/users/third/ai/15s1-ai-adelaide/files/infer.zip

PGM 1: student model The first PGM models a typical university student in

USA. A student has high intelligence has higher chance to score well in SAT test before

entering the university. The grade for university courses depends on the difficulty of the

courses and the student’s intelligence. A good grade increases the chance of getting a

good recommendation letter from the teachers. The letter and SAT result may affect

the job that the student may get. Both the job and the grade may affect the student’s

happiness — typically a student is happy if they get a good grade and a dream job. The

PGM’s structure is shown in Figure 1.

PGM 2: car problem model The second PGM is for diagnosing the problem of a

car. The full detail is omitted here, but you can figure out some of its meaning based

1

G

I

J

S

D

H

L

Difficulty Intelligence

Grade

Happy

Letter

SAT

Job

Figure 1: Student model

on its file. Your code should be able to process both PGMs in above format. The grade

of the assignment depends on your results on both PGMs.

1.1.1 Model file format

The graphical model will be specified by a text file with format as in Figure 2 (Left),

where:

• N is the number of random variables in the network;

• rv are the random variable names (arbitrary alphanumeric strings);

• Graph matrix: the matrix of zeros and ones specifies the directed arcs in the graph;

a one (1) in the i, j entry indicates there is an edge from i to j (so the ith variable

is a parent of the jth variable.

• mat are two dimensional arrays of real numbers (in ASCII) that specify the conditional

probability table of each random variable conditioned on its parents;

The format of the matrices is a bit subtle, and is illustrated in Figure 2 (Right). If

a node has m parents, then the matrix needs to specify the probability of each outcome

(f alse, true) conditioned on 2m different combinations of parent values, so the matrix

will be 2m × 2 (rows × columns). Treating f alse as 0, and true as 1, concatenate the

values of the parents in their numerical order from most significant bit to least significant

2

Figure 2: Left: File format; Right: the format of the conditional probability matrices.

Note the numbers in the figure are for illustration purpose, and might be different from

the actual numbers in the file. You should use your code to read the file. The right

figure is from the wonderful textbook Probabilistic Graphical Models: Principles and

Techniques. MIT Press, 2009, by D. Koller and N. Friedman.

bit (left to right) to create a row index r. The entry in the first column, rth row is then

the probability that the variable is f alse given the values of the other variables (the

entry in the corresponding 2nd column is the probability that the variable is true).

For example in Figure 2 (Right), assume that the variables are ordered as I, D, G, S, L.

G has two parents I and D. P(G|I, D) is represented by mat2 (the 3rd matrix), which

is the 4 × 3 matrix in the figure. Since I’s order is before D, the row indexing is (i

0

, d0

),

and then (i

0

, d1

) and so on instead of (d

0

, i0

). I has only two values (false or true) thus

denoted as i

0

, i1

. Here G has 3 values thus denoted as g

1

, g2

, g3

(if you want to use

g

0

, g1

, g2

to denote, it is fine too. It won’t affect the code anyway).

Another example is that if A has parents C and F where C is the 3rd variable specified

and F is the 6th, then the (2,0) entry of their matrix represents P(A = f alse|C =

true, F = f alse), and the (0,1) entry represents P(A = true|C = f alse, F = f alse).

The format of the query, entered via the console (or read from redirected standard

input) is:

P(rvQ | rvE1=val, rvE2=val, …)

where rvQ is the name of the query variable, and rvEx are the names of the evidence

variables with their respective true/false values specified. As is normal in Bayesian

3

Case I D G S L J H

1 1 1 2 1 1 0 1

2 0 0 1 0 0 1 0

3 1 0 3 0 0 1 0

4 1 0 1 0 0 1 1

.

.

.

Table 1: Samples

inference, variables not included are unobserved and should be marginalised out.

1.2 Learning parameters (postgrads)

This task is for postgrads. Undergrads can choose to do it too for bonus points.

The parameters in a Bayesian network are the conditional probability matrices. If

you have samples from the joint distribution represented by the Bayesian network, you

can estimate (conditional) probabilities of any variable by frequentism. For example, if

you have samples like in Table 1, you can estimate probabilities as follows:

P(D = 1) = ND=1

Ntotal

P(G = 1|D = 1, I = 0) = NG=1,D=1,I=0

ND=1,I=0

.

.

.

Here ND=1 is the number of samples that D = 1, and Ntotal is the total number of

samples. NG=1,D=1,I=0 is the number of samples that G = 1, D = 1, I = 0, and ND=1,I=0

is the number of samples that D = 1, I = 0.

Your code must be able to read both the PGM model file, and the sample data file,

and use them to produce a new PGM model file. The new PGM model file should be

in the same format as the PGM model file.

The sample data file for learning can be downloaded from

https://cs.adelaide.edu.au/users/third/ai/15s1-ai-adelaide/files/sample.

zip

Note that the Case ID and variables’ names are not included in the sample file.

Assume there are N variables. Each line is a sample vector with N dimensions in the

4

same order of variables as specified in the model file. Note that the old model file and the

new model file shall only differ in the entries of matrices. The rest should not change, as

we are only learning (updating) the parameters (not the graph). In principle, to learn

the parameters does not need to know the mat in the old model file. We allow you to

use the entire old model file for your convenience — if you have a different number of

matrices or their sizes are different from the ones in the old model file, you must have

done something wrong.

2 Deliverables

2.1 Inference (undergrads and postgrads)

Write your program in Java or C/C++. In the case of Java, name your program

inference.java. Your program must be able to be compiled and run as follows:

$ javac inference.java

$ java inference modelfile.txt queryfile.txt answerfile.txt

In the case of C/C++, you must supply a makefile Makefile with a rule called

inference to compile your program into a Linux executable binary named inference.bin.

Your program must be able to be compiled and run as follows:

$ make inference

$ ./inference.bin modelfile.txt queryfile.txt answerfile.txt

Output Your code must be able to read the model file, and the query file to produce

an answer file. The answer file should be named as studentanswer.txt for the student

model, and caranswer.txt for the car model. Each query file has two queries. The

(output) answer file should have 2 lines: the first line is the vector (of the conditional

distribution) of the first query, and the second is the vector of the second query. Each

vector uses a space to separate the value of each dimension.

NOTE1: If a makfile exists, the marking script will assume that the program is

written in C/C++, otherwise Java will be assumed.

NOTE2: modelfile.txt, queryfile.txt, and answerfile.txt are the placeholders

for the names of the model file, the query file and the answer file.

5

2.2 Learning (postgrads)

Write your program in Java or C/C++. In the case of Java, name your program

learn.java. Your program must be able to be compiled and run as follows:

$ javac learn.java

$ java learn modelfile.txt samplefile.txt

In the case of C/C++, you must supply a makefile Makefile with a rule called

learn to compile your program into a Linux executable binary named learn.bin. Your

program must be able to be compiled and run as follows:

$ make learn

$ ./learn.bin modelfile.txt samplefile.txt

Output Your code must be able to read the PGM model file, and the sample data

file to produce a new PGM model file. The new PGM model file should be named as

newmodel.txt following the same format as the old PGM model file.

NOTE1: If a makfile exists, the marking script will assume that the program is

written in C/C++, otherwise Java will be assumed.

NOTE2: modelfile.txt and samplefile.txt are the placeholders for the names

of PGM model file, and sample data file.

3 Submission policies

You must submit your program on the Computer Science Web Submission System.

This means you must create the assignment under your own SVN repository to store

the submission files. The SVN key for this submission is 2015/s1/ai/Assignment3.

The link to the Web Submission System used for this assignment is

https://cs.adelaide.edu.au/for/students/automark/

DO NOT use the link: https://cs.adelaide.edu.au/services/websubmission/

4 Grading

For more details on the online submission procedures including SVN visit the home

page of the school and look for SVN information under the “For Students” tab. Your

code will be compiled and run, testing its outputs for the two networks with two queries

each that have been provided to you. None of the networks will contain more than

6

16 variables. Since the results are stochastic, the query answers will not be exact and

will vary from run to run. The answers will therefore be tested against a tolerance of

±0.01 (i.e. your answers must be within 1% of the true values), so you must ensure

convergence to this level of precision. If it passes all tests you will get 15% of the overall

course mark. Note undergrads need to do inference (15%), and postgrads need to to

do both inference (10%) and learning (5%). The objective of the tests is to check for

the correct operation of your implementation. Hence, the basis of the assessment is to

compare your results against the expected results. You must also ensure that you have

an efficient implementation.

5 Bonus points for Undergrads

Undergrads can choose to do the learning task described in section 1.2 and 2.2 for bonus

point of 3%.

6 Using other source code

You may not use other source code for this assignment. You should personally and

carefully implement to fully understand the concept(s).

7 Due date and late submission policy

This assignment is due 11.59pm on Friday 12th June 2015. If your submission is late

the maximum mark you can obtain will be reduced by 25% per day (or part thereof)

past the due date or any extension you are granted.