Probabilistic graphical models

Probabilistic graphical models

Artificial Intelligence
Assignment 3
Semester 1 2015
Due 11.59pm on Friday 12th June 2015
1 Probabilistic graphical models
1.1 Inference (undergrads and postgrads)
In lectures, we mainly covered Bayesian networks, one type of probabilistic graphical
model (PGM) using directed acrylic graphs. Your task is to write a program to implement
likelihood weighting (or weighted) sampling, as described in lectures, to perform
inference on an arbitrary Bayesian network. You will need to parse an input text file
that encodes the graph and the conditional probability distributions for each variable
(conditioned on its parents). The format of the file is given in section 1.1.1 below. Two
PGM model files have been defined in this format and can be downloaded, along with
two queries each, at:
https://cs.adelaide.edu.au/users/third/ai/15s1-ai-adelaide/files/infer.zip
PGM 1: student model The first PGM models a typical university student in
USA. A student has high intelligence has higher chance to score well in SAT test before
entering the university. The grade for university courses depends on the difficulty of the
courses and the student’s intelligence. A good grade increases the chance of getting a
good recommendation letter from the teachers. The letter and SAT result may affect
the job that the student may get. Both the job and the grade may affect the student’s
happiness — typically a student is happy if they get a good grade and a dream job. The
PGM’s structure is shown in Figure 1.
PGM 2: car problem model The second PGM is for diagnosing the problem of a
car. The full detail is omitted here, but you can figure out some of its meaning based
1
G
I
J
S
D
H
L
Difficulty Intelligence
Grade
Happy
Letter
SAT
Job
Figure 1: Student model
on its file. Your code should be able to process both PGMs in above format. The grade
of the assignment depends on your results on both PGMs.
1.1.1 Model file format
The graphical model will be specified by a text file with format as in Figure 2 (Left),
where:
• N is the number of random variables in the network;
• rv are the random variable names (arbitrary alphanumeric strings);
• Graph matrix: the matrix of zeros and ones specifies the directed arcs in the graph;
a one (1) in the i, j entry indicates there is an edge from i to j (so the ith variable
is a parent of the jth variable.
• mat are two dimensional arrays of real numbers (in ASCII) that specify the conditional
probability table of each random variable conditioned on its parents;
The format of the matrices is a bit subtle, and is illustrated in Figure 2 (Right). If
a node has m parents, then the matrix needs to specify the probability of each outcome
(f alse, true) conditioned on 2m different combinations of parent values, so the matrix
will be 2m × 2 (rows × columns). Treating f alse as 0, and true as 1, concatenate the
values of the parents in their numerical order from most significant bit to least significant
2
Figure 2: Left: File format; Right: the format of the conditional probability matrices.
Note the numbers in the figure are for illustration purpose, and might be different from
the actual numbers in the file. You should use your code to read the file. The right
figure is from the wonderful textbook Probabilistic Graphical Models: Principles and
Techniques. MIT Press, 2009, by D. Koller and N. Friedman.
bit (left to right) to create a row index r. The entry in the first column, rth row is then
the probability that the variable is f alse given the values of the other variables (the
entry in the corresponding 2nd column is the probability that the variable is true).
For example in Figure 2 (Right), assume that the variables are ordered as I, D, G, S, L.
G has two parents I and D. P(G|I, D) is represented by mat2 (the 3rd matrix), which
is the 4 × 3 matrix in the figure. Since I’s order is before D, the row indexing is (i
0
, d0
),
and then (i
0
, d1
) and so on instead of (d
0
, i0
). I has only two values (false or true) thus
denoted as i
0
, i1
. Here G has 3 values thus denoted as g
1
, g2
, g3
(if you want to use
g
0
, g1
, g2
to denote, it is fine too. It won’t affect the code anyway).
Another example is that if A has parents C and F where C is the 3rd variable specified
and F is the 6th, then the (2,0) entry of their matrix represents P(A = f alse|C =
true, F = f alse), and the (0,1) entry represents P(A = true|C = f alse, F = f alse).
The format of the query, entered via the console (or read from redirected standard
input) is:
P(rvQ | rvE1=val, rvE2=val, …)
where rvQ is the name of the query variable, and rvEx are the names of the evidence
variables with their respective true/false values specified. As is normal in Bayesian
3
Case I D G S L J H
1 1 1 2 1 1 0 1
2 0 0 1 0 0 1 0
3 1 0 3 0 0 1 0
4 1 0 1 0 0 1 1
.
.
.
Table 1: Samples
inference, variables not included are unobserved and should be marginalised out.
1.2 Learning parameters (postgrads)
This task is for postgrads. Undergrads can choose to do it too for bonus points.
The parameters in a Bayesian network are the conditional probability matrices. If
you have samples from the joint distribution represented by the Bayesian network, you
can estimate (conditional) probabilities of any variable by frequentism. For example, if
you have samples like in Table 1, you can estimate probabilities as follows:
P(D = 1) = ND=1
Ntotal
P(G = 1|D = 1, I = 0) = NG=1,D=1,I=0
ND=1,I=0
.
.
.
Here ND=1 is the number of samples that D = 1, and Ntotal is the total number of
samples. NG=1,D=1,I=0 is the number of samples that G = 1, D = 1, I = 0, and ND=1,I=0
is the number of samples that D = 1, I = 0.
Your code must be able to read both the PGM model file, and the sample data file,
and use them to produce a new PGM model file. The new PGM model file should be
in the same format as the PGM model file.
The sample data file for learning can be downloaded from
https://cs.adelaide.edu.au/users/third/ai/15s1-ai-adelaide/files/sample.
zip
Note that the Case ID and variables’ names are not included in the sample file.
Assume there are N variables. Each line is a sample vector with N dimensions in the
4
same order of variables as specified in the model file. Note that the old model file and the
new model file shall only differ in the entries of matrices. The rest should not change, as
we are only learning (updating) the parameters (not the graph). In principle, to learn
the parameters does not need to know the mat in the old model file. We allow you to
use the entire old model file for your convenience — if you have a different number of
matrices or their sizes are different from the ones in the old model file, you must have
done something wrong.
2 Deliverables
2.1 Inference (undergrads and postgrads)
Write your program in Java or C/C++. In the case of Java, name your program
inference.java. Your program must be able to be compiled and run as follows:
$ javac inference.java
$ java inference modelfile.txt queryfile.txt answerfile.txt
In the case of C/C++, you must supply a makefile Makefile with a rule called
inference to compile your program into a Linux executable binary named inference.bin.
Your program must be able to be compiled and run as follows:
$ make inference
$ ./inference.bin modelfile.txt queryfile.txt answerfile.txt
Output Your code must be able to read the model file, and the query file to produce
an answer file. The answer file should be named as studentanswer.txt for the student
model, and caranswer.txt for the car model. Each query file has two queries. The
(output) answer file should have 2 lines: the first line is the vector (of the conditional
distribution) of the first query, and the second is the vector of the second query. Each
vector uses a space to separate the value of each dimension.
NOTE1: If a makfile exists, the marking script will assume that the program is
written in C/C++, otherwise Java will be assumed.
NOTE2: modelfile.txt, queryfile.txt, and answerfile.txt are the placeholders
for the names of the model file, the query file and the answer file.
5
2.2 Learning (postgrads)
Write your program in Java or C/C++. In the case of Java, name your program
learn.java. Your program must be able to be compiled and run as follows:
$ javac learn.java
$ java learn modelfile.txt samplefile.txt
In the case of C/C++, you must supply a makefile Makefile with a rule called
learn to compile your program into a Linux executable binary named learn.bin. Your
program must be able to be compiled and run as follows:
$ make learn
$ ./learn.bin modelfile.txt samplefile.txt
Output Your code must be able to read the PGM model file, and the sample data
file to produce a new PGM model file. The new PGM model file should be named as
newmodel.txt following the same format as the old PGM model file.
NOTE1: If a makfile exists, the marking script will assume that the program is
written in C/C++, otherwise Java will be assumed.
NOTE2: modelfile.txt and samplefile.txt are the placeholders for the names
of PGM model file, and sample data file.
3 Submission policies
You must submit your program on the Computer Science Web Submission System.
This means you must create the assignment under your own SVN repository to store
the submission files. The SVN key for this submission is 2015/s1/ai/Assignment3.
The link to the Web Submission System used for this assignment is
https://cs.adelaide.edu.au/for/students/automark/
DO NOT use the link: https://cs.adelaide.edu.au/services/websubmission/
4 Grading
For more details on the online submission procedures including SVN visit the home
page of the school and look for SVN information under the “For Students” tab. Your
code will be compiled and run, testing its outputs for the two networks with two queries
each that have been provided to you. None of the networks will contain more than
6
16 variables. Since the results are stochastic, the query answers will not be exact and
will vary from run to run. The answers will therefore be tested against a tolerance of
±0.01 (i.e. your answers must be within 1% of the true values), so you must ensure
convergence to this level of precision. If it passes all tests you will get 15% of the overall
course mark. Note undergrads need to do inference (15%), and postgrads need to to
do both inference (10%) and learning (5%). The objective of the tests is to check for
the correct operation of your implementation. Hence, the basis of the assessment is to
compare your results against the expected results. You must also ensure that you have
an efficient implementation.
5 Bonus points for Undergrads
Undergrads can choose to do the learning task described in section 1.2 and 2.2 for bonus
point of 3%.
6 Using other source code
You may not use other source code for this assignment. You should personally and
carefully implement to fully understand the concept(s).
7 Due date and late submission policy
This assignment is due 11.59pm on Friday 12th June 2015. If your submission is late
the maximum mark you can obtain will be reduced by 25% per day (or part thereof)
past the due date or any extension you are granted.