Graphical Models:  Assignment 6
Due April 2, 2019

  1. Agent Hydziz I. Dennity's contact classification model includes the PersonType, Gender, and HairLength random variables (as well as other random variables not considered in this assignment).  Agent Dennity defines a completely connected Bayesian network for these three random variables, as shown in the figure below, and assigns a uniform prior distribution for each of these local distributions: PersonType (1 distribution with four states), Gender given PersonType (4 distributions with 2 states each), and HairLength given PersonType and Gender (8 distributions with 2 states each).  What is the prior distribution for the probability that a contact is a government agent? 
DennityBN
  1. To estimate the probability distributions in his contact classification model, Agent Dennity collected a sample of 200 individuals whose type and features were known.  The sample was collected in such a way that Agent Dennity is confident in treating the observations as a random sample from the population of Depravians. The data he collected is provided here.  
    1. Find the posterior distribution for each of the local distributions in Agent Dennity's Bayesian network.  State the type of distribution and the parameters.
    2. Find the mean and variance of each probability in Agent Dennity's Bayesian network.
    3. Find a 90% interval for each probability.  That is, find the 5th and 95th percentile of each probability.  (Hint: You can do this with Excel's BETAINV function or R's qbeta function.)
  2. Repeat Problem 3, but assume that (i)  the gender distribution for government agents is the same as the gender distribution for dissidents; (ii) the hair length distribution is the same for all women; (iii) the hair length distributions are the same for male government agents, male government supporters, and apolitical males.  Compare these results to Problem 3 and comment on the differences.  If you were advising Agent Dennity on whether to estimate the probabilies using the method of Problem 2 or Problem 3, what would you recommend?  Justify your recommendation.
  3. Agent Dennity has asked Chief Statistician Ky Square whether the arc from PersonType to Gender could be removed from the Bayesian network. To evaluate this assumption, Dr. Square recommends comparing the K2 score for the two structures. Compare the K2 score for the structure with and without the arc from PersonType to Gender.
  4. Dr. Square is also considering a model in which (i) the hair length distribution is the same for all women; (ii) the hair length distributions are the same for male government agents, male government supporters, and apolitical males. Find the log probbaility of the data for the fully-connected network with these context-specific independence assumptions.
  5. If the three structures from Problems 4 and 5 have equal prior probabilities, what are their posterior probabilities?  Comment on your results.