Tuesday, March 26, 2013

Reading Think Bayes

I have started reading the book http://www.greenteapress.com/thinkbayes/html and thought of posting some interesting stuff i read.
Here is a problem from chapter 3:

Suppose I have a box of dice that contains a 4-sided die, a 6-sided die, an 8-sided die, a 12-sided die and a 20-sided die. Suppose I select a die from the box at random, roll it, and get a 6. What is the probability that I rolled each die? 
We need to first define the hypothesis(H) ie our prediction of which die has been rolled. In this case the hypothesis has to be a number from the list 4,6,8,12 and 20.
Also the data or observation(D) we are given is that the first random roll resulted in a value of 6.  We have to find the posterior probability distribution P(H|D=6) ie probability of each outcome H=4, H=6, H=8 etc  given the data D=6. The Bayes' theorem says that P(A|B) = P(A).P(B|A)/P(B). in our case P(H|D=6) = P(H). P(D=6|H)/P(D=6).
Now P(D=6|H) is going to be a distribution like
Hypothesis(H) Probability
P(D=6|H=4) 0.0
P(D=6|H=6) 1/6 = 0.1667
P(D=6|H=8) 1/8 = 0.125
P(D=6|H=12) 1/12 = 0.0833
P(D=6|H=20) 1/20 = 0.05
As you can clearly make out the values do not sum to 1. So add all these values and divide each value by the sum (normalize). This is needed to make it a probability distribution. so final result table will be,
Hypothesis(H) Probability
P(h=4|D=6) 0.0
P(H=6|D=6) 0.392156862745
P(H=8|D=6) 0.294117647059
P(H=12|D=6) 0.196078431373
P(H=20|D=6) 0.117647058824
This is called (after normalizing) the Likelihood ie likelihood of getting a 6 on a roll for each die. It is obvious that if we roll a 4-sided die getting a 6 is impossible. Hence likelihood of this happening is 0.0.
The main strength of this book is all problems are solved computationally. The author gives programs for all these and that too in Python. Now if we get more data by going on picking a die at random and rolling the probabilities will change.

No comments: