Densities For Proportions: Chapter 7
This chapter introduces two concepts :
- Probability as density. Probabilities move from discrete to continuous.
- Beta densities, which are described in Wikipedia as
In probability theory and statistics, the beta distribution is a family of continuous probability distributions defined on the interval (0, 1) parameterized by two positive shape parameters, typically denoted by α and β. The domain of the beta distribution can be viewed as a probability, and in fact the beta distribution is often used to describe the distribution of an unknown probability value — typically, as the prior distribution over a probability parameter, such as the probability of success in a binomial distribution or Bernoulli distribution.
This book describes a beta density as “particular types of prior (and posterior) distributions. There is a different density for each pair of a and b. “
The advantage of using beta distributions is that it simplifies calculations.
Basic Equations
The equation for calculating likelihood enables one to derive a and b:
Likelihood = (model)success * (1- model) failure
Which can be rewritten as : (p)a-1 * (1-p)b-1
The predictive probability of success = mean of beta (a, b) density = a / (a + b)
Process for Using Beta Density
Choose Beta Densities as Priors
Note that beta densities are not always appropriate. For example, a prior distribution may be bi – modal, and there is no beta density with this shape.
Selecting a beta density means selecting a and b.
First step – assess probability of success on first trial = r.
Second step – imagine first trial is success, what is probability of success on the second trial given this information = r+.
A and B can then found as follows:
As r = a / (a + b), and
R+ = (a + 1) / (a + b + 1),
Then
A = (r (1 – r+)) / (r+ - r)
B = ((1-r)(1-r+)) / (r+ - r)
The third step is to check the internal consistency of the values of a and b, by assessing the probability of success on the second trial assuming the first trial results in a failure
r- = a / (a + b + 1)
Once a and b have been calculated using the formulas involving r and r+, r- can be calculated using this formula. If the specified and calculated values of r- disagree, then a and b should be reassessed until they do agree.
The text has an additional two consistency checks which I have not noted here.
Smaller values of a and b indicate an open minded position.
Constructing a Probability Interval For a Proportion
The normal curve can be used to approximate the beta(a, b) density when a and b are large.
Calculate r and r+.
Calculate t = std dev = √( r (r+ -r))
A probability interval for p is r + / - zperc * t
Key z percentages are
95% - Z0.95 = 1.96
99% - Z0.99 = 2.58
Example
Sample of 663 adults asked if “can gays serve effectively in the military if they keep their sexual orientation private?”, the responses were
Yes – 477 -
No – 186.
How good an estimate is 72% as an estimate of the population proportion who would say yes?
Assume a prior beta(1, 1)
Posterior density is beta(478, 187).
r = 0.71880
r+ = 0.71922
t = 0.0174
A 95% probability interval is 0.7188 +/- 1.96 * 0.0174
= 0.7188 +/- 0.341 or from 68.5% to 75.3%
Probability interval can also be used to test null hypothesis. For example, null hypothesis of no trend is 50%.
Text also shows how to calculate probability of next two or more observations.
No comments:
Post a Comment