Chapters 6 – 9 of Statistics: A Bayesian Perspective dealt with making inferences about proportions. Chapter 10, and the 2 following chapters deals with inferences about general types of observations. Chapter 10 looks at one sample and one population.
The book approaches inferences about one population by using density both within models and across models. The calculation methods assume that the population of interest is normally distributed, and that as sample size increases, sample proportions tend to population proportions.
As has been the case with previous chapters, the book describes a "spreadsheet" like approach to calculation, and then follows up with a density based calculation method. Chapter 10 looks at the spreadsheet approach for calculating posterior probability based on one sample / population.
This example is about a study of piglet weight gains. The sample size is 100.
The components of the spreadsheet model are:
Model: in this case, there are 1,000 models for the possible weights; from 0 kgs to 100 kgs, in 0.1 kg steps.
Z-score: because we are assuming the population is normally distributed, we calculate the z score using the formula z = √ (n) * (mean – model) / std dev
Likelihood: e^ -(z^2)/2 - ie formula for height of normal density
P (Model): Prior probability. In this example, a flat prior has been used.
Product: Likelihood * Prior
P (Model | D ): Posterior probability
n | 100 | ||||
x | 30 | ||||
std dev | 10 | ||||
e | 2.718282 | ||||
Model | z-score | Likelihood | P (Model) | Product | P (Model | D) |
30.00 | 0.00 | 1.00000000 | 0.000999 | 0.00099900 | 0.03989423 |
30.10 | -0.10 | 0.99501248 | 0.000999 | 0.00099402 | 0.03969525 |
30.20 | -0.20 | 0.98019867 | 0.000999 | 0.00097922 | 0.03910427 |
30.30 | -0.30 | 0.95599748 | 0.000999 | 0.00095504 | 0.03813878 |
30.40 | -0.40 | 0.92311635 | 0.000999 | 0.00092219 | 0.03682701 |
30.50 | -0.50 | 0.88249690 | 0.000999 | 0.00088162 | 0.03520653 |
30.60 | -0.60 | 0.83527021 | 0.000999 | 0.00083444 | 0.03332246 |
30.70 | -0.70 | 0.78270454 | 0.000999 | 0.00078192 | 0.03122539 |
30.80 | -0.80 | 0.72614904 | 0.000999 | 0.00072542 | 0.02896916 |
30.90 | -0.90 | 0.66697681 | 0.000999 | 0.00066631 | 0.02660852 |
31.00 | -1.00 | 0.60653066 | 0.000999 | 0.00060592 | 0.02419707 |
The table is an extract of the complete model set. The extract is centred around those models that are likely.
The models around 30 are the most likely. The posterior probability that the correct model is 30.0 is quite small, as there are many models around 30 that could account for the weight gains observed.
Comments on analysis
- Prior probability is flat; in reality, it is not. Can be sure, for example that pigs will gain at least some weight, but unlikely they will gain 90 pounds in 20 days.
- Assuming population is normal leads to rich theory in which calculations are not difficult.
- Normal densities not always appropriate.
- Smooth within models as well smooth across models.
- Law of large numbers à as sample size increases, sample proportions will increase and tend to population proportions
- Likelihood : height of density at observed data point.
-
No comments:
Post a Comment