HMS777: Models For Proportions : Statistics : A Bayesian Perspective (Chapter 6)

Chapter 6 : Models for Proportions - is the first chapter in Statistics : A Bayesian Perspective that deals with statistical analysis. The earlier chapters are about statistical analysis generally, display and summation of data, experiment design, and probability theory as it relates tho Bayes' Rule.

The process of statistical inference is :

Specify set of models.

With a basic proportional model, with proportions to one decimal place, the models are: 0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0 . In the event that absolute certainty one way or the other does not exist, the model 0.0 and 1.0 can be deleted. A more exhaustive model listing would be to express proportions to either 2 or 3 decimal places.

2. Assign prior probability.

In a traditional frequentist model of statistics, equal prior probability would be assigned to each model (for example, for a set of 11 models, the prior probability of each model would be 1/11. This example is called a flat prior.

The strength of Bayesian statistics is when the prior probability takes account of existing knowledge about the population (as would be the case, for example, if a pilot study had been undertaken). Alternatively, the prior probability of each model could be derived from a theoretical model about the process being studied.

3. Collect data.

4. Calculate the likelihood of each model.

I will need to read further on this concept, but in the meantime likelihood is defined as P ( data | model).

The calculation is

model^success * ( 1- model) ^failure

5. Use Bayes Rule to calculate posterior probabilities

- Prior Probability * Likelihood

- Posterior probability = (prior * likelihood ) / (

prior * likelihood)

6. Draw inferences

7. Calculate predictive probability.

This is to predict the probability of the next observation.

The calculation is : the weighted average of that probability over the various model proportions where the weights are the probabilities of each model.

The predictive probability is equivalent to expected value.

Spreadsheet Example

Model	Prior	Likelihood	Prior * Likelihood	Posterior	Model * Posterior
0.00	0.00	0.000000	0.00000000	0.000000	0.000000
0.10	0.02	0.081000	0.00162000	0.018802	0.001880
0.20	0.03	0.128000	0.00384000	0.044568	0.008914
0.30	0.05	0.147000	0.00735000	0.085306	0.025592
0.40	0.10	0.144000	0.01440000	0.167131	0.066852
0.50	0.15	0.125000	0.01875000	0.217618	0.108809
0.60	0.20	0.096000	0.01920000	0.222841	0.133705
0.70	0.25	0.063000	0.01575000	0.182799	0.127960
0.80	0.15	0.032000	0.00480000	0.055710	0.044568
0.90	0.05	0.009000	0.00045000	0.005223	0.004701
1.00	0.00	0.000000	0.00000000	0.000000	0.000000

	1.00	0.825000	0.08616	1	0.522980501

Comments

- Assumptions – like most statistical procedures etc, there are various assumptions, which if not met, can render the results of analysis meaningless. With proportions for example, there is the assumption of exchangeability.

- One proportion may be significantly more likely than another, but the probability of either event a or event b occurring may be very small.

- Law of large numbers : proportion of successes in a sample from a population tends to the proportion of successes in the population as sample size gets larger.

I need to think about this and sample sizes : in my course on statistical procedures, I asked the question :

This is a general question about the power of a statistical test.

Power is a factor of

- size of effect
- alpha
- sensitivity of experiment (which includes sample size)

I was wondering if there is any way that the size of the population in any way effects power.

Or is it implicit in effect size.

For example, sampling from the population of Australia versus the student population of Swinburne ?

There, the population varies from say 16m to 20k (approx)

The answer from the instructor was:

no the population does not impact on the power, so long as the sample size is not more than around 5 to 10% of the population. Following link discusses this issue.
http://www.childrens-mercy.org/stats/size/population.asp

Fairly much the sampling distribution of the sample mean is not affected by the population unless the samples size is a large proportion of the population.

- Statistical Tests for Proportions

Studying this chapter made me realize I needed to freshen up on my understanding of the various tests available to test proportions:-

1. Z score and probability

2. Confidence interval for z score

3. Sign test

4. McNemar Test (variation of sign test)

5. Fischers Exact Test

6. Chi square / 2 * 2 contingency test

7. New test proposed by Taillard, Waelti & Zuber

HMS777

Monday, December 27, 2010

Models For Proportions : Statistics : A Bayesian Perspective (Chapter 6)

No comments:

About Me

Search This Blog

Followers

Blog Archive