- Association between two categorical variables
- In many contingency tables, one variable is a response variable and the other an explanatory variable.
o Then informative to construct separate probaility distribution for Y at each level of x
§ Conditional probabilities for Y, given the level of X
· Called : conditional distribution.
- Independence
o Statistically independent if conditional distributions of Y are identical at each level of X
§ Independent : probability of any particular column is same for each row.
§ Statistical independence: property that all joint probabilities equal the product of their marginal probabilities
· Joint
· Marginal
- When rows of contengency table refer to different groups
o Sample sizes for groups often fixed by sampling design
o When marginal totals fixed rather than random
§ Joint distribution for x and y is no longer meaningful
· But conditional distributions for Y at each level of X are
- Difference of Proportions
o Compares the success probabilities in the two rows
o Difference falls between -1 and 1 / equals zero when two probabilities equal
o See formula for SE à calculate confidence interval.
- Aspirin and heart attack example
o Two rows à independent bimonial sample
- Relative Risk
o Difference between two proportions of certain fixed size may have greater importance when both proportions are near 0 or 1.
o RR = ratio of success probabilities.
o Any non-negative number
o Complicated CI formula
- Odds Ratio
o Odds = prob of success / prob of failure
o Odds non-negative
- Odds Ratio for Aspirin Study
- Relationship betwen odds ratio and relative risk
o When proportion of successes is close to zero à fraction in last term is approx zero
o OR and RR then take similiar values.
o For some data sets calculation of RR is not possible
§ Case control study à where marginal distribution is fixed by sampling design
· Two controls for each case
· Might wish to compare ever smokers with non smokers in terms of proportions who suffered a disease
o These proportions refer to conditional distribution of disease, given smoking status
§ Cannot estimate such proportions for this data set
§ Study matched each case with two controls
· We can compute proportions in reverse direction
o Conditional distribution of smoking status, given disease status
o Use odds ratio
§ Odds ratio takes same value when it is defined using the condtional distribution of x given y as it does using the distribution of y given x à treats variables symetrically
§ Or – conditional distribution in either direction
No comments:
Post a Comment