Sunday, April 3, 2011

Forum Comment - Topic 5

1) Study guide (p11) talks about collapsing (or not collapsing) categories to highlight / obscure patterns in the data. I think the key method here would be using graphical methods. Table and grids are not very good for discerning relationships / patterns.

2)On page 12, the SG says that one reason for rearranging categories was

o Changing the level of measurement of a variable and thus affecting the methods of analysis that can be applied to the variable.

Can anyone provide an example, as I'm not sure what this means.

3)re statistical imputation of missing data : All methods can be used only if variables with missing data are interval / ratio / ordinal ?

o Nominal variables – sample or group means wouldn't work if missing variable was gender.

4)Re multiple response questions à de Vaus says that multiple response approach can lead to difficulties with analysis. In the example used in the book, where there 12 qualities about children, and the respondent is asked to pick the top 2 qualities, he says that you would need to look at both variables to obtain information about which qualities were picked. I would have thought it would be relatively easy to gain that information, by creating a new variable using a conditional transformation. It's trivial in excel, so I assume it can be easily done in SPSS or R.

=========================================

I liked the table illustration of the effect of collapsing categories strongly agree/agree, strongly disagree/disagree for males and females as an example of obscuring a relationship, but agree graphical methods might add to exploration of patterns.

I can't think of a new example of the impact of reordering a variable on changing the level of measurement, but the one on page 13 Table 2 changes a nominal variable to an ordinal level and there is some argument you could move from categorical analyses to parametric analyses. An example might be that you code children's strategies for solving a problem (in no particular order) then when you look at them they seem to form a logical sequence from most (code 7) to least sophisticated (code 1), if you recoded to that order you could argue that you have at least an ordinal level variable that you could use to calculate means, look at variation in average strategy sophistication by age etc....

Imputation of missing data for a nominal variable (e.g., gender) would work for random assignment within groups (so long as you chose another background variable to divide by and the variable on which the values were missing was random in the file). I don't replace missing values but you do it that way if you wanted. Or, find the proportions of male and female in the sample and assign the missing values randomly in that proportion? I don't like the idea, it seems too much like making up data to me.

No comments: