HMS777: February 2011

Monday, February 28, 2011

Feedback on Quizzes

Quiz 1

- Researcher manipulates independent variable and measures independent variable

- The essential characteristic of a researchable question is that there is a hypothesis – correct answer is false

o Ask lecturer to indicate what is the essential characteristic

Forum Comment Topic 1

In the textbook (de Vaus) chapter 3, the author describes various research designs. One design is "retrospective experimental design".

The term "retrospective experimental design" seems unusual, in that I had thought a key attribute of the experimental design is that the researcher exercises control over the assignment of treatments to the experimental units through the process of randomization.

It would seem to me that any retrospective analysis could only be observational, as the researcher can in no way assign treatments to subjects.

In fact, the retrospective experimental design seems very similar to the cross-sectional or correlational design.

Also, de Vaus seems to suggest that surveys, because of the method of analysis and form of data, can be used for an understanding of causal relationships or links, whether or not the design is observational or experimental.

One of the points I came away with from HMS771 Analysis of Variance (and other statistics units) is that an experimental design allows inference of causality, observation allows inference of association

If you can use de Vaus' article as an example (refer chapter 18, Putting it into Practice: a research example), the link he demonstrates is more of "association" rather than 'causality':

In the paper which the chapter refers to, Gender Differences in Religion: A Test of the Structural Location Theory", he states " the results show that the lower rates of female labor force participation are the major cause of their greater religious commitment". (my emphasis).

In the discussion section at the end of the paper though, he says, " the question remains as to why work force participation affects the religious orientation of females".

That seems to say, there is a correlation, but we still don't know what causes it.

----------------------------------------------------------------------------------

I agree with Graham. I find it difficult to distinguish between a cross-sectional (correlational) survey design, and a retrospective experimental design. Both designs appear to rely on the a posteriori selection of groups. Grouping of the subjects is based on the level of exposure to the independent variable. The only possible distinguishing feature mentioned for the "retrospective experimental design" is in the attempt to match the groups with respect to other independent variables, in order to remove confounding.

Sunday, February 27, 2011

Notes from Basic Statistical Practice 2: Sample Size

Notes from Basic Statistical Practice 2: Sample Size

Power of a test

- Ability to find a difference in distribution when there really is a difference.

- Increase power of a test

o Increase size of alpha

§ If using 0.05, increase to say 0.01

§ Reduces risk of type 2 error, increases chance of type 1 error

- Reduce beta without increasing alpha

o Be more specific in prediction

o One tailed test more powerful than 2 tailed test

- Reduce overlap between 2 distributions

o Beta is reduced

o Difference between the means

o Size of std deviation.

- Look for strong, rather than subtle effects

- Improve sensitivity of the measure of dependent variable

- Planning sample sizes with power approaches – what sample size is required to achieve given power

- Different formula for factorial sample size

De Vaus: Chapter 6: Finding a Sample

Some Sampling Concepts

- Fundamental goal of research à generalize from sample to wider population

- Statistical generalization – use probability theory to estimate likelihood that patterns observed in sample will hold in population (relies on random samples)

- Replication – test generalisability – test in different situations

- Census

- Sample

- Sampling frame - list of population elements

- Representative sample – profile of sample same as population

- Weighting

- Sampling error

Types of Probability Samples

- Simple random sampling

o Complete sampling frame

o Requires good sampling frame, and population is geographically concentrated or data collection method does not require travelling

o Cost

- Systematic sampling

o Similar to simple random sampling

o Randomly chose start point, then choose nth unit

o Problem – periodicity of sampling frame – eg, husband & wife

- Stratified sampling

o Select stratifying variable

o Divide sampling frame into separate lists

o Randomly close sample from each list

- Multistage cluster sampling

o Final sample involves several different samples

§ Eg

· Divide city into areas (clusters)

· Select sample of areas

· Divide clusters into smaller units – sample

· For each smaller unit – list addresses – sample

· At each selected address – select individual

o Maximize number of initial clusters à increase chance of representativeness – but this increases sampling costs

o Use stratification techniques

Internet Samples

- E-mail surveys

- Web page based surveys

o Pop ups

o Advertising on other sites

o E-mail invitations

o Panels / representative panels

http://statisticsandastudent.blogspot.com/2011/02/aapor-report-on-on-line-panels.html

- Internet samples and representativeness – 3 ways to use internet to gain representative general population samples

o Connect a random sample to internet

o Multimode methods of questionnaire administration

§ Complex

§ Mode effects

o Using quota internet samples

§ Make a sample that is representative in specific respects

o Use of internet samples

Sample sizes

- Degree of accuracy

- Extent of variation in population

- Relationship between sample size and accuracy

o Small samples à small increase in sample size can lead to substantial increase in accuracy

o Size of population from which we draw sample is irrelevant

o Heterogeneous population – smaller sample à ie 90% voting one way vs 50/50 split

§ But allow flexibility in survey purpose, and go for larger sample sizes

o Sample sizes of sub groups

§ Sample size and variation within each group should determine sample size of each group

Non Response

- Problems

o Unacceptable reduction of sample size

o Bias

o Techniques to reduce non response may not avoid bias problem

§ Often non responders are different in crucial respects from responders

· Use available information on non responders

· Some sampling frames can provide useful information

o Official records

· Compare with known characteristics of population – ie census balanced

http://statisticsandastudent.blogspot.com/2011/02/chapter-1-introduction-to-survey-errors.html

Weighting Samples

- Due to

o Non response

o Deliberate oversampling

o Inadequate sampling frames

- Purpose – adjust sample so that sample profile on key variables reflects that of population

- Statistically increasing / decreasing 'numbers' of cases with particular characteristics so proportion of cases in sample is adjusted to population proportion.

- How to weight a sample on single characteristic

o Example

§ Select variable – gender

§ Obtain population and sample percentages

§ Calculate weight for each category

· Ie male

o Sample 35%

o Population 50%

· Male weight = 50 / 355 – 1.43

- How to weight a sample on two / more characteristics

Secondary Analysis

Non Probability Sampling

- Eg, to sample gays

- Purposive sampling – cases selected by researcher

- Quota sampling

- Availability samples

Sampling Checklist

GW Comments

- How does stratified sampling differ from blocking

- Cluster sampling – see article on fathers & un wed births.

http://statisticsandastudent.blogspot.com/2011/02/fragile-families-sample-design.html

- Get notes from Anova course on sample size

- Sources for : Heterogeneous population – smaller sample à ie 90% voting one way vs 50/50 split

- Non response – see notes

http://statisticsandastudent.blogspot.com/2011/02/chapter-1-introduction-to-survey-errors.html

Saturday, February 26, 2011

HMS777 Study Guide: Topic 1: Introduction To Survey Research

When Do We Use Surveys

- Collecting information from / about subjects

- To compare / predict / describe / explain

What is Survey Research

- Form of data

o Variable by case data grid

§ Case à unit of analysis

- Method of analysis

o Describing groups of cases

o Comparison of cases

o Looking for causal relationships by comparing groups of cases

- Techniques of data collection

o No necessary connection between questionnaire and survey research

o Data collection techniques

§ Questionnaires

§ Structured interviews

§ In depth interviews

§ Observations

§ Content analysis

Steps in Survey Research

1. Setting specific, measurable objectives

a. Determines info to be collected

b. Define terms

c. Convert survey objectives into questions and hypotheses à statement to question

i. How do x & y compare

ii. Null hypothesis : no difference

iii. Research hypothesis : there is a difference

d. Where do survey objectives originate

i. Defined need

ii. Literature review

iii. Experts

iv. Focus groups

1. Trained leader conducts carefully planned discussion to obtain participants opinions on defined areas of interest

v. Consensus panels

1. Skilled leader in highly structured environment

a. Eg – read docs and rate / rank

2. Selecting research designs

a. Descriptive / observational

i. Describe or compare

1. Time frame of interest

2. Geographical location

b. Explanatory

i. Causes or consequences

c. Type of research design

i. Classic experimental design

1. Experimental & control groups

2. Random allocation

3. Experimental studies can include surveys

ii. Panel design

1. No control group

iii. Quasi panel design

1. Similar to panel design , except two different groups studied at two points in time

2. Cannot fully match samples

iv. Retrospective panel design

1. Used when not feasible to follow group of people over time

2. Problem of memory

v. Retrospective experimental design

1. Two groups

2. Assessed at one time

3. Asked about behaviour at that time and prior time

vi. Cross sectional or correlation design

1. Two groups studied

vii. One group post test only design

1. One group assessed at one point in time.

2. Need to have reference point, otherwise cannot say anything causal

3. Choosing population and sample for study

4. Developing reliable and valid survey instruments

5. Administering the survey

6. Managing and analyzing survey data

7. Interpreting and reporting survey results

Ethics in Data Collection

- Voluntary participation

- Informed consent

o Purpose of research

o Description of likely benefits

o How respondents were selected

o Statement re voluntary participation / free to withdraw

o Anonymous and confidential

o Risks / discomfort

o How data / results will be used

o Identity of researcher and sponsor

- No harm

- Anonymity

- Confidentiality/ privacy

- Ethical responsibilities to colleagues / sponsors / public

Human Research Ethics at Swinburne University

http://www.research.swinburne.edu.au/researchers/ethics/human-research/

GW Comment

- How valid are the claims that surveys can be used to identify causal links?

o Refer to material on experimental design

o Surveys à suggest but not validate claims of causality

o Correlational links vs causal links

- Skill of researcher is in converting research objective into questions

- Focus group – example – focus group conducted by PWC on behalf of Swin University

- Note that survey can be used in experimental design to collect the information

- Match up designs with experimental design types

- Is retrospective experimental design really experimental à subjects are not randomly assigned to groups

- How does cross sectional design differ from panel design

Chapter 15 : KNNL p642 : Introduction to Design of Experimental and Observational Studies

Experimental Studies

- Investigator exercises control over assignment of treatments to experimental units through process of randomization.

- Clinical trial – prospective intervention study

Observational Studies

- Randomization of treatments to experimental units does not occur

- Not possible to randomly assign levels of predictor variables to subjects

- Comparative observational studies

o Random samples obtained from 2 or more populations

o And observed outcomes are compared across populations

o Populations are defined by levels of one or more explanatory factors (observational factors)

o A cause and effect relationship between explanatory factors and the outcome / response is difficult to establish in observational study

o Usually, evidence external to observational study would be required to rule out possible alternative explanations for cause and effect.

Experimental Studies – Basic Concepts

- Set of explanatory factors included in the study

- Set of treatments included in study

- Set of experimental units included in the summary

- Rules & procedures where treatments are randomly assigned to experimental units

- Outcome measurements

- Experimental unit à smallest unit of experimental material to which a treatment can be assigned à thus determined by method of randomization

- Randomization

o Constrained randomization – blocking

- Std experimental designs

o Completely randomized design

o Factorial experiments – crossed multifactor designs

o Randomised complete block designs

o Nested

o Repeated measures

Design of Observational Studies

- Random assignment of factor levels to experimental units does not occur à therefore designed observational studies do not directly demonstrate cause and effect relationships between explanatory factors and the response.

- Can establish association

- To infer causality, potential confounding variables would need to be identified, and sub group analysis performed to rule out possible alternative causal factors.

- Cross sectional studies

o Measurements taken from one / more populations at single point in time

o Exposure to potential causal factor and response determined simultaneously

o Snapshot

- Prospective studies

o One / more groups formed in non-random manner according to levels of hypothesized causal factor, and those groups observed over time with respect to outcome variable of interest.

- Retrospective studies

- Matching

Friday, February 25, 2011

De Vaus: Chapter 5: Ethics and data collection

Research Participants

- Voluntary participation

- Informed consent

- No harm

- Confidential anonymity

- Privacy

De Vaus: Chapter 3: Formulating and Clarifying Research Questions

Types of Research Questions

- Variables

o Dependent

o Independent

o Intervening

§ Education à job à income

- Descriptive research

o Five questions

§ Time frame

§ Geographical location

§ Broad description / comparing & specifying patterns for subgroups

§ Aspect of topic

§ How abstract is our interest

o Explanation: searching for causes or consequences

§ Why à what caused increased divorce rate?

§ Consequences

§ Literature search

§ Explanation – exploring a simple idea

· Has x led to increased divorce rate

§ Explanation – exploring more complex ideas

· Why should religious decline lead to divorce?

§ Specify links

Using Internet to Review Existing Information and Research

Scope of the Research

- Particular but exhaustive / general but partial

- Idiographic à case study

- Nomothetic à survey

- Units of analysis

Research Design

- Frame of reference

- Descriptive research

o Context – is 9% inflation high or low?

o Compare

§ Other groups

§ Over time

- Explanatory research

o Causal processes

o Structure of the data

o Central point of good research design is that it provides a context in which relatively unambiguous statements can be drawn.

o Classic experimental design

§ Experimental group / control group

§ Random assignment

o Panel design

§ Looks at same group of subjects over period of time

o Quasi panel design

§ Similar to panel design

§ Except different groups studied at two points in time

o Retrospective panel design

§ Obtain information at one point in time

§ Ask about two / more time points

o Retrospective experimental design

De Vaus: Chapter 1: The Nature of Surveys

What Is a Survey?

- Distinguishing feature:

o Form of data

o Method of analysis

- Form of data

o Rectangular set of data

o Variable by case grid

o Technique to generate data can vary

o Unit of analysis need not be people

- Method of Analysis

o Causes of phenomena

o Description

o Locate causes by comparing cases

o Compare with other methods

§ Case study

§ Experimental method

· Creates intervention

· Survey studies naturally occurring variation

- Quantitative and Qualitative Research

o Structured data / unstructured data

§ Systematic data

o Analyzing data à logic of analysis

§ Logic of survey analysis

· Variation in one variable matched with variations in other variables

· Co variation

· Not an inherently statistical concept

· Causal analysis

Thursday, February 24, 2011

AAPOR Report on On-Line Panels

Type of Online Panels

- Vast majority not constructed using probability based recruitment

- Offers to join

- River sampling

Total Survey Error

- Coverage error – major factor when goal is to represent general population

- Approx 1/3^rd Us adult population does not use internet on regular basis.

- Under coverage

- High level of non response at various stages of building a non-probability panel and delivering respondents to individual studies

- Major differences between surveys using non probability panels and traditional methods (usually phone) à difficult to determine whether mode of administration or sample bias is greater cause of differences.

- Studies suggest probability sampling still more accurate

Adjustments to Reduce Bias

- Standard demographic weighting

- Simple purposive sampling that uses known information about panel members to generate demographically balanced samples

- Standard quota sampling

- Propensity models in post stratification adjustments à augment standard demographic weighting with attitudinal or behavioural measures thought to be predictors of bias

Concerns about Panel Quality

- Industry response

o Guidelines & standards

o Validating panelists à remove duplicates etc

o Research to understand drivers of panelist behaviours / design techniques to reduce impact of those behaviours on survey results.

Background & Purpose of Report

- Types of panels

o Probability based methods

o Non probability approach / volunteer online panels

Overview

- First challenge for all survey modes à development of sample frame

- Online panels have become popular solution to sample frame problem.

- General population panel à serves as a frame from which samples are drawn to meet specific needs of particular studies.

- Census balanced samples

- Specialty panel

- Proprietary panel

- Targeted samples

Non Probability Volunteer Online Panels

- Lower cost

- Faster response

- Ability to build targeted samples of people who would be low incidence in a general population sample

- Five major areas of activity

o Recruitment of members

§ Motivations

· Contingent incentive

· Self expression

· Fun

· Social comparison

· convenience

o Joining procedures and profiling

§ Double optin process

§ Validation procedures

o Specific study sampling

§ Simple random samples from panels are rare à tendency to be highly skewed to particular demographics

§ Purposive sampling

o Incentive programs

o Panel maintenance

Probability Based Recruitment

- Dutch Telepanel – 1986

River Sampling

- Intercept interviewing / real time sampling

- May be on rise as researchers seek larger / more diverse sample pools and less frequently surveyed respondents than provided by online panels

Errors of Non Observation in Online Panel Surveys

- Coverage

- Non response

- More severe with online panels than other types of surveys

- All sampling frames have features which can affect quality

o Under coverage

o Multiple mappings of frame to population

§ Eg, multiple people per home phone line

o Duplication

§ eg, one person, multiple phone numbers

Online Panel Surveys, Frames, and Coverage Issues

- volunteer panels do not attempt to build complete sampling frame

- notion of sample frame is skipped

- instead focuses on recruitment and sampling steps.

- Common evaluative criterion of volunteer panel is not full coverage of household population, but sufficient diversity on attributes related to type of surveys supported by panel.

- Online panels can repeatedly sample

Unit Non Response and Non Response Error

- Failure to measure a unit in a sample

- Person selected for a sample does not respond to survey

- Vs item non response à respondent skips a question

- 4 stages in volunteer panels where non response can be a issue

o Recruitment

§ No way of knowing anything precise about size / nature of non response at this stage

§ If online panel members belonging to under represented groups are similar to group members who are not in panel, the risk of bias is diminished under an appropriate adjustment procedure

· Within group homogeneity may be a poor assumption.

o Joining and profiling

§ Just over 6% of those who click thru a banner advert to panel registration page eventually become member.

o Specific study sampling

§ Reasons why member may not participate

· Lack of interest / survey length / heavy volume of survey invitations

· Failure to qualify / not completing within require time period

· Technical problems

§ Address differential non-response

· Disproportionately higher rate of group selected.

· Pre emptive differential non response adjustment

· No guarantee that non response error eliminated / reduced

o Panel maintenance

§ Forced or normal attrition

§ Forced turnover is not a form of non response

- Response metrics

o Recruitment is constant on-going endeavour

- Coverage errors versus non response bias

o Given absence of sampling frame in online panels, conceptual difference between coverage errors and non response errors gets blurred.

- Measurement error in online panels

o Understand how and why people think, feel and act.

o Measurement error is defined as difference between observed response and underlying true response

§ How concepts are measured

· The questions and the answers

· Questionnaire design effects

§ Mode of interview

§ Respondents

§ Interviewers

o Mode effects

§ Two hypotheses about possible impact of shifting from one mode to another

· Social desirability

· Satisficing

§ Primacy

· Tendency for respondents to select answers offered at beginning of list

§ Recency

· Select answers from last offered

§ Concurrent – predictive validity

o Respondent effects

§ Cognitive capabilities of panel members

§ Motivation to participate

§ Panel conditioning

· Taking too many surveys

§ Topic interest and experience

- Sample adjustments to reduce error and bias

o Purposive sampling

§ Non random selection technique

§ Goal – sample representative of defined target population

§ Anders Kiaer – 19^th century

§ Quota sampling

§ Census balance sample

o Model based methods

§ Small area estimation

§ Epidemiological studies

o Post survey adjustment

§ Weighting techniques

- Panel data Cleaning

HMS777