Case-Control Design - Sampling at Baseline
Special type of
case-control study:
Case cohort design
Lead Editor(s): Jeff Martin, MD
Summary of Case-Control Sampling
- A random sample of the cohort baseline = case-cohort design
- At time each case is diagnosed = incidence density sampling
- From persons without disease at the end of follow-up = prevalent controls design
Definition of Sampling for Case Cohort Design
One way to sample controls within a cohort is to take a random sample of a cohort at baseline.
- This is called a Case-Cohort Design.
It is relatively new and has not been frequently used. It was first described by the statistician Ross Prentice in 1986.
- It seems odd at first to realize that you will likely be sampling future cases as well as controls when you take a random sample of a cohort at its baseline.
- This means that a subject may be included both as a case and a control.
- But this is also true of incidence density sampling since a subject selected as a control at one time point may later become a case.
Diagram of Random Sampling at Baseline
As you can see in the diagram below, controls are chosen at baseline from the hypothetical cohort.
Case-cohort: sample baseline of cohort
is sampled randomly from baseline.
The case-cohort design is distinguished by taking a random sample of the study cohort at baseline.
- Since everyone is eligible to be in this sample and it is taken independently of exposure,
- The sample will give an unbiased approximation of the ratio of number of persons exposed over number of persons not exposed.
All of the cases are included in the study (although also could be a random sample),
- So in a case-cohort design unbiased estimates of both:
- The ratio of exposed to unexposed cases in the original cohort and
- The ratio of exposed to unexposed in the original cohort are obtained.
Sampling Controls at Baseline
Most people think that the best way to sample controls is to wait until the end of follow-up so that the investigator can be sure they will not be cases.
Becoming a case is an artifact of the follow-up period of the cohort.
- The investigator cannot know whether many of the controls will be diagnosed with the study outcome the day after the study ends.
- This is made even clearer by the example of the cohort study that uses death as an outcome. Everyone is eventually a case.
When we are looking for (i.e., sampling) controls, we do not necessarily have to guarantee that these are subjects will never become cases.
All that is needed is to be sure that they are not cases at the time of control sampling. CONTROL GROUP is a random sample of the cohort at baseline.
Case-Based Estimate of Exposed and Unexposed
The case-cohort design begins with the
cases, so like other types of case-control sampling it is case-based.
- It differs from the other approaches to sampling controls by taking a sample of the cohort, or the study base, at time zero.
This random sample of the baseline population will estimate the proportion exposed and unexposed in the entire study base.
- With that ratio, the risk ratio for the study base experience can be estimated without bias.
Advantages of Case-Cohort Design
- The use of the same control group for more than one outcome or for additional follow-up at time later than the current study. (Examining time-varying exposures can be a problem if only baseline data are available on the controls.)
- Data from a population-based survey (questionnaires, perhaps biological samples stored) and that population can be tracked for the incidence of a disease outcome. An example of large population-based survey from the Netherlands looked at the relationship between bladder cancer and vitamins.
The case-control design becomes a good alternative to the other types of case-control sampling if:
- The study has a primary study base which has baseline information or biological specimens on the persons who would be eligible for sampling.
How OR = Risk Ratio in a Case-Cohort Design
The graphic below shows the notational framework of the 2x2 table to illustrate the properties of the odds ratio.
There is a:
- A cohort of exposed and unexposed cases
- Estimate of the proportion of exposed and unexposed in the whole cohort (controls)
- EXPOSURE ODDS IN CASES = a/c
- EXPOSURE ODDS IN CONTROLS = b/d
The critical point is that the controls in the case cohort design estimate the proportions exposed and unexposed in the whole cohort,
- so cells b and d are unbiased estimates of N1 and N0.
When the odds ratio of exposure in the cases and controls is formed and the notation from a cohort is substituted for the cells in the 2x2 table, the resulting ratio is the same as the risk ratio.
The odds of exposure among those with disease is easily formed simply by taking:
- The ratio of the counts in the cells from the disease column, and
- The ratio of the counts in the cells from those without disease.
- The ratio of the two ratios is the odds ratio.