Case Control Design - Sampling Controls at Follow-up
Lead Author(s): Jeff Martin, MD
Summary of Case-Control Sampling
- A random sample of the cohort baseline = case-cohort design
- At time each case is diagnosed = incidence density sampling
- From persons without disease at the end of follow-up = prevalent controls design
Definition of Prevalent Controls
Sampling only non-cases in a primary or secondary
study base
This case-control design uses prevalent controls at follow-up .
Odds ratio approximates risk ratio only if disease occurrence is rare
This sampling is the classic instance of needing the rare disease assumption assumption that many text books discuss
- because the OR will approximate the risk ratio only if the incidence is low or rare.
Diagram Using Prevalent Controls at Follow-up
In the diagram below, one can see prevalent controls drawn from the non-diseased individuals at follow-up.
Problems with Using Prevalent Controls at Follow-up
This is the design that most neophytes are drawn to. It is the least desirable of the three types of control sampling but it used to be the most common. That may no longer be the case as researchers are becoming more sophisticated about
case-control design.
LIST PROBLEMS AND LINK
Limitation of Study Controls-FIRST PROBLEM
So even if all the
cases are captured as in the schematic,
- the controls are drawn only from those present at the time the study is conducted.
So unlike the case-cohort and the
case-control with
incidence density sampling designs,
Because the cases are excluded,
- the control group can no longer represent the entire baseline population of the cohort.
Bias in Waiting until Follow-up- SECOND PROBLEM
As you can see one of the problems with this design is that there is an obvious source of potential
bias in waiting until the end of
follow-up to select controls
- because factors that influence loss to follow-up will influence the selection of controls.
Furthermore, losses to follow-up and deaths also make this group of controls not very representative of the population that gave rise to the cases.
- Nor can it represent the person-time of the cohort because time is not represented throughout the study base experience in sampling the controls, only one time point is used.
If those factors are associated with both your predictor variable and your outcome, the measure of association will be biased.
Inability to Calculate the Risk Ratio -THIRD PROBLEM
In case control design,
Case Control Design: OR equals RR.
ratio is known in all case-control designs
BUT sampling only non-cases cannot get unbiased estimate of
The ratio of exposed to unexposed in the whole cohort
- can only be estimated by a sample of everyone at the beginning of follow-up,
- not just those who remain non-cases at the end of follow-up.
Example of Inability to Calculate the Risk Ratio
So using
prevalent controls, you get:
- 60 non-cases in the exposed
- 90 non-cases in the unexposed
Example Showing Incorrect Odds Ratio
If you look at the
odds calculation:
ad/bc = OR
IN this example,
- (40 * 90) / (60 * 10) = 6.0
One quarter of the cohort has been diagnosed with disease during the cohort follow-up
- l eaving only 150 of the original 200 left from which to select controls using the prevalent control case-control design.
Since the original cohort was divided 50/50 by exposure and the
- odds of disease among exposed versus unexposed cases is 4 to 1,
- the remaining subjects without disease will have a ratio of 60/90 or 2/3 of exposed to unexposed.
In other words, the odds of exposure in the eligible controls will be 2/3 and the odds ratio will be 4 divided by 2/3 = 6.0.
These numbers use everyone in the cohort and the case-control study will only use a sample of 150 remaining without disease but as they will be sampled independently of exposure status the ratio of 2/3 also applies to any random sample of controls.
Thus the OR in this example is much larger than the risk ratio and cannot be considered even an approximation of it.
Prevalent Controls - Rare Disease Assumption-FOURTH PROBLEM
If controls are selected among those without disease at time of study (+/- prevalent cases), the OR approximates risk ratio only with the
- rare disease assumption?.
Case Control with Low Incidence-PROBLEM
If the disease only removes a few persons from the original cohort,
- the ratio of exposure in those remaining will stay close to the original ratio at baseline.
It follows that estimating N0/N1 by using prevalent controls
- becomes increasing more valid as the number removed by the disease gets smaller.
Example of Case Control with Low Incidence
IN this example,
- (4 * 99) / (96 * 1) = 4.13
Assuming that the
incidence of disease was 2.5% (5 out of 200 developed disease),
- the OR is only slightly higher than the risk ratio for the simple reason
- that the ratio of exposure in the remaining non-cases is close to 1.0,
- which is what it was in the whole cohort at baseline.
The somewhat arbitrary rule of thumb of incidence below 10% is sometimes given as what is meant by a
rare disease
If the incidence were 10% (16 exposed cases and 4 unexposed cases),
- OR = 4.57 (Do you think that this is a good approximation of 4.0?)
CAVEAT: Sampling Non-cases May Introduce Bias-PROBLEM
Disease may remove few from study base sampled for controls, but other sources of loss to* *loss to follow-up can bias control group.
- The rare disease assumption only looks at the effect of removing potential controls who are diagnosed with the outcome, the disease.
Losses to follow-up and deaths among potential controls from the study base givingrise to the cases affect who is available at one point in time.
- Looking at the study base that gave rise to those cases over time, some members of the study base population at time zero will not be in the population of non-cases sampled at the end of the time when all the cases have been ascertained. Some will have left the study base or died, and these changes in the group of non-cases who are sampled can bias, the estimate of exposure in the controls. Since no information is available on who left the study base with the prevalent controls design, the nature of this bias cannot be known. Thus, even though the rare disease assumption is met, the OR from this type of case-control sampling may give a biased estimate of the risk ratio.