… a study with a sample that is too small will be unable to detect clinically important effects. Such a study may thus be scientifically useless, and hence unethical in its use of subjects and other resources.He also stated that “Power of 80-90% is recommended”. Challenges to this idea asserted that it had “been rendered untenable by the rising acceptance of amalgamated evidence from many studies”, while also contradicting the “scientifically useless” claim above by asserting that “imprecise results are better than no results at all” (Edwards, et al. 1997). An important caveat was that the results of small trials must be made available to future researchers. Halpern et al. (2003) sought to rebut these challenges, restating the original argument in somewhat more detail. They asserted that having too small a sample size “shifts the risk-benefit calculus that helps justify research in an unfavorable direction” and that “the marginal value of narrowing confidence intervals to widths still compatible with both positive and negative results generally is insufficient to justify exposing individuals to the common risks and burdens of research”. They concluded that in order for trials to be ethical one of two conditions must be met:
either enough patients will be enrolled to obtain at least 80% power to detect a clinically important effect or, if this is not possible, the researchers will be able to document a clear and practical plan to integrate the results of their trial with those of future trials.Bacchetti et al. (2005) performed a detailed analysis of the quanSubsequent work made a detailed case for diminishing marginal returns for many other measures of projected study value that have been proposed in the statistical literature for use in sample size planning (Bacchetti et al. 2008) and provided a less technical explanation (Bacchetti 2010) of the threshold myth?, arguing that it underlies the argument for ethical condemnation of studies that are “underpowered” and other misconceptions about sample size planning.titative claim, made most explicitly by Halpern et al. (2003), that studies with too small a sample size do not have enough value to justify the burdens imposed on participants. They found that statistical power and other measures of a study’s projected value all exhibit diminishing marginal returns as a function of sample size and that increasing sample size therefore can only worsen the ratio of projected value to total participant burden, which increases linearly rather than in diminishing increments. They therefore asserted:
Even assuming the controversial premise that a study’s projected value is determined only by its power, with no value from estimates, confidence intervals, or potential meta-analyses, the balance between a study’s value and the burdens accepted by its participants does not improve as the sample size increases. Thus, the argument for ethical condemnation of small studies fails even on its own termsSubsequent work made a detailed case for diminishing marginal returns for many other measures of projected study value that have been proposed in the statistical literature for use in sample size planning (Bacchetti et al. 2008) and provided a less technical explanation (Bacchetti 2010) of the threshold myth?, arguing that it underlies the argument for ethical condemnation of studies that are “underpowered” and other misconceptions about sample size planning. In an article focused mainly on ethical issues in analysis rather than planning of studies, Gelfond et al. (2011) wrote, “Underpowered studies are not likely to yield results with practical translational value; they put subjects at unnecessary risk and waste resources.” They did not acknowledge any controversy or reference any previous work on ethics and sample size. Bacchetti et al. (2012) wrote a letter citing previous work and summarizing the argument that the value to risk ratio can only worsen as sample size increases. Gelfond et al. (2012) replied that “several other authors have significant critiques to their [Bacchetti et al. 2005] formulation of ethicality that go beyond mere ‘misconceptions.’” Below are examinations of those critiques. These are followed by examination of other arguments for why having “too small” a sample size does or does not make a study unethical. Following that is discussion of a proposal from Gelfond et al. (2012) to define the term “underpowered” in terms of optimality and efficiency rather than just sample size.
the value to a participant from his or her altruistic contribution to a definitive study of an important clinical or public health question is relatively independent of the number of trial participants. More generally, as a function of sample size, one might expect the projected value per participant to start low since there is modest benefit from a trial (in isolation) that is insufficient to affect medical or public health practice, then to be relatively constant over a range of sample sizes that have potential clinical impact, and eventually to decline beyond sample sizes where the research question will have been reliably answered … .
because people commonly participate in research for altruistic reasons, and because additional participants increase the probability that a social benefit is obtained, each participant’s expected individual benefit increases with larger sample sizes. If a new treatment is proven effective, each participant’s altruistic motives are rewarded in full; if it is not, and the study was underpowered, then none are rewarded at all. On the other hand, if an adequately powered trial determines that a clinically important benefit is unlikely (recognizing the impossibility of ‘‘proving’’ the null hypothesis), then altruistic motives are still rewarded. Assuming, as Bacchetti et al. (1) do, that the average burden per participant is constant across all possible sample sizes results in an improved risk-benefit ratio for individual research participants as the sample size increases.
power is determined by the complete study design that includes many factors other than sample size, and one could define underpowered designs as having less power than the optimal feasible design, where the optimal design is determined by some efficiency criterion. Given this definition of underpowered, we could revise our statement in the article (edits in italics) to ‘Underpowered studies are less likely to yield results with practical translational value; they may both put subjects at unnecessary risk and waste resources.’
Bacchetti, P. (2010). Current sample size conventions: Flaws, harms, and alternatives. BMC Medicine 8, 17. Available here. Ctspedia version here?.
Bacchetti, P., McCulloch, C. E., and Segal, M. R. (2008). Simple, defensible sample sizes based on cost efficiency (with discussion and rejoinder). Biometrics 64, 577-594. Available here.
Bacchetti, P., McCulloch, C., and Segal, M. R. (2012). Being ‘underpowered' does not make a study unethical. Statistics in Medicine 31, 4138-4139.
Bacchetti, P., Wolf, L. E., Segal, M. R., and McCulloch, C. E. (2005). Ethics and sample size. American Journal of Epidemiology 161, 105-110. Available here.
Edwards, S. J. L., Lilford, R. J., Braunholtz, D., and Jackson, J. (1997). Why ''underpowered'' trials are not necessarily unethical. Lancet 350, 804-807.
Gelfond, J. A. L., Heitman, E., Pollock, B. H., and Klugman, C. M. (2011). Principles for the ethical analysis of clinical and translational research. Statistics in Medicine 30, 2785-2792.
Gelfond, J. A., Heitman, E., Pollock, B. H., and Klugman, C. H. (2012). Power, ethics, and obligation Authors' Reply. Statistics in Medicine 31, 4140-4141.
Halpern, S. D., Karlawish, J. H. T., and Berlin, J. A. (2002). The continuing unethical conduct of underpowered clinical trials. Journal of the American Medical Association 288, 358-362.
Halpern, S. D., Karlawish, J. H. T., and Berlin, J. A. (2005). Re: “Ethics and sample size”. American Journal of Epidemiology 162, 195-196. Available here .
Horrobin, D. F. (2003). Are large clinical trials in rapidly lethal diseases usually unethical? Lancet 361, 695-697.
Newell, D.J. (1978). Type II errors and ethics. Br Med J. 2:1789–1789. Available here.
Prentice, R. L. (2005). Ethics and sample size—Another view. American Journal of Epidemiology 161, 111-112. Available here.