# How do you determine the sample size to estimate the proportion?

Updated: 12/13/2022

8y ago

There is no simple answer. There are two main factors need to be taken into account. Consider the simple case of a dichotomous or binary variable.

One consideration is the consequences of getting the proportion wrong. If you are estimating the proportion of males (and females) going to a cinema so as to design the correct number of toilets, a 5% risk of getting it wrong may be acceptable. You may have some disgruntled customers and, in any case, it may be possible to rebuild and re-designate some toilets. If, instead, you are estimating the proportion of people who have a serious adverse reaction to some medication, a 5% error rate is catastrophic! Not just for the patient but for the pharmaceutical company as well.

Such risk assessment will determine the confidence level that you require from the estimate. Suppose now that for the study under consideration, a 5% risk of getting it wrong is acceptable. That is, you want to be 95% confident that the true (but unknown proportion) is within 1.96 standard errors of your estimate.

If the true proportion is around 50%, then a sample size of just under 100 will suffice. However, if you are trying to estimate the proportion of a rare characteristic - whose true incidence in the population is 0.5% - then for the same degree of confidence in the estimate you will need a sample of over 19,000.

