Abstract
Several papers have been published recently in the Journal addressing “best practices” for survey research manuscripts. This paper explores in more detail the effects of the target population size on sample size determination, probability sampling versus census approaches, and response rates and the relationship to potential nonresponse bias. Survey research is a complex methodology requiring expertise in the planning, execution, and analytic stages.
INTRODUCTION
In 2008, editors of the Journal explored the quality of the survey research methods used in papers submitted to AJPE, and subsequently published 2 papers addressing the issue and raising the bar for future research.1–2 The purpose of this paper is to clarify and expand on the specific considerations of target population size, sampling procedures, and response rate. One of the central questions that arose in response to Associate Editor Jack E. Fincham's Viewpoint was why an 80% response rate is required when one wants to generalize results to all colleges/schools of pharmacy, but only a 50%-60% response rate is required for other types of inquiries.2 Answering this question requires consideration of whether a census is required or whether a form of probability (random) sampling is sufficient. Probability sampling occurs when, “…every member of the target population has a known, non-zero probability of being included in the sample. Probability sampling implies the use of random selection.”3(p68) Answering this important question more completely also requires addressing how to minimize the potential sources of error that occur with sampling in survey research.4 This paper does not address non-probability sampling methods such as convenience, judgment, and quota, which introduce additional limitations and preclude generalizing results.5
Coverage, sampling, and nonresponse are sources of error common to all survey research and are a function of sampling procedures. Coverage error occurs when the list (sampling frame) does not include all elements of the target population to be studied. For example, in surveying community pharmacists, if the state pharmacy association membership roster was used rather than the State Board of Pharmacy roster, there would be potential coverage error since only community pharmacists that were members of the state pharmacy association would have had the opportunity to be selected for participation. Sampling error is the discrepancy due to random sampling between the true value of the population parameter and the sample estimate (statistic) of that parameter. The larger the sample, the smaller the expected sampling error.
Coverage, sampling, and nonresponse error are all considerations that must be addressed before any data are collected. The fourth type of error in survey research is measurement error, which occurs during data collection.
Response Rate
Nonresponse error, or bias, occurs if data are not collected from each member of the sample. Nonresponse bias theoretically can occur with anything less than a 100% response rate. A response rate of 50%-60% or greater is optimal because nonresponse bias is thought to be minimal with that high of a response rate.2 If the response rate is too low, those who responded have a greater chance of being self-selected (ie, there is something inherently different about those who responded and those who did not respond), and thus not representative of the target population. Nonresponse bias can lead to inaccurate conclusions if data from the non-respondents would have changed the overall results of the survey. Non-respondents can be contacted directly to obtain and compare demographic information to those that did respond. Some researchers suggest that late responders are more similar to non-responders than early responders, and thus can be used as a proxy. The non-responders may simply have chosen not to participate; however they also could have refused to participate, depending on the subject of the research. That is a different consideration which could impact the interpretation of results. Also “…the effect of nonresponse on one variable can be very different than for others in the same survey.”6(p54) Nonresponse bias can seriously compromise the validity of results. For example, if 50% of subjects responded in a particular way to a specific item, the “true” percentages could actually range from 45%-55% if the overall response rate was 90%, but range from 5%-95% if the overall response rate was only 10%.6
Target Population Size
A census involves collecting the desired information under study from every member of a population.4,6 This is extremely difficult, if not impossible, with large populations, however is required in extremely small populations. As Salant and Dillman stated, “Occasionally, however, a census is the only way to get accurate information, especially when the population is so small that sampling part of it will not provide accurate estimates of the whole.”4(p6) In pharmacy education, examples exist when it is necessary to strive for a census, and others where sampling is appropriate. One such example is the AACP Faculty Salary Survey, which is incorporated in the Profile of Pharmacy Faculty.7 The 2008–2009 AACP Profile of Pharmacy Faculty achieved a 97% response rate for salary data, creating a high level of confidence in those data. If the response rate had been 20%, 30%, 50%, or even 60%, what level of confidence could users of those data have that the salaries reported were representative and/or generalizable?
The abbreviated table by Krejcie and Morgan, as shown in Table 1, demonstrates the necessity for high response rates in small populations.8 An example is if there is a distinct data point where 1 respondent represents 1 school, such as the experiential director who provides the college/school's policy on criminal background checks prior to advanced pharmacy practice experiences. There are 102 AACP regular institutional members. If this is the target population (ie, those you wish to generalize to), the required sample size to represent the population is 80 colleges/schools, which is an 80% response rate. If the number of regular institutional members increased to 130, the number required to be representative would be 97, or a 75% response rate. The required sample size is important in establishing confidence in generalizing results to the entire population. Taking the experiential director survey example further, if the experiential directors at all 102 colleges/schools were surveyed and responses were received from only 42, reporting a variety of background check procedures, it would be inappropriate to generalize those results to all 102 colleges/schools. If 25% of those 42 performed procedure X, while 15% performed procedure Y, and 45% performed procedure Z, it would be incorrect to conclude that 45% of colleges/schools that were institutional members of AACP performed procedure Z, because the 60 colleges/schools that did not respond may have all performed procedure Y. Thus, it would be an inaccurate generalization to the target population of 102 institutions.
Sample Required from a Given Population to be Representativea
Probability Sampling and Response Rate
Having adequate numbers of subjects is one consideration, but the method of obtaining the sample is even more critical (ie, a random sample). As stated by Dillman, “There is nothing to be gained by surveying all 1000 members of a population in a way that produces only 350 responses (a 35% response rate) versus surveying a sample of only 500 in a way that produces the same number of responses (a 70% response rate). The possibility of non-respondents being different from respondents is likely to be greater when the response rate is lower.”9(p209) In this example, 350 people out of 1000 would not be representative of those 1000 if they were not randomly selected, and therefore the survey would not provide as much confidence in generalizing findings to the original population. The remaining 650 people had to have the same non-zero independent chance to be selected, which is an example of simple random sampling. The lower the response rate, the greater the probability that those who responded are self-selected rather than randomly selected, since it is not always possible to determine why other subjects did not respond. Another issue is the potential for nonresponse bias, the probability of which decreases as response rate increases. Going back to the example, if the 500 people in the sample were randomly selected and thus representative, a higher response rate reduces nonresponse bias and increases the ability to generalize findings to the original target population of 1000.
Another example is a college/school administering the AACP Curriculum Quality Perception Alumni Survey. Sampling and planning follow-up strategies rather than sending out survey instruments to all graduates from the past 5 years is a preferred strategy because alumni surveys have historically low and variable response rates.10–11 Working with a smaller but representative sample allows for more follow-up.
Kerlinger and Lee described the mail questionnaire as having serious drawbacks especially in the case of low response rates, making the “mail questionnaire worse than useless, except in highly sophisticated hands.”12(p603) This is even more crucial in light of Web-based and e-mail data collection approaches.
Depending on the population that the researcher wants to generalize to, either simple or stratified random sampling can be used to obtain a random sample. Stratified random sampling uses meaningful known characteristics about a population to guide sampling. For example, if a researcher wants to survey faculty members at US colleges and schools of pharmacy, faculty members could be stratified by rank, discipline, and/or public/private institution.
CONCLUSION
Survey research continues to be complex and requires consideration of whether research questions should be asked using a census approach or when a form of probability sampling is sufficient. The sources of error possible in survey research, and response rate and potential nonresponse bias require critical consideration because they ultimately affect the validity of the results. The population size of US colleges and schools of pharmacy is relatively small, therefore an 80% response rate is required for survey results to be representative.
Footnotes
The ideas expressed in this manuscript are those of the authors and do not represent the position of the American Association of Colleges of Pharmacy.
- Received February 13, 2009.
- Accepted May 16, 2009.
- © 2009 American Journal of Pharmaceutical Education