Skip to main content
Statistics LibreTexts

20.4: Conducting surveys

  • Page ID
    45276
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

    under construction, missing citations

    Introduction

    What is a survey? A survey is a method of collecting information from a sample of a reference population. Surveys are implemented in many fields including biomedical work (see Chapter 5.4). We can design a cross-sectional study, which gathers information at one time, or the study can be longitudinal, whereby information is gathered over a period of time. If the purpose of the survey is to determine association between two (or more) variables, then either cross-sectional or longitudinal approaches will do. However, if cause-effect hypotheses are the purpose, then longitudinal (e.g., prospective cohort) approaches would be better.

    Survey basics

    Steps to conduct a survey include

    1. Identify and clarify the purpose of the survey. If the purpose is to find out how common something is, then this is descriptive. If we are interested in why something has occurred, then this is an analytic survey.
    2. Define the reference population. It is essential that you know which group the survey applies to. For example, if one wishes to study the opinions of undergraduates at your university, then postgraduate students cannot be included in the sample as they are not part of the reference population.
    3. Design sampling method and determine sample size. Sampling needs to be done to obtain unbiased sample from representative population. If the size of the population is known, then a target of 10% might be the relevant sample size, and a procedure should be taken to obtain a simple random sample. A measure of the success of a survey sample design is the size of the response rate, defined as the ratio of surveys returned divided by the number of surveys distributed.
    4. What information is needed? Care needs to be taken to make sure that the questions asked actually yield the desired information. For example, if the questionnaire is long, there may be a tendency for some to skim or skip questions. If too much information is requested, this may lower the response rate.
    5. How will the information be collected? Types and format of questions (closed or open ended). A phone survey? Written survey dropped off as a mailer? Interview?
    6. In thinking about the data to be collected from the questions, you also need some scoring system. Will a dichotomous response (e.g., True/False, or Yes/No) be adequate, or would a Likert-like scale be more appropriate? When scaling responses, one needs to also be concerned with floor and ceiling effects.
    7. Collect the data. What protocol will be employed to achieve a high response rate with unbiased responses?
    8. Analyze the data. Often chi-square contingency table or the related logistic regression would be appropriate.

    Bias sources in survey research

    When a statistical estimator consistently under or over estimates the true value, this is called statistical bias, which was introduced and discussed in Chapter 5.3. The potential for bias responses to survey questions is an important constraint on the applicability of the survey. An example, it is well-known that adults tend to overestimate their height, but underestimate their weight.

    In survey research, different classes of bias have been defined.

    • Information bias occurs when trends are present in the measurement of the response, (1) recall bias and (2) observer bias.
    • Recall bias is when a difference occurs because some people are much more likely to remember and event than others.
    • Observer bias can be as a result of differences between different observers. If different persons conduct interviews, then it is important that all observers use a standardized method of collecting data.
    • A reality of survey is that not all targets of the survey will answer the questions. This may result in bias. Non-response bias is the situation when those who respond to a questionnaire, the responders, differ in some way from those who don’t, the non-responders.
    • Selection bias results when the sample group you have chosen is not representative of the population you want to generalize your results to. Random sampling can help to minimize this from happening in your survey, but a stratified sampling approach is needed to avoid missing representation, e.g., economic groups, ethnicity.

    How to ask questions?

    The goal is to maximize the number of people who respond to the survey while maintaining accuracy and relevance of the responses. This is accomplished by asking the right questions in the right manner, but also by how the questionnaire is presented and administered. To get accurate answers, one should include additional questions to check the consistency of the responses provided by a person. For example, if the study is about smoking, you could ask either of two questions (or both…?):

    Question 1. Do you smoke tobacco cigarettes? Yes / No

    Question 2. How many cigarettes did you smoke yesterday?
    0 …. 1 – 10 …. 11 – 20 …. 21 +

    Note that these questions are CLOSED — the responder must answer using the answers provided rather than making something up. This has the advantage of restricting the possible answers and allows you to test specific hypotheses. An OPEN question might be

    Question 3. Do you smoke a lot of cigarettes in a day?

    As you can imagine, you would expect to get a variety of interpretations of this question, which limits your ability to analyze test the hypothesis. It would be a poor question to use.

    Closed or forced format questions can take on a variety of styles.

    Question 4. What is your favorite soft drink? (select one answer only)
    __ water
    __ cola
    __ ice tea
    __ fruit juice
    __ no preference

    Note that Question 4 gives the responder a choice among categories. For another example,

    Question 5. Biostatistics is my favorite subject
    ___Strongly disagree ___Disagree ___Undecided ___Agree ___Strongly Agree

    And still another example might employ ranking, requesting the responder to rank from 1 to 8 their favorite subjects from a list of academic topics.

    All of these closed format responses can be easily converted for analyses by contingency table or other nonparametric statistics.

    Some other general tips in writing a survey.

    • Keep sentences simple and short.
    • Ask for only one piece of information at a time.
    • Ask precise questions.
    • Start the survey with the question(s) most relevant to the subject of the study.
    • Avoid asking personal questions at the start.
    • Write some questions then conduct a pilot study to test for the efficacy of the survey

    Suggested readings

    Statistics Canada: a site with lots of material about survey methods.

    Wikipedia: Statistical survey


    This page titled 20.4: Conducting surveys is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Michael R Dohm via source content that was edited to the style and standards of the LibreTexts platform.

    • Was this article helpful?