Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base

Methodology

  • What Is an Observational Study? | Guide & Examples

What Is an Observational Study? | Guide & Examples

Published on March 31, 2022 by Tegan George . Revised on June 22, 2023.

An observational study is used to answer a research question based purely on what the researcher observes. There is no interference or manipulation of the research subjects, and no control and treatment groups .

These studies are often qualitative in nature and can be used for both exploratory and explanatory research purposes. While quantitative observational studies exist, they are less common.

Observational studies are generally used in hard science, medical, and social science fields. This is often due to ethical or practical concerns that prevent the researcher from conducting a traditional experiment . However, the lack of control and treatment groups means that forming inferences is difficult, and there is a risk of confounding variables and observer bias impacting your analysis.

Table of contents

Types of observation, types of observational studies, observational study example, advantages and disadvantages of observational studies, observational study vs. experiment, other interesting articles, frequently asked questions.

There are many types of observation, and it can be challenging to tell the difference between them. Here are some of the most common types to help you choose the best one for your observational study.

Here's why students love Scribbr's proofreading services

Discover proofreading & editing

There are three main types of observational studies: cohort studies, case–control studies, and cross-sectional studies .

Cohort studies

Cohort studies are more longitudinal in nature, as they follow a group of participants over a period of time. Members of the cohort are selected because of a shared characteristic, such as smoking, and they are often observed over a period of years.

Case–control studies

Case–control studies bring together two groups, a case study group and a control group . The case study group has a particular attribute while the control group does not. The two groups are then compared, to see if the case group exhibits a particular characteristic more than the control group.

For example, if you compared smokers (the case study group) with non-smokers (the control group), you could observe whether the smokers had more instances of lung disease than the non-smokers.

Cross-sectional studies

Cross-sectional studies analyze a population of study at a specific point in time.

This often involves narrowing previously collected data to one point in time to test the prevalence of a theory—for example, analyzing how many people were diagnosed with lung disease in March of a given year. It can also be a one-time observation, such as spending one day in the lung disease wing of a hospital.

Observational studies are usually quite straightforward to design and conduct. Sometimes all you need is a notebook and pen! As you design your study, you can follow these steps.

Step 1: Identify your research topic and objectives

The first step is to determine what you’re interested in observing and why. Observational studies are a great fit if you are unable to do an experiment for practical or ethical reasons , or if your research topic hinges on natural behaviors.

Step 2: Choose your observation type and technique

In terms of technique, there are a few things to consider:

  • Are you determining what you want to observe beforehand, or going in open-minded?
  • Is there another research method that would make sense in tandem with an observational study?
  • If yes, make sure you conduct a covert observation.
  • If not, think about whether observing from afar or actively participating in your observation is a better fit.
  • How can you preempt confounding variables that could impact your analysis?
  • You could observe the children playing at the playground in a naturalistic observation.
  • You could spend a month at a day care in your town conducting participant observation, immersing yourself in the day-to-day life of the children.
  • You could conduct covert observation behind a wall or glass, where the children can’t see you.

Overall, it is crucial to stay organized. Devise a shorthand for your notes, or perhaps design templates that you can fill in. Since these observations occur in real time, you won’t get a second chance with the same data.

Step 3: Set up your observational study

Before conducting your observations, there are a few things to attend to:

  • Plan ahead: If you’re interested in day cares, you’ll need to call a few in your area to plan a visit. They may not all allow observation, or consent from parents may be needed, so give yourself enough time to set everything up.
  • Determine your note-taking method: Observational studies often rely on note-taking because other methods, like video or audio recording, run the risk of changing participant behavior.
  • Get informed consent from your participants (or their parents) if you want to record:  Ultimately, even though it may make your analysis easier, the challenges posed by recording participants often make pen-and-paper a better choice.

Step 4: Conduct your observation

After you’ve chosen a type of observation, decided on your technique, and chosen a time and place, it’s time to conduct your observation.

Here, you can split them into case and control groups. The children with siblings have a characteristic you are interested in (siblings), while the children in the control group do not.

When conducting observational studies, be very careful of confounding or “lurking” variables. In the example above, you observed children as they were dropped off, gauging whether or not they were upset. However, there are a variety of other factors that could be at play here (e.g., illness).

Step 5: Analyze your data

After you finish your observation, immediately record your initial thoughts and impressions, as well as follow-up questions or any issues you perceived during the observation. If you audio- or video-recorded your observations, you can transcribe them.

Your analysis can take an inductive  or deductive approach :

  • If you conducted your observations in a more open-ended way, an inductive approach allows your data to determine your themes.
  • If you had specific hypotheses prior to conducting your observations, a deductive approach analyzes whether your data confirm those themes or ideas you had previously.

Next, you can conduct your thematic or content analysis . Due to the open-ended nature of observational studies, the best fit is likely thematic analysis .

Step 6: Discuss avenues for future research

Observational studies are generally exploratory in nature, and they often aren’t strong enough to yield standalone conclusions due to their very high susceptibility to observer bias and confounding variables. For this reason, observational studies can only show association, not causation .

If you are excited about the preliminary conclusions you’ve drawn and wish to proceed with your topic, you may need to change to a different research method , such as an experiment.

  • Observational studies can provide information about difficult-to-analyze topics in a low-cost, efficient manner.
  • They allow you to study subjects that cannot be randomized safely, efficiently, or ethically .
  • They are often quite straightforward to conduct, since you just observe participant behavior as it happens or utilize preexisting data.
  • They’re often invaluable in informing later, larger-scale clinical trials or experimental designs.

Disadvantages

  • Observational studies struggle to stand on their own as a reliable research method. There is a high risk of observer bias and undetected confounding variables or omitted variables .
  • They lack conclusive results, typically are not externally valid or generalizable, and can usually only form a basis for further research.
  • They cannot make statements about the safety or efficacy of the intervention or treatment they study, only observe reactions to it. Therefore, they offer less satisfying results than other methods.

The key difference between observational studies and experiments is that a properly conducted observational study will never attempt to influence responses, while experimental designs by definition have some sort of treatment condition applied to a portion of participants.

However, there may be times when it’s impossible, dangerous, or impractical to influence the behavior of your participants. This can be the case in medical studies, where it is unethical or cruel to withhold potentially life-saving intervention, or in longitudinal analyses where you don’t have the ability to follow your group over the course of their lifetime.

An observational study may be the right fit for your research if random assignment of participants to control and treatment groups is impossible or highly difficult. However, the issues observational studies raise in terms of validity , confounding variables, and conclusiveness can mean that an experiment is more reliable.

If you’re able to randomize your participants safely and your research question is definitely causal in nature, consider using an experiment.

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Student’s  t -distribution
  • Normal distribution
  • Null and Alternative Hypotheses
  • Chi square tests
  • Confidence interval
  • Quartiles & Quantiles
  • Cluster sampling
  • Stratified sampling
  • Data cleansing
  • Reproducibility vs Replicability
  • Peer review
  • Prospective cohort study

Research bias

  • Implicit bias
  • Cognitive bias
  • Placebo effect
  • Hawthorne effect
  • Hindsight bias
  • Affect heuristic
  • Social desirability bias

An observational study is a great choice for you if your research question is based purely on observations. If there are ethical, logistical, or practical concerns that prevent you from conducting a traditional experiment , an observational study may be a good choice. In an observational study, there is no interference or manipulation of the research subjects, as well as no control or treatment groups .

The key difference between observational studies and experimental designs is that a well-done observational study does not influence the responses of participants, while experiments do have some sort of treatment condition applied to at least some participants by random assignment .

A quasi-experiment is a type of research design that attempts to establish a cause-and-effect relationship. The main difference with a true experiment is that the groups are not randomly assigned.

Exploratory research aims to explore the main aspects of an under-researched problem, while explanatory research aims to explain the causes and consequences of a well-defined problem.

Experimental design means planning a set of procedures to investigate a relationship between variables . To design a controlled experiment, you need:

  • A testable hypothesis
  • At least one independent variable that can be precisely manipulated
  • At least one dependent variable that can be precisely measured

When designing the experiment, you decide:

  • How you will manipulate the variable(s)
  • How you will control for any potential confounding variables
  • How many subjects or samples will be included in the study
  • How subjects will be assigned to treatment levels

Experimental design is essential to the internal and external validity of your experiment.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

George, T. (2023, June 22). What Is an Observational Study? | Guide & Examples. Scribbr. Retrieved April 9, 2024, from https://www.scribbr.com/methodology/observational-study/

Is this article helpful?

Tegan George

Tegan George

Other students also liked, what is a research design | types, guide & examples, guide to experimental design | overview, steps, & examples, naturalistic observation | definition, guide & examples, "i thought ai proofreading was useless but..".

I've been using Scribbr for years now and I know it's a service that won't disappoint. It does a good job spotting mistakes”

Case Study Observational Research: A Framework for Conducting Case Study Research Where Observation Data Are the Focus

Affiliation.

  • 1 1 University of Otago, Wellington, New Zealand.
  • PMID: 27217290
  • DOI: 10.1177/1049732316649160

Case study research is a comprehensive method that incorporates multiple sources of data to provide detailed accounts of complex research phenomena in real-life contexts. However, current models of case study research do not particularly distinguish the unique contribution observation data can make. Observation methods have the potential to reach beyond other methods that rely largely or solely on self-report. This article describes the distinctive characteristics of case study observational research, a modified form of Yin's 2014 model of case study research the authors used in a study exploring interprofessional collaboration in primary care. In this approach, observation data are positioned as the central component of the research design. Case study observational research offers a promising approach for researchers in a wide range of health care settings seeking more complete understandings of complex topics, where contextual influences are of primary concern. Future research is needed to refine and evaluate the approach.

Keywords: New Zealand; appreciative inquiry; case studies; case study observational research; health care; interprofessional collaboration; naturalistic inquiry; observation; primary health care; qualitative; research design.

  • Observational Studies as Topic / methods*
  • Observational Studies as Topic / standards
  • Primary Health Care / organization & administration
  • Research Design*
  • Self Report / standards

Library homepage

  • school Campus Bookshelves
  • menu_book Bookshelves
  • perm_media Learning Objects
  • login Login
  • how_to_reg Request Instructor Account
  • hub Instructor Commons
  • Download Page (PDF)
  • Download Full Book (PDF)
  • Periodic Table
  • Physics Constants
  • Scientific Calculator
  • Reference & Cite
  • Tools expand_more
  • Readability

selected template will load here

This action is not available.

Social Sci LibreTexts

6.5: Observational Research

  • Last updated
  • Save as PDF
  • Page ID 16122

Learning Objectives

  • List the various types of observational research methods and distinguish between each
  • Describe the strengths and weakness of each observational research method.

What Is Observational Research?

The term observational research is used to refer to several different types of non-experimental studies in which behavior is systematically observed and recorded. The goal of observational research is to describe a variable or set of variables. More generally, the goal is to obtain a snapshot of specific characteristics of an individual, group, or setting. As described previously, observational research is non-experimental because nothing is manipulated or controlled, and as such we cannot arrive at causal conclusions using this approach. The data that are collected in observational research studies are often qualitative in nature but they may also be quantitative or both (mixed-methods). There are several different types of observational research designs that will be described below.

Naturalistic Observation

Naturalistic observation is an observational method that involves observing people’s behavior in the environment in which it typically occurs. Thus naturalistic observation is a type of field research (as opposed to a type of laboratory research). Jane Goodall’s famous research on chimpanzees is a classic example of naturalistic observation. Dr. Goodall spent three decades observing chimpanzees in their natural environment in East Africa. She examined such things as chimpanzee’s social structure, mating patterns, gender roles, family structure, and care of offspring by observing them in the wild. However, naturalistic observation could more simply involve observing shoppers in a grocery store, children on a school playground, or psychiatric inpatients in their wards. Researchers engaged in naturalistic observation usually make their observations as unobtrusively as possible so that participants are not aware that they are being studied. Such an approach is called disguised naturalistic observation. Ethically, this method is considered to be acceptable if the participants remain anonymous and the behavior occurs in a public setting where people would not normally have an expectation of privacy. Grocery shoppers putting items into their shopping carts, for example, are engaged in public behavior that is easily observable by store employees and other shoppers. For this reason, most researchers would consider it ethically acceptable to observe them for a study. On the other hand, one of the arguments against the ethicality of the naturalistic observation of “bathroom behavior” discussed earlier in the book is that people have a reasonable expectation of privacy even in a public restroom and that this expectation was violated.

In cases where it is not ethical or practical to conduct disguised naturalistic observation, researchers can conduct undisguised naturalistic observation where the participants are made aware of the researcher presence and monitoring of their behavior. However, one concern with undisguised naturalistic observation is reactivity. Reactivity refers to when a measure changes participants’ behavior. In the case of undisguised naturalistic observation, the concern with reactivity is that when people know they are being observed and studied, they may act differently than they normally would. For instance, you may act much differently in a bar if you know that someone is observing you and recording your behaviors and this would invalidate the study. So disguised observation is less reactive and therefore can have higher validity because people are not aware that their behaviors are being observed and recorded. However, we now know that people often become used to being observed and with time they begin to behave naturally in the researcher’s presence. In other words, over time people habituate to being observed. Think about reality shows like Big Brother or Survivor where people are constantly being observed and recorded. While they may be on their best behavior at first, in a fairly short amount of time they are, flirting, having sex, wearing next to nothing, screaming at each other, and at times acting like complete fools in front of the entire nation.

Participant Observation

Another approach to data collection in observational research is participant observation. In participant observation , researchers become active participants in the group or situation they are studying. Participant observation is very similar to naturalistic observation in that it involves observing people’s behavior in the environment in which it typically occurs. As with naturalistic observation, the data that is collected can include interviews (usually unstructured), notes based on their observations and interactions, documents, photographs, and other artifacts. The only difference between naturalistic observation and participant observation is that researchers engaged in participant observation become active members of the group or situations they are studying. The basic rationale for participant observation is that there may be important information that is only accessible to, or can be interpreted only by, someone who is an active participant in the group or situation. Like naturalistic observation, participant observation can be either disguised or undisguised. In disguised participant observation, the researchers pretend to be members of the social group they are observing and conceal their true identity as researchers. In contrast with undisguised participant observation, the researchers become a part of the group they are studying and they disclose their true identity as researchers to the group under investigation. Once again there are important ethical issues to consider with disguised participant observation. First no informed consent can be obtained and second passive deception is being used. The researcher is passively deceiving the participants by intentionally withholding information about their motivations for being a part of the social group they are studying. But sometimes disguised participation is the only way to access a protective group (like a cult). Further, disguised participant observation is less prone to reactivity than undisguised participant observation.

Rosenhan’s study (1973) [1] of the experience of people in a psychiatric ward would be considered disguised participant observation because Rosenhan and his pseudopatients were admitted into psychiatric hospitals on the pretense of being patients so that they could observe the way that psychiatric patients are treated by staff. The staff and other patients were unaware of their true identities as researchers.

Another example of participant observation comes from a study by sociologist Amy Wilkins (published in Social Psychology Quarterly ) on a university-based religious organization that emphasized how happy its members were (Wilkins, 2008) [2] . Wilkins spent 12 months attending and participating in the group’s meetings and social events, and she interviewed several group members. In her study, Wilkins identified several ways in which the group “enforced” happiness—for example, by continually talking about happiness, discouraging the expression of negative emotions, and using happiness as a way to distinguish themselves from other groups.

One of the primary benefits of participant observation is that the researcher is in a much better position to understand the viewpoint and experiences of the people they are studying when they are apart of the social group. The primary limitation with this approach is that the mere presence of the observer could affect the behavior of the people being observed. While this is also a concern with naturalistic observation when researchers because active members of the social group they are studying, additional concerns arise that they may change the social dynamics and/or influence the behavior of the people they are studying. Similarly, if the researcher acts as a participant observer there can be concerns with biases resulting from developing relationships with the participants. Concretely, the researcher may become less objective resulting in more experimenter bias.

Structured Observation

Another observational method is structured observation. Here the investigator makes careful observations of one or more specific behaviors in a particular setting that is more structured than the settings used in naturalistic and participant observation. Often the setting in which the observations are made is not the natural setting, rather the researcher may observe people in the laboratory environment. Alternatively, the researcher may observe people in a natural setting (like a classroom setting) that they have structured some way, for instance by introducing some specific task participants are to engage in or by introducing a specific social situation or manipulation. Structured observation is very similar to naturalistic observation and participant observation in that in all cases researchers are observing naturally occurring behavior, however, the emphasis in structured observation is on gathering quantitative rather than qualitative data. Researchers using this approach are interested in a limited set of behaviors. This allows them to quantify the behaviors they are observing. In other words, structured observation is less global than naturalistic and participant observation because the researcher engaged in structured observations is interested in a small number of specific behaviors. Therefore, rather than recording everything that happens, the researcher only focuses on very specific behaviors of interest.

Structured observation is very similar to naturalistic observation and participant observation in that in all cases researchers are observing naturally occurring behavior, however, the emphasis in structured observation is on gathering quantitative rather than qualitative data. Researchers using this approach are interested in a limited set of behaviors. This allows them to quantify the behaviors they are observing. In other words, structured observation is less global than naturalistic and participant observation because the researcher engaged in structured observations is interested in a small number of specific behaviors. Therefore, rather than recording everything that happens, the researcher only focuses on very specific behaviors of interest.

Researchers Robert Levine and Ara Norenzayan used structured observation to study differences in the “pace of life” across countries (Levine & Norenzayan, 1999) [3] . One of their measures involved observing pedestrians in a large city to see how long it took them to walk 60 feet. They found that people in some countries walked reliably faster than people in other countries. For example, people in Canada and Sweden covered 60 feet in just under 13 seconds on average, while people in Brazil and Romania took close to 17 seconds. When structured observation takes place in the complex and even chaotic “real world,” the questions of when, where, and under what conditions the observations will be made, and who exactly will be observed are important to consider. Levine and Norenzayan described their sampling process as follows:

“Male and female walking speed over a distance of 60 feet was measured in at least two locations in main downtown areas in each city. Measurements were taken during main business hours on clear summer days. All locations were flat, unobstructed, had broad sidewalks, and were sufficiently uncrowded to allow pedestrians to move at potentially maximum speeds. To control for the effects of socializing, only pedestrians walking alone were used. Children, individuals with obvious physical handicaps, and window-shoppers were not timed. Thirty-five men and 35 women were timed in most cities.” (p. 186). Precise specification of the sampling process in this way makes data collection manageable for the observers, and it also provides some control over important extraneous variables. For example, by making their observations on clear summer days in all countries, Levine and Norenzayan controlled for effects of the weather on people’s walking speeds. In Levine and Norenzayan’s study, measurement was relatively straightforward. They simply measured out a 60-foot distance along a city sidewalk and then used a stopwatch to time participants as they walked over that distance.

As another example, researchers Robert Kraut and Robert Johnston wanted to study bowlers’ reactions to their shots, both when they were facing the pins and then when they turned toward their companions (Kraut & Johnston, 1979) [4] . But what “reactions” should they observe? Based on previous research and their own pilot testing, Kraut and Johnston created a list of reactions that included “closed smile,” “open smile,” “laugh,” “neutral face,” “look down,” “look away,” and “face cover” (covering one’s face with one’s hands). The observers committed this list to memory and then practiced by coding the reactions of bowlers who had been videotaped. During the actual study, the observers spoke into an audio recorder, describing the reactions they observed. Among the most interesting results of this study was that bowlers rarely smiled while they still faced the pins. They were much more likely to smile after they turned toward their companions, suggesting that smiling is not purely an expression of happiness but also a form of social communication.

When the observations require a judgment on the part of the observers—as in Kraut and Johnston’s study—this process is often described as coding . Coding generally requires clearly defining a set of target behaviors. The observers then categorize participants individually in terms of which behavior they have engaged in and the number of times they engaged in each behavior. The observers might even record the duration of each behavior. The target behaviors must be defined in such a way that different observers code them in the same way. This difficulty with coding is the issue of interrater reliability, as mentioned in Chapter 4. Researchers are expected to demonstrate the interrater reliability of their coding procedure by having multiple raters code the same behaviors independently and then showing that the different observers are in close agreement. Kraut and Johnston, for example, video recorded a subset of their participants’ reactions and had two observers independently code them. The two observers showed that they agreed on the reactions that were exhibited 97% of the time, indicating good interrater reliability.

One of the primary benefits of structured observation is that it is far more efficient than naturalistic and participant observation. Since the researchers are focused on specific behaviors this reduces time and expense. Also, often times the environment is structured to encourage the behaviors of interested which again means that researchers do not have to invest as much time in waiting for the behaviors of interest to naturally occur. Finally, researchers using this approach can clearly exert greater control over the environment. However, when researchers exert more control over the environment it may make the environment less natural which decreases external validity. It is less clear for instance whether structured observations made in a laboratory environment will generalize to a real world environment. Furthermore, since researchers engaged in structured observation are often not disguised there may be more concerns with reactivity.

Case Studies

A case study is an in-depth examination of an individual. Sometimes case studies are also completed on social units (e.g., a cult) and events (e.g., a natural disaster). Most commonly in psychology, however, case studies provide a detailed description and analysis of an individual. Often the individual has a rare or unusual condition or disorder or has damage to a specific region of the brain.

Like many observational research methods, case studies tend to be more qualitative in nature. Case study methods involve an in-depth, and often a longitudinal examination of an individual. Depending on the focus of the case study, individuals may or may not be observed in their natural setting. If the natural setting is not what is of interest, then the individual may be brought into a therapist’s office or a researcher’s lab for study. Also, the bulk of the case study report will focus on in-depth descriptions of the person rather than on statistical analyses. With that said some quantitative data may also be included in the write-up of a case study. For instance, an individuals’ depression score may be compared to normative scores or their score before and after treatment may be compared. As with other qualitative methods, a variety of different methods and tools can be used to collect information on the case. For instance, interviews, naturalistic observation, structured observation, psychological testing (e.g., IQ test), and/or physiological measurements (e.g., brain scans) may be used to collect information on the individual.

HM is one of the most notorious case studies in psychology. HM suffered from intractable and very severe epilepsy. A surgeon localized HM’s epilepsy to his medial temporal lobe and in 1953 he removed large sections of his hippocampus in an attempt to stop the seizures. The treatment was a success, in that it resolved his epilepsy and his IQ and personality were unaffected. However, the doctors soon realized that HM exhibited a strange form of amnesia, called anterograde amnesia. HM was able to carry out a conversation and he could remember short strings of letters, digits, and words. Basically, his short term memory was preserved. However, HM could not commit new events to memory. He lost the ability to transfer information from his short-term memory to his long term memory, something memory researchers call consolidation. So while he could carry on a conversation with someone, he would completely forget the conversation after it ended. This was an extremely important case study for memory researchers because it suggested that there’s a dissociation between short-term memory and long-term memory, it suggested that these were two different abilities sub-served by different areas of the brain. It also suggested that the temporal lobes are particularly important for consolidating new information (i.e., for transferring information from short-term memory to long-term memory).

The history of psychology is filled with influential cases studies, such as Sigmund Freud’s description of “Anna O.” (see below) and John Watson and Rosalie Rayner’s description of Little Albert (Watson & Rayner, 1920) [5] , who learned to fear a white rat—along with other furry objects—when the researchers made a loud noise while he was playing with the rat.

The Case of “Anna O.”

Sigmund Freud used the case of a young woman he called “Anna O.” to illustrate many principles of his theory of psychoanalysis (Freud, 1961) [6] . (Her real name was Bertha Pappenheim, and she was an early feminist who went on to make important contributions to the field of social work.) Anna had come to Freud’s colleague Josef Breuer around 1880 with a variety of odd physical and psychological symptoms. One of them was that for several weeks she was unable to drink any fluids. According to Freud,

She would take up the glass of water that she longed for, but as soon as it touched her lips she would push it away like someone suffering from hydrophobia.…She lived only on fruit, such as melons, etc., so as to lessen her tormenting thirst. (p. 9)

But according to Freud, a breakthrough came one day while Anna was under hypnosis.

[S]he grumbled about her English “lady-companion,” whom she did not care for, and went on to describe, with every sign of disgust, how she had once gone into this lady’s room and how her little dog—horrid creature!—had drunk out of a glass there. The patient had said nothing, as she had wanted to be polite. After giving further energetic expression to the anger she had held back, she asked for something to drink, drank a large quantity of water without any difficulty, and awoke from her hypnosis with the glass at her lips; and thereupon the disturbance vanished, never to return. (p.9)

Freud’s interpretation was that Anna had repressed the memory of this incident along with the emotion that it triggered and that this was what had caused her inability to drink. Furthermore, her recollection of the incident, along with her expression of the emotion she had repressed, caused the symptom to go away.

As an illustration of Freud’s theory, the case study of Anna O. is quite effective. As evidence for the theory, however, it is essentially worthless. The description provides no way of knowing whether Anna had really repressed the memory of the dog drinking from the glass, whether this repression had caused her inability to drink, or whether recalling this “trauma” relieved the symptom. It is also unclear from this case study how typical or atypical Anna’s experience was.

318px-Pappenheim_1882.jpg

Case studies are useful because they provide a level of detailed analysis not found in many other research methods and greater insights may be gained from this more detailed analysis. As a result of the case study, the researcher may gain a sharpened understanding of what might become important to look at more extensively in future more controlled research. Case studies are also often the only way to study rare conditions because it may be impossible to find a large enough sample to individuals with the condition to use quantitative methods. Although at first glance a case study of a rare individual might seem to tell us little about ourselves, they often do provide insights into normal behavior. The case of HM provided important insights into the role of the hippocampus in memory consolidation. However, it is important to note that while case studies can provide insights into certain areas and variables to study, and can be useful in helping develop theories, they should never be used as evidence for theories. In other words, case studies can be used as inspiration to formulate theories and hypotheses, but those hypotheses and theories then need to be formally tested using more rigorous quantitative methods.

The reason case studies shouldn’t be used to provide support for theories is that they suffer from problems with internal and external validity. Case studies lack the proper controls that true experiments contain. As such they suffer from problems with internal validity, so they cannot be used to determine causation. For instance, during HM’s surgery, the surgeon may have accidentally lesioned another area of HM’s brain (indeed questioning into the possibility of a separate brain lesion began after HM’s death and dissection of his brain) and that lesion may have contributed to his inability to consolidate new information. The fact is, with case studies we cannot rule out these sorts of alternative explanations. So as with all observational methods case studies do not permit determination of causation. In addition, because case studies are often of a single individual, and typically a very abnormal individual, researchers cannot generalize their conclusions to other individuals. Recall that with most research designs there is a trade-off between internal and external validity, with case studies, however, there are problems with both internal validity and external validity. So there are limits both to the ability to determine causation and to generalize the results. A final limitation of case studies is that ample opportunity exists for the theoretical biases of the researcher to color or bias the case description. Indeed, there have been accusations that the woman who studied HM destroyed a lot of her data that were not published and she has been called into question for destroying contradictory data that didn’t support her theory about how memories are consolidated. There is a fascinating New York Times article that describes some of the controversies that ensued after HM’s death and analysis of his brain that can be found at: https://www.nytimes.com/2016/08/07/m...mber.html?_r=0

Archival Research

Another approach that is often considered observational research is the use of archival research which involves analyzing data that have already been collected for some other purpose. An example is a study by Brett Pelham and his colleagues on “implicit egotism”—the tendency for people to prefer people, places, and things that are similar to themselves (Pelham, Carvallo, & Jones, 2005) [7] . In one study, they examined Social Security records to show that women with the names Virginia, Georgia, Louise, and Florence were especially likely to have moved to the states of Virginia, Georgia, Louisiana, and Florida, respectively.

As with naturalistic observation, measurement can be more or less straightforward when working with archival data. For example, counting the number of people named Virginia who live in various states based on Social Security records is relatively straightforward. But consider a study by Christopher Peterson and his colleagues on the relationship between optimism and health using data that had been collected many years before for a study on adult development (Peterson, Seligman, & Vaillant, 1988) [8] . In the 1940s, healthy male college students had completed an open-ended questionnaire about difficult wartime experiences. In the late 1980s, Peterson and his colleagues reviewed the men’s questionnaire responses to obtain a measure of explanatory style—their habitual ways of explaining bad events that happen to them. More pessimistic people tend to blame themselves and expect long-term negative consequences that affect many aspects of their lives, while more optimistic people tend to blame outside forces and expect limited negative consequences. To obtain a measure of explanatory style for each participant, the researchers used a procedure in which all negative events mentioned in the questionnaire responses, and any causal explanations for them were identified and written on index cards. These were given to a separate group of raters who rated each explanation in terms of three separate dimensions of optimism-pessimism. These ratings were then averaged to produce an explanatory style score for each participant. The researchers then assessed the statistical relationship between the men’s explanatory style as undergraduate students and archival measures of their health at approximately 60 years of age. The primary result was that the more optimistic the men were as undergraduate students, the healthier they were as older men. Pearson’s r was +.25.

This method is an example of content analysis —a family of systematic approaches to measurement using complex archival data. Just as structured observation requires specifying the behaviors of interest and then noting them as they occur, content analysis requires specifying keywords, phrases, or ideas and then finding all occurrences of them in the data. These occurrences can then be counted, timed (e.g., the amount of time devoted to entertainment topics on the nightly news show), or analyzed in a variety of other ways.

Key Takeaways

  • There are several different approaches to observational research including naturalistic observation, participant observation, structured observation, case studies, and archival research.
  • Naturalistic observation is used to observe people in their natural setting, participant observation involves becoming an active member of the group being observed, structured observation involves coding a small number of behaviors in a quantitative manner, case studies are typically used to collect in-depth information on a single individual, and archival research involves analysing existing data.
  • Describe one problem related to internal validity.
  • Describe one problem related to external validity.
  • Generate one hypothesis suggested by the case study that might be interesting to test in a systematic single-subject or group study.
  • Rosenhan, D. L. (1973). On being sane in insane places. Science, 179 , 250–258.
  • Wilkins, A. (2008). “Happier than Non-Christians”: Collective emotions and symbolic boundaries among evangelical Christians. Social Psychology Quarterly, 71 , 281–301.
  • Levine, R. V., & Norenzayan, A. (1999). The pace of life in 31 countries. Journal of Cross-Cultural Psychology, 30 , 178–205.
  • Kraut, R. E., & Johnston, R. E. (1979). Social and emotional messages of smiling: An ethological approach. Journal of Personality and Social Psychology, 37 , 1539–1553.
  • Watson, J. B., & Rayner, R. (1920). Conditioned emotional reactions. Journal of Experimental Psychology, 3 , 1–14.
  • Freud, S. (1961). Five lectures on psycho-analysis . New York, NY: Norton.
  • Pelham, B. W., Carvallo, M., & Jones, J. T. (2005). Implicit egotism. Current Directions in Psychological Science, 14 , 106–110.
  • Peterson, C., Seligman, M. E. P., & Vaillant, G. E. (1988). Pessimistic explanatory style is a risk factor for physical illness: A thirty-five year longitudinal study. Journal of Personality and Social Psychology, 55 , 23–27.

Observational Case Studies

  • First Online: 01 January 2014

Cite this chapter

Book cover

  • Roel J. Wieringa 2  

9643 Accesses

1 Citations

An observational case study is a study of a real-world case without performing an intervention. Measurement may influence the measured phenomena, but as in all forms of research, the researcher tries to restrict this to a minimum.

  • Requirement Engineering
  • Population Predicate
  • Knowledge Question
  • Analytical Induction
  • Abductive Inference

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
  • Durable hardcover edition

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

D. Damian, J. Chisan, An empirical study of the complex relationships between requirements engineering processes and other processes that lead to payoffs in productivity, quality and risk management. IEEE Trans. Softw. Eng. 32 (7), 433–453 (2006)

Article   Google Scholar  

M. Denscombe, The Good Research Guide For Small-Scale Social Research Projects , 4th edn. (Open University Press, Maidenhead, 2010)

Google Scholar  

K.M. Eisenhardt, Building theories from case study research. Acad. Manag. Rev. 14 (4), 532–550 (1989)

B. Flyvberg, Five misunderstandings about case-study research. Qual. Inq. 12 (2), 219–245 (2006)

R.L. Glass, Pilot studies: What, why, and how. J. Syst. Softw. 36 , 85–97 (1997)

M.M. Kennedy, Generalizing from single case studies. Eval. Q. 3 (4), 661–678 (1979)

B. Kitchenham, L. Pickard, S.L. Pfleeger, Case studies for method and tool evaluation. IEEE Softw. 12 (4), 52–62 (1995)

C. Robson, Real World Research , 2nd edn. (Blackwell, Oxford, 2002)

P. Runeson, M. Höst, A. Rainer, B. Regnell, Case Study Research in Software Engineering: Guidelines and Examples (Wiley, Hoboken, 2012)

Book   Google Scholar  

J.M. Verner, J. Sampson, V. Tosic, N.A.A. Bakar, B.A. Kitchenham, Guidelines for industrially-based multiple case studies in software engineering, in Research Challenges in Information Science, 2009. RCIS 2009. Third International Conference on , 2009, pp. 313–324

L. Warne, D. Hart, The impact of organizational politics on information systems project failure-a case study, in Proceedings of the Twenty-Ninth Hawaii International Conference on System Sciences , vol. 4, 1996, pp. 191–201

R.J. Wieringa, Towards a unified checklist for empirical research in software engineering: first proposal, in 16th International Conference on Evaluation and Assessment in Software Engineering (EASE 2012) , ed. by T. Baldaresse, M. Genero, E. Mendes, M. Piattini (IET, Ciudad Real, 2012), pp. 161–165

R.J. Wieringa, A unified checklist for observational and experimental research in software engineering (version 1). Technical Report TR-CTIT-12-07, Centre for Telematics and Information Technology University of Twente (2012)

R.K. Yin, Case Study research: Design and Methods (Sage, Thousand Oaks, 1984)

R.K. Yin, Case Study research: Design and Methods , 3rd edn. (Sage, Thousand Oaks, 2003)

Download references

Author information

Authors and affiliations.

University of Twente, Enschede, The Netherlands

Roel J. Wieringa

You can also search for this author in PubMed   Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag Berlin Heidelberg

About this chapter

Wieringa, R.J. (2014). Observational Case Studies. In: Design Science Methodology for Information Systems and Software Engineering. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-43839-8_17

Download citation

DOI : https://doi.org/10.1007/978-3-662-43839-8_17

Published : 20 August 2014

Publisher Name : Springer, Berlin, Heidelberg

Print ISBN : 978-3-662-43838-1

Online ISBN : 978-3-662-43839-8

eBook Packages : Computer Science Computer Science (R0)

Share this chapter

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

We have a new app!

Take the Access library with you wherever you go—easy access to books, videos, images, podcasts, personalized features, and more.

Download the Access App here: iOS and Android . Learn more here!

  • Remote Access
  • Save figures into PowerPoint
  • Download tables as PDFs

Understanding Clinical Research

Chapter 12. Observational Study Designs

Bradley G. Hammill

  • Download Chapter PDF

Disclaimer: These citations have been automatically generated based on the information we have and it may not be 100% accurate. Please consult the latest official manual style if you have any questions regarding the format accuracy.

Download citation file:

  • Search Book

Jump to a Section

  • Introduction
  • Analytic Study Designs
  • Descriptive Study Designs
  • Conclusions
  • Full Chapter
  • Supplementary Content

Observational studies in clinical research can be classified as either analytic or descriptive ( Table 12–1 ). Analytic observational studies are similar to randomized, controlled clinical trials in that the goal is to estimate the causal effect of an exposure on an outcome. Also similar to trials, analytic observational studies always include some type of comparison group, against which the experience of the exposed group is compared. Well-designed analytic studies can generate strong evidence for or against a stated hypothesis. Descriptive studies, on the other hand, aim to describe the characteristics or experiences of a particular patient group. Even well-designed descriptive studies cannot be used to draw strong conclusions about the effect of an exposure on an outcome. Instead, these studies are often used to generate study questions that can then be tested by more rigorous methods.

Although many observational study designs are available to researchers ( 1 ), a few are most widely used and will be described below. The analytic study designs presented are the case-control study and the cohort study. The descriptive study designs presented are the ecologic study, the cross-sectional prevalence survey, and case reports or case series.

Case-Control Studies

Sign in or create a free Access profile below to access even more exclusive content.

With an Access profile, you can save and manage favorites from your personal dashboard, complete case quizzes, review Q&A, and take these feature on the go with our Access app.

Pop-up div Successfully Displayed

This div only appears when the trigger link is hovered over. Otherwise it is hidden from view.

Please Wait

Observation Method in Psychology: Naturalistic, Participant and Controlled

Saul Mcleod, PhD

Editor-in-Chief for Simply Psychology

BSc (Hons) Psychology, MRes, PhD, University of Manchester

Saul Mcleod, PhD., is a qualified psychology teacher with over 18 years of experience in further and higher education. He has been published in peer-reviewed journals, including the Journal of Clinical Psychology.

Learn about our Editorial Process

Olivia Guy-Evans, MSc

Associate Editor for Simply Psychology

BSc (Hons) Psychology, MSc Psychology of Education

Olivia Guy-Evans is a writer and associate editor for Simply Psychology. She has previously worked in healthcare and educational sectors.

On This Page:

The observation method in psychology involves directly and systematically witnessing and recording measurable behaviors, actions, and responses in natural or contrived settings without attempting to intervene or manipulate what is being observed.

Used to describe phenomena, generate hypotheses, or validate self-reports, psychological observation can be either controlled or naturalistic with varying degrees of structure imposed by the researcher.

There are different types of observational methods, and distinctions need to be made between:

1. Controlled Observations 2. Naturalistic Observations 3. Participant Observations

In addition to the above categories, observations can also be either overt/disclosed (the participants know they are being studied) or covert/undisclosed (the researcher keeps their real identity a secret from the research subjects, acting as a genuine member of the group).

In general, conducting observational research is relatively inexpensive, but it remains highly time-consuming and resource-intensive in data processing and analysis.

The considerable investments needed in terms of coder time commitments for training, maintaining reliability, preventing drift, and coding complex dynamic interactions place practical barriers on observers with limited resources.

Controlled Observation

Controlled observation is a research method for studying behavior in a carefully controlled and structured environment.

The researcher sets specific conditions, variables, and procedures to systematically observe and measure behavior, allowing for greater control and comparison of different conditions or groups.

The researcher decides where the observation will occur, at what time, with which participants, and in what circumstances, and uses a standardized procedure. Participants are randomly allocated to each independent variable group.

Rather than writing a detailed description of all behavior observed, it is often easier to code behavior according to a previously agreed scale using a behavior schedule (i.e., conducting a structured observation).

The researcher systematically classifies the behavior they observe into distinct categories. Coding might involve numbers or letters to describe a characteristic or the use of a scale to measure behavior intensity.

The categories on the schedule are coded so that the data collected can be easily counted and turned into statistics.

For example, Mary Ainsworth used a behavior schedule to study how infants responded to brief periods of separation from their mothers. During the Strange Situation procedure, the infant’s interaction behaviors directed toward the mother were measured, e.g.,

  • Proximity and contact-seeking
  • Contact maintaining
  • Avoidance of proximity and contact
  • Resistance to contact and comforting

The observer noted down the behavior displayed during 15-second intervals and scored the behavior for intensity on a scale of 1 to 7.

strange situation scoring

Sometimes participants’ behavior is observed through a two-way mirror, or they are secretly filmed. Albert Bandura used this method to study aggression in children (the Bobo doll studies ).

A lot of research has been carried out in sleep laboratories as well. Here, electrodes are attached to the scalp of participants. What is observed are the changes in electrical activity in the brain during sleep ( the machine is called an EEG ).

Controlled observations are usually overt as the researcher explains the research aim to the group so the participants know they are being observed.

Controlled observations are also usually non-participant as the researcher avoids direct contact with the group and keeps a distance (e.g., observing behind a two-way mirror).

  • Controlled observations can be easily replicated by other researchers by using the same observation schedule. This means it is easy to test for reliability .
  • The data obtained from structured observations is easier and quicker to analyze as it is quantitative (i.e., numerical) – making this a less time-consuming method compared to naturalistic observations.
  • Controlled observations are fairly quick to conduct which means that many observations can take place within a short amount of time. This means a large sample can be obtained, resulting in the findings being representative and having the ability to be generalized to a large population.

Limitations

  • Controlled observations can lack validity due to the Hawthorne effect /demand characteristics. When participants know they are being watched, they may act differently.

Naturalistic Observation

Naturalistic observation is a research method in which the researcher studies behavior in its natural setting without intervention or manipulation.

It involves observing and recording behavior as it naturally occurs, providing insights into real-life behaviors and interactions in their natural context.

Naturalistic observation is a research method commonly used by psychologists and other social scientists.

This technique involves observing and studying the spontaneous behavior of participants in natural surroundings. The researcher simply records what they see in whatever way they can.

In unstructured observations, the researcher records all relevant behavior with a coding system. There may be too much to record, and the behaviors recorded may not necessarily be the most important, so the approach is usually used as a pilot study to see what type of behaviors would be recorded.

Compared with controlled observations, it is like the difference between studying wild animals in a zoo and studying them in their natural habitat.

With regard to human subjects, Margaret Mead used this method to research the way of life of different tribes living on islands in the South Pacific. Kathy Sylva used it to study children at play by observing their behavior in a playgroup in Oxfordshire.

Collecting Naturalistic Behavioral Data

Technological advances are enabling new, unobtrusive ways of collecting naturalistic behavioral data.

The Electronically Activated Recorder (EAR) is a digital recording device participants can wear to periodically sample ambient sounds, allowing representative sampling of daily experiences (Mehl et al., 2012).

Studies program EARs to record 30-50 second sound snippets multiple times per hour. Although coding the recordings requires extensive resources, EARs can capture spontaneous behaviors like arguments or laughter.

EARs minimize participant reactivity since sampling occurs outside of awareness. This reduces the Hawthorne effect, where people change behavior when observed.

The SenseCam is another wearable device that passively captures images documenting daily activities. Though primarily used in memory research currently (Smith et al., 2014), systematic sampling of environments and behaviors via the SenseCam could enable innovative psychological studies in the future.

  • By being able to observe the flow of behavior in its own setting, studies have greater ecological validity.
  • Like case studies , naturalistic observation is often used to generate new ideas. Because it gives the researcher the opportunity to study the total situation, it often suggests avenues of inquiry not thought of before.
  • The ability to capture actual behaviors as they unfold in real-time, analyze sequential patterns of interactions, measure base rates of behaviors, and examine socially undesirable or complex behaviors that people may not self-report accurately.
  • These observations are often conducted on a micro (small) scale and may lack a representative sample (biased in relation to age, gender, social class, or ethnicity). This may result in the findings lacking the ability to generalize to wider society.
  • Natural observations are less reliable as other variables cannot be controlled. This makes it difficult for another researcher to repeat the study in exactly the same way.
  • Highly time-consuming and resource-intensive during the data coding phase (e.g., training coders, maintaining inter-rater reliability, preventing judgment drift).
  • With observations, we do not have manipulations of variables (or control over extraneous variables), meaning cause-and-effect relationships cannot be established.

Participant Observation

Participant observation is a variant of the above (natural observations) but here, the researcher joins in and becomes part of the group they are studying to get a deeper insight into their lives.

If it were research on animals , we would now not only be studying them in their natural habitat but be living alongside them as well!

Leon Festinger used this approach in a famous study into a religious cult that believed that the end of the world was about to occur. He joined the cult and studied how they reacted when the prophecy did not come true.

Participant observations can be either covert or overt. Covert is where the study is carried out “undercover.” The researcher’s real identity and purpose are kept concealed from the group being studied.

The researcher takes a false identity and role, usually posing as a genuine member of the group.

On the other hand, overt is where the researcher reveals his or her true identity and purpose to the group and asks permission to observe.

  • It can be difficult to get time/privacy for recording. For example, researchers can’t take notes openly with covert observations as this would blow their cover. This means they must wait until they are alone and rely on their memory. This is a problem as they may forget details and are unlikely to remember direct quotations.
  • If the researcher becomes too involved, they may lose objectivity and become biased. There is always the danger that we will “see” what we expect (or want) to see. This problem is because they could selectively report information instead of noting everything they observe. Thus reducing the validity of their data.

Recording of Data

With controlled/structured observation studies, an important decision the researcher has to make is how to classify and record the data. Usually, this will involve a method of sampling.

In most coding systems, codes or ratings are made either per behavioral event or per specified time interval (Bakeman & Quera, 2011).

The three main sampling methods are:

Event-based coding involves identifying and segmenting interactions into meaningful events rather than timed units.

For example, parent-child interactions may be segmented into control or teaching events to code. Interval recording involves dividing interactions into fixed time intervals (e.g., 6-15 seconds) and coding behaviors within each interval (Bakeman & Quera, 2011).

Event recording allows counting event frequency and sequencing while also potentially capturing event duration through timed-event recording. This provides information on time spent on behaviors.

Coding Systems

The coding system should focus on behaviors, patterns, individual characteristics, or relationship qualities that are relevant to the theory guiding the study (Wampler & Harper, 2014).

Codes vary in how much inference is required, from concrete observable behaviors like frequency of eye contact to more abstract concepts like degree of rapport between a therapist and client (Hill & Lambert, 2004). More inference may reduce reliability.

Macroanalytic coding systems

Macroanalytic coding systems involve rating or summarizing behaviors using larger coding units and broader categories that reflect patterns across longer periods of interaction rather than coding small or discrete behavioral acts. 

For example, a macroanalytic coding system may rate the overall degree of therapist warmth or level of client engagement globally for an entire therapy session, requiring the coders to summarize and infer these constructs across the interaction rather than coding smaller behavioral units.

These systems require observers to make more inferences (more time-consuming) but can better capture contextual factors, stability over time, and the interdependent nature of behaviors (Carlson & Grotevant, 1987).

Microanalytic coding systems

Microanalytic coding systems involve rating behaviors using smaller, more discrete coding units and categories.

For example, a microanalytic system may code each instance of eye contact or head nodding during a therapy session. These systems code specific, molecular behaviors as they occur moment-to-moment rather than summarizing actions over longer periods.

Microanalytic systems require less inference from coders and allow for analysis of behavioral contingencies and sequential interactions between therapist and client. However, they are more time-consuming and expensive to implement than macroanalytic approaches.

Mesoanalytic coding systems

Mesoanalytic coding systems attempt to balance macro- and micro-analytic approaches.

In contrast to macroanalytic systems that summarize behaviors in larger chunks, mesoanalytic systems use medium-sized coding units that target more specific behaviors or interaction sequences (Bakeman & Quera, 2017).

For example, a mesoanalytic system may code each instance of a particular type of therapist statement or client emotional expression. However, mesoanalytic systems still use larger units than microanalytic approaches coding every speech onset/offset.

The goal of balancing specificity and feasibility makes mesoanalytic systems well-suited for many research questions (Morris et al., 2014). Mesoanalytic codes can preserve some sequential information while remaining efficient enough for studies with adequate but limited resources.

For instance, a mesoanalytic couple interaction coding system could target key behavior patterns like validation sequences without coding turn-by-turn speech.

In this way, mesoanalytic coding allows reasonable reliability and specificity without requiring extensive training or observation. The mid-level focus offers a pragmatic compromise between depth and breadth in analyzing interactions.

Preventing Coder Drift

Coder drift results in a measurement error caused by gradual shifts in how observations get rated according to operational definitions, especially when behavioral codes are not clearly specified.

This type of error creeps in when coders fail to regularly review what precise observations constitute or do not constitute the behaviors being measured.

Preventing drift refers to taking active steps to maintain consistency and minimize changes or deviations in how coders rate or evaluate behaviors over time. Specifically, some key ways to prevent coder drift include:
  • Operationalize codes : It is essential that code definitions unambiguously distinguish what interactions represent instances of each coded behavior. 
  • Ongoing training : Returning to those operational definitions through ongoing training serves to recalibrate coder interpretations and reinforce accurate recognition. Having regular “check-in” sessions where coders practice coding the same interactions allows monitoring that they continue applying codes reliably without gradual shifts in interpretation.
  • Using reference videos : Coders periodically coding the same “gold standard” reference videos anchors their judgments and calibrate against original training. Without periodic anchoring to original specifications, coder decisions tend to drift from initial measurement reliability.
  • Assessing inter-rater reliability : Statistical tracking that coders maintain high levels of agreement over the course of a study, not just at the start, flags any declines indicating drift. Sustaining inter-rater agreement requires mitigating this common tendency for observer judgment change during intensive, long-term coding tasks.
  • Recalibrating through discussion : Having meetings for coders to discuss disagreements openly explores reasons judgment shifts may be occurring over time. Consensus on the application of codes is restored.
  • Adjusting unclear codes : If reliability issues persist, revisiting and refining ambiguous code definitions or anchors can eliminate inconsistencies arising from coder confusion.

Essentially, the goal of preventing coder drift is maintaining standardization and minimizing unintentional biases that may slowly alter how observational data gets rated over periods of extensive coding.

Through the upkeep of skills, continuing calibration to benchmarks, and monitoring consistency, researchers can notice and correct for any creeping changes in coder decision-making over time.

Reducing Observer Bias

Observational research is prone to observer biases resulting from coders’ subjective perspectives shaping the interpretation of complex interactions (Burghardt et al., 2012). When coding, personal expectations may unconsciously influence judgments. However, rigorous methods exist to reduce such bias.

Coding Manual

A detailed coding manual minimizes subjectivity by clearly defining what behaviors and interaction dynamics observers should code (Bakeman & Quera, 2011).

High-quality manuals have strong theoretical and empirical grounding, laying out explicit coding procedures and providing rich behavioral examples to anchor code definitions (Lindahl, 2001).

Clear delineation of the frequency, intensity, duration, and type of behaviors constituting each code facilitates reliable judgments and reduces ambiguity for coders. Application risks inconsistency across raters without clarity on how codes translate to observable interaction.

Coder Training

Competent coders require both interpersonal perceptiveness and scientific rigor (Wampler & Harper, 2014). Training thoroughly reviews the theoretical basis for coded constructs and teaches the coding system itself.

Multiple “gold standard” criterion videos demonstrate code ranges that trainees independently apply. Coders then meet weekly to establish reliability of 80% or higher agreement both among themselves and with master criterion coding (Hill & Lambert, 2004).

Ongoing training manages coder drift over time. Revisions to unclear codes may also improve reliability. Both careful selection and investment in rigorous training increase quality control.

Blind Methods

To prevent bias, coders should remain unaware of specific study predictions or participant details (Burghardt et al., 2012). Separate data gathering versus coding teams helps maintain blinding.

Coders should be unaware of study details or participant identities that could bias coding (Burghardt et al., 2012).

Separate teams collecting data versus coding data can reduce bias.

In addition, scheduling procedures can prevent coders from rating data collected directly from participants with whom they have had personal contact. Maintaining coder independence and blinding enhances objectivity.

observation methods

Bakeman, R., & Quera, V. (2017). Sequential analysis and observational methods for the behavioral sciences. Cambridge University Press.

Burghardt, G. M., Bartmess-LeVasseur, J. N., Browning, S. A., Morrison, K. E., Stec, C. L., Zachau, C. E., & Freeberg, T. M. (2012). Minimizing observer bias in behavioral studies: A review and recommendations. Ethology, 118 (6), 511-517.

Hill, C. E., & Lambert, M. J. (2004). Methodological issues in studying psychotherapy processes and outcomes. In M. J. Lambert (Ed.), Bergin and Garfield’s handbook of psychotherapy and behavior change (5th ed., pp. 84–135). Wiley.

Lindahl, K. M. (2001). Methodological issues in family observational research. In P. K. Kerig & K. M. Lindahl (Eds.), Family observational coding systems: Resources for systemic research (pp. 23–32). Lawrence Erlbaum Associates.

Mehl, M. R., Robbins, M. L., & Deters, F. G. (2012). Naturalistic observation of health-relevant social processes: The electronically activated recorder methodology in psychosomatics. Psychosomatic Medicine, 74 (4), 410–417.

Morris, A. S., Robinson, L. R., & Eisenberg, N. (2014). Applying a multimethod perspective to the study of developmental psychology. In H. T. Reis & C. M. Judd (Eds.), Handbook of research methods in social and personality psychology (2nd ed., pp. 103–123). Cambridge University Press.

Smith, J. A., Maxwell, S. D., & Johnson, G. (2014). The microstructure of everyday life: Analyzing the complex choreography of daily routines through the automatic capture and processing of wearable sensor data. In B. K. Wiederhold & G. Riva (Eds.), Annual Review of Cybertherapy and Telemedicine 2014: Positive Change with Technology (Vol. 199, pp. 62-64). IOS Press.

Traniello, J. F., & Bakker, T. C. (2015). The integrative study of behavioral interactions across the sciences. In T. K. Shackelford & R. D. Hansen (Eds.), The evolution of sexuality (pp. 119-147). Springer.

Wampler, K. S., & Harper, A. (2014). Observational methods in couple and family assessment. In H. T. Reis & C. M. Judd (Eds.), Handbook of research methods in social and personality psychology (2nd ed., pp. 490–502). Cambridge University Press.

Print Friendly, PDF & Email

Elsevier QRcode Wechat

  • Research Process

What is Observational Study Design and Types

  • 4 minute read
  • 108.7K views

Table of Contents

Most people think of a traditional experimental design when they consider research and published research papers. There is, however, a type of research that is more observational in nature, and it is appropriately referred to as “observational studies.”

There are many valuable reasons to utilize an observational study design. But, just as in research experimental design, different methods can be used when you’re considering this type of study. In this article, we’ll look at the advantages and disadvantages of an observational study design, as well as the 3 types of observational studies.

What is Observational Study Design?

An observational study is when researchers are looking at the effect of some type of intervention, risk, a diagnostic test or treatment, without trying to manipulate who is, or who isn’t, exposed to it.

This differs from an experimental study, where the scientists are manipulating who is exposed to the treatment, intervention, etc., by having a control group, or those who are not exposed, and an experimental group, or those who are exposed to the intervention, treatment, etc. In the best studies, the groups are randomized, or chosen by chance.

Any evidence derived from systematic reviews is considered the best in the hierarchy of evidence, which considers which studies are deemed the most reliable. Next would be any evidence that comes from randomized controlled trials. Cohort studies and case studies follow, in that order.

Cohort studies and case studies are considered observational in design, whereas the randomized controlled trial would be an experimental study.

Let’s take a closer look at the different types of observational study design.

The 3 types of Observational Studies

The different types of observational studies are used for different reasons. Selecting the best type for your research is critical to a successful outcome. One of the main reasons observational studies are used is when a randomized experiment would be considered unethical. For example, a life-saving medication used in a public health emergency. They are also used when looking at aetiology, or the cause of a condition or disease, as well as the treatment of rare conditions.

Case Control Observational Study

Researchers in case control studies identify individuals with an existing health issue or condition, or “cases,” along with a similar group without the condition, or “controls.” These two groups are then compared to identify predictors and outcomes. This type of study is helpful to generate a hypothesis that can then be researched.

Cohort Observational Study

This type of observational study is often used to help understand cause and effect. A cohort observational study looks at causes, incidence and prognosis, for example. A cohort is a group of people who are linked in a particular way, for example, a birth cohort would include people who were born within a specific period of time. Scientists might compare what happens to the members of the cohort who have been exposed to some variable to what occurs with members of the cohort who haven’t been exposed.

Cross Sectional Observational Study

Unlike a cohort observational study, a cross sectional observational study does not explore cause and effect, but instead looks at prevalence. Here you would look at data from a particular group at one very specific period of time. Researchers would simply observe and record information about something present in the population, without manipulating any variables or interventions. These types of studies are commonly used in psychology, education and social science.

Advantages and Disadvantages of Observational Study Design

Observational study designs have the distinct advantage of allowing researchers to explore answers to questions where a randomized controlled trial, or RCT, would be unethical. Additionally, if the study is focused on a rare condition, studying existing cases as compared to non-affected individuals might be the most effective way to identify possible causes of the condition. Likewise, if very little is known about a condition or circumstance, a cohort study would be a good study design choice.

A primary advantage to the observational study design is that they can generally be completed quickly and inexpensively. A RCT can take years before the data is compiled and available. RCTs are more complex and involved, requiring many more logistics and details to iron out, whereas an observational study can be more easily designed and completed.

The main disadvantage of observational study designs is that they’re more open to dispute than an RCT. Of particular concern would be confounding biases. This is when a cohort might share other characteristics that affect the outcome versus the outcome stated in the study. An example would be that people who practice good sleeping habits have less heart disease. But, maybe those who practice effective sleeping habits also, in general, eat better and exercise more.

Language Editing Plus Service

Need help with your research writing? With our Language Editing Plus service , we’ll help you improve the flow and writing of your paper, including UNLIMITED editing support. Use the simulator below to check the price for your manuscript, using the total number of words of the document.

Clinical Questions: PICO and PEO Research

Clinical Questions: PICO and PEO Research

Paper Retraction: Meaning and Main Reasons

Paper Retraction: Meaning and Main Reasons

You may also like.

what is a descriptive research design

Descriptive Research Design and Its Myriad Uses

Doctor doing a Biomedical Research Paper

Five Common Mistakes to Avoid When Writing a Biomedical Research Paper

observation or case study

Making Technical Writing in Environmental Engineering Accessible

Risks of AI-assisted Academic Writing

To Err is Not Human: The Dangers of AI-assisted Academic Writing

Importance-of-Data-Collection

When Data Speak, Listen: Importance of Data Collection and Analysis Methods

choosing the Right Research Methodology

Choosing the Right Research Methodology: A Guide for Researchers

Why is data validation important in research

Why is data validation important in research?

Writing a good review article

Writing a good review article

Input your search keywords and press Enter.

6.5 Observational Research

Learning objectives.

  • List the various types of observational research methods and distinguish between each
  • Describe the strengths and weakness of each observational research method. 

What Is Observational Research?

The term observational research is used to refer to several different types of non-experimental studies in which behavior is systematically observed and recorded. The goal of observational research is to describe a variable or set of variables. More generally, the goal is to obtain a snapshot of specific characteristics of an individual, group, or setting. As described previously, observational research is non-experimental because nothing is manipulated or controlled, and as such we cannot arrive at causal conclusions using this approach. The data that are collected in observational research studies are often qualitative in nature but they may also be quantitative or both (mixed-methods). There are several different types of observational research designs that will be described below.

Naturalistic Observation

Naturalistic observation  is an observational method that involves observing people’s behavior in the environment in which it typically occurs. Thus naturalistic observation is a type of field research (as opposed to a type of laboratory research). Jane Goodall’s famous research on chimpanzees is a classic example of naturalistic observation. Dr.  Goodall spent three decades observing chimpanzees in their natural environment in East Africa. She examined such things as chimpanzee’s social structure, mating patterns, gender roles, family structure, and care of offspring by observing them in the wild. However, naturalistic observation  could more simply involve observing shoppers in a grocery store, children on a school playground, or psychiatric inpatients in their wards. Researchers engaged in naturalistic observation usually make their observations as unobtrusively as possible so that participants are not aware that they are being studied. Such an approach is called disguised naturalistic observation.  Ethically, this method is considered to be acceptable if the participants remain anonymous and the behavior occurs in a public setting where people would not normally have an expectation of privacy. Grocery shoppers putting items into their shopping carts, for example, are engaged in public behavior that is easily observable by store employees and other shoppers. For this reason, most researchers would consider it ethically acceptable to observe them for a study. On the other hand, one of the arguments against the ethicality of the naturalistic observation of “bathroom behavior” discussed earlier in the book is that people have a reasonable expectation of privacy even in a public restroom and that this expectation was violated. 

In cases where it is not ethical or practical to conduct disguised naturalistic observation, researchers can conduct  undisguised naturalistic observation where the participants are made aware of the researcher presence and monitoring of their behavior. However, one concern with undisguised naturalistic observation is  reactivity. Reactivity  refers to when a measure changes participants’ behavior. In the case of undisguised naturalistic observation, the concern with reactivity is that when people know they are being observed and studied, they may act differently than they normally would. For instance, you may act much differently in a bar if you know that someone is observing you and recording your behaviors and this would invalidate the study. So disguised observation is less reactive and therefore can have higher validity because people are not aware that their behaviors are being observed and recorded. However, we now know that people often become used to being observed and with time they begin to behave naturally in the researcher’s presence. In other words, over time people habituate to being observed. Think about reality shows like Big Brother or Survivor where people are constantly being observed and recorded. While they may be on their best behavior at first, in a fairly short amount of time they are, flirting, having sex, wearing next to nothing, screaming at each other, and at times acting like complete fools in front of the entire nation.

Participant Observation

Another approach to data collection in observational research is participant observation. In  participant observation , researchers become active participants in the group or situation they are studying. Participant observation is very similar to naturalistic observation in that it involves observing people’s behavior in the environment in which it typically occurs. As with naturalistic observation, the data that is collected can include interviews (usually unstructured), notes based on their observations and interactions, documents, photographs, and other artifacts. The only difference between naturalistic observation and participant observation is that researchers engaged in participant observation become active members of the group or situations they are studying. The basic rationale for participant observation is that there may be important information that is only accessible to, or can be interpreted only by, someone who is an active participant in the group or situation. Like naturalistic observation, participant observation can be either disguised or undisguised. In disguised participant observation, the researchers pretend to be members of the social group they are observing and conceal their true identity as researchers. In contrast with undisguised participant observation,  the researchers become a part of the group they are studying and they disclose their true identity as researchers to the group under investigation. Once again there are important ethical issues to consider with disguised participant observation.  First no informed consent can be obtained and second passive deception is being used. The researcher is passively deceiving the participants by intentionally withholding information about their motivations for being a part of the social group they are studying. But sometimes disguised participation is the only way to access a protective group (like a cult). Further,  disguised participant observation is less prone to reactivity than undisguised participant observation. 

Rosenhan’s study (1973) [1]   of the experience of people in a psychiatric ward would be considered disguised participant observation because Rosenhan and his pseudopatients were admitted into psychiatric hospitals on the pretense of being patients so that they could observe the way that psychiatric patients are treated by staff. The staff and other patients were unaware of their true identities as researchers.

Another example of participant observation comes from a study by sociologist Amy Wilkins (published in  Social Psychology Quarterly ) on a university-based religious organization that emphasized how happy its members were (Wilkins, 2008) [2] . Wilkins spent 12 months attending and participating in the group’s meetings and social events, and she interviewed several group members. In her study, Wilkins identified several ways in which the group “enforced” happiness—for example, by continually talking about happiness, discouraging the expression of negative emotions, and using happiness as a way to distinguish themselves from other groups.

One of the primary benefits of participant observation is that the researcher is in a much better position to understand the viewpoint and experiences of the people they are studying when they are apart of the social group. The primary limitation with this approach is that the mere presence of the observer could affect the behavior of the people being observed. While this is also a concern with naturalistic observation when researchers because active members of the social group they are studying, additional concerns arise that they may change the social dynamics and/or influence the behavior of the people they are studying. Similarly, if the researcher acts as a participant observer there can be concerns with biases resulting from developing relationships with the participants. Concretely, the researcher may become less objective resulting in more experimenter bias.

Structured Observation

Another observational method is structured observation. Here the investigator makes careful observations of one or more specific behaviors in a particular setting that is more structured than the settings used in naturalistic and participant observation. Often the setting in which the observations are made is not the natural setting, rather the researcher may observe people in the laboratory environment. Alternatively, the researcher may observe people in a natural setting (like a classroom setting) that they have structured some way, for instance by introducing some specific task participants are to engage in or by introducing a specific social situation or manipulation. Structured observation is very similar to naturalistic observation and participant observation in that in all cases researchers are observing naturally occurring behavior, however, the emphasis in structured observation is on gathering quantitative rather than qualitative data. Researchers using this approach are interested in a limited set of behaviors. This allows them to quantify the behaviors they are observing. In other words, structured observation is less global than naturalistic and participant observation because the researcher engaged in structured observations is interested in a small number of specific behaviors. Therefore, rather than recording everything that happens, the researcher only focuses on very specific behaviors of interest.

Structured observation is very similar to naturalistic observation and participant observation in that in all cases researchers are observing naturally occurring behavior, however, the emphasis in structured observation is on gathering quantitative rather than qualitative data. Researchers using this approach are interested in a limited set of behaviors. This allows them to quantify the behaviors they are observing. In other words, structured observation is less global than naturalistic and participant observation because the researcher engaged in structured observations is interested in a small number of specific behaviors. Therefore, rather than recording everything that happens, the researcher only focuses on very specific behaviors of interest.

Researchers Robert Levine and Ara Norenzayan used structured observation to study differences in the “pace of life” across countries (Levine & Norenzayan, 1999) [3] . One of their measures involved observing pedestrians in a large city to see how long it took them to walk 60 feet. They found that people in some countries walked reliably faster than people in other countries. For example, people in Canada and Sweden covered 60 feet in just under 13 seconds on average, while people in Brazil and Romania took close to 17 seconds. When structured observation  takes place in the complex and even chaotic “real world,” the questions of when, where, and under what conditions the observations will be made, and who exactly will be observed are important to consider. Levine and Norenzayan described their sampling process as follows:

“Male and female walking speed over a distance of 60 feet was measured in at least two locations in main downtown areas in each city. Measurements were taken during main business hours on clear summer days. All locations were flat, unobstructed, had broad sidewalks, and were sufficiently uncrowded to allow pedestrians to move at potentially maximum speeds. To control for the effects of socializing, only pedestrians walking alone were used. Children, individuals with obvious physical handicaps, and window-shoppers were not timed. Thirty-five men and 35 women were timed in most cities.” (p. 186).  Precise specification of the sampling process in this way makes data collection manageable for the observers, and it also provides some control over important extraneous variables. For example, by making their observations on clear summer days in all countries, Levine and Norenzayan controlled for effects of the weather on people’s walking speeds.  In Levine and Norenzayan’s study, measurement was relatively straightforward. They simply measured out a 60-foot distance along a city sidewalk and then used a stopwatch to time participants as they walked over that distance.

As another example, researchers Robert Kraut and Robert Johnston wanted to study bowlers’ reactions to their shots, both when they were facing the pins and then when they turned toward their companions (Kraut & Johnston, 1979) [4] . But what “reactions” should they observe? Based on previous research and their own pilot testing, Kraut and Johnston created a list of reactions that included “closed smile,” “open smile,” “laugh,” “neutral face,” “look down,” “look away,” and “face cover” (covering one’s face with one’s hands). The observers committed this list to memory and then practiced by coding the reactions of bowlers who had been videotaped. During the actual study, the observers spoke into an audio recorder, describing the reactions they observed. Among the most interesting results of this study was that bowlers rarely smiled while they still faced the pins. They were much more likely to smile after they turned toward their companions, suggesting that smiling is not purely an expression of happiness but also a form of social communication.

When the observations require a judgment on the part of the observers—as in Kraut and Johnston’s study—this process is often described as  coding . Coding generally requires clearly defining a set of target behaviors. The observers then categorize participants individually in terms of which behavior they have engaged in and the number of times they engaged in each behavior. The observers might even record the duration of each behavior. The target behaviors must be defined in such a way that different observers code them in the same way. This difficulty with coding is the issue of interrater reliability, as mentioned in Chapter 4. Researchers are expected to demonstrate the interrater reliability of their coding procedure by having multiple raters code the same behaviors independently and then showing that the different observers are in close agreement. Kraut and Johnston, for example, video recorded a subset of their participants’ reactions and had two observers independently code them. The two observers showed that they agreed on the reactions that were exhibited 97% of the time, indicating good interrater reliability.

One of the primary benefits of structured observation is that it is far more efficient than naturalistic and participant observation. Since the researchers are focused on specific behaviors this reduces time and expense. Also, often times the environment is structured to encourage the behaviors of interested which again means that researchers do not have to invest as much time in waiting for the behaviors of interest to naturally occur. Finally, researchers using this approach can clearly exert greater control over the environment. However, when researchers exert more control over the environment it may make the environment less natural which decreases external validity. It is less clear for instance whether structured observations made in a laboratory environment will generalize to a real world environment. Furthermore, since researchers engaged in structured observation are often not disguised there may be more concerns with reactivity.

Case Studies

A  case study  is an in-depth examination of an individual. Sometimes case studies are also completed on social units (e.g., a cult) and events (e.g., a natural disaster). Most commonly in psychology, however, case studies provide a detailed description and analysis of an individual. Often the individual has a rare or unusual condition or disorder or has damage to a specific region of the brain.

Like many observational research methods, case studies tend to be more qualitative in nature. Case study methods involve an in-depth, and often a longitudinal examination of an individual. Depending on the focus of the case study, individuals may or may not be observed in their natural setting. If the natural setting is not what is of interest, then the individual may be brought into a therapist’s office or a researcher’s lab for study. Also, the bulk of the case study report will focus on in-depth descriptions of the person rather than on statistical analyses. With that said some quantitative data may also be included in the write-up of a case study. For instance, an individuals’ depression score may be compared to normative scores or their score before and after treatment may be compared. As with other qualitative methods, a variety of different methods and tools can be used to collect information on the case. For instance, interviews, naturalistic observation, structured observation, psychological testing (e.g., IQ test), and/or physiological measurements (e.g., brain scans) may be used to collect information on the individual.

HM is one of the most notorious case studies in psychology. HM suffered from intractable and very severe epilepsy. A surgeon localized HM’s epilepsy to his medial temporal lobe and in 1953 he removed large sections of his hippocampus in an attempt to stop the seizures. The treatment was a success, in that it resolved his epilepsy and his IQ and personality were unaffected. However, the doctors soon realized that HM exhibited a strange form of amnesia, called anterograde amnesia. HM was able to carry out a conversation and he could remember short strings of letters, digits, and words. Basically, his short term memory was preserved. However, HM could not commit new events to memory. He lost the ability to transfer information from his short-term memory to his long term memory, something memory researchers call consolidation. So while he could carry on a conversation with someone, he would completely forget the conversation after it ended. This was an extremely important case study for memory researchers because it suggested that there’s a dissociation between short-term memory and long-term memory, it suggested that these were two different abilities sub-served by different areas of the brain. It also suggested that the temporal lobes are particularly important for consolidating new information (i.e., for transferring information from short-term memory to long-term memory).

www.youtube.com/watch?v=KkaXNvzE4pk

The history of psychology is filled with influential cases studies, such as Sigmund Freud’s description of “Anna O.” (see Note 6.1 “The Case of “Anna O.””) and John Watson and Rosalie Rayner’s description of Little Albert (Watson & Rayner, 1920) [5] , who learned to fear a white rat—along with other furry objects—when the researchers made a loud noise while he was playing with the rat.

The Case of “Anna O.”

Sigmund Freud used the case of a young woman he called “Anna O.” to illustrate many principles of his theory of psychoanalysis (Freud, 1961) [6] . (Her real name was Bertha Pappenheim, and she was an early feminist who went on to make important contributions to the field of social work.) Anna had come to Freud’s colleague Josef Breuer around 1880 with a variety of odd physical and psychological symptoms. One of them was that for several weeks she was unable to drink any fluids. According to Freud,

She would take up the glass of water that she longed for, but as soon as it touched her lips she would push it away like someone suffering from hydrophobia.…She lived only on fruit, such as melons, etc., so as to lessen her tormenting thirst. (p. 9)

But according to Freud, a breakthrough came one day while Anna was under hypnosis.

[S]he grumbled about her English “lady-companion,” whom she did not care for, and went on to describe, with every sign of disgust, how she had once gone into this lady’s room and how her little dog—horrid creature!—had drunk out of a glass there. The patient had said nothing, as she had wanted to be polite. After giving further energetic expression to the anger she had held back, she asked for something to drink, drank a large quantity of water without any difficulty, and awoke from her hypnosis with the glass at her lips; and thereupon the disturbance vanished, never to return. (p.9)

Freud’s interpretation was that Anna had repressed the memory of this incident along with the emotion that it triggered and that this was what had caused her inability to drink. Furthermore, her recollection of the incident, along with her expression of the emotion she had repressed, caused the symptom to go away.

As an illustration of Freud’s theory, the case study of Anna O. is quite effective. As evidence for the theory, however, it is essentially worthless. The description provides no way of knowing whether Anna had really repressed the memory of the dog drinking from the glass, whether this repression had caused her inability to drink, or whether recalling this “trauma” relieved the symptom. It is also unclear from this case study how typical or atypical Anna’s experience was.

Figure 10.1 Anna O. “Anna O.” was the subject of a famous case study used by Freud to illustrate the principles of psychoanalysis. Source: http://en.wikipedia.org/wiki/File:Pappenheim_1882.jpg

Figure 10.1 Anna O. “Anna O.” was the subject of a famous case study used by Freud to illustrate the principles of psychoanalysis. Source: http://en.wikipedia.org/wiki/File:Pappenheim_1882.jpg

Case studies are useful because they provide a level of detailed analysis not found in many other research methods and greater insights may be gained from this more detailed analysis. As a result of the case study, the researcher may gain a sharpened understanding of what might become important to look at more extensively in future more controlled research. Case studies are also often the only way to study rare conditions because it may be impossible to find a large enough sample to individuals with the condition to use quantitative methods. Although at first glance a case study of a rare individual might seem to tell us little about ourselves, they often do provide insights into normal behavior. The case of HM provided important insights into the role of the hippocampus in memory consolidation. However, it is important to note that while case studies can provide insights into certain areas and variables to study, and can be useful in helping develop theories, they should never be used as evidence for theories. In other words, case studies can be used as inspiration to formulate theories and hypotheses, but those hypotheses and theories then need to be formally tested using more rigorous quantitative methods.

The reason case studies shouldn’t be used to provide support for theories is that they suffer from problems with internal and external validity. Case studies lack the proper controls that true experiments contain. As such they suffer from problems with internal validity, so they cannot be used to determine causation. For instance, during HM’s surgery, the surgeon may have accidentally lesioned another area of HM’s brain (indeed questioning into the possibility of a separate brain lesion began after HM’s death and dissection of his brain) and that lesion may have contributed to his inability to consolidate new information. The fact is, with case studies we cannot rule out these sorts of alternative explanations. So as with all observational methods case studies do not permit determination of causation. In addition, because case studies are often of a single individual, and typically a very abnormal individual, researchers cannot generalize their conclusions to other individuals. Recall that with most research designs there is a trade-off between internal and external validity, with case studies, however, there are problems with both internal validity and external validity. So there are limits both to the ability to determine causation and to generalize the results. A final limitation of case studies is that ample opportunity exists for the theoretical biases of the researcher to color or bias the case description. Indeed, there have been accusations that the woman who studied HM destroyed a lot of her data that were not published and she has been called into question for destroying contradictory data that didn’t support her theory about how memories are consolidated. There is a fascinating New York Times article that describes some of the controversies that ensued after HM’s death and analysis of his brain that can be found at: https://www.nytimes.com/2016/08/07/magazine/the-brain-that-couldnt-remember.html?_r=0

Archival Research

Another approach that is often considered observational research is the use of  archival research  which involves analyzing data that have already been collected for some other purpose. An example is a study by Brett Pelham and his colleagues on “implicit egotism”—the tendency for people to prefer people, places, and things that are similar to themselves (Pelham, Carvallo, & Jones, 2005) [7] . In one study, they examined Social Security records to show that women with the names Virginia, Georgia, Louise, and Florence were especially likely to have moved to the states of Virginia, Georgia, Louisiana, and Florida, respectively.

As with naturalistic observation, measurement can be more or less straightforward when working with archival data. For example, counting the number of people named Virginia who live in various states based on Social Security records is relatively straightforward. But consider a study by Christopher Peterson and his colleagues on the relationship between optimism and health using data that had been collected many years before for a study on adult development (Peterson, Seligman, & Vaillant, 1988) [8] . In the 1940s, healthy male college students had completed an open-ended questionnaire about difficult wartime experiences. In the late 1980s, Peterson and his colleagues reviewed the men’s questionnaire responses to obtain a measure of explanatory style—their habitual ways of explaining bad events that happen to them. More pessimistic people tend to blame themselves and expect long-term negative consequences that affect many aspects of their lives, while more optimistic people tend to blame outside forces and expect limited negative consequences. To obtain a measure of explanatory style for each participant, the researchers used a procedure in which all negative events mentioned in the questionnaire responses, and any causal explanations for them were identified and written on index cards. These were given to a separate group of raters who rated each explanation in terms of three separate dimensions of optimism-pessimism. These ratings were then averaged to produce an explanatory style score for each participant. The researchers then assessed the statistical relationship between the men’s explanatory style as undergraduate students and archival measures of their health at approximately 60 years of age. The primary result was that the more optimistic the men were as undergraduate students, the healthier they were as older men. Pearson’s  r  was +.25.

This method is an example of  content analysis —a family of systematic approaches to measurement using complex archival data. Just as structured observation requires specifying the behaviors of interest and then noting them as they occur, content analysis requires specifying keywords, phrases, or ideas and then finding all occurrences of them in the data. These occurrences can then be counted, timed (e.g., the amount of time devoted to entertainment topics on the nightly news show), or analyzed in a variety of other ways.

Key Takeaways

  • There are several different approaches to observational research including naturalistic observation, participant observation, structured observation, case studies, and archival research.
  • Naturalistic observation is used to observe people in their natural setting, participant observation involves becoming an active member of the group being observed, structured observation involves coding a small number of behaviors in a quantitative manner, case studies are typically used to collect in-depth information on a single individual, and archival research involves analysing existing data.
  • Describe one problem related to internal validity.
  • Describe one problem related to external validity.
  • Generate one hypothesis suggested by the case study that might be interesting to test in a systematic single-subject or group study.
  • Rosenhan, D. L. (1973). On being sane in insane places. Science, 179 , 250–258. ↵
  • Wilkins, A. (2008). “Happier than Non-Christians”: Collective emotions and symbolic boundaries among evangelical Christians. Social Psychology Quarterly, 71 , 281–301. ↵
  • Levine, R. V., & Norenzayan, A. (1999). The pace of life in 31 countries. Journal of Cross-Cultural Psychology, 30 , 178–205. ↵
  • Kraut, R. E., & Johnston, R. E. (1979). Social and emotional messages of smiling: An ethological approach. Journal of Personality and Social Psychology, 37 , 1539–1553. ↵
  • Watson, J. B., & Rayner, R. (1920). Conditioned emotional reactions. Journal of Experimental Psychology, 3 , 1–14. ↵
  • Freud, S. (1961).  Five lectures on psycho-analysis . New York, NY: Norton. ↵
  • Pelham, B. W., Carvallo, M., & Jones, J. T. (2005). Implicit egotism. Current Directions in Psychological Science, 14 , 106–110. ↵
  • Peterson, C., Seligman, M. E. P., & Vaillant, G. E. (1988). Pessimistic explanatory style is a risk factor for physical illness: A thirty-five year longitudinal study. Journal of Personality and Social Psychology, 55 , 23–27. ↵

Creative Commons License

Share This Book

  • Increase Font Size

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

Ch 2: Psychological Research Methods

Children sit in front of a bank of television screens. A sign on the wall says, “Some content may not be suitable for children.”

Have you ever wondered whether the violence you see on television affects your behavior? Are you more likely to behave aggressively in real life after watching people behave violently in dramatic situations on the screen? Or, could seeing fictional violence actually get aggression out of your system, causing you to be more peaceful? How are children influenced by the media they are exposed to? A psychologist interested in the relationship between behavior and exposure to violent images might ask these very questions.

The topic of violence in the media today is contentious. Since ancient times, humans have been concerned about the effects of new technologies on our behaviors and thinking processes. The Greek philosopher Socrates, for example, worried that writing—a new technology at that time—would diminish people’s ability to remember because they could rely on written records rather than committing information to memory. In our world of quickly changing technologies, questions about the effects of media continue to emerge. Is it okay to talk on a cell phone while driving? Are headphones good to use in a car? What impact does text messaging have on reaction time while driving? These are types of questions that psychologist David Strayer asks in his lab.

Watch this short video to see how Strayer utilizes the scientific method to reach important conclusions regarding technology and driving safety.

You can view the transcript for “Understanding driver distraction” here (opens in new window) .

How can we go about finding answers that are supported not by mere opinion, but by evidence that we can all agree on? The findings of psychological research can help us navigate issues like this.

Introduction to the Scientific Method

Learning objectives.

  • Explain the steps of the scientific method
  • Describe why the scientific method is important to psychology
  • Summarize the processes of informed consent and debriefing
  • Explain how research involving humans or animals is regulated

photograph of the word "research" from a dictionary with a pen pointing at the word.

Scientists are engaged in explaining and understanding how the world around them works, and they are able to do so by coming up with theories that generate hypotheses that are testable and falsifiable. Theories that stand up to their tests are retained and refined, while those that do not are discarded or modified. In this way, research enables scientists to separate fact from simple opinion. Having good information generated from research aids in making wise decisions both in public policy and in our personal lives. In this section, you’ll see how psychologists use the scientific method to study and understand behavior.

The Scientific Process

A skull has a large hole bored through the forehead.

The goal of all scientists is to better understand the world around them. Psychologists focus their attention on understanding behavior, as well as the cognitive (mental) and physiological (body) processes that underlie behavior. In contrast to other methods that people use to understand the behavior of others, such as intuition and personal experience, the hallmark of scientific research is that there is evidence to support a claim. Scientific knowledge is empirical : It is grounded in objective, tangible evidence that can be observed time and time again, regardless of who is observing.

While behavior is observable, the mind is not. If someone is crying, we can see the behavior. However, the reason for the behavior is more difficult to determine. Is the person crying due to being sad, in pain, or happy? Sometimes we can learn the reason for someone’s behavior by simply asking a question, like “Why are you crying?” However, there are situations in which an individual is either uncomfortable or unwilling to answer the question honestly, or is incapable of answering. For example, infants would not be able to explain why they are crying. In such circumstances, the psychologist must be creative in finding ways to better understand behavior. This module explores how scientific knowledge is generated, and how important that knowledge is in forming decisions in our personal lives and in the public domain.

Process of Scientific Research

Flowchart of the scientific method. It begins with make an observation, then ask a question, form a hypothesis that answers the question, make a prediction based on the hypothesis, do an experiment to test the prediction, analyze the results, prove the hypothesis correct or incorrect, then report the results.

Scientific knowledge is advanced through a process known as the scientific method. Basically, ideas (in the form of theories and hypotheses) are tested against the real world (in the form of empirical observations), and those empirical observations lead to more ideas that are tested against the real world, and so on.

The basic steps in the scientific method are:

  • Observe a natural phenomenon and define a question about it
  • Make a hypothesis, or potential solution to the question
  • Test the hypothesis
  • If the hypothesis is true, find more evidence or find counter-evidence
  • If the hypothesis is false, create a new hypothesis or try again
  • Draw conclusions and repeat–the scientific method is never-ending, and no result is ever considered perfect

In order to ask an important question that may improve our understanding of the world, a researcher must first observe natural phenomena. By making observations, a researcher can define a useful question. After finding a question to answer, the researcher can then make a prediction (a hypothesis) about what he or she thinks the answer will be. This prediction is usually a statement about the relationship between two or more variables. After making a hypothesis, the researcher will then design an experiment to test his or her hypothesis and evaluate the data gathered. These data will either support or refute the hypothesis. Based on the conclusions drawn from the data, the researcher will then find more evidence to support the hypothesis, look for counter-evidence to further strengthen the hypothesis, revise the hypothesis and create a new experiment, or continue to incorporate the information gathered to answer the research question.

Basic Principles of the Scientific Method

Two key concepts in the scientific approach are theory and hypothesis. A theory is a well-developed set of ideas that propose an explanation for observed phenomena that can be used to make predictions about future observations. A hypothesis is a testable prediction that is arrived at logically from a theory. It is often worded as an if-then statement (e.g., if I study all night, I will get a passing grade on the test). The hypothesis is extremely important because it bridges the gap between the realm of ideas and the real world. As specific hypotheses are tested, theories are modified and refined to reflect and incorporate the result of these tests.

A diagram has four boxes: the top is labeled “theory,” the right is labeled “hypothesis,” the bottom is labeled “research,” and the left is labeled “observation.” Arrows flow in the direction from top to right to bottom to left and back to the top, clockwise. The top right arrow is labeled “use the hypothesis to form a theory,” the bottom right arrow is labeled “design a study to test the hypothesis,” the bottom left arrow is labeled “perform the research,” and the top left arrow is labeled “create or modify the theory.”

Other key components in following the scientific method include verifiability, predictability, falsifiability, and fairness. Verifiability means that an experiment must be replicable by another researcher. To achieve verifiability, researchers must make sure to document their methods and clearly explain how their experiment is structured and why it produces certain results.

Predictability in a scientific theory implies that the theory should enable us to make predictions about future events. The precision of these predictions is a measure of the strength of the theory.

Falsifiability refers to whether a hypothesis can be disproved. For a hypothesis to be falsifiable, it must be logically possible to make an observation or do a physical experiment that would show that there is no support for the hypothesis. Even when a hypothesis cannot be shown to be false, that does not necessarily mean it is not valid. Future testing may disprove the hypothesis. This does not mean that a hypothesis has to be shown to be false, just that it can be tested.

To determine whether a hypothesis is supported or not supported, psychological researchers must conduct hypothesis testing using statistics. Hypothesis testing is a type of statistics that determines the probability of a hypothesis being true or false. If hypothesis testing reveals that results were “statistically significant,” this means that there was support for the hypothesis and that the researchers can be reasonably confident that their result was not due to random chance. If the results are not statistically significant, this means that the researchers’ hypothesis was not supported.

Fairness implies that all data must be considered when evaluating a hypothesis. A researcher cannot pick and choose what data to keep and what to discard or focus specifically on data that support or do not support a particular hypothesis. All data must be accounted for, even if they invalidate the hypothesis.

Applying the Scientific Method

To see how this process works, let’s consider a specific theory and a hypothesis that might be generated from that theory. As you’ll learn in a later module, the James-Lange theory of emotion asserts that emotional experience relies on the physiological arousal associated with the emotional state. If you walked out of your home and discovered a very aggressive snake waiting on your doorstep, your heart would begin to race and your stomach churn. According to the James-Lange theory, these physiological changes would result in your feeling of fear. A hypothesis that could be derived from this theory might be that a person who is unaware of the physiological arousal that the sight of the snake elicits will not feel fear.

Remember that a good scientific hypothesis is falsifiable, or capable of being shown to be incorrect. Recall from the introductory module that Sigmund Freud had lots of interesting ideas to explain various human behaviors (Figure 5). However, a major criticism of Freud’s theories is that many of his ideas are not falsifiable; for example, it is impossible to imagine empirical observations that would disprove the existence of the id, the ego, and the superego—the three elements of personality described in Freud’s theories. Despite this, Freud’s theories are widely taught in introductory psychology texts because of their historical significance for personality psychology and psychotherapy, and these remain the root of all modern forms of therapy.

(a)A photograph shows Freud holding a cigar. (b) The mind’s conscious and unconscious states are illustrated as an iceberg floating in water. Beneath the water’s surface in the “unconscious” area are the id, ego, and superego. The area just below the water’s surface is labeled “preconscious.” The area above the water’s surface is labeled “conscious.”

In contrast, the James-Lange theory does generate falsifiable hypotheses, such as the one described above. Some individuals who suffer significant injuries to their spinal columns are unable to feel the bodily changes that often accompany emotional experiences. Therefore, we could test the hypothesis by determining how emotional experiences differ between individuals who have the ability to detect these changes in their physiological arousal and those who do not. In fact, this research has been conducted and while the emotional experiences of people deprived of an awareness of their physiological arousal may be less intense, they still experience emotion (Chwalisz, Diener, & Gallagher, 1988).

Link to Learning

Why the scientific method is important for psychology.

The use of the scientific method is one of the main features that separates modern psychology from earlier philosophical inquiries about the mind. Compared to chemistry, physics, and other “natural sciences,” psychology has long been considered one of the “social sciences” because of the subjective nature of the things it seeks to study. Many of the concepts that psychologists are interested in—such as aspects of the human mind, behavior, and emotions—are subjective and cannot be directly measured. Psychologists often rely instead on behavioral observations and self-reported data, which are considered by some to be illegitimate or lacking in methodological rigor. Applying the scientific method to psychology, therefore, helps to standardize the approach to understanding its very different types of information.

The scientific method allows psychological data to be replicated and confirmed in many instances, under different circumstances, and by a variety of researchers. Through replication of experiments, new generations of psychologists can reduce errors and broaden the applicability of theories. It also allows theories to be tested and validated instead of simply being conjectures that could never be verified or falsified. All of this allows psychologists to gain a stronger understanding of how the human mind works.

Scientific articles published in journals and psychology papers written in the style of the American Psychological Association (i.e., in “APA style”) are structured around the scientific method. These papers include an Introduction, which introduces the background information and outlines the hypotheses; a Methods section, which outlines the specifics of how the experiment was conducted to test the hypothesis; a Results section, which includes the statistics that tested the hypothesis and state whether it was supported or not supported, and a Discussion and Conclusion, which state the implications of finding support for, or no support for, the hypothesis. Writing articles and papers that adhere to the scientific method makes it easy for future researchers to repeat the study and attempt to replicate the results.

Ethics in Research

Today, scientists agree that good research is ethical in nature and is guided by a basic respect for human dignity and safety. However, as you will read in the Tuskegee Syphilis Study, this has not always been the case. Modern researchers must demonstrate that the research they perform is ethically sound. This section presents how ethical considerations affect the design and implementation of research conducted today.

Research Involving Human Participants

Any experiment involving the participation of human subjects is governed by extensive, strict guidelines designed to ensure that the experiment does not result in harm. Any research institution that receives federal support for research involving human participants must have access to an institutional review board (IRB) . The IRB is a committee of individuals often made up of members of the institution’s administration, scientists, and community members (Figure 6). The purpose of the IRB is to review proposals for research that involves human participants. The IRB reviews these proposals with the principles mentioned above in mind, and generally, approval from the IRB is required in order for the experiment to proceed.

A photograph shows a group of people seated around tables in a meeting room.

An institution’s IRB requires several components in any experiment it approves. For one, each participant must sign an informed consent form before they can participate in the experiment. An informed consent  form provides a written description of what participants can expect during the experiment, including potential risks and implications of the research. It also lets participants know that their involvement is completely voluntary and can be discontinued without penalty at any time. Furthermore, the informed consent guarantees that any data collected in the experiment will remain completely confidential. In cases where research participants are under the age of 18, the parents or legal guardians are required to sign the informed consent form.

While the informed consent form should be as honest as possible in describing exactly what participants will be doing, sometimes deception is necessary to prevent participants’ knowledge of the exact research question from affecting the results of the study. Deception involves purposely misleading experiment participants in order to maintain the integrity of the experiment, but not to the point where the deception could be considered harmful. For example, if we are interested in how our opinion of someone is affected by their attire, we might use deception in describing the experiment to prevent that knowledge from affecting participants’ responses. In cases where deception is involved, participants must receive a full debriefing  upon conclusion of the study—complete, honest information about the purpose of the experiment, how the data collected will be used, the reasons why deception was necessary, and information about how to obtain additional information about the study.

Dig Deeper: Ethics and the Tuskegee Syphilis Study

Unfortunately, the ethical guidelines that exist for research today were not always applied in the past. In 1932, poor, rural, black, male sharecroppers from Tuskegee, Alabama, were recruited to participate in an experiment conducted by the U.S. Public Health Service, with the aim of studying syphilis in black men (Figure 7). In exchange for free medical care, meals, and burial insurance, 600 men agreed to participate in the study. A little more than half of the men tested positive for syphilis, and they served as the experimental group (given that the researchers could not randomly assign participants to groups, this represents a quasi-experiment). The remaining syphilis-free individuals served as the control group. However, those individuals that tested positive for syphilis were never informed that they had the disease.

While there was no treatment for syphilis when the study began, by 1947 penicillin was recognized as an effective treatment for the disease. Despite this, no penicillin was administered to the participants in this study, and the participants were not allowed to seek treatment at any other facilities if they continued in the study. Over the course of 40 years, many of the participants unknowingly spread syphilis to their wives (and subsequently their children born from their wives) and eventually died because they never received treatment for the disease. This study was discontinued in 1972 when the experiment was discovered by the national press (Tuskegee University, n.d.). The resulting outrage over the experiment led directly to the National Research Act of 1974 and the strict ethical guidelines for research on humans described in this chapter. Why is this study unethical? How were the men who participated and their families harmed as a function of this research?

A photograph shows a person administering an injection.

Learn more about the Tuskegee Syphilis Study on the CDC website .

Research Involving Animal Subjects

A photograph shows a rat.

This does not mean that animal researchers are immune to ethical concerns. Indeed, the humane and ethical treatment of animal research subjects is a critical aspect of this type of research. Researchers must design their experiments to minimize any pain or distress experienced by animals serving as research subjects.

Whereas IRBs review research proposals that involve human participants, animal experimental proposals are reviewed by an Institutional Animal Care and Use Committee (IACUC) . An IACUC consists of institutional administrators, scientists, veterinarians, and community members. This committee is charged with ensuring that all experimental proposals require the humane treatment of animal research subjects. It also conducts semi-annual inspections of all animal facilities to ensure that the research protocols are being followed. No animal research project can proceed without the committee’s approval.

Introduction to Approaches to Research

  • Differentiate between descriptive, correlational, and experimental research
  • Explain the strengths and weaknesses of case studies, naturalistic observation, and surveys
  • Describe the strength and weaknesses of archival research
  • Compare longitudinal and cross-sectional approaches to research
  • Explain what a correlation coefficient tells us about the relationship between variables
  • Describe why correlation does not mean causation
  • Describe the experimental process, including ways to control for bias
  • Identify and differentiate between independent and dependent variables

Three researchers review data while talking around a microscope.

Psychologists use descriptive, experimental, and correlational methods to conduct research. Descriptive, or qualitative, methods include the case study, naturalistic observation, surveys, archival research, longitudinal research, and cross-sectional research.

Experiments are conducted in order to determine cause-and-effect relationships. In ideal experimental design, the only difference between the experimental and control groups is whether participants are exposed to the experimental manipulation. Each group goes through all phases of the experiment, but each group will experience a different level of the independent variable: the experimental group is exposed to the experimental manipulation, and the control group is not exposed to the experimental manipulation. The researcher then measures the changes that are produced in the dependent variable in each group. Once data is collected from both groups, it is analyzed statistically to determine if there are meaningful differences between the groups.

When scientists passively observe and measure phenomena it is called correlational research. Here, psychologists do not intervene and change behavior, as they do in experiments. In correlational research, they identify patterns of relationships, but usually cannot infer what causes what. Importantly, with correlational research, you can examine only two variables at a time, no more and no less.

Watch It: More on Research

If you enjoy learning through lectures and want an interesting and comprehensive summary of this section, then click on the Youtube link to watch a lecture given by MIT Professor John Gabrieli . Start at the 30:45 minute mark  and watch through the end to hear examples of actual psychological studies and how they were analyzed. Listen for references to independent and dependent variables, experimenter bias, and double-blind studies. In the lecture, you’ll learn about breaking social norms, “WEIRD” research, why expectations matter, how a warm cup of coffee might make you nicer, why you should change your answer on a multiple choice test, and why praise for intelligence won’t make you any smarter.

You can view the transcript for “Lec 2 | MIT 9.00SC Introduction to Psychology, Spring 2011” here (opens in new window) .

Descriptive Research

There are many research methods available to psychologists in their efforts to understand, describe, and explain behavior and the cognitive and biological processes that underlie it. Some methods rely on observational techniques. Other approaches involve interactions between the researcher and the individuals who are being studied—ranging from a series of simple questions to extensive, in-depth interviews—to well-controlled experiments.

The three main categories of psychological research are descriptive, correlational, and experimental research. Research studies that do not test specific relationships between variables are called descriptive, or qualitative, studies . These studies are used to describe general or specific behaviors and attributes that are observed and measured. In the early stages of research it might be difficult to form a hypothesis, especially when there is not any existing literature in the area. In these situations designing an experiment would be premature, as the question of interest is not yet clearly defined as a hypothesis. Often a researcher will begin with a non-experimental approach, such as a descriptive study, to gather more information about the topic before designing an experiment or correlational study to address a specific hypothesis. Descriptive research is distinct from correlational research , in which psychologists formally test whether a relationship exists between two or more variables. Experimental research  goes a step further beyond descriptive and correlational research and randomly assigns people to different conditions, using hypothesis testing to make inferences about how these conditions affect behavior. It aims to determine if one variable directly impacts and causes another. Correlational and experimental research both typically use hypothesis testing, whereas descriptive research does not.

Each of these research methods has unique strengths and weaknesses, and each method may only be appropriate for certain types of research questions. For example, studies that rely primarily on observation produce incredible amounts of information, but the ability to apply this information to the larger population is somewhat limited because of small sample sizes. Survey research, on the other hand, allows researchers to easily collect data from relatively large samples. While this allows for results to be generalized to the larger population more easily, the information that can be collected on any given survey is somewhat limited and subject to problems associated with any type of self-reported data. Some researchers conduct archival research by using existing records. While this can be a fairly inexpensive way to collect data that can provide insight into a number of research questions, researchers using this approach have no control on how or what kind of data was collected.

Correlational research can find a relationship between two variables, but the only way a researcher can claim that the relationship between the variables is cause and effect is to perform an experiment. In experimental research, which will be discussed later in the text, there is a tremendous amount of control over variables of interest. While this is a powerful approach, experiments are often conducted in very artificial settings. This calls into question the validity of experimental findings with regard to how they would apply in real-world settings. In addition, many of the questions that psychologists would like to answer cannot be pursued through experimental research because of ethical concerns.

The three main types of descriptive studies are, naturalistic observation, case studies, and surveys.

Naturalistic Observation

If you want to understand how behavior occurs, one of the best ways to gain information is to simply observe the behavior in its natural context. However, people might change their behavior in unexpected ways if they know they are being observed. How do researchers obtain accurate information when people tend to hide their natural behavior? As an example, imagine that your professor asks everyone in your class to raise their hand if they always wash their hands after using the restroom. Chances are that almost everyone in the classroom will raise their hand, but do you think hand washing after every trip to the restroom is really that universal?

This is very similar to the phenomenon mentioned earlier in this module: many individuals do not feel comfortable answering a question honestly. But if we are committed to finding out the facts about hand washing, we have other options available to us.

Suppose we send a classmate into the restroom to actually watch whether everyone washes their hands after using the restroom. Will our observer blend into the restroom environment by wearing a white lab coat, sitting with a clipboard, and staring at the sinks? We want our researcher to be inconspicuous—perhaps standing at one of the sinks pretending to put in contact lenses while secretly recording the relevant information. This type of observational study is called naturalistic observation : observing behavior in its natural setting. To better understand peer exclusion, Suzanne Fanger collaborated with colleagues at the University of Texas to observe the behavior of preschool children on a playground. How did the observers remain inconspicuous over the duration of the study? They equipped a few of the children with wireless microphones (which the children quickly forgot about) and observed while taking notes from a distance. Also, the children in that particular preschool (a “laboratory preschool”) were accustomed to having observers on the playground (Fanger, Frankel, & Hazen, 2012).

A photograph shows two police cars driving, one with its lights flashing.

It is critical that the observer be as unobtrusive and as inconspicuous as possible: when people know they are being watched, they are less likely to behave naturally. If you have any doubt about this, ask yourself how your driving behavior might differ in two situations: In the first situation, you are driving down a deserted highway during the middle of the day; in the second situation, you are being followed by a police car down the same deserted highway (Figure 9).

It should be pointed out that naturalistic observation is not limited to research involving humans. Indeed, some of the best-known examples of naturalistic observation involve researchers going into the field to observe various kinds of animals in their own environments. As with human studies, the researchers maintain their distance and avoid interfering with the animal subjects so as not to influence their natural behaviors. Scientists have used this technique to study social hierarchies and interactions among animals ranging from ground squirrels to gorillas. The information provided by these studies is invaluable in understanding how those animals organize socially and communicate with one another. The anthropologist Jane Goodall, for example, spent nearly five decades observing the behavior of chimpanzees in Africa (Figure 10). As an illustration of the types of concerns that a researcher might encounter in naturalistic observation, some scientists criticized Goodall for giving the chimps names instead of referring to them by numbers—using names was thought to undermine the emotional detachment required for the objectivity of the study (McKie, 2010).

(a) A photograph shows Jane Goodall speaking from a lectern. (b) A photograph shows a chimpanzee’s face.

The greatest benefit of naturalistic observation is the validity, or accuracy, of information collected unobtrusively in a natural setting. Having individuals behave as they normally would in a given situation means that we have a higher degree of ecological validity, or realism, than we might achieve with other research approaches. Therefore, our ability to generalize  the findings of the research to real-world situations is enhanced. If done correctly, we need not worry about people or animals modifying their behavior simply because they are being observed. Sometimes, people may assume that reality programs give us a glimpse into authentic human behavior. However, the principle of inconspicuous observation is violated as reality stars are followed by camera crews and are interviewed on camera for personal confessionals. Given that environment, we must doubt how natural and realistic their behaviors are.

The major downside of naturalistic observation is that they are often difficult to set up and control. In our restroom study, what if you stood in the restroom all day prepared to record people’s hand washing behavior and no one came in? Or, what if you have been closely observing a troop of gorillas for weeks only to find that they migrated to a new place while you were sleeping in your tent? The benefit of realistic data comes at a cost. As a researcher you have no control of when (or if) you have behavior to observe. In addition, this type of observational research often requires significant investments of time, money, and a good dose of luck.

Sometimes studies involve structured observation. In these cases, people are observed while engaging in set, specific tasks. An excellent example of structured observation comes from Strange Situation by Mary Ainsworth (you will read more about this in the module on lifespan development). The Strange Situation is a procedure used to evaluate attachment styles that exist between an infant and caregiver. In this scenario, caregivers bring their infants into a room filled with toys. The Strange Situation involves a number of phases, including a stranger coming into the room, the caregiver leaving the room, and the caregiver’s return to the room. The infant’s behavior is closely monitored at each phase, but it is the behavior of the infant upon being reunited with the caregiver that is most telling in terms of characterizing the infant’s attachment style with the caregiver.

Another potential problem in observational research is observer bias . Generally, people who act as observers are closely involved in the research project and may unconsciously skew their observations to fit their research goals or expectations. To protect against this type of bias, researchers should have clear criteria established for the types of behaviors recorded and how those behaviors should be classified. In addition, researchers often compare observations of the same event by multiple observers, in order to test inter-rater reliability : a measure of reliability that assesses the consistency of observations by different observers.

Case Studies

In 2011, the New York Times published a feature story on Krista and Tatiana Hogan, Canadian twin girls. These particular twins are unique because Krista and Tatiana are conjoined twins, connected at the head. There is evidence that the two girls are connected in a part of the brain called the thalamus, which is a major sensory relay center. Most incoming sensory information is sent through the thalamus before reaching higher regions of the cerebral cortex for processing.

The implications of this potential connection mean that it might be possible for one twin to experience the sensations of the other twin. For instance, if Krista is watching a particularly funny television program, Tatiana might smile or laugh even if she is not watching the program. This particular possibility has piqued the interest of many neuroscientists who seek to understand how the brain uses sensory information.

These twins represent an enormous resource in the study of the brain, and since their condition is very rare, it is likely that as long as their family agrees, scientists will follow these girls very closely throughout their lives to gain as much information as possible (Dominus, 2011).

In observational research, scientists are conducting a clinical or case study when they focus on one person or just a few individuals. Indeed, some scientists spend their entire careers studying just 10–20 individuals. Why would they do this? Obviously, when they focus their attention on a very small number of people, they can gain a tremendous amount of insight into those cases. The richness of information that is collected in clinical or case studies is unmatched by any other single research method. This allows the researcher to have a very deep understanding of the individuals and the particular phenomenon being studied.

If clinical or case studies provide so much information, why are they not more frequent among researchers? As it turns out, the major benefit of this particular approach is also a weakness. As mentioned earlier, this approach is often used when studying individuals who are interesting to researchers because they have a rare characteristic. Therefore, the individuals who serve as the focus of case studies are not like most other people. If scientists ultimately want to explain all behavior, focusing attention on such a special group of people can make it difficult to generalize any observations to the larger population as a whole. Generalizing refers to the ability to apply the findings of a particular research project to larger segments of society. Again, case studies provide enormous amounts of information, but since the cases are so specific, the potential to apply what’s learned to the average person may be very limited.

Often, psychologists develop surveys as a means of gathering data. Surveys are lists of questions to be answered by research participants, and can be delivered as paper-and-pencil questionnaires, administered electronically, or conducted verbally (Figure 11). Generally, the survey itself can be completed in a short time, and the ease of administering a survey makes it easy to collect data from a large number of people.

Surveys allow researchers to gather data from larger samples than may be afforded by other research methods . A sample is a subset of individuals selected from a population , which is the overall group of individuals that the researchers are interested in. Researchers study the sample and seek to generalize their findings to the population.

A sample online survey reads, “Dear visitor, your opinion is important to us. We would like to invite you to participate in a short survey to gather your opinions and feedback on your news consumption habits. The survey will take approximately 10-15 minutes. Simply click the “Yes” button below to launch the survey. Would you like to participate?” Two buttons are labeled “yes” and “no.”

There is both strength and weakness of the survey in comparison to case studies. By using surveys, we can collect information from a larger sample of people. A larger sample is better able to reflect the actual diversity of the population, thus allowing better generalizability. Therefore, if our sample is sufficiently large and diverse, we can assume that the data we collect from the survey can be generalized to the larger population with more certainty than the information collected through a case study. However, given the greater number of people involved, we are not able to collect the same depth of information on each person that would be collected in a case study.

Another potential weakness of surveys is something we touched on earlier in this chapter: people don’t always give accurate responses. They may lie, misremember, or answer questions in a way that they think makes them look good. For example, people may report drinking less alcohol than is actually the case.

Any number of research questions can be answered through the use of surveys. One real-world example is the research conducted by Jenkins, Ruppel, Kizer, Yehl, and Griffin (2012) about the backlash against the US Arab-American community following the terrorist attacks of September 11, 2001. Jenkins and colleagues wanted to determine to what extent these negative attitudes toward Arab-Americans still existed nearly a decade after the attacks occurred. In one study, 140 research participants filled out a survey with 10 questions, including questions asking directly about the participant’s overt prejudicial attitudes toward people of various ethnicities. The survey also asked indirect questions about how likely the participant would be to interact with a person of a given ethnicity in a variety of settings (such as, “How likely do you think it is that you would introduce yourself to a person of Arab-American descent?”). The results of the research suggested that participants were unwilling to report prejudicial attitudes toward any ethnic group. However, there were significant differences between their pattern of responses to questions about social interaction with Arab-Americans compared to other ethnic groups: they indicated less willingness for social interaction with Arab-Americans compared to the other ethnic groups. This suggested that the participants harbored subtle forms of prejudice against Arab-Americans, despite their assertions that this was not the case (Jenkins et al., 2012).

Think It Over

Archival research.

(a) A photograph shows stacks of paper files on shelves. (b) A photograph shows a computer.

In comparing archival research to other research methods, there are several important distinctions. For one, the researcher employing archival research never directly interacts with research participants. Therefore, the investment of time and money to collect data is considerably less with archival research. Additionally, researchers have no control over what information was originally collected. Therefore, research questions have to be tailored so they can be answered within the structure of the existing data sets. There is also no guarantee of consistency between the records from one source to another, which might make comparing and contrasting different data sets problematic.

Longitudinal and Cross-Sectional Research

Sometimes we want to see how people change over time, as in studies of human development and lifespan. When we test the same group of individuals repeatedly over an extended period of time, we are conducting longitudinal research. Longitudinal research  is a research design in which data-gathering is administered repeatedly over an extended period of time. For example, we may survey a group of individuals about their dietary habits at age 20, retest them a decade later at age 30, and then again at age 40.

Another approach is cross-sectional research . In cross-sectional research, a researcher compares multiple segments of the population at the same time. Using the dietary habits example above, the researcher might directly compare different groups of people by age. Instead of observing a group of people for 20 years to see how their dietary habits changed from decade to decade, the researcher would study a group of 20-year-old individuals and compare them to a group of 30-year-old individuals and a group of 40-year-old individuals. While cross-sectional research requires a shorter-term investment, it is also limited by differences that exist between the different generations (or cohorts) that have nothing to do with age per se, but rather reflect the social and cultural experiences of different generations of individuals make them different from one another.

To illustrate this concept, consider the following survey findings. In recent years there has been significant growth in the popular support of same-sex marriage. Many studies on this topic break down survey participants into different age groups. In general, younger people are more supportive of same-sex marriage than are those who are older (Jones, 2013). Does this mean that as we age we become less open to the idea of same-sex marriage, or does this mean that older individuals have different perspectives because of the social climates in which they grew up? Longitudinal research is a powerful approach because the same individuals are involved in the research project over time, which means that the researchers need to be less concerned with differences among cohorts affecting the results of their study.

Often longitudinal studies are employed when researching various diseases in an effort to understand particular risk factors. Such studies often involve tens of thousands of individuals who are followed for several decades. Given the enormous number of people involved in these studies, researchers can feel confident that their findings can be generalized to the larger population. The Cancer Prevention Study-3 (CPS-3) is one of a series of longitudinal studies sponsored by the American Cancer Society aimed at determining predictive risk factors associated with cancer. When participants enter the study, they complete a survey about their lives and family histories, providing information on factors that might cause or prevent the development of cancer. Then every few years the participants receive additional surveys to complete. In the end, hundreds of thousands of participants will be tracked over 20 years to determine which of them develop cancer and which do not.

Clearly, this type of research is important and potentially very informative. For instance, earlier longitudinal studies sponsored by the American Cancer Society provided some of the first scientific demonstrations of the now well-established links between increased rates of cancer and smoking (American Cancer Society, n.d.) (Figure 13).

A photograph shows pack of cigarettes and cigarettes in an ashtray. The pack of cigarettes reads, “Surgeon general’s warning: smoking causes lung cancer, heart disease, emphysema, and may complicate pregnancy.”

As with any research strategy, longitudinal research is not without limitations. For one, these studies require an incredible time investment by the researcher and research participants. Given that some longitudinal studies take years, if not decades, to complete, the results will not be known for a considerable period of time. In addition to the time demands, these studies also require a substantial financial investment. Many researchers are unable to commit the resources necessary to see a longitudinal project through to the end.

Research participants must also be willing to continue their participation for an extended period of time, and this can be problematic. People move, get married and take new names, get ill, and eventually die. Even without significant life changes, some people may simply choose to discontinue their participation in the project. As a result, the attrition  rates, or reduction in the number of research participants due to dropouts, in longitudinal studies are quite high and increases over the course of a project. For this reason, researchers using this approach typically recruit many participants fully expecting that a substantial number will drop out before the end. As the study progresses, they continually check whether the sample still represents the larger population, and make adjustments as necessary.

Correlational Research

Did you know that as sales in ice cream increase, so does the overall rate of crime? Is it possible that indulging in your favorite flavor of ice cream could send you on a crime spree? Or, after committing crime do you think you might decide to treat yourself to a cone? There is no question that a relationship exists between ice cream and crime (e.g., Harper, 2013), but it would be pretty foolish to decide that one thing actually caused the other to occur.

It is much more likely that both ice cream sales and crime rates are related to the temperature outside. When the temperature is warm, there are lots of people out of their houses, interacting with each other, getting annoyed with one another, and sometimes committing crimes. Also, when it is warm outside, we are more likely to seek a cool treat like ice cream. How do we determine if there is indeed a relationship between two things? And when there is a relationship, how can we discern whether it is attributable to coincidence or causation?

Three scatterplots are shown. Scatterplot (a) is labeled “positive correlation” and shows scattered dots forming a rough line from the bottom left to the top right; the x-axis is labeled “weight” and the y-axis is labeled “height.” Scatterplot (b) is labeled “negative correlation” and shows scattered dots forming a rough line from the top left to the bottom right; the x-axis is labeled “tiredness” and the y-axis is labeled “hours of sleep.” Scatterplot (c) is labeled “no correlation” and shows scattered dots having no pattern; the x-axis is labeled “shoe size” and the y-axis is labeled “hours of sleep.”

Correlation Does Not Indicate Causation

Correlational research is useful because it allows us to discover the strength and direction of relationships that exist between two variables. However, correlation is limited because establishing the existence of a relationship tells us little about cause and effect . While variables are sometimes correlated because one does cause the other, it could also be that some other factor, a confounding variable , is actually causing the systematic movement in our variables of interest. In the ice cream/crime rate example mentioned earlier, temperature is a confounding variable that could account for the relationship between the two variables.

Even when we cannot point to clear confounding variables, we should not assume that a correlation between two variables implies that one variable causes changes in another. This can be frustrating when a cause-and-effect relationship seems clear and intuitive. Think back to our discussion of the research done by the American Cancer Society and how their research projects were some of the first demonstrations of the link between smoking and cancer. It seems reasonable to assume that smoking causes cancer, but if we were limited to correlational research , we would be overstepping our bounds by making this assumption.

A photograph shows a bowl of cereal.

Unfortunately, people mistakenly make claims of causation as a function of correlations all the time. Such claims are especially common in advertisements and news stories. For example, recent research found that people who eat cereal on a regular basis achieve healthier weights than those who rarely eat cereal (Frantzen, Treviño, Echon, Garcia-Dominic, & DiMarco, 2013; Barton et al., 2005). Guess how the cereal companies report this finding. Does eating cereal really cause an individual to maintain a healthy weight, or are there other possible explanations, such as, someone at a healthy weight is more likely to regularly eat a healthy breakfast than someone who is obese or someone who avoids meals in an attempt to diet (Figure 15)? While correlational research is invaluable in identifying relationships among variables, a major limitation is the inability to establish causality. Psychologists want to make statements about cause and effect, but the only way to do that is to conduct an experiment to answer a research question. The next section describes how scientific experiments incorporate methods that eliminate, or control for, alternative explanations, which allow researchers to explore how changes in one variable cause changes in another variable.

Watch this clip from Freakonomics for an example of how correlation does  not  indicate causation.

You can view the transcript for “Correlation vs. Causality: Freakonomics Movie” here (opens in new window) .

Illusory Correlations

The temptation to make erroneous cause-and-effect statements based on correlational research is not the only way we tend to misinterpret data. We also tend to make the mistake of illusory correlations, especially with unsystematic observations. Illusory correlations , or false correlations, occur when people believe that relationships exist between two things when no such relationship exists. One well-known illusory correlation is the supposed effect that the moon’s phases have on human behavior. Many people passionately assert that human behavior is affected by the phase of the moon, and specifically, that people act strangely when the moon is full (Figure 16).

A photograph shows the moon.

There is no denying that the moon exerts a powerful influence on our planet. The ebb and flow of the ocean’s tides are tightly tied to the gravitational forces of the moon. Many people believe, therefore, that it is logical that we are affected by the moon as well. After all, our bodies are largely made up of water. A meta-analysis of nearly 40 studies consistently demonstrated, however, that the relationship between the moon and our behavior does not exist (Rotton & Kelly, 1985). While we may pay more attention to odd behavior during the full phase of the moon, the rates of odd behavior remain constant throughout the lunar cycle.

Why are we so apt to believe in illusory correlations like this? Often we read or hear about them and simply accept the information as valid. Or, we have a hunch about how something works and then look for evidence to support that hunch, ignoring evidence that would tell us our hunch is false; this is known as confirmation bias . Other times, we find illusory correlations based on the information that comes most easily to mind, even if that information is severely limited. And while we may feel confident that we can use these relationships to better understand and predict the world around us, illusory correlations can have significant drawbacks. For example, research suggests that illusory correlations—in which certain behaviors are inaccurately attributed to certain groups—are involved in the formation of prejudicial attitudes that can ultimately lead to discriminatory behavior (Fiedler, 2004).

We all have a tendency to make illusory correlations from time to time. Try to think of an illusory correlation that is held by you, a family member, or a close friend. How do you think this illusory correlation came about and what can be done in the future to combat them?

Experiments

Causality: conducting experiments and using the data, experimental hypothesis.

In order to conduct an experiment, a researcher must have a specific hypothesis to be tested. As you’ve learned, hypotheses can be formulated either through direct observation of the real world or after careful review of previous research. For example, if you think that children should not be allowed to watch violent programming on television because doing so would cause them to behave more violently, then you have basically formulated a hypothesis—namely, that watching violent television programs causes children to behave more violently. How might you have arrived at this particular hypothesis? You may have younger relatives who watch cartoons featuring characters using martial arts to save the world from evildoers, with an impressive array of punching, kicking, and defensive postures. You notice that after watching these programs for a while, your young relatives mimic the fighting behavior of the characters portrayed in the cartoon (Figure 17).

A photograph shows a child pointing a toy gun.

These sorts of personal observations are what often lead us to formulate a specific hypothesis, but we cannot use limited personal observations and anecdotal evidence to rigorously test our hypothesis. Instead, to find out if real-world data supports our hypothesis, we have to conduct an experiment.

Designing an Experiment

The most basic experimental design involves two groups: the experimental group and the control group. The two groups are designed to be the same except for one difference— experimental manipulation. The experimental group  gets the experimental manipulation—that is, the treatment or variable being tested (in this case, violent TV images)—and the control group does not. Since experimental manipulation is the only difference between the experimental and control groups, we can be sure that any differences between the two are due to experimental manipulation rather than chance.

In our example of how violent television programming might affect violent behavior in children, we have the experimental group view violent television programming for a specified time and then measure their violent behavior. We measure the violent behavior in our control group after they watch nonviolent television programming for the same amount of time. It is important for the control group to be treated similarly to the experimental group, with the exception that the control group does not receive the experimental manipulation. Therefore, we have the control group watch non-violent television programming for the same amount of time as the experimental group.

We also need to precisely define, or operationalize, what is considered violent and nonviolent. An operational definition is a description of how we will measure our variables, and it is important in allowing others understand exactly how and what a researcher measures in a particular experiment. In operationalizing violent behavior, we might choose to count only physical acts like kicking or punching as instances of this behavior, or we also may choose to include angry verbal exchanges. Whatever we determine, it is important that we operationalize violent behavior in such a way that anyone who hears about our study for the first time knows exactly what we mean by violence. This aids peoples’ ability to interpret our data as well as their capacity to repeat our experiment should they choose to do so.

Once we have operationalized what is considered violent television programming and what is considered violent behavior from our experiment participants, we need to establish how we will run our experiment. In this case, we might have participants watch a 30-minute television program (either violent or nonviolent, depending on their group membership) before sending them out to a playground for an hour where their behavior is observed and the number and type of violent acts is recorded.

Ideally, the people who observe and record the children’s behavior are unaware of who was assigned to the experimental or control group, in order to control for experimenter bias. Experimenter bias refers to the possibility that a researcher’s expectations might skew the results of the study. Remember, conducting an experiment requires a lot of planning, and the people involved in the research project have a vested interest in supporting their hypotheses. If the observers knew which child was in which group, it might influence how much attention they paid to each child’s behavior as well as how they interpreted that behavior. By being blind to which child is in which group, we protect against those biases. This situation is a single-blind study , meaning that one of the groups (participants) are unaware as to which group they are in (experiment or control group) while the researcher who developed the experiment knows which participants are in each group.

A photograph shows three glass bottles of pills labeled as placebos.

In a double-blind study , both the researchers and the participants are blind to group assignments. Why would a researcher want to run a study where no one knows who is in which group? Because by doing so, we can control for both experimenter and participant expectations. If you are familiar with the phrase placebo effect, you already have some idea as to why this is an important consideration. The placebo effect occurs when people’s expectations or beliefs influence or determine their experience in a given situation. In other words, simply expecting something to happen can actually make it happen.

The placebo effect is commonly described in terms of testing the effectiveness of a new medication. Imagine that you work in a pharmaceutical company, and you think you have a new drug that is effective in treating depression. To demonstrate that your medication is effective, you run an experiment with two groups: The experimental group receives the medication, and the control group does not. But you don’t want participants to know whether they received the drug or not.

Why is that? Imagine that you are a participant in this study, and you have just taken a pill that you think will improve your mood. Because you expect the pill to have an effect, you might feel better simply because you took the pill and not because of any drug actually contained in the pill—this is the placebo effect.

To make sure that any effects on mood are due to the drug and not due to expectations, the control group receives a placebo (in this case a sugar pill). Now everyone gets a pill, and once again neither the researcher nor the experimental participants know who got the drug and who got the sugar pill. Any differences in mood between the experimental and control groups can now be attributed to the drug itself rather than to experimenter bias or participant expectations (Figure 18).

Independent and Dependent Variables

In a research experiment, we strive to study whether changes in one thing cause changes in another. To achieve this, we must pay attention to two important variables, or things that can be changed, in any experimental study: the independent variable and the dependent variable. An independent variable is manipulated or controlled by the experimenter. In a well-designed experimental study, the independent variable is the only important difference between the experimental and control groups. In our example of how violent television programs affect children’s display of violent behavior, the independent variable is the type of program—violent or nonviolent—viewed by participants in the study (Figure 19). A dependent variable is what the researcher measures to see how much effect the independent variable had. In our example, the dependent variable is the number of violent acts displayed by the experimental participants.

A box labeled “independent variable: type of television programming viewed” contains a photograph of a person shooting an automatic weapon. An arrow labeled “influences change in the…” leads to a second box. The second box is labeled “dependent variable: violent behavior displayed” and has a photograph of a child pointing a toy gun.

We expect that the dependent variable will change as a function of the independent variable. In other words, the dependent variable depends on the independent variable. A good way to think about the relationship between the independent and dependent variables is with this question: What effect does the independent variable have on the dependent variable? Returning to our example, what effect does watching a half hour of violent television programming or nonviolent television programming have on the number of incidents of physical aggression displayed on the playground?

Selecting and Assigning Experimental Participants

Now that our study is designed, we need to obtain a sample of individuals to include in our experiment. Our study involves human participants so we need to determine who to include. Participants  are the subjects of psychological research, and as the name implies, individuals who are involved in psychological research actively participate in the process. Often, psychological research projects rely on college students to serve as participants. In fact, the vast majority of research in psychology subfields has historically involved students as research participants (Sears, 1986; Arnett, 2008). But are college students truly representative of the general population? College students tend to be younger, more educated, more liberal, and less diverse than the general population. Although using students as test subjects is an accepted practice, relying on such a limited pool of research participants can be problematic because it is difficult to generalize findings to the larger population.

Our hypothetical experiment involves children, and we must first generate a sample of child participants. Samples are used because populations are usually too large to reasonably involve every member in our particular experiment (Figure 20). If possible, we should use a random sample   (there are other types of samples, but for the purposes of this section, we will focus on random samples). A random sample is a subset of a larger population in which every member of the population has an equal chance of being selected. Random samples are preferred because if the sample is large enough we can be reasonably sure that the participating individuals are representative of the larger population. This means that the percentages of characteristics in the sample—sex, ethnicity, socioeconomic level, and any other characteristics that might affect the results—are close to those percentages in the larger population.

In our example, let’s say we decide our population of interest is fourth graders. But all fourth graders is a very large population, so we need to be more specific; instead we might say our population of interest is all fourth graders in a particular city. We should include students from various income brackets, family situations, races, ethnicities, religions, and geographic areas of town. With this more manageable population, we can work with the local schools in selecting a random sample of around 200 fourth graders who we want to participate in our experiment.

In summary, because we cannot test all of the fourth graders in a city, we want to find a group of about 200 that reflects the composition of that city. With a representative group, we can generalize our findings to the larger population without fear of our sample being biased in some way.

(a) A photograph shows an aerial view of crowds on a street. (b) A photograph shows s small group of children.

Now that we have a sample, the next step of the experimental process is to split the participants into experimental and control groups through random assignment. With random assignment , all participants have an equal chance of being assigned to either group. There is statistical software that will randomly assign each of the fourth graders in the sample to either the experimental or the control group.

Random assignment is critical for sound experimental design. With sufficiently large samples, random assignment makes it unlikely that there are systematic differences between the groups. So, for instance, it would be very unlikely that we would get one group composed entirely of males, a given ethnic identity, or a given religious ideology. This is important because if the groups were systematically different before the experiment began, we would not know the origin of any differences we find between the groups: Were the differences preexisting, or were they caused by manipulation of the independent variable? Random assignment allows us to assume that any differences observed between experimental and control groups result from the manipulation of the independent variable.

Issues to Consider

While experiments allow scientists to make cause-and-effect claims, they are not without problems. True experiments require the experimenter to manipulate an independent variable, and that can complicate many questions that psychologists might want to address. For instance, imagine that you want to know what effect sex (the independent variable) has on spatial memory (the dependent variable). Although you can certainly look for differences between males and females on a task that taps into spatial memory, you cannot directly control a person’s sex. We categorize this type of research approach as quasi-experimental and recognize that we cannot make cause-and-effect claims in these circumstances.

Experimenters are also limited by ethical constraints. For instance, you would not be able to conduct an experiment designed to determine if experiencing abuse as a child leads to lower levels of self-esteem among adults. To conduct such an experiment, you would need to randomly assign some experimental participants to a group that receives abuse, and that experiment would be unethical.

Introduction to Statistical Thinking

Psychologists use statistics to assist them in analyzing data, and also to give more precise measurements to describe whether something is statistically significant. Analyzing data using statistics enables researchers to find patterns, make claims, and share their results with others. In this section, you’ll learn about some of the tools that psychologists use in statistical analysis.

  • Define reliability and validity
  • Describe the importance of distributional thinking and the role of p-values in statistical inference
  • Describe the role of random sampling and random assignment in drawing cause-and-effect conclusions
  • Describe the basic structure of a psychological research article

Interpreting Experimental Findings

Once data is collected from both the experimental and the control groups, a statistical analysis is conducted to find out if there are meaningful differences between the two groups. A statistical analysis determines how likely any difference found is due to chance (and thus not meaningful). In psychology, group differences are considered meaningful, or significant, if the odds that these differences occurred by chance alone are 5 percent or less. Stated another way, if we repeated this experiment 100 times, we would expect to find the same results at least 95 times out of 100.

The greatest strength of experiments is the ability to assert that any significant differences in the findings are caused by the independent variable. This occurs because random selection, random assignment, and a design that limits the effects of both experimenter bias and participant expectancy should create groups that are similar in composition and treatment. Therefore, any difference between the groups is attributable to the independent variable, and now we can finally make a causal statement. If we find that watching a violent television program results in more violent behavior than watching a nonviolent program, we can safely say that watching violent television programs causes an increase in the display of violent behavior.

Reporting Research

When psychologists complete a research project, they generally want to share their findings with other scientists. The American Psychological Association (APA) publishes a manual detailing how to write a paper for submission to scientific journals. Unlike an article that might be published in a magazine like Psychology Today, which targets a general audience with an interest in psychology, scientific journals generally publish peer-reviewed journal articles aimed at an audience of professionals and scholars who are actively involved in research themselves.

A peer-reviewed journal article is read by several other scientists (generally anonymously) with expertise in the subject matter. These peer reviewers provide feedback—to both the author and the journal editor—regarding the quality of the draft. Peer reviewers look for a strong rationale for the research being described, a clear description of how the research was conducted, and evidence that the research was conducted in an ethical manner. They also look for flaws in the study’s design, methods, and statistical analyses. They check that the conclusions drawn by the authors seem reasonable given the observations made during the research. Peer reviewers also comment on how valuable the research is in advancing the discipline’s knowledge. This helps prevent unnecessary duplication of research findings in the scientific literature and, to some extent, ensures that each research article provides new information. Ultimately, the journal editor will compile all of the peer reviewer feedback and determine whether the article will be published in its current state (a rare occurrence), published with revisions, or not accepted for publication.

Peer review provides some degree of quality control for psychological research. Poorly conceived or executed studies can be weeded out, and even well-designed research can be improved by the revisions suggested. Peer review also ensures that the research is described clearly enough to allow other scientists to replicate it, meaning they can repeat the experiment using different samples to determine reliability. Sometimes replications involve additional measures that expand on the original finding. In any case, each replication serves to provide more evidence to support the original research findings. Successful replications of published research make scientists more apt to adopt those findings, while repeated failures tend to cast doubt on the legitimacy of the original article and lead scientists to look elsewhere. For example, it would be a major advancement in the medical field if a published study indicated that taking a new drug helped individuals achieve a healthy weight without changing their diet. But if other scientists could not replicate the results, the original study’s claims would be questioned.

Dig Deeper: The Vaccine-Autism Myth and the Retraction of Published Studies

Some scientists have claimed that routine childhood vaccines cause some children to develop autism, and, in fact, several peer-reviewed publications published research making these claims. Since the initial reports, large-scale epidemiological research has suggested that vaccinations are not responsible for causing autism and that it is much safer to have your child vaccinated than not. Furthermore, several of the original studies making this claim have since been retracted.

A published piece of work can be rescinded when data is called into question because of falsification, fabrication, or serious research design problems. Once rescinded, the scientific community is informed that there are serious problems with the original publication. Retractions can be initiated by the researcher who led the study, by research collaborators, by the institution that employed the researcher, or by the editorial board of the journal in which the article was originally published. In the vaccine-autism case, the retraction was made because of a significant conflict of interest in which the leading researcher had a financial interest in establishing a link between childhood vaccines and autism (Offit, 2008). Unfortunately, the initial studies received so much media attention that many parents around the world became hesitant to have their children vaccinated (Figure 21). For more information about how the vaccine/autism story unfolded, as well as the repercussions of this story, take a look at Paul Offit’s book, Autism’s False Prophets: Bad Science, Risky Medicine, and the Search for a Cure.

A photograph shows a child being given an oral vaccine.

Reliability and Validity

Dig deeper:  everyday connection: how valid is the sat.

Standardized tests like the SAT are supposed to measure an individual’s aptitude for a college education, but how reliable and valid are such tests? Research conducted by the College Board suggests that scores on the SAT have high predictive validity for first-year college students’ GPA (Kobrin, Patterson, Shaw, Mattern, & Barbuti, 2008). In this context, predictive validity refers to the test’s ability to effectively predict the GPA of college freshmen. Given that many institutions of higher education require the SAT for admission, this high degree of predictive validity might be comforting.

However, the emphasis placed on SAT scores in college admissions has generated some controversy on a number of fronts. For one, some researchers assert that the SAT is a biased test that places minority students at a disadvantage and unfairly reduces the likelihood of being admitted into a college (Santelices & Wilson, 2010). Additionally, some research has suggested that the predictive validity of the SAT is grossly exaggerated in how well it is able to predict the GPA of first-year college students. In fact, it has been suggested that the SAT’s predictive validity may be overestimated by as much as 150% (Rothstein, 2004). Many institutions of higher education are beginning to consider de-emphasizing the significance of SAT scores in making admission decisions (Rimer, 2008).

In 2014, College Board president David Coleman expressed his awareness of these problems, recognizing that college success is more accurately predicted by high school grades than by SAT scores. To address these concerns, he has called for significant changes to the SAT exam (Lewin, 2014).

Statistical Significance

Coffee cup with heart shaped cream inside.

Does drinking coffee actually increase your life expectancy? A recent study (Freedman, Park, Abnet, Hollenbeck, & Sinha, 2012) found that men who drank at least six cups of coffee a day also had a 10% lower chance of dying (women’s chances were 15% lower) than those who drank none. Does this mean you should pick up or increase your own coffee habit? We will explore these results in more depth in the next section about drawing conclusions from statistics. Modern society has become awash in studies such as this; you can read about several such studies in the news every day.

Conducting such a study well, and interpreting the results of such studies requires understanding basic ideas of statistics , the science of gaining insight from data. Key components to a statistical investigation are:

  • Planning the study: Start by asking a testable research question and deciding how to collect data. For example, how long was the study period of the coffee study? How many people were recruited for the study, how were they recruited, and from where? How old were they? What other variables were recorded about the individuals? Were changes made to the participants’ coffee habits during the course of the study?
  • Examining the data: What are appropriate ways to examine the data? What graphs are relevant, and what do they reveal? What descriptive statistics can be calculated to summarize relevant aspects of the data, and what do they reveal? What patterns do you see in the data? Are there any individual observations that deviate from the overall pattern, and what do they reveal? For example, in the coffee study, did the proportions differ when we compared the smokers to the non-smokers?
  • Inferring from the data: What are valid statistical methods for drawing inferences “beyond” the data you collected? In the coffee study, is the 10%–15% reduction in risk of death something that could have happened just by chance?
  • Drawing conclusions: Based on what you learned from your data, what conclusions can you draw? Who do you think these conclusions apply to? (Were the people in the coffee study older? Healthy? Living in cities?) Can you draw a cause-and-effect conclusion about your treatments? (Are scientists now saying that the coffee drinking is the cause of the decreased risk of death?)

Notice that the numerical analysis (“crunching numbers” on the computer) comprises only a small part of overall statistical investigation. In this section, you will see how we can answer some of these questions and what questions you should be asking about any statistical investigation you read about.

Distributional Thinking

When data are collected to address a particular question, an important first step is to think of meaningful ways to organize and examine the data. Let’s take a look at an example.

Example 1 : Researchers investigated whether cancer pamphlets are written at an appropriate level to be read and understood by cancer patients (Short, Moriarty, & Cooley, 1995). Tests of reading ability were given to 63 patients. In addition, readability level was determined for a sample of 30 pamphlets, based on characteristics such as the lengths of words and sentences in the pamphlet. The results, reported in terms of grade levels, are displayed in Figure 23.

Table showing patients' reading levels and pahmphlet's reading levels.

  • Data vary . More specifically, values of a variable (such as reading level of a cancer patient or readability level of a cancer pamphlet) vary.
  • Analyzing the pattern of variation, called the distribution of the variable, often reveals insights.

Addressing the research question of whether the cancer pamphlets are written at appropriate levels for the cancer patients requires comparing the two distributions. A naïve comparison might focus only on the centers of the distributions. Both medians turn out to be ninth grade, but considering only medians ignores the variability and the overall distributions of these data. A more illuminating approach is to compare the entire distributions, for example with a graph, as in Figure 24.

Bar graph showing that the reading level of pamphlets is typically higher than the reading level of the patients.

Figure 24 makes clear that the two distributions are not well aligned at all. The most glaring discrepancy is that many patients (17/63, or 27%, to be precise) have a reading level below that of the most readable pamphlet. These patients will need help to understand the information provided in the cancer pamphlets. Notice that this conclusion follows from considering the distributions as a whole, not simply measures of center or variability, and that the graph contrasts those distributions more immediately than the frequency tables.

Finding Significance in Data

Even when we find patterns in data, often there is still uncertainty in various aspects of the data. For example, there may be potential for measurement errors (even your own body temperature can fluctuate by almost 1°F over the course of the day). Or we may only have a “snapshot” of observations from a more long-term process or only a small subset of individuals from the population of interest. In such cases, how can we determine whether patterns we see in our small set of data is convincing evidence of a systematic phenomenon in the larger process or population? Let’s take a look at another example.

Example 2 : In a study reported in the November 2007 issue of Nature , researchers investigated whether pre-verbal infants take into account an individual’s actions toward others in evaluating that individual as appealing or aversive (Hamlin, Wynn, & Bloom, 2007). In one component of the study, 10-month-old infants were shown a “climber” character (a piece of wood with “googly” eyes glued onto it) that could not make it up a hill in two tries. Then the infants were shown two scenarios for the climber’s next try, one where the climber was pushed to the top of the hill by another character (“helper”), and one where the climber was pushed back down the hill by another character (“hinderer”). The infant was alternately shown these two scenarios several times. Then the infant was presented with two pieces of wood (representing the helper and the hinderer characters) and asked to pick one to play with.

The researchers found that of the 16 infants who made a clear choice, 14 chose to play with the helper toy. One possible explanation for this clear majority result is that the helping behavior of the one toy increases the infants’ likelihood of choosing that toy. But are there other possible explanations? What about the color of the toy? Well, prior to collecting the data, the researchers arranged so that each color and shape (red square and blue circle) would be seen by the same number of infants. Or maybe the infants had right-handed tendencies and so picked whichever toy was closer to their right hand?

Well, prior to collecting the data, the researchers arranged it so half the infants saw the helper toy on the right and half on the left. Or, maybe the shapes of these wooden characters (square, triangle, circle) had an effect? Perhaps, but again, the researchers controlled for this by rotating which shape was the helper toy, the hinderer toy, and the climber. When designing experiments, it is important to control for as many variables as might affect the responses as possible. It is beginning to appear that the researchers accounted for all the other plausible explanations. But there is one more important consideration that cannot be controlled—if we did the study again with these 16 infants, they might not make the same choices. In other words, there is some randomness inherent in their selection process.

Maybe each infant had no genuine preference at all, and it was simply “random luck” that led to 14 infants picking the helper toy. Although this random component cannot be controlled, we can apply a probability model to investigate the pattern of results that would occur in the long run if random chance were the only factor.

If the infants were equally likely to pick between the two toys, then each infant had a 50% chance of picking the helper toy. It’s like each infant tossed a coin, and if it landed heads, the infant picked the helper toy. So if we tossed a coin 16 times, could it land heads 14 times? Sure, it’s possible, but it turns out to be very unlikely. Getting 14 (or more) heads in 16 tosses is about as likely as tossing a coin and getting 9 heads in a row. This probability is referred to as a p-value . The p-value represents the likelihood that experimental results happened by chance. Within psychology, the most common standard for p-values is “p < .05”. What this means is that there is less than a 5% probability that the results happened just by random chance, and therefore a 95% probability that the results reflect a meaningful pattern in human psychology. We call this statistical significance .

So, in the study above, if we assume that each infant was choosing equally, then the probability that 14 or more out of 16 infants would choose the helper toy is found to be 0.0021. We have only two logical possibilities: either the infants have a genuine preference for the helper toy, or the infants have no preference (50/50) and an outcome that would occur only 2 times in 1,000 iterations happened in this study. Because this p-value of 0.0021 is quite small, we conclude that the study provides very strong evidence that these infants have a genuine preference for the helper toy.

If we compare the p-value to some cut-off value, like 0.05, we see that the p=value is smaller. Because the p-value is smaller than that cut-off value, then we reject the hypothesis that only random chance was at play here. In this case, these researchers would conclude that significantly more than half of the infants in the study chose the helper toy, giving strong evidence of a genuine preference for the toy with the helping behavior.

Drawing Conclusions from Statistics

Generalizability.

Photo of a diverse group of college-aged students.

One limitation to the study mentioned previously about the babies choosing the “helper” toy is that the conclusion only applies to the 16 infants in the study. We don’t know much about how those 16 infants were selected. Suppose we want to select a subset of individuals (a sample ) from a much larger group of individuals (the population ) in such a way that conclusions from the sample can be generalized to the larger population. This is the question faced by pollsters every day.

Example 3 : The General Social Survey (GSS) is a survey on societal trends conducted every other year in the United States. Based on a sample of about 2,000 adult Americans, researchers make claims about what percentage of the U.S. population consider themselves to be “liberal,” what percentage consider themselves “happy,” what percentage feel “rushed” in their daily lives, and many other issues. The key to making these claims about the larger population of all American adults lies in how the sample is selected. The goal is to select a sample that is representative of the population, and a common way to achieve this goal is to select a r andom sample  that gives every member of the population an equal chance of being selected for the sample. In its simplest form, random sampling involves numbering every member of the population and then using a computer to randomly select the subset to be surveyed. Most polls don’t operate exactly like this, but they do use probability-based sampling methods to select individuals from nationally representative panels.

In 2004, the GSS reported that 817 of 977 respondents (or 83.6%) indicated that they always or sometimes feel rushed. This is a clear majority, but we again need to consider variation due to random sampling . Fortunately, we can use the same probability model we did in the previous example to investigate the probable size of this error. (Note, we can use the coin-tossing model when the actual population size is much, much larger than the sample size, as then we can still consider the probability to be the same for every individual in the sample.) This probability model predicts that the sample result will be within 3 percentage points of the population value (roughly 1 over the square root of the sample size, the margin of error. A statistician would conclude, with 95% confidence, that between 80.6% and 86.6% of all adult Americans in 2004 would have responded that they sometimes or always feel rushed.

The key to the margin of error is that when we use a probability sampling method, we can make claims about how often (in the long run, with repeated random sampling) the sample result would fall within a certain distance from the unknown population value by chance (meaning by random sampling variation) alone. Conversely, non-random samples are often suspect to bias, meaning the sampling method systematically over-represents some segments of the population and under-represents others. We also still need to consider other sources of bias, such as individuals not responding honestly. These sources of error are not measured by the margin of error.

Cause and Effect

In many research studies, the primary question of interest concerns differences between groups. Then the question becomes how were the groups formed (e.g., selecting people who already drink coffee vs. those who don’t). In some studies, the researchers actively form the groups themselves. But then we have a similar question—could any differences we observe in the groups be an artifact of that group-formation process? Or maybe the difference we observe in the groups is so large that we can discount a “fluke” in the group-formation process as a reasonable explanation for what we find?

Example 4 : A psychology study investigated whether people tend to display more creativity when they are thinking about intrinsic (internal) or extrinsic (external) motivations (Ramsey & Schafer, 2002, based on a study by Amabile, 1985). The subjects were 47 people with extensive experience with creative writing. Subjects began by answering survey questions about either intrinsic motivations for writing (such as the pleasure of self-expression) or extrinsic motivations (such as public recognition). Then all subjects were instructed to write a haiku, and those poems were evaluated for creativity by a panel of judges. The researchers conjectured beforehand that subjects who were thinking about intrinsic motivations would display more creativity than subjects who were thinking about extrinsic motivations. The creativity scores from the 47 subjects in this study are displayed in Figure 26, where higher scores indicate more creativity.

Image showing a dot for creativity scores, which vary between 5 and 27, and the types of motivation each person was given as a motivator, either extrinsic or intrinsic.

In this example, the key question is whether the type of motivation affects creativity scores. In particular, do subjects who were asked about intrinsic motivations tend to have higher creativity scores than subjects who were asked about extrinsic motivations?

Figure 26 reveals that both motivation groups saw considerable variability in creativity scores, and these scores have considerable overlap between the groups. In other words, it’s certainly not always the case that those with extrinsic motivations have higher creativity than those with intrinsic motivations, but there may still be a statistical tendency in this direction. (Psychologist Keith Stanovich (2013) refers to people’s difficulties with thinking about such probabilistic tendencies as “the Achilles heel of human cognition.”)

The mean creativity score is 19.88 for the intrinsic group, compared to 15.74 for the extrinsic group, which supports the researchers’ conjecture. Yet comparing only the means of the two groups fails to consider the variability of creativity scores in the groups. We can measure variability with statistics using, for instance, the standard deviation: 5.25 for the extrinsic group and 4.40 for the intrinsic group. The standard deviations tell us that most of the creativity scores are within about 5 points of the mean score in each group. We see that the mean score for the intrinsic group lies within one standard deviation of the mean score for extrinsic group. So, although there is a tendency for the creativity scores to be higher in the intrinsic group, on average, the difference is not extremely large.

We again want to consider possible explanations for this difference. The study only involved individuals with extensive creative writing experience. Although this limits the population to which we can generalize, it does not explain why the mean creativity score was a bit larger for the intrinsic group than for the extrinsic group. Maybe women tend to receive higher creativity scores? Here is where we need to focus on how the individuals were assigned to the motivation groups. If only women were in the intrinsic motivation group and only men in the extrinsic group, then this would present a problem because we wouldn’t know if the intrinsic group did better because of the different type of motivation or because they were women. However, the researchers guarded against such a problem by randomly assigning the individuals to the motivation groups. Like flipping a coin, each individual was just as likely to be assigned to either type of motivation. Why is this helpful? Because this random assignment  tends to balance out all the variables related to creativity we can think of, and even those we don’t think of in advance, between the two groups. So we should have a similar male/female split between the two groups; we should have a similar age distribution between the two groups; we should have a similar distribution of educational background between the two groups; and so on. Random assignment should produce groups that are as similar as possible except for the type of motivation, which presumably eliminates all those other variables as possible explanations for the observed tendency for higher scores in the intrinsic group.

But does this always work? No, so by “luck of the draw” the groups may be a little different prior to answering the motivation survey. So then the question is, is it possible that an unlucky random assignment is responsible for the observed difference in creativity scores between the groups? In other words, suppose each individual’s poem was going to get the same creativity score no matter which group they were assigned to, that the type of motivation in no way impacted their score. Then how often would the random-assignment process alone lead to a difference in mean creativity scores as large (or larger) than 19.88 – 15.74 = 4.14 points?

We again want to apply to a probability model to approximate a p-value , but this time the model will be a bit different. Think of writing everyone’s creativity scores on an index card, shuffling up the index cards, and then dealing out 23 to the extrinsic motivation group and 24 to the intrinsic motivation group, and finding the difference in the group means. We (better yet, the computer) can repeat this process over and over to see how often, when the scores don’t change, random assignment leads to a difference in means at least as large as 4.41. Figure 27 shows the results from 1,000 such hypothetical random assignments for these scores.

Standard distribution in a typical bell curve.

Only 2 of the 1,000 simulated random assignments produced a difference in group means of 4.41 or larger. In other words, the approximate p-value is 2/1000 = 0.002. This small p-value indicates that it would be very surprising for the random assignment process alone to produce such a large difference in group means. Therefore, as with Example 2, we have strong evidence that focusing on intrinsic motivations tends to increase creativity scores, as compared to thinking about extrinsic motivations.

Notice that the previous statement implies a cause-and-effect relationship between motivation and creativity score; is such a strong conclusion justified? Yes, because of the random assignment used in the study. That should have balanced out any other variables between the two groups, so now that the small p-value convinces us that the higher mean in the intrinsic group wasn’t just a coincidence, the only reasonable explanation left is the difference in the type of motivation. Can we generalize this conclusion to everyone? Not necessarily—we could cautiously generalize this conclusion to individuals with extensive experience in creative writing similar the individuals in this study, but we would still want to know more about how these individuals were selected to participate.

Close-up photo of mathematical equations.

Statistical thinking involves the careful design of a study to collect meaningful data to answer a focused research question, detailed analysis of patterns in the data, and drawing conclusions that go beyond the observed data. Random sampling is paramount to generalizing results from our sample to a larger population, and random assignment is key to drawing cause-and-effect conclusions. With both kinds of randomness, probability models help us assess how much random variation we can expect in our results, in order to determine whether our results could happen by chance alone and to estimate a margin of error.

So where does this leave us with regard to the coffee study mentioned previously (the Freedman, Park, Abnet, Hollenbeck, & Sinha, 2012 found that men who drank at least six cups of coffee a day had a 10% lower chance of dying (women 15% lower) than those who drank none)? We can answer many of the questions:

  • This was a 14-year study conducted by researchers at the National Cancer Institute.
  • The results were published in the June issue of the New England Journal of Medicine , a respected, peer-reviewed journal.
  • The study reviewed coffee habits of more than 402,000 people ages 50 to 71 from six states and two metropolitan areas. Those with cancer, heart disease, and stroke were excluded at the start of the study. Coffee consumption was assessed once at the start of the study.
  • About 52,000 people died during the course of the study.
  • People who drank between two and five cups of coffee daily showed a lower risk as well, but the amount of reduction increased for those drinking six or more cups.
  • The sample sizes were fairly large and so the p-values are quite small, even though percent reduction in risk was not extremely large (dropping from a 12% chance to about 10%–11%).
  • Whether coffee was caffeinated or decaffeinated did not appear to affect the results.
  • This was an observational study, so no cause-and-effect conclusions can be drawn between coffee drinking and increased longevity, contrary to the impression conveyed by many news headlines about this study. In particular, it’s possible that those with chronic diseases don’t tend to drink coffee.

This study needs to be reviewed in the larger context of similar studies and consistency of results across studies, with the constant caution that this was not a randomized experiment. Whereas a statistical analysis can still “adjust” for other potential confounding variables, we are not yet convinced that researchers have identified them all or completely isolated why this decrease in death risk is evident. Researchers can now take the findings of this study and develop more focused studies that address new questions.

Explore these outside resources to learn more about applied statistics:

  • Video about p-values:  P-Value Extravaganza
  • Interactive web applets for teaching and learning statistics
  • Inter-university Consortium for Political and Social Research  where you can find and analyze data.
  • The Consortium for the Advancement of Undergraduate Statistics
  • Find a recent research article in your field and answer the following: What was the primary research question? How were individuals selected to participate in the study? Were summary results provided? How strong is the evidence presented in favor or against the research question? Was random assignment used? Summarize the main conclusions from the study, addressing the issues of statistical significance, statistical confidence, generalizability, and cause and effect. Do you agree with the conclusions drawn from this study, based on the study design and the results presented?
  • Is it reasonable to use a random sample of 1,000 individuals to draw conclusions about all U.S. adults? Explain why or why not.

How to Read Research

In this course and throughout your academic career, you’ll be reading journal articles (meaning they were published by experts in a peer-reviewed journal) and reports that explain psychological research. It’s important to understand the format of these articles so that you can read them strategically and understand the information presented. Scientific articles vary in content or structure, depending on the type of journal to which they will be submitted. Psychological articles and many papers in the social sciences follow the writing guidelines and format dictated by the American Psychological Association (APA). In general, the structure follows: abstract, introduction, methods, results, discussion, and references.

  • Abstract : the abstract is the concise summary of the article. It summarizes the most important features of the manuscript, providing the reader with a global first impression on the article. It is generally just one paragraph that explains the experiment as well as a short synopsis of the results.
  • Introduction : this section provides background information about the origin and purpose of performing the experiment or study. It reviews previous research and presents existing theories on the topic.
  • Method : this section covers the methodologies used to investigate the research question, including the identification of participants , procedures , and  materials  as well as a description of the actual procedure . It should be sufficiently detailed to allow for replication.
  • Results : the results section presents key findings of the research, including reference to indicators of statistical significance.
  • Discussion : this section provides an interpretation of the findings, states their significance for current research, and derives implications for theory and practice. Alternative interpretations for findings are also provided, particularly when it is not possible to conclude for the directionality of the effects. In the discussion, authors also acknowledge the strengths and limitations/weaknesses of the study and offer concrete directions about for future research.

Watch this 3-minute video for an explanation on how to read scholarly articles. Look closely at the example article shared just before the two minute mark.

https://digitalcommons.coastal.edu/kimbel-library-instructional-videos/9/

Practice identifying these key components in the following experiment: Food-Induced Emotional Resonance Improves Emotion Recognition.

In this chapter, you learned to

  • define and apply the scientific method to psychology
  • describe the strengths and weaknesses of descriptive, experimental, and correlational research
  • define the basic elements of a statistical investigation

Putting It Together: Psychological Research

Psychologists use the scientific method to examine human behavior and mental processes. Some of the methods you learned about include descriptive, experimental, and correlational research designs.

Watch the CrashCourse video to review the material you learned, then read through the following examples and see if you can come up with your own design for each type of study.

You can view the transcript for “Psychological Research: Crash Course Psychology #2” here (opens in new window).

Case Study: a detailed analysis of a particular person, group, business, event, etc. This approach is commonly used to to learn more about rare examples with the goal of describing that particular thing.

  • Ted Bundy was one of America’s most notorious serial killers who murdered at least 30 women and was executed in 1989. Dr. Al Carlisle evaluated Bundy when he was first arrested and conducted a psychological analysis of Bundy’s development of his sexual fantasies merging into reality (Ramsland, 2012). Carlisle believes that there was a gradual evolution of three processes that guided his actions: fantasy, dissociation, and compartmentalization (Ramsland, 2012). Read   Imagining Ted Bundy  (http://goo.gl/rGqcUv) for more information on this case study.

Naturalistic Observation : a researcher unobtrusively collects information without the participant’s awareness.

  • Drain and Engelhardt (2013) observed six nonverbal children with autism’s evoked and spontaneous communicative acts. Each of the children attended a school for children with autism and were in different classes. They were observed for 30 minutes of each school day. By observing these children without them knowing, they were able to see true communicative acts without any external influences.

Survey : participants are asked to provide information or responses to questions on a survey or structure assessment.

  • Educational psychologists can ask students to report their grade point average and what, if anything, they eat for breakfast on an average day. A healthy breakfast has been associated with better academic performance (Digangi’s 1999).
  • Anderson (1987) tried to find the relationship between uncomfortably hot temperatures and aggressive behavior, which was then looked at with two studies done on violent and nonviolent crime. Based on previous research that had been done by Anderson and Anderson (1984), it was predicted that violent crimes would be more prevalent during the hotter time of year and the years in which it was hotter weather in general. The study confirmed this prediction.

Longitudinal Study: researchers   recruit a sample of participants and track them for an extended period of time.

  • In a study of a representative sample of 856 children Eron and his colleagues (1972) found that a boy’s exposure to media violence at age eight was significantly related to his aggressive behavior ten years later, after he graduated from high school.

Cross-Sectional Study:  researchers gather participants from different groups (commonly different ages) and look for differences between the groups.

  • In 1996, Russell surveyed people of varying age groups and found that people in their 20s tend to report being more lonely than people in their 70s.

Correlational Design:  two different variables are measured to determine whether there is a relationship between them.

  • Thornhill et al. (2003) had people rate how physically attractive they found other people to be. They then had them separately smell t-shirts those people had worn (without knowing which clothes belonged to whom) and rate how good or bad their body oder was. They found that the more attractive someone was the more pleasant their body order was rated to be.
  • Clinical psychologists can test a new pharmaceutical treatment for depression by giving some patients the new pill and others an already-tested one to see which is the more effective treatment.

American Cancer Society. (n.d.). History of the cancer prevention studies. Retrieved from http://www.cancer.org/research/researchtopreventcancer/history-cancer-prevention-study

American Psychological Association. (2009). Publication Manual of the American Psychological Association (6th ed.). Washington, DC: Author.

American Psychological Association. (n.d.). Research with animals in psychology. Retrieved from https://www.apa.org/research/responsible/research-animals.pdf

Arnett, J. (2008). The neglected 95%: Why American psychology needs to become less American. American Psychologist, 63(7), 602–614.

Barton, B. A., Eldridge, A. L., Thompson, D., Affenito, S. G., Striegel-Moore, R. H., Franko, D. L., . . . Crockett, S. J. (2005). The relationship of breakfast and cereal consumption to nutrient intake and body mass index: The national heart, lung, and blood institute growth and health study. Journal of the American Dietetic Association, 105(9), 1383–1389. Retrieved from http://dx.doi.org/10.1016/j.jada.2005.06.003

Chwalisz, K., Diener, E., & Gallagher, D. (1988). Autonomic arousal feedback and emotional experience: Evidence from the spinal cord injured. Journal of Personality and Social Psychology, 54, 820–828.

Dominus, S. (2011, May 25). Could conjoined twins share a mind? New York Times Sunday Magazine. Retrieved from http://www.nytimes.com/2011/05/29/magazine/could-conjoined-twins-share-a-mind.html?_r=5&hp&

Fanger, S. M., Frankel, L. A., & Hazen, N. (2012). Peer exclusion in preschool children’s play: Naturalistic observations in a playground setting. Merrill-Palmer Quarterly, 58, 224–254.

Fiedler, K. (2004). Illusory correlation. In R. F. Pohl (Ed.), Cognitive illusions: A handbook on fallacies and biases in thinking, judgment and memory (pp. 97–114). New York, NY: Psychology Press.

Frantzen, L. B., Treviño, R. P., Echon, R. M., Garcia-Dominic, O., & DiMarco, N. (2013). Association between frequency of ready-to-eat cereal consumption, nutrient intakes, and body mass index in fourth- to sixth-grade low-income minority children. Journal of the Academy of Nutrition and Dietetics, 113(4), 511–519.

Harper, J. (2013, July 5). Ice cream and crime: Where cold cuisine and hot disputes intersect. The Times-Picaune. Retrieved from http://www.nola.com/crime/index.ssf/2013/07/ice_cream_and_crime_where_hot.html

Jenkins, W. J., Ruppel, S. E., Kizer, J. B., Yehl, J. L., & Griffin, J. L. (2012). An examination of post 9-11 attitudes towards Arab Americans. North American Journal of Psychology, 14, 77–84.

Jones, J. M. (2013, May 13). Same-sex marriage support solidifies above 50% in U.S. Gallup Politics. Retrieved from http://www.gallup.com/poll/162398/sex-marriage-support-solidifies-above.aspx

Kobrin, J. L., Patterson, B. F., Shaw, E. J., Mattern, K. D., & Barbuti, S. M. (2008). Validity of the SAT for predicting first-year college grade point average (Research Report No. 2008-5). Retrieved from https://research.collegeboard.org/sites/default/files/publications/2012/7/researchreport-2008-5-validity-sat-predicting-first-year-college-grade-point-average.pdf

Lewin, T. (2014, March 5). A new SAT aims to realign with schoolwork. New York Times. Retreived from http://www.nytimes.com/2014/03/06/education/major-changes-in-sat-announced-by-college-board.html.

Lowry, M., Dean, K., & Manders, K. (2010). The link between sleep quantity and academic performance for the college student. Sentience: The University of Minnesota Undergraduate Journal of Psychology, 3(Spring), 16–19. Retrieved from http://www.psych.umn.edu/sentience/files/SENTIENCE_Vol3.pdf

McKie, R. (2010, June 26). Chimps with everything: Jane Goodall’s 50 years in the jungle. The Guardian. Retrieved from http://www.theguardian.com/science/2010/jun/27/jane-goodall-chimps-africa-interview

Offit, P. (2008). Autism’s false prophets: Bad science, risky medicine, and the search for a cure. New York: Columbia University Press.

Perkins, H. W., Haines, M. P., & Rice, R. (2005). Misperceiving the college drinking norm and related problems: A nationwide study of exposure to prevention information, perceived norms and student alcohol misuse. J. Stud. Alcohol, 66(4), 470–478.

Rimer, S. (2008, September 21). College panel calls for less focus on SATs. The New York Times. Retrieved from http://www.nytimes.com/2008/09/22/education/22admissions.html?_r=0

Rothstein, J. M. (2004). College performance predictions and the SAT. Journal of Econometrics, 121, 297–317.

Rotton, J., & Kelly, I. W. (1985). Much ado about the full moon: A meta-analysis of lunar-lunacy research. Psychological Bulletin, 97(2), 286–306. doi:10.1037/0033-2909.97.2.286

Santelices, M. V., & Wilson, M. (2010). Unfair treatment? The case of Freedle, the SAT, and the standardization approach to differential item functioning. Harvard Education Review, 80, 106–134.

Sears, D. O. (1986). College sophomores in the laboratory: Influences of a narrow data base on social psychology’s view of human nature. Journal of Personality and Social Psychology, 51, 515–530.

Tuskegee University. (n.d.). About the USPHS Syphilis Study. Retrieved from http://www.tuskegee.edu/about_us/centers_of_excellence/bioethics_center/about_the_usphs_syphilis_study.aspx.

CC licensed content, Original

  • Psychological Research Methods. Provided by : Karenna Malavanti. License : CC BY-SA: Attribution ShareAlike

CC licensed content, Shared previously

  • Psychological Research. Provided by : OpenStax College. License : CC BY: Attribution . License Terms : Download for free at https://openstax.org/books/psychology-2e/pages/1-introduction. Located at : https://openstax.org/books/psychology-2e/pages/2-introduction .
  • Why It Matters: Psychological Research. Provided by : Lumen Learning. License : CC BY: Attribution   Located at: https://pressbooks.online.ucf.edu/lumenpsychology/chapter/introduction-15/
  • Introduction to The Scientific Method. Provided by : Lumen Learning. License : CC BY: Attribution   Located at:   https://pressbooks.online.ucf.edu/lumenpsychology/chapter/outcome-the-scientific-method/
  • Research picture. Authored by : Mediterranean Center of Medical Sciences. Provided by : Flickr. License : CC BY: Attribution   Located at : https://www.flickr.com/photos/mcmscience/17664002728 .
  • The Scientific Process. Provided by : Lumen Learning. License : CC BY-SA: Attribution ShareAlike   Located at:  https://pressbooks.online.ucf.edu/lumenpsychology/chapter/reading-the-scientific-process/
  • Ethics in Research. Provided by : Lumen Learning. License : CC BY: Attribution   Located at:  https://pressbooks.online.ucf.edu/lumenpsychology/chapter/ethics/
  • Ethics. Authored by : OpenStax College. Located at : https://openstax.org/books/psychology-2e/pages/2-4-ethics . License : CC BY: Attribution . License Terms : Download for free at https://openstax.org/books/psychology-2e/pages/1-introduction .
  • Introduction to Approaches to Research. Provided by : Lumen Learning. License : CC BY-NC-SA: Attribution NonCommercial ShareAlike   Located at:   https://pressbooks.online.ucf.edu/lumenpsychology/chapter/outcome-approaches-to-research/
  • Lec 2 | MIT 9.00SC Introduction to Psychology, Spring 2011. Authored by : John Gabrieli. Provided by : MIT OpenCourseWare. License : CC BY-NC-SA: Attribution-NonCommercial-ShareAlike Located at : https://www.youtube.com/watch?v=syXplPKQb_o .
  • Paragraph on correlation. Authored by : Christie Napa Scollon. Provided by : Singapore Management University. License : CC BY-NC-SA: Attribution-NonCommercial-ShareAlike Located at : http://nobaproject.com/modules/research-designs?r=MTc0ODYsMjMzNjQ%3D . Project : The Noba Project.
  • Descriptive Research. Provided by : Lumen Learning. License : CC BY-SA: Attribution ShareAlike   Located at: https://pressbooks.online.ucf.edu/lumenpsychology/chapter/reading-clinical-or-case-studies/
  • Approaches to Research. Authored by : OpenStax College.  License : CC BY: Attribution . License Terms : Download for free at https://openstax.org/books/psychology-2e/pages/1-introduction. Located at : https://openstax.org/books/psychology-2e/pages/2-2-approaches-to-research
  • Analyzing Findings. Authored by : OpenStax College. Located at : https://openstax.org/books/psychology-2e/pages/2-3-analyzing-findings . License : CC BY: Attribution . License Terms : Download for free at https://openstax.org/books/psychology-2e/pages/1-introduction.
  • Experiments. Provided by : Lumen Learning. License : CC BY: Attribution   Located at:  https://pressbooks.online.ucf.edu/lumenpsychology/chapter/reading-conducting-experiments/
  • Research Review. Authored by : Jessica Traylor for Lumen Learning. License : CC BY: Attribution Located at:  https://pressbooks.online.ucf.edu/lumenpsychology/chapter/reading-conducting-experiments/
  • Introduction to Statistics. Provided by : Lumen Learning. License : CC BY: Attribution   Located at:  https://pressbooks.online.ucf.edu/lumenpsychology/chapter/outcome-statistical-thinking/
  • histogram. Authored by : Fisher’s Iris flower data set. Provided by : Wikipedia.
  • License : CC BY-SA: Attribution-ShareAlike   Located at : https://en.wikipedia.org/wiki/Wikipedia:Meetup/DC/Statistics_Edit-a-thon#/media/File:Fisher_iris_versicolor_sepalwidth.svg .
  • Statistical Thinking. Authored by : Beth Chance and Allan Rossman . Provided by : California Polytechnic State University, San Luis Obispo.  
  • License : CC BY-NC-SA: Attribution-NonCommerci al-S hareAlike .  License Terms : http://nobaproject.com/license-agreement   Located at : http://nobaproject.com/modules/statistical-thinking . Project : The Noba Project.
  • Drawing Conclusions from Statistics. Authored by: Pat Carroll and Lumen Learning. Provided by : Lumen Learning. License : CC BY: Attribution   Located at: https://pressbooks.online.ucf.edu/lumenpsychology/chapter/reading-drawing-conclusions-from-statistics/
  • Statistical Thinking. Authored by : Beth Chance and Allan Rossman, California Polytechnic State University, San Luis Obispo. Provided by : Noba. License: CC BY-NC-SA: Attribution-NonCommercial-ShareAlike Located at : http://nobaproject.com/modules/statistical-thinking .
  • The Replication Crisis. Authored by : Colin Thomas William. Provided by : Ivy Tech Community College. License: CC BY: Attribution
  • How to Read Research. Provided by : Lumen Learning. License : CC BY: Attribution   Located at:  https://pressbooks.online.ucf.edu/lumenpsychology/chapter/how-to-read-research/
  • What is a Scholarly Article? Kimbel Library First Year Experience Instructional Videos. 9. Authored by:  Joshua Vossler, John Watts, and Tim Hodge.  Provided by : Coastal Carolina University  License :  CC BY NC ND:  Attribution-NonCommercial-NoDerivatives Located at :  https://digitalcommons.coastal.edu/kimbel-library-instructional-videos/9/
  • Putting It Together: Psychological Research. Provided by : Lumen Learning. License : CC BY: Attribution   Located at:  https://pressbooks.online.ucf.edu/lumenpsychology/chapter/putting-it-together-psychological-research/
  • Research. Provided by : Lumen Learning. License : CC BY: Attribution   Located at:

All rights reserved content

  • Understanding Driver Distraction. Provided by : American Psychological Association. License : Other. License Terms: Standard YouTube License Located at : https://www.youtube.com/watch?v=XToWVxS_9lA&list=PLxf85IzktYWJ9MrXwt5GGX3W-16XgrwPW&index=9 .
  • Correlation vs. Causality: Freakonomics Movie. License : Other. License Terms : Standard YouTube License Located at : https://www.youtube.com/watch?v=lbODqslc4Tg.
  • Psychological Research – Crash Course Psychology #2. Authored by : Hank Green. Provided by : Crash Course. License : Other. License Terms : Standard YouTube License Located at : https://www.youtube.com/watch?v=hFV71QPvX2I .

Public domain content

  • Researchers review documents. Authored by : National Cancer Institute. Provided by : Wikimedia. Located at : https://commons.wikimedia.org/wiki/File:Researchers_review_documents.jpg . License : Public Domain: No Known Copyright

grounded in objective, tangible evidence that can be observed time and time again, regardless of who is observing

well-developed set of ideas that propose an explanation for observed phenomena

(plural: hypotheses) tentative and testable statement about the relationship between two or more variables

an experiment must be replicable by another researcher

implies that a theory should enable us to make predictions about future events

able to be disproven by experimental results

implies that all data must be considered when evaluating a hypothesis

committee of administrators, scientists, and community members that reviews proposals for research involving human participants

process of informing a research participant about what to expect during an experiment, any risks involved, and the implications of the research, and then obtaining the person’s consent to participate

purposely misleading experiment participants in order to maintain the integrity of the experiment

when an experiment involved deception, participants are told complete and truthful information about the experiment at its conclusion

committee of administrators, scientists, veterinarians, and community members that reviews proposals for research involving non-human animals

research studies that do not test specific relationships between variables

research investigating the relationship between two or more variables

research method that uses hypothesis testing to make inferences about how one variable impacts and causes another

observation of behavior in its natural setting

inferring that the results for a sample apply to the larger population

when observations may be skewed to align with observer expectations

measure of agreement among observers on how they record and classify a particular event

observational research study focusing on one or a few people

list of questions to be answered by research participants—given as paper-and-pencil questionnaires, administered electronically, or conducted verbally—allowing researchers to collect data from a large number of people

subset of individuals selected from the larger population

overall group of individuals that the researchers are interested in

method of research using past records or data sets to answer various research questions, or to search for interesting patterns or relationships

studies in which the same group of individuals is surveyed or measured repeatedly over an extended period of time

compares multiple segments of a population at a single time

reduction in number of research participants as some drop out of the study over time

relationship between two or more variables; when two variables are correlated, one variable changes as the other does

number from -1 to +1, indicating the strength and direction of the relationship between variables, and usually represented by r

two variables change in the same direction, both becoming either larger or smaller

two variables change in different directions, with one becoming larger as the other becomes smaller; a negative correlation is not the same thing as no correlation

changes in one variable cause the changes in the other variable; can be determined only through an experimental research design

unanticipated outside factor that affects both variables of interest, often giving the false impression that changes in one variable causes changes in the other variable, when, in actuality, the outside factor causes changes in both variables

seeing relationships between two things when in reality no such relationship exists

tendency to ignore evidence that disproves ideas or beliefs

group designed to answer the research question; experimental manipulation is the only difference between the experimental and control groups, so any differences between the two are due to experimental manipulation rather than chance

serves as a basis for comparison and controls for chance factors that might influence the results of the study—by holding such factors constant across groups so that the experimental manipulation is the only difference between groups

description of what actions and operations will be used to measure the dependent variables and manipulate the independent variables

researcher expectations skew the results of the study

experiment in which the researcher knows which participants are in the experimental group and which are in the control group

experiment in which both the researchers and the participants are blind to group assignments

people's expectations or beliefs influencing or determining their experience in a given situation

variable that is influenced or controlled by the experimenter; in a sound experimental study, the independent variable is the only important difference between the experimental and control group

variable that the researcher measures to see how much effect the independent variable had

subjects of psychological research

subset of a larger population in which every member of the population has an equal chance of being selected

method of experimental group assignment in which all participants have an equal chance of being assigned to either group

consistency and reproducibility of a given result

accuracy of a given result in measuring what it is designed to measure

determines how likely any difference between experimental groups is due to chance

statistical probability that represents the likelihood that experimental results happened by chance

Psychological Science is the scientific study of mind, brain, and behavior. We will explore what it means to be human in this class. It has never been more important for us to understand what makes people tick, how to evaluate information critically, and the importance of history. Psychology can also help you in your future career; indeed, there are very little jobs out there with no human interaction!

Because psychology is a science, we analyze human behavior through the scientific method. There are several ways to investigate human phenomena, such as observation, experiments, and more. We will discuss the basics, pros and cons of each! We will also dig deeper into the important ethical guidelines that psychologists must follow in order to do research. Lastly, we will briefly introduce ourselves to statistics, the language of scientific research. While reading the content in these chapters, try to find examples of material that can fit with the themes of the course.

To get us started:

  • The study of the mind moved away Introspection to reaction time studies as we learned more about empiricism
  • Psychologists work in careers outside of the typical "clinician" role. We advise in human factors, education, policy, and more!
  • While completing an observation study, psychologists will work to aggregate common themes to explain the behavior of the group (sample) as a whole. In doing so, we still allow for normal variation from the group!
  • The IRB and IACUC are important in ensuring ethics are maintained for both human and animal subjects

Psychological Science: Understanding Human Behavior Copyright © by Karenna Malavanti is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Glob Health Sci Pract
  • v.6(2); 2018 Jun 27

Observe Before You Leap: Why Observation Provides Critical Insights for Formative Research and Intervention Design That You'll Never Get From Focus Groups, Interviews, or KAP Surveys

Steven a. harvey.

a Social and Behavioral Interventions Program, Department of International Health, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA.

Associated Data

Four case studies show how observation can uncover issues critical to making a health intervention succeed or, sometimes, reveal reasons why it is likely to fail. Observation can be particularly valuable for interventions that depend on mechanical or clinical skills; service delivery processes; effects of the built environment; and habitual tasks that practitioners find difficult to articulate.

Formative research is essential to designing both study instruments and interventions in global health. While formative research may employ many qualitative methods, focus group discussions and in-depth interviews are the most common. Observation is less common but can generate insights unlikely to emerge from any other method. This article presents 4 case studies in which observation revealed critical insights: corralling domestic poultry to reduce childhood diarrhea, promoting insecticide-treated bed nets (ITNs) to prevent malaria, evaluating skilled birth attendant competency to manage life-threatening obstetric and neonatal complications, and assessing community health worker (CHW) ability to use malaria rapid diagnostic tests (RDTs). Observation of Zambian CHWs to design malaria RDT training materials revealed a need for training on how to take finger-stick blood samples, a procedure second nature to many health workers but one that few CHWs had ever performed. In Lima, Peru, study participants reported keeping their birds corralled “all the time,” but observers frequently found them loose, a difference potentially explained by an alternative interpretation of the phrase “all the time” to mean “all the time (except at some specific seemingly obvious times).” In the Peruvian Amazon, observation revealed a potential limitation of bed net efficacy due to the built environment: In houses constructed on stilts, many people sleep directly on the floor, allowing mosquitoes to bite from below through gaps in the floorboards. Observation forms and checklists from each case study are included as supplemental files; these may serve as models for designing new observation guides. The case studies illustrate the value of observation to clearly understanding clinical practices and skills, details about how people carry out certain tasks, routine behaviors people would most likely not think to describe in an interview, and environmental barriers that must be overcome if an intervention is to succeed. Observation provides a way to triangulate for social desirability bias and to measure details that interview or focus group participants are unlikely to recognize, remember, or be able to describe with precision.

INTRODUCTION

Let's play a quick game of word association: If I say “formative research,” what's the first word or phrase that comes to mind? Some of you, thinking of purpose, might say that formative research is what you do before designing a behavior change campaign. Others, thinking of methods, might say “focus groups.” Both would be wrong. Well, at least partially wrong.

Formative research is important to the design of behavior change campaigns, but it serves many other purposes as well. It is essential to developing research instruments and global health interventions of many kinds. 1 – 4 It can provide the basis for assessing clinical practice, determining how to measure intervention outcomes, planning quality improvement initiatives, and understanding many other aspects of global health programming. 5 – 14 As medical anthropologist Margaret Bentley explains 15 :

The purpose [of formative research] is to provide input into the design of a research study or intervention, including the identification of target populations and appropriate recruitment, retention or consent strategies, development of assessment or evaluation measures, and refinement of intervention components. Formative research allows community participation in the design of research and program protocols, which leads to greater community acceptance.

So formative research is about much more than just behavior change interventions.

Now, what about methods? If you want to do formative research, how should you go about it? Formative research can incorporate many methods, both qualitative and quantitative. Focus groups tend to be the most common, perhaps because they are most familiar. Interviews and knowledge, attitude, and practice (KAP) surveys are also popular. However, as you've probably gathered by now, I'm going to argue that those methods are often insufficient. If you're doing formative research, you should also consider observation.

If you're doing formative research, you should consider observation.

Researchers seem more hesitant about observation than other methods, perhaps because they don't know how to do it, consider it too labor-intensive or costly, feel uncomfortable with the idea of watching other people, or worry about reactivity—the phenomenon where those being observed change their behavior due to the observer's presence. 16 – 18 But observation can generate insights you won't get using any other method. And those insights can often prove critical.

In this article, I present 4 case studies on different global health topics, from corralling domestic poultry to measuring the competency of skilled birth attendants (SBAs). 19 – 21 These examples illustrate some of the scenarios in which observation—both structured and unstructured—can be useful, and they highlight the types of insights it can provide. In each case study, observation yielded critical information that would have been difficult or impossible to obtain any other way. For each case study, I provide a brief description of the research and the context from which it was drawn, then focus more extensively on the observational methods used and the unique insights they generated. Complete descriptions of the original research can be found elsewhere. 22 – 28 I've provided the observation instruments used for each case study as supplemental files.

Ethics Review

The research cited in case studies #1 and #2 was reviewed and approved by the Institutional Review Board of the Johns Hopkins Bloomberg School of Public Health in Baltimore, MD, USA, and by the Ethics Committee of the Asociación Benéfica PRISMA in Lima, Peru. The research cited in case study #3 was reviewed for compliance with the ethics guidelines of the Quality Assurance Project funded by the United States Agency for International Development and approved by Ministry of Health ethics committees or their equivalent in each study country. The research cited in case study #4 received ethics approval from the World Health Organization Special Programme for Research and Training in Tropical Diseases (WHO/TDR) and by the Tropical Disease Research Centre Ethics Committee – Ndola, Zambia.

CASE STUDY #1: CORRALLING DOMESTIC POULTRY TO REDUCE CHILDHOOD DIARRHEA IN LIMA, PERU

Campylobacter jejuni is a common bacterial contributor to diarrheal disease worldwide. 23 , 29 – 31 The bacteria is found almost universally in the intestinal tracts of chickens and can be transmitted to humans from contact with chicken feces or consumption of undercooked chicken. 23 , 32 – 36 In the shanty town outside Lima, Peru, where this study took place, the link between C. jejuni in domestic poultry and childhood diarrhea has been established for decades and confirmed repeatedly. 23 , 32 , 37

Study Context and Observation Methods

The observations described here took place as formative research for a trial to test whether corralling free-range chickens and other domestic poultry would reduce Campylobacter- associated diarrhea by minimizing contact between children and birds. 23 The research team recruited 12 local families raising domestic poultry, built corrals for the poultry at each household, and asked each family to test the corral for 8 weeks. A study team member made weekly visits to each household to complete a 19-item structured observation form ( Supplement 1 ) with space to record variables such as number of birds present; number inside and outside each corral; visual evidence that birds might have been outside the corral recently (e.g., feathers or bird droppings in the yard or inside the house); interaction, if any, between birds and children; cleanliness and structural soundness of each corral; and presence and cleanliness of food and water. The weekly visits were carried out at preselected random times during daylight hours Monday–Saturday. Participants were not notified of visits in advance. This unannounced random schedule made it possible to observe the natural state of each household and corral on different days of the week and at different times of day. In addition, the project sociologist made 3–4 random semi-structured spot checks per household over the 8-week period (30 total across the 12 participating households) noting whether, at the moment of arrival, birds were corralled, children were interacting with birds, birds had adequate food and water, and corrals were in good condition. The sociologist took unstructured notes on anything he judged relevant to feasibility or acceptability of corralling.

Critical Findings

Extent of corralling.

In interviews, participants stated that they kept their birds corralled “all the time.” However, observers found birds loose during 13% of observation visits and 33% of spot checks. Asked about this difference, participants clarified that they let the birds out at certain times such as while cleaning the corrals or to give them time to play ( recrearse )—an activity owners considered essential to their birds' well-being.

Why did participants say they kept their birds corralled all the time when they really didn't? One possible reason is courtesy bias: The project had built them corrals, and so participants may have felt they would disappoint us or seem ungrateful by admitting they didn't always use them. Another possible reason is that they meant something different than we did by “all the time.” Participants took for granted that—like themselves—everyone would understand the need to let birds loose at certain times for practical or health reasons, a “fact” seemingly so obvious as to be unworthy of mention. “All the time” really meant “all the time except at certain (presumably obvious) specific times.” Had we relied solely on interviews (reported behavior), we might never have known that birds were sometimes loose or might never have thought to ask why. Triangulation between what people told us and what we observed revealed critical information about why the intervention might not work.

Participants took for granted that—like themselves—everyone would understand the need to let birds loose at certain times for practical or health reasons, a “fact” seemingly so obvious as to be unworthy of mention.

Sufficient Food and Water

For the local population, one advantage of raising loose poultry was that the birds could find their own food and water. With a corral, the household needed to provide a constant supply of food and water and maintain hygienic conditions. As shown in Figure 1 , both structured observations and spot checks revealed that over the 8-week surveillance only 46% of corrals had food and only 43% had water. Further, corral floors were often wet after birds overturned their water dishes, and food was often rotting. In earlier interviews, participants had expressed concern that corralling would be unhealthy for their birds. Observations made clear that a corralling intervention might validate these concerns unless participants received training on how to keep corrals clean and corralled birds healthy. The data also showed that corralling took more time and effort since someone had to clean the corrals regularly and ensure availability of food and water.

An external file that holds a picture, illustration, etc.
Object name is GH-GHSP180025F001.jpg

Percentage of Domestic Poultry Corrals Containing Food or Water During Weekly Random Observations, Lima, Peru (N=122 Observations)

Contact Between Poultry and Children

The primary objective of corralling was to break the Campylobacter transmission cycle by separating birds from children. Observations demonstrated that children took a keen interest in the new corrals, often swinging on the doors, sticking their fingers through the mesh, or entering to play with the birds. Attempts to childproof corrals with latches or convince parents to keep children away were largely ineffective: Observers continued to encounter children inside. Parents explained that this was natural and appropriate: They wanted their children to grow up around animals. Children as young as 3 were assigned to collect eggs every day. Instead of isolating children from C. jejuni , observations suggested that corralling actually concentrated exposure. This may help explain the finding from a later study that rates of Campylobacter -associated diarrhea among children under 6 were 2 to 7 times higher in corralling households than non-corralling households with the same number of chickens. 38 Without observation, we might have missed the child-bird contact.

Handling of Poultry Manure

One contributor to child Campylobacter exposure not revealed in interviews was household handling of chicken manure. With manure now concentrated in a smaller space, poultry-raising households began to collect it to use as fertilizer. Observers documented that manure removed from coops was often stored in tin cans or buckets outside the coop within easy reach of children. Uncovered storage also allowed the wind to scatter dried manure around the outside of the living area, thus increasing potential contact.

Contrast Between Human and Bird Habitation

Though not part of formal data collection, observers also noted the contrast between human and animal living space. Residents of this area had settled outside Lima as squatters, often after fleeing rural terrorism in the 1980s. Most worked as casual laborers, domestic servants, or textile piece-workers earning the equivalent of $4.00 to $5.00 per day. Many lived in houses cobbled together from discarded materials, often scavenged from construction sites or garbage dumps. Corrals, though built as cheaply as possible, were made from new material at an average cost of $60.00 per household. Figure 2 shows a project-constructed corral to the left with the human habitation in the center. After receiving their corrals, more than 1 participant joked that their birds now enjoyed a higher standard of living than the human members of the family. Documenting this contrast offered a perspective beyond that likely to be achieved through interviews or focus groups alone.

An external file that holds a picture, illustration, etc.
Object name is GH-GHSP180025F002.jpg

Contrast Between Human and Animal Living Spaces Documented Through Observation, Las Pampas de San Juan de Miraflores, Peru

Project-constructed poultry corral (left foreground) vs. human habitation (center background). Project participants sometimes joked that the birds in the project enjoyed a better standard of living than the people. © 1999 Steven Harvey.

CASE STUDY #2: BED NETS FOR MALARIA PREVENTION IN THE PERUVIAN AMAZON

Malaria was virtually eliminated from the Peruvian Amazon during the 1970s and 1980s but began to reappear sporadically in the mid-1990s, culminating in an epidemic outbreak in 1997. 39 In response that year, the Peruvian Ministry of Health began distributing ITNs to affected communities. This case study involves observations carried out to evaluate the social acceptability of ITNs and to assess their potential efficacy based on human behavior during the peak biting hours of local malaria-transmitting mosquitoes.

The study took place in 1 peri-urban community and 3 rural villages, all within 30 km of Iquitos, the Peruvian Amazon's largest city. Over 9 months, 4 observers carried out 1 dusk-to-dawn observation in each of 60 households. Upon arrival, the observer used a structured form ( Supplement 2 ) to collect information about the number, ages, and relationships of household occupants; the number and types of sleeping spaces; and the number and types of bed nets. The observer then took unstructured notes at 5-minute intervals throughout the night, recording the location and activities of each household member. Most households consisted of a wooden platform on stilts raised about 2 meters off the ground and covered with a thatched roof. These structures had few rooms or interior dividers, so observers could follow most household activities from a single vantage point. 17 , 40

Net Use During Peak Biting Hours

A key concern about ITN effectiveness in the Americas is whether people are likely to be inside a net during the hours when local malaria-transmitting mosquitoes bite. Observation allowed us to systematically document net use. As shown in Figure 3 , people began to enter their nets for the night as early as 7:00 p.m., but only about half the population was inside a net by 8:30 p.m. and slightly less than 80% by 9:30 p.m., the peak biting hour for Anopheles darlingi , the Amazon's most important malaria vector. 42 This suggests that ITNs might be somewhat effective, but not as effective as in Africa where principal vector species feed later at night. Rather than observing all night, we might have simply asked people what time each member of the household went to bed the previous night, but in a setting where few people had watches or clocks, it would have been hard for them to respond with much precision. Social desirability bias might also have affected people's reports about their own behavior: At the time, the Ministry of Health was running a campaign encouraging people to enter their nets at dusk—a practice unlikely to be feasible in an area near the equator where the sun sets around 6:30 p.m. throughout the year.

An external file that holds a picture, illustration, etc.
Object name is GH-GHSP180025F003.jpg

Percentage of the Population in Bed by Half-Hour (N=60 Observations) Compared With Anopheles darlingi Feeding Behavior, a Department of Loreto, Peru

a Data on mosquito feeding behavior come from Vittor (2003). 41

Multiple Entries and Exits

One unanticipated finding was the number of times people enter and exit ITNs during the night. 43 Each time the net is lifted, mosquitoes have an opportunity to enter. Parents who share nets with children may spend considerable time outside the net unprotected after their children have gone to sleep. The Table shows an example of a single sleeping space occupied by a 23-year-old mother and her 2-year-old son. The net was lifted a total of 20 times between 7:00 p.m. and 6:30 a.m. The mother spent 195 minutes outside the net between the first time she entered with her son at 7:00 p.m. and the time both of them got out of bed at 6:30 the next morning.

Observational Bed Net Entry and Exit Data From a Single Sleeping Space With 2 Occupants, a 23-Year-Old Mother and Her 2-Year-Old Son, Peruvian Amazon

An unanticipated finding from an observational study of bed net use was the number of times people enter and exit their bed nets during the night—as many as 20 times for 1 mother with a young son.

Additional Potential Risk Factors

Observations revealed other phenomena that would have been difficult to capture with interviews or focus groups. For instance, observers took detailed notes on sleeping spaces in participating households. These notes revealed that many people slept directly on cane flooring rather than on a bed. The flooring had gaps between the cane staves. Since many houses were built on stilts, this meant mosquitoes could enter the sleeping space from below. A net alone could not provide adequate protection in this setting: An effective malaria prevention intervention would need to help at-risk individuals find a way to protect themselves from below as well as from above. Observers also documented other practices that might increase exposure risk: attending evening church services during peak biting hours, bathing after dark, running small home-based stores where community members came to buy food or basic necessities in the evening hours, and other nighttime activities such as hunting, fishing, or charcoal production. While study participants reported some of these activities during interviews, direct observation allowed the study team to document them more systematically.

CASE STUDY #3: ASSESSING THE COMPETENCY OF SKILLED BIRTH ATTENDANTS IN 7 COUNTRIES

About 90% of the 300,000–350,000 annual maternal deaths worldwide are caused by 5 common obstetric complications: postpartum hemorrhage, pregnancy-induced hypertension, obstructed labor, perinatal sepsis, and postabortion complications. 44 , 45 Risk for experiencing one of these life-threatening complications cannot be reliably predicted in advance, but most can be treated successfully if the woman experiencing them has access to basic or comprehensive essential obstetric care delivered by an SBA. For this reason, the World Health Organization (WHO) recommends that all pregnant women be assisted by an SBA during labor and delivery. 46 Several international organizations have defined the competencies necessary to manage these complications. The observations described below were carried out as part of developing a method to assess these competencies among practicing SBAs in low- and middle-income countries.

Testing a clinician's competency to manage a complication according to standards requires assessing not only abstract knowledge but also physical or manual ability. Knowledge can be measured using a written exam, but the only way to assess manual skill is by watching someone perform a task to see whether she or he does it correctly. Assessing skills on actual patients, however, is problematic. Ethically, an observer qualified to evaluate clinical competency would need to stop observing and intervene before allowing an insufficiently skilled provider to endanger a patient's life or well-being. Moreover, even common obstetric complications are relatively rare. This makes it impossible to assess the skill of more than a handful of providers using actual patients.

While knowledge can be measured using a written exam, the only way to assess manual skill is by watching someone perform a task.

The observations discussed here were designed to test SBA competency at performing 4 critical procedures. The first 3 procedures—active management of the third stage of labor (AMTSL), manual removal of the placenta, and bimanual uterine compression—are performed to prevent or control postpartum hemorrhage in a mother who has just given birth. The fourth, neonatal resuscitation with an Ambu bag, is used to treat neonatal asphyxia. The project, eventually carried out in Benin, Ecuador, Jamaica, Kenya, Nicaragua, Rwanda, and Tanzania, used expert obstetrician/gynecologists and pediatricians from host countries as observers. SBAs being assessed performed each procedure on an anatomical model (Gaumard S500 Advanced Childbirth Simulator and Simulaids Sani-Baby CPR mannequin or Gaumard S320 Newborn Airway Trainer); observers assessed competency using a structured step-by-step checklist ( Supplement 3 ). 27 , 28

Correct hand position and movement are essential to successfully performing all 4 tasks. Controlled cord traction, an elective component of AMTSL, requires exerting a gentle downward pull on the umbilical cord with one hand while using the other to prevent uterine inversion by applying counter-traction just above the pubic bone. 47 In case of retained placenta, manual removal requires inserting the hand through the vaginal canal and using a gentle lateral motion to detach the placenta intact, leaving no fragments that could provoke continued bleeding or cause sepsis. Figure 4 shows an expert observer demonstrating manual removal with the Gaumard Advanced Childbirth Simulator. The open abdominal cavity allows the observer to assess the technique of the SBA being observed. Some SBAs might be able to describe these or similar procedures, but even a precise detailed description would not necessarily indicate ability to perform them.

An external file that holds a picture, illustration, etc.
Object name is GH-GHSP180025F004.jpg

Demonstration of the Correct Hand Position for Manual Removal of a Retained Placenta on an Anatomical Model

© 2006 Steven Harvey

Observations across the 7 study countries revealed the following:

  • Though AMTSL is commonly included in national standards for managing uncomplicated delivery, most SBAs did not know how to perform controlled cord traction.
  • Similarly, most SBAs could not demonstrate the correct hand positions for carrying out the manual removal of a retained placenta. Although bimanual uterine compression is a relatively simple procedure requiring no instruments or equipment, virtually no SBA was familiar with it.

An external file that holds a picture, illustration, etc.
Object name is GH-GHSP180025F005.jpg

Neonatal Resuscitation With an Ambu Bag: Correct vs. Incorrect Positioning

Left: Correct positioning of mask, bag, and newborn's head to achieve a good seal, with bag perpendicular to the newborn's body. © 2006 Steven Harvey.

Right: Incorrect positioning, with bag parallel with the newborn's body, making it more difficult to achieve a good seal. © 2002 Steven Harvey.

Using checklists adapted to each country's norms, observation also enabled the study team to assess whether SBAs followed prescribed infection prevention guidelines including handwashing, gloving, and post-procedure decontamination. Participating SBAs were provided with all necessary supplies and equipment. At the beginning of each assessment, the observer instructed each participant to “begin by preparing yourself, the equipment, and the patient,” then noted if the SBA proceeded in accordance with norms. At the end, the observer similarly instructed each participant to “please tell me what more you would do or ask someone else to do once you have finished the procedure.”

It's tempting to classify this research as summative since its initial objective was to assess existing health worker skills. But it was also formative , because the results helped shape interventions: In the short term, observers offered feedback and retraining to each participant, and sometimes—when many participants had a particular weakness in common—to the entire group. In the longer term, findings have influenced training programs and assessment methods in participating countries and around the globe.

CASE STUDY #4: ASSESSING CHW ABILITY TO USE MALARIA RAPID DIAGNOSTIC TESTS IN ZAMBIA

For decades leading up to the early 2000s, malaria in sub-Saharan Africa was diagnosed presumptively: Anyone with a fever was presumed to have malaria and treated with antimalarials. This practice developed because the supply of both microscopes and trained microscopists was too limited to diagnose more than a tiny fraction of febrile patients. In addition, first-line antimalarial drugs were cheap and adverse effects negligible, so presumptive treatment involved minimal cost and risk. After introduction of artemisinin combination therapy as first-line treatment for malaria starting around 2004, WHO recommended parasite-based diagnosis first for adults and older children, then for all suspected cases of malaria regardless of age. 48 Malaria rapid diagnostic tests (RDTs) make parasite-based diagnosis possible even at health facilities with no laboratory, microscope, or microscopist. In many areas, however, febrile patients seek treatment at the community level without ever visiting a health facility. The observations described in this case study were carried out to determine whether volunteer community health workers (CHWs) could use RDTs safely and accurately and, if so, what sort of training materials they needed.

Based on focus group discussions with Zambian CHWs, the study team designed a job aid and brief training curriculum. We used structured observation to pilot test these materials. Study team members observed 79 CHWs prepare 3 RDTs each and recorded the results on a 16-item checklist ( Supplement 4 ). 24 , 25

  • Malaria RDTs require using a sterile lancet to draw a finger-stick blood sample, a procedure that is second nature to many professional health workers. Due to concerns about HIV and other blood-borne diseases, however, most African CHWs were prohibited from taking finger-stick blood samples. The Zambian Medical Council authorized the practice for this study, but few participating CHWs had ever taken a sample or used a lancet. During training, observers noticed that instead of drawing blood with a quick stab—the preferred approach—many CHWs set the point of the lancet on the patient's fingertip, then pushed it into the skin. Participants explained they were doing this for fear that stabbing would cause the patients too much pain, but the effect was just the opposite: Pushing was more painful. In addition, it often produced too little blood, thus necessitating a second, third, or even fourth finger prick. Observing this made clear that CHWs needed specific training on proper lancet technique. The study team subsequently developed a training module demonstrating how to extract sufficient blood with a single prick. Improved CHW technique reduced patient discomfort and increased testing quality.
  • Watching CHWs transfer blood from fingertip to test cassette yielded a similar revelation. The project RDT came packaged with a loop-shaped blood transfer device designed to collect a 5 μl film of blood across the width of the loop. CHWs did the finger prick with the ball of the patient's finger facing up, then tried to collect the drop from above. This often conveyed too little blood to the test cassette even after multiple tries. Noting this, an experienced observer suggested pricking the finger, rotating the patient's hand 180°, then collecting the drop from underneath with the ball of the finger facing down. In most cases, this made it possible to collect and transfer the precise volume of blood required on the first attempt.
  • A key concern related to blood safety was correct disposal of the blood-contaminated lancet. To minimize danger to patients, CHWs, and the community, the research team distributed sharps boxes to all participating CHWs and instructed them to deposit the used lancet into the sharps box immediately after pricking the patient's finger. Setting down the used lancet prior to disposal heightens risk of finger-stick injuries. Observers noticed that positioning the sharps box appropriately made immediate disposal convenient: For a right-handed CHW, this meant placing the sharps box on the right side of the work space, and vice versa for a left-handed CHW. Placing the box on the opposite side of the CHW's dominant hand forced the CHW to reach across both his or her own body and that of the patient. This made handling the used lancet more risky and immediate disposal more difficult.

An external file that holds a picture, illustration, etc.
Object name is GH-GHSP180025F006.jpg

Malaria Rapid Diagnostic Test Job Aid

A job aid for community health workers lists at the top all supplies and equipment that the worker needs to assemble prior to conducting a rapid diagnostic test for malaria.

  • Watching CHWs provide services from home led to another observational finding: Many CHW homes lack electricity and thus have poor-quality artificial lighting. This fact can affect the accuracy of test interpretation when RDTs are prepared inside, especially after dusk or during inclement weather. The RDT's positive test line—indicating that a patient is infected with malaria—can often be quite faint. With inadequate artificial lighting inside and insufficient natural light outside, a CHW could easily misread a faint positive result as negative, thus leaving an infected patient untreated. Realizing this led to added emphasis during training that positive lines are sometimes quite faint and that CHWs should read results in the brightest light possible to avoid missing a faint positive.

Observation produced novel insights in the case studies just described, but how do you decide when observation might be valuable or even essential for your intervention or study? To answer this, it's useful to think in terms of categories of events or processes. Among others, these might include mechanical skills, health service delivery processes, effects of the built environment, and habitual practices that people would have difficulty articulating, sometimes known as “tacit knowledge.” 49 , 50

Observation can produce novel insights, but how do you decide when it might be valuable or even essential for your intervention or study?

Mechanical Skills

The SBA and RDT case studies both illustrate the value of observation to understanding mechanical skills, including critical details such as the correct hand position needed to effectively carry out a lifesaving obstetric or neonatal intervention. Manual removal of a retained placenta or resuscitation of an asphyxiated newborn are two examples. Although lancet technique, sharps box position, or collecting blood with the fingertip facing up or down might seem like minute details when preparing an RDT, they can make the difference between effective, efficient, safe practices and practices that lead to incorrect results or endanger the patient, the health worker, or the community. Observation in these cases is critical not only to diagnose lapses but also to identify interventions that can address them. Observation thus led to additional practical training for SBAs and to development of specific training modules and revised job aid illustrations for malaria RDTs. Beyond their specific substantive findings, these two studies highlight the value of observation to understanding both health worker and community behavior.

Sequential Processes

Many public health interventions involve sequential processes: Not only must each step be performed properly, it must also be performed in the proper order. Again, the RDT case study offers an illustrative example: The study team identified 16 discrete steps necessary to correctly prepare and interpret the test; performing them in the wrong order (e.g., opening the sterile lancet before cleaning the finger with an alcohol swab) or the wrong way (depositing the blood drop where the buffer solution is supposed to go) could compromise test accuracy or patient or health worker safety. The observation checklist ( Supplement 4 ) enabled the team to determine the proportion of health workers who completed all steps correctly, identify specific steps where health workers had problems, and modify training to address the problems observed. Greenland et al. used a similar approach in Zambia to determine what proportion of caregivers of young children with diarrhea could prepare oral rehydration solution correctly. 51 Hurley et al. used a combination of structured and unstructured observations to track the flow of pregnant women through antenatal care in Mali and better understand why many completed their visits without receiving intermittent preventive treatment for malaria in pregnancy (IPTp) or received it without any information about the purpose of IPTp. 52 Hermida et al. found observation to be more accurate than patient exit interviews or medical record review for assessing facility-based provider adherence to standards of care for acute lower respiratory infection, diarrheal disease, and family planning counseling. 53 For this reason, observation is often a key component of quality improvement research. 53 , 54 In sum, observation can be an invaluable tool for documenting the necessary steps in a process, identifying where breakdowns occur, and thus pinpointing where intervention is needed. This type of analysis can be useful at the household, community, and health facility levels.

Understanding the Built Environment

The built environment—and sometimes its relationship to the natural environment—can significantly affect disease risk, health service delivery, and the feasibility of health interventions. The Campylobacter study setting consists of dusty desert hills where water is scarce and rain nonexistent (natural environment). Since the poorest people live at the top of those hills with neither wells nor piped water (built environment), many families struggle to provide water for themselves. Water for corralled birds becomes, at best, a secondary priority. Observing the difficulty of obtaining water helped study team members better understand owners' concerns about the effect of corralling on birds' health. Wind (natural environment) combined with open storage of concentrated chicken manure cleaned from the corrals (built environment) turned out to be one form of continued contact between humans and Campylobacter despite corralling.

The built environment was likewise a critical aspect of the bed net study. The structure of a typical bed in the study setting—no mattress and gaps between the wooden or bamboo slats that allowed mosquitoes to bite from underneath—might never have occurred to public health practitioners, most of whom presumably sleep in beds with mattresses. Even had it occurred to them, they would not have been able to collect systematic data on bed configurations without observation. Thus, observation revealed one potential limitation of bed net efficacy in the study setting. This, in turn, revealed a necessary component of any improvement intervention: figure out how to block the gaps between flooring that allowed mosquitoes to enter.

Systematically observing the built environment can be revealing in many settings. By documenting patient flow at health centers and hospitals, maternal health researchers from the Quality Assurance Project helped explain why women arriving with an obstetric complication might encounter significant, sometimes life-threatening, delays before seeing a clinician. 55 – 58 Observing both the size of rooms in a house and their use for multiple purposes (sleeping at night, running a small retail shop during the day) helped explain why some households in Ghana were reluctant to permanently install bed nets over their sleeping spaces and why, in some cases, residents preferred conical nets to rectangular. 59 Observing the dim lighting in CHWs' houses helped explain why CHWs might miss weak positive RDT results and why training programs needed to emphasize the importance of reading test results under bright light. 25 Many U.S. researchers have used observation to study the relationships between built environment, physical activity, available food choices, and chronic diseases such as obesity and diabetes. 60 – 63 As with the discussion of sequential processes above, it is worth reiterating that observations related to the built and natural environments can be useful at the household, community, and health facility levels.

Habitual Practices and Tacit Knowledge

In any setting, people perform a variety of routine activities, the procedures for which they learned at some point in the past, committed to memory, and carry out automatically, almost as if by instinct. Because these activities are habitual, those who perform them often have difficulty articulating the step-by-step process and even come to think of that process as self-evident. Collecting a finger-stick blood sample is a case in point. A health care provider who has done it many times considers it second nature and wonders why a novice finds it so difficult. Observation reveals that the process involves numerous steps: assemble all the supplies before starting, swab the fingertip with alcohol, wait for it to dry, massage the finger to work the blood up into the fingertip, open the sterile lancet, puncture the fingertip with a quick stab, orient the fingertip with the blood drop in the optimum position for the particular blood collection device being used, etc. The experienced provider has internalized all this and performs it without needing to think. The novice may fail to massage the finger, stab too timidly and thus extract too little blood, or orient the fingertip in a less than optimal position and thus collect too little blood, or too much. Observing both expert and novice helps distinguish the differences and thus determine what training the novice requires.

People who perform habitual activities often have difficulty articulating the step-by-step process and even come to think of that process as self-evident.

The Campylobacter study provides additional examples: Interview or focus group participants might fail to mention the many points of contact between children and birds either because they knew the intervention was meant to separate the two (courtesy bias) or because the types of contact were so commonplace as to seem unworthy of mention. Observing children play with birds, feed and water them, collect eggs, and clean corrals provides tangible evidence that those designing public health interventions should take into account both human nature (children like to play with animals) and economic and cultural practices (even a very young child may be assigned household chores; parents may view learning to raise animals as a key life skill). Cumulative findings from these observations contributed to a conclusion that the intervention was unlikely to succeed, a conclusion confirmed by subsequent research demonstrating that corralling, instead of decreasing risk of Campylobacter -associated diarrhea in children, actually doubled it. 38

The bed net study also provides examples: Absent observation, as noted above, public health practitioners might not have thought to ask about bed design. Conversely, mentioning bed design—an aspect of daily existence so routine as to pass virtually unperceived—might never have occurred to a member of the at-risk population. Had interviewers thought to ask, net occupants might also have mentioned that they enter and exit their nets more than once per night, but it is unlikely that they could have reported very precisely the number of entries and exits, the amount of time the net was lifted, or the amount of time different occupants spend outside the net. Observation made it possible to quantify this phenomenon much more systematically. 43

After validating the method, Gittelsohn used structured mealtime observations to estimate differences in caloric and micronutrient intake between men, women, and children in lowland south-central Nepal. 64 – 66 It is unlikely that parents would have been able to provide such detailed information about intra-household food allocation. Bentley et al. used structured observation during formative research to document child feeding practices prior to a nutritional intervention to improve infant growth and development in Andhra Pradesh, India. 10 Brummell used observation to discover tacit knowledge related to the prognosis of patients suffering cardiac arrest and whether to attempt resuscitation in 2 UK hospital emergency departments. 67 Huot and Laliberte Rudman, who used participant observation to learn about the daily routines of refugees in Canada, explain why observation can be so important for understanding habitual phenomena 68 :

The tacit nature of daily occupation can make the details involved in participation difficult to verbalize because respondents may not have reflected upon their occupational engagement in such detail, or may assume that such “minutia” may not be relevant for research.

This statement could be extended to many areas of health at individual, household, community, and facility levels. Often observation, used together with more common methods like interviews or focus groups, is the only way to make such tacit knowledge explicit.

Triangulating Observation Data With Data From Other Methods

In both the case studies described here and many of the examples cited, researchers used observation together with other methods to achieve a more complete picture of a setting, practice, or intervention. Using observation to triangulate information gathered from interviews or focus group discussions can bring to light differences between what people say they do (reported behavior) and what people actually do (observed behavior). In some cases, this may reveal social desirability bias: People over- or under-report a particular behavior because it violates what they perceive to be social norms. Hygiene studies, for instance, have often found that people over-report handwashing at critical times; observation shows much lower levels. 69 , 70

Using observation to triangulate information from interviews or focus group discussions can bring to light differences between what people say they do and what they actually do.

There is no Peruvian data on reported ITN use that we can compare to the case study #2 observation. But there is at least a plausible basis for comparison in Ghana: Nighttime observation of net use in Northern and Upper West Regions found that only 17% of the population used a net at any time during the night. 71 In a malaria indicator survey of the same 2 districts, 51% and 54% of the population reported sleeping under a net. 72 The numbers are not directly comparable for many reasons, so these differences should be interpreted with caution. The observation study is based on a small purposive sample, the survey on a population-based representative sample; the data were collected in different years and at different times of year. But the wide gap suggests a considerable difference between reported and actual net use. Also, for the observation sample, we know when each individual entered and exited his or her net and how long individuals spent protected versus unprotected. All we know from the survey is that the individual reported sleeping under the net at some point during the night—we have no idea for how long.

Triangulation may also reveal that a word, phrase, or concept means something different to participants than to the researcher. The possibility, in the Campylobacter study, that participants who reported keeping their birds in the corral “all the time,” really meant “all the time except for certain specific seemingly obvious times” is one example. Had we employed only interviews in that study, we would likely have concluded—incorrectly—that birds were never loose. Had we employed only observation, we would likely have concluded that birds were loose 20% of the time—more accurate, but not the whole story. Only the combination revealed the differences in meaning and their conflicting unspoken assumptions.

Observation and Reactivity

A key objection to observation is that it leads to reactivity: Those under observation may change their behavior because they know they are being observed. However, this problem is not unique to observation: People also change their behavior when they are being studied in other ways. Survey and interview respondents may answer questions based on what they think society (social desirability bias) or the interviewer (courtesy bias) expect of them. Observer expectancy effect refers to how an observer can shape behavior—deliberately or subconsciously—by providing subtle nonverbal cues such as slight changes in facial expression. The Hawthorne effect was named for a study in which factory workers from both intervention and control groups became more productive because they knew that researchers were testing possible interventions (such as better lighting) to improve productivity. More detailed definitions are beyond the scope of this article but can be found in many social science references. 73 – 76

In one example of reactivity, P.V. Ram and colleagues found evidence of a 35% increase in handwashing when an observer was present compared with when there was no observer and handwashing was detected by a motion sensor hidden within a bar of soap.7 77 But while reactivity often does occur, researchers can measure and adjust for it. 17 Reactivity also diminishes with time: The longer amount of time or the greater number of times people are observed, the less likely they are to react to an observer's presence. 78 – 80 Ram's study concluded that their findings “call into question the validity of structured observation details because it appears that a majority of participating caregivers substantially altered their behavior in the presence of an observer.” But the study included only 1 observation per household. Had Ram's team observed each household multiple times and waited until household members became accustomed to the observer's presence, their results might have been different.

Ram and her colleagues have a point that in some cases a less invasive technological method might be preferable to observation. For example, studies exploring household use of cleaner cookstoves to reduce indoor air pollution often use temperature sensors (called stove use monitors or SUMs) to track which stove is being used when and for how long. 81 , 82 At least one recent study reports that combining observation and SUMs data provides a more accurate picture than SUMs data alone. 83

Moreover, reactivity is often unrelated to the focus behavior. In the bed net study, we identified 339 instances of reactivity across 60 observations using the broadest possible definition: any interaction whatsoever between the observer and any member of the observed household. Of these 339 instances, only 2 were directly related to the behavior of interest: protecting against mosquito bites. 17 In a similar way, John Schnelle and colleagues found that observations did not change provider treatment of nursing home residents in the United Kingdom. 84

Another way to control reactivity is through unannounced spot checks similar to those we used in case study #1. Nazmul Chaudhury and colleagues used this method to chronicle the degree of health worker and teacher absenteeism in health facilities and primary schools in Bangladesh, Ecuador, India, Indonesia, Peru, and Uganda. 85 In his classic article about nighttime observations among the Samukundi Abelam, Richard Scaglion describes how he used spot checks to document time allocation within this Papua New Guinea ethnic group. 86 Scaglion admits, however, that he was not always able to maintain the element of spontaneity that spot check observations are meant to provide:

… it is not easy for an anthropologist in the field to come upon an Abelam unawares. Since I did not want to record “greeting anthropologist” as a frequent activity when people were first observed, I often had to reconstruct what they were doing immediately before I arrived.

In sum, observation can be an essential tool in formative research. As a stand-alone method, it can measure phenomena not measurable by any other method. In combination with interviews or focus groups, it can suggest questions to be posed through these other methods. It can also triangulate findings from other methods, reveal potential differences between reported and observed behavior, and thus help assess social desirability bias. Given these benefits, observation—either alone or in combination with other methods—is something both investigators and program managers should consider when undertaking formative research.

Supplementary Material

Acknowledgments.

I am grateful to Marianne Henry for her help with literature review and manuscript preparation. I wish to thank the editor and editorial staff of GHSP as well as the 3 anonymous reviewers, all of whose comments considerably strengthened this manuscript. I also wish to thank the many participants in the 4 studies described here for their time, patience, and willingness to participate. Finally, I am grateful for the comments and suggestions of the many students with whom I have discussed these concepts in formative research classes over nearly a decade and to Drs. Elli Leontsini and Peter Winch for inviting me to do so.

Peer Reviewed

Competing Interests: None declared.

Funding: Funding for case study #1 was provided by the Thrasher Research Fund (award 02813-1). Funding for case study #2 was provided by the US Agency for International Development (USAID) under Grant Number 527G001000070. Case study #3 was supported by the Quality Assurance Project under contracts number HRN-C-00-96-90013 and GPH-C-00-02-00004-00 with the United States Agency for International Development (USAID). Funding for case study #4 was provided by the Australian Agency for International Development (AusAID), the WHO Special Programme for Research and Training in Tropical Diseases (TDR), and the United States Agency for International Development (USAID) under the Quality Assurance and Workforce Development Project at University Research Co., LLC (contract number GPH-C-00-02-00004-00). Conclusions and opinions are the sole responsibility of the author and do not necessarily reflect the views or policies of the funders.

First Published Online: May 23, 2018

Cite this article as: Harvey SA. Observe before you leap: why observation provides critical insights for formative research and intervention design that you'll never get from focus groups, Interviews, or KAP Surveys. Glob Health Sci Pract. 2018;6(2):299-316. https://doi.org/10.9745/GHSP-D-17-00328

  • Privacy Policy

Buy Me a Coffee

Research Method

Home » Case Study – Methods, Examples and Guide

Case Study – Methods, Examples and Guide

Table of Contents

Case Study Research

A case study is a research method that involves an in-depth examination and analysis of a particular phenomenon or case, such as an individual, organization, community, event, or situation.

It is a qualitative research approach that aims to provide a detailed and comprehensive understanding of the case being studied. Case studies typically involve multiple sources of data, including interviews, observations, documents, and artifacts, which are analyzed using various techniques, such as content analysis, thematic analysis, and grounded theory. The findings of a case study are often used to develop theories, inform policy or practice, or generate new research questions.

Types of Case Study

Types and Methods of Case Study are as follows:

Single-Case Study

A single-case study is an in-depth analysis of a single case. This type of case study is useful when the researcher wants to understand a specific phenomenon in detail.

For Example , A researcher might conduct a single-case study on a particular individual to understand their experiences with a particular health condition or a specific organization to explore their management practices. The researcher collects data from multiple sources, such as interviews, observations, and documents, and uses various techniques to analyze the data, such as content analysis or thematic analysis. The findings of a single-case study are often used to generate new research questions, develop theories, or inform policy or practice.

Multiple-Case Study

A multiple-case study involves the analysis of several cases that are similar in nature. This type of case study is useful when the researcher wants to identify similarities and differences between the cases.

For Example, a researcher might conduct a multiple-case study on several companies to explore the factors that contribute to their success or failure. The researcher collects data from each case, compares and contrasts the findings, and uses various techniques to analyze the data, such as comparative analysis or pattern-matching. The findings of a multiple-case study can be used to develop theories, inform policy or practice, or generate new research questions.

Exploratory Case Study

An exploratory case study is used to explore a new or understudied phenomenon. This type of case study is useful when the researcher wants to generate hypotheses or theories about the phenomenon.

For Example, a researcher might conduct an exploratory case study on a new technology to understand its potential impact on society. The researcher collects data from multiple sources, such as interviews, observations, and documents, and uses various techniques to analyze the data, such as grounded theory or content analysis. The findings of an exploratory case study can be used to generate new research questions, develop theories, or inform policy or practice.

Descriptive Case Study

A descriptive case study is used to describe a particular phenomenon in detail. This type of case study is useful when the researcher wants to provide a comprehensive account of the phenomenon.

For Example, a researcher might conduct a descriptive case study on a particular community to understand its social and economic characteristics. The researcher collects data from multiple sources, such as interviews, observations, and documents, and uses various techniques to analyze the data, such as content analysis or thematic analysis. The findings of a descriptive case study can be used to inform policy or practice or generate new research questions.

Instrumental Case Study

An instrumental case study is used to understand a particular phenomenon that is instrumental in achieving a particular goal. This type of case study is useful when the researcher wants to understand the role of the phenomenon in achieving the goal.

For Example, a researcher might conduct an instrumental case study on a particular policy to understand its impact on achieving a particular goal, such as reducing poverty. The researcher collects data from multiple sources, such as interviews, observations, and documents, and uses various techniques to analyze the data, such as content analysis or thematic analysis. The findings of an instrumental case study can be used to inform policy or practice or generate new research questions.

Case Study Data Collection Methods

Here are some common data collection methods for case studies:

Interviews involve asking questions to individuals who have knowledge or experience relevant to the case study. Interviews can be structured (where the same questions are asked to all participants) or unstructured (where the interviewer follows up on the responses with further questions). Interviews can be conducted in person, over the phone, or through video conferencing.

Observations

Observations involve watching and recording the behavior and activities of individuals or groups relevant to the case study. Observations can be participant (where the researcher actively participates in the activities) or non-participant (where the researcher observes from a distance). Observations can be recorded using notes, audio or video recordings, or photographs.

Documents can be used as a source of information for case studies. Documents can include reports, memos, emails, letters, and other written materials related to the case study. Documents can be collected from the case study participants or from public sources.

Surveys involve asking a set of questions to a sample of individuals relevant to the case study. Surveys can be administered in person, over the phone, through mail or email, or online. Surveys can be used to gather information on attitudes, opinions, or behaviors related to the case study.

Artifacts are physical objects relevant to the case study. Artifacts can include tools, equipment, products, or other objects that provide insights into the case study phenomenon.

How to conduct Case Study Research

Conducting a case study research involves several steps that need to be followed to ensure the quality and rigor of the study. Here are the steps to conduct case study research:

  • Define the research questions: The first step in conducting a case study research is to define the research questions. The research questions should be specific, measurable, and relevant to the case study phenomenon under investigation.
  • Select the case: The next step is to select the case or cases to be studied. The case should be relevant to the research questions and should provide rich and diverse data that can be used to answer the research questions.
  • Collect data: Data can be collected using various methods, such as interviews, observations, documents, surveys, and artifacts. The data collection method should be selected based on the research questions and the nature of the case study phenomenon.
  • Analyze the data: The data collected from the case study should be analyzed using various techniques, such as content analysis, thematic analysis, or grounded theory. The analysis should be guided by the research questions and should aim to provide insights and conclusions relevant to the research questions.
  • Draw conclusions: The conclusions drawn from the case study should be based on the data analysis and should be relevant to the research questions. The conclusions should be supported by evidence and should be clearly stated.
  • Validate the findings: The findings of the case study should be validated by reviewing the data and the analysis with participants or other experts in the field. This helps to ensure the validity and reliability of the findings.
  • Write the report: The final step is to write the report of the case study research. The report should provide a clear description of the case study phenomenon, the research questions, the data collection methods, the data analysis, the findings, and the conclusions. The report should be written in a clear and concise manner and should follow the guidelines for academic writing.

Examples of Case Study

Here are some examples of case study research:

  • The Hawthorne Studies : Conducted between 1924 and 1932, the Hawthorne Studies were a series of case studies conducted by Elton Mayo and his colleagues to examine the impact of work environment on employee productivity. The studies were conducted at the Hawthorne Works plant of the Western Electric Company in Chicago and included interviews, observations, and experiments.
  • The Stanford Prison Experiment: Conducted in 1971, the Stanford Prison Experiment was a case study conducted by Philip Zimbardo to examine the psychological effects of power and authority. The study involved simulating a prison environment and assigning participants to the role of guards or prisoners. The study was controversial due to the ethical issues it raised.
  • The Challenger Disaster: The Challenger Disaster was a case study conducted to examine the causes of the Space Shuttle Challenger explosion in 1986. The study included interviews, observations, and analysis of data to identify the technical, organizational, and cultural factors that contributed to the disaster.
  • The Enron Scandal: The Enron Scandal was a case study conducted to examine the causes of the Enron Corporation’s bankruptcy in 2001. The study included interviews, analysis of financial data, and review of documents to identify the accounting practices, corporate culture, and ethical issues that led to the company’s downfall.
  • The Fukushima Nuclear Disaster : The Fukushima Nuclear Disaster was a case study conducted to examine the causes of the nuclear accident that occurred at the Fukushima Daiichi Nuclear Power Plant in Japan in 2011. The study included interviews, analysis of data, and review of documents to identify the technical, organizational, and cultural factors that contributed to the disaster.

Application of Case Study

Case studies have a wide range of applications across various fields and industries. Here are some examples:

Business and Management

Case studies are widely used in business and management to examine real-life situations and develop problem-solving skills. Case studies can help students and professionals to develop a deep understanding of business concepts, theories, and best practices.

Case studies are used in healthcare to examine patient care, treatment options, and outcomes. Case studies can help healthcare professionals to develop critical thinking skills, diagnose complex medical conditions, and develop effective treatment plans.

Case studies are used in education to examine teaching and learning practices. Case studies can help educators to develop effective teaching strategies, evaluate student progress, and identify areas for improvement.

Social Sciences

Case studies are widely used in social sciences to examine human behavior, social phenomena, and cultural practices. Case studies can help researchers to develop theories, test hypotheses, and gain insights into complex social issues.

Law and Ethics

Case studies are used in law and ethics to examine legal and ethical dilemmas. Case studies can help lawyers, policymakers, and ethical professionals to develop critical thinking skills, analyze complex cases, and make informed decisions.

Purpose of Case Study

The purpose of a case study is to provide a detailed analysis of a specific phenomenon, issue, or problem in its real-life context. A case study is a qualitative research method that involves the in-depth exploration and analysis of a particular case, which can be an individual, group, organization, event, or community.

The primary purpose of a case study is to generate a comprehensive and nuanced understanding of the case, including its history, context, and dynamics. Case studies can help researchers to identify and examine the underlying factors, processes, and mechanisms that contribute to the case and its outcomes. This can help to develop a more accurate and detailed understanding of the case, which can inform future research, practice, or policy.

Case studies can also serve other purposes, including:

  • Illustrating a theory or concept: Case studies can be used to illustrate and explain theoretical concepts and frameworks, providing concrete examples of how they can be applied in real-life situations.
  • Developing hypotheses: Case studies can help to generate hypotheses about the causal relationships between different factors and outcomes, which can be tested through further research.
  • Providing insight into complex issues: Case studies can provide insights into complex and multifaceted issues, which may be difficult to understand through other research methods.
  • Informing practice or policy: Case studies can be used to inform practice or policy by identifying best practices, lessons learned, or areas for improvement.

Advantages of Case Study Research

There are several advantages of case study research, including:

  • In-depth exploration: Case study research allows for a detailed exploration and analysis of a specific phenomenon, issue, or problem in its real-life context. This can provide a comprehensive understanding of the case and its dynamics, which may not be possible through other research methods.
  • Rich data: Case study research can generate rich and detailed data, including qualitative data such as interviews, observations, and documents. This can provide a nuanced understanding of the case and its complexity.
  • Holistic perspective: Case study research allows for a holistic perspective of the case, taking into account the various factors, processes, and mechanisms that contribute to the case and its outcomes. This can help to develop a more accurate and comprehensive understanding of the case.
  • Theory development: Case study research can help to develop and refine theories and concepts by providing empirical evidence and concrete examples of how they can be applied in real-life situations.
  • Practical application: Case study research can inform practice or policy by identifying best practices, lessons learned, or areas for improvement.
  • Contextualization: Case study research takes into account the specific context in which the case is situated, which can help to understand how the case is influenced by the social, cultural, and historical factors of its environment.

Limitations of Case Study Research

There are several limitations of case study research, including:

  • Limited generalizability : Case studies are typically focused on a single case or a small number of cases, which limits the generalizability of the findings. The unique characteristics of the case may not be applicable to other contexts or populations, which may limit the external validity of the research.
  • Biased sampling: Case studies may rely on purposive or convenience sampling, which can introduce bias into the sample selection process. This may limit the representativeness of the sample and the generalizability of the findings.
  • Subjectivity: Case studies rely on the interpretation of the researcher, which can introduce subjectivity into the analysis. The researcher’s own biases, assumptions, and perspectives may influence the findings, which may limit the objectivity of the research.
  • Limited control: Case studies are typically conducted in naturalistic settings, which limits the control that the researcher has over the environment and the variables being studied. This may limit the ability to establish causal relationships between variables.
  • Time-consuming: Case studies can be time-consuming to conduct, as they typically involve a detailed exploration and analysis of a specific case. This may limit the feasibility of conducting multiple case studies or conducting case studies in a timely manner.
  • Resource-intensive: Case studies may require significant resources, including time, funding, and expertise. This may limit the ability of researchers to conduct case studies in resource-constrained settings.

About the author

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Questionnaire

Questionnaire – Definition, Types, and Examples

Observational Research

Observational Research – Methods and Guide

Quantitative Research

Quantitative Research – Methods, Types and...

Qualitative Research Methods

Qualitative Research Methods

Explanatory Research

Explanatory Research – Types, Methods, Guide

Survey Research

Survey Research – Types, Methods, Examples

What Works

OBSERVATION

Methods, Tools & Techniques

Methods, Tools and Techniques are ways of gathering data and collecting the information to learn what changes have happened.

(Adapted from: http://atlasti.com/observational-research )

Observation is a method in which a person observes behaviour to note changes in people or places, typically as the result of an intervention. Most simply it is learning through observing and documenting.

Observation is most common in psychology and other social sciences. It lets the researcher describe situations under study using the five senses.

Using observation in different ways

You can use observational research in different ways.  At one end is the controlled observation, where the researcher completely manages the environment. At the other is participant observation where the researcher joins the group to understand behaviour and changes. Both have strengths and weaknesses which are often linked to how much an evaluator or researcher has influenced the environment and it’s subjects.

Observation covers a lot of ground. It can involve just watching people, listening to everyday conversations, interviewing individuals and or groups, filling questionnaires and checklists. In short, observing.

Naturalistic (or nonparticipant) observation happens when a researcher       doesn’t intervene and studies behaviour that occurs naturally.

In participant observation , the researcher take a full part. Most commonly, this happens when the researcher joins a group to observe behaviour that otherwise would be inaccessible.

Case Studies as observation

Case Studies are a type of observational research that involve a thorough descriptive analysis of a single individual, group, or event. There is no single way to conduct a case study so researchers use a range of methods from unstructured interviewing to direct observation.

VIDEO RESOURCE

Understanding Observational Research

USEFUL RESOURCES

This Forum for Qualitative Social Research site includes a comprehensive explanation of observation methods, recommendations on what to observe , ethics in observation and tips to collect useful data.

This slide deck created by Melanie Bryant from Swinburne University in Australia presents the basics of conducting participant observation in applied research projects.

Characteristics of effective observers

  • Having an open, nonjudgmental attitude.
  • Being interested in learning about others.
  • Being a careful observer, recorder and a good listener.
  • Being open to the unexpected.
  • Allows insight into contexts, relationships and behaviour. By being able to observe the flow of behaviour in its own setting, the evidence gathered can be more credible than, say, surveys, which rely on the participants’ memory, honesty and awareness.
  • Observation is often used to generate new ideas. As it gives the person gathering evidence the opportunity to explore the total situation, it often suggests lines of enquiry and outcomes not thought of before. It can provide new information that is crucial for service improvements, project design, other data collection, and interpretation of other data.

Disadvantages & Limitations

  • Observation usually takes a lot of time compared with other methods.
  • In social services, observation requires a high level of trust between the person collecting information and participants. Sometimes service staff have easy access to the homes, workplaces and social settings that clients are part of. Often however, these settings are not open to observers, so it can be difficult to find authentic environments to observe changes in behaviour.
  • In participant observation it can be difficult to get time/privacy for recording. For example, with participant observations, researchers can’t take notes openly as this would affect their participation. This means they have to wait until they are alone and rely on memory.
  • Observations are often small-scale and conclusions may not be able to be generalised. It can also be difficult to claim the intervention was responsible for the changes observed.
  • The researcher needs to be trained or experienced enough to recognise events that are significant and worth further attention.
  • If the researcher becomes too involved they may lose objectivity and become biased. There is always the danger that we will see what we expect, or want, to see. This is a problem for anyone within an organisation doing any evaluation work.

U.S. flag

An official website of the United States government

Here's how you know

Official websites use .gov A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Home

  •   Facebook
  •   Twitter
  •   Linkedin
  •   Digg
  •   Reddit
  •   Pinterest
  •   Email

Latest Earthquakes |    Chat Share Social Media  

Earth Observation Case Studies

Social scientists at the U.S. Geological Survey (USGS) Fort Collins Science Center – in collaboration with the USGS National Land Imaging Program – conduct Earth observation user case studies using qualitative research methods. This research enables them to investigate the value of Landsat data, and understand the wide variety of Landsat users.

The following illustrated videos highligh

Earth Observation User Case Study: Ladies of Landsat

  • Download Video

Using Landsat and Machine Learning to Map Urban Change

Earth observation user case: using landsat to connect space to village, earth observation case study: landsat to map ag. yields and irrigation, earth observation user case: speaking a new language of landsat, related content.

Download all the videos by following the links below

observation or case study

Social scientists at the U.S. Geological Survey (USGS) Fort Collins Science Center – in collaboration with the USGS National Land Imaging Program – conduct Earth observation user case studies using qualitative research methods.

observation or case study

  • University of Kentucky

Home

  • Main Menu / Search
  • Scheduled for trouble: Studying the unintended consequences of classifying controlled substances

Researchers’ reasons for launching a study are sometimes purely intellectual: They may be intrigued by an inexplicable observation or challenged by a compelling question.

Other times, the spark may be deeply personal.

Such was the case for University of Kentucky (UK) Ph.D. student Kara Cook.

Darren’s Story

Cook’s brother-in-law, Darren Noble, suffered from a chronic autoimmune disease called Sjögren’s syndrome. Symptoms of this rare condition can include dry mouth and eyes, fatigue, and muscle and joint pain.

In Darren’s case, the pain increased to the point where it became nearly intractable. Only opioid analgesics were up to the task of managing his pain.

During the course of Darren’s treatment, stricter quantity limits were placed on opioid prescriptions. As a result, he could no longer receive an adequate supply of pain medicine. Under his doctors’ care, Darren tried steroids and other drug therapies as well as surgical treatment, but nothing could quash his chronic pain.

Eventually, his doctors admitted him to a hospice program. With hospice’s focus on palliative care, Darren could once again receive doses of opioid medications that were safe and effective for him.

Then Darren came down with COVID-19. Given his already compromised health, the COVID-19 symptoms rapidly grew worse. Overcome with fear that a trip to the hospital would make him ineligible for hospice care and the medicines that made his life bearable, Darren deferred in-patient care.

Days later, he died of complications of the COVID-19 infection.

“Drug scheduling,” noted Cook, referring to classifying drugs based on their misuse potential and restricting access depending on a drug’s classification, “is intended to protect people from the most harmful drugs. Scheduling is meant to make our lives safer, our lives better. In my brother-in-law’s case, increased restrictions on opioid prescribing made his life worse and may have contributed to his death.”

And so the seed was sewn.

Nonmedical Gabapentin Use in Kentucky

Shortly after, Cook and her adviser, Assistant Professor Rachel Vickers-Smith , Ph.D., of the UK College of Public Health’s Department of Epidemiology and Environmental Health, were looking at data derived from the Social Networks among Appalachian People (SNAP) study regarding the use of the drug gabapentin in Kentucky.

Gabapentin is an anticonvulsant drug also used to treat certain types of pain. According to the federal government, gabapentin isn’t a controlled substance. But because its potential for misuse has become increasingly evident, some states (Kentucky being the first) have listed gabapentin as a Schedule V drug with the intent of restricting access and curtailing use without a prescription.

But that’s not what happened. After scheduling, Cook noticed that nonmedical gabapentin use was still increasing in the Commonwealth. What had changed, however, was that people were no longer getting gabapentin from their doctors and pharmacies but rather from family, friends, and people who sell drugs.

Although the circumstances surrounding Darren’s situation and post-scheduling gabapentin trends in Kentucky are quite different, both point to Cook’s general hypothesis: While tightening access to a drug seems to be an obvious and eminently sensible approach to reducing misuse, scheduling decisions can have unforeseen consequences.

Cook was recently awarded a UK Substance Use Priority Research Area (SUPRA) Graduate Student Grant to explore this hypothesis in her study titled “The Effects of Scheduling on Nonmedical Use of Gabapentin in Kentucky.”

“My study is about gabapentin, but that’s largely because it’s a recently scheduled drug that we can get a lot of data on,” explained Cook. “The study is really about the effect of scheduling.”

Cook’s study has two aims. The first is to determine whether gabapentin-involved overdose deaths changed in Kentucky after it became a Schedule V drug in July 2017. She will accomplish this by comparing data from 12 months before and 12 months after the drug was scheduled.

For the second aim, Cook will collect qualitative data by interviewing 30 Kentucky residents who have taken gabapentin for nonmedical reasons. Using this novel approach of gathering data directly from people who use drugs, she hopes to gauge post-scheduling changes in accessibility, street price, source, frequency and amount of use, and reason for use.

Cook has taught college-level math and statistics for nearly 30 years, and in addition to being a Ph.D. student in the College of Public Health, she is currently a lecturer in the UK College of Arts and Sciences’ Department of Statistics .

With her strong background in statistics, Cook is quite comfortable with the quantitative methods required for her first aim. However, the interview-based approach of the second aim opens new territory for Cook.

“I’ve had to learn a lot about qualitative methods, and it’s exciting to step outside my mathy little world and explore the messy real world. It does seem strange,” Cook joked, “to be doing exactly what I tell my statistics students not to do.”

“But the interviews will be potentially illuminating,” Cook added. “Very few qualitative studies have looked at drug scheduling from the perspective of people who use drugs.”

The findings of her study could — in the best possible way — complicate our picture of the effects of drug scheduling. Ultimately, research such as Cook’s could help reshape scheduling practices to maximize their benefits while minimizing unanticipated and potentially unwelcome consequences.

“You can never tell what the impact of your research will be,” Cook noted. “But I’m hoping — my whole family is hoping — that something good will come from this research. Then we all can feel that the tragedy of Darren’s death served as the genesis for positive change.”

Research reported in this publication was supported by the National Institute on Drug Abuse of the National Institutes of Health under Award Numbers R01DA024598 and R01DA033862. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

The research described here was supported in part by a University of Kentucky Substance Use Priority Research Area Graduate Student Grant. The UK Research Priorities Initiative, funded by the Office of the Vice President for Research, encompasses eight research priority areas: cancer, cardiovascular diseases, diabetes & obesity, equity, energy, materials science, neuroscience, and substance use disorders. These areas were chosen based on local relevance, existing funding strength, sustainability, and disciplinary scholarly diversity. 

Back to News

  • UK researchers create task force aimed at lowering state’s maternal mortality rate
  • UK study finds disparities in substance use disorder for communities of color

College of Public Health 111 Washington Avenue Lexington, KY 40536 [email protected]

ASPPH & CEPH Logos

IMAGES

  1. types of observation case study

    observation or case study

  2. Observational research

    observation or case study

  3. Observational Research

    observation or case study

  4. Observational Study vs Experiment: What is the Difference?

    observation or case study

  5. types of observation case study

    observation or case study

  6. Doing Your Child Observation Case Study

    observation or case study

VIDEO

  1. Basic difference b/w observational and experimental study

  2. Descriptive Research definition, types, and its use in education

  3. MA Psychology IGNOU 1st Year Cognitive Psychology Research Methods Hindi एम.ए. मनोविज्ञान हिंदी

  4. Chapter 5.2

  5. 10 Fascinating Facts About Albert Einstein

  6. OBSERVATION FILE ( कक्षा-अवलोकन) Of B.E.D Observation-1 / Classroom observation [email protected]_info

COMMENTS

  1. What Is an Observational Study?

    An observational study is used to answer a research question based purely on what the researcher observes. There is no interference or manipulation of the research subjects, and no control and treatment groups. These studies are often qualitative in nature and can be used for both exploratory and explanatory research purposes.

  2. PDF Case Study Observational Research: A Framework for Conducting Case

    characteristics of case study observational research, a modified form of Yin's 2014 model of case study research the authors used in a study exploring interprofessional collaboration in primary care. In this approach, observation data are positioned as the central component of the research design.

  3. Observational studies and their utility for practice

    Observational studies provide critical descriptive data and information on long-term efficacy and safety that clinical trials cannot provide, at generally much less expense. Observational studies include case reports and case series, ecological studies, cross-sectional studies, case-control studies and cohort studies. ...

  4. Observational Studies: Cohort and Case-Control Studies

    Cohort studies and case-control studies are two primary types of observational studies that aid in evaluating associations between diseases and exposures. In this review article, we describe these study designs, methodological issues, and provide examples from the plastic surgery literature. Keywords: observational studies, case-control study ...

  5. Observational Study Designs: Synopsis for Selecting an Appropriate

    Case-control study. A case-control study is an observational analytic retrospective study design [].It starts with the outcome of interest (referred to as cases) and looks back in time for exposures that likely caused the outcome of interest [13, 20].This design compares two groups of participants - those with the outcome of interest and the matched control [].

  6. 6.6: Observational Research

    A case study is an in-depth examination of an individual. Sometimes case studies are also completed on social units (e.g., a cult) and events (e.g., a natural disaster). ... So, as with all observational methods, case studies do not permit determination of causation. In addition, because case studies are often of a single individual, and ...

  7. What is an Observational Study: Definition & Examples

    In an observational study vs experiment, the researchers only observe the subjects and do not interfere or try to influence the outcomes. Skip to secondary menu; ... Case-Control Study: A retrospective observational study that compares two existing groups—the case group with the condition and the control group without it. Researchers compare ...

  8. Case Study Observational Research: A Framework for Conducting Case

    Observation methods have the potential to reach beyond other methods that rely largely or solely on self-report. This article describes the distinctive characteristics of case study observational research, a modified form of Yin's 2014 model of case study research the authors used in a study exploring interprofessional collaboration in primary ...

  9. Case Study Observational Research: A Framework for Conducting Case

    Case study research is a comprehensive method that incorporates multiple sources of data to provide detailed accounts of complex research phenomena in real-life contexts. However, current models of case study research do not particularly distinguish the unique contribution observation data can make.

  10. 6.5: Observational Research

    Like many observational research methods, case studies tend to be more qualitative in nature. Case study methods involve an in-depth, and often a longitudinal examination of an individual. Depending on the focus of the case study, individuals may or may not be observed in their natural setting. If the natural setting is not what is of interest ...

  11. Observational Case Studies

    Download chapter PDF. An observational case study is a study of a real-world case without performing an intervention. Measurement may influence the measured phenomena, but as in all forms of research, the researcher tries to restrict this to a minimum. The researcher may study a sample of two or even more cases, but the goal of case study ...

  12. Chapter 12. Observational Study Designs

    Observational studies in clinical research can be classified as either analytic or descriptive (Table 12-1). Analytic observational studies are similar to randomized, controlled clinical trials in that the goal is to estimate the causal effect of an exposure on an outcome. ... The analytic study designs presented are the case-control study ...

  13. Observation Methods: Naturalistic, Participant and Controlled

    Like case studies, naturalistic observation is often used to generate new ideas. Because it gives the researcher the opportunity to study the total situation, it often suggests avenues of inquiry not thought of before. The ability to capture actual behaviors as they unfold in real-time, analyze sequential patterns of interactions, measure base ...

  14. Case Study Methodology of Qualitative Research: Key Attributes and

    A case study is one of the most commonly used methodologies of social research. This article attempts to look into the various dimensions of a case study research strategy, the different epistemological strands which determine the particular case study type and approach adopted in the field, discusses the factors which can enhance the effectiveness of a case study research, and the debate ...

  15. 7 Types of Observational Studies (With Examples)

    There are seven types of observational studies. Researchers might choose to use one type of observational study or combine any of these multiple observational study approaches: 1. Cross-sectional studies. Cross-sectional studies happen when researchers observe their chosen subject at one particular point in time.

  16. What is Observational Study Design and What Types

    Case Control Observational Study. Researchers in case control studies identify individuals with an existing health issue or condition, or "cases," along with a similar group without the condition, or "controls." These two groups are then compared to identify predictors and outcomes. This type of study is helpful to generate a hypothesis ...

  17. 6.5 Observational Research

    Like many observational research methods, case studies tend to be more qualitative in nature. Case study methods involve an in-depth, and often a longitudinal examination of an individual. Depending on the focus of the case study, individuals may or may not be observed in their natural setting. If the natural setting is not what is of interest ...

  18. Ch 2: Psychological Research Methods

    The three main types of descriptive studies are, naturalistic observation, case studies, and surveys. Try It. Naturalistic Observation. If you want to understand how behavior occurs, one of the best ways to gain information is to simply observe the behavior in its natural context. However, people might change their behavior in unexpected ways ...

  19. Observe Before You Leap: Why Observation Provides Critical Insights for

    In each case study, observation yielded critical information that would have been difficult or impossible to obtain any other way. For each case study, I provide a brief description of the research and the context from which it was drawn, then focus more extensively on the observational methods used and the unique insights they generated.

  20. Case Study

    A case study is a research method that involves an in-depth examination and analysis of a particular phenomenon or case, such as an individual, organization, community, event, or situation. It is a qualitative research approach that aims to provide a detailed and comprehensive understanding of the case being studied.

  21. Observation

    Case Studies as observation. Case Studies are a type of observational research that involve a thorough descriptive analysis of a single individual, group, or event. There is no single way to conduct a case study so researchers use a range of methods from unstructured interviewing to direct observation.

  22. Earth Observation Case Studies

    This graphic illustration guides you through an Earth observation user case study and provides the in-depth user experience of Jill Deines - one example of an Earth observation user. Dr. Jill Deines is a postdoctoral scholar at the Center for Food Security and the Environment at Stanford University working jointly with the NASA Harvest ...

  23. Who is the Expert in Change? Exploring the Role of Therapists in Client

    This case study focuses on the therapeutic change process, highlighting the crucial role of therapists in facilitating it. Rooted in constructivist epistemology, we employed an integrative systemic model. Our exploration centers on the presence or absence of system restructuring within the therapeutic process.

  24. Scheduled for trouble: Studying the unintended consequences of

    Researchers' reasons for launching a study are sometimes purely intellectual: They may be intrigued by an inexplicable observation or challenged by a compelling question. Other times, the spark may be deeply personal. Such was the case for University of Kentucky (UK) Ph.D. student Kara Cook. Darren's Story