Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base

Methodology

  • What Is an Observational Study? | Guide & Examples

What Is an Observational Study? | Guide & Examples

Published on March 31, 2022 by Tegan George . Revised on June 22, 2023.

An observational study is used to answer a research question based purely on what the researcher observes. There is no interference or manipulation of the research subjects, and no control and treatment groups .

These studies are often qualitative in nature and can be used for both exploratory and explanatory research purposes. While quantitative observational studies exist, they are less common.

Observational studies are generally used in hard science, medical, and social science fields. This is often due to ethical or practical concerns that prevent the researcher from conducting a traditional experiment . However, the lack of control and treatment groups means that forming inferences is difficult, and there is a risk of confounding variables and observer bias impacting your analysis.

Table of contents

Types of observation, types of observational studies, observational study example, advantages and disadvantages of observational studies, observational study vs. experiment, other interesting articles, frequently asked questions.

There are many types of observation, and it can be challenging to tell the difference between them. Here are some of the most common types to help you choose the best one for your observational study.

Prevent plagiarism. Run a free check.

There are three main types of observational studies: cohort studies, case–control studies, and cross-sectional studies .

Cohort studies

Cohort studies are more longitudinal in nature, as they follow a group of participants over a period of time. Members of the cohort are selected because of a shared characteristic, such as smoking, and they are often observed over a period of years.

Case–control studies

Case–control studies bring together two groups, a case study group and a control group . The case study group has a particular attribute while the control group does not. The two groups are then compared, to see if the case group exhibits a particular characteristic more than the control group.

For example, if you compared smokers (the case study group) with non-smokers (the control group), you could observe whether the smokers had more instances of lung disease than the non-smokers.

Cross-sectional studies

Cross-sectional studies analyze a population of study at a specific point in time.

This often involves narrowing previously collected data to one point in time to test the prevalence of a theory—for example, analyzing how many people were diagnosed with lung disease in March of a given year. It can also be a one-time observation, such as spending one day in the lung disease wing of a hospital.

Observational studies are usually quite straightforward to design and conduct. Sometimes all you need is a notebook and pen! As you design your study, you can follow these steps.

Step 1: Identify your research topic and objectives

The first step is to determine what you’re interested in observing and why. Observational studies are a great fit if you are unable to do an experiment for practical or ethical reasons , or if your research topic hinges on natural behaviors.

Step 2: Choose your observation type and technique

In terms of technique, there are a few things to consider:

  • Are you determining what you want to observe beforehand, or going in open-minded?
  • Is there another research method that would make sense in tandem with an observational study?
  • If yes, make sure you conduct a covert observation.
  • If not, think about whether observing from afar or actively participating in your observation is a better fit.
  • How can you preempt confounding variables that could impact your analysis?
  • You could observe the children playing at the playground in a naturalistic observation.
  • You could spend a month at a day care in your town conducting participant observation, immersing yourself in the day-to-day life of the children.
  • You could conduct covert observation behind a wall or glass, where the children can’t see you.

Overall, it is crucial to stay organized. Devise a shorthand for your notes, or perhaps design templates that you can fill in. Since these observations occur in real time, you won’t get a second chance with the same data.

Step 3: Set up your observational study

Before conducting your observations, there are a few things to attend to:

  • Plan ahead: If you’re interested in day cares, you’ll need to call a few in your area to plan a visit. They may not all allow observation, or consent from parents may be needed, so give yourself enough time to set everything up.
  • Determine your note-taking method: Observational studies often rely on note-taking because other methods, like video or audio recording, run the risk of changing participant behavior.
  • Get informed consent from your participants (or their parents) if you want to record:  Ultimately, even though it may make your analysis easier, the challenges posed by recording participants often make pen-and-paper a better choice.

Step 4: Conduct your observation

After you’ve chosen a type of observation, decided on your technique, and chosen a time and place, it’s time to conduct your observation.

Here, you can split them into case and control groups. The children with siblings have a characteristic you are interested in (siblings), while the children in the control group do not.

When conducting observational studies, be very careful of confounding or “lurking” variables. In the example above, you observed children as they were dropped off, gauging whether or not they were upset. However, there are a variety of other factors that could be at play here (e.g., illness).

Step 5: Analyze your data

After you finish your observation, immediately record your initial thoughts and impressions, as well as follow-up questions or any issues you perceived during the observation. If you audio- or video-recorded your observations, you can transcribe them.

Your analysis can take an inductive  or deductive approach :

  • If you conducted your observations in a more open-ended way, an inductive approach allows your data to determine your themes.
  • If you had specific hypotheses prior to conducting your observations, a deductive approach analyzes whether your data confirm those themes or ideas you had previously.

Next, you can conduct your thematic or content analysis . Due to the open-ended nature of observational studies, the best fit is likely thematic analysis .

Step 6: Discuss avenues for future research

Observational studies are generally exploratory in nature, and they often aren’t strong enough to yield standalone conclusions due to their very high susceptibility to observer bias and confounding variables. For this reason, observational studies can only show association, not causation .

If you are excited about the preliminary conclusions you’ve drawn and wish to proceed with your topic, you may need to change to a different research method , such as an experiment.

  • Observational studies can provide information about difficult-to-analyze topics in a low-cost, efficient manner.
  • They allow you to study subjects that cannot be randomized safely, efficiently, or ethically .
  • They are often quite straightforward to conduct, since you just observe participant behavior as it happens or utilize preexisting data.
  • They’re often invaluable in informing later, larger-scale clinical trials or experimental designs.

Disadvantages

  • Observational studies struggle to stand on their own as a reliable research method. There is a high risk of observer bias and undetected confounding variables or omitted variables .
  • They lack conclusive results, typically are not externally valid or generalizable, and can usually only form a basis for further research.
  • They cannot make statements about the safety or efficacy of the intervention or treatment they study, only observe reactions to it. Therefore, they offer less satisfying results than other methods.

Here's why students love Scribbr's proofreading services

Discover proofreading & editing

The key difference between observational studies and experiments is that a properly conducted observational study will never attempt to influence responses, while experimental designs by definition have some sort of treatment condition applied to a portion of participants.

However, there may be times when it’s impossible, dangerous, or impractical to influence the behavior of your participants. This can be the case in medical studies, where it is unethical or cruel to withhold potentially life-saving intervention, or in longitudinal analyses where you don’t have the ability to follow your group over the course of their lifetime.

An observational study may be the right fit for your research if random assignment of participants to control and treatment groups is impossible or highly difficult. However, the issues observational studies raise in terms of validity , confounding variables, and conclusiveness can mean that an experiment is more reliable.

If you’re able to randomize your participants safely and your research question is definitely causal in nature, consider using an experiment.

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Student’s  t -distribution
  • Normal distribution
  • Null and Alternative Hypotheses
  • Chi square tests
  • Confidence interval
  • Quartiles & Quantiles
  • Cluster sampling
  • Stratified sampling
  • Data cleansing
  • Reproducibility vs Replicability
  • Peer review
  • Prospective cohort study

Research bias

  • Implicit bias
  • Cognitive bias
  • Placebo effect
  • Hawthorne effect
  • Hindsight bias
  • Affect heuristic
  • Social desirability bias

An observational study is a great choice for you if your research question is based purely on observations. If there are ethical, logistical, or practical concerns that prevent you from conducting a traditional experiment , an observational study may be a good choice. In an observational study, there is no interference or manipulation of the research subjects, as well as no control or treatment groups .

The key difference between observational studies and experimental designs is that a well-done observational study does not influence the responses of participants, while experiments do have some sort of treatment condition applied to at least some participants by random assignment .

A quasi-experiment is a type of research design that attempts to establish a cause-and-effect relationship. The main difference with a true experiment is that the groups are not randomly assigned.

Exploratory research aims to explore the main aspects of an under-researched problem, while explanatory research aims to explain the causes and consequences of a well-defined problem.

Experimental design means planning a set of procedures to investigate a relationship between variables . To design a controlled experiment, you need:

  • A testable hypothesis
  • At least one independent variable that can be precisely manipulated
  • At least one dependent variable that can be precisely measured

When designing the experiment, you decide:

  • How you will manipulate the variable(s)
  • How you will control for any potential confounding variables
  • How many subjects or samples will be included in the study
  • How subjects will be assigned to treatment levels

Experimental design is essential to the internal and external validity of your experiment.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

George, T. (2023, June 22). What Is an Observational Study? | Guide & Examples. Scribbr. Retrieved April 2, 2024, from https://www.scribbr.com/methodology/observational-study/

Is this article helpful?

Tegan George

Tegan George

Other students also liked, what is a research design | types, guide & examples, guide to experimental design | overview, steps, & examples, naturalistic observation | definition, guide & examples, what is your plagiarism score.

Logo for Kwantlen Polytechnic University

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

Non-Experimental Research

32 Observational Research

Learning objectives.

  • List the various types of observational research methods and distinguish between each.
  • Describe the strengths and weakness of each observational research method. 

What Is Observational Research?

The term observational research is used to refer to several different types of non-experimental studies in which behavior is systematically observed and recorded. The goal of observational research is to describe a variable or set of variables. More generally, the goal is to obtain a snapshot of specific characteristics of an individual, group, or setting. As described previously, observational research is non-experimental because nothing is manipulated or controlled, and as such we cannot arrive at causal conclusions using this approach. The data that are collected in observational research studies are often qualitative in nature but they may also be quantitative or both (mixed-methods). There are several different types of observational methods that will be described below.

Naturalistic Observation

Naturalistic observation  is an observational method that involves observing people’s behavior in the environment in which it typically occurs. Thus naturalistic observation is a type of field research (as opposed to a type of laboratory research). Jane Goodall’s famous research on chimpanzees is a classic example of naturalistic observation. Dr.  Goodall spent three decades observing chimpanzees in their natural environment in East Africa. She examined such things as chimpanzee’s social structure, mating patterns, gender roles, family structure, and care of offspring by observing them in the wild. However, naturalistic observation  could more simply involve observing shoppers in a grocery store, children on a school playground, or psychiatric inpatients in their wards. Researchers engaged in naturalistic observation usually make their observations as unobtrusively as possible so that participants are not aware that they are being studied. Such an approach is called disguised naturalistic observation .  Ethically, this method is considered to be acceptable if the participants remain anonymous and the behavior occurs in a public setting where people would not normally have an expectation of privacy. Grocery shoppers putting items into their shopping carts, for example, are engaged in public behavior that is easily observable by store employees and other shoppers. For this reason, most researchers would consider it ethically acceptable to observe them for a study. On the other hand, one of the arguments against the ethicality of the naturalistic observation of “bathroom behavior” discussed earlier in the book is that people have a reasonable expectation of privacy even in a public restroom and that this expectation was violated. 

In cases where it is not ethical or practical to conduct disguised naturalistic observation, researchers can conduct  undisguised naturalistic observation where the participants are made aware of the researcher presence and monitoring of their behavior. However, one concern with undisguised naturalistic observation is  reactivity. Reactivity refers to when a measure changes participants’ behavior. In the case of undisguised naturalistic observation, the concern with reactivity is that when people know they are being observed and studied, they may act differently than they normally would. This type of reactivity is known as the Hawthorne effect . For instance, you may act much differently in a bar if you know that someone is observing you and recording your behaviors and this would invalidate the study. So disguised observation is less reactive and therefore can have higher validity because people are not aware that their behaviors are being observed and recorded. However, we now know that people often become used to being observed and with time they begin to behave naturally in the researcher’s presence. In other words, over time people habituate to being observed. Think about reality shows like Big Brother or Survivor where people are constantly being observed and recorded. While they may be on their best behavior at first, in a fairly short amount of time they are flirting, having sex, wearing next to nothing, screaming at each other, and occasionally behaving in ways that are embarrassing.

Participant Observation

Another approach to data collection in observational research is participant observation. In  participant observation , researchers become active participants in the group or situation they are studying. Participant observation is very similar to naturalistic observation in that it involves observing people’s behavior in the environment in which it typically occurs. As with naturalistic observation, the data that are collected can include interviews (usually unstructured), notes based on their observations and interactions, documents, photographs, and other artifacts. The only difference between naturalistic observation and participant observation is that researchers engaged in participant observation become active members of the group or situations they are studying. The basic rationale for participant observation is that there may be important information that is only accessible to, or can be interpreted only by, someone who is an active participant in the group or situation. Like naturalistic observation, participant observation can be either disguised or undisguised. In disguised participant observation , the researchers pretend to be members of the social group they are observing and conceal their true identity as researchers.

In a famous example of disguised participant observation, Leon Festinger and his colleagues infiltrated a doomsday cult known as the Seekers, whose members believed that the apocalypse would occur on December 21, 1954. Interested in studying how members of the group would cope psychologically when the prophecy inevitably failed, they carefully recorded the events and reactions of the cult members in the days before and after the supposed end of the world. Unsurprisingly, the cult members did not give up their belief but instead convinced themselves that it was their faith and efforts that saved the world from destruction. Festinger and his colleagues later published a book about this experience, which they used to illustrate the theory of cognitive dissonance (Festinger, Riecken, & Schachter, 1956) [1] .

In contrast with undisguised participant observation ,  the researchers become a part of the group they are studying and they disclose their true identity as researchers to the group under investigation. Once again there are important ethical issues to consider with disguised participant observation.  First no informed consent can be obtained and second deception is being used. The researcher is deceiving the participants by intentionally withholding information about their motivations for being a part of the social group they are studying. But sometimes disguised participation is the only way to access a protective group (like a cult). Further, disguised participant observation is less prone to reactivity than undisguised participant observation. 

Rosenhan’s study (1973) [2]   of the experience of people in a psychiatric ward would be considered disguised participant observation because Rosenhan and his pseudopatients were admitted into psychiatric hospitals on the pretense of being patients so that they could observe the way that psychiatric patients are treated by staff. The staff and other patients were unaware of their true identities as researchers.

Another example of participant observation comes from a study by sociologist Amy Wilkins on a university-based religious organization that emphasized how happy its members were (Wilkins, 2008) [3] . Wilkins spent 12 months attending and participating in the group’s meetings and social events, and she interviewed several group members. In her study, Wilkins identified several ways in which the group “enforced” happiness—for example, by continually talking about happiness, discouraging the expression of negative emotions, and using happiness as a way to distinguish themselves from other groups.

One of the primary benefits of participant observation is that the researchers are in a much better position to understand the viewpoint and experiences of the people they are studying when they are a part of the social group. The primary limitation with this approach is that the mere presence of the observer could affect the behavior of the people being observed. While this is also a concern with naturalistic observation, additional concerns arise when researchers become active members of the social group they are studying because that they may change the social dynamics and/or influence the behavior of the people they are studying. Similarly, if the researcher acts as a participant observer there can be concerns with biases resulting from developing relationships with the participants. Concretely, the researcher may become less objective resulting in more experimenter bias.

Structured Observation

Another observational method is structured observation . Here the investigator makes careful observations of one or more specific behaviors in a particular setting that is more structured than the settings used in naturalistic or participant observation. Often the setting in which the observations are made is not the natural setting. Instead, the researcher may observe people in the laboratory environment. Alternatively, the researcher may observe people in a natural setting (like a classroom setting) that they have structured some way, for instance by introducing some specific task participants are to engage in or by introducing a specific social situation or manipulation.

Structured observation is very similar to naturalistic observation and participant observation in that in all three cases researchers are observing naturally occurring behavior; however, the emphasis in structured observation is on gathering quantitative rather than qualitative data. Researchers using this approach are interested in a limited set of behaviors. This allows them to quantify the behaviors they are observing. In other words, structured observation is less global than naturalistic or participant observation because the researcher engaged in structured observations is interested in a small number of specific behaviors. Therefore, rather than recording everything that happens, the researcher only focuses on very specific behaviors of interest.

Researchers Robert Levine and Ara Norenzayan used structured observation to study differences in the “pace of life” across countries (Levine & Norenzayan, 1999) [4] . One of their measures involved observing pedestrians in a large city to see how long it took them to walk 60 feet. They found that people in some countries walked reliably faster than people in other countries. For example, people in Canada and Sweden covered 60 feet in just under 13 seconds on average, while people in Brazil and Romania took close to 17 seconds. When structured observation  takes place in the complex and even chaotic “real world,” the questions of when, where, and under what conditions the observations will be made, and who exactly will be observed are important to consider. Levine and Norenzayan described their sampling process as follows:

“Male and female walking speed over a distance of 60 feet was measured in at least two locations in main downtown areas in each city. Measurements were taken during main business hours on clear summer days. All locations were flat, unobstructed, had broad sidewalks, and were sufficiently uncrowded to allow pedestrians to move at potentially maximum speeds. To control for the effects of socializing, only pedestrians walking alone were used. Children, individuals with obvious physical handicaps, and window-shoppers were not timed. Thirty-five men and 35 women were timed in most cities.” (p. 186).

Precise specification of the sampling process in this way makes data collection manageable for the observers, and it also provides some control over important extraneous variables. For example, by making their observations on clear summer days in all countries, Levine and Norenzayan controlled for effects of the weather on people’s walking speeds.  In Levine and Norenzayan’s study, measurement was relatively straightforward. They simply measured out a 60-foot distance along a city sidewalk and then used a stopwatch to time participants as they walked over that distance.

As another example, researchers Robert Kraut and Robert Johnston wanted to study bowlers’ reactions to their shots, both when they were facing the pins and then when they turned toward their companions (Kraut & Johnston, 1979) [5] . But what “reactions” should they observe? Based on previous research and their own pilot testing, Kraut and Johnston created a list of reactions that included “closed smile,” “open smile,” “laugh,” “neutral face,” “look down,” “look away,” and “face cover” (covering one’s face with one’s hands). The observers committed this list to memory and then practiced by coding the reactions of bowlers who had been videotaped. During the actual study, the observers spoke into an audio recorder, describing the reactions they observed. Among the most interesting results of this study was that bowlers rarely smiled while they still faced the pins. They were much more likely to smile after they turned toward their companions, suggesting that smiling is not purely an expression of happiness but also a form of social communication.

In yet another example (this one in a laboratory environment), Dov Cohen and his colleagues had observers rate the emotional reactions of participants who had just been deliberately bumped and insulted by a confederate after they dropped off a completed questionnaire at the end of a hallway. The confederate was posing as someone who worked in the same building and who was frustrated by having to close a file drawer twice in order to permit the participants to walk past them (first to drop off the questionnaire at the end of the hallway and once again on their way back to the room where they believed the study they signed up for was taking place). The two observers were positioned at different ends of the hallway so that they could read the participants’ body language and hear anything they might say. Interestingly, the researchers hypothesized that participants from the southern United States, which is one of several places in the world that has a “culture of honor,” would react with more aggression than participants from the northern United States, a prediction that was in fact supported by the observational data (Cohen, Nisbett, Bowdle, & Schwarz, 1996) [6] .

When the observations require a judgment on the part of the observers—as in the studies by Kraut and Johnston and Cohen and his colleagues—a process referred to as   coding is typically required . Coding generally requires clearly defining a set of target behaviors. The observers then categorize participants individually in terms of which behavior they have engaged in and the number of times they engaged in each behavior. The observers might even record the duration of each behavior. The target behaviors must be defined in such a way that guides different observers to code them in the same way. This difficulty with coding illustrates the issue of interrater reliability, as mentioned in Chapter 4. Researchers are expected to demonstrate the interrater reliability of their coding procedure by having multiple raters code the same behaviors independently and then showing that the different observers are in close agreement. Kraut and Johnston, for example, video recorded a subset of their participants’ reactions and had two observers independently code them. The two observers showed that they agreed on the reactions that were exhibited 97% of the time, indicating good interrater reliability.

One of the primary benefits of structured observation is that it is far more efficient than naturalistic and participant observation. Since the researchers are focused on specific behaviors this reduces time and expense. Also, often times the environment is structured to encourage the behaviors of interest which again means that researchers do not have to invest as much time in waiting for the behaviors of interest to naturally occur. Finally, researchers using this approach can clearly exert greater control over the environment. However, when researchers exert more control over the environment it may make the environment less natural which decreases external validity. It is less clear for instance whether structured observations made in a laboratory environment will generalize to a real world environment. Furthermore, since researchers engaged in structured observation are often not disguised there may be more concerns with reactivity.

Case Studies

A  case study   is an in-depth examination of an individual. Sometimes case studies are also completed on social units (e.g., a cult) and events (e.g., a natural disaster). Most commonly in psychology, however, case studies provide a detailed description and analysis of an individual. Often the individual has a rare or unusual condition or disorder or has damage to a specific region of the brain.

Like many observational research methods, case studies tend to be more qualitative in nature. Case study methods involve an in-depth, and often a longitudinal examination of an individual. Depending on the focus of the case study, individuals may or may not be observed in their natural setting. If the natural setting is not what is of interest, then the individual may be brought into a therapist’s office or a researcher’s lab for study. Also, the bulk of the case study report will focus on in-depth descriptions of the person rather than on statistical analyses. With that said some quantitative data may also be included in the write-up of a case study. For instance, an individual’s depression score may be compared to normative scores or their score before and after treatment may be compared. As with other qualitative methods, a variety of different methods and tools can be used to collect information on the case. For instance, interviews, naturalistic observation, structured observation, psychological testing (e.g., IQ test), and/or physiological measurements (e.g., brain scans) may be used to collect information on the individual.

HM is one of the most notorious case studies in psychology. HM suffered from intractable and very severe epilepsy. A surgeon localized HM’s epilepsy to his medial temporal lobe and in 1953 he removed large sections of his hippocampus in an attempt to stop the seizures. The treatment was a success, in that it resolved his epilepsy and his IQ and personality were unaffected. However, the doctors soon realized that HM exhibited a strange form of amnesia, called anterograde amnesia. HM was able to carry out a conversation and he could remember short strings of letters, digits, and words. Basically, his short term memory was preserved. However, HM could not commit new events to memory. He lost the ability to transfer information from his short-term memory to his long term memory, something memory researchers call consolidation. So while he could carry on a conversation with someone, he would completely forget the conversation after it ended. This was an extremely important case study for memory researchers because it suggested that there’s a dissociation between short-term memory and long-term memory, it suggested that these were two different abilities sub-served by different areas of the brain. It also suggested that the temporal lobes are particularly important for consolidating new information (i.e., for transferring information from short-term memory to long-term memory).

QR code for Hippocampus & Memory video

The history of psychology is filled with influential cases studies, such as Sigmund Freud’s description of “Anna O.” (see Note 6.1 “The Case of “Anna O.””) and John Watson and Rosalie Rayner’s description of Little Albert (Watson & Rayner, 1920) [7] , who allegedly learned to fear a white rat—along with other furry objects—when the researchers repeatedly made a loud noise every time the rat approached him.

The Case of “Anna O.”

Sigmund Freud used the case of a young woman he called “Anna O.” to illustrate many principles of his theory of psychoanalysis (Freud, 1961) [8] . (Her real name was Bertha Pappenheim, and she was an early feminist who went on to make important contributions to the field of social work.) Anna had come to Freud’s colleague Josef Breuer around 1880 with a variety of odd physical and psychological symptoms. One of them was that for several weeks she was unable to drink any fluids. According to Freud,

She would take up the glass of water that she longed for, but as soon as it touched her lips she would push it away like someone suffering from hydrophobia.…She lived only on fruit, such as melons, etc., so as to lessen her tormenting thirst. (p. 9)

But according to Freud, a breakthrough came one day while Anna was under hypnosis.

[S]he grumbled about her English “lady-companion,” whom she did not care for, and went on to describe, with every sign of disgust, how she had once gone into this lady’s room and how her little dog—horrid creature!—had drunk out of a glass there. The patient had said nothing, as she had wanted to be polite. After giving further energetic expression to the anger she had held back, she asked for something to drink, drank a large quantity of water without any difficulty, and awoke from her hypnosis with the glass at her lips; and thereupon the disturbance vanished, never to return. (p.9)

Freud’s interpretation was that Anna had repressed the memory of this incident along with the emotion that it triggered and that this was what had caused her inability to drink. Furthermore, he believed that her recollection of the incident, along with her expression of the emotion she had repressed, caused the symptom to go away.

As an illustration of Freud’s theory, the case study of Anna O. is quite effective. As evidence for the theory, however, it is essentially worthless. The description provides no way of knowing whether Anna had really repressed the memory of the dog drinking from the glass, whether this repression had caused her inability to drink, or whether recalling this “trauma” relieved the symptom. It is also unclear from this case study how typical or atypical Anna’s experience was.

Figure 6.8 Anna O. “Anna O.” was the subject of a famous case study used by Freud to illustrate the principles of psychoanalysis. Source: http://en.wikipedia.org/wiki/File:Pappenheim_1882.jpg

Case studies are useful because they provide a level of detailed analysis not found in many other research methods and greater insights may be gained from this more detailed analysis. As a result of the case study, the researcher may gain a sharpened understanding of what might become important to look at more extensively in future more controlled research. Case studies are also often the only way to study rare conditions because it may be impossible to find a large enough sample of individuals with the condition to use quantitative methods. Although at first glance a case study of a rare individual might seem to tell us little about ourselves, they often do provide insights into normal behavior. The case of HM provided important insights into the role of the hippocampus in memory consolidation.

However, it is important to note that while case studies can provide insights into certain areas and variables to study, and can be useful in helping develop theories, they should never be used as evidence for theories. In other words, case studies can be used as inspiration to formulate theories and hypotheses, but those hypotheses and theories then need to be formally tested using more rigorous quantitative methods. The reason case studies shouldn’t be used to provide support for theories is that they suffer from problems with both internal and external validity. Case studies lack the proper controls that true experiments contain. As such, they suffer from problems with internal validity, so they cannot be used to determine causation. For instance, during HM’s surgery, the surgeon may have accidentally lesioned another area of HM’s brain (a possibility suggested by the dissection of HM’s brain following his death) and that lesion may have contributed to his inability to consolidate new information. The fact is, with case studies we cannot rule out these sorts of alternative explanations. So, as with all observational methods, case studies do not permit determination of causation. In addition, because case studies are often of a single individual, and typically an abnormal individual, researchers cannot generalize their conclusions to other individuals. Recall that with most research designs there is a trade-off between internal and external validity. With case studies, however, there are problems with both internal validity and external validity. So there are limits both to the ability to determine causation and to generalize the results. A final limitation of case studies is that ample opportunity exists for the theoretical biases of the researcher to color or bias the case description. Indeed, there have been accusations that the woman who studied HM destroyed a lot of her data that were not published and she has been called into question for destroying contradictory data that didn’t support her theory about how memories are consolidated. There is a fascinating New York Times article that describes some of the controversies that ensued after HM’s death and analysis of his brain that can be found at: https://www.nytimes.com/2016/08/07/magazine/the-brain-that-couldnt-remember.html?_r=0

Archival Research

Another approach that is often considered observational research involves analyzing archival data that have already been collected for some other purpose. An example is a study by Brett Pelham and his colleagues on “implicit egotism”—the tendency for people to prefer people, places, and things that are similar to themselves (Pelham, Carvallo, & Jones, 2005) [9] . In one study, they examined Social Security records to show that women with the names Virginia, Georgia, Louise, and Florence were especially likely to have moved to the states of Virginia, Georgia, Louisiana, and Florida, respectively.

As with naturalistic observation, measurement can be more or less straightforward when working with archival data. For example, counting the number of people named Virginia who live in various states based on Social Security records is relatively straightforward. But consider a study by Christopher Peterson and his colleagues on the relationship between optimism and health using data that had been collected many years before for a study on adult development (Peterson, Seligman, & Vaillant, 1988) [10] . In the 1940s, healthy male college students had completed an open-ended questionnaire about difficult wartime experiences. In the late 1980s, Peterson and his colleagues reviewed the men’s questionnaire responses to obtain a measure of explanatory style—their habitual ways of explaining bad events that happen to them. More pessimistic people tend to blame themselves and expect long-term negative consequences that affect many aspects of their lives, while more optimistic people tend to blame outside forces and expect limited negative consequences. To obtain a measure of explanatory style for each participant, the researchers used a procedure in which all negative events mentioned in the questionnaire responses, and any causal explanations for them were identified and written on index cards. These were given to a separate group of raters who rated each explanation in terms of three separate dimensions of optimism-pessimism. These ratings were then averaged to produce an explanatory style score for each participant. The researchers then assessed the statistical relationship between the men’s explanatory style as undergraduate students and archival measures of their health at approximately 60 years of age. The primary result was that the more optimistic the men were as undergraduate students, the healthier they were as older men. Pearson’s  r  was +.25.

This method is an example of  content analysis —a family of systematic approaches to measurement using complex archival data. Just as structured observation requires specifying the behaviors of interest and then noting them as they occur, content analysis requires specifying keywords, phrases, or ideas and then finding all occurrences of them in the data. These occurrences can then be counted, timed (e.g., the amount of time devoted to entertainment topics on the nightly news show), or analyzed in a variety of other ways.

Media Attributions

  • What happens when you remove the hippocampus? – Sam Kean by TED-Ed licensed under a standard YouTube License
  • Pappenheim 1882  by unknown is in the  Public Domain .
  • Festinger, L., Riecken, H., & Schachter, S. (1956). When prophecy fails: A social and psychological study of a modern group that predicted the destruction of the world. University of Minnesota Press. ↵
  • Rosenhan, D. L. (1973). On being sane in insane places. Science, 179 , 250–258. ↵
  • Wilkins, A. (2008). “Happier than Non-Christians”: Collective emotions and symbolic boundaries among evangelical Christians. Social Psychology Quarterly, 71 , 281–301. ↵
  • Levine, R. V., & Norenzayan, A. (1999). The pace of life in 31 countries. Journal of Cross-Cultural Psychology, 30 , 178–205. ↵
  • Kraut, R. E., & Johnston, R. E. (1979). Social and emotional messages of smiling: An ethological approach. Journal of Personality and Social Psychology, 37 , 1539–1553. ↵
  • Cohen, D., Nisbett, R. E., Bowdle, B. F., & Schwarz, N. (1996). Insult, aggression, and the southern culture of honor: An "experimental ethnography." Journal of Personality and Social Psychology, 70 (5), 945-960. ↵
  • Watson, J. B., & Rayner, R. (1920). Conditioned emotional reactions. Journal of Experimental Psychology, 3 , 1–14. ↵
  • Freud, S. (1961).  Five lectures on psycho-analysis . New York, NY: Norton. ↵
  • Pelham, B. W., Carvallo, M., & Jones, J. T. (2005). Implicit egotism. Current Directions in Psychological Science, 14 , 106–110. ↵
  • Peterson, C., Seligman, M. E. P., & Vaillant, G. E. (1988). Pessimistic explanatory style is a risk factor for physical illness: A thirty-five year longitudinal study. Journal of Personality and Social Psychology, 55 , 23–27. ↵

Research that is non-experimental because it focuses on recording systemic observations of behavior in a natural or laboratory setting without manipulating anything.

An observational method that involves observing people’s behavior in the environment in which it typically occurs.

When researchers engage in naturalistic observation by making their observations as unobtrusively as possible so that participants are not aware that they are being studied.

Where the participants are made aware of the researcher presence and monitoring of their behavior.

Refers to when a measure changes participants’ behavior.

In the case of undisguised naturalistic observation, it is a type of reactivity when people know they are being observed and studied, they may act differently than they normally would.

Researchers become active participants in the group or situation they are studying.

Researchers pretend to be members of the social group they are observing and conceal their true identity as researchers.

Researchers become a part of the group they are studying and they disclose their true identity as researchers to the group under investigation.

When a researcher makes careful observations of one or more specific behaviors in a particular setting that is more structured than the settings used in naturalistic or participant observation.

A part of structured observation whereby the observers use a clearly defined set of guidelines to "code" behaviors—assigning specific behaviors they are observing to a category—and count the number of times or the duration that the behavior occurs.

An in-depth examination of an individual.

A family of systematic approaches to measurement using qualitative methods to analyze complex archival data.

Research Methods in Psychology Copyright © 2019 by Rajiv S. Jhangiani, I-Chant A. Chiang, Carrie Cuttler, & Dana C. Leighton is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

Elsevier QRcode Wechat

  • Research Process

What is Observational Study Design and Types

  • 4 minute read
  • 107.9K views

Table of Contents

Most people think of a traditional experimental design when they consider research and published research papers. There is, however, a type of research that is more observational in nature, and it is appropriately referred to as “observational studies.”

There are many valuable reasons to utilize an observational study design. But, just as in research experimental design, different methods can be used when you’re considering this type of study. In this article, we’ll look at the advantages and disadvantages of an observational study design, as well as the 3 types of observational studies.

What is Observational Study Design?

An observational study is when researchers are looking at the effect of some type of intervention, risk, a diagnostic test or treatment, without trying to manipulate who is, or who isn’t, exposed to it.

This differs from an experimental study, where the scientists are manipulating who is exposed to the treatment, intervention, etc., by having a control group, or those who are not exposed, and an experimental group, or those who are exposed to the intervention, treatment, etc. In the best studies, the groups are randomized, or chosen by chance.

Any evidence derived from systematic reviews is considered the best in the hierarchy of evidence, which considers which studies are deemed the most reliable. Next would be any evidence that comes from randomized controlled trials. Cohort studies and case studies follow, in that order.

Cohort studies and case studies are considered observational in design, whereas the randomized controlled trial would be an experimental study.

Let’s take a closer look at the different types of observational study design.

The 3 types of Observational Studies

The different types of observational studies are used for different reasons. Selecting the best type for your research is critical to a successful outcome. One of the main reasons observational studies are used is when a randomized experiment would be considered unethical. For example, a life-saving medication used in a public health emergency. They are also used when looking at aetiology, or the cause of a condition or disease, as well as the treatment of rare conditions.

Case Control Observational Study

Researchers in case control studies identify individuals with an existing health issue or condition, or “cases,” along with a similar group without the condition, or “controls.” These two groups are then compared to identify predictors and outcomes. This type of study is helpful to generate a hypothesis that can then be researched.

Cohort Observational Study

This type of observational study is often used to help understand cause and effect. A cohort observational study looks at causes, incidence and prognosis, for example. A cohort is a group of people who are linked in a particular way, for example, a birth cohort would include people who were born within a specific period of time. Scientists might compare what happens to the members of the cohort who have been exposed to some variable to what occurs with members of the cohort who haven’t been exposed.

Cross Sectional Observational Study

Unlike a cohort observational study, a cross sectional observational study does not explore cause and effect, but instead looks at prevalence. Here you would look at data from a particular group at one very specific period of time. Researchers would simply observe and record information about something present in the population, without manipulating any variables or interventions. These types of studies are commonly used in psychology, education and social science.

Advantages and Disadvantages of Observational Study Design

Observational study designs have the distinct advantage of allowing researchers to explore answers to questions where a randomized controlled trial, or RCT, would be unethical. Additionally, if the study is focused on a rare condition, studying existing cases as compared to non-affected individuals might be the most effective way to identify possible causes of the condition. Likewise, if very little is known about a condition or circumstance, a cohort study would be a good study design choice.

A primary advantage to the observational study design is that they can generally be completed quickly and inexpensively. A RCT can take years before the data is compiled and available. RCTs are more complex and involved, requiring many more logistics and details to iron out, whereas an observational study can be more easily designed and completed.

The main disadvantage of observational study designs is that they’re more open to dispute than an RCT. Of particular concern would be confounding biases. This is when a cohort might share other characteristics that affect the outcome versus the outcome stated in the study. An example would be that people who practice good sleeping habits have less heart disease. But, maybe those who practice effective sleeping habits also, in general, eat better and exercise more.

Language Editing Plus Service

Need help with your research writing? With our Language Editing Plus service , we’ll help you improve the flow and writing of your paper, including UNLIMITED editing support. Use the simulator below to check the price for your manuscript, using the total number of words of the document.

Clinical Questions: PICO and PEO Research

Clinical Questions: PICO and PEO Research

Paper Retraction: Meaning and Main Reasons

Paper Retraction: Meaning and Main Reasons

You may also like.

what is a descriptive research design

Descriptive Research Design and Its Myriad Uses

Doctor doing a Biomedical Research Paper

Five Common Mistakes to Avoid When Writing a Biomedical Research Paper

observational research

Making Technical Writing in Environmental Engineering Accessible

Risks of AI-assisted Academic Writing

To Err is Not Human: The Dangers of AI-assisted Academic Writing

Importance-of-Data-Collection

When Data Speak, Listen: Importance of Data Collection and Analysis Methods

choosing the Right Research Methodology

Choosing the Right Research Methodology: A Guide for Researchers

Why is data validation important in research

Why is data validation important in research?

Writing a good review article

Writing a good review article

Input your search keywords and press Enter.

Library homepage

  • school Campus Bookshelves
  • menu_book Bookshelves
  • perm_media Learning Objects
  • login Login
  • how_to_reg Request Instructor Account
  • hub Instructor Commons
  • Download Page (PDF)
  • Download Full Book (PDF)
  • Periodic Table
  • Physics Constants
  • Scientific Calculator
  • Reference & Cite
  • Tools expand_more
  • Readability

selected template will load here

This action is not available.

Social Sci LibreTexts

6.6: Observational Research

  • Last updated
  • Save as PDF
  • Page ID 19655

  • Rajiv S. Jhangiani, I-Chant A. Chiang, Carrie Cuttler, & Dana C. Leighton
  • Kwantlen Polytechnic U., Washington State U., & Texas A&M U.—Texarkana

Learning Objectives

  • List the various types of observational research methods and distinguish between each.
  • Describe the strengths and weakness of each observational research method.

What Is Observational Research?

The term observational research is used to refer to several different types of non-experimental studies in which behavior is systematically observed and recorded. The goal of observational research is to describe a variable or set of variables. More generally, the goal is to obtain a snapshot of specific characteristics of an individual, group, or setting. As described previously, observational research is non-experimental because nothing is manipulated or controlled, and as such we cannot arrive at causal conclusions using this approach. The data that are collected in observational research studies are often qualitative in nature but they may also be quantitative or both (mixed-methods). There are several different types of observational methods that will be described below.

Naturalistic Observation

Naturalistic observation is an observational method that involves observing people’s behavior in the environment in which it typically occurs. Thus naturalistic observation is a type of field research (as opposed to a type of laboratory research). Jane Goodall’s famous research on chimpanzees is a classic example of naturalistic observation. Dr. Goodall spent three decades observing chimpanzees in their natural environment in East Africa. She examined such things as chimpanzee’s social structure, mating patterns, gender roles, family structure, and care of offspring by observing them in the wild. However, naturalistic observation could more simply involve observing shoppers in a grocery store, children on a school playground, or psychiatric inpatients in their wards. Researchers engaged in naturalistic observation usually make their observations as unobtrusively as possible so that participants are not aware that they are being studied. Such an approach is called disguised naturalistic observation. Ethically, this method is considered to be acceptable if the participants remain anonymous and the behavior occurs in a public setting where people would not normally have an expectation of privacy. Grocery shoppers putting items into their shopping carts, for example, are engaged in public behavior that is easily observable by store employees and other shoppers. For this reason, most researchers would consider it ethically acceptable to observe them for a study. On the other hand, one of the arguments against the ethicality of the naturalistic observation of “bathroom behavior” discussed earlier in the book is that people have a reasonable expectation of privacy even in a public restroom and that this expectation was violated.

In cases where it is not ethical or practical to conduct disguised naturalistic observation, researchers can conduct undisguised naturalistic observation where the participants are made aware of the researcher presence and monitoring of their behavior. However, one concern with undisguised naturalistic observation is reactivity. Reactivity refers to when a measure changes participants’ behavior. In the case of undisguised naturalistic observation, the concern with reactivity is that when people know they are being observed and studied, they may act differently than they normally would. This type of reactivity is known as the Hawthorne effect . For instance, you may act much differently in a bar if you know that someone is observing you and recording your behaviors and this would invalidate the study. So disguised observation is less reactive and therefore can have higher validity because people are not aware that their behaviors are being observed and recorded. However, we now know that people often become used to being observed and with time they begin to behave naturally in the researcher’s presence. In other words, over time people habituate to being observed. Think about reality shows like Big Brother or Survivor where people are constantly being observed and recorded. While they may be on their best behavior at first, in a fairly short amount of time they are flirting, having sex, wearing next to nothing, screaming at each other, and occasionally behaving in ways that are embarrassing.

Participant Observation

Another approach to data collection in observational research is participant observation. In participant observation , researchers become active participants in the group or situation they are studying. Participant observation is very similar to naturalistic observation in that it involves observing people’s behavior in the environment in which it typically occurs. As with naturalistic observation, the data that are collected can include interviews (usually unstructured), notes based on their observations and interactions, documents, photographs, and other artifacts. The only difference between naturalistic observation and participant observation is that researchers engaged in participant observation become active members of the group or situations they are studying. The basic rationale for participant observation is that there may be important information that is only accessible to, or can be interpreted only by, someone who is an active participant in the group or situation. Like naturalistic observation, participant observation can be either disguised or undisguised. In disguised participant observation, the researchers pretend to be members of the social group they are observing and conceal their true identity as researchers.

In a famous example of disguised participant observation, Leon Festinger and his colleagues infiltrated a doomsday cult known as the Seekers, whose members believed that the apocalypse would occur on December 21, 1954. Interested in studying how members of the group would cope psychologically when the prophecy inevitably failed, they carefully recorded the events and reactions of the cult members in the days before and after the supposed end of the world. Unsurprisingly, the cult members did not give up their belief but instead convinced themselves that it was their faith and efforts that saved the world from destruction. Festinger and his colleagues later published a book about this experience, which they used to illustrate the theory of cognitive dissonance (Festinger, Riecken, & Schachter, 1956) [1] .

In contrast with undisguised participant observation, the researchers become a part of the group they are studying and they disclose their true identity as researchers to the group under investigation. Once again there are important ethical issues to consider with disguised participant observation. First no informed consent can be obtained and second deception is being used. The researcher is deceiving the participants by intentionally withholding information about their motivations for being a part of the social group they are studying. But sometimes disguised participation is the only way to access a protective group (like a cult). Further, disguised participant observation is less prone to reactivity than undisguised participant observation.

Rosenhan’s study (1973) [2] of the experience of people in a psychiatric ward would be considered disguised participant observation because Rosenhan and his pseudopatients were admitted into psychiatric hospitals on the pretense of being patients so that they could observe the way that psychiatric patients are treated by staff. The staff and other patients were unaware of their true identities as researchers.

Another example of participant observation comes from a study by sociologist Amy Wilkins on a university-based religious organization that emphasized how happy its members were (Wilkins, 2008) [3] . Wilkins spent 12 months attending and participating in the group’s meetings and social events, and she interviewed several group members. In her study, Wilkins identified several ways in which the group “enforced” happiness—for example, by continually talking about happiness, discouraging the expression of negative emotions, and using happiness as a way to distinguish themselves from other groups.

One of the primary benefits of participant observation is that the researchers are in a much better position to understand the viewpoint and experiences of the people they are studying when they are a part of the social group. The primary limitation with this approach is that the mere presence of the observer could affect the behavior of the people being observed. While this is also a concern with naturalistic observation, additional concerns arise when researchers become active members of the social group they are studying because that they may change the social dynamics and/or influence the behavior of the people they are studying. Similarly, if the researcher acts as a participant observer there can be concerns with biases resulting from developing relationships with the participants. Concretely, the researcher may become less objective resulting in more experimenter bias.

Structured Observation

Another observational method is structured observation . Here the investigator makes careful observations of one or more specific behaviors in a particular setting that is more structured than the settings used in naturalistic or participant observation. Often the setting in which the observations are made is not the natural setting. Instead, the researcher may observe people in the laboratory environment. Alternatively, the researcher may observe people in a natural setting (like a classroom setting) that they have structured some way, for instance by introducing some specific task participants are to engage in or by introducing a specific social situation or manipulation.

Structured observation is very similar to naturalistic observation and participant observation in that in all three cases researchers are observing naturally occurring behavior; however, the emphasis in structured observation is on gathering quantitative rather than qualitative data. Researchers using this approach are interested in a limited set of behaviors. This allows them to quantify the behaviors they are observing. In other words, structured observation is less global than naturalistic or participant observation because the researcher engaged in structured observations is interested in a small number of specific behaviors. Therefore, rather than recording everything that happens, the researcher only focuses on very specific behaviors of interest.

Researchers Robert Levine and Ara Norenzayan used structured observation to study differences in the “pace of life” across countries (Levine & Norenzayan, 1999) [4] . One of their measures involved observing pedestrians in a large city to see how long it took them to walk 60 feet. They found that people in some countries walked reliably faster than people in other countries. For example, people in Canada and Sweden covered 60 feet in just under 13 seconds on average, while people in Brazil and Romania took close to 17 seconds. When structured observation takes place in the complex and even chaotic “real world,” the questions of when, where, and under what conditions the observations will be made, and who exactly will be observed are important to consider. Levine and Norenzayan described their sampling process as follows:

“Male and female walking speed over a distance of 60 feet was measured in at least two locations in main downtown areas in each city. Measurements were taken during main business hours on clear summer days. All locations were flat, unobstructed, had broad sidewalks, and were sufficiently uncrowded to allow pedestrians to move at potentially maximum speeds. To control for the effects of socializing, only pedestrians walking alone were used. Children, individuals with obvious physical handicaps, and window-shoppers were not timed. Thirty-five men and 35 women were timed in most cities.” (p. 186).

Precise specification of the sampling process in this way makes data collection manageable for the observers, and it also provides some control over important extraneous variables. For example, by making their observations on clear summer days in all countries, Levine and Norenzayan controlled for effects of the weather on people’s walking speeds. In Levine and Norenzayan’s study, measurement was relatively straightforward. They simply measured out a 60-foot distance along a city sidewalk and then used a stopwatch to time participants as they walked over that distance.

As another example, researchers Robert Kraut and Robert Johnston wanted to study bowlers’ reactions to their shots, both when they were facing the pins and then when they turned toward their companions (Kraut & Johnston, 1979) [5] . But what “reactions” should they observe? Based on previous research and their own pilot testing, Kraut and Johnston created a list of reactions that included “closed smile,” “open smile,” “laugh,” “neutral face,” “look down,” “look away,” and “face cover” (covering one’s face with one’s hands). The observers committed this list to memory and then practiced by coding the reactions of bowlers who had been videotaped. During the actual study, the observers spoke into an audio recorder, describing the reactions they observed. Among the most interesting results of this study was that bowlers rarely smiled while they still faced the pins. They were much more likely to smile after they turned toward their companions, suggesting that smiling is not purely an expression of happiness but also a form of social communication.

In yet another example (this one in a laboratory environment), Dov Cohen and his colleagues had observers rate the emotional reactions of participants who had just been deliberately bumped and insulted by a confederate after they dropped off a completed questionnaire at the end of a hallway. The confederate was posing as someone who worked in the same building and who was frustrated by having to close a file drawer twice in order to permit the participants to walk past them (first to drop off the questionnaire at the end of the hallway and once again on their way back to the room where they believed the study they signed up for was taking place). The two observers were positioned at different ends of the hallway so that they could read the participants’ body language and hear anything they might say. Interestingly, the researchers hypothesized that participants from the southern United States, which is one of several places in the world that has a “culture of honor,” would react with more aggression than participants from the northern United States, a prediction that was in fact supported by the observational data (Cohen, Nisbett, Bowdle, & Schwarz, 1996) [6] .

When the observations require a judgment on the part of the observers—as in the studies by Kraut and Johnston and Cohen and his colleagues—a process referred to as coding is typically required . Coding generally requires clearly defining a set of target behaviors. The observers then categorize participants individually in terms of which behavior they have engaged in and the number of times they engaged in each behavior. The observers might even record the duration of each behavior. The target behaviors must be defined in such a way that guides different observers to code them in the same way. This difficulty with coding illustrates the issue of interrater reliability, as mentioned in Chapter 4. Researchers are expected to demonstrate the interrater reliability of their coding procedure by having multiple raters code the same behaviors independently and then showing that the different observers are in close agreement. Kraut and Johnston, for example, video recorded a subset of their participants’ reactions and had two observers independently code them. The two observers showed that they agreed on the reactions that were exhibited 97% of the time, indicating good interrater reliability.

One of the primary benefits of structured observation is that it is far more efficient than naturalistic and participant observation. Since the researchers are focused on specific behaviors this reduces time and expense. Also, often times the environment is structured to encourage the behaviors of interest which again means that researchers do not have to invest as much time in waiting for the behaviors of interest to naturally occur. Finally, researchers using this approach can clearly exert greater control over the environment. However, when researchers exert more control over the environment it may make the environment less natural which decreases external validity. It is less clear for instance whether structured observations made in a laboratory environment will generalize to a real world environment. Furthermore, since researchers engaged in structured observation are often not disguised there may be more concerns with reactivity.

Case Studies

A case study is an in-depth examination of an individual. Sometimes case studies are also completed on social units (e.g., a cult) and events (e.g., a natural disaster). Most commonly in psychology, however, case studies provide a detailed description and analysis of an individual. Often the individual has a rare or unusual condition or disorder or has damage to a specific region of the brain.

Like many observational research methods, case studies tend to be more qualitative in nature. Case study methods involve an in-depth, and often a longitudinal examination of an individual. Depending on the focus of the case study, individuals may or may not be observed in their natural setting. If the natural setting is not what is of interest, then the individual may be brought into a therapist’s office or a researcher’s lab for study. Also, the bulk of the case study report will focus on in-depth descriptions of the person rather than on statistical analyses. With that said some quantitative data may also be included in the write-up of a case study. For instance, an individual’s depression score may be compared to normative scores or their score before and after treatment may be compared. As with other qualitative methods, a variety of different methods and tools can be used to collect information on the case. For instance, interviews, naturalistic observation, structured observation, psychological testing (e.g., IQ test), and/or physiological measurements (e.g., brain scans) may be used to collect information on the individual.

HM is one of the most notorious case studies in psychology. HM suffered from intractable and very severe epilepsy. A surgeon localized HM’s epilepsy to his medial temporal lobe and in 1953 he removed large sections of his hippocampus in an attempt to stop the seizures. The treatment was a success, in that it resolved his epilepsy and his IQ and personality were unaffected. However, the doctors soon realized that HM exhibited a strange form of amnesia, called anterograde amnesia. HM was able to carry out a conversation and he could remember short strings of letters, digits, and words. Basically, his short term memory was preserved. However, HM could not commit new events to memory. He lost the ability to transfer information from his short-term memory to his long term memory, something memory researchers call consolidation. So while he could carry on a conversation with someone, he would completely forget the conversation after it ended. This was an extremely important case study for memory researchers because it suggested that there’s a dissociation between short-term memory and long-term memory, it suggested that these were two different abilities sub-served by different areas of the brain. It also suggested that the temporal lobes are particularly important for consolidating new information (i.e., for transferring information from short-term memory to long-term memory),

The history of psychology is filled with influential cases studies, such as Sigmund Freud’s description of “Anna O.” (see Note 6.1 “The Case of “Anna O.””) and John Watson and Rosalie Rayner’s description of Little Albert (Watson & Rayner, 1920) [7] , who allegedly learned to fear a white rat—along with other furry objects—when the researchers repeatedly made a loud noise every time the rat approached him.

The Case of “Anna O.”

Sigmund Freud used the case of a young woman he called “Anna O.” to illustrate many principles of his theory of psychoanalysis (Freud, 1961) [8] . (Her real name was Bertha Pappenheim, and she was an early feminist who went on to make important contributions to the field of social work.) Anna had come to Freud’s colleague Josef Breuer around 1880 with a variety of odd physical and psychological symptoms. One of them was that for several weeks she was unable to drink any fluids. According to Freud,

She would take up the glass of water that she longed for, but as soon as it touched her lips she would push it away like someone suffering from hydrophobia.…She lived only on fruit, such as melons, etc., so as to lessen her tormenting thirst. (p. 9)

But according to Freud, a breakthrough came one day while Anna was under hypnosis.

[S]he grumbled about her English “lady-companion,” whom she did not care for, and went on to describe, with every sign of disgust, how she had once gone into this lady’s room and how her little dog—horrid creature!—had drunk out of a glass there. The patient had said nothing, as she had wanted to be polite. After giving further energetic expression to the anger she had held back, she asked for something to drink, drank a large quantity of water without any difficulty, and awoke from her hypnosis with the glass at her lips; and thereupon the disturbance vanished, never to return. (p.9)

Freud’s interpretation was that Anna had repressed the memory of this incident along with the emotion that it triggered and that this was what had caused her inability to drink. Furthermore, he believed that her recollection of the incident, along with her expression of the emotion she had repressed, caused the symptom to go away.

As an illustration of Freud’s theory, the case study of Anna O. is quite effective. As evidence for the theory, however, it is essentially worthless. The description provides no way of knowing whether Anna had really repressed the memory of the dog drinking from the glass, whether this repression had caused her inability to drink, or whether recalling this “trauma” relieved the symptom. It is also unclear from this case study how typical or atypical Anna’s experience was.

10.1.png

Case studies are useful because they provide a level of detailed analysis not found in many other research methods and greater insights may be gained from this more detailed analysis. As a result of the case study, the researcher may gain a sharpened understanding of what might become important to look at more extensively in future more controlled research. Case studies are also often the only way to study rare conditions because it may be impossible to find a large enough sample of individuals with the condition to use quantitative methods. Although at first glance a case study of a rare individual might seem to tell us little about ourselves, they often do provide insights into normal behavior. The case of HM provided important insights into the role of the hippocampus in memory consolidation.

However, it is important to note that while case studies can provide insights into certain areas and variables to study, and can be useful in helping develop theories, they should never be used as evidence for theories. In other words, case studies can be used as inspiration to formulate theories and hypotheses, but those hypotheses and theories then need to be formally tested using more rigorous quantitative methods. The reason case studies shouldn’t be used to provide support for theories is that they suffer from problems with both internal and external validity. Case studies lack the proper controls that true experiments contain. As such, they suffer from problems with internal validity, so they cannot be used to determine causation. For instance, during HM’s surgery, the surgeon may have accidentally lesioned another area of HM’s brain (a possibility suggested by the dissection of HM’s brain following his death) and that lesion may have contributed to his inability to consolidate new information. The fact is, with case studies we cannot rule out these sorts of alternative explanations. So, as with all observational methods, case studies do not permit determination of causation. In addition, because case studies are often of a single individual, and typically an abnormal individual, researchers cannot generalize their conclusions to other individuals. Recall that with most research designs there is a trade-off between internal and external validity. With case studies, however, there are problems with both internal validity and external validity. So there are limits both to the ability to determine causation and to generalize the results. A final limitation of case studies is that ample opportunity exists for the theoretical biases of the researcher to color or bias the case description. Indeed, there have been accusations that the woman who studied HM destroyed a lot of her data that were not published and she has been called into question for destroying contradictory data that didn’t support her theory about how memories are consolidated. There is a fascinating New York Times article that describes some of the controversies that ensued after HM’s death and analysis of his brain that can be found at: https://www.nytimes.com/2016/08/07/magazine/the-brain-that-couldnt-remember.html?_r=0

Archival Research

Another approach that is often considered observational research involves analyzing archival data that have already been collected for some other purpose. An example is a study by Brett Pelham and his colleagues on “implicit egotism”—the tendency for people to prefer people, places, and things that are similar to themselves (Pelham, Carvallo, & Jones, 2005) [9] . In one study, they examined Social Security records to show that women with the names Virginia, Georgia, Louise, and Florence were especially likely to have moved to the states of Virginia, Georgia, Louisiana, and Florida, respectively.

As with naturalistic observation, measurement can be more or less straightforward when working with archival data. For example, counting the number of people named Virginia who live in various states based on Social Security records is relatively straightforward. But consider a study by Christopher Peterson and his colleagues on the relationship between optimism and health using data that had been collected many years before for a study on adult development (Peterson, Seligman, & Vaillant, 1988) [10] . In the 1940s, healthy male college students had completed an open-ended questionnaire about difficult wartime experiences. In the late 1980s, Peterson and his colleagues reviewed the men’s questionnaire responses to obtain a measure of explanatory style—their habitual ways of explaining bad events that happen to them. More pessimistic people tend to blame themselves and expect long-term negative consequences that affect many aspects of their lives, while more optimistic people tend to blame outside forces and expect limited negative consequences. To obtain a measure of explanatory style for each participant, the researchers used a procedure in which all negative events mentioned in the questionnaire responses, and any causal explanations for them were identified and written on index cards. These were given to a separate group of raters who rated each explanation in terms of three separate dimensions of optimism-pessimism. These ratings were then averaged to produce an explanatory style score for each participant. The researchers then assessed the statistical relationship between the men’s explanatory style as undergraduate students and archival measures of their health at approximately 60 years of age. The primary result was that the more optimistic the men were as undergraduate students, the healthier they were as older men. Pearson’s r was +.25.

This method is an example of content analysis —a family of systematic approaches to measurement using complex archival data. Just as structured observation requires specifying the behaviors of interest and then noting them as they occur, content analysis requires specifying keywords, phrases, or ideas and then finding all occurrences of them in the data. These occurrences can then be counted, timed (e.g., the amount of time devoted to entertainment topics on the nightly news show), or analyzed in a variety of other ways.

  • Festinger, L., Riecken, H., & Schachter, S. (1956). When prophecy fails: A social and psychological study of a modern group that predicted the destruction of the world. University of Minnesota Press. ↵
  • Rosenhan, D. L. (1973). On being sane in insane places. Science, 179 , 250–258. ↵
  • Wilkins, A. (2008). “Happier than Non-Christians”: Collective emotions and symbolic boundaries among evangelical Christians. Social Psychology Quarterly, 71 , 281–301. ↵
  • Levine, R. V., & Norenzayan, A. (1999). The pace of life in 31 countries. Journal of Cross-Cultural Psychology, 30 , 178–205. ↵
  • Kraut, R. E., & Johnston, R. E. (1979). Social and emotional messages of smiling: An ethological approach. Journal of Personality and Social Psychology, 37 , 1539–1553. ↵
  • Cohen, D., Nisbett, R. E., Bowdle, B. F., & Schwarz, N. (1996). Insult, aggression, and the southern culture of honor: An "experimental ethnography." Journal of Personality and Social Psychology, 70 (5), 945-960. ↵
  • Watson, J. B., & Rayner, R. (1920). Conditioned emotional reactions. Journal of Experimental Psychology, 3 , 1–14. ↵
  • Freud, S. (1961). Five lectures on psycho-analysis . New York, NY: Norton. ↵
  • Pelham, B. W., Carvallo, M., & Jones, J. T. (2005). Implicit egotism. Current Directions in Psychological Science, 14 , 106–110. ↵
  • Peterson, C., Seligman, M. E. P., & Vaillant, G. E. (1988). Pessimistic explanatory style is a risk factor for physical illness: A thirty-five year longitudinal study. Journal of Personality and Social Psychology, 55 , 23–27. ↵
  • Privacy Policy

Buy Me a Coffee

Research Method

Home » Observational Research – Methods and Guide

Observational Research – Methods and Guide

Table of Contents

Observational Research

Observational Research

Definition:

Observational research is a type of research method where the researcher observes and records the behavior of individuals or groups in their natural environment. In other words, the researcher does not intervene or manipulate any variables but simply observes and describes what is happening.

Observation

Observation is the process of collecting and recording data by observing and noting events, behaviors, or phenomena in a systematic and objective manner. It is a fundamental method used in research, scientific inquiry, and everyday life to gain an understanding of the world around us.

Types of Observational Research

Observational research can be categorized into different types based on the level of control and the degree of involvement of the researcher in the study. Some of the common types of observational research are:

Naturalistic Observation

In naturalistic observation, the researcher observes and records the behavior of individuals or groups in their natural environment without any interference or manipulation of variables.

Controlled Observation

In controlled observation, the researcher controls the environment in which the observation is taking place. This type of observation is often used in laboratory settings.

Participant Observation

In participant observation, the researcher becomes an active participant in the group or situation being observed. The researcher may interact with the individuals being observed and gather data on their behavior, attitudes, and experiences.

Structured Observation

In structured observation, the researcher defines a set of behaviors or events to be observed and records their occurrence.

Unstructured Observation

In unstructured observation, the researcher observes and records any behaviors or events that occur without predetermined categories.

Cross-Sectional Observation

In cross-sectional observation, the researcher observes and records the behavior of different individuals or groups at a single point in time.

Longitudinal Observation

In longitudinal observation, the researcher observes and records the behavior of the same individuals or groups over an extended period of time.

Data Collection Methods

Observational research uses various data collection methods to gather information about the behaviors and experiences of individuals or groups being observed. Some common data collection methods used in observational research include:

Field Notes

This method involves recording detailed notes of the observed behavior, events, and interactions. These notes are usually written in real-time during the observation process.

Audio and Video Recordings

Audio and video recordings can be used to capture the observed behavior and interactions. These recordings can be later analyzed to extract relevant information.

Surveys and Questionnaires

Surveys and questionnaires can be used to gather additional information from the individuals or groups being observed. This method can be used to validate or supplement the observational data.

Time Sampling

This method involves taking a snapshot of the observed behavior at pre-determined time intervals. This method helps to identify the frequency and duration of the observed behavior.

Event Sampling

This method involves recording specific events or behaviors that are of interest to the researcher. This method helps to provide detailed information about specific behaviors or events.

Checklists and Rating Scales

Checklists and rating scales can be used to record the occurrence and frequency of specific behaviors or events. This method helps to simplify and standardize the data collection process.

Observational Data Analysis Methods

Observational Data Analysis Methods are:

Descriptive Statistics

This method involves using statistical techniques such as frequency distributions, means, and standard deviations to summarize the observed behaviors, events, or interactions.

Qualitative Analysis

Qualitative analysis involves identifying patterns and themes in the observed behaviors or interactions. This analysis can be done manually or with the help of software tools.

Content Analysis

Content analysis involves categorizing and counting the occurrences of specific behaviors or events. This analysis can be done manually or with the help of software tools.

Time-series Analysis

Time-series analysis involves analyzing the changes in behavior or interactions over time. This analysis can help identify trends and patterns in the observed data.

Inter-observer Reliability Analysis

Inter-observer reliability analysis involves comparing the observations made by multiple observers to ensure the consistency and reliability of the data.

Multivariate Analysis

Multivariate analysis involves analyzing multiple variables simultaneously to identify the relationships between the observed behaviors, events, or interactions.

Event Coding

This method involves coding observed behaviors or events into specific categories and then analyzing the frequency and duration of each category.

Cluster Analysis

Cluster analysis involves grouping similar behaviors or events into clusters based on their characteristics or patterns.

Latent Class Analysis

Latent class analysis involves identifying subgroups of individuals or groups based on their observed behaviors or interactions.

Social network Analysis

Social network analysis involves mapping the social relationships and interactions between individuals or groups based on their observed behaviors.

The choice of data analysis method depends on the research question, the type of data collected, and the available resources. Researchers should choose the appropriate method that best fits their research question and objectives. It is also important to ensure the validity and reliability of the data analysis by using appropriate statistical tests and measures.

Applications of Observational Research

Observational research is a versatile research method that can be used in a variety of fields to explore and understand human behavior, attitudes, and preferences. Here are some common applications of observational research:

  • Psychology : Observational research is commonly used in psychology to study human behavior in natural settings. This can include observing children at play to understand their social development or observing people’s reactions to stress to better understand how stress affects behavior.
  • Marketing : Observational research is used in marketing to understand consumer behavior and preferences. This can include observing shoppers in stores to understand how they make purchase decisions or observing how people interact with advertisements to determine their effectiveness.
  • Education : Observational research is used in education to study teaching and learning in natural settings. This can include observing classrooms to understand how teachers interact with students or observing students to understand how they learn.
  • Anthropology : Observational research is commonly used in anthropology to understand cultural practices and beliefs. This can include observing people’s daily routines to understand their culture or observing rituals and ceremonies to better understand their significance.
  • Healthcare : Observational research is used in healthcare to understand patient behavior and preferences. This can include observing patients in hospitals to understand how they interact with healthcare professionals or observing patients with chronic illnesses to better understand their daily routines and needs.
  • Sociology : Observational research is used in sociology to understand social interactions and relationships. This can include observing people in public spaces to understand how they interact with others or observing groups to understand how they function.
  • Ecology : Observational research is used in ecology to understand the behavior and interactions of animals and plants in their natural habitats. This can include observing animal behavior to understand their social structures or observing plant growth to understand their response to environmental factors.
  • Criminology : Observational research is used in criminology to understand criminal behavior and the factors that contribute to it. This can include observing criminal activity in a particular area to identify patterns or observing the behavior of inmates to understand their experience in the criminal justice system.

Observational Research Examples

Here are some real-time observational research examples:

  • A researcher observes and records the behaviors of a group of children on a playground to study their social interactions and play patterns.
  • A researcher observes the buying behaviors of customers in a retail store to study the impact of store layout and product placement on purchase decisions.
  • A researcher observes the behavior of drivers at a busy intersection to study the effectiveness of traffic signs and signals.
  • A researcher observes the behavior of patients in a hospital to study the impact of staff communication and interaction on patient satisfaction and recovery.
  • A researcher observes the behavior of employees in a workplace to study the impact of the work environment on productivity and job satisfaction.
  • A researcher observes the behavior of shoppers in a mall to study the impact of music and lighting on consumer behavior.
  • A researcher observes the behavior of animals in their natural habitat to study their social and feeding behaviors.
  • A researcher observes the behavior of students in a classroom to study the effectiveness of teaching methods and student engagement.
  • A researcher observes the behavior of pedestrians and cyclists on a city street to study the impact of infrastructure and traffic regulations on safety.

How to Conduct Observational Research

Here are some general steps for conducting Observational Research:

  • Define the Research Question: Determine the research question and objectives to guide the observational research study. The research question should be specific, clear, and relevant to the area of study.
  • Choose the appropriate observational method: Choose the appropriate observational method based on the research question, the type of data required, and the available resources.
  • Plan the observation: Plan the observation by selecting the observation location, duration, and sampling technique. Identify the population or sample to be observed and the characteristics to be recorded.
  • Train observers: Train the observers on the observational method, data collection tools, and techniques. Ensure that the observers understand the research question and objectives and can accurately record the observed behaviors or events.
  • Conduct the observation : Conduct the observation by recording the observed behaviors or events using the data collection tools and techniques. Ensure that the observation is conducted in a consistent and unbiased manner.
  • Analyze the data: Analyze the observed data using appropriate data analysis methods such as descriptive statistics, qualitative analysis, or content analysis. Validate the data by checking the inter-observer reliability and conducting statistical tests.
  • Interpret the results: Interpret the results by answering the research question and objectives. Identify the patterns, trends, or relationships in the observed data and draw conclusions based on the analysis.
  • Report the findings: Report the findings in a clear and concise manner, using appropriate visual aids and tables. Discuss the implications of the results and the limitations of the study.

When to use Observational Research

Here are some situations where observational research can be useful:

  • Exploratory Research: Observational research can be used in exploratory studies to gain insights into new phenomena or areas of interest.
  • Hypothesis Generation: Observational research can be used to generate hypotheses about the relationships between variables, which can be tested using experimental research.
  • Naturalistic Settings: Observational research is useful in naturalistic settings where it is difficult or unethical to manipulate the environment or variables.
  • Human Behavior: Observational research is useful in studying human behavior, such as social interactions, decision-making, and communication patterns.
  • Animal Behavior: Observational research is useful in studying animal behavior in their natural habitats, such as social and feeding behaviors.
  • Longitudinal Studies: Observational research can be used in longitudinal studies to observe changes in behavior over time.
  • Ethical Considerations: Observational research can be used in situations where manipulating the environment or variables would be unethical or impractical.

Purpose of Observational Research

Observational research is a method of collecting and analyzing data by observing individuals or phenomena in their natural settings, without manipulating them in any way. The purpose of observational research is to gain insights into human behavior, attitudes, and preferences, as well as to identify patterns, trends, and relationships that may exist between variables.

The primary purpose of observational research is to generate hypotheses that can be tested through more rigorous experimental methods. By observing behavior and identifying patterns, researchers can develop a better understanding of the factors that influence human behavior, and use this knowledge to design experiments that test specific hypotheses.

Observational research is also used to generate descriptive data about a population or phenomenon. For example, an observational study of shoppers in a grocery store might reveal that women are more likely than men to buy organic produce. This type of information can be useful for marketers or policy-makers who want to understand consumer preferences and behavior.

In addition, observational research can be used to monitor changes over time. By observing behavior at different points in time, researchers can identify trends and changes that may be indicative of broader social or cultural shifts.

Overall, the purpose of observational research is to provide insights into human behavior and to generate hypotheses that can be tested through further research.

Advantages of Observational Research

There are several advantages to using observational research in different fields, including:

  • Naturalistic observation: Observational research allows researchers to observe behavior in a naturalistic setting, which means that people are observed in their natural environment without the constraints of a laboratory. This helps to ensure that the behavior observed is more representative of the real-world situation.
  • Unobtrusive : Observational research is often unobtrusive, which means that the researcher does not interfere with the behavior being observed. This can reduce the likelihood of the research being affected by the observer’s presence or the Hawthorne effect, where people modify their behavior when they know they are being observed.
  • Cost-effective : Observational research can be less expensive than other research methods, such as experiments or surveys. Researchers do not need to recruit participants or pay for expensive equipment, making it a more cost-effective research method.
  • Flexibility: Observational research is a flexible research method that can be used in a variety of settings and for a range of research questions. Observational research can be used to generate hypotheses, to collect data on behavior, or to monitor changes over time.
  • Rich data : Observational research provides rich data that can be analyzed to identify patterns and relationships between variables. It can also provide context for behaviors, helping to explain why people behave in a certain way.
  • Validity : Observational research can provide high levels of validity, meaning that the results accurately reflect the behavior being studied. This is because the behavior is being observed in a natural setting without interference from the researcher.

Disadvantages of Observational Research

While observational research has many advantages, it also has some limitations and disadvantages. Here are some of the disadvantages of observational research:

  • Observer bias: Observational research is prone to observer bias, which is when the observer’s own beliefs and assumptions affect the way they interpret and record behavior. This can lead to inaccurate or unreliable data.
  • Limited generalizability: The behavior observed in a specific setting may not be representative of the behavior in other settings. This can limit the generalizability of the findings from observational research.
  • Difficulty in establishing causality: Observational research is often correlational, which means that it identifies relationships between variables but does not establish causality. This can make it difficult to determine if a particular behavior is causing an outcome or if the relationship is due to other factors.
  • Ethical concerns: Observational research can raise ethical concerns if the participants being observed are unaware that they are being observed or if the observations invade their privacy.
  • Time-consuming: Observational research can be time-consuming, especially if the behavior being observed is infrequent or occurs over a long period of time. This can make it difficult to collect enough data to draw valid conclusions.
  • Difficulty in measuring internal processes: Observational research may not be effective in measuring internal processes, such as thoughts, feelings, and attitudes. This can limit the ability to understand the reasons behind behavior.

About the author

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Questionnaire

Questionnaire – Definition, Types, and Examples

Case Study Research

Case Study – Methods, Examples and Guide

Quantitative Research

Quantitative Research – Methods, Types and...

Qualitative Research Methods

Qualitative Research Methods

Explanatory Research

Explanatory Research – Types, Methods, Guide

Survey Research

Survey Research – Types, Methods, Examples

  • Search Menu
  • Browse content in Arts and Humanities
  • Browse content in Archaeology
  • Anglo-Saxon and Medieval Archaeology
  • Archaeological Methodology and Techniques
  • Archaeology by Region
  • Archaeology of Religion
  • Archaeology of Trade and Exchange
  • Biblical Archaeology
  • Contemporary and Public Archaeology
  • Environmental Archaeology
  • Historical Archaeology
  • History and Theory of Archaeology
  • Industrial Archaeology
  • Landscape Archaeology
  • Mortuary Archaeology
  • Prehistoric Archaeology
  • Underwater Archaeology
  • Urban Archaeology
  • Zooarchaeology
  • Browse content in Architecture
  • Architectural Structure and Design
  • History of Architecture
  • Residential and Domestic Buildings
  • Theory of Architecture
  • Browse content in Art
  • Art Subjects and Themes
  • History of Art
  • Industrial and Commercial Art
  • Theory of Art
  • Biographical Studies
  • Byzantine Studies
  • Browse content in Classical Studies
  • Classical Literature
  • Classical Reception
  • Classical History
  • Classical Philosophy
  • Classical Mythology
  • Classical Art and Architecture
  • Classical Oratory and Rhetoric
  • Greek and Roman Archaeology
  • Greek and Roman Epigraphy
  • Greek and Roman Law
  • Greek and Roman Papyrology
  • Late Antiquity
  • Religion in the Ancient World
  • Digital Humanities
  • Browse content in History
  • Colonialism and Imperialism
  • Diplomatic History
  • Environmental History
  • Genealogy, Heraldry, Names, and Honours
  • Genocide and Ethnic Cleansing
  • Historical Geography
  • History by Period
  • History of Agriculture
  • History of Education
  • History of Emotions
  • History of Gender and Sexuality
  • Industrial History
  • Intellectual History
  • International History
  • Labour History
  • Legal and Constitutional History
  • Local and Family History
  • Maritime History
  • Military History
  • National Liberation and Post-Colonialism
  • Oral History
  • Political History
  • Public History
  • Regional and National History
  • Revolutions and Rebellions
  • Slavery and Abolition of Slavery
  • Social and Cultural History
  • Theory, Methods, and Historiography
  • Urban History
  • World History
  • Browse content in Language Teaching and Learning
  • Language Learning (Specific Skills)
  • Language Teaching Theory and Methods
  • Browse content in Linguistics
  • Applied Linguistics
  • Cognitive Linguistics
  • Computational Linguistics
  • Forensic Linguistics
  • Grammar, Syntax and Morphology
  • Historical and Diachronic Linguistics
  • History of English
  • Language Variation
  • Language Families
  • Language Acquisition
  • Language Evolution
  • Language Reference
  • Lexicography
  • Linguistic Theories
  • Linguistic Typology
  • Linguistic Anthropology
  • Phonetics and Phonology
  • Psycholinguistics
  • Sociolinguistics
  • Translation and Interpretation
  • Writing Systems
  • Browse content in Literature

Bibliography

  • Children's Literature Studies
  • Literary Studies (Modernism)
  • Literary Studies (Asian)
  • Literary Studies (European)
  • Literary Studies (Eco-criticism)
  • Literary Studies (Romanticism)
  • Literary Studies (American)
  • Literary Studies - World
  • Literary Studies (1500 to 1800)
  • Literary Studies (19th Century)
  • Literary Studies (20th Century onwards)
  • Literary Studies (African American Literature)
  • Literary Studies (British and Irish)
  • Literary Studies (Early and Medieval)
  • Literary Studies (Fiction, Novelists, and Prose Writers)
  • Literary Studies (Gender Studies)
  • Literary Studies (Graphic Novels)
  • Literary Studies (History of the Book)
  • Literary Studies (Plays and Playwrights)
  • Literary Studies (Poetry and Poets)
  • Literary Studies (Postcolonial Literature)
  • Literary Studies (Queer Studies)
  • Literary Studies (Science Fiction)
  • Literary Studies (Travel Literature)
  • Literary Studies (War Literature)
  • Literary Studies (Women's Writing)
  • Literary Theory and Cultural Studies
  • Mythology and Folklore
  • Shakespeare Studies and Criticism
  • Browse content in Media Studies
  • Browse content in Music
  • Applied Music
  • Dance and Music
  • Ethics in Music
  • Ethnomusicology
  • Gender and Sexuality in Music
  • Medicine and Music
  • Music Cultures
  • Music and Culture
  • Music and Religion
  • Music and Media
  • Music Education and Pedagogy
  • Music Theory and Analysis
  • Musical Scores, Lyrics, and Libretti
  • Musical Structures, Styles, and Techniques
  • Musicology and Music History
  • Performance Practice and Studies
  • Race and Ethnicity in Music
  • Sound Studies
  • Browse content in Performing Arts
  • Browse content in Philosophy
  • Aesthetics and Philosophy of Art
  • Epistemology
  • Feminist Philosophy
  • History of Western Philosophy
  • Metaphysics
  • Moral Philosophy
  • Non-Western Philosophy
  • Philosophy of Action
  • Philosophy of Law
  • Philosophy of Religion
  • Philosophy of Science
  • Philosophy of Language
  • Philosophy of Mind
  • Philosophy of Perception
  • Philosophy of Mathematics and Logic
  • Practical Ethics
  • Social and Political Philosophy
  • Browse content in Religion
  • Biblical Studies
  • Christianity
  • East Asian Religions
  • History of Religion
  • Judaism and Jewish Studies
  • Qumran Studies
  • Religion and Education
  • Religion and Health
  • Religion and Politics
  • Religion and Science
  • Religion and Law
  • Religion and Art, Literature, and Music
  • Religious Studies
  • Browse content in Society and Culture
  • Cookery, Food, and Drink
  • Cultural Studies
  • Customs and Traditions
  • Ethical Issues and Debates
  • Hobbies, Games, Arts and Crafts
  • Lifestyle, Home, and Garden
  • Natural world, Country Life, and Pets
  • Popular Beliefs and Controversial Knowledge
  • Sports and Outdoor Recreation
  • Technology and Society
  • Travel and Holiday
  • Visual Culture
  • Browse content in Law
  • Arbitration
  • Browse content in Company and Commercial Law
  • Commercial Law
  • Company Law
  • Browse content in Comparative Law
  • Systems of Law
  • Competition Law
  • Browse content in Constitutional and Administrative Law
  • Government Powers
  • Judicial Review
  • Local Government Law
  • Military and Defence Law
  • Parliamentary and Legislative Practice
  • Construction Law
  • Contract Law
  • Browse content in Criminal Law
  • Criminal Procedure
  • Criminal Evidence Law
  • Sentencing and Punishment
  • Employment and Labour Law
  • Environment and Energy Law
  • Browse content in Financial Law
  • Banking Law
  • Insolvency Law
  • History of Law
  • Human Rights and Immigration
  • Intellectual Property Law
  • Browse content in International Law
  • Private International Law and Conflict of Laws
  • Public International Law
  • IT and Communications Law
  • Jurisprudence and Philosophy of Law
  • Law and Society
  • Law and Politics
  • Browse content in Legal System and Practice
  • Courts and Procedure
  • Legal Skills and Practice
  • Primary Sources of Law
  • Regulation of Legal Profession
  • Medical and Healthcare Law
  • Browse content in Policing
  • Criminal Investigation and Detection
  • Police and Security Services
  • Police Procedure and Law
  • Police Regional Planning
  • Browse content in Property Law
  • Personal Property Law
  • Study and Revision
  • Terrorism and National Security Law
  • Browse content in Trusts Law
  • Wills and Probate or Succession
  • Browse content in Medicine and Health
  • Browse content in Allied Health Professions
  • Arts Therapies
  • Clinical Science
  • Dietetics and Nutrition
  • Occupational Therapy
  • Operating Department Practice
  • Physiotherapy
  • Radiography
  • Speech and Language Therapy
  • Browse content in Anaesthetics
  • General Anaesthesia
  • Neuroanaesthesia
  • Browse content in Clinical Medicine
  • Acute Medicine
  • Cardiovascular Medicine
  • Clinical Genetics
  • Clinical Pharmacology and Therapeutics
  • Dermatology
  • Endocrinology and Diabetes
  • Gastroenterology
  • Genito-urinary Medicine
  • Geriatric Medicine
  • Infectious Diseases
  • Medical Oncology
  • Medical Toxicology
  • Pain Medicine
  • Palliative Medicine
  • Rehabilitation Medicine
  • Respiratory Medicine and Pulmonology
  • Rheumatology
  • Sleep Medicine
  • Sports and Exercise Medicine
  • Clinical Neuroscience
  • Community Medical Services
  • Critical Care
  • Emergency Medicine
  • Forensic Medicine
  • Haematology
  • History of Medicine
  • Medical Ethics
  • Browse content in Medical Dentistry
  • Oral and Maxillofacial Surgery
  • Paediatric Dentistry
  • Restorative Dentistry and Orthodontics
  • Surgical Dentistry
  • Browse content in Medical Skills
  • Clinical Skills
  • Communication Skills
  • Nursing Skills
  • Surgical Skills
  • Medical Statistics and Methodology
  • Browse content in Neurology
  • Clinical Neurophysiology
  • Neuropathology
  • Nursing Studies
  • Browse content in Obstetrics and Gynaecology
  • Gynaecology
  • Occupational Medicine
  • Ophthalmology
  • Otolaryngology (ENT)
  • Browse content in Paediatrics
  • Neonatology
  • Browse content in Pathology
  • Chemical Pathology
  • Clinical Cytogenetics and Molecular Genetics
  • Histopathology
  • Medical Microbiology and Virology
  • Patient Education and Information
  • Browse content in Pharmacology
  • Psychopharmacology
  • Browse content in Popular Health
  • Caring for Others
  • Complementary and Alternative Medicine
  • Self-help and Personal Development
  • Browse content in Preclinical Medicine
  • Cell Biology
  • Molecular Biology and Genetics
  • Reproduction, Growth and Development
  • Primary Care
  • Professional Development in Medicine
  • Browse content in Psychiatry
  • Addiction Medicine
  • Child and Adolescent Psychiatry
  • Forensic Psychiatry
  • Learning Disabilities
  • Old Age Psychiatry
  • Psychotherapy
  • Browse content in Public Health and Epidemiology
  • Epidemiology
  • Public Health
  • Browse content in Radiology
  • Clinical Radiology
  • Interventional Radiology
  • Nuclear Medicine
  • Radiation Oncology
  • Reproductive Medicine
  • Browse content in Surgery
  • Cardiothoracic Surgery
  • Gastro-intestinal and Colorectal Surgery
  • General Surgery
  • Neurosurgery
  • Paediatric Surgery
  • Peri-operative Care
  • Plastic and Reconstructive Surgery
  • Surgical Oncology
  • Transplant Surgery
  • Trauma and Orthopaedic Surgery
  • Vascular Surgery
  • Browse content in Science and Mathematics
  • Browse content in Biological Sciences
  • Aquatic Biology
  • Biochemistry
  • Bioinformatics and Computational Biology
  • Developmental Biology
  • Ecology and Conservation
  • Evolutionary Biology
  • Genetics and Genomics
  • Microbiology
  • Molecular and Cell Biology
  • Natural History
  • Plant Sciences and Forestry
  • Research Methods in Life Sciences
  • Structural Biology
  • Systems Biology
  • Zoology and Animal Sciences
  • Browse content in Chemistry
  • Analytical Chemistry
  • Computational Chemistry
  • Crystallography
  • Environmental Chemistry
  • Industrial Chemistry
  • Inorganic Chemistry
  • Materials Chemistry
  • Medicinal Chemistry
  • Mineralogy and Gems
  • Organic Chemistry
  • Physical Chemistry
  • Polymer Chemistry
  • Study and Communication Skills in Chemistry
  • Theoretical Chemistry
  • Browse content in Computer Science
  • Artificial Intelligence
  • Computer Architecture and Logic Design
  • Game Studies
  • Human-Computer Interaction
  • Mathematical Theory of Computation
  • Programming Languages
  • Software Engineering
  • Systems Analysis and Design
  • Virtual Reality
  • Browse content in Computing
  • Business Applications
  • Computer Games
  • Computer Security
  • Computer Networking and Communications
  • Digital Lifestyle
  • Graphical and Digital Media Applications
  • Operating Systems
  • Browse content in Earth Sciences and Geography
  • Atmospheric Sciences
  • Environmental Geography
  • Geology and the Lithosphere
  • Maps and Map-making
  • Meteorology and Climatology
  • Oceanography and Hydrology
  • Palaeontology
  • Physical Geography and Topography
  • Regional Geography
  • Soil Science
  • Urban Geography
  • Browse content in Engineering and Technology
  • Agriculture and Farming
  • Biological Engineering
  • Civil Engineering, Surveying, and Building
  • Electronics and Communications Engineering
  • Energy Technology
  • Engineering (General)
  • Environmental Science, Engineering, and Technology
  • History of Engineering and Technology
  • Mechanical Engineering and Materials
  • Technology of Industrial Chemistry
  • Transport Technology and Trades
  • Browse content in Environmental Science
  • Applied Ecology (Environmental Science)
  • Conservation of the Environment (Environmental Science)
  • Environmental Sustainability
  • Environmentalist Thought and Ideology (Environmental Science)
  • Management of Land and Natural Resources (Environmental Science)
  • Natural Disasters (Environmental Science)
  • Nuclear Issues (Environmental Science)
  • Pollution and Threats to the Environment (Environmental Science)
  • Social Impact of Environmental Issues (Environmental Science)
  • History of Science and Technology
  • Browse content in Materials Science
  • Ceramics and Glasses
  • Composite Materials
  • Metals, Alloying, and Corrosion
  • Nanotechnology
  • Browse content in Mathematics
  • Applied Mathematics
  • Biomathematics and Statistics
  • History of Mathematics
  • Mathematical Education
  • Mathematical Finance
  • Mathematical Analysis
  • Numerical and Computational Mathematics
  • Probability and Statistics
  • Pure Mathematics
  • Browse content in Neuroscience
  • Cognition and Behavioural Neuroscience
  • Development of the Nervous System
  • Disorders of the Nervous System
  • History of Neuroscience
  • Invertebrate Neurobiology
  • Molecular and Cellular Systems
  • Neuroendocrinology and Autonomic Nervous System
  • Neuroscientific Techniques
  • Sensory and Motor Systems
  • Browse content in Physics
  • Astronomy and Astrophysics
  • Atomic, Molecular, and Optical Physics
  • Biological and Medical Physics
  • Classical Mechanics
  • Computational Physics
  • Condensed Matter Physics
  • Electromagnetism, Optics, and Acoustics
  • History of Physics
  • Mathematical and Statistical Physics
  • Measurement Science
  • Nuclear Physics
  • Particles and Fields
  • Plasma Physics
  • Quantum Physics
  • Relativity and Gravitation
  • Semiconductor and Mesoscopic Physics
  • Browse content in Psychology
  • Affective Sciences
  • Clinical Psychology
  • Cognitive Neuroscience
  • Cognitive Psychology
  • Criminal and Forensic Psychology
  • Developmental Psychology
  • Educational Psychology
  • Evolutionary Psychology
  • Health Psychology
  • History and Systems in Psychology
  • Music Psychology
  • Neuropsychology
  • Organizational Psychology
  • Psychological Assessment and Testing
  • Psychology of Human-Technology Interaction
  • Psychology Professional Development and Training
  • Research Methods in Psychology
  • Social Psychology
  • Browse content in Social Sciences
  • Browse content in Anthropology
  • Anthropology of Religion
  • Human Evolution
  • Medical Anthropology
  • Physical Anthropology
  • Regional Anthropology
  • Social and Cultural Anthropology
  • Theory and Practice of Anthropology
  • Browse content in Business and Management
  • Business History
  • Business Strategy
  • Business Ethics
  • Business and Government
  • Business and Technology
  • Business and the Environment
  • Comparative Management
  • Corporate Governance
  • Corporate Social Responsibility
  • Entrepreneurship
  • Health Management
  • Human Resource Management
  • Industrial and Employment Relations
  • Industry Studies
  • Information and Communication Technologies
  • International Business
  • Knowledge Management
  • Management and Management Techniques
  • Operations Management
  • Organizational Theory and Behaviour
  • Pensions and Pension Management
  • Public and Nonprofit Management
  • Strategic Management
  • Supply Chain Management
  • Browse content in Criminology and Criminal Justice
  • Criminal Justice
  • Criminology
  • Forms of Crime
  • International and Comparative Criminology
  • Youth Violence and Juvenile Justice
  • Development Studies
  • Browse content in Economics
  • Agricultural, Environmental, and Natural Resource Economics
  • Asian Economics
  • Behavioural Finance
  • Behavioural Economics and Neuroeconomics
  • Econometrics and Mathematical Economics
  • Economic Methodology
  • Economic Systems
  • Economic History
  • Economic Development and Growth
  • Financial Markets
  • Financial Institutions and Services
  • General Economics and Teaching
  • Health, Education, and Welfare
  • History of Economic Thought
  • International Economics
  • Labour and Demographic Economics
  • Law and Economics
  • Macroeconomics and Monetary Economics
  • Microeconomics
  • Public Economics
  • Urban, Rural, and Regional Economics
  • Welfare Economics
  • Browse content in Education
  • Adult Education and Continuous Learning
  • Care and Counselling of Students
  • Early Childhood and Elementary Education
  • Educational Equipment and Technology
  • Educational Strategies and Policy
  • Higher and Further Education
  • Organization and Management of Education
  • Philosophy and Theory of Education
  • Schools Studies
  • Secondary Education
  • Teaching of a Specific Subject
  • Teaching of Specific Groups and Special Educational Needs
  • Teaching Skills and Techniques
  • Browse content in Environment
  • Applied Ecology (Social Science)
  • Climate Change
  • Conservation of the Environment (Social Science)
  • Environmentalist Thought and Ideology (Social Science)
  • Natural Disasters (Environment)
  • Social Impact of Environmental Issues (Social Science)
  • Browse content in Human Geography
  • Cultural Geography
  • Economic Geography
  • Political Geography
  • Browse content in Interdisciplinary Studies
  • Communication Studies
  • Museums, Libraries, and Information Sciences
  • Browse content in Politics
  • African Politics
  • Asian Politics
  • Chinese Politics
  • Comparative Politics
  • Conflict Politics
  • Elections and Electoral Studies
  • Environmental Politics
  • European Union
  • Foreign Policy
  • Gender and Politics
  • Human Rights and Politics
  • Indian Politics
  • International Relations
  • International Organization (Politics)
  • International Political Economy
  • Irish Politics
  • Latin American Politics
  • Middle Eastern Politics
  • Political Theory
  • Political Methodology
  • Political Communication
  • Political Philosophy
  • Political Sociology
  • Political Behaviour
  • Political Economy
  • Political Institutions
  • Politics and Law
  • Public Administration
  • Public Policy
  • Quantitative Political Methodology
  • Regional Political Studies
  • Russian Politics
  • Security Studies
  • State and Local Government
  • UK Politics
  • US Politics
  • Browse content in Regional and Area Studies
  • African Studies
  • Asian Studies
  • East Asian Studies
  • Japanese Studies
  • Latin American Studies
  • Middle Eastern Studies
  • Native American Studies
  • Scottish Studies
  • Browse content in Research and Information
  • Research Methods
  • Browse content in Social Work
  • Addictions and Substance Misuse
  • Adoption and Fostering
  • Care of the Elderly
  • Child and Adolescent Social Work
  • Couple and Family Social Work
  • Developmental and Physical Disabilities Social Work
  • Direct Practice and Clinical Social Work
  • Emergency Services
  • Human Behaviour and the Social Environment
  • International and Global Issues in Social Work
  • Mental and Behavioural Health
  • Social Justice and Human Rights
  • Social Policy and Advocacy
  • Social Work and Crime and Justice
  • Social Work Macro Practice
  • Social Work Practice Settings
  • Social Work Research and Evidence-based Practice
  • Welfare and Benefit Systems
  • Browse content in Sociology
  • Childhood Studies
  • Community Development
  • Comparative and Historical Sociology
  • Economic Sociology
  • Gender and Sexuality
  • Gerontology and Ageing
  • Health, Illness, and Medicine
  • Marriage and the Family
  • Migration Studies
  • Occupations, Professions, and Work
  • Organizations
  • Population and Demography
  • Race and Ethnicity
  • Social Theory
  • Social Movements and Social Change
  • Social Research and Statistics
  • Social Stratification, Inequality, and Mobility
  • Sociology of Religion
  • Sociology of Education
  • Sport and Leisure
  • Urban and Rural Studies
  • Browse content in Warfare and Defence
  • Defence Strategy, Planning, and Research
  • Land Forces and Warfare
  • Military Administration
  • Military Life and Institutions
  • Naval Forces and Warfare
  • Other Warfare and Defence Issues
  • Peace Studies and Conflict Resolution
  • Weapons and Equipment

The Oxford Handbook of Quantitative Methods in Psychology, Vol. 1

  • < Previous chapter
  • Next chapter >

The Oxford Handbook of Quantitative Methods in Psychology, Vol. 1

15 Observational Methods

Jamie M. Ostrov, Department of Psychology, University at Buffalo, The State University of New York, Buffalo, NY

Emily J. Hart, Department of Psychology, University at Buffalo, The State University of New York, Buffalo, NY

  • Published: 01 October 2013
  • Cite Icon Cite
  • Permissions Icon Permissions

Systematic observational methods require clearly defined codes, structured sampling and recording procedures, and are subject to rigorous psychometric analysis. We review best practices in each of these areas with attention to the application of these methods for addressing empirical questions that quantitative researchers may posit. Special focus is placed on the selection of appropriate observational methods and coding systems as well as on the analysis of reliability and validity. The use of technology to facilitate the collection and analysis of observational data is discussed. Ethical considerations and future directions are raised.

Introduction

Systematic observational methods have been a common technique employed by psychologists studying human and animal behavior since the inception of our field, and yet best practices for the use of observational instruments ( see Table 15.1 ) are often not known or adopted by researchers in our field. As such, the quality of observational research varies widely, and thus, it is our goal in the present chapter to review and explicitly define the standards of practice for this important methodological tool in the psychological sciences. Bakeman and Gottman (1987) have previously defined observational methods to include the a priori use of operationally defined behavioral codes by observers who have achieved interobserver reliability. Importantly, the setting or context is not what defines a method as

being systematic ( Pellegrini, 2004 ). That is, systematic observations may be conducted in the laboratory, schools, workplace, public spaces and coded

live or via recordings/transcripts. Therefore, having clear definitions and sampling/recording rules as well as reliable codes delineates informal, unsystematic observation from systematic observation. We also distinguish between the use of nonsystematic field notes and other data collection techniques that are often used in qualitative studies by ethologists and educational practitioners in naturalistic contexts and only include a review and analysis of systematic observational methods (Pellegrini, Ostrov, Roseth, Solberg, & Dupuis, in press).

Nonsystematic sampling techniques such as Ad libitum (i.e., ad lib) in which there are no a priori systematic sampling or recording rules are often used by researchers as a part of pilot testing and help to inform the development of systematic observational coding systems ( Pellegrini, 2004 ). Thus, ad lib sampling approaches are important to understand the context and nature of the behaviors under study, but they will not be discussed further in this review. Observational methods may be used in a variety of designs from correlational and quasi-experimental to experimental and even randomized trial designs ( Bakeman & Gnisci, 2006 ). However, it is more typical to find systematic observational methods used outside the laboratory to maximize ecological validity and, thus, less likely as part of experimental manipulations ( Bakeman & Gnisci, 2006 ). The current review will be relevant to all research designs with a focus on those methods that are well designed for quantitative data analysis.

History of Observational Methods

The use of systematic observational methods has been used extensively by psychologists throughout the history of our field to examine various empirical questions ( see   Langfeld, 1913 ). One of the first documented cases of systematic observational methods in the extant literature was from a study by Goodenough (1930) and was part of an increasing trend in the systematic study of young children as part of the Child Welfare Movement in the United States, which was supported by the National Research Council (for review, see   Arrington, 1943 ). In fact, her seminal work was also one of the first studies in psychology to be published using time sampling ( see Sampling section below) observational procedures ( Arrington, 1943 ). In her classic work (appearing in the first issue of Child Development ), Florence L. Goodenough reported on several observational studies conducted in her laboratory at the Institute of Child Welfare (now Institute of Child Development) of the University of Minnesota. This study highlights several best practices that are still endorsed today. For example, careful pilot testing of the observational codes was conducted, and revisions were made to generate mutually exclusive codes ( see Coding section below) and reliable distinctions between the categories. In addition, observations of each child’s physical activity were conducted only once per day and only by one observer at a time so that observations of behavior were conducted independent of one another. Goodenough (1930) carefully defined the a priori categories or observational codes and demonstrated interobserver reliability for each of these codes. Finally, Goodenough (1930) described the justification for her observational procedures and discussed alternative techniques (e.g., the optimum duration for an interval within a time-sampling procedure). There are other well-known examples of systematic observation conducted by contemporaries of Goodenough, including Parten’s (1932) study of young children’s play behavior, which also illustrate best practices (e.g., clearly defined, mutually exclusive observational codes; rules designed to maintain independence of sampling and decrease observer error). Some of the earliest observational studies focused on either children or non-human animals (e.g., Crawford, 1942 ), as other techniques for studying behavior (and often social domains of study) were either not as well suited for the research questions or not available at the time. Today, systematic observational methods are used in research and applied settings ( Pellegrini, 2001 ) and relevant for training in all domains and subdisciplines of the social and behavioral sciences ( Krehbiel & Lewis, 1994 ).

Sampling and Recording Rules

Systematic observational systems follow various sampling and recording rules that are designed for different contexts and research questions. The following section includes a review of the central sampling and recording rules that quantitative scholars would use for conducting systematic observations ( see Table 15.2 for a summary of the strengths and weaknesses of each approach). Recently adopted best practices for direct systematic observation are relevant for each of these types of observational methods, and they are briefly reviewed here. These practices, which were first introduced by Hintze, Volpe, and Shapiro (2002) , include (1) the observational system is designed to measure well-defined behaviors; (2) the behaviors are operationally defined a priori ; (3) observations are recorded using objective, standardized (i.e., manualized training protocols) sampling procedures and recording rules; (4) the context and timing of sampling is explicitly determined; and (5) scoring and coding of data are conducted in a standardized fashion ( see Leff & Lakin, 2005 , p. 476).

Time Sampling

A time-dependent observational procedure in which the researcher a priori divides the behavior stream into discrete intervals and each time interval is scored for the presence or absence of the behavior in question is defined as a time sampling observational approach. That is, the time interval is the unit coded ( Bakeman & Gottman, 1987 ). Time sampling procedures may be conceptualized as either 0/1 (i.e., absent/present or nonoccurrence/occurrence) or continuous in nature. A time sampling procedure is an efficient method of sampling, as multiple data points may be collected from a single participant in a short period of time. Time sampling is well suited for measuring rather discrete behaviors, such as overt behaviors (e.g., on task and off task behavior in classrooms), or with behaviors that are frequently occurring. For example, a recent study of the frequency of various behaviors (e.g., off task behavior, noncompliance) during several naturalistic activities in 30 children with various psychiatric diagnoses used a reliable 0/1 time sampling approach with a 15-second interval ( Quake-Rapp, Miller, Ananthan, & Chiu, 2008 ). Alternatively, time sampling is not well designed for infrequently occurring events or events that are long in duration ( Slee, 1987 ). A clear advantage is that time sampling is relatively inexpensive because it is an efficient use of the research assistant ( Bakeman & Gottman, 1987 ). Further, 0/1 sampling is also easier for the observer than alternatives such as instantaneous sampling, in which the research assistant notes if the behavior is present at a precise moment in time rather than it occurring during a larger interval of time. A major disadvantage of the time sampling approach is that the researcher delineates the particular time interval and therefore arbitrarily categorizes the behavior into discrete artificial units of time that may or may not be meaningful ( Slee, 1987 ). Moreover, some behaviors may exceed the often brief interval of time that is selected for the sampling. Thus, it is crucial to carefully justify the interval that is selected. The intervals are often brief and the behaviors in question should be readily apparent and easily observable by trained research assistants. If frequency estimates are to be obtained, then the interval in question needs to be sufficiently brief so that an accurate assessment can be made. That is, typically with an interval approach, a maximum of one behavior is recorded during an interval even if the behavior independently occurs more frequently during this interval ( Slee, 1987 ). Thus, special attention needs to be given to the pilot testing of the observational scheme and various durations of the interval if frequency assessments are desired.

Time sampling procedures are used in a range of settings and studies to test various empirical questions that often have applied significance. For example, Macintosh and Dissanayake (2006) adopted a 0/1 time sampling technique to assess spontaneous social interactions in school-aged children with high-functioning autism or Asperger’s disorder as well as typically developing children. Observations were conducted in the schoolyard. For each timed interval of 30 seconds, one type of behavior (e.g., parallel play) from a particular behavioral domain (e.g., social participation) was coded. For reliability purposes, a second observer made independent ratings for 20% of the entire sample. Intraclass correlation reliability coefficients were all acceptable for each type of behavior (0.78–0.99) with the exception of nonverbal interaction (i.e., gestures; 0.58), which are often difficult to reliably assess in live settings ( see also   Ostrov & Keating, 2004 ). Results meaningfully distinguished between the typically developing children and the clinical groups and revealed few differences between the two clinical groups, supporting the use of time sampling as a means to discriminate between clinical and nonclinical groups ( Macintosh & Dissanayake, 2006 ). Time sampling procedures have several other applications and clinical considerations. For example, time sampling methods may differentially affect how treatment effects are interpreted ( Meany-Daboul, Roscoe, Bourret, & Ahearn, 2007 ) and may be appropriate for classroom-based research that tests adherence to educational policies intended to aid students with special needs ( Jackson & Neel, 2006 ; Soukup, Wehmeyer, Bashinski, & Boyaird, 2007 ).

Event Sampling

Event-based sampling is also known as behavior sampling and permits a researcher to study the frequency, duration, latency, and intensity of the behavior under study ( Pellegrini, 2004 ). Essentially, unlike time sampling, event sampling is a type of observational sampling in which the events are time-independent and the behavior is the unit of analysis ( Bakeman & Gottman, 1987 ). Event sampling allows the behavior to remain as part of the naturally occurring phenomenon and may unfold in a manner generally consistent with the timing of the behavior in the natural setting. This type of sampling also can be efficient in terms of the total amount of time needed for observations. Unlike other sampling techniques (e.g., time sampling), a third advantage is that event sampling may be used when the construct under study is either frequently or infrequently occurring ( Slee, 1987 ). There are some clear disadvantages to event-based sampling procedures, and this may be a reason that it is less commonly seen in the literature. First, it is sometimes challenging to delineate the independence of events—that is, the researcher must specify when one event ends and the next event begins. Second, event sampling does not lend itself well to coding of dyadic interactions such as parent–child or romantic partner relations in which there is a fair amount of interdependence between the participants ( Slee, 1987 ).

Event sampling also has wide applicability and has even been used to understand the propensity to violence at sporting events. For example, Bowker et al. (2009) used an event-sampling approach to examine spectator comments at youth hockey games in a large Canadian city. A group of five observers attended 69 hockey games played by youth in two age groups: 11–12 years and 13–14 years. Verbal comments were coded as positive, negative, corrective, or neutral and rated for intensity. Most of the comments elicited by spectators were positively toned. The valence of spectator comments was influenced by gender (i.e., the gender of the children playing) and the purpose for which the game was being played (i.e., competitive or recreational). These results support the utility of event sampling at social and athletic events, where particular behaviors are likely to occur during a finite period of time. Time sampling may not be appropriate in such circumstances because of the presence of a high concentration of individuals in a single setting and many potential interruptions arising from the nature of the activity.

Participant Observation

Although participant observation has been more frequently used with nonsystematic field observation and in disciplines that focus on qualitative methods, it is possible to conduct systematic participant observation as part of quantitative studies. Systematic participant observation has been the method of choice for behaviors of interest that require “an insider’s perspective” ( Pellegrini, 2004 , p. 288) or for contexts in which the sampling period may be long and informal. Moreover, this method is well suited for the use of more global observational ratings that sample events. This procedure has wide applicability, and participant observation has an extensive history of successful use from studies of children with behavioral problems at summer camps in clinical psychology (e.g., Newcomb, 1931 ; Pelham et al., 2000 ) to worker stress in organizational psychology (e.g., Länsisalmi, Peiró, & Kivimäki, 2000 ). For example, a recent study of children diagnosed with disruptive behavioral disorders and enrolled in a summer treatment program used staff counselors to complete daily participant observations of social behaviors of the children while they engaged in various camp activities ( Lopez-Williams et al., 2005 ). A second study of social competence among reunited adolescents ( M a g e = 1 5 . 5 years) who had attended a research-based summer camp when they were 10 years old revealed the predictive validity of participant observer (i.e., camp counselor) ratings of social skills ( Englund, Levy, Hyson, & Sroufe, 2000 ). The validity of the participant observations of social competence when the participants were 10 years old was determined by revealing significant prospective correlations with a group-problem solving task that was videotaped and coded by two independent raters along several dimensions (e.g., self-confidence, agency, overall social competence) when the participants were 15 years old. The results support the use of participant observations in studying the development and stability of complex, multifaceted constructs like social competence.

Focal Sampling

Focal person sampling involves selecting (typically at random from a roster of participants) one participant and observing the individual for a defined time period. For each sampling interval (ranges vary depending on the question of interest), the observer records all relevant behaviors of the focal person. As we have previously discussed ( see Pellegrini et al., in press), for studies of dyads or small groups, the sampling interval should be as long as the typical interaction or displayed behavior of interest. For example, in our work, we study the display of relational aggression (i.e., the use of the relationship as the means of harm via social exclusion, withdrawing friendship, spreading malicious rumors), and given the nature of these behaviors, we have found that an interval of 10 minutes is a reasonable interval for assessing the intent for harm as well as the subtle nature of these peer interactions ( Ostrov, 2008 ; Ostrov & Keating, 2004 ).

Focal sampling may technically use continuous (e.g., Fagot & Hagan, 1985 ; Laursen & Hartup, 1989 ), 0/1 (e.g., Hall & McGregor, 2000 ; Harrist & Bradley, 2003 ), or instantaneous recording rules ( see   Pellegrini, 2004 ). However, focal sampling often uses continuous recording procedures because it permits the simultaneous coding of various behaviors, sequences of behaviors, and interactions with multiple partners in a live setting (e.g., Arsenio & Lover, 1997 ; Keating & Heltman, 1994 ). For example, in our observational studies of relational aggression among young children, we always have used focal sampling with continuous recording given the somewhat covert nature of the behaviors we have targeted for observation, which require a longer period of direct assessment to decipher and appropriately record the behaviors ( Ostrov & Keating, 2004 ). Focal participant sampling is often conducted across multiple days and contexts to better capture the true nature of the behavior rather than any state-dependent artifacts. Given the amount of time and the continuous nature of the recordings, this technique permits the recording of behavior that is a close approximation to real-time recording, and a researcher may recreate the behavior of the focal participants with a high degree of accuracy (Pellegrini et al., in press). For example, we observe children in their naturally occurring play contexts on 8 separate days, and they are only ever observed once per day to maintain independence of the data. Thus, in our work, each participant is observed for 80 minutes (8 sessions at 10 minutes each session). More specifically, a study of 120 children resulted in more than 370 hours of observation across the two time-points of the short-term longitudinal study ( Ostrov, 2008 ). Therefore, time is a major cost of focal sampling because of the large number of independent observations typically conducted with this approach. Focal sampling may also be used with 0/1 or instantaneous sampling as recording procedures, but this is rarely done. As previously mentioned, both of these recording procedures require an a priori specified time interval, which is usually relatively brief (i.e., 1–10 seconds). Instantaneous recording is typically used only with scan sampling procedures ( see Scan Sampling section below). 0/1 time sampling is not usually used with focal sampling because we are often interested in assessing the true frequency of behaviors that may not be obtained with this procedure (i.e., an independent behavior could occur once or more than once during a set interval, but with 0/1 coding only one point is scored).

Despite the emphasis on the use of these methods for studying basic social behavior, focal sampling procedures may be used in a wide range of studies. It is common in the literature to find focal participant sampling studies on a range of social behavior topics: social dominance in children ( Keating & Heltman, 1994 ) and adults ( Ostrov & Collins, 2007 ), play behavior ( Pellegrini, 1989 ), emotion and aggression ( Arsenio & Lover, 1997 ), conflict ( Laursen & Hartup, 1989 ), and peer relations with young children and non-human primates (e.g., Hinde, Easton, & Meller, 1984 ; Silk, Cheney, & Seyfarth, 1996 ). However, there are many practical applications of focal participant sampling ( see   Leff & Lakin, 2005 ; Pellegrini, 2001 ). For example, applied studies have been conducted that have used these observational techniques for examining the adjustment of children with special needs in elementary schools ( Hall & McGregor, 2000 ), peer victimization in early adolescence ( Pellegrini & Bartini, 2000 ), and for testing the efficacy of randomized behavioral interventions (e.g., Harrist & Bradley, 2003 ; Ostrov et al., 2009 ).

Scan Sampling

Instantaneous or scan sampling is a more efficient observational procedure than focal sampling. Scan sampling exclusively relies on instantaneous recording rules ( Pellegrini, 2001 ). With this procedure the observer scans the entire observation field for a possible behavior or event for a particular period of time. If an event is noted during that scan, then it is recorded. Typically, a number of discrete scans occur across a number of days to maximize the independence of the data. A participant’s data is usually summed across the scans to yield a behavioral score for the construct of interest. A concern with this approach is that it may not accurately assess the true frequency of behaviors if spacing is not adequate between the scans ( Pellegrini, 2004 ). Moreover, given the typical approach in which scans are conducted on an entire reference group in their natural context, behaviors that are selected for this approach must be readily apparent, discrete, and overt behaviors that require typically only a few seconds to observe. In our own field, McNeilly-Choque, Hart, Robinson, Nelson, and Olsen (1996) conducted a study of young children’s aggressive behavior in which they used a random scan sampling method that yielded 100 five-second scans during a 5- to 7-week period, resulting in 8 minutes of total observation per participant ( McNeilly-Choque, Hart, Robinson, Nelson, & Olsen, 1996 ). Thus, this study demonstrated the feasibility and efficiency of systematic scan sampling observations of aggressive behavior on the playground.

Semi-Structured Observations

Analog tasks or semi-structured observations, involving controlled simulations or analog situations, are observational tasks designed to mimic naturalistic conditions. Semi-structured observational procedures are another observational paradigm well suited for low base rate events. The recording and coding procedures are often identical to the procedures an observer would use in a naturalistic setting; however, the context in which the behaviors emerge is different. Often analog tasks are completed in a laboratory or similarly controlled setting and are videotaped for subsequent coding by unaware observers. Thus, analog observational paradigms permit a great deal of experimental control/standardization of procedures, and with the use of videotapes, observers are able to objectively code the session using the same recording rules as permitted in other contexts. A clear advantage of these procedures is that they are efficient and require less cost and time spent observing participants. If the study is not designed well, then a major disadvantage is a lack of ecological validity (i.e., degree to which the context in which the research is conducted parallels the real-life experience of the participants), and poor generalizability of the findings is possible. Moreover, a relatively small sampling of behavior does not provide for a true frequency of behavior or for a representative sample of behavior with many interaction partners (i.e., the researcher is not able to examine individual–partner interactions). Other researchers have addressed this concern by using a “round robin” approach in which each participant completes an analog session with several (or all) other member of the reference group, which may improve the validity of the approach but, of course, adds a great deal of time and expense ( see   Hawley & Little, 1999 ).

In our own research we have used a semi-structured observational paradigm to provide an efficient estimate of young children’s aggressive behavior. To this end, we created a brief (9-minute) analog situation to observe various aggressive and prosocial behaviors (i.e., within dyads or triads) in early childhood ( Ostrov & Keating, 2004 ; Ostrov, Woods, Jansen, Casas, & Crick, 2004 ). The procedures and a review of the psychometric findings are described extensively elsewhere (e.g., Ostrov & Godleski, 2007 ), but essentially, each assessment includes three trials of 3 minutes each. For each trial, the children are given the same developmentally appropriate picture to color (e.g., Winnie the Pooh). For triads, three crayons are placed on the table equidistant from all participants, and only one crayon is the functional instrument (e.g., orange crayon for Winnie the Pooh) and two are functionally useless white crayons. At the end of the trial, a new picture and new crayons are placed on the table. This procedure is designed to produce mild conflict among the children and was developed to permit the children to engage in a variety of behaviors: prosocial behavior (e.g., sharing the one functional crayon or breaking into pieces to share), relational aggression (e.g., telling the child they will not be their friend anymore unless they give them the crayon), and physical aggression (e.g., taking the crayon away from someone else). The analog task was designed to be developmentally appropriate and resemble everyday conflict interactions concerning limited resources that young children experience in their typical preschool classroom. Highly trained research assistants monitored the entire session and intervened if needed to guarantee the safety of all participants and reduce the likelihood of participant distress. Moreover, at the end of the session, the children were each individually given access to a full box of crayons to diminish any distress and they were praised for their performance ( see   Ostrov et al., 2004 ). This paradigm is thus designed to elicit the behavioral constructs of interest in a more controlled environment than free play yet ensures the ethical treatment of participants.

One way to demonstrate the ecological validity of semi-structured observations is to correlate behaviors observed in a semi-structured context with behaviors observed in a more naturalistic context. For example, Coie and Kupersmidt (1983) found that social status in experimentally contrived playgroups comprised of unfamiliar peers matched social status in the classroom, supporting the validity of a contrived playgroup paradigm for studying social development ( see also   Dodge, 1983 ). Similarly, our own brief semi-structured observational paradigm (i.e., coloring task) has been shown to significantly predict observational scores collected from concurrently assessed naturalistic (i.e., classroom and playground free play) focal child observations with continuous recording ( r = 0 . 4 8 ) and to predict future (i.e., 12 months later) behavior in naturalistic contexts at moderate levels ( see   Ostrov et al., 2004 ).

Methods of Recording

Various methods of recording (i.e., checklist, detailed records, or observation forms) vary widely and should be based on the type of recording procedures that a researcher adopts. For example, time sampling (i.e., 0/1) and instantaneous or scan sampling procedures are well suited for checklist forms in which the prescribed intervals simply receive a check or a precise code indicating the occurrence or absence of the behavior in question. However, focal participant sampling often requires observation forms that permit greater detail and several codes that are recorded either simultaneously or in close temporal proximity, and, as such, a form that includes the behaviors or events of interest with space for recording the behavior in detail may be needed (for example forms and templates, see Pellegrini et al., in press). A general concern here is that the more time spent writing details about the behavior/event removes the observer’s attention from the participants and important details may be lost. Some observational procedures like time sampling provide the observer with a set period of time after the interval for recording behavior. In general, the easier the observation form is to complete, the less room there is for error. With that said, checklists often do not permit systematic reviews for accuracy of codes by the master trainer. For example, observers that are observing the same participant as part of a reliability check could both code a behavior as “PA” for physical aggression when in fact one research assistant observed a “hit” and the other observed a “kick,” which, depending on the observational system, may be different and might not warrant a positive match or agreement. Thus, depending on the coding scheme and intentions of the researcher, these may artificially match for reliability purposes when in fact they were closely related but discrete behaviors. Finally, if observers record some written details about the event, they may inform subsequent decision rules concerning whether a recorded behavior from observer 1 matches or does not match observer 2 for reliability assessments.

Coding Considerations

The development of a reliable coding scheme is crucial for appropriately capturing the behaviors in question and testing the experimenter’s a priori hypotheses ( Bakeman & Gottman, 1987 ). There are three types of coding categories that are often included in observational systems: physical description codes, consequence codes, and relational or environmental relations codes ( Pellegrini, 2004 ). Physical description is believed to be the most “objective” type of codes because these describe “muscle contraction” ( Pellegrini, 2004 , p. 108) and might, for example, be involved in recording a participant’s social dominance or submissiveness (e.g., direct eye contact, rigid posture, arms akimbo; see   Ostrov & Collins, 2007 ). The second type of codes is for those of consequence in which a constellation of behaviors are part of a single code if they lead to the same outcome ( Pellegrini, 2004 ). For example, if we were interested in studying social dominance, then we might code taking objects away from others that result in a submissive posture on the part of the nonfocal participant to be an indicator of social dominance ( Ostrov & Collins, 2007 ). The third type of codes includes categories in which participants are described in relation to the context in which they are observed ( Pellegrini, 2004 ). An example of a relational observational category would be a coding scheme that accounted for where and with whom an individual was socially dominant. In terms of costs and benefits, it is clear that physical description codes are often easier to train and therefore potentially more reliable. It is possible that consequence codes may be unreliable given a misunderstanding of the sequence of events ( Pellegrini, 2004 ). Relational codes involve the appropriate documentation of multiple factors and therefore create more possibilities of error (for discussion, see   Pellegrini, 2004 ; Bakeman & Gottman, 1987 ). Overall, the level of analysis from micro- to macro-coding schemes is important to consider and the most objective and reliable system for addressing a researcher’s particular research question should be adopted.

A second consideration is the determination of whether to use mutually exclusive and exhaustive codes. Mutually exclusive codes are used when a single behavior may be recorded under one and only one code. In our observational studies, our coding scheme includes mutually exclusive codes such that a single behavior may be coded as either physical aggression or relational aggression, but not both. Exhaustive coding schemes are designed such that for any given behavior of a theoretical construct, there is an appropriate code for that behavior. For example, in our work we have codes for physical, relational, verbal, or nonverbal aggression as well as aggression not otherwise specified. Thus, if we determine a behavior is an act of aggression, then it may be coded as one of our behaviors in our scheme. Often schemes include mutually exclusive and exhaustive codes because there are several benefits to this approach ( see   Bakeman & Gottman, 1987 ). Having mutually exclusive codes means that researchers are not violating assumptions of independence, which are often needed for parametric statistics. For example, if a single behavior may be coded as both physical and relational aggression, then that may violate our assumption that the data are independent and come from independent behavioral interactions ( Pellegrini, 2004 ). Having exhaustive codes also speaks to the content validity of a coding scheme. That is, if the overall construct appropriately measures all facets of that construct, then the behavior in question should be included in the observational system, and exhaustive schemes guarantee this occurrence. It is important to recall that the larger the coding scheme, the more taxing the observational procedures will be for observers and the greater the possibility of observer error.

Scoring of observational data is similar to the scoring of any quantitative data within the social and behavioral sciences, and it often depends on the convention within a particular field and the type of observational sampling and recording techniques that are adopted. For example, for focal participant sampling with continuous recording, frequency counts are often generated by summing each independently recorded behavior across the various sessions. In our own research, that would mean that an individual participant would get a score for each of the constructs (i.e., physical aggression, relational aggression, verbal aggression, etc.) by summing all the behaviors within a construct (e.g., all physical aggression behaviors) across all eight sessions ( Ostrov & Keating, 2004 ). If the number of sessions is different for each participant because of missing data, then it is often common practice to divide by the number of sessions completed to generate an average rate of behavior per session ( see   Crick, Ostrov, Burr et al., 2006 ). Occasionally it is apparent that an error was made in the original coding of behaviors. Best practices have not been established for addressing these concerns, but as long as these errors are not systematic, the adopted solutions are often not a concern. To avoid problems with potential scoring biases, the observers and coders should always be unaware of the participant’s condition and/or past history. In addition, whenever possible, observers and coders should be unaware of the study hypotheses.

Psychometric Properties

Reliability.

Reliability is often conceptualized as consistency within or between individuals (i.e., intra-observer or inter-observer), within measures (internal consistency), or across time (i.e., test–retest). Arguably, for observational methods, the most important measure of consistency is inter-observer reliability, or the degree to which two sets of observations from two independent observers agree ( Stangor, 2011 ). In the present review, we will first address intra-observer reliability and then focus on the assessment of inter-observer reliability.

Intra-observer, or within-observer, reliability is defined as a situation in which two sets of observations by the same research assistant agree or are consistent. Essentially, intra-observer reliability is assessing how consistent a particular observer is when coding specific behaviors either between sessions (i.e., across time) or within a single session. As Pellegrini (2004) has discussed in more detail, we may conceptualize and test (e.g., Pearson’s Product-Moment Correlation Coefficient) intra-observer reliability in ways similar to test–retest reliability, and thus, intra-observer reliability is essentially the temporal stability of the observational measure for a given observer between testing sessions. We might desire to know the degree to which the observational score on a given behavioral construct for the same observer is stable across time to test for observer drift (a threat to the validity of the observational data), or the likelihood that observers are deviating from initial training procedures over time and modifying the definitions of the constructs under study ( Smith, 1986 ). Intra-observer reliability or consistency within an observer may also be conceptualized as the reliability of an observer’s scores within a single session, and in this case the test is analogous to assessments of internal consistency (e.g., Cronbach’s α ). As Pellegrini (2004) has stated, we assume an observer is first reliable or consistent in their scoring/recording by themselves prior to testing if they agree with an independent observer (i.e., inter-observer reliability).

As mentioned, inter-observer reliability or consistency between observers is the gold standard for observational research. Essentially, inter-observer reliability involves comparing the independent codes of the observers with other trained observers. There are several ways to assess this psychometric property ( see   Pellegrini, 2004 ), but the key task is comparing agreement across all of the observers. An important best practice for inter-observer reliability procedures is to ensure that observers are sampling/recording the same behaviors independently. Independent coding may be conducted with the use of video and private coding sessions without discussion until all codes have been completed. Inter-observer reliability may be assessed live in the field if the observers take precautions to avoid conveying to their partner how (and, in some cases, when) they are recording the behavior in question. A second best practice is to assess for reliability across the study to help avoid various biases (e.g., observer drift) and coding/recording errors from corrupting the integrity of the data. That is, observers should be checked against a master coder at the start of the study just after training ends, and each observer should pass an a priori reliability threshold (e.g., Cohen’s κ 〉 0 . 7 0 ). Next, their observations should be compared against other independent reliable observers throughout the duration of the study, and the trainer should provide constructive feedback for any deviations from the training protocol. Finally, an important consideration is for what percentage of time inter-observer reliability will be checked. This percentage should be a function of the number of cases or possible events that will be recorded, but typically 15% to 30% of a randomly selected sample of the possible sessions is coded by more than one observer for assessing inter-observer reliability. To avoid potential biases, a best practice is for each observer to conduct reliability observations with all other observers in a round-robin format.

There are several ways to statistically measure inter-observer reliability. In the past, authors relied on zero-order correlations (Pearson’s r ) but that problematic practice is not seen as often in the recent literature. A second statistical method that is still reported in peer-reviewed journals is percent agreement. Percent agreement may be expressed in Equation 1 :

where P o b s is the proportion of agreement observed, N A is the total number of agreements, and N D is the total number of disagreements. Percent agreement is not currently best practice, as it is influenced by the number of cases (i.e., it may be biased by relatively few cases) and because it is not compared against a standard threshold ( Bakeman & Gottman, 1987 ). Finally, one of the central concerns with percent agreement (as well as Pearson’s r ) as a measure of inter-observer reliability is that it does not control for chance agreement ( Bakeman & Gottman, 1987 ).

Cohen’s (1960)   κ is a preferred statistic for inter-observer reliability because it does control for chance agreements and is a more “stringent statistic,” allowing greater precision in assessing reliability at a specific moment in time or for particular events rather than overall summaries of association ( Bakeman & Gottman, 1987 , p. 836). Importantly, κ may only be used when coders use a categorical scale ( Bakeman & Gottman, 1987 ) and when a 2 x 2 matrix may be created to depict the proportion of agreements/disagreements for occurrences/nonoccurrences of behavior for any two observers ( Pellegrini, 2004 ). When calculating the rate of agreement, it is important to a priori indicate any time parameters (i.e., within what period of time must both observers note the occurrence of a behavior, also known as the tolerance interval). Some experts caution that extremely short tolerance intervals (e.g., 1 sec) may be overly stringent and artificially reduce the degree of agreement given typical reaction times of observers ( see   Bakeman & Gnisci, 2006 ). If time sampling is being used, then observers should be signaled by an external source (e.g., audible tone from an electronic device) to indicate when they should record the behavior ( see   Pellegrini, 2004 ). κ may be expressed in Equation 2 :

where P o b s is the proportion of agreement observed, and P e x p is the expected proportion of agreement by chance ( Bakeman & Gnisci, 2006 ). Equation 2 indicates that agreement anticipated as a result of chance is subtracted from both the numerator and denominator, thus κ provides the proportion of agreement corrected for chance agreements ( Bakeman & Gnisci, 2006 ). The range for κ is from - 1 . 0 0 to + 1 . 0 0 , with a value of “0” indicating that obtained agreement is equivalent to agreement anticipated by chance, and greater than chance agreement would yield positive values with +1.00 equal to perfect agreement between the observers ( Cohen, 1960 ). Interestingly, Cohen (1960) revealed that negative values (less than 0) were rare and suggested agreement at less than chance levels. It is possible to test if κ is significantly different from 0, but statistical significance is often not used as a threshold for determining an “adequate” or “good” criterion ( Bakeman & Gottman, 1987 ). Initially, Landis and Koch (1977) provided an index of the strength of agreement or “benchmarks” and reported the following standards: κ of < 0 . 0 0 was “poor,” 0 . 0 0 - 0 . 2 0 was “slight,” 0 . 2 1 - 0 . 4 0 was “fair,” 0 . 4 1 - 0 . 6 0 was “moderate,” 0 . 6 1 - 0 . 8 0 was “substantial,” and 〉 0 . 8 1 was “almost perfect” (p. 165). However, Bakeman and Gottman (1987) reported that a significant κ of less than 0.70 may be a reason for concern. Other scholars have noted that the conservative nature of κ permits one to use a slightly lower threshold for adequate levels of reliability than the typical convention of 0.70 and suggest that a κ coefficient of 0.60 or higher is “acceptable” and 0.80 or above is considered “good” ( Pellegrini, 2001 ).

Under circumstances when a κ coefficient may not be calculated (e.g., when noncategorical data is used or quadrants of the aforementioned occurrence matrix may not be available given the recording rules of the adopted observational procedure), scholars have suggested that an intraclass correlation coefficient (ICC) be computed between independent raters on the continuous data ( Bartko, 1976 ; McGraw & Wong, 1996 ; Shrout & Fleiss, 1979 ). There are several possible ICC formulas that could be depicted that are beyond the scope of the present review, and as such the interested reader is referred to the prior literature on this topic ( Shrout & Fleiss, 1979 ; McGraw & Wong, 1996 ). Intra-class correlation coefficients may be expressed as a function of either the reliability for a single rating (i.e., the reliability of a typical, single observer compared to another observer) or the average rating of the observations across all the raters ( McGraw & Wong, 1996 ). The average rating ICC uses the Spearman-Brown correction to indicate the reliability for all the observers averaged together ( Bartko, 1976 ). The absolute value of an ICC assessing average ratings will be greater or equal to the ICC for a single rater ( Bartko, 1976 ). Intra-class correlation coefficients may also be calculated as an index of “consistency” or as a measure of “absolute agreement.” Essentially, if systematic differences among observers are of interest, then the “absolute agreement” formula accounts for observer variability in the denominator of the ICC estimate, and this is not included for ICCs that measure “consistency” (for further detail, see   McGraw & Wong, 1996 ). Intra-class correlation coefficients range from –1.00 to +1.00, where negative values indicate a lack of reliability and +1.00 would indicate perfect agreement ( Bartko, 1976 ). An advantage to ICCs is that confidence intervals may be calculated ( see   McGraw & Wong, 1996 ). Typically, acceptable levels of reliability for ICCs are similar to other criteria in the field, and as such, levels greater than or equal to 0.70 are considered “acceptable” (e.g., Ostrov, 2008 ; NICHD Early Child Care Research Network, 2004 ).

In using observational research methods, an assessment of validity is equally as important as an assessment of reliability. Different types of validity should be considered to strengthen the inferences drawn from a particular method, with construct validity being most fundamental to any empirical inquiry. Construct validity is the degree to which the construct being studied actually measures the concept that a researcher intends to study ( Stangor, 2011 ). Construct validity is often established through assessments designed to measure convergent and discriminant validity. Convergent validity rests on the assumption that if a construct is truly being measured, then alternative assessments of the same construct should be correlated with each other ( Stangor, 2011 ). For example, an observational method intended to measure disruptive behaviors in the classroom should be correlated with teacher reports of disruptive behaviors. Alternatively, discriminant validity suggests that the construct being studied should not be correlated with other variables unrelated to the construct ( Stangor, 2011 ). Should the expected convergent and discriminant associations not be observed, then it is unclear what an instrument or observational system is measuring.

Other types of validity that are secondary yet still important to the establishment of a psychometrically sound observational system include content validity and criterion validity. Content validity refers to the extent to which a measure adequately assesses the full breadth of the construct being studied ( Stangor, 2011 ). For example, an observational study of children’s play behavior should code for different types of play, given that it is a diverse construct. To ensure that all facets of a construct are included in an observational system, correspondence with experts and focus groups/review panels may be used. Criterion validity involves an assessment of whether a study variable is associated with a theoretically relevant outcome measure. If observations are associated with an outcome that is measured at the same point in time at which observations are conducted, then concurrent validity is demonstrated. If observations are associated with an outcome that is measured at a future point in time, then predictive validity is demonstrated. For example, concurrent validity would be confirmed by associations between classroom observations of disruptive behavior and teacher report of rejection by peers, and predictive validity would be confirmed by associations between classroom observations of disruptive behavior and future parent -report of academic performance.

threats to validity: sources of bias and error

There are numerous biases for which observational methods are susceptible. A key bias is the aforementioned observer drift, and it is paramount that investigators monitor for this threat to the validity of the data by carefully assessing observational records and calculating reliability coefficients for the duration of the study. Importantly, in addition to the aforementioned discussion about intra-observer reliability, observer drift may also be indicated if there is a drop in inter-observer reliability among the phases of training and data collection ( Smith, 1986 ). A second strategy to mitigate observer drift is to regularly retrain observers. In instances where particular observers demonstrate problematic coding patterns, retraining should be individualized and should target the particular area of concern. In general, retraining is a practice that is beneficial for every observer because it reinforces proper coding procedures and observer behavior, thereby ensuring the integrity of the study.

A second type of distortion that must be considered results from participant reactivity, which is also a threat to the validity of the observational data. Reactivity occurs when the individuals under study alter their behavior because of the presence or influence of an observer. Consequently, the behavior observed does not provide a true representation of the construct being measured. If participants avoid a particular location within a setting or modify their behavior because they know they are being recorded, this is a major concern for the validity of the data ( Stangor, 2011 ). Depending on the nature of the study, reactivity may be more probable. For example, when observers need to remain within earshot of a focal participant to hear and see the behavioral interactions, it is crucial that the observers remain unobtrusive (e.g., Pellegrini, 1989 ). Researchers should explicitly address reactivity by training observers in the field to have a minimally responsive manner ( Pellegrini, 2004 ). Essentially, observers should use neutral facial expressions and control their nonverbal behavior, posture, movement, and reactions to events during live coding. It is also possible that participants may be reactive to cameras and other recording devices, and efforts should be made to habituate participants to this equipment ( see Use of Technology and Software section below) and monitor for this occurrence. Thus, this habituation process should occur prior to the actual collection of data ( Pellegrini, 2004 ). In our studies, we spend a minimum of several days in the observational environment (and will do so for as long as needed) simulating our observations, which provide the participants an opportunity to habituate to our presence and reduce reactivity prior to actual data collection. Therefore, regardless of live or videotaped coding, researchers should observe for participant reactivity and report the degree of reactivity in their studies (e.g., Atlas & Pepler, 1998 ). We define participant reactivity as any direct eye contact between the focal participant and observer, comments from the focal participant to the observer about our presence, or comments about our presence to others in the environment ( Ostrov, 2008 ). Our training procedures and careful monitoring has resulted in relatively low levels of reactivity in several studies (e.g., 1.5–2.5 times per focal participant during 80 min of observation; Crick, Ostrov, Burr et al., 2006 ).

Observer expectancy effects are a third bias ( Hartmann & Pelzel, 2005 ), which is essentially when observers form expectations about the nature of the data based on their knowledge or assumptions about the study goals and hypotheses, which is why best practice is to use unaware observers, when possible, and to use unaware observers for reliability purposes, at a minimum.

A final source of bias that we will discuss is gender bias as this is a well-documented concern with observational methods ( Ostrov, Crick, & Keating, 2005 ). Past research has documented that untrained observers maintain gender biases when observing, for example, physical aggression ( Lyons & Serbin, 1986 ; see also   Condry & Ross; 1985 ; Susser & Keating, 1990 ). That is, men tend to rate boys as more physically aggressive than girls, even when boys and girls are displaying comparable levels of aggression ( Lyons & Serbin, 1986 ). Moreover, male and female college students have shown documented gender biases based on knowledge about gender of young children in past experimental studies ( Gurwtiz & Dodge, 1975 ). Finally, in our own research, we have documented that male college students are less likely to correctly identify relational aggression or prosocial behavior than their female peers ( Ostrov et al., 2005 ). Please note that although the examples were related to our field of study (i.e., aggression), gender biases may be present for a variety of topics of study. Importantly, it may be that when individuals are trained to recognize potential biases, they are more likely to be objective in their coding of behavior ( Lyons & Serbin, 1986 ).

Use of Technology and Software

Excellent detailed reviews of computer-assisted recording devices and observational software programs are available ( see   Hoch & Symons, 2004 ), and thus, the present goal of this section is to briefly review the current state of technology and software for assisting in systematic observations in the laboratory and field. The following will include a review of the three most common observational software programs as well as the use of handheld devices and remote audiovisual equipment. The commercially available programs vary widely in function and cost, but most permit the observer to define a coding scheme and corresponding letter or number codes that observers can quickly use when making observations live or when coding digital media in the laboratory. Overall, advances in technology have made observational methods more efficient (e.g., flexible data reduction procedures and automatic statistical analyses), accurate (i.e., automatic rewind and playback functions reduce errors in coding), and applicable to a wider range of settings and topics of study ( Bakeman & Gnisci, 2006 , p. 140).

The first software program and associated computer-assisted recording devices that we will discuss is the Observer ® ; system by Noldus Inc. ( Noldus, Trienes, Hendriksen, Jansen, & Jansen, 2000 ). The current version is Observer XT, which permits both time sampling as well as continuous event-based observational systems and has been used in both human and animal research ( see   http://www.noldus.com/the-observer-xt/observer-xt-research ). A notable feature is that this software permits an assessment of response latency of the time between the onset of a stimulus and the initiation of the response, which facilitates consequence coding ( see Coding Considerations section above). The software also permits the linking of data from multiple modalities (e.g., observational reports, physiological responses) with a continuous time synch. The software may be used in the field with durable handheld devices or in the laboratory with live streaming video linked directly with the coding program ( Noldus et al., 2000 ). Finally, the new version of the software permits searches of the data for particular comments, events, or behaviors, and data may be exported to various statistical software packages ( Noldus et al., 2000 ). Jonge, Kemner, Naber, and van Engeland (2009) used an earlier version of the Observer software to code data from a study on block design reconstruction in children with autism spectrum disorders and a group of comparison participants. The use of the videotaped sessions and later coding by unaware observers meant that the coders using the software were unaware of the child’s group status. The software permitted the coders to record the amount of time the children took to reconstruct the block design pattern as well as a range of errors ( Jonge et al., 2009 ). The program was used to calculate Cohen’s κ based on two independent coders ( Jonge et al., 2009 ), who could make independent evaluations of the behavior without biasing their coding partner.

The second observational software program that we examine is the Multi-Option Observation System for Experimental Studies (MOOSES; Tapp, Wehby, & Ellis, 1995 ) and the associated Procoder for Digital Video (PCDV; Tapp& Walden, 1993 ), which permits viewing and coding of digital media ( see   http://mooses.vueinnovations.com/overview ). The MOOSES and PCDV programs also permit event and time sampling and for the coding of real-time digital media files or verbatim transcripts of observational sessions ( Tapp & Walden, 1993 ; Tapp et al., 1995 ). In fact, data files may be exported to MOOSES for event coding or to another format known as the Systematic Analysis of Language Transcripts (SALT) for transcription data coding. MOOSES automatically timestamps events and may provide frequency and duration codes as well as basic reliability statistics (e.g., Cohen’s κ ), and MOOSES is designed for sequential analysis ( Tapp et al., 1995 ). A handheld version of MOOSES is available. MOOSES/PCDV has been described as a lower cost alternative to The Observer ( Hoch & Symons, 2004 ).

The third system we review is the Behavior Evaluation Strategies and Taxonomies (BEST; Sharpe & Koperwas, 2003 ). This computer system includes both the BEST Collection for capturing digital media files and the BEST Analysis program for both qualitative and quantitative analysis of the observational data ( Sidener, Shabani, & Carr, 2004 ). The BEST program may be used for examining the frequency or duration of events, and sophisticated sequential analysis may be conducted. Much like the more expensive alternatives, this program will calculate reliability statistics (e.g., Cohen’s κ ) and will summarize data in table or various graph formats. A review of this program suggests that BEST does not handle the collection of interval-based data well, but the BEST Analysis program will allow a researcher to analyze this type of observational data ( Sidener et al., 2004 ). A new platform permits video display for captured data from video files, and although the program was initially written for Windows ® ; , there are inexpensive Apple ® ; iPhone ® ; and iPod Touch ® ; applications available for data collection ( see   http://www.skware.com ).

Various types of technology (e.g., audio and video recordings) have an extensive history in the field and laboratory to assist researchers in better capturing verbal and nonverbal interactions (e.g., Abramovitch, Corter, Pepler, & Stanhope, 1986 ; Stauffacher & DeHart, 2005 ). Remote audiovisual recordings provided an opportunity to combine the benefits of both audio and video recording while also reducing reactivity to typical recording devices when participants were observed in naturally occurring settings ( Asher & Gabriel, 1993 ; Atlas & Pepler 1998 ; Pellegrini, 2004 ; Pepler & Craig, 1995 ; Pepler, Craig, & Roberts, 1998 ). That is, videotaping with a telephoto zoom lens from an unobtrusive location in the natural setting and recording audio via a system of wireless microphones provides an externally valid way to record behavior and a time-synched verbal record of the interaction ( Pepler & Craig, 1995 ). Thus, remote audiovisual observational recordings provide all the benefits of having a video for subsequent coding by unaware observers (i.e., the ability to pause, rewind, and analyze subtle nonverbal behaviors) as well as a complete verbal transcript, which helps to put the video data in proper context ( Asher & Gabriel, 1993 ; Pepler & Craig, 1995 ). Wireless microphones typically are housed within small vests or waist pouches that participants wear, and often only the focal participant has an active or live microphone, and others in the reference group have “dummy” microphones that resemble the weight and look of the real microphone. Importantly, observational codes made with the remote audiovisual equipment have demonstrated acceptable inter-observer reliability coefficients (e.g., κ = 0 . 7 6 ; Pepler & Craig, 1995 ). Moreover, this procedure as well as sufficient exposure to the equipment by the participants has been found to produce low levels of participant reactivity (e.g., <5%, Atlas & Pepler, 1998 ; see also   Asher & Gabriel, 1993 ). The benefits of a rich observational record with low levels of reactivity within settings of high ecological validity seem to outweigh the costs, which include additional training, equipment costs, and some ethical considerations. A central ethical consideration is that individuals without consent may be recorded indirectly. A possible solution is to temporarily store and then, after processing, discard film clips of individuals without consent ( Pepler & Craig, 1995 ), but this solution may violate the rights of nonparticipants. Alternatively, a researcher could restrict access to the observational setting to only those with consent, but this second approach is a threat to the ecological validity of the procedures ( Pepler & Craig, 1995 ). An additional concern is that third parties may wish to use the data as surveillance, which might limit the rights of participants being recorded. As such, policies related to confidentiality and any possible limits of confidentiality should be discussed with the participants and any other possible party that may desire access to the data ( see   Pepler & Craig, 1995 ). Importantly, to our knowledge, remote audiovisual observational methodology has only been used with school-aged children in the classroom ( Atlas & Pepler, 1998 ) and typically on the playground (e.g., Asher & Gabriel, 1993 ; Pepler, Craig, & Roberts, 1998 ); thus, it is not clear if older individuals would be more aware and reactive to the procedure and equipment ( Pepler & Craig, 1995 ).

Ethical Considerations

There are several ethical considerations with observational research. With naturally occurring phenomena, there may be a temptation to observe social interactions and behavior without obtaining informed consent. Although this practice may technically be exempt from most Institutional Review Board (IRB) review (i.e., if identifying information is not collected and video or audio recordings of the public behavior are not made), we strongly encourage researchers to obtain informed consent from participants and assent from legal minors to support their right for autonomy but also so that all risks (e.g., breaches of confidentiality) may be appropriately conveyed. To avoid these breaches of confidentiality, researchers conducting live observations typically use identification codes rather than identifying information about the participants on all observation forms and in data files. Access to video or audio recordings of observational sessions is typically restricted to only those individuals (e.g., coders) who must have access as part of the research study. Participants should be fully informed for how long the observational recordings will be maintained and when they will be destroyed. A final ethical consideration concerns intervention efforts or at what point the researcher or observers will intervene (for a discussion of duty to warn with observational methods, see   Pepler & Craig, 1995 ) and directly or indirectly act on the behalf of the participants. For example, in our observational studies, we have clearly established procedures for when we will notify a teacher that a child in the observation setting is in danger or in need of help (e.g., leaving the controlled area, serious injury). These procedures are discussed at the start of the study with school officials and are part of our consent process, which we believe are best practices.

An Overview of Procedures for a High-Quality Systematic Observational Study

The researcher begins by a priori selecting and operationally defining behaviors of interest. Next, the researcher adopts a coding scheme by selecting the most appropriate sampling and recording procedures given the nature of the behavior under study and the observational context ( see Table 15.2 ). Ethical considerations should be addressed during this development stage of the observational method and should be evaluated for the duration of the study. If the observational scheme is newly developed for the study, then it is imperative that pilot testing occur within a similar context and with a sample representing the target population. If it is not a new scheme or if pilot testing does not indicate any problems, then the investigator may begin training observers. If there are problems noted, then it is important to rectify these issues as quickly as possible to avoid further errors in the study. It is possible that modifications will be needed regarding the operational definition of the observed constructs or changes may be needed to the procedures and coding scheme given the nature of the context or sample under study. Once these changes are adopted, additional checks should be made to verify the solution has worked to ameliorate the original concerns. Training involves the use of a standardized manual, and initial reliability training assessments are conducted prior to the collection of data. Behavior is sampled in the lab or in the field in accordance with the adopted sampling and recording rules, and inter-observer reliability is collected for the duration of the study. Validity assessments are also conducted using alternative informants and methods. If reliability or validity problems are detected, then this may also yield further modifications to the coding scheme to address the problems. If no psychometric problems are noted, then coding and scoring of the observational data occurs using standardized procedures. Finally, the data are analyzed and reported, which concludes the systematic observational study ( see Fig. 15.1 ).

Systematic observational methods provide an opportunity to record the behavior of humans and animals in a relatively objective manner, without sacrificing ecological validity. In the present chapter, we have attempted to identify best practices as well as benefits and costs of various sampling and recording techniques. Quantitative researchers should be guided by a priori research questions and hypotheses when selecting the most appropriate sampling and recording procedure for the specific research setting. Systematic observations require careful attention to coding and scoring decisions and a focus on achieving acceptable levels of reliability and validity. As a field, we must work to establish more stringent standards of reliability (i.e., inter-observer) and validity (i.e., construct) for observational methods. Moreover, we must continue to address and reduce various sources of bias and error. The use of computer-assisted software and digital analysis technology provide some promising options for increasing the efficiency and appeal of systematic observations in the field. Attention must also be given to key ethical considerations to guide appropriate conduct as an observational researcher. Careful consideration of these issues may inform quality research in a wide variety of basic, clinical, and educational contexts.

Procedures for a high-quality systematic observational study.

Future Directions

Observational methods have been a part of the social and behavioral sciences since the early years of our field, and we anticipate that there is a bright future for observational methods within the quantitative scholar’s toolbox. We have defined seven questions and two remaining issues that we believe the field should work to address. This list is not exhaustive, but we hope these questions will generate future work using systematic observational methods.

1. What is the utility of observational methods above and beyond additional informants? Given the time and cost of observational methods, it is necessary to continue to demonstrate that observational methods have incremental predictive utility or may explain unique amounts of variance in relevant outcomes, above and beyond other informants and measures ( Doctoroff & Arnold, 2004 ; Shaw et al., 1998 ). For example, we have demonstrated that observations of relational and physical aggression account for a significant amount of unique variance above and beyond teacher reports of relational and physical aggression in the prediction of teacher-reported deceptive and lying behaviors ( Ostrov, Ries, Stauffacher, Godleski, & Mullins, 2008 ).

2. How does one best examine the construct validity of observational methods? To date, there is not wide consensus on the best approach for demonstrating the construct validity of observational systems. The typical approach is to compare observational data to other “gold standard” methods. For example, convergent evidence is achieved when high levels of association are found across methods such as between observations of aggression subtypes in classrooms, observations of aggression subtypes via semi-structured observations, and with various informants including teacher reports and parent reports of aggression subtypes (e.g., Crick, Ostrov, Burr, et al., 2006 ; Hinde et al., 1984 ; Ostrov & Bishop, 2008 ; Ostrov & Keating, 2004 ; Pellegrini & Bartini, 2000 ).

3. How do we detect observer biases? We believe the field has only begun to address the important issue of how to assess and identify observer biases. Much further work is needed to examine a host of possible biases from observer drift and observer expectancy effects to gender biases as well as other possible sources of distortion such as halo effects and potential expectancy biases derived from prior knowledge of participants in longitudinal studies ( Hartmann & Pelzel, 2005 ). In addition, more focus should be placed on assessing participant reactivity. Few studies report this source of error and threat to validity, and we encourage observational researchers to quantify the degree to which their participants are reactive to the observational procedures.

4. How do we eliminate observer biases and other sources of error? Once we identify observer biases, we need more evidence-based information on how to appropriately eliminate these biases and sources of error. The literature has indicated few possible solutions (e.g., increased training for individuals with identified biases). In addition, more emphasis should be placed on identifying best practices for reducing reactivity. It is clear that minimally responsive procedures and habituation practices have worked effectively to reduce reactivity to low levels (e.g., <5% of time), but our goal should be to eliminate this source of error from our data.

5. What is the sufficient amount of time for observational sampling? Too often the time interval for time sampling as well as the total duration of observed time for event-based coding systems is decided without sufficient justification, and greater work is needed to establish parameters and strategies for determining the most efficient and useful time intervals for various behaviors and settings.

6. How do we reduce the cost of observational methods? One of the biggest obstacles to greater adoption of systematic observational methods is the cost of observational procedures. Typically, large staffs of highly trained individuals are needed for observational work, and although volunteer research assistants may be used to address this concern, this is still a significant barrier to further work in this area. Moreover, the overall amount of time to conduct an observational study is potentially longer than comparable studies with other methods, and thus we must work to make training procedures, data collection, and coding processes more efficient. The use of computer-assisted software and coding technology will continue to greatly help in this regard.

7. How do we refine and create observational software so that it is compatible with all types of observational systems and more flexible as well as affordable? Although observational software and recording devices have advanced a great deal in recent years ( see   Hoch & Symons, 2004 ), the software must become more flexible to accommodate a greater range of observational sampling and recording procedures. Moreover, the financial cost of these programs and licenses are often prohibitive, and efforts must be made to develop high-quality, affordable, and flexible computer-assisted observational software programs.

8. A key remaining issue is that as a field we need to move away from the use of Pearson product moment correlations and percent agreement as a standard measure of assessing inter-observer reliability. Given what we know about the role of chance agreement from classic (e.g., Cohen, 1960 ) and modern sources ( Bakeman & Gottman, 1987 ; Pellegrini, 2004 ), it is not clear why some peer-reviewed manuscripts continue to only present either Pearson product moment correlations or percent agreement as strong evidence of inter-observer reliability.

9. A second remaining concern is that greater discussion of the ethical issues involved in observational methods is needed. For example, as we have discussed, it is not always clear when intervention is needed by observers in the field. Further, greater work needs to be conducted to examine how we may best ensure confidentiality of data with detailed observational records. Finally, we must focus on how we ensure confidentiality with the transfer of electronic observational data via handheld devices and other electronic technology.

Author Note

We wish to thank Jennifer Kane and members of the UB Social Development Laboratory for their assistance with the preparation of this chapter. Thanks to Dr. Leonard J. Simms for comments on an earlier draft. Special thanks to Dr. Anthony D. Pellegrini, who has greatly influenced the way we conceptualize systematic observational methods. The authors are affiliated with the Department of Psychology, University at Buffalo, The State University of New York. Please direct correspondence to the first author at [email protected] or 716-645-3680.

Abramovitch, R. , Corter, C. , Pepler, D. J. , & Stanhope, L. ( 1986 ). Sibling and peer interaction: A final follow-up and a comparison.   Child Development , 57 , 217–229.

Google Scholar

Arrington, R. E. ( 1943 ). Time sampling in studies of social behavior: A critical review of techniques and results with research suggestions.   Psychological Bulletin , 40 , 81–124.

Arsenio, W. F. , & Lover, A. ( 1997 ). Emotions, conflicts and aggression during preschoolers’ free play.   British Journal of Developmental Psychology , 15 , 531–542.

Asher, S. R. , & Gabriel, S. W. ( 1993 ). Using a wireless transmission system to observe conversation and social interaction on the playground. In C. H. Hart (Ed.). Children on playgrounds: Research perspectives and applications (pp. 184–209). Albany, NY: SUNY Press.

Google Preview

Atlas, R. S. , & Pepler, D. J. ( 1998 ). Observations of bullying in the classroom.   Journal of Educational Research , 92 , 86–99.

Bakeman, R. , & Gnisci, A. ( 2006 ). Sequential observational methods. In M. Eid & E. Diener (Eds.). Handbook of multimethod measurement in psychology (pp. 127–140). Washington DC: American Psychological Association.

Bakeman, R. , & Gottman, J. M. ( 1987 ). Applying observational methods: A systematic view. In J. D. Osofsky (Ed.). Handbook of infant development . (2nd ed., pp. 818–854). New York: John Wiley.

Bartko, J. J. ( 1976 ). On various intraclass correlation reliability coefficients.   Psychological Bulletin , 83 , 762–765.

Bowker, A. , Boekhoven, B. , Nolan, A. , Bauhaus, S. , Glover, P , Powell, T. , & Taylor, S. ( 2009 ). Naturalistic observations of spectator behavior at youth hockey games.   Applied Research , 23 , 301–316.

Cohen, J. ( 1960 ). A coefficient of agreement for nominal scales.   Educational and Psychological Measurement , 20 , 37–46.

Coie, J. D. & Kupersmidt, J. B. ( 1983 ). A behavioral analysis of emerging social status in boys’ groups.   Child Development , 54 , 1400–1416.

Condry, J. C. , & Ross, D. F. ( 1985 ). Sex and aggression: The influence of gender label on the perception of aggression in children.   Child Development , 56 , 225–233.

Crawford, M. P. ( 1942 ). Dominance and social behavior, for chimpanzees, in a non-competitive situation.   Journal of Comparative Psychology , 33 , 267–277.

Crick, N. R. , Ostrov, J. M. , Burr, J. E. , Jansen-Yeh, E. A. , Cullerton-Sen, C. , & Ralston, P. ( 2006 ). A longitudinal study of relational and physical aggression in preschool.   Journal of Applied Developmental Psychology , 27 , 254–268.

Doctoroff, G. L. , & Arnold, D. H. ( 2004 ). Parent-rated externalizing behavior in preschoolers: The predictive utility of structured interviews, teacher reports, and classroom observations.   Journal of Clinical Child and Adolescent Psychology , 4 , 813–818.

Dodge, K. A. ( 1983 ). Behavioral antecedents of peer social status.   Child Development , 54 , 1386–1399.

Englund, M. M. , Levy, A. K. , Hyson, D. M. , & Sroufe, L. A. ( 2000 ). Adolescent social competence: Effectiveness in a group setting.   Child Development , 71 , 1049–1060.

Fagot, B. T. , & Hagan, R. ( 1985 ). Aggression in toddlers: Responses to the assertive acts of boys and girls.   Sex Roles , 12 , 341–351.

Goodenough, F. L. ( 1930 ). Inter-relationships in the behavior of young children.   Child Development , 1 , 29–47.

Gurwtiz, S. B. , & Dodge, K. A. ( 1975 ). Adults’ evaluations of a child as a function of sex of adult and sex of child.   Journal of Personality and Social Psychology , 32 , 822–828.

Hall, L. J. , & McGregor, J. A. ( 2000 ). A follow-up study of the peer relationships of children with disabilities in an inclusive school.   The Journal of Special Education , 34 , 114–126.

Harrist, A. W. , & Bradley, K. D. ( 2003 ). You can’t say you can’t play: Intervening in the process of social exclusion in the kindergarten classroom.   Early Childhood Research Quarterly , 18 , 185–205.

Hartmann, D. P. , & Pelzel, K. E. ( 2005 ). Design, measurement, and analysis in developmental research. In M. H. Bornstein & M. E. Lamb (Eds.). Developmental science: An advanced textbook (5th ed., pp. 103–184). Mahwah, NJ: Lawrence Erlbaum Associates.

Hawley, P. H. , & Little, T. D. ( 1999 ). On winning some and losing some: A social relations approach to social dominance in toddlers.   Merrill-Palmer Quarterly , 45 , 185–214.

Hinde, R. A. , Easton, D. F. , & Meller, R. E. ( 1984 ). Teacher questionnaire compared with observational data on effects of sex and sibling status on preschool behavior.   Journal of Child Psychology and Psychiatry , 25 , 285–303.

Hintze, J. M. , Volpe, R. J. , & Shapiro, E. S. ( 2002 ). Best practices in the systematic direct observation of student behavior. In A. Thomas & J. Grimes (Eds.). Best practices in school psychology-IV (pp. 993–1006). Bethesda, MD: National Association of School Psychologists.

Hoch, J. , & Symons, F. J. ( 2004 ). Computer-assisted recording and observational software programs. In A. D. Pellegrini ’s Observing children in their natural worlds: A methodological primer . (2nd ed., pp. 214–222). Mahwah, NJ: Lawrence Erlbaum Associates

Jackson, H. G. & Neel, R. S. ( 2006 ). Observing mathematics: Do students with EBD have access to standards-based mathematics instruction?   Education and Treatment of Children , 29 , 593–614.

Jonge, M. de. , Kemner, C. , Naber, F. , & Engeland, H. van . ( 2009 ). Block design reconstruction skills: not a good candidate for an endophenotypic marker in autism research,   European Child & Adolescent Psychiatry , 18 , 197–205.

Keating, C. F. , & Heltman, K. R. ( 1994 ). Dominance and deception in children and adults: Are leaders the best misleaders?   Personality and Social Psychology Bulletin , 20 , 312–321.

Krehbiel, D. , & Lewis, P. T. ( 1994 ). An observational emphasis in undergraduate psychology laboratories.   Teaching of Psychology , 21 , 45–48.

Landis, J. R. , & Koch, G. G. ( 1977 ). The measurement of observer agreement for categorical data.   Biometrics , 33 , 159–174.

Langfeld, H. S. ( 1913 ). Text-books and general treatises.   Psychological Bulletin , 10 , 25–32.

Länsisalmi, H. , Peiró, J. M. , & Kivimäki, M. ( 2000 ). Collective stress and coping in the context of organizational culture.   European Journal of Work and Organizational Psychology , 9 , 527–559.

Laursen, B. , & Hartup, W. W. ( 1989 ). The dynamics of preschool children’s conflicts.   Merrill-Palmer Quarterly , 35 , 281–297.

Leff, S.S. , & Lakin, R. ( 2005 ). Playground-based observational systems: A review and implications for practitioners and researchers.   School Psychology Review , 34 (4), 475–489.

Lopez-Williams, A. , Chacko, A. , Wymbs, B. T. , Fabiano, G. A. , Seymour, K. E. , Gnagy, E. M. , et al. ( 2005 ). Athletic performance and social behavior as predictors of peer acceptance in children diagnosed with attention-deficit/hyperactivity disorder.   Journal of Emotional and Behavioral Disorders , 13 , 172–180.

Lyons, J. A. , & Serbin, L. A. ( 1986 ). Observer bias in scoring boys’ and girls’ aggression.   Sex Roles , 14 , 301–313.

Macintosh, K. & Dissanayake, C. ( 2006 ). A comparative study of the spontaneous social interactions of children with high-functioning autism and children with Asperger’s disorder.   Autism , 10 , 199–220.

McGraw, K. O. , & Wong, S. P. ( 1996 ). Forming inferences about some intra-class correlation coefficients.   Psychological Methods , 1 , 30–46.

McNeilly-Choque, M. K. , Hart, C. H. , & Robinson, C. C. , Nelson, L. , & Olsen, S. F. ( 1996 ). Overt and relational aggression on the playground: Correspondence among different informants.   Journal of Research in Childhood Education , 11 , 47–67.

Meany-Daboul, M. G. , Roscoe, E. M. , Bourret, J. C. , & Ahearn, W. H. ( 2007 ). A comparison of momentary time sampling and partial-interval recording for evaluating functional relations.   Journal of applied behavior analysis , 40 , 501–514.

Newcomb, T. ( 1931 ). An experiment designed to test the validity of a rating technique.   Journal of Educational Psychology , 22 , 279–289.

NICHD Early Child Care Research Network. ( 2004 ). Trajectories of physical aggression from toddlerhood to middle childhood.   Monographs of the Society for Research in Child Development , 69 , ( Serial No . 278).

Noldus, L. P. , Trienes, R. J. , Hendriksen, A. H. , Jansen, H. , & Jansen, R. G. ( 2000 ). The observer video-pro: New software for the collection, management, and presentation of time-structured data from videotapes and digital media files.   Behavior Research Methods, Instruments, and Computers , 32 , 197–206.

Ostrov, J. M. ( 2008 ). Forms of aggression and peer victimization during early childhood: A short-term longitudinal study.   Journal of Abnormal Child Psychology , 36 , 311–322.

Ostrov, J. M. & Bishop, C. M. ( 2008 ). Preschoolers’ aggression and parent-child conflict: A multiinformant and multimethod study.   Journal of Experimental Child Psychology , 99 , 309–322.

Ostrov, J. M. , & Collins, W. A. ( 2007 ). Social dominance in romantic relationships: A Prospective longitudinal study of non-verbal processes.   Social Development , 16 , 580–595.

Ostrov, J. M. , Crick, N. R. , & Keating, C. F. ( 2005 ). Gender-biased perceptions of preschoolers’ behavior: How much is aggression and prosocial behavior in the eye of the beholder?   Sex Roles , 52 , 393–398.

Ostrov, J. M. , & Godleski, S. A. ( 2007 ). Relational aggression, victimization, and language development: Implications for practice.   Topics in Language Disorders , 27 , 146–166.

Ostrov, J. M. , & Keating, C. F. ( 2004 ). Gender differences in preschool aggression during free play and structured interactions: An observational study.   Social Development , 13 , 255–277.

Ostrov, J. M. , Massetti, G. M. , Stauffacher, K. , Godleski, S. A. , Hart, K. C. , Karch, K. M. , Mullins, A. D. , et al. ( 2009 ). An intervention for relational and physical aggression in early childhood: A preliminary study.   Early Childhood Research Quarterly , 24 , 15–28.

Ostrov, J. M. , Ries, E. E. , Stauffacher, K. , Godleski, S. A. , & Mullins, A. D. ( 2008 ). Relational aggression, physical aggression and deception during early childhood: A multi-method, multi-informant short-term longitudinal study.   Journal of Clinical Child and Adolescent Psychology , 37 , 664–675.

Ostrov, J. M. , Woods, K. E. , Jansen, E. A. , Casas, J. F. , & Crick, N. R. ( 2004 ). An observational study of delivered and received aggression, gender, and social-psychological adjustment in preschool: “This white crayon doesn’t work …” Early Childhood Research Quarterly , 19 , 355–371.

Parten, M. B. ( 1932 ). Social participation among pre-school children.   The Journal of Abnormal and Social Psychology , 27 , 243–269.

Pelham, W. E. Jr. , Gnagy, E. M. , Greiner, A. R. , Hoza, B. , Hinshaw, S.P.   Swanson, J. M. , et al. ( 2000 ). Behavioral versus behavioral and pharmacological treatment in ADHD children attending a summer treatment program.   Journal of Abnormal Child Psychology , 28 , 507–525.

Pellegrini, A. D. ( 1989 ). Categorizing children’s rough-and-tumble play.   Play & Culture , 2 , 48–51.

Pellegrini, A. D. ( 2001 ). Practitioner review: The role of direct observation in the assessment of young children.   Journal of Child Psychology and Psychiatry , 42 , 861–869.

Pellegrini, A. D. ( 2004 ). Observing children in their natural worlds: A methodological primer . (2nd ed.). Mahwah, NJ: Lawrence Erlbaum Associates.

Pellegrini, A. D. , & Bartini, M. ( 2000 ). An empirical comparison of methods of sampling aggression and victimization in school settings.   Journal of Educational Psychology , 92 , 360–366.

Pellegrini, A. D. , Ostrov, J. M. , Roseth, C. , Solberg, D. , & Dupuis, D. ( in press ). Using observational methods to study children’s and adolescents’ development. In G. Melton , A. Ben-Arich , & J. Cashmore (Eds.). Handbook of child research . Beverly Hills, CA: Sage.

Pepler, D. J. , & Craig, W. M. ( 1995 ). A peek behind the fence: Naturalistic observations of aggressive children with remote audiovisual recording.   Developmental Psychology , 31 , 548–553.

Pepler, D. J. , Craig, W. M. , & Roberts, W. L. ( 1998 ). Observations of aggressive and nonaggressive children on the school playground.   Merrill-Palmer Quarterly , 44 , 55–76.

Quake-Rapp, C. , Miller, B. , Ananthan, G. , & Chiu, E-C. ( 2008 ). Direct observation as a means of assessing frequency of maladaptive behavior in youths with severe emotional and behavioral disorder.   The American Journal of Occupational Therapy , 62 , 206–211.

Sharpe, T. L. & Koperwas, J. ( 2003 ). Behavior and sequential analyses: Principles and practice . Thousand Oaks, CA: Sage Publications.

Shaw, D. S. , Winslow, E. B. , Owens, E. B. , Vondra, J. I. , Cohn, J. F. , & Bell, R. Q. ( 1998 ). The development of early externalizing problems among children from low-income families: A transformational perspective.   Journal of Abnormal Child Psychology , 26 , 95–107.

Shrout, P. E. , & Fleiss, J. L. ( 1979 ). Intraclass correlations: Uses in assessing rater reliability.   Psychological Bulletin , 86 , 420–428.

Sidener, T. M. , Shabani, D. B. , & Carr, J. E. ( 2004 ). A review of the Behavioral Evaluation Strategy and Taxonomy (BEST) Software Application.   Behavioral Interventions , 19 , 275–285.

Silk, J. B. , Cheney, D. L. , & Seyfarth, R. M. ( 1996 ). The form and function of post-conflict interactions between female baboons.   Animal Behaviour , 52 , 259–268.

Slee, P. T. ( 1987 ). Child observation skills . London, UK: Croom Helm.

Smith, G. A. ( 1986 ). Observer drift: A drifting definition.   The Behavior Analyst , 9 , 127–128.

Soukup, J. H. , Wehmeyer, M. L. , Bashinski, S. M. , & Boyaird, J. A. ( 2007 ). Classroom variables and access to the general curriculum for students with disabilities.   Exceptional Children , 24 , 101–120.

Stangor, C. ( 2011 ). Research methods for the behavioral sciences (4th ed) . Belmont CA: Wadsworth.

Stauffacher, K. , & DeHart, G. ( 2005 ). Preschoolers’ relational aggression with siblings and friends.   Early Education and Development , 16 , 185–206.

Susser, S. A. , & Keating, C. F. , ( 1990 ). Adult sex role orientation and perceptions of aggressive interactions between boys and girls.   Sex Roles , 23 , 147–155.

Tapp, J.T. , & Walden, T. ( 1993 ). PROCODER: A professional tape control, coding, and analysis system for behavioral research using videotape.   Behavior Research Methods, Instruments, & Computers , 25 , 53–56.

Tapp, J. T. , Wehby, J. H. , & Ellis, D. ( 1995 ). A Multi-option observation system for experimental studies: MOOSES . Behavior Research Methods, Instruments, & Computers , 27 , 25–31.

  • About Oxford Academic
  • Publish journals with us
  • University press partners
  • What we publish
  • New features  
  • Open access
  • Institutional account management
  • Rights and permissions
  • Get help with access
  • Accessibility
  • Advertising
  • Media enquiries
  • Oxford University Press
  • Oxford Languages
  • University of Oxford

Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide

  • Copyright © 2024 Oxford University Press
  • Cookie settings
  • Cookie policy
  • Privacy policy
  • Legal notice

This Feature Is Available To Subscribers Only

Sign In or Create an Account

This PDF is available to Subscribers Only

For full access to this pdf, sign in to an existing account, or purchase an annual subscription.

  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • QuestionPro

survey software icon

  • Solutions Industries Gaming Automotive Sports and events Education Government Travel & Hospitality Financial Services Healthcare Cannabis Technology Use Case NPS+ Communities Audience Contactless surveys Mobile LivePolls Member Experience GDPR Positive People Science 360 Feedback Surveys
  • Resources Blog eBooks Survey Templates Case Studies Training Help center

observational research

Home Market Research

Observational Research: What is, Types, Pros & Cons + Example

Observational research is a qualitative, non-experimental examination of behavior. This helps researchers understand their customers' behavior.

Researchers can gather customer data in a variety of ways, including surveys, interviews, and research. But not all data can be collected by asking questions because customers might not be conscious of their behaviors. 

It is when observational research comes in. This research is a way to learn about people by observing them in their natural environment. This kind of research helps researchers figure out how people act in different situations and what things in the environment affect their actions.

This blog will teach you about observational research, including types and observation methods. Let’s get started.

What is observational research?

Observational research is a broad term for various non-experimental studies in which behavior is carefully watched and recorded.

The goal of this research is to describe a variable or a set of variables. More broadly, the goal is to capture specific individual, group, or setting characteristics.

Since it is non-experimental and uncontrolled, we cannot draw causal research conclusions from it. The observational data collected in research studies is frequently qualitative observation , but it can also be quantitative or both (mixed methods).

Types of observational research

Conducting observational research can take many different forms. There are various types of this research. These types are classified below according to how much a researcher interferes with or controls the environment.

Naturalistic observation

Taking notes on what is seen is the simplest form of observational research. A researcher makes no interference in naturalistic observation. It’s just watching how people act in their natural environments. 

Importantly, there is no attempt to modify factors in naturalistic observation, as there would be when comparing data between a control group and an experimental group.

Case studiesCase studies

A case study is a sort of observational research that focuses on a single phenomenon. It is a naturalistic observation because it captures data in the field. But case studies focus on a specific point of reference, like a person or event, while other studies may have a wider scope and try to record everything that happens in the researcher’s eyes. 

For example, a case study of a single businessman might try to find out how that person deals with a certain disease’s ups and down or loss.

Participant observation

Participant observation is similar to naturalistic observation, except that the researcher is a part of the natural environment they are studying. In such research, the researcher is also interested in rituals or cultural practices that can only be evaluated by sharing experiences. 

For example, anyone can learn the basic rules of table Tennis by going to a game or following a team. Participant observation, on the other hand, lets people take part directly to learn more about how the team works and how the players relate to each other.

It usually includes the researcher joining a group to watch behavior they couldn’t see from afar. Participant observation can gather much information, from the interactions with the people being observed to the researchers’ thoughts.

Controlled observation

A more systematic structured observation entails recording the behaviors of research participants in a remote place. Case-control studies are more like experiments than other types of research, but they still use observational research methods. When researchers want to find out what caused a certain event, they might use a case-control study.

Longitudinal observation

This observational research is one of the most difficult and time-consuming because it requires watching people or events for a long time. Researchers should consider longitudinal observations when their research involves variables that can only be seen over time. 

After all, you can’t get a complete picture of things like learning to read or losing weight in a single observation. Longitudinal studies keep an eye on the same people or events over a long period of time and look for changes or patterns in behavior.

Observational research methods

When doing this research, there are a few observational methods to remember to ensure that the research is done correctly. Along with other research methods, let’s learn some key research methods of it:

observational research

Have a clear objective

For an observational study to be helpful, it needs to have a clear goal. It will help guide the observations and ensure they focus on the right things.

Get permission

Get permission from your participants. Getting explicit permission from the people you will be watching is essential. It means letting them know that they will be watched, the observation’s goal, and how their data will be used.

Unbiased observation

It is important to make sure the observations are fair and unbiased. It can be done by keeping detailed notes of what is seen and not putting any personal meaning on the data.

Hide your observers

In the observation method, keep your observers hidden. The participants should be unaware of the observers to avoid potential bias in their actions.

Documentation

It is important to document the observations clearly and straightforwardly. It will allow others to examine the information and confirm the observational research findings.

Data analysis

Data analysis is the last method. The researcher will analyze the collected data to draw conclusions or confirm a hypothesis.

Pros and cons of observational research

Observational studies are a great way to learn more about how your customers use different parts of your business. There are so many pros and cons of observational research. Let’s have a look at them.

  • It provides a practical application for a hypothesis. In other words, it can help make research more complete.
  • You can see people acting alone or in groups, such as customers. So, you can answer a number of questions about how people act as customers.
  • There is a chance of researcher bias in observational research. Experts say that this can be a very big problem.
  • Some human activities and behaviors can be difficult to understand. We are unable to see memories or attitudes. In other words, there are numerous situations in which observation alone is inadequate.

Example of observational research

The researcher observes customers buying products in a mall. Assuming the product is soap, the researcher will observe how long the customer takes to decide whether he likes the packaging or comes to the mall with his decision already made based on advertisements.

If the customer takes their time making a decision, the researcher will conclude that packaging and information on the package affect purchase behavior. If a customer makes a quick decision, the decision is likely predetermined. 

As a result, the researcher will recommend more and better advertisements in this case. All of these findings were obtained through simple observational research.

How to conduct observational research with QuestionPro?

QuestionPro can help with observational research by providing tools to collect and analyze data. It can help in the following ways:

Define the research goals and question types you want to answer with your observational study . Use QuestionPro’s customizable survey templates and questions to do a survey that fits your research goals and gets the necessary information. 

You can distribute the survey to your target audience using QuestionPro’s online platform or by sending a link to the survey. 

With QuestionPro’s real-time data analysis and reporting features, you can collect and look at the data as people fill out the survey. Use the advanced analytics tools in QuestionPro to see and understand the data and find insights and trends. 

If you need to, you can export the data from QuestionPro into the analysis tools you like to use. Draw conclusions from the collected and analyzed data and answer the research questions that were asked at the beginning of the research.

For a deeper understanding of human behaviors and decision-making processes, explore the realm of Behavioral Research .

To summarize, observational research is an effective strategy for collecting data and getting insights into real-world phenomena. When done right, this research can give helpful information and help people make decisions. 

QuestionPro is a valuable tool that can help with observational research by letting you create online surveys, analyze data in real time, make surveys your own, keep your data safe, and use advanced analytics tools.

To do this research with QuestionPro, researchers need to define their research goals, do a survey that matches their goals, send the survey to participants, collect and analyze the data, visualize and explain the results, export data if needed, and draw conclusions from the data collected.

By keeping in mind what has been said above, researchers can use QuestionPro to help with their observational research and gain valuable data. Try out QuestionPro today!

FREE TRIAL         LEARN MORE

Frequently Asked Questions (FAQ)

Observational research is a method in which researchers observe and systematically record behaviors, events, or phenomena without directly manipulating them.

There are three main types of observational research: naturalistic observation, participant observation, and structured observation.

Naturalistic observation involves observing subjects in their natural environment without any interference.

MORE LIKE THIS

ux research software

Top 17 UX Research Software for UX Design in 2024

Apr 5, 2024

Healthcare Staff Burnout

Healthcare Staff Burnout: What it Is + How To Manage It

Apr 4, 2024

employee retention software

Top 15 Employee Retention Software in 2024

employee development software

Top 10 Employee Development Software for Talent Growth

Apr 3, 2024

Other categories

  • Academic Research
  • Artificial Intelligence
  • Assessments
  • Brand Awareness
  • Case Studies
  • Communities
  • Consumer Insights
  • Customer effort score
  • Customer Engagement
  • Customer Experience
  • Customer Loyalty
  • Customer Research
  • Customer Satisfaction
  • Employee Benefits
  • Employee Engagement
  • Employee Retention
  • Friday Five
  • General Data Protection Regulation
  • Insights Hub
  • Life@QuestionPro
  • Market Research
  • Mobile diaries
  • Mobile Surveys
  • New Features
  • Online Communities
  • Question Types
  • Questionnaire
  • QuestionPro Products
  • Release Notes
  • Research Tools and Apps
  • Revenue at Risk
  • Survey Templates
  • Training Tips
  • Uncategorized
  • Video Learning Series
  • What’s Coming Up
  • Workforce Intelligence

Sociology Group: Welcome to Social Sciences Blog

What is an Observational Research: Steps, Types, Pros and Cons

Observational research refers to qualitative and non-experimental studies that seek to systematically observe, record, and analyse a particular society, culture, behaviours and attitudes. It is non-experimental in its observation as it does not manipulate any variables.

observational research examples

The steps are undertaken in conducting Observation research usually include:

  • Deciding upon the goals of the study
  • Deciding upon the group to be observed
  • Choosing a type of observation method to employ
  • Earning access to the group
  • Establishing a rapport with participants
  • Conducting the study by observing and recording behaviour, attitudes and beliefs over a specific time period
  • Exiting the observation research setting
  • Analysing data that was recorded
  • Writing the report and presenting the findings

(Bailey, 1994)

Observational Research is typically dichotomized on the basis of  :

Degree of Structure of the Environment:

  • Participant observation : this type of observation research falls under naturalistic observation as it is employed in a natural setting. It calls for a researcher to (either covertly or overtly) participate or immerse themselves in the setting they are studying, becoming a part of the community they are observing and make inferences through their experience. Overt participation implies that the participants aware that they are being observed and studied while covert participation implies that a researcher will act as a member of the group, they are studying without others knowing that they are a researcher. One of the first recorded employments of this method was in the early 19 th century by Joseph Marie who sought to understand Native Americans by becoming one of them stating that, “it is by learning their language that we will become their fellow citizens (Gaille, 2020)”.
  • Controlled Observation: When an environment needs to be confined to a structure, a researcher will use a controlled observation design. This  is a type of observation research that is employed mostly in psychological research and in the field of marketing. Controlled observation serves as an exception to the non-experimental criterion of observational research, for this method observes behaviour in a controlled laboratory setting. An ‘un-controlled observation’ simply implies a naturalistic observation employed in an unstructured environment.

Degree of Structure imposed upon the environment by the researcher

  • Structured observation: This type of observation employs a specific framework for observing, categorizing, and recording behaviour.
  • Unstructured Observation:  This type of observation implies the absence of a specific framework that dictates when, how and what behaviours must be recorded. It alludes to an open-ended approach to a subject in which the researcher records almost everything he observes, sifting through the data at a later stage.

Other types of Observation research include:

  • Indirect observation: In a case where the researcher isn’t able to conduct their research in the natural setting of their subject, they must resort to conducting indirect observation. This is the most non-invasive method of research where the researcher will gather primary information by employing techniques in physical tracing such as erosion measures. Erosion Measures refers to studying materials and their condition in order to make conclusive findings; For instance, a researcher may study the floor of a museum to see which exhibits are most popular. Social anthropologists and archaeologists may employ this type of observation research to draw conclusions about historical societies.
  • Direct Observation: The opposite of indirect observation, direct observation, would encompass many of the aforementioned types of observation research including naturalistic and participant observation.

Observation research, irrespective of type, come with a plethora of advantages and disadvantages as described below.

Advantages :

  • In-expensive: Observational research is relatively in-expensive to conduct as researchers need minimal resources to conduct their observations and no variables can be controlled or manipulated. The sociologist simply observes an already existing phenomena like interest group dynamics or a particular society within its natural setting, hence does not have to allocate many resources to conduct their observation.
  • Flexible: Observational research boasts several types within the natural/participant dichotomy such as overt/ covert participation, case studies and archival research allowing one to choose an outline the best suits their research question.
  • Greater Ecological Validity: Ecological Validity refers to the real-world application of a studies’ findings: and since observation research takes place in a natural setting, all observations are made ‘from the real world’ leading to greater ecological validity than other methods of research that are conducted in experimental or laboratory set ups where participants may provide inaccurate self-reports of their own behaviours. Observation research eliminates the discrepancy between reported and enacted behaviours by observing the actual behaviour in action.
  • Allows change to be recorded: Observing a subject within its natural setting can help researchers capture changing attitudes and mobile dynamics of the subject. For example, if a sociologist is studying the dynamics within a multi-ethnic society, they have the opportunity to watch opinions and attitudes of each social group evolve and change. The researcher will be able to identify recurrent behaviours as well as ones that occur by chance
  • Open-ended: Observational research is usually semi-structured, allowing the researchers to work freely within a larger framework. Researchers are free to observe and analyses a plethora of things and the flexibility/ open-endedness of this style of research allows researchers to adapt their research to accommodate more observations that are of value to their research such as interesting phenomena that complement a groups behaviour which researchers did not originally intend to study. This feature would allow a sociologist, for example, to record a new behaviour or attitude of the social group he is studying and include it into his study as it may lend to the research.
  • Some advantages exclusive to Participant Observation include:
  • Options: Researchers can choose to participate either covertly or overtly, choosing one role that is best suited to the nature of their work. A researcher may choose to study a tribe in the amazon overtly but may choose to study group dynamics within an intersectional environmentalist group by participating in that group disguised as a member. Both of these options lend a sufficient amount of descriptive information for analyses.
  • An Inside look into a society or phenomena: Observational research permits researchers to study people in their native environment in order to grasp the subject of their study in manner that would not be understood otherwise. Non-verbal cues and unfiltered responses are recorded in grave detail in a covert participant observation. For instance, if a researcher is observing gender dynamics in a co-ed high school, he will be able to gather information about attitudes and perceptions through gossip exchanges, glances and other non-verbal cues which may not naturally come out if the participants were in a laboratory setting.
  • More detailed observations: Spending a great deal of time immersed in a community, social group or culture yields highly specific and ethnographic information. Some researchers spend years living with a society or involved in a social group and this allows them to record, with significant detail, the intricacies of that society or social group.

Also Read: Sampling Methods

Disadvantages

  • Small scale: Observational research is most often conducted on a small scale and hence may lack a representative sample which consequently compromises the generalisability of the observations. Researchers may adopt a longitudinal focus while studying, for example, one particular community college. The information found on the students in this community college will be highly specific to that college and not generalizable to all.
  • Less reliable: Since observing a phenomenon in its natural setting comes with the presence of innumerable extraneous variables which cannot be controlled, the study is not easily replicable and less reliable than other methods of research including controlled observations. For example, field research may come with a number of variables out of the researchers control including the weather, a group dispute, conflict etc.
  • Cannot establish cause and effect relationships : The lack of control the researcher has over the phenomenon being observed makes it significantly difficult to establish causal factors and resulting behaviours. The observations begin to become more descriptive instead of analytical and no significant inferences can be made that can allows for prediction.
  • Researchers must be highly skilled & Knowledgeable: Researchers must be skilled and trained to recognise facets of a situation that are sociologically significant. Depending on the nature of research, the Observation method calls for the researcher to adopt one or more roles and depend on several techniques, such as observing with all five senses, in order to gain a comprehensive understanding of the subject being studied. Researchers often spend years doing secondary research, learning a new language and familiarising themselves with a culture in order to participate.
  • Time : Observational research is most often time consuming. Researchers choose to spend several months, sometimes years observing the subject of their research in order to gain a comprehensive understanding of the phenomenon that they are studying. As mentioned above, researchers also spend several month planning, researching and preparing for field research.
  • Observer-bias: One of the biggest and most recurring issued in observational research is that of Observer bias. Since social reality is relative, observations may end up reflecting a number of biases possessed by the researcher. Several components such as personal beliefs and preferences can cloud a researcher’s perception and his observations may reflect their biases. Researchers often have a hypothesis which may could judgement and make the researcher see only what they want to see in order to confirm their hypothesis. Observation bias is hence, extremely dangerous in the way that it greatly compromises the validity of the results.
  • Hawthorne Effect: The behaviour of those being studied is often influenced by the presence of the research which is why covert participation is often preferred by many. Consequently, observations and inferences may blur the actual phenomena leading to inaccurate results.
  • External & Distanced observation: Observations are often made from a distance and this could hinder a comprehensive understand of the subject being studied for the researcher may not be able to see or hear any significant events or exchanges. In naturalistic observations, the researcher cannot clarify or inquire into anything being observed, they may only record and subjectively analyse their observations. This leads to non-objective inferences. Furthermore, the only way a researcher conducting a naturalistic observation can get the whole picture is to record data through images, videos and audio recordings which pose a multitude of ethical consequences.
  • Difficulty recording : In a participant observational research, the researcher may experience trouble taking notes and providing written accounts of their observations, which often leads to them relying on their memory to reproduce an observation on paper. This can lead to inaccurate observations that may reflect an observer’s bias.
  • Access: Gaining access to a particular community or social group is challenging if those comprising those groups are not willing to be studied. Several societies, tribes and social groups are physically inaccessible and may be closed off to outsiders.
  • Ethical issues : Covert participation prompts a wide variety of ethical complications as participants of the study are unaware that they are being observed, their behaviours recorded and analysed, hence unable to give consent. Informed Consent being one of the most important aspects of any study, raises a multitude of ethical questions about covert participation observations.
  • Microscopic: Most observation research gathers only situation-specific data and is of relatively minimal use (because of its low generalisability and small representative sampling) to the greater body of sociological research. Studying Native Hawaiian approach to gender will yield inferences that are exclusive to the indigenous community of Hawaii and won’t add greatly to existing literature. If a sociologist or researcher, however, chooses to spend more time studying other indigenous communities’ approach to gender, which were colonized such as the Maori community or Native Americans, this can contribute greatly to sociological research but again calls for a significant amount of time to be spent on observation.
  • No statistical representation of data: Unless the research design employs a mixed methods approach, most observation research is entirely qualitative and results cannot be represented statistically. Observational research does not allow questionnaires or surveys hence cannot any quantitative data.

Also Read: Qualitative and Quantative Methods

Observation research comes with a myriad of advantages and disadvantages. Obviously, not all pros and cons listed above apply to every research project but several do and it is important to note that this research method must be tailored to the phenomena that you want to study. Each research question will call for a different approach and the observation research style can be moulded to satisfy the studies’ research objectives.

References:

Gaille, Louise. “21 Advantages and Disadvantages of a Participant Observation.” Vittana.org , 3 Feb. 2020, vittana.org/21-advantages-and-disadvantages-of-a-participant-observation. 

McLeod, S. A. (2015, June 06). Observation methods. Simply Psychology. https://www.simplypsychology.org/observation.html

Ciesielska, Malgorzata, et al. “Observation Methods.” Qualitative Methodologies in Organization Studies , 2017, pp. 33–52., doi:10.1007/978-3-319-65442-3_2.

Baker, Lynda M. “Observation: A Complex Research Method.” Library Trends , vol. 55, no. 1, 2006, pp. 171–189., doi:10.1353/lib.2006.0045.

Bailey, K. (1994). Observation in Methods of social research. Simon and

Schuster, 4th ed. The Free Press, New York NY10020. Ch 10. Pp.241-273.

observational research

Shivanka Gautam

Shivanka Gautam is a student at FLAME University, studying Psychology and Literary & Cultural studies. She has a passion for Critical theory, Cultural Affairs, Political Philosophy and Academia.

U.S. flag

An official website of the United States government

Here’s how you know

Official websites use .gov A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS A lock ( Lock A locked padlock ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

National Institute of Dental and Craniofacial Research

  • Human Subjects Research
  • Clinical Researcher Toolkit & Educational Materials

Observational Studies - Planning & Startup

On this page, clinical terms of award, protocol template, planning & site activation, data management, essential documents binder, informed consent, manual of operations, data & safety monitoring (csoc, medical monitor, independent safety monitor), clinical site monitoring, quality management.

If you are an NIDCR grant applicant or awardee planning to conduct clinical research, you may need the following documents for your clinical studies.

Questions? Contact [email protected]  or your NIH Program Official.

NIDCR Clinical Study Oversight Committee (CSOC)

Nidcr medical monitor, nidcr independent safety monitor.

Research-Methodology

Observation

Observation, as the name implies, is a way of collecting data through observing. This data collection method is classified as a participatory study, because the researcher has to immerse herself in the setting where her respondents are, while taking notes and/or recording. Observation data collection method may involve watching, listening, reading, touching, and recording behavior and characteristics of phenomena.

Observation as a data collection method can be structured or unstructured. In structured or systematic observation, data collection is conducted using specific variables and according to a pre-defined schedule. Unstructured observation, on the other hand, is conducted in an open and free manner in a sense that there would be no pre-determined variables or objectives.

Moreover, this data collection method can be divided into overt or covert categories. In overt observation research subjects are aware that they are being observed. In covert observation, on the other hand, the observer is concealed and sample group members are not aware that they are being observed. Covert observation is considered to be more effective because in this case sample group members are likely to behave naturally with positive implications on the authenticity of research findings.

Advantages of observation data collection method include direct access to research phenomena, high levels of flexibility in terms of application and generating a permanent record of phenomena to be referred to later. At the same time, this method is disadvantaged with longer time requirements, high levels of observer bias, and impact of observer on primary data, in a way that presence of observer may influence the behaviour of sample group elements.

It is important to note that observation data collection method may be associated with certain ethical issues. As it is discussed further below in greater details, fully informed consent of research participant(s) is one of the basic ethical considerations to be adhered to by researchers. At the same time, the behaviour of sample group members may change with negative implications on the level of research validity if they are notified about the presence of the observer.

This delicate matter needs to be addressed by consulting with dissertation supervisor, and commencing the primary data collection process only after ethical aspects of the issue have been approved by the supervisor.

My e-book,  The Ultimate Guide to Writing a Dissertation in Business Studies: a step by step assistance  offers practical assistance to complete a dissertation with minimum or no stress. The e-book covers all stages of writing a dissertation starting from the selection to the research area to submitting the completed version of the work within the deadline.

John Dudovskiy

Observation

  • Memberships

Observational Research Method explained

Observational Research - Toolshero

Observational Research Method: this article explains the concept of Observational Research Method in a practical way. The article begins with an introduction and the general definition of the term, followed by an explanation of why observational research is important, its advantages and disadvantages, and a practical example. Enjoy reading!

What is observational research?

Observational research is a method of collecting data by simply observing and recording the behavior of individuals, animals or objects in their natural environment.

It offers researchers insights into human and animal behavior, revealing patterns and dynamics that would otherwise go unnoticed.

Free Toolshero ebook

This article explores the definition, types, advantages, and disadvantages of observational research. Several examples, including its application in market research, will show you how this approach improves our human understanding of the world.

Observational research: collecting insights unobtrusively

Definition of observational research.

Observational studies serve as a means of answering research questions through careful observation of subjects, without any interference or manipulation by the researcher.

Unlike traditional experiments, these studies lack control and treatment groups, allowing researchers to collect data in a natural setting without imposing predetermined conditions.

Observational studies are generally of a qualitative nature, with both exploratory and explanatory purposes, providing insight into the complexity of particular phenomena.

While quantitative observational studies also exist, they are less common compared to the qualitative studies.

Observational research is widely used in disciplines such as the exact sciences, medicine and social sciences.

Often, ethical or practical considerations prohibit researchers from conducting controlled experiments, leading them to opt for observational studies instead.

The lack of control and treatment groups can pose challenges in drawing conclusions. The risk of confounding variables and observer bias affecting the analysis is high, highlighting the importance of careful interpretation.

Types of observational research

Types of observational research - Toolshero

Figure 1 – Types of observational research

Some common types of observational research are:

Naturalistic observation

In naturalistic observation, researchers observe participants in their natural environment, without any interference or disturbance. The aim is to study the behavior and interactions of individuals or groups as they occur in their natural environment.

Structured observation

In structured observation, a predetermined set of behaviors or variables is observed and systematically recorded.

The researchers use specific behavioral categories or measurement tools to collect data .

Participant observation

Participant observation means that the researcher actively participates in the activities or interactions of the participants while they are being observed.

This gives the researcher a deeper insight into the experience and perspectives of the participants.

Covert observation

In the case of a covert observational study, the researcher tries to make himself known to the participants as little as possible.

They observe and record behavior without the participants being aware of the observation. This minimizes the risk of deviant behaviour.

Cross-sectional study

In cross-sectional studies, data is collected at a single point in time or over a short period of time.

The goal is to get a snapshot of the behavior or phenomenon being studied.

Longitudinal study

Longitudinal studies involve following and observing participants over a longer period of time. This makes it possible to identify and analyze changes in behavior or patterns over time.

Choosing the right type of observational study depends on the research question, the aim of the study and the available resources and time. Each type has its own strengths and weaknesses and can be adapted to the specific needs of the research.

Research Methods For Business Students Course A-Z guide to writing a rockstar Research Paper with a bulletproof Research Methodology!   More information

Steps in observational research

Below you will find the steps that are followed when setting up an observational research.

Step 1: determine research topic and objectives

The first step involves determining the phenomenon to be observed and the reasons why it is important. Observational studies are especially suitable when an experiment is not an option for practical or ethical reasons. The research topic may also depend on natural behaviour.

As an example, let’s consider a researcher who is interested in the interactions of teens in their social situations. The researcher wants to investigate whether having a smartphone influences the social interactions of the teenagers. Conducting an experiment can be tricky because smartphone use should not be manipulated.

Step 2: choose the type of observation and techniques

Think about what needs to be observed. Does the researcher go in without preconceived notion? Is there another research method that makes more sense to use? Is it important for the analysis that the researcher is present during the observation? If so, a covert observation is already ruled out.

In the example described earlier, several options are possible. The observations could be performed by observing the teens in different situations. It may also be considered to have the observer join a social group and actively participate in their interactions while the group is being observed. Hidden cameras can also be used to record teens’ social interactions in a controlled environment.

Step 3: set up the observational study

There are a number of things to consider before starting the observation.

First, you need to plan ahead. If the participants are observed in a social setting such as community centers or schools, clear agreements should be made and permission should be given. Informed consent might be required. Decide in advance the observational research methods you will use for data collection. Are notes taken? Or video images or audio recordings?

Step 4: before the observation

Once the type of observation has been chosen, the research technique has been decided on and the correct time and place have been determined, it is time to conduct the observation.

In the example, it can be considered to observe two situations, for example one with smartphones and one without smartphones. When conducting the observation, it is important to take confounding variables into account.

Step 5: analyzing data

After completing the observation, it is important to immediately record the first clues, thoughts and impressions. If the observation has been recorded, this recording must be transcribed. Subsequently, a thematic or content analysis must be carried out.

Observations are often exploratory and have an open character. That is why this analysis fits well with this method.

Step 6: discuss next steps

Observational studies are generally exploratory in nature and therefore usually do not immediately yield definitive conclusions. This is mainly because of the risk of observational bias and confounding variables. If the researcher is satisfied with the conclusions that have been reached, it may be useful to switch to another research method, like an experiment.

Examples of observational research

Observational research has led to several revolutionary results that have forever changed our understanding of the world and human behavior.

Some examples of this are:

Development of Darwin’s theory of evolution

Charles Darwin used observational research during his travels on the ship HMS Beagle. Observations of various animal species in their natural environment, such as birds in the Galapagos Islands, allowed Darwin to gather evidence for his theory of evolution.

This revolutionary theory has completely changed the understanding of the origin and diversity of species of creatures.

Discovery of penicillin

Sir Alexander Fleming accidentally discovered the effect of penicillin, a revolutionary antibiotic, through observational research.

He observed that a fungus called Penicillium notatum destroyed bacteria in a petri dish.

This discovery laid the foundation for the development of modern antibiotics and has had an enormous impact on medicine and the treatment of infectious diseases.

Confirmation of Einstein’s theory of relativity

During a solar eclipse in 1919, Arthur Eddington and his team conducted observational research to test the predictions of Einstein’s general theory of relativity.

By observing the positions of stars during the eclipse, they were able to confirm the deflection of light by the sun’s gravity. This experimental evidence supported Einstein’s theory and marked a revolutionary breakthrough in physics.

Research into the effects of smoking on health

One of the most influential observational studies was the study of the relationship between smoking and health problems, particularly lung cancer.

By observing large groups of smokers over a long period of time and collecting data on their smoking behavior and health outcomes, it was shown that there is a strong association between smoking and the risk of lung cancer.

These findings have led to a better understanding of the harmful effects of smoking and have contributed to the promotion of anti-smoking measures and health education.

Pros and cons

Observational research has several advantages and disadvantages that need to be considered before choosing the right research approach.

Advantages of observational research

Authentic behaviour.

By observing people, animals or objects in their natural environment, researchers can study authentic behavior.

That means that the observations take place in real situations and not artificial laboratory conditions.

This allows researchers to study behavior as it actually occurs. This increases scientific validity.

Detailed information

Observational research offers the opportunity to collect detailed information about behaviour, interactions and context.

Researchers can observe specific behaviors such as nonverbal cues, responses to stimuli, and social dynamics. This leads to a deep understanding of the phenomenon being studied.

Flexibility

Observational research can be adapted to different research questions and contexts. Researchers can tailor the observations to the specific situations and variables they want to study. This gives them the flexibility to focus on specific aspects of behaviour, for example.

Disadvantages of observational research

Limited control.

In observational research, researchers have limited control over the conditions and variables they observe. They cannot perform experimental manipulations or control specific environmental factors.

Observer bias

Observer bias refers to the subjective interpretation of the observations by the researcher. Researchers may unconsciously project their own biases, expectations, or interpretations onto the observed behaviors. This could jeopardize the objectivity of the investigation.

Time consuming

Join the Toolshero community

Now it’s your turn

What do you think? Do you recognize the explanation about observational research? Are you familiar with observational research? What do you think are the main benefits of observational research? Have you ever read or experienced an observational study that has given you new insights? Do you have tips or other comments?

Share your experience and knowledge in the comments box below.

More information

  • Barick, R. (2021). Research Methods For Business Students . Retrieved 02/16/2024 from Udemy.
  • Rosenbaum, P. R. (2005). Observational study . Encyclopedia of statistics in behavioral science.
  • Altmann, J. (1974). Observational study of behavior: sampling methods . Behaviour, 49(3-4), 227-266.
  • Jepsen, P., Johnsen, S. P., Gillman, M. W., & Sørensen, H. T. (2004). Interpretation of observational studies . Heart, 90(8), 956-960.
  • Ligthelm, R. J., Borzì, V., Gumprecht, J., Kawamori, R., Wenying, Y., & Valensi, P. (2007). Importance of observational studies in clinical practice . Clinical therapeutics , 29(6), 1284-1292.

How to cite this article: Janse, B. (2023). Observational Research Method . Retrieved [insert date] from Toolshero: https://www.toolshero.com/research/observational-research/

Original publication date: 10/17/2023 | Last update: 01/02/2024

Add a link to this page on your website: <a href=” https://www.toolshero.com/research/observational-research/”> Toolshero: Observational Research Method</a>

Did you find this article interesting?

Your rating is more than welcome or share this article via Social media!

Average rating 4 / 5. Vote count: 4

No votes so far! Be the first to rate this post.

We are sorry that this post was not useful for you!

Let us improve this post!

Tell us how we can improve this post?

Ben Janse

Ben Janse is a young professional working at ToolsHero as Content Manager. He is also an International Business student at Rotterdam Business School where he focusses on analyzing and developing management models. Thanks to his theoretical and practical knowledge, he knows how to distinguish main- and side issues and to make the essence of each article clearly visible.

Related ARTICLES

Gartner Magic Quadrant - Toolshero

Gartner Magic Quadrant report and basics explained

Univariate Analysis - Toolshero

Univariate Analysis: basic theory and example

Bivariate Analysis - Toolshero

Bivariate Analysis in Research explained

Contingency table - Toolshero

Contingency Table: the Theory and an Example

Content Analysis - Toolshero

Content Analysis explained plus example

Starting a Thesis - Toolshero

Starting a Thesis: The Most Common How’s, Why’s, and Where’s Answered

Also interesting.

Field research - Toolshero

Field Research explained

Research Ethics - Toolshero

Research Ethics explained

Research proposal - Toolshero

Research Proposal explained and guide

Leave a reply cancel reply.

You must be logged in to post a comment.

BOOST YOUR SKILLS

Toolshero supports people worldwide ( 10+ million visitors from 100+ countries ) to empower themselves through an easily accessible and high-quality learning platform for personal and professional development.

By making access to scientific knowledge simple and affordable, self-development becomes attainable for everyone, including you! Join our learning platform and boost your skills with Toolshero.

observational research

POPULAR TOPICS

  • Change Management
  • Marketing Theories
  • Problem Solving Theories
  • Psychology Theories

ABOUT TOOLSHERO

  • Free Toolshero e-book
  • Memberships & Pricing

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here .

Loading metrics

Open Access

Peer-reviewed

Research Article

Assessing fall risk and equilibrium function in patients with age-related macular degeneration and glaucoma: An observational study

Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Resources, Software, Validation, Visualization, Writing – original draft, Writing – review & editing

* E-mail: [email protected]

Affiliation Otorhinolaryngology, Shinseikai Toyama Hospital, Imizu, Japan

ORCID logo

Roles Data curation, Investigation, Resources, Writing – review & editing

Roles Conceptualization, Methodology, Resources, Validation, Writing – review & editing

Affiliation Ophthalmology, Shinseikai Toyama Hospital, Imizu, Japan

Roles Conceptualization, Methodology, Project administration, Resources, Writing – review & editing

Roles Conceptualization, Methodology, Resources, Supervision, Validation, Writing – review & editing

Affiliation Otolaryngology, Mejiro University Ear Institute Clinic, Saitama, Japan

  • Takahiro Tokunaga, 
  • Rinako Takegawa, 
  • Yoshiki Ueta, 
  • Yasuhiro Manabe, 
  • Hiroaki Fushiki

PLOS

  • Published: April 1, 2024
  • https://doi.org/10.1371/journal.pone.0301377
  • Peer Review
  • Reader Comments

Table 1

Falls in older adults are a significant public health concern, and age-related macular degeneration (AMD) and glaucoma have been identified as potential visual risk factors. This study was designed to assess equilibrium function, fall risk, and fall-related self-efficacy (an individual’s belief in their capacity to act in ways necessary to reach specific goals) in patients with AMD and glaucoma.

This observational study was performed at the Otorhinolaryngology Department of Shinseikai Toyama Hospital. The cohort comprised 60 participants (AMD; n = 30; median age, 76.0 years; and glaucoma; n = 30; median age, 64.5 years). Visual acuity and visual fields were assessed using the decimal best-corrected visual acuity and Humphrey visual field tests, respectively. The evaluation metrics included pathological eye movement analysis, bedside head impulse test, single-leg upright test, eye-tracking test, optokinetic nystagmus, and posturography. Furthermore, we administered questionnaires for fall risk determinants including the Dizziness Handicap Inventory, Activities-Specific Balance Confidence Scale, Falls Efficacy Scale-International, and Hospital Anxiety and Depression Scale. The collected data were analyzed using descriptive statistics, and Spearman’s correlation analysis was employed to examine the interrelations among the equilibrium function, fall risk, and other pertinent variables.

Most participants exhibited standard outcomes in equilibrium function evaluations. Visual acuity and field deficits had a minimal impact on subjective dizziness manifestations, degree of disability, and fall-related self-efficacy. Both groups predominantly showed high self-efficacy. No significant correlation was observed between visual acuity or field deficits and body equilibrium function or fall risk. However, greater peripheral visual field impairment was associated with a tendency for sensory reweighting from visual to somatosensory.

Self-efficacy was higher and fall risk was relatively lower among patients with mild-to-moderate visual impairment, with a tendency for sensory reweighting from visual to somatosensory in those with greater peripheral visual field impairment. Further studies are required to validate these findings.

Citation: Tokunaga T, Takegawa R, Ueta Y, Manabe Y, Fushiki H (2024) Assessing fall risk and equilibrium function in patients with age-related macular degeneration and glaucoma: An observational study. PLoS ONE 19(4): e0301377. https://doi.org/10.1371/journal.pone.0301377

Editor: Renato S. Melo, UFPE: Universidade Federal de Pernambuco, BRAZIL

Received: December 1, 2023; Accepted: March 14, 2024; Published: April 1, 2024

Copyright: © 2024 Tokunaga et al. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant data are within the paper and its Supporting information files.

Funding: The author(s) received no specific funding for this work.

Competing interests: The authors have declared that no competing interests exist.

Introduction

Falls are a pressing public health issue, particularly among older adults. Data from the Centers for Disease Control and Prevention in 2018 indicated that 27.5% of adults aged ≥65 years in the United States had experienced at least one fall in the preceding year, with 10.2% sustaining injuries due to these falls [ 1 ]. The risk of falls in older adults is caused by a complex combination of factors. For example, as older adults spend most of their time at home, fall risk factors in the home environment, including inadequate lighting, slippery surfaces, worn carpets, and stairs without handrails, become important. Additionally, visual deficits, which are prevalent in older adults, have been identified as potential risk factors of falls.

The integration of visual, vestibular, and somatosensory inputs is essential for postural regulation. Any deficit in these sensory modalities can jeopardize postural steadiness and increase the fall risk [ 2 ]. Although visual deficits are not the only cause of falls, discerning their specific risk factors can facilitate the formulation of robust preventive measures.

Previous studies have provided insights into the relationships among visual acuity, field deficits, and the incidence of falls. For instance, one study revealed that first-eye cataract surgery significantly reduces the rate of falls and enhances overall visual function and health status [ 3 ]. Another study suggested that while baseline bilateral visual impairment was associated with a nearly two-fold increase in the risk of falls, even mild unilateral visual impairment was significantly associated with frequent falls due to both correctable and uncorrectable conditions [ 4 ]. Additionally, a cohort study highlighted that recently developed visual impairments in older adults increased the likelihood of subsequent falls and fractures within a 5-year period [ 5 ]. In contrast, research focusing on patients with glaucoma found that deficits in postural control, although affected by visual field deficit severity, were not solely attributable to impaired peripheral visual input [ 6 ].

The exact correlation between visual deficits and falls remains unclear and, at times, is contradictory. The precise risk factors that lead to falls in individuals with visual challenges are not comprehensively understood because of research limitations. This knowledge gap impedes the creation of specialized fall prevention strategies for the visually impaired.

In this study, we included patients with age-related macular degeneration (AMD) and glaucoma with moderate or low severity to distinguish between visual acuity impairment and visual field impairment, and determine which of the two contributes to postural control. AMD is a degenerative condition that affects the macula and the central region of the retina, leading to visual acuity impairment. Globally, AMD causes irreversible visual impairment, which is sometimes bilateral and can lead to blindness in severe cases [ 7 ]. As the Japanese population ages, the number of AMD cases has surged. Glaucoma, which is characterized by optic nerve damage primarily due to elevated intraocular pressure, results in visual field narrowing and loss. Over 10% of individuals aged ≥60 years in Japan have glaucoma, making it the primary cause of blindness in the country [ 8 ].

This study examined the equilibrium function and fall susceptibility in patients with AMD and glaucoma attending a standard ophthalmology outpatient clinic without subjective vertigo symptoms. By measuring the fall risk in a population of mild–to-moderately visually impaired individuals, this research may help develop fall prevention strategies for such individuals.

Materials and methods

Study design and participants.

This observational study was conducted at the Otorhinolaryngology Department of Shinseikai Toyama Hospital between April 1, 2021, and March 31, 2022. Patients diagnosed with AMD and glaucoma, representing visual acuity and visual field deficits, respectively, were enrolled from the Ophthalmology Department of our hospital. Thirty patients with AMD (median age, 76.0 years) and 30 patients with glaucoma (median age, 64.5 years) participated in this study ( Table 1 ). All patients were current attendees of the hospital, and their medical records were used to investigate their medical history and cognitive function. The mean left–right difference in hearing of all study participants was ˂15 dB. Exclusion parameters included any neurological or musculoskeletal alignment history that could affect balance and gait, potentially confounding the visual impairment-related fall risk. Patients with a history of vertigo or a diagnosis of vestibular dysfunction by a specialist were excluded. Patients diagnosed with cognitive impairment were excluded because cognitive impairment can affect the results of the questionnaire.

thumbnail

  • PPT PowerPoint slide
  • PNG larger image
  • TIFF original image

https://doi.org/10.1371/journal.pone.0301377.t001

Visual function assessment

Monocular visual acuity was assessed in patients with AMD using decimal best-corrected visual acuity (BCVA) with a Landolt ring. In contrast, the monocular visual fields of glaucoma patients were assessed using the MD value (dB) from the Humphrey visual field test. BCVA were transmuted to “logMAR” for arithmetic computations and statistical analysis.

observational research

High logMAR values signify low visual acuity, whereas low MD values indicate a compromised visual field. The superior-performing eye’s values were utilized for both visual acuity and visual field evaluations. Humans are semi-crossed binocular animals; therefore, they rely on information from the better eye to discriminate objects and control eye movements and postural reflexes derived from peripheral vision. Therefore, the better eye was the main object of the analysis.

We also measured the visual fields extensively using Goldmann visual field meters; however, we did not analyze these measurements because they are not routinely performed in clinical practice and may not reflect the visual fields at the time of enrollment in this study.

Outcome measures

The outcome measures comprised the equilibrium function and fall risk evaluations. Equilibrium function evaluation included pathological eye movement analysis, bedside head impulse test (HIT), single-leg upright test, eye-tracking test (ETT), optokinetic nystagmus (OKN), and posturography. The HIT is a technique used to evaluate semicircular canal dysfunction by observing the vestibulo–ocular reflex. The examiner sat facing the participants and asked them to look at the tip of the examiner’s nose. The examiner then firmly grasped the participant’s temporal region with both hands and applied a fast, small rotation of the head for impulse stimulation. In cases of semicircular canal dysfunction, head impulse stimulation in the direction of the affected side produces a "catch up saccade.” The head impulse was applied three times each to the left and right sides, and a positive result was obtained when the catch-up saccade was observed two or more times [ 9 ]. The ETT measures eye movements by fixing the participant’s head and having the eyes tracking a smooth-moving optotype (amplitude, 40°; frequency, 0.3 Hz) in front of the eye. Smooth eye movements were considered normal, and saccadic and ataxic patterns were considered abnormal [ 10 ]. The OKN measures eye movement while looking at an object moving at a constant angular velocity in front of the eye. Stimuli of 60°/s were applied in the left and right directions. Saccadic patterns were considered abnormal [ 11 ]. Eye movements were recorded using the yVOG (Daiichi Medical Co., Ltd., Tokyo, Japan) and Gravicoda GW-31 (Anima Co., Ltd., Tokyo, Japan) facilitated posturography. Posturography was performed on solid or rubber foam surfaces while observing a small spot in front of a white wall and closing the eye with an eye mask. The foam ratio (posturography with/without foam) with eyes closed served as a somatosensory-dependent postural control metric, whereas the Romberg foam ratio served as a visual-dependent metric. The quotient of (Romberg ratio on foam)/(foam ratio with closed eyes) was computed as a visual postural control dependence index, named the “visual/somatosensory ratio” [ 12 ].

Patients completed the Dizziness Handicap Inventory (DHI) [ 13 ], Activities-specific Balance Confidence (ABC) scale [ 14 ], Falls Efficacy Scale-International (FES-I) [ 15 ], and the Hospital Anxiety and Depression Scale (HADS) [ 16 ]. They also documented falls over the preceding six months. The DHI is designed to quantify the magnitude of subjective physical, emotional, and functional impairments. It measures the extent of impairment in daily activities caused by dizziness. Scores range between 0 and 100 points, with categorizations of mild (0–30 points), moderate (31–60 points), and severe (60–100 points) to denote severity [ 17 ]. The ABC scale employs a 10-point scale ranging 0–100% across 16 activities of daily living. An average score of 67% or lower across these 16 items indicates an increased fall risk [ 18 ]. The FES-I is a questionnaire focused on the fear of falling. It quantifies self-efficacy related to falls. Self-efficacy is an individual’s belief in their capacity to act in the manner necessary to reach specific goals [ 19 ]. Participants rate their level of caution to avoid falling during 16 indoor and outdoor activities on a four-point scale. Cumulatively, higher scores signify lower self-efficacy regarding falls. Moreira et al. established 23 points a threshold score [ 20 ]. The HADS is a straightforward, 14-item questionnaire designed to evaluate anxiety and depression. Subscales of ≥12 for either anxiety or depression are reported to have a sensitivity of 92% and a specificity of 90% in diagnosing psychiatric morbidity [ 21 ]. All fall risk evaluations were conducted using self-report questionnaires.

Ethical considerations

This study was approved by the Research Ethics Committee of the Shinseikai Toyama Hospital (Approval No.: 210309–1). All participants provided written informed consent and the study strictly adhered to the tenets of the Declaration of Helsinki.

Statistical analysis

Descriptive statistical methods were used for data analysis. Categorical variables are presented as frequencies and percentages, whereas continuous variables are presented as medians and interquartile ranges (IQR). Spearman’s correlation analysis examined the correlations among equilibrium function, fall risk, and other variables.

Although this research was preliminary and did not include validation testing, the sample size was deduced from previous studies. In one study, approximately 30% of patients with vestibular dysfunction had an ABC scale score of <67%, indicating an elevated fall risk [ 22 ]. Assuming a marginally reduced fall risk (20%) in our study cohort, a minimum sample size of 44 patients was deduced to achieve a 95% confidence interval (CI) ranging from 8% to 32% (60% relative accuracy). Analyses were performed using Stata software v18.0 (StataCorp LP, Texas, USA). Statistical significance was set at P<0.05. 3.

Demographics and visual impairment characteristics

The cohort included 30 patients with AMD and 30 patients with glaucoma, with females constituting 35% of the total. The participants’ ages ranged from 40 to 84 years, with a median age of 70.0 years. Table 1 shows the demographic details and the extent of visual deficits among the participants.

Evaluation of fall risk and self-efficacy pertaining to falls

In the previous six months, 7% of patients with AMD and glaucoma reported falls. The median FES-I scores were comparable between patients with AMD (23.0, IQR: 20.0 to 29.0) and glaucoma (23.5, IQR: 20.0 to 27.0). Similarly, the median ABC scale scores for AMD (95.3, IQR: 83.8 to 98.8) and glaucoma (96.3, IQR: 86.9 to 98.8) patients showed no significant difference (p = 0.76). Regarding fall-related self-efficacy, 3.3% of patients with AMD and 10.0% of patients with glaucoma registered an ABC scale score of <67%, a threshold indicative of an increased fall risk in older individuals [ 22 ]. In the single-leg upright test, 20% of patients with AMD and 3% of patients with glaucoma maintained their stance for ˂5 s, a duration associated with an elevated fall risk [ 23 ]. However, this difference was not statistically significant (p = 0.103).

Oculomotor and vestibular function assessment

Table 2 shows the results of the balance function tests and fall-related self-efficacy metrics. Most participants in both groups demonstrated normal HIT, ETT, and OKN results. Specifically, 80% of the patients with AMD and 73% of the patients with glaucoma had standard ETT results. All patients with AMD and 90–97% of patients with glaucoma exhibited normal OKN results. Positive HIT results were observed in 10–13% of patients with AMD and 3% of those with glaucoma. In terms of subjective dizziness and associated disability, only a single glaucoma patient registered a DHI score ≥60.

thumbnail

https://doi.org/10.1371/journal.pone.0301377.t002

Balance and its association with visual deficits

Fig 1 shows the relationship between visual impairments (visual acuity and visual field deficits) and the visual/somatosensory ratio. Neither AMD nor glaucoma patients exhibited significant correlations with Spearman’s correlation analysis. Among patients with glaucoma, more pronounced visual field impairment in the better-performing eye corresponded to reduced visual dependence. No association was identified between visual deficits and body equilibrium function or fall susceptibility across the other metrics ( Table 3 ).

thumbnail

(A) Relationship between visual acuity and the visual/somatosensory ratio, (B) Relationship between visual field impairment and the visual/somatosensory ratio, The superior-performing eye’s results were utilized for both visual acuity and visual field evaluations.

https://doi.org/10.1371/journal.pone.0301377.g001

thumbnail

https://doi.org/10.1371/journal.pone.0301377.t003

Our investigation aimed to evaluate the equilibrium function and susceptibility to falls in patients with AMD and glaucoma. The patients exhibited standard outcomes in the HIT, ETT, and OKN evaluations. The extent of visual acuity and visual field deficits minimally influenced subjective dizziness, degree of disability, and fall-related self-efficacy. Patients with AMD as well as those with glaucoma generally demonstrated high self-efficacy. Notably, the proportion of patients with a heightened fall risk was lower than that of patients with peripheral vestibular disorders. Of the 52 patients with peripheral vestibular disorders, 18 (34.6%) were assessed as being at risk for falls using the ABC scale [ 22 ]. This indicates that individuals with mild-to-moderate visual acuity and visual field deficits with normal vestibular and somatic balance functions might exhibit fewer dizziness manifestations and potentially a diminished fall risk compared to those with vestibular deficits.

The triad of visual, vestibular, and somatosensory inputs is crucial for postural regulation. While visual anomalies can compromise postural stability, existing literature suggests potential compensatory mechanisms via other sensory modalities [ 24 ]. Individuals with visual deficits often exhibit irregular postural reflexes and motor patterns, resulting in asymmetric muscle distribution. Nevertheless, they can acclimatize to these anomalies by harnessing non-visual feedback mechanisms such as auditory cues.

The visual/somatosensory ratio was used as an indicator of visual dependence of postural control. This ratio decreased in patients with impaired visual fields. This indicates potential sensory reweighting in patients with compromised visual fields, suggesting that other senses might compensate for the deteriorated visual fields. In contrast, these patterns were absent in patients with visual acuity deficits. Because visual acuity measures central vision, it may not be relevant to postural control. The peripheral retina is primarily responsible for the induction of self-motion perception [ 25 , 26 ]. Most studies have assigned the most important role in postural control to the peripheral retina [ 27 ].

The existing research on equilibrium function and postural stability in visually impaired individuals is equivocal. For instance, one study reported a correlation between glaucoma severity and equilibrium, noting that pronounced visual field loss in the better eye, but not in the worse eye, correlated with augmented standing sway rates. This sway persists even with eyes closed, suggesting that postural control deficits are not solely attributable to compromised peripheral visual input [ 6 ]. Another study highlighted that visually impaired individuals predominantly employ hip-centric strategies to maintain postural stability, indicating that visual deficits adversely affect postural stability [ 28 ].

The ramifications of visual deficits on postural control are contingent on the nature (congenital vs. acquired) and severity of the impairment. Acquired visual deficits tend to have a more pronounced effect on postural control than congenital ones [ 29 ]. Furthermore, an exploration of visually impaired athletes across diverse sports revealed that postural control disparities were linked to the extent of visual loss and sports-specific modalities [ 30 ]. The cohort in our study predominantly comprised individuals with mild-to-moderate acquired visual deficits. In this demographic group, the influence of visual impairment on the fall risk was minimal.

Nevertheless, this did not reduce the importance of balance interventions in this cohort. Visual acuity and field deficits may develop gradually without the patient being aware. Four cardinal interventions—education, medical assessment, physical exercise, and environmental modifications—have been identified for fall prevention among visually impaired older individuals [ 31 ]. Further empirical investigations are required to ascertain the necessity for, and evaluate the efficacy of, specialized interdisciplinary fall prevention initiatives. Vestibular stimulation rehabilitation reportedly improves the postural stability of visually impaired individuals, bringing it at par with that of individuals with normal vision [ 32 ]. A systematic review showed that a minimum of six weeks of balance and core stability training could beneficially influence fall risk in visually impaired individuals, irrespective of age, sex, or visual impairment severity [ 33 ]. Hence, exercises that stimulate the vestibular system through head and body movements are recommended to improve balance in visually impaired individuals.

The robustness of this study derives from the inclusion of patients with mild-to-moderate visual acuity and visual field deficits, particularly those with acquired conditions such as AMD and glaucoma, who might be oblivious to their gradual visual deterioration. Determining whether such latent visual deficits augment fall risk is pivotal for falls prevention in older individuals. Subjective symptoms were juxtaposed with empirical findings by combining subjective questionnaire surveys and objective balance function tests.

However, this study had some limitations. Age matching and controlling for other characteristics that could potentially influence the outcomes was not performed. This may have introduced confounding factors. Owing to its observational nature, there may have been a selection bias in patient recruitment. Specifically, the patients were asked to participate as volunteers, which may have discouraged the recruitment of patients with severe visual impairment. The study design inherently precludes causal inferences. Moreover, this study did not account for other potential postural control and fall risk factors such as muscle strength, walking speed, and environmental hazards. The relatively modest sample size may limit the generalizability of the findings.

In summary, our findings underscore that most patients with AMD and glaucoma attending an ophthalmology clinic exhibit standard postural control and have a lower fall risk than patients with vestibular disorders. This indicates that visual deficits did not substantially undermine postural stability in this cohort. Nonetheless, medical practitioners and policymakers should consider vestibular function evaluations in visually impaired patients with a heightened fall risk due to other underlying conditions. Prospective studies with matched cohorts and large sample sizes are required to corroborate our findings and elucidate the underlying mechanisms.

Supporting information

https://doi.org/10.1371/journal.pone.0301377.s001

Acknowledgments

The authors extend their gratitude to the certified orthoptists of Shinseikai Toyama Hospital for patient recruitment, and to the Department of Otolaryngology staff and clinical laboratory technicians of Shinseikai Toyama Hospital for their assistance with the body balance function tests. We would like to thank Honyaku Center Inc. for English language editing. ChatGPT was used only for syntax proofreading and checking spelling errors and grammar.

  • View Article
  • PubMed/NCBI
  • Google Scholar
  • 19. Bandura A. Self-efficacy: The exercise of control. New York, NY: W. H. Freeman; 1997.

Amgen Logo

What would you like to do?

Category Select one Administrative Aviation Business Development Clinical Clinical Development College Job Compliance Corporate Services Engineering Finance Government Affairs Health Economics Human Resources Information Systems Law/Legal Logistics Maintenance Manufacturing Marketing Medical Affairs Medical Services Operations Postdoctoral Process Development Procurement Project Management Public Relations Quality Regulatory Research Safety Sales Sales & Marketing Operations Scientific Strategy and Innovation Supply Chain Training Value and Access

Job Type Select one Full time Part time

Where would you like to work?

City Select one Albany Albuquerque Algiers Austin Bakersfield Baltimore Bangkok Beijing Birmingham Birmingham Bogotá Boise Bradenton Breda Bristol Bucharest Buenos Aires Buffalo Burnaby Cairo Cambridge Cambridge Changsha Chicago Cincinnati Cleveland Columbus Copenhagen Dallas Danbury Denver Des Moines Detroit Diegem Dublin Dún Laoghaire East Pensacola Heights Edinburgh El Paso Glasgow Glendale Greater Manchester Guangzhou Hangzhou Harbin Hartford Holly Springs Hong Kong Indianapolis Istanbul Jacksonville Jiangmen Jiaxing Juncos Jupiter Kansas City Kuala Lumpur Kunming Ladrido Lancaster Leeds Lisbon Little Rock London Los Angeles Madrid Melbourne Memphis Mexico City Milan Minneapolis Mississauga Montreal Munich Nantong Naperville Nashua New Albany New Albany New York Newark Omaha Orlando Palm Springs Paris Parkersburg Philadelphia Phoenix Pittsburgh Portland Providence Raleigh Remote Richmond Riyadh Rockville Salinas San Diego San Francisco Santa Clara Seattle Seoul Shanghai Shijiazhuang Singapore Søborg Sofia South San Francisco Southampton St Louis Sydney Taichung Taipei Tampa Terre Haute Thousand Oaks Tokyo Tulsa Utrecht Uxbridge Uxbridge Vancouver Vienna Virginia Beach Visalia Warsaw Washington D.C. West Covina West Greenwich Wuhan Wuhu Xi'an Xuzhou Yangzhou Zagreb

Country Select one Algeria Argentina Australia Austria Belgium Bulgaria Canada China Colombia Croatia Denmark Egypt France Germany Hong Kong SAR Ireland Italy Japan Malaysia Mexico Netherlands Poland Portugal Puerto Rico Remote Romania Saudi Arabia Singapore South Korea Spain Taiwan Thailand Türkiye United Arab Emirates United Kingdom United States

State Select one Alabama Alberta Algiers Province Anhui Arizona Arkansas Bangkok Bavaria Beijing Municipality Bogota D.C. British Columbia București Buenos Aires F.D. Cairo Governorate California Capital Region Central and Western District City of Zagreb Colorado Connecticut District of Columbia Dubai England Flanders Florida Galicia Guangdong Hebei Heilongjiang Hubei Hunan Idaho Île-de-France Region Illinois Indiana Iowa Istanbul Jiangsu Juncos Kuala Lumpur Leinster Lisbon District Lombardy Madrid Maine Maryland Massachusetts Mazovia Mexico City Michigan Minnesota Missouri Nebraska New Hampshire New Jersey New Mexico New South Wales New York North Brabant North Carolina Ohio Oklahoma Ontario Oregon Pennsylvania Quebec Rhode Island Riyadh Region Scotland Seoul Shaanxi Shanghai Municipality Singapore Sofia-grad Taiwan Tennessee Texas Tokyo Utrecht Vienna Virginia Washington West Virginia Yunnan Zhejiang

Observational Research Manager - US Remote

Two lab technicians smiling

RADIUS Select Miles 5 miles 15 miles 25 miles 35 miles 50 miles

HOW MIGHT YOU DEFY IMAGINATION?

You’ve worked hard to become the professional you are today and are now ready to take the next step in your career. How will you put your skills, experience and passion to work toward your goals? At Amgen, our shared mission—to serve patients—drives all that we do. It is key to our becoming one of the world’s leading biotechnology companies, reaching over 10 million patients worldwide. Come do your best work alongside other innovative, driven professionals in this meaningful role.

Observational Research Manager

What you will do

Let’s do this. Let’s change the world. This vital role you will be responsible for research activities on therapeutic/product or Data and Analytics Center (DAC) functionally aligned teams.

With a constantly growing demand for information from regulatory and reimbursement agencies, Observational Research (OR) has become a critical component in drug development and commercialization. Amgen’s Center for Observational Research (CfOR) partners with internal and external teams to generate real world evidence for multiple partners across the product lifecycle. The CfOR gives evidence regarding the frequency and distribution of disease or the clinical burden of disease, the natural history or clinical course of disease, the design of clinical trials, cost and utilization patterns and the safety and efficiency of interventions.

The CfOR Manager is recognized as a strong scientific contributor and a first or contributing author for papers in peer-reviewed journals, abstracts for scientific congresses and internal reports that improve the company’s mission. Preferably, CfOR Managers will be located on the Amgen corporate campus in Thousand Oaks, CA, but remote working opportunities are also available.

Key activities of a CfOR Manager include:

  • Providing support in the design and execution of RWE studies across the product lifecycle
  • Performing and handling research projects involving the analysis of multiple types of data including medical claims, electronic health records and prospective observational cohort studies.
  • Contributing to the development and implementation of innovative analytic methods, capabilities and tools to enable rapid, scalable and reproducible RWE.
  • Staying current on the latest developments in the field of observational research and drug development.
  • Promoting awareness, understanding and use of OR methods.
  • Communicating significant scientific information to a variety of audiences.

What we expect of you

We are all different, yet we all use our unique contributions to serve patients. The research professional we seek is driven with these qualifications.

Basic Qualifications:

Doctorate degree

Master's degree and 3 years of related research and scientific experience

Bachelor's degree and 5 years of related research and scientific experience

Associate's degree and 10 years of related research and scientific experience

High School diploma / GED and 12 years of related research and scientific experience

Preferred Qualifications:

  • Doctorate in Epidemiology or other subject with high observational research content
  • Experience communicating observational research information (written and oral)
  • Experience working in multi-disciplinary teams

What you can expect of us

As we work to develop treatments that take care of others, we also work to care for our teammates’ professional and personal growth and well-being.

The expected annual salary range for this role in the U.S. (excluding Puerto Rico) is posted. Actual salary will vary based on several factors including but not limited to, relevant skills, experience, and qualifications.

Amgen offers a Total Rewards Plan comprising health and welfare plans for staff and eligible dependents, financial plans with opportunities to save towards retirement or other goals, work/life balance, and career development opportunities including:

  • Comprehensive employee benefits package, including a Retirement and Savings Plan with generous company contributions, group medical, dental and vision coverage, life and disability insurance, and flexible spending accounts.
  • A discretionary annual bonus program, or for field sales representatives, a sales-based incentive plan
  • Stock-based long-term incentives
  • Award-winning time-off plans and bi-annual company-wide shutdowns
  • Flexible work models, including remote work arrangements, where possible

for a career that defies imagination

Objects in your future are closer than they appear. Join us.

careers.amgen.com

Amgen is an Equal Opportunity employer and will consider you without regard to your race, color, religion, sex, sexual orientation, gender identity, national origin, protected veteran status, or disability status.

We will ensure that individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment. Please contact us to request accommodation.

SHARE THIS JOB

Sign up for job alerts.

Stay up to date on Amgen news and opportunities. Sign up to receive alerts about positions that suit your skills and career interests.

Category* Select one Administrative Aviation Business Development Clinical Clinical Development College Job Compliance Corporate Services Engineering Finance Government Affairs Health Economics Human Resources Information Systems Law/Legal Logistics Maintenance Manufacturing Marketing Medical Affairs Medical Services Operations Postdoctoral Process Development Procurement Project Management Public Relations Quality Regulatory Research Safety Sales Sales & Marketing Operations Scientific Strategy and Innovation Supply Chain Training Value and Access

  • Research, Washington D.C., District of Columbia, United States Remove

Confirm Email

By submitting your information, you acknowledge that you have read our privacy policy (this content opens in new window)and consent to receive email communication from.

Related Content

Two men talking

Mission, Vision and Values

employees sitting and talking

Diversity, Inclusion and Belonging

Older woman

Amgen Stories

Prognostic utility and characteristics of MIB-1 labeling index as a proliferative activity marker in childhood low-grade glioma: a retrospective observational study

  • Open access
  • Published: 05 April 2024
  • Volume 150 , article number  178 , ( 2024 )

Cite this article

You have full access to this open access article

  • David Gorodezki 1 ,
  • Julian Zipfel 2 ,
  • Andrea Bevot 3 ,
  • Thomas Nägele 4 ,
  • Martin Ebinger 1 ,
  • Martin U. Schuhmann 2 &
  • Jens Schittenhelm 5  

The prognostic utility of MIB-1 labeling index (LI) in pediatric low-grade glioma (PLGG) has not yet conclusively been described. We assess the correlation of MIB-1 LI and tumor growth velocity (TGV), aiming to contribute to the understanding of clinical implications and the predictive value of MIB-1 LI as an indicator of proliferative activity and progression-free survival (PFS) in PLGG.

MIB-1 LI of a cohort of 172 nonependymal PLGGs were comprehensively characterized. Correlation to TGV, assessed by sequential MRI-based three-dimensional volumetry, and PFS was analyzed.

Mean MIB-1 LI accounted for 2.7% (range: < 1–10) and showed a significant decrease to 1.5% at secondary surgery ( p  = .0013). A significant difference of MIB-1 LI in different histopathological types and a correlation to tumor volume at diagnosis could be shown. Linear regression analysis showed a correlation between MIB-1 LI and preoperative TGV (R 2  = .55, p  < .0001), while correlation to TGV remarkably decreased after incomplete resection (R 2  = .08, p  = .013). Log-rank test showed no association of MIB-1 LI and 5-year PFS after incomplete (MIB-1 LI > 1 vs ≤ 1%: 48 vs 46%, p  = .73) and gross-total resection (MIB-1 LI > 1 vs ≤ 1%: 89 vs 95%, p  = .75).

These data confirm a correlation of MIB-1 LI and radiologically detectable TGV in PLGG for the first time. Compared with preoperative TGV, a crucially decreasing correlation of MIB-1 LI and TGV after surgery may result in limited prognostic capability of MIB-1 LI in PLGG.

Avoid common mistakes on your manuscript.

Introduction

Pediatric low-grade gliomas (PLGG) comprise a heterogenous set of central nervous system (CNS) tumors of glial and mixed glioneural histology, representing the most common brain tumor types in childhood and adolescence, and are strongly associated with alterations of the RAS/MAPK pathway (Sievert and Fisher 2009 ; Rickert and Paulus 2001 ; Ryall et al. 2020b ). These tumors are commonly characterized by a benign clinical course, mostly associated with favorable overall-survival rates and a low risk of malignant transformation or metastatic dissemination (Greuter et al. 2021 ; Sievert and Fisher 2009 ; Avinash et al. 2019 ; Chamdine et al. 2016 ; Krishnatry et al. 2016 ). However, depending on the tumor location, complete tumor resection can be crucially limited, as constantly high rates of incompletely resected PLGG have been reported over the past decades despite continuous advances in neurosurgical technology (Wisoff et al. 2011 ; Bandopadhayay et al. 2014 ; Stokland et al. 2010 ; Gnekow et al. 2012 ). Particularly in patients with tumors of limited resectability in deep-seated midline locations, recurrent progressions, treatment sequalae and unsatisfactory tumor control by subsequent adjuvant therapies frequently provoke significant morbidity (Krishnatry et al. 2016 ; Sadighi et al. 2018 ; Armstrong et al. 2009 ; van Iersel et al. 2020 ).

Ambivalent and barely predictable progression patterns of PLGG complicate treatment decisions and issue a challenge in postoperative care and adjuvant management after incomplete resection, as previous studies report of progression-free survival (PFS) rates of 45–65% after incomplete resection (Sievert and Fisher 2009 ; Wisoff et al. 2011 ; Stokland et al. 2010 ; Gnekow et al. 2012 ; Benesch et al. 2006 ; Jones et al. 2018 ; Ryall et al. 2020a ; Fisher et al. 2001 ; Shaw and Wisoff 2003 ). Evaluation of reliable histological, molecular or imaging-based biomarkers for risk stratification is the objective of current research and ongoing clinical trials.

Molecular Immunology Borstel (MIB-1) is a monoclonal antibody used for qualitative detection of KI-67, a nonhistone DNA-binding nuclear protein exclusively expressed during active cell cycle phases, whereas showing absence during resting phase and senescence in paraffin-embedded sections (Gerdes et al. 1983 ; McCormick et al. 1993 ). Comparable to distinct CNS and solid non-CNS malignancies, the proportion of KI-67 expressing cells has repeatedly been described as a marker of the proliferative potential in glial tumors, while fostered by its unelaborate application by immunohistochemistry in formalin-fixed, paraffin-embedded (FFPE) tissues, nowadays plays an integral role in histopathological routine diagnostics (Schröder et al. 1991 ; Kałuza et al. 1997 ; Thotakura et al. 2014 ).

Although a correlation to histological grade in pediatric glioma has repeatedly been reported, several studies furthermore conformably show an association of KI-67/MIB-1 LI and PFS in pediatric high-grade glioma (Yao et al. 2023 ; Matsumoto et al. 1998 ; Pollack et al. 2002 , 1997 ; Ho et al. 1998 ).

The prognostic significance of MIB-1 labeling index (LI) in PLGG, however, remains unclear, as previously published data of smaller case series partly show contradictory results and draw conflicting conclusions (Bowers et al. 2002 , 2003 ; Fisher et al. 2002 ; Dorward et al. 2010 ; Margraf et al. 2011 ; Tu et al. 2018 ; Horbinski et al. 2010 ; Cherlow et al. 2019 ; Cler et al. 2022 ). In 2003, Bowers et al. reported of a study including 118 Pilocytic astrocytoma (PA) patients, showing a shortened PFS in patients with tumors bearing a MIB-1 LI > 2%. This data is in line with four subsequent studies on PAs including 35–80 cases, reporting a significant negative correlation of KI-67/MIB-1 LI and PFS (Fisher et al. 2002 ; Dorward et al. 2010 ; Margraf et al. 2011 ; Tu et al. 2018 ). In contrast, Horbinski et al. ( 2010 ) published a case series including 118 patients, showing no correlation of MIB-1 LI and adverse treatment outcome including recurrence, progression, metastatic spread or death. Similar results were reported by Cherlow et al. ( 2019 ) of 85 PLGG patients included in the ACNS0221 trial, as well as by Cler et al. ( 2022 ), showing no correlation of KI-67 with PFS in a series of PAs. Comparable results could be shown by Bowers et al. ( 2002 ) in a small series of pediatric low-grade oligodendrogliomas. It should be pointed out, that the previously published datasets almost exclusively are confined to distinct PLGG types, in most cases to PA, and follow-up data is often very limited. Despite its integral part in diagnostic routine, the overall predictive value of MIB-1 LI in PLGG remains unclear.

In this study, we aim to contribute data of a large representative single center PLGG cohort comprising various PLGG types. Beyond a characterization of MIB-1 LI values and analysis of a possible association with PFS and molecular data, we assess the correlation of MIB-1 LI values and pre- and postoperative tumor growth velocity, aiming to contribute to the understanding of the clinical implication and predictive value of MIB-1 LI as a surrogate marker of the proliferative activity in PLGG on progression-free survival.

Despite a growing understanding of distinct molecular features of subordinate tumor types, PLGG are, based on similar clinical characteristics, mainly managed by uniform diagnostic pathways and treatment algorithms (Gnekow et al. 2019 ). Therefore, an integrated analysis of the prognostic value of MIB-1 LI in PLGG comprising several tumor types may provide the highest comprehensive value for clinical practice. To take account of the distinct biology of various tumor types included in this cohort, however, we moreover compare tumor type-specific MIB-1 LI values and analyze the correlation of MIB-1 LI and PFS of various tumor types.

Patients and methods

Patient selection criteria.

Patients < 18 years of age treated with histologically confirmed PLGG between 2006 and 2022 at University Children’s Hospital Tuebingen, a tertiary care referral center for pediatric neurosurgery and neuro-oncology, were identified by search of the medical center database and included to the study. Eligible diagnoses included glial and glioneural tumors CNS WHO grade 1 or 2 according to the 5th edition of the WHO classification of central nervous system tumors of 2021 (Louis et al. 2021 ) with exclusion of ependymomas. As feasible, pre-2021 tumor diagnoses were adopted to the currently valid classification based on available molecular genetic data by individually reviewing potentially relevant cases. Patients who did not undergo surgical treatment or received surgery at a foreign institution had to be excluded from subsequent analyses owing to unavailable histopathological data.

Histopathology reports were contributed by the Department of Neuropathology at University Hospital Tuebingen, while diagnoses were routinely confirmed by the German Brain Tumor Reference Center (Institute of Neuropathology, University of Bonn Medical Center, Bonn, Germany). Patients diagnosed with PLGG associated with Neurofibromatosis Type 1 (NF-1) and other phacomatoses were equally included into the study.

Demographical and clinical data of eligible patients including age, sex, histopathological diagnosis, molecular BRAF status, tumor site and chronological sequence of events were extracted from the Pediatric Neuro-oncology center database.

Analysis of pre- and postoperative tumor growth velocity was implemented using serial quantification of tumor burden on sequentially acquired T2-weighted MRI scans, in most cases by 1.5/3 T MRI scanners. Due to previous experience in distinct intracranial malignancies, three-dimensional volumetry was preferred to linear assessment of lesion extension, as comparative analysis has shown superior sensitivity of three-dimensional assessment of tumor expansion (Harris et al. 2008 ). Therefore, semiautomated calculation of tumor volume was conducted following manual determination of tumor margins in 1–3 mm axial MRI slices. Image based volumetric segmentation was applied using BrainLab Elements (version 3.0, BrainLab, Munich, Germany), a specialized and broadly applied software for image guided therapy in surgical and radiation oncology. Tumor growth velocity was quantified by calculation of tumor expansion over time. Repeated volumetry of various investigators showed negligible variation of tumor volumes.

MIB-1 LI was routinely assessed at the time of histopathological diagnosis following automated immunohistochemical staining of slides from paraffin-embedded tumor samples using a well-established KI-67 antibody (DakoCytomation, Glostrup, Denmark, clone MIB-1, dilution 1:200). Therefore, an automated staining application (BenchMark®, Ventana Medical Systems, Tucson, Az, USA) was used, applying cell condition pretreatment (CC1 for 40 min, antibody incubation at 42 °C for 20 min) and using a universal biotinylated immunoglobulin secondary antibody, combined with diaminobenzidine as a substrate. Nuclear counterstaining with hemalaun was applied. Evaluation was implemented by various experienced neuropathologists at the time of initial diagnosis. For the purpose of a maximally achievable homogeneity, MIB-1 LI assessment was carried out by calculating the overall percentage of MIB-1 positive tumor nuclei on large-scale full slide sections including proliferation hotspots without confining to areas of high proliferative activity. Fainted colored nuclei were categorically counted as positive. In case of significant disparities of estimated MIB-1 LI values, individual discussion and reassessment was conducted. MIB-1 LI values were later retrieved from pathological records during data acquisition for the present study.

Testing for BRAF V600E mutation was performed at time of diagnosis via pyrosequencing following PCR amplification of the BRAF gene from extracted tumor DNA. Testing for BRAF-KIAA1549 fusion was performed at time of diagnosis after extraction of RNA from tumor material and subsequent reverse transcription into cDNA by use of fusion transcript specific primers and electrophoretic segregation.

Statistical analyses

Statistical analysis of the reported data was conducted using JMP 15.2.0 (SAS Institute Inc., Cary, North Carolina, USA) and GraphPad Prism 8.0 (GraphPad Software, Inc., California, USA). Anderson − Darling test was used to study distribution of MIB-1 LI values and pre- and postoperative tumor growth rates. Owing to not normally distributed data, nonparametric testing was conducted for further statistical analysis using Mann − Whitney rank sum test and Kruskal − Wallis test. Log-rank (Mantel-Cox) test was performed for PFS curve comparison. P values < 0.05 were considered statistically significant.

During the observation period, 191 patients were treated with PLGG at the stated institution. Patient age ranged from 2 to 17 years (mean: 7.9 years). Diagnoses included Pilocytic astrocytoma °1 (139 cases), Ganglioglioma °1 (36 cases), Pediatric-type diffuse low-grade glioma °2 (14 cases), Oligodendroglioma °2, IDH-mutant, 1p/19q codeleted (2 cases), Pleomorphic xanthoastrocytoma °2 (1 case), Rosette-forming glioneural tumor °1 (1 case) and Subependymal giant cell astrocytoma °1 (1 case). As several cases including diffuse low-grade glioma °2 were diagnosed up to seventeen years ago, a precise re-classification in accordance with the latest edition of the WHO classification appeared unfeasible in several cases. Among these tumors, either IDH1/2 mutations and MAPK alterations were found in three cases, respectively. Reviewing the available molecular data, these cases presumably include MAPK altered pediatric-type diffuse low-grade gliomas °2 and MAPK altered and MYB-/MYBL-altered pediatric-type diffuse astrocytoma °2. Detailed results of the molecular BRAF analyses of this cohort have previously been published in a comprehensive analysis of the tumor growth velocity of the reported cohort (Gorodezki et al. 2022 ). Association to NF-1 was present in 23 cases. Tumor locations included the posterior fossa (PF, 80 cases), the supratentorial midline and optic nerve (SML and OG, 55 cases), the cerebral hemispheres (CH, 46 cases), the spinal cord (SC, 8 cases) and the lateral ventricles (LV, 2 cases). Of 172 patients (90.1%) receiving surgery during the observation period, gross-total resection (GTR) could be achieved in 65 cases (37.7%), while incomplete resection (IR) was in realized in 100 cases (58.1%). Biopsy was carried out in 7 cases (4.1%). A total of 38 patients (22.1%) received repeated surgery during the observation period, while adjuvant treatment including chemotherapy, radiation or targeted therapy was applied in 23 patients (12%). Because our institution serves as a referral center for pediatric neurosurgery, in 18 cases (10.5%) first surgery was performed at a foreign institution, hence detailed histopathological records including MIB-1 LI values on the first tumor were not available for analysis. Distribution of diagnoses, tumor sites and treatment patterns showed congruency to previously published population based PLGG cohorts (Bandopadhayay et al. 2014 ; Stokland et al. 2010 ; Gnekow et al. 2012 ).

MIB-1 labeling index: distribution, characterization and treatment-dependent sequence

Distribution of MIB-1 LI values at first and second surgery are illustrated Fig.  1 A. MIB-1 LI values showed a mean of 2.7% at first surgery (range: < 1–10%, n  = 154), while mean MIB-1 LI at second surgery accounted for 1.5% (< 1–5%, n  = 38). Comparison of median values showed a significant difference (Mann − Whitney U test, p  = 0.0013, see Fig.  1 B). In 27 patients who underwent repeated surgeries, available MIB-1 LI values at the time of 1st and 2nd surgery allowed for individual chronological outline of MIB-1 LI during individual treatment periods. In 19 patients (70.4%), a decrease of individual MIB-1 LI values could be observed, while an increase of MIB-1 LI could only be detected in 4 cases (14.8%). Individual chronological sequences of MIB-1 LI values are illustrated in Fig.  1 C.

figure 1

A Distribution of MIB-1 LI values at 1st and 2nd surgery of a large single center PLGG cohort B Comparison of mean MIB-1 values at 1st and 2nd surgery showed a significant difference (2.7 vs 1.5%, respectively, Mann − Whitney U test, p  = 0.0013) C Individual chronological sequences of MIB-1 LI values of 27 patients receiving two consecutive surgeries. In 19 patients (70.4%), a decrease of individual MIB-1 LI values could be observed, while an increase of MIB-1 LI could only be detected in 4 cases (14.8%)

figure 2

Comparative illustration of two cases of Pilocytic astrocytoma °1 showing a varying fraction of KI-67 expressing nuclei after immunohistochemical MIB-1 staining ( A : MIB-1 LI = 1%; B : MIB-1 LI = 3%; Ki67, clone MIB1, Dako Glostrup, 1:200 magnification, Ventana immunohistochemistry system, diaminobenzidine as brown chromogen)

We furthermore compared mean MIB-1 LI values of pretreated vs treatment-naïve PLGG, aiming to characterize the impact of neoadjuvant treatment on MIB-1 LI. Prior to first surgical intervention, a total of five patients received neoadjuvant radio-/chemotherapy, while no pretreatment was applied in 140 patients. Compared with treatment-naïve PLGG, pretreated tumors showed a significantly lower mean MIB-1 LI value (1.0 vs 2.8%, Mann − Whitney U test, p  = 0.035, see Fig.  3 A). At 2nd surgery, comparison of MIB-1 LI values of pretreated vs treatment naïve tumors showed no statistically significant difference (0.9% vs 1.7%, Mann − Whitney U test, p  = 0.11, see Fig.  3 B).

figure 3

A Comparative analysis of MIB-1 LI values of pretreated vs treatment naïve PLGG at 1st surgery showed significantly lower mean MIB-1 LI values in patients pretreated with neoadjuvant radio-/chemotherapy (1.0 vs 2.8%, Mann − Whitney U test, p  = 0.035) B At 2nd surgery, comparison of MIB-1 LI values of pretreated vs treatment naïve tumors showed no statistically significant difference (0.9% vs 1.7%, Mann − Whitney U test, p  = 0.11, see Fig.  2 B)

Studying age dependence of MIB-1 LI revealed significantly higher MIB-LI values in younger patients (3.1 vs 2.8 vs 2.4% in patients aged 0–5 vs 6–11 vs 12–18 years at time of diagnosis, respectively, p  = 0.04, Kruskal − Wallis test). Comparison of tumor type-specific MIB-1 LI values showed a significant difference, as Pilocytic astrocytomas °1 were characterized by the highest mean MIB-1 LI values, followed by pediatric-type diffuse low-grade gliomas (including diffuse astrocytomas °2, MYB or MYBL1 -altered; and diffuse low-grade gliomas °2, MAPK -pathway altered) and Gangliogliomas °1 (2.9 vs 2.6 vs 2.1%, p  = 0.04, Kruskal − Wallis test). Remarkably, tumors characterized by an initial volume of > 20 cm 3 at diagnosis showed a significantly higher mean MIB-1 LI as compared to tumors showing a volume of ≤ 20 cm 3 (3.6 vs 2%, p  = 0.002, Mann − Whitney U test).

No significant differences of mean MIB-1 LI values could be shown regarding to patient sex, various tumor locations, WHO grade and the most frequent molecular aberrations of BRAF in PLGG (BRAF-KIAA1549 fusion and BRAF V600E-mutation) compared to BRAF wild-type tumors. Patient characteristics and corresponding MIB-1 LI values are illustrated in Table  1 .

Correlation of MIB-1 labeling index on pre- and postoperative tumor growth velocity

To study the clinical significance of MIB-1 LI as a proliferative activity marker in PLGG, we analyzed the implication of MIB-1 LI on pre- and postoperative tumor growth velocity within our cohort.

In 31 patients, comparable MRI sequences over a surveillance period of ≥ 6 months prior to surgical resection or biopsy allowed for calculation of preoperative tumor growth rates. Linear regression analysis showed a significant correlation between MIB-1 LI values and preoperative tumor growth velocity ( n  = 31, R  = 0.128, R 2  = 0.546, p  < 0.001, see Fig.  4 A).

figure 4

A Linear regression analysis showed a significant correlation between MIB-1 LI values and preoperative tumor growth velocity ( n  = 31, R  = 0.128, R 2  = 0.546, p  < 0.001) B A significantly decreasing correlation of MIB-LI and postoperative tumor growth velocity could be shown ( n  = 76, R  = 0.014, R 2  = 0.08, p  = 0.013)

For calculation of postoperative tumor growth rates, comparable sequential postoperative MRI data of a surveillance period of ≥ 6 months with corresponding MIB-1 LI values of a total of 76 patients could be included to the subsequent analyses. Compared to preoperative tumor growth rates, a crucially decreasing correlation of MIB-1 LI values and postoperative tumor growth velocity could be shown after IR (linear regression analysis, n  = 76, R  = 0.014, R 2  = 0.08, p  = 0.013, see Fig.  4 B).

Correlation of MIB-1 labeling index and progression-free survival (PFS)

We furthermore studied the correlation of MIB-1 LI values and PFS within the reported cohort after IR and GTR. Including all patients, 5- and 10-year PFS after IR accounted for 63% and 46%, respectively, while 5- and 10-year PFS after GTR accounted for 91%.

Within the subgroup of patients undergone IR, comparison of 5- and 10-year PFS in cases with MIB-1 LI ≤ 1 vs > 1% showed no significant disparity, as 5-year PFS accounted for 63.5 vs 55.6%, respectively, while 10-year PFS accounted for 46.2 vs 47.1% ( n  = 83, log rank test, Chi square = 0.63, p  = 0.625, see Fig.  5 A).

figure 5

A Comparison of 5- and 10-year-PFS in incompletely resected PLGG with MIB-1 LI ≤ 1 vs > 1% showed no significant difference (63.5 vs 55.6% and 46.2 vs 47.1%, respectively, log rank test, Chi square 0.63, p  = 0.625) B No significant difference of 5- and 10-year-progression-free survival in gross-totally resected PLGG with MIB-1 LI ≤ 1 vs > 1% could be shown (95.5 vs 89.0%, respectively, log rank test, Chi square 0.58, p  = 0.75)

After GTR, equally no significant difference in 5- and 10-year PFS in patients with MIB-1 LI ≤ 1 vs > 1% could be observed, as 5- and 10-year PFS accounted for 95.5 vs 89.0%, respectively ( n  = 64, log rank test, Chi square = 0.58, p  = 0.75, see Fig.  5 B).

Tumor type specific comparison of PFS of tumors bearing a MIB-1 LI > 1 vs ≤ 1% after incomplete resection showed no significant difference in Pilocytic astrocytomas °1 (5-year PFS 45 vs 45%, respectively, log-rank test, Chi square = 0.32,  p  = 0.57, n  = 60), pediatric-type diffuse low-grade gliomas (5-year PFS 33 vs 50%, respectively, log-rank test, Chi square = 0.08,  p  = 0.78, n = 8) or ganglioglioma °1 (5-year PFS 85 vs 66%, respectively, log-rank test, Chi square = 0.04,  p  = 0.84, n  = 15). Respective Kaplan − Meier curves are illustrated in Fig.  6 A.

figure 6

A Tumor type specific comparison of PFS of tumors bearing a MIB-1 LI > 1 vs ≤ 1% after incomplete resection showed no significant difference in Pilocytic astrocytomas °1, pediatric-type diffuse low-grade gliomas or ganglioglioma °1. B In patients who received gross-total resection, tumor type specific comparison of PFS of tumors bearing a MIB-1 LI > 1 vs ≤ 1% after gross-total resection likewise showed no significant difference in Pilocytic astrocytomas °1, pediatric-type diffuse low-grade gliomas or ganglioglioma °1

Comparison of PFS of tumors bearing a MIB-1 LI > 1 vs ≤ 1% after gross-total resection likewise showed no significant difference in Pilocytic astrocytomas °1 (5-year PFS 96 vs 90%, respectively, log-rank test, Chi square = 0.35,  p  = 0.56, n  = 34), pediatric-type diffuse low-grade gliomas (5-year PFS 100%, respectively, log-rank test, Chi square = 0.40,  p  = 0.53, n  = 9) or ganglioglioma °1 (5-year PFS 89 vs 100%, respectively, log-rank test, Chi square = 0.89,  p  = 0.35, n  = 21). Respective Kaplan–Meier curves are illustrated in Fig.  6 B.

In the present work, beyond studying its prognostic utility on a large representative single-center PLGG cohort, we aim to contribute to a more nuanced understanding of the clinical implications of MIB-1 LI as a potential surrogate marker for the proliferative activity of PLGG. For this purpose, a potential association of MIB-1 LI values and pre- and postoperative tumor growth behavior has been analyzed including all potential cofounding factors. Quantification of tumor growth velocity has been conducted using sequential three-dimensional MRI based tumor volumetry, as this method has shown a superior sensitivity in growth tracking as compared to linear diameter measurements in intracranial tumors (Harris et al. 2008 ).

While nowadays amending an integral part to histopathological routine diagnostics in CNS tumors, assessment of MIB-1 LI has previously shown to contribute to differentiation of the degree of malignancy in tumors of the central nervous system, while a significant correlation to WHO grade in human glioma has been reported (Matsumoto et al. 1998 ; Pollack et al. 2002 ; Skjulsvik et al. 2014 ; Hsu et al. 1997 ; Krishnan et al. 2019 ). However, as compared to distinct solid CNS and non-CNS malignancies, previous analyses of smaller PLGG cohorts addressing the prognostic value of KI-67/MIB-1 LI in PLGG, many of them from the pre-molecular era, draw conflicting conclusions and necessitate further assessment (Bowers et al. 2002 , 2003 ; Fisher et al. 2002 ; Dorward et al. 2010 ; Margraf et al. 2011 ; Tu et al. 2018 ; Horbinski et al. 2010 ; Cherlow et al. 2019 ; Cler et al. 2022 ).

Analysis of a potential coherence between the fraction of MIB-1 positive cells and the growth velocity within the analyzed cohort of PLGG notably showed a significant correlation of MIB-1 LI and both pre- and postoperative radiologically assessed tumor growth rates, as illustrated in Fig.  4 . This observation may possibly confirm a clinical significance of KI-67/MIB-1 LI as a surrogate marker for the proliferative activity of glioma cells, as originally shown in comprehensive analyses on a cellular level, leading to the establishment of MIB-1 LI as the commonly used method for measuring the proliferative potential in human gliomas (Schröder et al. 1991 ; Kałuza et al. 1997 ; Thotakura et al. 2014 ). In the current work, a significant correlation of MIB-1 LI and radiologically quantifiable tumor growth velocity could be shown for the first time.

A clinically significant correlation of MIB-1 LI and tumor growth may also be confirmed by the observation of a significant coherence of MIB-1 LI and tumor volume at diagnosis within the analyzed cohort, as gliomas showing a volume of > 20 cm 3 at time of detection bearing a higher mean fraction of MIB-1 positive tumor cells compared to tumors measuring ≤ 20 cm 3 at time of diagnosis, possibly indicating faster preoperative tumor growth velocity of PLGG bearing a higher MIB-1 LI.

Remarkably, however, compared to preoperative tumor growth rates, a crucially decreasing correlation of MIB-1 LI values at time of incomplete resection on postoperative tumor growth velocity could be shown, as illustrated in Fig.  4 . This may be explained by a significant alteration of tumor growth velocity caused by surgical intervention, as recently published data indicates a significant deceleration of tumor growth in PLGG after surgical intervention, predominantly determined by the extent of resection (Gorodezki et al. 2022 ). Presumably, this significant alteration of tumor growth behavior by surgical intervention may be the cause for a limited predictive value of MIB-1 LI at time of incomplete resection regarding postoperative growth velocity, and subsequently resulting in a limited prognostic utility of MIB-1 LI regarding progression-free survival in PLGG. Eventually, no significant association of MIB-1 LI and long-term PFS could be shown both after incomplete and gross-total resection within the analyzed cohort. For this reason, we do not advocate the use of a MIB-1 LI cutoff outside of the neurosurgical context for tumor risk stratification.

In context of the previously published observation of growth deceleration after surgical intervention in this cohort, the significantly lower mean MIB-1 LI at time of secondary surgery shown in this analysis may possibly be seen as a coherence of growth deceleration and decrease of mean MIB-1 LI values (Gorodezki et al. 2022 ).

Characterization of MIB-1 LI in PLGG furthermore showed age dependence of mean MIB-1 LI values, as patients 0 – 5 years of age showed the highest mean MIB-1 LI value, with a decreasing tendency in sub-teenage and adolescent patients. An age dependence with a tendency towards higher MIB-1 LI values in PLGG particularly in infants has previously described, and may possibly be seen as an expression of young age representing a risk factor for significantly higher progression rates and worse treatment outcomes in PLGG (Bandopadhayay et al. 2014 ; Stokland et al. 2010 ; Gnekow et al. 2012 ; Fisher et al. 2002 ; Tu et al. 2018 ). The association of MIB-1 LI with age may also explain contradictory results in previous studies as the age composition of the respective cohorts may differ significantly.

Comparing mean MIB-1 LI values of distinct histologic tumor types showed a minor, thus significant difference, as PAs °1 showed the highest mean MIB-1 LI, while the lowest mean value was detected Ganglioglioma °1. In context of PA not showing a significantly unfavorable long-term PFS in recent population-based cohort studies, the described differentiating mean MIB-1 LI values of distinct histological diagnoses should presumably not be seen as prognostically relevant (Krishnatry et al. 2016 ; Wisoff et al. 2011 ; Bandopadhayay et al. 2014 ; Stokland et al. 2010 ; Gnekow et al. 2012 ).

Further analysis showed no significant association of patient sex, tumor location, WHO grade and detection the most frequent molecular aberrations of BRAF in PLGG (BRAF-KIAA1549 fusion and BRAF V600E-mutation) with MIB-1 LI values. A previously mentioned smaller case series of 70 PAs consistently showed no significant difference of mean MIB-1 LI values in tumors of various locations (Tu et al. 2018 ).

There are, however, limitations to this study to be addressed. First, it should be pointed out, that nonautomated assessment of MIB-1 LI values was applied, potentially bearing interobserver variability and limited accuracy of the analyzed data. Although a standardized protocol for immunohistochemical staining of MIB-1 and counting of LI has been applied, and estimation of MIB-1 LI furthermore has been carried out by various experienced neuropathologists, interobserver variability of MIB-1 LI values within the presented data should, to some degree, be presumed. Previous analyses pointed out significant interobserver variability of MIB-1 LI assessment in primary brain tumors, depending on applied counting methods and height of LI values, subsequently leading to the development of automated counting systems (Hsu et al. 2003 ; Grzybicki et al. 2001 ). Nevertheless, with nonautomated measurement of MIB-1 LI representing the most prevalently used method to this day, application of manual MIB-1 LI assessment may contribute to the transferability and viability of the analyzed data in neuropathology practice. Potentially limited accuracy and interobserver variability, moreover, should be considered a possible explanation of the partially conflicting results of previous studies on the prognostic utility of MIB-1 LI in PLGG (Bowers et al. 2002 , 2003 ; Fisher et al. 2002 ; Dorward et al. 2010 ; Margraf et al. 2011 ; Tu et al. 2018 ; Horbinski et al. 2010 ; Cherlow et al. 2019 ; Cler et al. 2022 ).

Further limitations include the retrospective nature of the study, as a significant number of patients had to be excluded from the analyses due to limited availability of comparable MRI sequences for quantification of tumor growth within the follow-up period. However, that the importance of a solid reproducibility of a potentially subjective variable like the MIB-1 LI, which may be significantly influenced by inter-laboratory deviations, does support the application of a single center approach in the current study.

Although the single center approach of the study should be considered as another limitation, the distribution of diagnoses, tumor sites and treatment patterns showed congruency to previously published population based PLGG studies, underlining the representativity of the reported cohort (Bandopadhayay et al. 2014 ; Stokland et al. 2010 ; Gnekow et al. 2012 ).

This data possibly confirms a significant correlation of MIB-1 LI and radiologically detectable tumor growth velocity in PLGG for the first time. However, compared to preoperative tumor growth rates, a crucially decreasing correlation of MIB-1 LI values and tumor growth rates after surgical intervention and age-dependent correlation could be shown, subsequently resulting in a limited prognostic value of MIB-1 LI cutoffs regarding PFS in PLGG.

Data availability

The datasets generated and analyzed during the current study are available from the corresponding author upon reasonable request.

Armstrong GT, Liu Q, Yasui Y, Huang S, Ness KK, Leisenring W, Hudson MM, Donaldson SS, King AA, Stovall M, Krull KR, Robison LL, Packer RJ (2009) Long-term outcomes among adult survivors of childhood central nervous system malignancies in the Childhood Cancer Survivor Study. J Natl Cancer Inst 101(13):946–958. https://doi.org/10.1093/jnci/djp148

Article   PubMed   PubMed Central   Google Scholar  

Avinash KS, Thakar S, Aryan S, Ghosal N, Hegde AS (2019) Malignant transformation of pediatric low-grade gliomas: report of two cases and review of a rare pathological phenomenon. Neurol India 67(4):1100–1106. https://doi.org/10.4103/0028-3886.266259

Article   CAS   PubMed   Google Scholar  

Bandopadhayay P, Bergthold G, London WB, Goumnerova LC, Morales La Madrid A, Marcus KJ, Guo D, Ullrich NJ, Robison NJ, Chi SN, Beroukhim R, Kieran MW, Manley PE (2014) Long-term outcome of 4040 children diagnosed with pediatric low-grade gliomas: an analysis of the Surveillance Epidemiology and End Results (SEER) database. Pediatr Blood Cancer 61(7):1173–1179. https://doi.org/10.1002/pbc.24958

Benesch M, Eder HG, Sovinz P, Raith J, Lackner H, Moser A, Urban C (2006) Residual or recurrent cerebellar low-grade glioma in children after tumor resection: is re-treatment needed? A single center experience from 1983 to 2003. Pediatr Neurosurg 42(3):159–164. https://doi.org/10.1159/000091859

Article   PubMed   Google Scholar  

Bowers DC, Mulne AF, Weprin B, Bruce DA, Shapiro K, Margraf LR (2002) Prognostic factors in children and adolescents with low-grade oligodendrogliomas. Pediatr Neurosurg 37(2):57–63. https://doi.org/10.1159/000065106

Bowers DC, Gargan L, Kapur P, Reisch JS, Mulne AF, Shapiro KN, Elterman RD, Winick NJ, Margraf LR (2003) Study of the MIB-1 labeling index as a predictor of tumor progression in pilocytic astrocytomas in children and adolescents. J Clin Oncol 21(15):2968–2973. https://doi.org/10.1200/JCO.2003.01.017

Chamdine O, Broniscer A, Wu S, Gajjar A, Qaddoumi I (2016) Metastatic low-grade gliomas in children: 20 years’ experience at St. Jude children’s research hospital. Pediatr Blood Cancer 63(1):62–70. https://doi.org/10.1002/pbc.25731

Cherlow JM, Shaw DWW, Margraf LR, Bowers DC, Huang J, Fouladi M, Onar-Thomas A, Zhou T, Pollack IF, Gajjar A, Kessel SK, Cullen PL, McMullen K, Wellons JC, Merchant TE (2019) Conformal radiation therapy for pediatric patients with low-grade glioma: results from the children’s oncology group phase 2 study ACNS0221. Int J Radiat Oncol Biol Phys 103(4):861–868. https://doi.org/10.1016/j.ijrobp.2018.11.004

Cler SJ, Skidmore A, Yahanda AT, Mackey K, Rubin JB, Cluster A, Perkins S, Gauvain K, King AA, Limbrick DD, McEvoy S, Park TS, Smyth MD, Mian AY, Chicoine MR, Dahiya S, Strahle JM (2022) Genetic and histopathological associations with outcome in pediatric pilocytic astrocytoma. J Neurosurg Pediatr 29(5):504–512. https://doi.org/10.3171/2021.9.PEDS21405

Dorward IG, Luo J, Perry A, Gutmann DH, Mansur DB, Rubin JB, Leonard JR (2010) Postoperative imaging surveillance in pediatric pilocytic astrocytomas. J Neurosurg Pediatr 6(4):346–352. https://doi.org/10.3171/2010.7.PEDS10129

Fisher BJ, Leighton CC, Vujovic O, Macdonald DR, Stitt L (2001) Results of a policy of surveillance alone after surgical management of pediatric low grade gliomas. Int J Radiat Oncol Biol Phys 51(3):704–710. https://doi.org/10.1016/s0360-3016(01)01705-9

Fisher BJ, Naumova E, Leighton CC, Naumov GN, Kerklviet N, Fortin D, Macdonald DR, Cairncross JG, Bauman GS, Stitt L (2002) Ki-67: a prognostic factor for low-grade glioma? Int J Radiat Oncol Biol Phys 52(4):996–1001. https://doi.org/10.1016/s0360-3016(01)02720-1

Gerdes J, Schwab U, Lemke H, Stein H (1983) Production of a mouse monoclonal antibody reactive with a human nuclear antigen associated with cell proliferation. Int J Cancer 31(1):13–20. https://doi.org/10.1002/ijc.2910310104

Gnekow AK, Falkenstein F, von Hornstein S, Zwiener I, Berkefeld S, Bison B, Warmuth-Metz M, Driever PH, Soerensen N, Kortmann RD, Pietsch T, Faldum A (2012) Long-term follow-up of the multicenter, multidisciplinary treatment study HIT-LGG-1996 for low-grade glioma in children and adolescents of the German speaking society of pediatric oncology and hematology. Neuro Oncol 14(10):1265–1284. https://doi.org/10.1093/neuonc/nos202

Article   CAS   PubMed   PubMed Central   Google Scholar  

Gnekow AK, Kandels D, Tilburg CV, Azizi AA, Opocher E, Stokland T, Driever PH, Schouten-van Meeteren AYN, Thomale UW, Schuhmann MU, Czech T, Goodden JR, Warmuth-Metz M, Bison B, Avula S, Kortmann RD, Timmermann B, Pietsch T, Witt O (2019) SIOP-E-BTG and GPOH guidelines for diagnosis and treatment of children and adolescents with low grade glioma. Klin Padiatr. 231(3):107–135. https://doi.org/10.1055/a-0889-8256

Gorodezki D, Zipfel J, Queudeville M, Sosa J, Holzer U, Kern J, Bevot A, Schittenhelm J, Nägele T, Ebinger M, Schuhmann MU (2022) Resection extent and BRAF V600E mutation status determine postoperative tumor growth velocity in pediatric low-grade glioma: results from a single-center cohort analysis. J Neurooncol 160(3):567–576. https://doi.org/10.1007/s11060-022-04176-4

Greuter L, Guzman R, Soleman J (2021) Pediatric and Adult Low-Grade Gliomas: Where Do the Differences Lie? Children (basel) 8(11):1075. https://doi.org/10.3390/children8111075

Grzybicki DM, Liu Y, Moore SA, Brown HG, Silverman JF, D’Amico F, Raab SS (2001) Interobserver variability associated with the MIB-1 labeling index: high levels suggest limited prognostic usefulness for patients with primary brain tumors. Cancer 92(10):2720–2726. https://doi.org/10.1002/1097-0142(20011115)92:10%3c2720::aid-cncr1626%3e3.0.co;2-z

Harris GJ, Plotkin SR, Maccollin M, Bhat S, Urban T, Lev MH, Slattery WH (2008) Three-dimensional volumetrics for tracking vestibular schwannoma growth in neurofibromatosis type II. Neurosurgery 62(6):1314–1320. https://doi.org/10.1227/01.neu.0000333303.79931.83

Ho DM, Wong TT, Hsu CY, Ting LT, Chiang H. MIB-1 labeling index in nonpilocytic astrocytoma of childhood: a study of 101 cases (1998) Cancer 82(12):2459–2466. https://doi.org/10.1002/(sici)1097-0142(19980615)82:12<2459::aid-cncr21>3.0.co;2-n

Horbinski C, Hamilton RL, Lovell C, Burnham J, Pollack IF (2010) Impact of morphology, MIB-1, p53 and MGMT on outcome in pilocytic astrocytomas. Brain Pathol 20(3):581–588. https://doi.org/10.1111/j.1750-3639.2009.00336.x

Hsu DW, Louis DN, Efird JT, Hedley-Whyte ET (1997) Use of MIB-1 (Ki-67) immunoreactivity in differentiating grade II and grade III gliomas. J Neuropathol Exp Neurol 56(8):857–865. https://doi.org/10.1097/00005072-199708000-00003

Hsu CY, Ho DM, Yang CF, Chiang H (2003) Interobserver reproducibility of MIB-1 labeling index in astrocytic tumors using different counting methods. Mod Pathol 16(9):951–957. https://doi.org/10.1097/01.MP.0000084631.64279.BC

Jones DTW, Kieran MW, Bouffet E, Alexandrescu S, Bandopadhayay P, Bornhorst M, Ellison D, Fangusaro J, Fisher MJ, Foreman N, Fouladi M, Hargrave D, Hawkins C, Jabado N, Massimino M, Mueller S, Perilongo G, Schouten van Meeteren AYN, Tabori U, Warren K, Waanders AJ, Walker D, Weiss W, Witt O, Wright K, Zhu Y, Bowers DC, Pfister SM, Packer RJ (2018) Pediatric low-grade gliomas: next biologically driven steps. Neuro Oncol 20(2):160–173. https://doi.org/10.1093/neuonc/nox141

Kałuza J, Adamek D, Pyrich M (1997) Ki-67 as a marker of proliferation activity in tumor progression of recurrent gliomas of supratentorial localization. Immunocytochem Quantit Stud Pol J Pathol 48(1):31–36

Google Scholar  

Krishnan SS, Muthiah S, Rao S, Salem SS, Madabhushi VC, Mahadevan A (2019) Mindbomb homolog-1 index in the prognosis of high-grade glioma and its clinicopathological correlation. J Neurosci Rural Pract 10(2):185–193. https://doi.org/10.4103/jnrp.jnrp_374_18

Krishnatry R, Zhukova N, Guerreiro Stucklin AS, Pole JD, Mistry M, Fried I, Ramaswamy V, Bartels U, Huang A, Laperriere N, Dirks P, Nathan PC, Greenberg M, Malkin D, Hawkins C, Bandopadhayay P, Kieran MW, Manley PE, Bouffet E, Tabori U (2016) Clinical and treatment factors determining long-term outcomes for adult survivors of childhood low-grade glioma: a population-based study. Cancer 122(8):1261–1269. https://doi.org/10.1002/cncr.29907

Louis DN, Perry A, Wesseling P, Brat DJ, Cree IA, Figarella-Branger D, Hawkins C, Ng HK, Pfister SM, Reifenberger G, Soffietti R, von Deimling A, Ellison DW (2021) The 2021 WHO classification of tumors of the central nervous system: a summary. Neuro Oncol 23(8):1231–1251. https://doi.org/10.1093/neuonc/noab106

Margraf LR, Gargan L, Butt Y, Raghunathan N, Bowers DC (2011) Proliferative and metabolic markers in incompletely excised pediatric pilocytic astrocytomas–an assessment of 3 new variables in predicting clinical outcome. Neuro Oncol 13(7):767–774. https://doi.org/10.1093/neuonc/nor041

Matsumoto T, Fujii T, Yabe M, Oka K, Hoshi T, Sato K (1998) MIB-1 and p53 immunocytochemistry for differentiating pilocytic astrocytomas and astrocytomas from anaplastic astrocytomas and glioblastomas in children and young adults. Histopathology 33(5):446–452. https://doi.org/10.1046/j.1365-2559.1998.00503.x

McCormick D, Chong H, Hobbs C, Datta C, Hall PA (1993) Detection of the Ki-67 antigen in fixed and wax-embedded sections with the monoclonal antibody MIB1. Histopathology 22(4):355–360. https://doi.org/10.1111/j.1365-2559.1993.tb00135.x

Pollack IF, Campbell JW, Hamilton RL, Martinez AJ, Bozik ME (1997) Proliferation index as a predictor of prognosis in malignant gliomas of childhood. Cancer 79(4):849–856

Pollack IF, Hamilton RL, Burnham J, Holmes EJ, Finkelstein SD, Sposto R, Yates AJ, Boyett JM, Finlay JL (2002) Impact of proliferation index on outcome in childhood malignant gliomas: results in a multi-institutional cohort. Neurosurgery 50(6):1238–1245. https://doi.org/10.1097/00006123-200206000-00011

Rickert CH, Paulus W (2001) Epidemiology of central nervous system tumors in childhood and adolescence based on the new WHO classification. Childs Nerv Syst 17(9):503–511. https://doi.org/10.1007/s003810100496

Ryall S, Tabori U, Hawkins C (2020a) Pediatric low-grade glioma in the era of molecular diagnostics. Acta Neuropathol Commun 8(1):30. https://doi.org/10.1186/s40478-020-00902-z

Ryall S, Zapotocky M, Fukuoka K, Nobre L, Guerreiro Stucklin A, Bennett J, Siddaway R, Li C, Pajovic S, Arnoldo A, Kowalski PE, Johnson M, Sheth J, Lassaletta A, Tatevossian RG, Orisme W, Qaddoumi I, Surrey LF, Li MM, Waanders AJ, Gilheeney S, Rosenblum M, Bale T, Tsang DS, Laperriere N, Kulkarni A, Ibrahim GM, Drake J, Dirks P, Taylor MD, Rutka JT, Laughlin S, Shroff M, Shago M, Hazrati LN, D’Arcy C, Ramaswamy V, Bartels U, Huang A, Bouffet E, Karajannis MA, Santi M, Ellison DW, Tabori U, Hawkins C (2020b) Integrated molecular and clinical analysis of 1000 pediatric low-grade gliomas. Cancer Cell 37(4):569-583.e5. https://doi.org/10.1016/j.ccell.2020.03.011

Sadighi ZS, Curtis E, Zabrowksi J, Billups C, Gajjar A, Khan R, Qaddoumi I (2018) Neurologic impairments from pediatric low-grade glioma by tumor location and timing of diagnosis. Pediatr Blood Cancer 65(8):e27063. https://doi.org/10.1002/pbc.27063

Schröder R, Bien K, Kott R, Meyers I, Vössing R (1991) The relationship between Ki-67 labeling and mitotic index in gliomas and meningiomas: demonstration of the variability of the intermitotic cycle time. Acta Neuropathol 82(5):389–394. https://doi.org/10.1007/BF00296550

Shaw EG, Wisoff JH (2003) Prospective clinical trials of intracranial low-grade glioma in adults and children. Neuro Oncol 5(3):153–160. https://doi.org/10.1215/S1152851702000601

Sievert AJ, Fisher MJ (2009) Pediatric low-grade gliomas. J Child Neurol 24(11):1397–1408. https://doi.org/10.1177/0883073809342005

Skjulsvik AJ, Mørk JN, Torp MO, Torp SH (2014) Ki-67/MIB-1 immunostaining in a cohort of human gliomas. Int J Clin Exp Pathol 7(12):8905–8910

PubMed   PubMed Central   Google Scholar  

Stokland T, Liu JF, Ironside JW, Ellison DW, Taylor R, Robinson KJ, Picton SV, Walker DA (2010) A multivariate analysis of factors determining tumor progression in childhood low-grade glioma: a population-based cohort study (CCLG CNS9702). Neuro Oncol 12(12):1257–1268. https://doi.org/10.1093/neuonc/noq092

Thotakura M, Tirumalasetti N, Krishna R (2014) Role of Ki-67 labeling index as an adjunct to the histopathological diagnosis and grading of astrocytomas. J Cancer Res Ther 10(3):641–645. https://doi.org/10.4103/0973-1482.139154

Tu A, Robison A, Melamed E, Buchanan I, Hariri O, Babu H, Szymanski L, Krieger M (2018) Proliferative index in pediatric pilocytic astrocytoma by region of origin and prediction of clinical behavior. Pediatr Neurosurg 53(6):395–400. https://doi.org/10.1159/000490466

van Iersel L, van Santen HM, Potter B, Li Z, Conklin HM, Zhang H, Chemaitilly W, Merchant TE (2020) Clinical impact of hypothalamic-pituitary disorders after conformal radiation therapy for pediatric low-grade glioma or ependymoma. Pediatr Blood Cancer 67(12):e28723. https://doi.org/10.1002/pbc.28723

Wisoff JH, Sanford RA, Heier LA, Sposto R, Burger PC, Yates AJ, Holmes EJ, Kun LE (2011) Primary neurosurgery for pediatric low-grade gliomas: a prospective multi-institutional study from the Children’s Oncology Group. Neurosurgery 68(6):1548–1555. https://doi.org/10.1227/NEU.0b013e318214a66e

Yao R, Cheng A, Zhang Z, Jin B, Yu H (2023) Correlation between apparent diffusion coefficient and the Ki-67 proliferation index in grading pediatric glioma. J Comput Assist Tomogr 47(2):322–328. https://doi.org/10.1097/RCT.0000000000001400

Download references

Open Access funding enabled and organized by Projekt DEAL. The authors declare that no funds, grants, or other support were received during the preparation of this manuscript.

Author information

Authors and affiliations.

Department of Hematology and Oncology, University Children’s Hospital Tübingen, Tübingen, Germany

David Gorodezki & Martin Ebinger

Department of Neurosurgery, Section of Pediatric Neurosurgery, University Hospital Tübingen, Tübingen, Germany

Julian Zipfel & Martin U. Schuhmann

Department of Neuropediatrics and Developmental Neurology, University Hospital Tübingen, Tübingen, Germany

Andrea Bevot

Department of Neuroradiology, University Hospital Tübingen, Tübingen, Germany

Thomas Nägele

Institute of Pathology, Department of Neuropathology, University Hospital Tübingen, Tübingen, Germany

Jens Schittenhelm

You can also search for this author in PubMed   Google Scholar

Contributions

All Authors contributed to the study’s conception and design. Material preparation and data collection were performed by DG, MS, ME, JS, AB and JZ, data analysis was performed by DG. The first draft of the manuscript was written by DG. JS supervised the study and edited the manuscript. All authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to David Gorodezki .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Conflicts of interest

No potential financial or nonfinancial conflict of interest was reported by the authors.

Ethical approval

This study was performed in line with the principles of the Declaration of Helsinki. Approval was granted by the Ethics Committee of the Medical Faculty and University Hospital of Tübingen (NO 762/2021B02). Individual consent was waived.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Gorodezki, D., Zipfel, J., Bevot, A. et al. Prognostic utility and characteristics of MIB-1 labeling index as a proliferative activity marker in childhood low-grade glioma: a retrospective observational study. J Cancer Res Clin Oncol 150 , 178 (2024). https://doi.org/10.1007/s00432-024-05701-w

Download citation

Received : 26 July 2023

Accepted : 13 March 2024

Published : 05 April 2024

DOI : https://doi.org/10.1007/s00432-024-05701-w

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Low-grade glioma
  • Proliferation index
  • Find a journal
  • Publish with us
  • Track your research
  • Study Guides
  • Homework Questions

Homework 1. Observational Research Design. Pia Vegas Smith.

  • Health Science

Social, Behavioral, and Metabolic Risk Factors and Racial Disparities in Cardiovascular Disease Mortality in U.S. Adults : An Observational Study

Affiliations.

  • 1 Department of Epidemiology, Tulane University School of Public Health and Tropical Medicine; Tulane University Translational Science Institute; and Department of Medicine, Tulane University School of Medicine, New Orleans, Louisiana (J.H.).
  • 2 Department of Epidemiology, Tulane University School of Public Health and Tropical Medicine, and Tulane University Translational Science Institute, New Orleans, Louisiana (J.D.B., S.G., L.T., H.H., A.H.A., K.S.D., K.T.M.).
  • 3 Department of Biostatistics and Data Science, School of Public Health, The University of Texas Health Science Center at Houston, Houston, Texas (X.L.).
  • 4 Department of Medicine, Tulane University School of Medicine, New Orleans, Louisiana (K.C.F.).
  • 5 University of Texas School of Public Health San Antonio, San Antonio, Texas (R.S.V.).
  • 6 Department of Medicine, Tulane University School of Medicine; Department of Epidemiology, Tulane University School of Public Health and Tropical Medicine; and Tulane University Translational Science Institute, New Orleans, Louisiana (J.C.).
  • PMID: 37579311
  • DOI: 10.7326/M23-0507

Background: Cardiovascular disease (CVD) mortality is persistently higher in the Black population than in other racial and ethnic groups in the United States.

Objective: To examine the degree to which social, behavioral, and metabolic risk factors are associated with CVD mortality and the extent to which racial differences in CVD mortality persist after these factors are accounted for.

Design: Prospective cohort study.

Setting: NHANES (National Health and Nutrition Examination Survey) 1999 to 2018.

Participants: A nationally representative sample of 50 808 persons aged 20 years or older.

Measurements: Data on social, behavioral, and metabolic factors were collected in each NHANES survey using standard methods. Deaths from CVD were ascertained from linkage to the National Death Index with follow-up through 2019.

Results: Over an average of 9.4 years of follow-up, 2589 CVD deaths were confirmed. The age- and sex-standardized rates of CVD mortality were 484.7 deaths per 100 000 person-years in Black participants, 384.5 deaths per 100 000 person-years in White participants, 292.4 deaths per 100 000 person-years in Hispanic participants, and 255.1 deaths per 100 000 person-years in other race groups. In a multiple Cox regression analysis adjusted for all measured risk factors simultaneously, several social (unemployment, low family income, food insecurity, lack of home ownership, and unpartnered status), behavioral (current smoking, lack of leisure-time physical activity, and sleep <6 or >8 h/d), and metabolic (obesity, hypertension, and diabetes) risk factors were associated with a significantly higher risk for CVD death. After adjustment for these metabolic, behavioral, and social risk factors separately, hazard ratios of CVD mortality for Black compared with White participants were attenuated from 1.54 (95% CI, 1.34 to 1.77) to 1.34 (CI, 1.16 to 1.55), 1.31 (CI, 1.15 to 1.50), and 1.04 (CI, 0.90 to 1.21), respectively.

Limitation: Causal contributions of social, behavioral, and metabolic risk factors to racial and ethnic disparities in CVD mortality could not be established.

Conclusion: The Black-White difference in CVD mortality diminished after adjustment for behavioral and metabolic risk factors and completely dissipated with adjustment for social determinants of health in the U.S. population.

Primary funding source: National Institutes of Health.

Publication types

  • Observational Study
  • Research Support, N.I.H., Extramural
  • Cardiovascular Diseases*
  • Nutrition Surveys
  • Prospective Studies
  • Racial Groups
  • Risk Factors
  • United States / epidemiology

Grants and funding

  • R01 HL133790/HL/NHLBI NIH HHS/United States
  • UG3 HL151309/HL/NHLBI NIH HHS/United States
  • P20 GM109036/GM/NIGMS NIH HHS/United States
  • R01 MD018193/MD/NIMHD NIH HHS/United States

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • HHS Author Manuscripts

Logo of nihpa

Observational Research Opportunities and Limitations

Edward j. boyko.

Epidemiologic Research and Information Center, VA Puget Sound Health Care System, Seattle, WA USA. University of Washington School of Medicine, Seattle, WA

Medical research continues to progress in its ability to identify treatments and characteristics associated with benefits and adverse outcomes. The principle engine for the evaluation of treatment efficacy is the randomized controlled trial (RCT). Due to the cost and other considerations, RCTs cannot address all clinically important decisions. Observational research often is used to address issues not addressed or not addressable by RCTs. This article provides an overview of the benefits and limitations of observational research to serve as a guide to the interpretation of this category of research designs in diabetes investigations. The potential for bias is higher in observational research but there are design and analysis features that can address these concerns although not completely eliminate them. Pharmacoepidemiologic research may provide important information regarding relative safety and effectiveness of diabetes pharmaceuticals. Such research must effectively address the important issue of confounding by indication in order to produce clinically meaningful results. Other methods such as instrumental variable analysis are being employed to enable stronger causal inference but these methods also require fulfillment of several key assumptions that may or may not be realistic. Nearly all clinical decisions involve probabilistic reasoning and confronting uncertainly, so a realistic goal for observational research may not be the high standard set by RCTs but instead the level of certainty needed to influence a diagnostic or treatment decision.

A major focus of medical research is the identification of causes of health outcomes, good and bad. The current gold standard method to accomplish this aim is the randomized controlled trial (RCT) ( Meldrum, 2000 ). The performance of a RCT requires strict specification of study conditions related to all aspects of its conduct, such as participant selection, treatment and control assignment arms, inclusion/exclusion criteria, randomization method, outcome measurement, and many other considerations. Such trials are difficult to mount due to the expense in terms of both time and money, and often lead to results that may be difficult to apply to a real-world setting due to either the rigor or complexity of the intervention or the selection process for participants that yields a population dissimilar from that seen in general clinical practice. A randomized controlled trial focuses on an assessment of the validity of its results at the expense of generalizability. For example, the Diabetes Prevention Program screened 158,177 subjects to yield 3,819 subjects who were eventually randomized to one of the four original arms ( Rubin et al., 2002 ). Other limitations of RCTs include a focus on treatment effects and not the ability to detect rarer adverse reactions; restrictions on diabetes duration at the time of trial entry, thereby yielding results that may not apply to persons with a different diabetes duration at the initiation of treatment; and high costs that limits the number of therapeutic comparisons. Regarding this last point, assessment of a new treatment for hyperglycemia requires comparison to existing accepted treatments, but the control population usually is restricted to fewer treatments than in current use, thereby limiting the ability to compare the new treatment to all existing treatments.

Given these considerations, observational research is often used to address important clinical questions in the absence of randomized clinical trial data, but may also make important potential contributions even when RCTs have been conducted. Examples include monitoring for long-term adverse events that did not appear during the time interval over which the RCT was conducted, or to assess whether the trial findings apply to a different population excluded from the trial due to younger or older age, gender, presence of comorbid conditions, or other factors. Observational research often also addresses other questions not suitable for randomized clinical trials, such as an exposure known to be harmful or in other ways unacceptable to participants or whose administration is inconsistent with ethical principles. Also, observational research can address other exposures that are not potentially under the control of the investigator, such as, for example, eye color, blood type, presence of a specific genetic marker, or elevations of blood pressure or plasma glucose concentration. Observational research may also provide preliminary data to justify the performance of a clinical trial, which might not have received sufficient funding support without the existence of such results.

This paper will review observational research methods applied to addressing questions of causation in diabetes research, with a particular focus on pharmacoepidemiology as an area of research where many important questions may be addressed regarding the relative merits of multiple pharmaceuticals for a given condition. There have been an increasing number of observational studies of the association between diabetes treatments and hard outcomes, such as death or CVD events. The increase in such studies likely has been facilitated by the availability of big data in general and specifically large pharmaceutical databases created by national health plans, large health care systems, or mail-order pharmacy providers ( Sobek et al., 2011 ). In addition, the ongoing development of diabetes pharmacotherapies approved based on ability to achieve an improvement in glycemic control but without data on hard outcomes may also provide the impetus to use such large databases for research on comparative safety and efficacy.

Observational Research Study Designs

Cohort and case-control studies.

The two most popular designs for investigating causal hypotheses are the cohort and case-control studies. Features are shown in Table 1 . The major difference between the two is that the cohort study begins with identification of the exposure status, whereas the case-control study begins with the identification of the outcome. A cohort study can be prospective, where exposed and non-exposed subjects are followed for the development of the outcome, or retrospective, where collected data can be used to identify both the exposure status at some past time point and the subsequent development of the outcome. A case-control study, on the other hand, can only look back in time for occurrence of the exposure. There are of course exceptions to these general statements. It is possible in some case-control studies to measure the exposure after the outcome in time if the exposure is invariant and if it is not related to a greater loss to follow-up among persons with the outcome due to mortality or other reasons. Examples of such exposures include genetic markers or an unchanging characteristic of adults such as femur length, eye color, or red blood cell type. Variations in these study designs include the case-cohort and case-only studies, which are described in detail elsewhere, and which a description of which will not be provided here ( DiPietro, 2010 ). Also, the relative merits of these study designs will not be discussed here but are covered in standard epidemiology texts.

Observational Study Designs for Assessment of Causal Relationships

Weaker Observational Research Designs

Other research designs are often used in studies reported in the medical literature. These include cross-sectional, case-series, and case-reports. The cross-sectional study has limited value in assessing a potential causal relationship since it may not be possible to determine whether the potential exposure preceded the outcome, except when the exposure does not vary over one’s life history, such as in the case of a genotype, ABO blood group, or eye color. Case-series and case reports are even more limited since it is not possible to assess if the outcome occurred more frequently among the persons included compared to a control population. Case reports do though have potential value in pharmaceutical safety research by generating potential signals that signify unexpected adverse events. Such monitoring is employed in the Food and Drug Administration’s Adverse Event Reporting System, and has led to changes in product labeling as well as restriction or outright removal of pharmaceuticals from the market due to safety concerns ( Wysowski & Swartz, 2005 ). Over 2 million case reports of adverse reactions were submitted between 1969–2002, resulting in only about 1% of marketed drugs being withdrawn or restricted. Therefore the noise-to-signal ratio for this method of surveillance is exceedingly high and presents an opportunity for other observational methods to better address this issue.

Observation Research for Causal Inference

Causal associations will always involve correlation, but the presence of a correlation does not imply causation. The challenge of observational research is to assess whether a correlation is present and then determine whether it may be due to a causal association. A list of criteria was developed by Dr. Austin Bradford Hill decades ago that is still referred to frequently today ( Hill, 1965 ), although reexamination of these criteria more recently has led to the conclusion that only one of the nine original features is really necessary for a causal relationship in a observational study ( Phillips & Goodman, 2004 ; Rothman & Greenland, 2005 ). The magnitude of the observed association, another Hill criterion, often figures into determinations about the presence of bias, with those of greater magnitude considered less likely to be due to bias and more likely due to a causal process ( Grimes & Schulz, 2002 ).

Examination of the features of an RCT provide some insight into the limitations of observational research in assessing causal associations. The randomization process provides the opportunity for equal distribution of risk factors for the outcome among persons assigned to the treatment and control. Thus any difference in the outcome between these two groups will not likely be due to unequal distribution of risk factors by treatment assignment. The use of randomization provides a way to approach the problem of not having complete knowledge about predictors of all clinically important outcomes. If we did have such knowledge then groups with exactly equal risks of the outcome could be assembled by the investigator. As we do not have such knowledge, the process of randomization utilizes chance to distribute both known and more importantly unknown risk factors for the outcome, and is most likely to achieve this aim with larger sample size ( Efird, 2011 ). Randomization, though, does not guarantee that the treatment and control group will have the same risk of the outcome. Accidents of randomization have occurred for known risk factors for outcomes as in the UGDP, where older subjects with a higher prevalence of cardiovascular disease risk factors were disproportionately assigned to the tolbutamide treatment arm ( Leibel, 1971 ). Such accidents also must occur for the unknown risk factors, although these would not be apparent to the investigator.

Bias in Observational Research

Confounding bias.

Observational research does not have the benefit of randomization to allocate by chance risk factors for an outcome of interest. Exposures to risk factors occur due to self-selection, medical provider prescription, in association with occupation, and for other reasons. When an exposure of interest is strongly associated with another exposure that is also related to the outcome, confounding bias is present, but methods exist to obtain an unbiased estimated of the exposure-disease association as long as the confounding factor is identified and measured accurately.

A cross-sectional study of a genetic marker (Gm haplotype Gm 3;5,13,14 ) and diabetes prevalence provides an example of confounding bias. Subjects included members of the Pima and Papago tribes of the Gila River Indian Community in Southern Arizona who underwent a medical history and examination every two years including assessment of diabetes status through oral glucose tolerance testing ( Knowler, Williams, Pettitt & Steinberg, 1988 ). Subjects were further characterized by degree of Indian heritage measured in eighths and referred to as “quantum.” A total of 4,640 subjects of either 0/8, 4/8, and 8/8 quantum were included in this analysis. There were 1,336 persons with and 3,304 persons without diabetes available for analysis, yielding a crude (unadjusted) overall odds ratio of 0.24 for the association between haplotype Gm 3;5,13,14 and diabetes prevalence ( Figure 1 , Panel A). This result supports a lower prevalence of diabetes in association with haplotype Gm 3;5,13,14 , but the unadjusted result represents a substantial overestimate due to confounding by Quantum. In Figure 1 panel B, subjects were divided by the three Quantum categories found in the sample, and within each of these the odds ratio is closer to 1.0 and therefore of smaller magnitude than the crude result. Note that collapsing the three tables in Panel B by summing the cells yields the single overall table shown in Panel A. Adjustment for these Quantum categories yields an odds ratio of 0.59, which is of smaller magnitude than the result seen in the unadjusted analysis ( Figure 1 , Panel B). Although the odds ratios vary across Quantum categories, a test for heterogeneity across these strata was non-significant (p=0.295). Therefore the null hypothesis that the odds ratios differed across Quantum strata could not be rejected.

An external file that holds a picture, illustration, etc.
Object name is nihms513520f1.jpg

Cross-sectional study of Native Americans of the Pima and Papago Indian tribes in Southern Arizona on the associations between the GM haplotype Gm 3;5,13,14 , native quantum, and diabetes mellitus prevalence. Panel A displays all participants combined with Native quantum of either 0/8, 4/8 or 8/8 by presence of diabetes mellitus in relation to Gm 3;5,13,14 presence or absence. The overall (crude) odds ratio for the association is shown. Panel B displays all participants from Panel A stratified by Native quantum, demonstrating confounding by Native quantum as judged by the discordance between the crude and stratified or Quantum-adjusted results. Panel C demonstrates that Quantum meets the criterion as a confounding variable due to its negative association with Gm 3;5,13,14 and positive association with diabetes prevalence.

Examination of the frequency of haplotype Gm 3;5,13,14 and diabetes prevalence across Indian heritage Quantum reveals the reason for the overestimation of the association in the unadjusted analysis. Diabetes occurred more frequently while the haplotype Gm 3;5,13,14 occurred less frequently among subjects with greater Indian heritage ( Figure 1 , Panel C). Adjustment for the imbalance in Quantum by haplotype Gm 3;5,13,14 in this specific example and in general any accurately measured confounding factor yields a less biased odds ratio that is closer to the true magnitude of the association between this haplotype and diabetes prevalence.

Another more recent example of confounding can be seen in a case-cohort European study of the association between artificially sweetened soft drinks and the risk of developing type 2 diabetes ( 2013 ). The unadjusted hazard ratio for the daily consumption of ≥ 250 g of this beverage type was 1.84 (95% CI 1.52 to 2.23) representing a statistically significant elevation in risk. After adjustment for daily energy intake and BMI, the hazard ratio diminished to 1.13 (95% CI 0.85 to 1.52) and was no longer statistically significant (p=0.24). The investigators concluded that consumption of artificially sweetened soft drinks was not associated with type 2 diabetes risk in their population.

Multiple methods exist to remove the bias from recognized, accurately measured confounding factors, but unfortunately there is no widely accepted option for handling unmeasured confounding factors and adjusting for this bias. In this regard observational research is unable to match the ability of a RCT to account for this potential bias. Methods have been developed to better assess whether associations represent causal pathways that will be described later in this paper.

Information Bias

Observational research can be susceptible to other types of bias. Information bias refers to inaccurate assessment of the outcome, the exposure, or potential confounding variables. An example includes measurement of nutritional intake, which is often assessed by research subjects completing a food frequency survey or 24-hour dietary recall. Even if subjects report these intakes correctly, the likelihood is low that this will reflect long-term dietary intake exactly. Attempts have been made to reduce the error of these measurements through biomarker calibration that in one study was based on a urinary nitrogen protocol to estimate daily protein consumption over a 24-hour period ( Tinker et al., 2011 ). This analysis revealed a slight increase in risk of incident diabetes in association with a 20% higher protein intake in grams (Hazard Ratio 1.05, 95% CI 1.03–1.07). Recalibrated results based on the results of the urinary nitrogen protocol yielded a substantially higher diabetes hazard ratio of 1.82 (95% CI 1.56–2.12) that after adjustment for BMI was reduced to 1.16 (95% CI 1.05–2.28). In this example, reduction of measurement error yielded a difference of greater magnitude than see in the analysis based on dietary self-reports only without objective validation, although theoretically more accurate measurements may yield smaller differences, depending on the type and magnitude of measurement error.

Selection Bias

Selection bias may produce factitious exposure-disease associations if the study population fails to mirror the target population of interest. For example, selection of control subjects from among hospitalized patients as might be the case in a study based on administrative data may not accurately depict smoking prevalence among controls, given that smoking is related to multiple diseases that would increase the risk for hospitalization. Effective observational research must recognize the potential for bias and attempt to minimize it both in the design and analysis, as well as accurately describing limitations of these data and the implications for study validity in reports of results.

Agreement and Discrepancies between Observational and Clinical Trial Research

One way to assess whether the potential biases of observational studies result in failure to detect true associations is by comparison of observational versus RCT results on the same questions. Since observational studies of treatments often precede definitive clinical trials, several authors have assessed agreement between similar hypotheses tested using the gold standard compared to observational designs, concluding that agreement between the two is high. A comparison of 136 reports published between 1985 to 1998 on 19 different treatments found excellent agreement, with the combined magnitude of the effect in observational studies lying within the 95% confidence intervals of the combined magnitude of the effect in RCTs for 17 of the 19 hypotheses tested ( Benson & Hartz, 2000 ). Another comparison focused on comparing the results of meta-analyses of observation and clinical trial research on five clinical questions that were identified through a search of five major medical journals from 1991 to 1995 ( Concato, Shah & Horwitz, 2000 ). These investigators concluded that average results of these studies were “remarkably similar.”

In contrast, other research has demonstrated discrepancies between RCT and observational designs. The Women’s Health Initiative (WHI) was a RCT of dietary and menopausal hormone interventions to assess these effects on mortality, cardiovascular disease, and cancer risk ( Prentice et al., 2005 ). Perhaps unique to this study was the establishment of a concurrent observational study accompanying the randomized clinical trial, thereby permitting direct comparison of reported associations by type of research design within the same study framework. In the trial/observational study of estrogen plus progestin for menopausal hormone replacement, marked differences were seen between the treatment and control groups by participation in the RCT or observational study ( Table 2 ). In the RCT, no important differences were seen by treatment assignment for race, educational level, BMI, or current smoking status. This was not true by estrogen-progestin exposure in the observational study, where exposed women were more likely to be White, having completed a college degree or higher, and less likely to be current smokers or obese. Outcomes occurred more frequently in the estrogen-progestin arm of the RCT, but less frequently in the corresponding arm of the observational study, except for venous thromboembolism ( Table 2 ). Hazard ratios for these comparisons adjusted for imbalances in baseline potential confounding factors show a harmful effect of estrogen-progestin use that is statistically significantly elevated in 2 of 3 outcomes and a discordance with the observational results due to null, somewhat protective hazard ratios or in the case of venous thromboembolism, an elevated hazard ratio of considerably smaller magnitude than in the clinical trial. Although good agreement between clinical trials and observational research occurs often, the example of the WHI prevents having complete confidence in the results of observational studies.

Comparison of baseline characteristics and outcomes in the randomized controlled trial and observational study of estrogen-progestin treatment in the Women’s Health Initiative (1994–2002).

Achievements of Observational Research

Despite the limitations of observational research design, many well-accepted causal associations in medicine are supported entirely or in part due to this type of investigation. Several examples include the association between hyperglycemia and diabetes complications including retinopathy, nephropathy, peripheral neuropathy, and ischemic heart disease ( 2013 ). Other well known examples include hypertension and stroke, smoking and lung cancer, asbestosis and mesothelioma, and LDL and HDL cholesterol concentrations and risk of ischemic heart disease ( Churg, 1988 ; Gordon, Kannel, Castelli & Dawber, 1981 ; Kannel, Wolf, Verter & McNamara, 1970 ; Pirie, Peto, Reeves, Green & Beral, 2013 ). In the case of complications due to hyperglycemia, high LDL-cholesterol concentration, and hypertension, clinical trials to reduce these levels have resulted in reductions in the rate of these outcomes, further supporting a causal association ( 1991 ; 1994 ; 1998 ; 1998 ). For many associations that involve an exposure that cannot be controlled by the investigator or should not be modified for ethical reasons, observational research may be the only avenue for direct testing of these associations in humans.

Causal Inference from Observational Research

The results of an observational research study are never interpreted in an information vacuum. Given the potential for bias with this study design, a number of other factors should be considered when weighing the strength of this evidence. First and foremost would be the replication of the finding in other observational research studies. Additional evidence to bolster the potential causal association would be support from the biological understanding of underlying mechanisms, animal experiments confirming that the exposure results in a similar outcome, and trend data in disease incidence following changes in exposure prevalence. For example, in the UK Million Women Study where median age was reported at 55 years, women who quit smoking completely at ages 25–34 or 35–44 years had only 3% and 10% of the excess mortality, respectively, seen among women who were continuing smokers ( Pirie, Peto, Reeves, Green & Beral, 2013 ). Coronary heart disease deaths in the U.S. declined by approximately 50% between 1980 to 2000. One analysis that addressed the reasons for this decline concluded that change in risk factors (reductions in total cholesterol concentration, systolic blood pressure, smoking, and physical inactivity) accounted for approximately 47% of this decrease ( Ford et al., 2007 ). These trends provide support for a causal association between smoking and lung cancer, and multiple cardiovascular disease risk factors and coronary death risk.

Pharmacoepidemiology

Many questions regarding the use of pharmaceuticals may never be answered through use of RCTs, thereby creating a need to address knowledge gaps using observational research. The specialized field of pharmacoepidemiology directly addresses these needs. The earliest appearance of the term “pharmacoepidemiology” on PubMed.com is in an article written in 1984 ( Lawson, 1984 ). The field of pharmacoepidemiology encompasses the use of observational research to assess pharmaceutical safety and effectiveness. For example, diabetes pharmaceuticals have received FDA approval based on efficacy at lowering glucose and safety, without the need to prove efficacy at preventing long-term complications. The sulfonylurea hypoglycemic agents glyburide and glipizide are in widespread use to manage the hyperglycemia of diabetes, but it is not clear whether one is associated with a greater reduction in hard outcomes such as mortality or diabetes complications, as this has not been tested in a clinical trial. Use of such surrogate endpoints as opposed to the hard outcomes one wishes to prevent has been criticized as an ineffective and potentially harmful approach to medication approval ( Fleming & DeMets, 1996 ; Psaty et al., 1999 ). Design of clinical trials to address hard as opposed to surrogate endpoints typically requires larger sample size, longer follow-up, and greater costs.

Observational research may also identify adverse effects associated with the use of pharmaceuticals that were not anticipated based on research conducted in support of the drug approval process. The withdrawal of the thiazolidinedione agent troglitazone from the U.S. market in 2000 followed reports on cases of severe liver toxicity during post-marketing surveillance. Similar data on a high number of reported cases of severe myopathies in cerivastatin users led to its withdrawal from the worldwide market in 2001 ( Furberg & Pitt, 2001 ). An observational study using administrative claims databases to assess the relative safety of lipid lowering medications in the U.S. between 2000–2004 reported a much higher risk for hospitalization for treatment of myopathy among cerivastatin users compared to users of other statin and non-stain lipid lowering agents ( Cziraky et al., 2006 ).

Confounding by Indication

As with other observational research designs, there are limitations to pharmacoepidemiology due to biases previous described, but in addition to these is the vexing phenomenon of confounding by indication, also referred to as channeling bias ( McMahon & MacDonald, 2000 ; Petri & Urquhart, 1991 ). This refers to an observed benefit (or harm) associated with a pharmaceutical due to the indications for treatment with it and not a medication effect. A hypothetical example of how confounding by indication results in outcome differences not due to medication effect is shown in Figure 2 , which provides an example of how the choice of a diabetes pharmaceutical may depend on the existence of a condition (higher serum creatinine reflecting lower GFR) associated with higher mortality risk ( Fox et al., 2012 ).

An external file that holds a picture, illustration, etc.
Object name is nihms513520f2.jpg

A hypothetical population of 2000 identical persons with type 2 diabetes differing only by renal function as measured by serum creatinine and assigned to either metformin or glipizide based on the serum creatinine level. The active treatment, though, is never dispensed, and instead substituted with a identical placebo. An expected difference in mortality is seen between the two groups given the association between poorer renal function and mortality in the glipizide group. This difference cannot be explained by the effect of the active pharmaceutical (since there was none) and therefore represents an example of confounding by indication.

Several approaches exist to the problem of confounding by indication. If there is no association between the indication for the pharmaceutical and the outcome of interest, then no bias will occur, since an association must also be present between both the indication and the outcome to yield a biased result. This same principle applies to all confounding factors ( van Stralen, Dekker, Zoccali & Jager, 2010 ). If the conditions for confounding are fulfilled, then statistical adjustment techniques are available to produce unbiased estimates of effect. Commonly used methods in biomedical research include linear regression analysis for continuous outcomes, logistic regression for categorical outcomes, and the Cox proportional hazards model for time-to-event outcomes. In addition, propensity scores have risen in popularity over the past decade. An “all fields” search of Pubmed conducted January 15, 2012 using the search term “propensity score” yielded 2,895 hits for the immediate past 5 years, and only 715 hits for the previous 5 years. The propensity score method models the probability of exposure in relation to predictor variables, and therefore estimates the likelihood, in the case of a pharmacoepidemiology study, of a subject receiving a particular pharmaceutical based on his or her characteristics ( Rubin, 2010 ). An additional step is required which uses standard previously mentioned adjustment methods to remove the bias associated with varying likelihood of receiving the pharmaceutical. Despite the rising popularity of this method, it has been demonstrated to be merely equivalent and sometimes inferior to standard multivariate adjustment methods ( Shah, Laupacis, Hux & Austin, 2005 ; Sturmer et al., 2006 ). Furthermore, propensity scores cannot address the issue of unmeasured confounding ( Cummings, 2008 ). So if the indications for the pharmaceutical cannot be determined from the other measured factors, neither multivariate adjustment or propensity scores will allow for adjustment and removal of bias.

Several design features of observational studies may increase the likelihood of confounding by indication but if recognized may be amenable to correction in the design or analysis phases of a study. Assessing outcomes for pharmaceuticals prescribed for different indications or by a comparison of populations who differ with regard to the presence of medication contraindications may introduce bias into comparisons. An assessment of the mortality risk associated with beta-blocker use compared to other antihypertensive medications should exclude participants in whom beta blockers but not other antihypertensive medications are prescribed for other indications, such as migraine headache or stage fright prophylaxis, as these conditions may be associated with better outcomes and lead to over-optimistic survival benefit. Also, failure to consider medication contraindications may lead to risk of the outcome differing by medication used, as seen in the example in Figure 2 which would lead to a higher frequency of subjects with renal insufficiency in the glipizide treatment group for hyperglycemia. To account for this potential bias, subjects with contraindications for use of any of the pharmaceuticals of interest in the comparison should be eliminated from the study. For example, recent studies of mortality and cardiovascular events among users of sulfonylurea or metformin monotherapy for treatment of diabetes in the Veterans Health Administration system excluded patients with serious medical conditions at baseline that might influence the prescription of diabetes medication ( Roumie et al., 2012 ; Wheeler et al., 2013 ). For example, some items on the list of exclusions were congestive heart failure, serum creatinine concentration of 1.5 mg/dl or greater, HIV, and other conditions described in this publication. Despite these design features and adjustment methods to correct for factors associated with a particular prescription that may also be associated with a different outcome risk, there will always be some uncertainty about the presence of bias due to residual confounding by indication.

Methods to Improve Causal Inference from Observational Research

Instrumental variables analysis has been promoted as a method to overcome the inability to exclude undetected confounding in observational research. This method involves identification of a factor that strongly predicts treatment (or exposure in an epidemiologic study not involving a pharmaceutical). This factor is referred to as an “instrument,” and it is used in a manner analogous to the intention to treat analysis employed in RCTs ( Thomas & Conti, 2004 ). A Mendelian Randomization study is a type of instrumental variable analysis that uses a genetic marker as the instrument ( Thomas & Conti, 2004 ). Although intriguing in concept, the difficulty is in the application, as this relies on finding an “instrument” that is (1) causally related to treatment but not unobserved risk factors for the outcome, and (2) influences the outcome only through its effect on treatment ( Hernan & Robins, 2006 ). This method is being explored in pharmacoepidemiologic investigations, with one example being use of physician prescribing preference for types of NSAIDS in the evaluation of the gastrointestinal toxicity of COX-2 inhibitors versus non-COX-2 inhibitor NSAIDS ( Brookhart, Wang, Solomon & Schneeweiss, 2006 ). This analysis reported a protective association with COX-2 inhibitors only in the instrumental variable analysis, leading the authors to conclude that this analysis resulted in a reduction in unmeasured confounding. Examples can also be found in the diabetes epidemiology literature, such as the lack of association between serum uric acid level and type 2 diabetes risk ( Pfister et al., 2011 ), and higher risk associated with lower sex hormone-binding globulin concentration ( Ding et al., 2009 ).

Conclusions

As it will not be possible to assess efficacy of all possible treatment comparisons in all possible groups of interest, or identify adverse (or unexpected beneficial) outcomes requiring longer follow-up or greater sample size using RCTs, observational research stands prepared to step forward to address these knowledge gaps. Much medical knowledge and practice currently rests on a foundation of observational research. Perhaps this is not noticed due to the gloss and novelty of recently completed RCTs. Little research has been conducted comparing results from observational and clinical trial designs, but that which has been completed finds generally good agreement in these findings. With any observational research finding, though, comes less certainly due to the inability to completely exclude the possibility of residual confounding, or in the case of a pharmaceutical, confounding by indication. However, the expectation of absolute certainty is unrealistic and inconsistent with the current practice of medicine, where decisions are made probabilistically, with the threshold for actions such as further testing or treatment varying widely depending on the comparative costs and benefits of true and false positive and negative decisions ( Boland & Lehmann, 2010 ; Pauker & Kassirer, 1980 ; Plasencia, Alderman, Baron, Rolfs & Boyko, 1992 ). Observational research definitely has had and will continue to have an important role in providing the information needed to improve medical decision-making. There is always room for improvement and the hope that the future will bring better methods to further reduce the uncertainty surrounding the validity of its results.

Acknowledgments

Grant Support: VA Epidemiologic Research and Information Center; the Diabetes Research Center at the University of Washington (DK-017047)

Thanks for James S. Floyd MD for his careful review of this manuscript. The work was supported by the VA Epidemiologic Research and Information Center; the Diabetes Research Center at the University of Washington (DK-017047); and VA Puget Sound Health Care System.

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

  • Prevention of stroke by antihypertensive drug treatment in older persons with isolated systolic hypertension. Final results of the Systolic Hypertension in the Elderly Program (SHEP). SHEP Cooperative Research Group. JAMA. 1991; 265 :3255–3264. [ PubMed ] [ Google Scholar ]
  • 1994 Randomised trial of cholesterol lowering in 4444 patients with coronary heart disease: the Scandinavian Simvastatin Survival Study (4S) Lancet. 344 :1383–1389. [ PubMed ] [ Google Scholar ]
  • Effect of intensive blood-glucose control with metformin on complications in overweight patients with type 2 diabetes (UKPDS 34). UK Prospective Diabetes Study (UKPDS) Group. Lancet. 1998; 352 :854–865. [ PubMed ] [ Google Scholar ]
  • Intensive blood-glucose control with sulphonylureas or insulin compared with conventional treatment and risk of complications in patients with type 2 diabetes (UKPDS 33). UK Prospective Diabetes Study (UKPDS) Group. Lancet. 1998; 352 :837–853. [ PubMed ] [ Google Scholar ]
  • Consumption of sweet beverages and type 2 diabetes incidence in European adults: results from EPIC-InterAct. Diabetologia. 2013; 56 :1520–1530. [ PubMed ] [ Google Scholar ]
  • Diagnosis and classification of diabetes mellitus. Diabetes Care. 2013; 36 (Suppl 1):S67–74. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Benson K, Hartz AJ. A comparison of observational studies and randomized, controlled trials. N Engl J Med. 2000; 342 :1878–1886. [ PubMed ] [ Google Scholar ]
  • Boland MV, Lehmann HP. A new method for determining physician decision thresholds using empiric, uncertain recommendations. BMC Med Inform Decis Mak. 2010; 10 :20. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Brookhart MA, Wang PS, Solomon DH, Schneeweiss S. Evaluating short-term drug effects using a physician-specific prescribing preference as an instrumental variable. Epidemiology. 2006; 17 :268–275. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Churg A. Chrysotile, tremolite, and malignant mesothelioma in man. Chest. 1988; 93 :621–628. [ PubMed ] [ Google Scholar ]
  • Concato J, Shah N, Horwitz RI. Randomized, controlled trials, observational studies, and the hierarchy of research designs. N Engl J Med. 2000; 342 :1887–1892. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Cummings P. Propensity scores. Arch Pediatr Adolesc Med. 2008; 162 :734–737. [ PubMed ] [ Google Scholar ]
  • Cziraky MJ, Willey VJ, McKenney JM, Kamat SA, Fisher MD, Guyton JR, et al. Statin safety: an assessment using an administrative claims database. Am J Cardiol. 2006; 97 :61C–68C. [ PubMed ] [ Google Scholar ]
  • Ding EL, Song Y, Manson JE, Hunter DJ, Lee CC, Rifai N, et al. Sex hormone-binding globulin and risk of type 2 diabetes in women and men. N Engl J Med. 2009; 361 :1152–1163. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • DiPietro NA. Methods in epidemiology: observational study designs. Pharmacotherapy. 2010; 30 :973–984. [ PubMed ] [ Google Scholar ]
  • Efird J. Blocked randomization with randomly selected block sizes. Int J Environ Res Public Health. 2011; 8 :15–20. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Fleming TR, DeMets DL. Surrogate end points in clinical trials: are we being misled? Ann Intern Med. 1996; 125 :605–613. [ PubMed ] [ Google Scholar ]
  • Ford ES, Ajani UA, Croft JB, Critchley JA, Labarthe DR, Kottke TE, et al. Explaining the decrease in U.S. deaths from coronary disease, 1980–2000. N Engl J Med. 2007; 356 :2388–2398. [ PubMed ] [ Google Scholar ]
  • Fox CS, Matsushita K, Woodward M, Bilo HJ, Chalmers J, Heerspink HJ, et al. Associations of kidney disease measures with mortality and end-stage renal disease in individuals with and without diabetes: a meta-analysis. Lancet. 2012; 380 :1662–1673. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Furberg CD, Pitt B. Withdrawal of cerivastatin from the world market. Curr Control Trials Cardiovasc Med. 2001; 2 :205–207. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Gordon T, Kannel WB, Castelli WP, Dawber TR. Lipoproteins, cardiovascular disease, and death. The Framingham study. Arch Intern Med. 1981; 141 :1128–1131. [ PubMed ] [ Google Scholar ]
  • Grimes DA, Schulz KF. Bias and causal associations in observational research. Lancet. 2002; 359 :248–252. [ PubMed ] [ Google Scholar ]
  • Hernan MA, Robins JM. Instruments for causal inference: an epidemiologist’s dream? Epidemiology. 2006; 17 :360–372. [ PubMed ] [ Google Scholar ]
  • Hill AB. The Environment and Disease: Association or Causation? Proceedings of the Royal Society of Medicine. 1965; 58 :295–300. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Kannel WB, Wolf PA, Verter J, McNamara PM. Epidemiologic assessment of the role of blood pressure in stroke. The Framingham study. JAMA. 1970; 214 :301–310. [ PubMed ] [ Google Scholar ]
  • Knowler WC, Williams RC, Pettitt DJ, Steinberg AG. Gm3;5,13,14 and type 2 diabetes mellitus: an association in American Indians with genetic admixture. American journal of human genetics. 1988; 43 :520–526. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Lawson DH. Pharmacoepidemiology: a new discipline. Br Med J (Clin Res Ed) 1984; 289 :940–941. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Leibel B. An analysis of the University Group Diabetes Study Program: data results and conslusions. Can Med Assoc J. 1971; 105 :292–294. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • McMahon AD, MacDonald TM. Design issues for drug epidemiology. Br J Clin Pharmacol. 2000; 50 :419–425. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Meldrum ML. A brief history of the randomized controlled trial. From oranges and lemons to the gold standard. Hematol Oncol Clin North Am. 2000; 14 :745–760. vii. [ PubMed ] [ Google Scholar ]
  • Pauker SG, Kassirer JP. The threshold approach to clinical decision making. N Engl J Med. 1980; 302 :1109–1117. [ PubMed ] [ Google Scholar ]
  • Petri H, Urquhart J. Channeling bias in the interpretation of drug effects. Stat Med. 1991; 10 :577–581. [ PubMed ] [ Google Scholar ]
  • Pfister R, Barnes D, Luben R, Forouhi NG, Bochud M, Khaw KT, et al. No evidence for a causal link between uric acid and type 2 diabetes: a Mendelian randomisation approach. Diabetologia. 2011; 54 :2561–2569. [ PubMed ] [ Google Scholar ]
  • Phillips CV, Goodman KJ. The missed lessons of Sir Austin Bradford Hill. Epidemiol Perspect Innov. 2004; 1 :3. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Pirie K, Peto R, Reeves GK, Green J, Beral V. The 21st century hazards of smoking and benefits of stopping: a prospective study of one million women in the UK. Lancet. 2013; 381 :133–141. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Plasencia CM, Alderman BW, Baron AE, Rolfs RT, Boyko EJ. A method to describe physician decision thresholds and its application in examining the diagnosis of coronary artery disease based on exercise treadmill testing. Med Decis Making. 1992; 12 :204–212. [ PubMed ] [ Google Scholar ]
  • Prentice RL, Langer R, Stefanick ML, Howard BV, Pettinger M, Anderson G, et al. Combined postmenopausal hormone therapy and cardiovascular disease: toward resolving the discrepancy between observational studies and the Women’s Health Initiative clinical trial. Am J Epidemiol. 2005; 162 :404–414. [ PubMed ] [ Google Scholar ]
  • Psaty BM, Weiss NS, Furberg CD, Koepsell TD, Siscovick DS, Rosendaal FR, et al. Surrogate end points, health outcomes, and the drug-approval process for the treatment of risk factors for cardiovascular disease. JAMA. 1999; 282 :786–790. [ PubMed ] [ Google Scholar ]
  • Rothman KJ, Greenland S. Causation and causal inference in epidemiology. Am J Public Health. 2005; 95 (Suppl 1):S144–150. [ PubMed ] [ Google Scholar ]
  • Roumie CL, Hung AM, Greevy RA, Grijalva CG, Liu X, Murff HJ, et al. Comparative effectiveness of sulfonylurea and metformin monotherapy on cardiovascular events in type 2 diabetes mellitus: a cohort study. Ann Intern Med. 2012; 157 :601–610. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Rubin DB. Propensity score methods. Am J Ophthalmol. 2010; 149 :7–9. [ PubMed ] [ Google Scholar ]
  • Rubin RR, Fujimoto WY, Marrero DG, Brenneman T, Charleston JB, Edelstein SL, et al. The Diabetes Prevention Program: recruitment methods and results. Control Clin Trials. 2002; 23 :157–171. [ PubMed ] [ Google Scholar ]
  • Shah BR, Laupacis A, Hux JE, Austin PC. Propensity score methods gave similar results to traditional regression modeling in observational studies: a systematic review. J Clin Epidemiol. 2005; 58 :550–559. [ PubMed ] [ Google Scholar ]
  • Sobek M, Cleveland L, Flood S, Hall PK, King ML, Ruggles S, et al. Big Data: Large-Scale Historical Infrastructure from the Minnesota Population Center. Hist Methods. 2011; 44 :61–68. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Sturmer T, Joshi M, Glynn RJ, Avorn J, Rothman KJ, Schneeweiss S. A review of the application of propensity score methods yielded increasing use, advantages in specific settings, but not substantially different estimates compared with conventional multivariable methods. J Clin Epidemiol. 2006; 59 :437–447. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Thomas DC, Conti DV. Commentary: the concept of ‘Mendelian Randomization’ Int J Epidemiol. 2004; 33 :21–25. [ PubMed ] [ Google Scholar ]
  • Tinker LF, Sarto GE, Howard BV, Huang Y, Neuhouser ML, Mossavar-Rahmani Y, et al. Biomarker-calibrated dietary energy and protein intake associations with diabetes risk among postmenopausal women from the Women’s Health Initiative. Am J Clin Nutr. 2011; 94 :1600–1606. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • van Stralen KJ, Dekker FW, Zoccali C, Jager KJ. Confounding. Nephron Clin Pract. 2010; 116 :c143–147. [ PubMed ] [ Google Scholar ]
  • Wheeler S, Moore K, Forsberg CW, Riley K, Floyd JS, Smith NL, et al. Mortality among veterans with type 2 diabetes initiating metformin, sulfonylurea or rosiglitazone monotherapy. Diabetologia 2013 [ PubMed ] [ Google Scholar ]
  • Wysowski DK, Swartz L. Adverse drug event surveillance and drug withdrawals in the United States, 1969–2002: the importance of reporting suspected reactions. Arch Intern Med. 2005; 165 :1363–1369. [ PubMed ] [ Google Scholar ]

This paper is in the following e-collection/theme issue:

Published on 4.4.2024 in Vol 10 (2024)

Comparison of the Real-World Reporting of Symptoms and Well-Being for the HER2-Directed Trastuzumab Biosimilar Ogivri With Registry Data for Herceptin in the Treatment of Breast Cancer: Prospective Observational Study (OGIPRO) of Electronic Patient-Reported Outcomes

Authors of this article:

Author Orcid Image

Original Paper

  • Andreas Trojan 1, 2 , MD   ; 
  • Sven Roth 3 , BSc   ; 
  • Ziad Atassi 2 , MD   ; 
  • Michael Kiessling 2 , MD, PhD   ; 
  • Reinhard Zenhaeusern 4 , MD   ; 
  • Yannick Kadvany 5 , MA   ; 
  • Johannes Schumacher 6 , PhD   ; 
  • Gerd A Kullak-Ublick 1 , Prof Dr Med   ; 
  • Matti Aapro 7 , MD   ; 
  • Alexandru Eniu 8 , MD  

1 Department of Clinical Pharmacology and Toxicology, University Hospital Zurich, University of Zurich, Zurich, Switzerland

2 BrustZentrum Zürichsee, Horgen, Switzerland

3 Faculty of Medicine, University of Zurich, Zurich, Switzerland

4 Onkologie, Spital Oberwallis, Brig, Switzerland

5 Mobile Health AG, Zurich, Switzerland

6 Palleos Healthcare, Wiesbaden, Germany

7 Cancer Center, Clinique de Genolier, Genolier, Switzerland

8 Hôpital Riviera-Chablais, Rennaz, Switzerland

Corresponding Author:

Andreas Trojan, MD

Department of Clinical Pharmacology and Toxicology

University Hospital Zurich

University of Zurich

Rämistrasse 100

Zurich, 8091

Switzerland

Phone: 41 76 34 30 200

Email: [email protected]

Background: Trastuzumab has had a major impact on the treatment of human epidermal growth factor receptor 2 (HER2)-positive breast cancer (BC). Anti-HER2 biosimilars such as Ogivri have demonstrated safety and clinical equivalence to trastuzumab (using Herceptin as the reference product) in clinical trials. To our knowledge, there has been no real-world report of the side effects and quality of life (QoL) in patients treated with biosimilars using electronic patient-reported outcomes (ePROs).

Objective: The primary objective of this prospective observational study (OGIPRO study) was to compare the ePRO data related to treatment side effects collected with the medidux app in patients with HER2-positive BC treated with the trastuzumab biosimilar Ogivri (prospective cohort) to those obtained from historical cohorts treated with Herceptin alone or combined with pertuzumab and/or chemotherapy (ClinicalTrials.gov NCT02004496 and NCT03578731).

Methods: Patients were treated with Ogivri alone or combined with pertuzumab and/or chemotherapy and hormone therapy in (neo)adjuvant and palliative settings. Patients used the medidux app to dynamically record symptoms (according to the Common Terminology Criteria for Adverse Events [CTCAE]), well-being (according to the Eastern Cooperative Oncology Group Performance Status scale), QoL (using the EQ-5D-5L questionnaire), cognitive capabilities, and vital parameters over 6 weeks. The primary endpoint was the mean CTCAE score. Key secondary endpoints included the mean well-being score. Data of this prospective cohort were compared with those of the historical cohorts (n=38 patients; median age 51, range 31-78 years).

Results: Overall, 53 female patients with a median age of 54 years (range 31-87 years) were enrolled in the OGIPRO study. The mean CTCAE score was analyzed in 50 patients with available data on symptoms, while the mean well-being score was evaluated in 52 patients with available data. The most common symptoms reported in both cohorts included fatigue, taste disorder, nausea, diarrhea, dry mucosa, joint discomfort, tingling, sleep disorder, headache, and appetite loss. Most patients experienced minimal (grade 0) or mild (grade 1) toxicities in both cohorts. The mean CTCAE score was comparable between the prospective and historical cohorts (29.0 and 30.3, respectively; mean difference –1.27, 95% CI –7.24 to 4.70; P =.68). Similarly, no significant difference was found for the mean well-being score between the groups treated with the trastuzumab biosimilar Ogivri and Herceptin (74.3 and 69.8, respectively; mean difference 4.45, 95% CI –3.53 to 12.44; P =.28).

Conclusions: Treatment of patients with HER2-positive BC with the trastuzumab biosimilar Ogivri resulted in equivalent symptoms, adverse events, and well-being as found for patients treated with Herceptin as determined by ePRO data. Hence, integration of an ePRO system into research and clinical practice can provide reliable information when investigating the real-world tolerability and outcomes of similar therapeutic compounds.

Trial Registration: ClinicalTrials.gov NCT05234021; https://clinicaltrials.gov/study/NCT05234021

Introduction

Biosimilars and reference biologics play a key role in the treatment of cancer and account for approximately 70% of the growth in costs of drugs from 2010 to 2015 [ 1 ]. Therefore, pricing is an important challenge for the medical society and biosimilars offer an attractive option for a value-based care environment with cost-saving potential [ 2 ].

Trastuzumab (Herceptin), a human epidermal growth factor receptor 2 (HER2) antibody, has had a major impact on the treatment of patients with HER2-positive breast cancer (BC) worldwide, which now has indications for the treatment of small tumors in both (neo)adjuvant and palliative settings [ 3 , 4 ]. This provides a good opportunity to compare the efficacy and safety of trastuzumab biosimilars to those of trastuzumab in clinical trials. For several anti-HER2 biosimilars, safety and clinical equivalence to the reference product have been demonstrated [ 2 , 5 ]. In a randomized, parallel-group phase 3 equivalence study of patients with HER2-positive metastatic BC, Rugo et al [ 6 ] demonstrated equivalent efficacy and similar safety profiles between the trastuzumab biosimilar Ogivri (MYL-1401O) and trastuzumab (Herceptin) [ 6 ].

The enhanced assessment of electronic patient-reported outcomes (ePROs) in clinical routine and cancer trials is of growing interest [ 7 - 9 ]. Several studies indicate that the proactive use of PROs can identify otherwise undetected symptoms and improve symptom management for patients with various types of cancer [ 9 ] as well as offer improvements in well-being and awareness of adverse events (AEs) between outpatient visits. Using a mobile app, especially in collaboration with the treating physician, might improve clinical care in patients with early or advanced disease [ 10 - 13 ]. In addition, the benefits of digital patient monitoring have been demonstrated during immune and targeted cancer therapies in terms of more efficient symptom assessment and patient-physician communication as well as a reduced need for telephone consultations [ 14 ].

Medidux is an interactive patient empowerment app that enables physicians, especially oncologists, to monitor the progression of well-being and symptoms of patients undergoing cancer treatment. Based on the documented symptom progression, the software notifies patients to contact the treatment team if symptoms defined according to the Common Terminology Criteria for Adverse Events (CTCAE) standards are outside the acceptable range. More than 110 available symptoms and severity classifications (according to the CTCAE), as well as high numbers of standardized symptom reports from patients, contribute to the collection of high-quality ePRO data for the timely management of treatment-related AEs and toxicities and their communication to treatment teams [ 11 , 13 ]. Thus, the medidux smartphone app is helpful to stabilize daily functional activities and leads to more frequent reporting of AEs and more precise entries regarding symptoms [ 11 ]. The continuous measurement of ePROs enables structured and standardized data recording of patients’ daily health state.

An increased level of concordance (κ=0.68) for common symptoms, including pain, fever, diarrhea, constipation, nausea, vomiting, and stomatitis, between the patient and treating physician was recently demonstrated for the medidux platform [ 13 ].

However, to the best of our knowledge, no real-world observation of side effects, tolerability, and quality of life (QoL) has been performed using ePRO data collected from patients treated with anti-HER2 biosimilars. Thus, the aim of this observational study was to investigate real-word data on daily functional activity, symptoms, and therapy side effects recorded with the medidux smartphone app in patients undergoing Ogivri antibody therapy. In addition, historical ePRO data of 38 patients with HER2-positive BC treated with Herceptin from two previous studies [ 7 , 13 ] were used for comparative analysis.

Study Design

OGIPRO was a noninterventional, multicenter, prospective, observational study conducted at 5 study sites in Switzerland over a duration of 20 months.

Patients 18 years and older with a histologically or cytologically proven diagnosis of HER2-positive primary, locally advanced, or metastatic BC were eligible to participate after providing written informed consent. In addition, patients had to own a personal iOS or Android smartphone.

Eligible patients received anti-HER2 treatment containing the trastuzumab biosimilar Ogivri (initial dose of 8 mg/kg body weight [BW] intravenously, followed by 6 mg/kg BW) with or without pertuzumab and/or chemotherapy and hormone therapy in (neo)adjuvant and palliative settings. At the beginning of the study, patients were provided with the medidux app and were prompted to record their symptoms, well-being, EQ-5D-5L questionnaires, cognitive capabilities, and vital parameters every day. Patients underwent 3 regular study visits scheduled on days 1, 21, and 42 during their 3 weekly chemotherapeutic interventions. All anticancer treatments used in this study were approved drugs, and the therapy was compliant with national treatment guidelines.

The observational period for each patient was 6 weeks. At the end of the observational period, patients decided whether to continue their therapy with the biosimilar Ogivri or with the reference substance Herceptin.

After the study observational period, prospectively collected data of patients treated with Ogivri (prospective cohort) were compared to historical ePRO data of patients treated with Herceptin (historical cohort) in two previous studies: a prospective randomized controlled trial (PRO1 study; ClinicalTrials.gov NCT02004496) of 139 patients with early stage BC who underwent chemotherapy [ 7 ] and an observational noninterventional trial (Consilium1 study; ClinicalTrials.gov NCT03578731) of patients with breast, colon, prostate, or lung cancer undergoing cancer treatment [ 13 ]. In both studies, patients were encouraged to document data on well-being and standardized symptoms using earlier versions of the medidux app during the course of their therapies. More than 5000 continuously measured data entries from 38 patients overall (14 from Consilium1 and 24 from PRO1) were available for the comparative analysis [ 7 , 13 ]. The historical ePRO data were recorded in the same manner using the earlier versions of the mobile app [ 11 ] and were therefore comparable to the prospective ePRO data.

Ethical Considerations

This study was approved by the Swiss Institutional Review Board (KEK-ZH: 2021-D0051) and was conducted in accordance with the principles of the Declaration of Helsinki (current version). The study was also registered on ClinicalTrials.gov (NCT05234021). All patients in the prospective and historical cohorts provided written informed consent prior to enrollment and were informed that participation in the study is voluntary and can be revoked at any time. All study documents were deidentified by assigning a unique ID to each patient. Functional data security was ensured by identification only made possible via the patient’s ID. The data on the patients’ devices were encapsulated in the app and the data exchange was encrypted with the patient’s ID. There was no compensation provided to participants.

Primary Objective

The primary objective of the study was to evaluate ePRO data reported in the medidux app by patients with HER2-positive BC treated with the trastuzumab biosimilar Ogivri with respect to their treatment side effects and to compare these data with ePRO data obtained from a historical cohort of 38 patients treated with Herceptin in two previous studies (NCT02004496, NCT03578731) [ 7 , 11 , 13 ]. No difference was expected for the CTCAE score between the two cohorts. The aims of the study were therefore to confirm that the average CTCAE scores were similar in both cohorts and that the recording of side effects with the app was reliable.

Secondary Objectives

Secondary objectives included well-being in both cohorts as well as electronically reported symptoms with respect to the therapy regimen and demographic characteristics only in the prospective cohort.

The medidux app (version 3.2) used in the study is a patient-centered, therapy-accompanying app that supports the structured, standardized, and dynamic documentation of symptoms and therapy side effects. Use of this tool does not represent an invasive intervention on the patient and consequently did not pose any specific risks of physical injury.

Data Collection

The app has two basic components: (1) a browser-based app for the treatment team (web app) and (2) a mobile app for cancer patients. There was no need for 24-hour monitoring by medical staff in connection with use of the app.

The medidux app for patients enabled recording symptoms, vital signs, and well-being in a structured and standardized manner. Patients were encouraged to regularly enter data on symptoms according to the CTCAE (version 4.0), general well-being according to the Eastern Cooperative Oncology Group Performance Status (ECOG PS), EQ-5D-5L questionnaire (weekly), vital signs (weight, blood glucose, blood pressure, and pulse), and optionally a neuropsychological cognitive test (Trail Making Test [TMT]), concomitant medications, and private notes. Patients were asked daily about their general well-being and symptoms using a visual analog scale. Recording usually started on the day of therapy initiation (or the change in therapy) and continued through an observational period of 6 weeks. The frequency of app use and data entry was logged throughout the course of the study treatment, which served as an indicator of patients’ active participation in the study and as a relevant process parameter for evaluating the usability of the app itself.

The mobile app also recommended contacting the investigator or treatment site in case of high intensity of symptoms (ie, treatment side effects). Furthermore, the app provided patients with self-efficacious recommendations and tips on how to treat and reduce treatment side effects.

Recording of AEs

AEs in the app were classified according to the CTCAE (version 4.0). For the app, grade 5 “Death related to AE” had been removed. Instead, category 0 was added, representing no or very mild symptoms. The 5 severity levels were translated into a visual analog scale from 0.1 to 10, with 0.1 representing the weakest possible symptom and 10 representing the strongest possible symptom. Scores 0.1-2.0 corresponded to grade 0, scores 2.1-4.0 corresponded to grade 1, scores 4.1-6 corresponded to grade 2, scores 6.1-8 corresponded to grade 3, and scores 8.1-10 corresponded to grade 4 AEs. When patients selected a score between 0.1 and 10, they received a summary and information for the selected range, which was displayed in the app. Classification into adapted grades based on the CTCAE resulted in the following categories: minimal symptoms (0), mild symptoms (1), moderate symptoms (2), severe symptoms (3), and very severe symptoms (4).

Well-Being Assessment

Self-assessment of well-being was carried out in the medidux app with the help of a slider on a visual analog scale that allows for the continuous selection from 0 to 100. At the same time, short definitions appear for the standardized and structured reporting of the gradings, which should help the patient to correctly categorize their well-being. Selected values between 81 and 100 correspond to an ECOG PS of 0, values of 61-80 correspond to ECOG PS 1, values of 41-60 correspond to ECOG PS 2, values of 21-40 correspond to ECOG PS 3, and values of 0-20 correspond to ECOG PS 4. As mentioned above, grade 5 “Dead” was removed for the app.

Statistical Analyses

Sample size calculation.

The research objective was to investigate the difference between prospective and historical cohorts regarding patient-reported side effects, operationalized by the CTCAE score over a period of 6 weeks. To assess the hypothesis of equal CTCAE scores in both cohorts, the method of interval estimation was selected using the 95% CI for the mean difference between cohorts. A statistical analysis plan (SAP) prospectively determined the required sample size for a prospective cohort based on the data from the historical cohort (as available from the previous studies NCT02004496 and NCT03578731 [ 7 , 11 , 13 ]; see the Study Design section above for further details). First, the SD for the CTCAE scores of the 38 patients in the historical cohort was calculated retrospectively as 9.7 and the assumption of an equal SD in the prospective cohort was made. Second, the sample size for the prospective cohort was chosen to achieve a certain minimum precision in estimating the mean difference between cohorts (width of the 95% CI). For a range of feasible sample sizes, the SAP reported 95% CI precisions based on the t distribution (calculated using the R package presize [ 15 ]), assuming an equal SD of 9.7 in both cohorts and using a pooled variance estimate. From this range, a sample size of 60 patients was prospectively selected in the SAP to achieve 51 evaluable patients, given an expected dropout rate of 15%. The corresponding 95% CI for the mean difference between the historical and prospective cohorts was estimated to have a precision of 8.3, which was deemed acceptable for the planned assessment in the given study context.

Statistical Methods

All analyses of the primary and secondary endpoints (CTCAE score, well-being score) were performed using univariate analyses, followed by multivariate linear regression to report (adjusted) mean differences between historical and prospective cohorts, with the P values based on t tests and corresponding 95% CIs. All multivariate models extended the respective univariate models in a supplementary fashion to adjust for potential imbalances in patient age, tumor stage, and therapy setting. These covariates were prospectively defined in the study’s SAP; no model selection procedures were employed. All analyses were performed using R version 4.2.0 (The R Foundation for Statistical Computing) [ 16 ]. Two-sided P values ≤.05 were considered statistically significant.

Primary Endpoint

The primary endpoint, a CTCAE score based on the severity grades of the 10 most relevant side effects (sensory disturbance, diarrhea, fatigue, nausea, vomiting, headache, fever, edema of the limbs, joint pain, and loss of appetite) after 6 weeks, was compared between the prospective and historical cohorts. The CTCAE score was calculated by averaging the score per patient and symptom and then averaging the score per patient over all symptoms. The mean difference in the CTCAE scores between cohorts was estimated using univariate linear regression with 95% CIs. To adjust for potential differences between the two cohorts in covariates relevant for the primary outcome, a supplementary multivariate analysis was performed including the additional covariates patient age, tumor stage, and therapy setting.

Secondary Endpoint

The well-being score according to the ECOG PS was collected continuously using a visual analog scale (range 0-100) implemented in the medidux app and averaged across measurements during the observational period. The analysis protocol was analogous to that described above for the primary endpoint.

Additional Analysis

Cognitive tests in the prospective cohort were collected continuously throughout the observation period and descriptively assessed by administering a modified version of the TMT. The time (in seconds) to complete each task (execution time) was used in the analysis.

Baseline Characteristics

Overall, 53 female patients were enrolled in the OGIPRO study. The median age was 57 (range 34-87) years in the prospective cohort and 51 (range 31-78) years in the historical cohort. Most patients (38.9%) had tumor stage 2 ( Table 1 ). With regard to the treatment setting, relatively fewer patients (22.2%) received palliative treatment than neoadjuvant or adjuvant treatment. More than half of the patients (59.3%) received dual anti-HER2 blockade with trastuzumab and pertuzumab ( Table 1 ).

a Student t test.

b Data missing for 1 participant.

c χ 2 test.

d ECOG PS: Eastern Cooperative Oncology Group Performance Status.

In the prospective cohort, 84 of the 92 available different symptoms were entered (average >4 symptoms/day), resulting in a total of 9680 symptoms, whereas 54 of the 82 different symptoms were reported in the historical cohort (average >3 symptoms/day), resulting in a total of 6904 symptom entries. The most common symptoms reported in both groups included fatigue, taste disorder, nausea, diarrhea, dry mucosa, joint discomfort, tingling, sleep disorder, headache, and appetite loss ( Figure 1 ).

observational research

Overall, the distribution of symptom grades in the Ogivri cohort revealed that most patients experienced minimal (grade 0) and mild (grade 1) toxicities, followed by grade 2, grade 3, and grade 4 toxicities (Table 2). The results for QoL (based on the EQ-5D-5L questionnaire), which was also assessed in this study, will be reported elsewhere.

CTCAE Score

The primary endpoint was analyzed in 50 patients (3 patients were excluded due to missing data on symptoms) treated with Ogivri (prospective cohort) and in all 38 patients treated with Herceptin (historical cohort). The mean CTCAE scores were comparable between the two cohorts (Table 3) with a mean difference of –1.27 (95% CI –7.24 to 4.70; P =.68) ( Figure 2 ). In the multivariate analysis, the adjusted mean CTCAE scores also did not differ between the two cohorts (2.51, 95% CI –3.27 to 8.29) (Table S1 in Multimedia Appendix 1 ).

observational research

a Reported P values correspond to mean differences between cohorts.

b Missing scores for 3 participants in the prospective cohort.

c Missing score for 1 participant in the prospective cohort.

Well-Being Score

The secondary endpoint, the well-being score, was analyzed in 52 patients (one patient was excluded due to missing data on well-being) from the OGIPRO study and in all 38 patients from the historical cohort. The mean well-being score did not differ significantly between patients treated with Ogivri and those treated with Herceptin (Table 3), with a mean difference of 4.45 (95% CI –3.53 to 12.44; P =.28). The adjusted mean well-being scores also did not differ between the two cohorts (3.78, 95% CI –4.64 to 12.19) (Table S2 in Multimedia Appendix 1 ).

Cognitive Abilities in the Prospective Cohort

A total of 767 cognitive tests were entered and the data of 37 patients (70%) who had performed at least one test were included in the analysis (see Figure S1 in Multimedia Appendix 1 ). Overall, the mean execution time was 42.9 (SD 26.3) with an absolute difference between the maximum and minimum execution time of 197 seconds. Because of the low sample size and limited number of cognitive tests recorded, no correlation analysis between cognitive abilities and treatment was performed.

The treatment of patients with HER2-positive BC with the trastuzumab biosimilar Ogivri resulted in equivalent symptoms, AEs, and well-being to those experienced under treatment with Herceptin as determined by ePROs. Ogivri treatment in the context of HER2-positive BC was well tolerated and no new important safety risks were observed. The results of this study are consistent with previously reported evidence on the safety comparability of the trastuzumab biosimilar Ogivri to the reference product Herceptin for the treatment of HER2-positive BC [ 6 , 17 ].

The use of biosimilars in oncology could reduce health care costs and thus expand access to drugs worldwide. The European Medicines Agency as well as the US Food and Drug Administration have developed guidelines requiring biosimilars to demonstrate comparable results in relevant clinical trials to those obtained using the original product [ 18 ]. Recent studies have demonstrated that anti-HER2 therapy can be switched safely to trastuzumab biosimilars and successfully implemented in clinical practice [ 19 ].

In our study, the incidence and distribution of symptoms associated with Ogivri were similar to those reported with Herceptin. However, the slightly lower mean symptom score related to Ogivri might be attributed to the higher number of treatments in this cohort for advanced cancer stages, including antihormone treatments and dual HER2 blockade.

To our knowledge, this study represents the first real-world evaluation on efficacy and safety in patients treated with HER2 biosimilars using ePRO data. Use of the app in this study was intended to help patients gain a better overview of their disease history and improve their symptom management. Our analysis of ePRO data demonstrated comparable CTCAE scores between the prospective Ogivri cohort and the historical Herceptin cohort. These findings further support the previously reported similar safety profiles between the trastuzumab biosimilar and the corresponding reference product [ 6 , 17 ] with no new safety concerns observed.

Importantly, the well-being score based on the ECOG PS did not differ between the two cohorts. In a pooled analysis of data from three randomized clinical trials including patients with HER2-positive advanced BC, PROs were identified as an independent prognostic factor for both survival and toxicity outcomes. In addition, patient-reported physical well-being and clinically interpreted ECOG PS provided independent prognostic information [ 20 ]. In our prospective Ogivri cohort, we did not focus on the prognostic value of the ePRO with regard to clinical outcomes, but we were able to demonstrate that an eHealth patient empowerment app can provide reliable information on side effects and well-being when comparing a biosimilar with reference treatments. Hence, the use of continuous eHealth-based symptom reporting together with biosimilars can result in a potential economic benefit by reducing the cost of drug treatment and hospitalization. Further detailed analyses of randomized trials with biosimilars will help to quantify these resources more comprehensively.

In general, the diary characteristic of apps might appear helpful to capture and recall disease-related information such as cognitive impairments [ 21 ]. In the OGIPRO study, patients had the possibility to complete a TMT, which is one of the most widely used neuropsychological tests in clinical practice; this test is perceptive, easy to understand for patients, has a short administration time, and has shown consistent results in multiple clinical populations [ 22 - 24 ]. A study investigating the impact of chemotherapy on cognitive functions of patients with BC demonstrated increased cognitive impairment throughout chemotherapy treatment, which did not recover 2 months after chemotherapy was completed [ 25 ]. In contrast, in the OGIPRO study, the cognitive performance of the patients receiving Ogivri showed potential improvement throughout the study treatment. However, due to the low number of cognitive tests recorded during app use, the cognitive abilities were analyzed descriptively and no association could be made with regard to the trastuzumab biosimilar treatment. Further analyses are needed to evaluate the electronically collected cognitive test results in patients treated with biosimilars and corresponding reference products.

Our study has several strengths and limitations. The limitations of the study included the design that lacked a prospective control group so that the study was not randomized. However, the comparison between prospectively collected data of patients treated with Ogivri and the historical ePRO data of patients treated with Herceptin in two previous studies [ 7 , 13 ] demonstrated no difference with regard to symptoms, well-being, and AEs. The earlier versions of the mobile app used in the historical cohort were developed to record symptoms and treatment side effects continuously and according to the CTCAE in patients with cancer, but were not designed to send questionnaires to patients. Nevertheless, the ePRO data of the historical cohort were recorded in the same way in the earlier versions of the mobile app [ 11 ] and are thus comparable to those of the prospective cohort. An exploratory analysis on cognitive abilities was performed only in the prospective cohort as these data were not available in the historical cohort. Further studies that are randomized and sufficiently powered to evaluate the real-word cognitive functions in patients with HER2-positive BC treated with anti-HER2 biosimilars are needed.

The major strength of our proof-of-concept study is that it was able to provide the first evidence that data collected via an autonomous eHealth app can also be used longitudinally to determine the similarity of a trastuzumab biosimilar to the reference product for the treatment of HER2-positive BC. Furthermore, our study has reached its primary endpoint, showing a similar average CTCAE score between patients treated with the trastuzumab biosimilar Ogivri and those treated with the reference drug Herceptin. Our results suggest that the use of a patient empowerment eHealth app in patients treated with anti-HER2 biosimilars is reliable and can support therapy management.

In conclusion, in patients with HER2-positive BC, treatments with the trastuzumab biosimilar Ogivri and the reference drug Herceptin resulted in equivalent symptoms, AEs, and well-being reported by ePRO. Hence, the integration of an ePRO tool into research and clinical practice can provide reliable information when investigating the real-world tolerability and safety outcomes of similar therapeutic compounds.

Acknowledgments

The authors thank all patients who participated in this study along with the investigators and their teams. We also thank Palleos Healthcare for the continued support of the trial; Dr. Stefanie von Felten at University of Zurich, Epidemiology, Biostatistics and Prevention Institute for the assistance with data analysis; and Swiss Tumor Institute, Zurich, Switzerland for the financial support for the trial.

Data Availability

The data sets generated and/or analyzed during this study are available from the corresponding author on reasonable request.

Conflicts of Interest

AT received medical writing support from Palleos Healthcare, funding from the Swiss Tumor Institute, payment or honoraria for presentations from Viatris, support for attending ESMO 2023 from Viatris, and is the founder and stock owner of Mobile Health AG. YK reports stock or stock options from Viatris and is the Head of Project Management at Mobile Health AG. GAKU reports stock or stock options from Novartis. MA reports consulting fees from Aptar. AE reports consulting fees from Daiichi-Sankyo, Gilead, Merck, Novartis, and Seagen, and institutional financial support for clinical trials from AstraZeneca, Roche, Pfizer, and Novartis. All other authors have declared no conflicts of interest.

Multivariate analyses of Common Terminology Criteria for Adverse Events (CTCAE) scores (Table S1) and well-being scores (Table S2); distribution of cognitive performance scores (Figure S1).

  • Green AK, Ohn JA, Bach PB. Review of current policy strategies to reduce US cancer drug costs. J Clin Oncol. Feb 01, 2020;38(4):372-379. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Nabhan C, Parsad S, Mato AR, Feinberg BA. Biosimilars in oncology in the United States: a review. JAMA Oncol. Feb 01, 2018;4(2):241-247. [ CrossRef ] [ Medline ]
  • Mejri N, Boussen H, Labidi S, Benna F, Afrit M, Rahal K. Relapse profile of early breast cancer according to immunohistochemical subtypes: guidance for patient's follow up? Ther Adv Med Oncol. May 15, 2015;7(3):144-152. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Stocker A, Hilbers M, Gauthier C, Grogg J, Kullak-Ublick GA, Seifert B, et al. HER2/CEP17 ratios and clinical outcome in HER2-positive early breast cancer undergoing trastuzumab-containing therapy. PLoS One. Jul 27, 2016;11(7):e0159176. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Cargnin S, Shin JI, Genazzani AA, Nottegar A, Terrazzino S. Comparative efficacy and safety of trastuzumab biosimilars to the reference drug: a systematic review and meta-analysis of randomized clinical trials. Cancer Chemother Pharmacol. Nov 01, 2020;86(5):577-588. [ CrossRef ] [ Medline ]
  • Rugo HS, Barve A, Waller CF, Hernandez-Bronchud M, Herson J, Yuan J, et al. Heritage Study Investigators. Effect of a proposed trastuzumab biosimilar compared with trastuzumab on overall response rate in patients with ERBB2 (HER2)-positive metastatic breast cancer: a randomized clinical trial. JAMA. Jan 03, 2017;317(1):37-47. [ CrossRef ] [ Medline ]
  • Egbring M, Far E, Roos M, Dietrich M, Brauchbar M, Kullak-Ublick GA, et al. A mobile app to stabilize daily functional activity of breast cancer patients in collaboration with the physician: a randomized controlled clinical trial. J Med Internet Res. Sep 06, 2016;18(9):e238. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Basch E, Deal AM, Kris MG, Scher HI, Hudis CA, Sabbatini P, et al. Symptom monitoring with patient-reported outcomes during routine cancer treatment: a randomized controlled trial. J Clin Oncol. Feb 20, 2016;34(6):557-565. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Basch E, Deal AM, Dueck AC, Scher HI, Kris MG, Hudis C, et al. Overall survival results of a trial assessing patient-reported outcomes for symptom monitoring during routine cancer treatment. JAMA. Jul 11, 2017;318(2):197-198. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Trojan A, Huber U, Brauchbar M, Petrausch U. Consilium smartphone app for real-world electronically captured patient-reported outcome monitoring in cancer patients undergoing anti-PD-L1-directed treatment. Case Rep Oncol. May 12, 2020;13(2):491-496. [ CrossRef ] [ Medline ]
  • Trojan A, Bättig B, Mannhart M, Seifert B, Brauchbar MN, Egbring M. Effect of collaborative review of electronic patient-reported outcomes for shared reporting in breast cancer patients: descriptive comparative study. JMIR Cancer. Mar 17, 2021;7(1):e26950. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Pircher M, Winder T, Trojan A. Response to vemurafenib in metastatic triple-negative breast cancer harbouring a BRAF V600E mutation: a case report and electronically captured patient-reported outcome. Case Rep Oncol. Mar 29, 2021;14(1):616-621. [ CrossRef ] [ Medline ]
  • Trojan A, Leuthold N, Thomssen C, Rody A, Winder T, Jakob A, et al. The effect of collaborative reviews of electronic patient-reported outcomes on the congruence of patient- and clinician-reported toxicity in cancer patients receiving systemic therapy: prospective, multicenter, observational clinical trial. J Med Internet Res. Aug 05, 2021;23(8):e29271. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Schmalz O, Jacob C, Ammann J, Liss B, Iivanainen S, Kammermann M, et al. Digital monitoring and management of patients with advanced or metastatic non-small cell lung cancer treated with cancer immunotherapy and its impact on quality of clinical care: interview and survey study among health care professionals and patients. J Med Internet Res. Dec 21, 2020;22(12):e18655. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Haynes A, Lenz A, Stalder O, Limacher A. presize: An R-package for precision-based sample size calculation in clinical research. J Open Source Soft. Apr 2021;6(60):3118. [ FREE Full text ] [ CrossRef ]
  • R Foundation for Statistical Computing. URL: https://www.R-project.org/ [accessed 2024-03-08]
  • Rugo HS, Pennella EJ, Gopalakrishnan U, Hernandez-Bronchud M, Herson J, Koch HF, et al. Final overall survival analysis of the phase 3 HERITAGE study demonstrates equivalence of trastuzumab-dkst to trastuzumab in HER2-positive metastatic breast cancer. Breast Cancer Res Treat. Jul 2021;188(2):369-377. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Triantafyllidi E, Triantafillidis JK. Systematic review on the use of biosimilars of trastuzumab in HER2+ breast cancer. Biomedicines. Aug 21, 2022;10(8):2045. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Hester A, Gaß P, Fasching PA, Krämer AK, Ettl J, Diessner J, et al. Trastuzumab biosimilars in the therapy of breast cancer - "real world" experiences from four Bavarian university breast centres. Geburtshilfe Frauenheilkd. Sep 02, 2020;80(9):924-931. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Modi N, Danell N, Perry R, Abuhelwa A, Rathod A, Badaoui S, et al. Patient-reported outcomes predict survival and adverse events following anticancer treatment initiation in advanced HER2-positive breast cancer. ESMO Open. Jun 2022;7(3):100475. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Reeve BB, McFatrich M, Pinheiro LC, Freyer DR, Basch EM, Baker JN, et al. Cognitive interview-based validation of the patient-reported outcomes version of the common terminology criteria for adverse events in adolescents with cancer. J Pain Symptom Manage. Apr 2017;53(4):759-766. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Ashendorf L, Jefferson AL, O'Connor MK, Chaisson C, Green RC, Stern RA. Trail Making Test errors in normal aging, mild cognitive impairment, and dementia. Arch Clin Neuropsychol. Mar 21, 2008;23(2):129-137. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Periáñez JA, Ríos-Lago M, Rodríguez-Sánchez JM, Adrover-Roig D, Sánchez-Cubillo I, Crespo-Facorro B, et al. Trail Making Test in traumatic brain injury, schizophrenia, and normal ageing: sample comparisons and normative data. Arch Clin Neuropsychol. May 2007;22(4):433-447. [ CrossRef ] [ Medline ]
  • Linari I, Juantorena GE, Ibáñez A, Petroni A, Kamienkowski JE. Unveiling Trail Making Test: visual and manual trajectories indexing multiple executive processes. Sci Rep. Aug 22, 2022;12(1):14265. [ CrossRef ] [ Medline ]
  • Rodríguez Martín B, Fernández Rodríguez EJ, Rihuete Galve MI, Cruz Hernández JJ. Study of chemotherapy-induced cognitive impairment in women with breast cancer. Int J Environ Res Public Health. Nov 30, 2020;17(23):8896. [ FREE Full text ] [ CrossRef ] [ Medline ]

Abbreviations

Edited by T de Azevedo Cardoso; submitted 01.11.23; peer-reviewed by HC Kohlberg, E Fiorio; comments to author 18.12.23; revised version received 22.12.23; accepted 27.02.24; published 04.04.24.

©Andreas Trojan, Sven Roth, Ziad Atassi, Michael Kiessling, Reinhard Zenhaeusern, Yannick Kadvany, Johannes Schumacher, Gerd A Kullak-Ublick, Matti Aapro, Alexandru Eniu. Originally published in JMIR Cancer (https://cancer.jmir.org), 04.04.2024.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Cancer, is properly cited. The complete bibliographic information, a link to the original publication on https://cancer.jmir.org/, as well as this copyright and license information must be included.

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 04 April 2024

Observation of Kekulé vortices around hydrogen adatoms in graphene

  • Yifei Guan   ORCID: orcid.org/0000-0001-7778-4083 1 ,
  • Clement Dutreix   ORCID: orcid.org/0000-0002-7557-7838 2 ,
  • Héctor González-Herrero   ORCID: orcid.org/0000-0002-3028-9875 3 , 4 ,
  • Miguel M. Ugeda   ORCID: orcid.org/0000-0001-7913-1617 5 , 6 , 7 ,
  • Ivan Brihuega   ORCID: orcid.org/0000-0001-5032-9304 3 , 4 , 8 ,
  • Mikhail I. Katsnelson   ORCID: orcid.org/0000-0001-5165-7553 9 ,
  • Oleg V. Yazyev   ORCID: orcid.org/0000-0001-7281-3199 1 &
  • Vincent T. Renard   ORCID: orcid.org/0000-0001-6242-9468 10  

Nature Communications volume  15 , Article number:  2927 ( 2024 ) Cite this article

152 Accesses

2 Altmetric

Metrics details

  • Scanning probe microscopy
  • Topological defects
  • Two-dimensional materials

Fractional charges are one of the wonders of the fractional quantum Hall effect. Such objects are also anticipated in two-dimensional hexagonal lattices under time reversal symmetry—emerging as bound states of a rotating bond texture called a Kekulé vortex. However, the physical mechanisms inducing such topological defects remain elusive, preventing experimental realization. Here, we report the observation of Kekulé vortices in the local density of states of graphene under time reversal symmetry. The vortices result from intervalley scattering on chemisorbed hydrogen adatoms. We uncover that their 2 π winding is reminiscent of the Berry phase π of the massless Dirac electrons. We can also induce a Kekulé pattern without vortices by creating point scatterers such as divacancies, which break different point symmetries. Our local-probe study thus confirms point defects as versatile building blocks for Kekulé engineering of graphene’s electronic structure.

Similar content being viewed by others

observational research

Flavour Hund’s coupling, Chern gaps and charge diffusivity in moiré graphene

Jeong Min Park, Yuan Cao, … Pablo Jarillo-Herrero

observational research

Imaging tunable quantum Hall broken-symmetry orders in graphene

Alexis Coissard, David Wander, … Benjamin Sacépé

observational research

Moiré nematic phase in twisted double bilayer graphene

Carmen Rubio-Verdú, Simon Turkel, … Abhay N. Pasupathy

Introduction

Real-space topological defects in crystals exhibit exotic electronic properties 1 , 2 , especially when combined with the reciprocal-space topological phase hosted by the bulk 3 , 4 . In two-dimensional hexagonal lattices, a vortex in the Kekulé order parameter is of particular interest for charge fractionalization without breaking time-reversal symmetry 5 , 6 . The Kekulé pattern in graphene corresponds to the \(\sqrt{3}\times \sqrt{3}R3{0}^{\circ }\) unit cell tripling, with a distinct bond order within one of the three equivalent hexagonal rings. These three degenerate states define an angular order parameter space as shown in Fig.  1 a. A Kekulé vortex of winding 2 π corresponds to the alternation of the three Kekulé domains upon encircling a singularity, which could be implemented in optical 7 , 8 and acoustical 9 , 10 metamaterials. In graphene, recent experiments have demonstrated that a Kekulé bond texture emerges from the intervalley coherent quantum Hall ferromagnet states in the zeroth Landau level 11 , 12 , 13 . These states can host skyrmionic topological excitations appearing as Kekulé vortices 13 , 14 , 15 . However, these schemes require a strong magnetic field and the Kekulé vortex, without breaking time-reversal symmetry, remains out of reach. At zero magnetic field, theory shows that a missing electronic site provides a fractionalization mechanism analogous to that of the Kekulé vortex. 16 , 17 Such point defects can scatter electrons from one valley to another, thereby connecting the two valleys and offering a way to stabilize the Kekulé bond texture 18 , 19 , 20 , 21 , 22 , 23 , 24 . We now demonstrate that they also induce Kekulé vortices in graphene.

figure 1

a Schematic illustration showing three distinct Kekulé orders in graphene. b STM image of graphene with a hydrogen adatom in the center for a tip bias V b =400 mV and tunneling current i t  = 45.5 pA. The scale bar is 1 nm. The colored tiling evidences three domains, each corresponding to one of the three Kekulé orders (bold hexagons) and separated from one another by domain walls along the armchair direction. The pseudo-spin of incoming electrons scattering off the H atom is locked on the azimutal coordinate θ r of the STM tip 25 (upper right inset). c Kekulé bond-texture signal extracted according to the methodology described in the main text and the Supplementary information. The graphene lattice signal is included to highlight the donut-like patterns in each domain. The bare signal is plotted in Fig.  2 a. d Phase of the Kekulé signal shown in c with the Kekulé order parameter defined by the color wheel in a . The corresponding signal is overlaid as a guide for the eyes. The scale bars in panels c and d are 1 nm. e Clar’s sextet configurations of graphene in the presence of a hydrogen adatom (black dot) illustrating the emergence of a Kekulé vortex.

Observation of a Kekulé vortex near an H atom on graphene

Figure  1 b shows a scanning tunneling microscopy (STM) image of graphene with a single chemisorbed hydrogen adatom (see Methods for experimental details and Supplementary Fig.  1 for another example). Near the H atom, a strong quasi-particle interference (QPI) signal is seen. It reveals a hexagonal superlattice commensurate with that of graphene but with a unit cell three times larger, as emphasised by the three-colour tiling (See Supplementary Fig.  2 for details). The pattern is dominated by the onsite (i.e. on-carbon atom) bright dots on sublattice B (assuming the H adatom is on sublattice A), which alternates between the three nonequivalent B sites. The coloured dots Fig.  1 b highlight that this onsite signal winds 4 π when circling around the adatom, consistent with previous studies 25 , 26 , 27 . A closer inspection of the interference pattern also reveals a strong donut-like signal on some benzene rings. We highlight them with bold-coloured hexagons in Fig.  1 b. This color tiling allows us to identify three distinct Kekulé domains separated from one another by domain walls along the armchair directions. This indicates that the Kekulé bond texture winds 2 π when circling around the adatom.

To further confirm the existence of the 2 π vortex, we extract the bond-centered signal directly from the STM data, following the approach of ref. 28 . In particular, we use a modified Geometrical Phase Analysis 29 , in which we exploit the fact that for intervalley scattering, the sublattice A, sublattice B, and bond-centered contributions each transform under different irreducible representations of the threefold symmetry (see Supplementary Information). The result is shown in Fig.  1 c, d. The images respectively present the magnitude and the phase of the bond signal for intervalley scattering, which defines the Kekulé order parameter. While the intensity shows a Kekulé texture, the phase exhibits a vortex that winds 2 π around the H adatom. This data analysis demonstrates that the chemisorbed H adatom induces a Kekulé vortex on the surrounding graphene bonds.

Interestingly, the Kekulé vortex supports an interpretation in terms of Clar’s sextet theory 30 , a set of rules explaining the stability of aromatic molecules in chemistry. A sextet represents the six resonantly delocalized π electrons by a circle. Clar’s rules state that adjacent hexagons cannot be aromatic sextets simultaneously, and that the most stable bond configuration maximizes the number of Clar’s sextets. Graphene admits three equivalent resonant Clar configurations corresponding to three Kekulé orders (Fig.  1 a) 31 . The hydrogen adatom effectively removes one site from the lattice, and the circulation of delocalized electrons in the adjacent benzene rings is obstructed. This lifts the degeneracy of Clar’s resonant configurations allowed in its surrounding. We then find two possible configurations shown in Fig.  1 e. The only freedom lies in the positions of the double covalent bonds along the boundaries between two Kekulé domains. Thus, Clar’s sextet representation is consistent with Fig.  1 c and supports our observation of the Kekulé vortex in graphene. In Supplementary Information, we formalize the relation between these two pictures.

Establishing the electronic origin of the Kekulé vortex

We now show that the Kekulé vortex has a purely electronic origin reminiscent of graphene’s band topology. The bond texture of the Kekulé type relates to the electron density between carbon atoms. Intervalley scattering resolved at the tip position r yields the following bond contribution to the local density of states (LDOS):

where ϕ n  = (2 n  + 1) π /3 and Δ K n is a scattering wavevector between the valleys responsible for the hexagonal superlattice pattern (see Supplementary Information). This bond contribution can be understood as a two-path loop interference allowed by the overlap of neighbouring p z orbitals of carbon atoms.

The Kekulé bond texture originates from the polar angle θ r indexing the STM tip orientation around the adatom. Indeed, in such QPI signal which is dominated by backscattering, the signal measured at the tip position is determined by the interference of incoming electrons and scattered electrons on the adatom which wavevector orientation ±  θ q is determined by the tip position through θ r  =  θ q  +  π (see inset in Fig.  1 b and ref. 25 ). The wavevector orientation also determines the wavefunction of the massless relativistic electrons around a Dirac point in momentum space. We assume an incoming wavefunction \(\left\vert \psi \right\rangle\) of the form \(\sqrt{2}\left\vert \psi \right\rangle=\left\vert A\right\rangle+{e}^{i{\theta }_{{{{{{{{\bf{q}}}}}}}}}}\left\vert B\right\rangle\) . When cycling along a closed path \({{{{{{{\mathcal{C}}}}}}}}\) enclosing a Dirac point, the wavefunction gains the Berry phase

Due to the lock-in relation θ r  =  θ q  +  π , cycling the STM tip around the adatom is equivalent to varying the wavevector around a Dirac point in momentum space. Thus, the 2 π vortex on the bonds derives from the topological Berry phase of the scattering wavefunctions.

In addition to the bond contribution above, the usual onsite LDOS modulations Δ ρ A ( r ) and Δ ρ B ( r ) also contribute to the STM signal 25 . All these contributions are compared to the experimental signals in Fig.  2 , which shows very good agreement between theory and experiments. (See also Supplementary Information and Supplementary Fig.  3 for more details.) We would like to point out that the experimentally observed vortex is also reproduced by both our tight-binding and density functional theory (DFT) calculations (see Methods, Fig.  3 and Supplementary Fig.  4) . We note that the vortices are not affected by the energy integration of the LDOS, since the scattering wave-vectors Δ K n are energy independent. The energy-resolved STM images shown in Supplementary Fig.  5 confirm this property. Thus, our theoretical studies show that the Kekulé vortex we observe in Fig.  1 b derives from an intrinsic topological property of the massless Dirac (that is, chiral in pseudospin 32 ) wavefunctions scattering on the adatom.

figure 2

Experimental image of Δ ρ b o n d ( r ) ( a ), Δ ρ A ( r ) ( b ) Δ ρ B ( r ) ( c ) from Fig.  1 b. The corresponding contributions calculated by the T-matrix approach are shown in d, e and f . The contributions are normalized by their prefactors for a straightforward comparison (see Supplementary Information). The scale bars are 1 nm.

figure 3

a STM image simulated using DFT with a tiling defined in the same way as in Fig.  1 b. The calculated LDOS was integrated between 0 and 400 meV as in the experiment. The scale bar is 0.5 nm. b Relaxed atomic structure of graphene with a hydrogen adatom. The bond lengths are coded as the color and thickness of the bonds. The colored tiling is superposed on the graphene lattice to show the winding of the bond length. The scale bar is 0.5 nm. c Calculated bond length distribution as a function of distance to the hydrogen adatom and divacancy defect.

The Kekulé vortices proposed by Hou et al. result from a structural distortion of the bonds and host zero-energy bound states in an excitation gap that are compatible with charge fractionalization scenarios 5 , 6 . This raises the question of whether the Kekulé vortex reported here is also accompanied i) by a structural Kekulé distortion and ii) by zero-energy quasi-bound states compatible with fractional excitations.

Structural relaxation

To investigate whether the Kekulé vortex also exhibits a Kekulé distortion, we perform DFT calculations that include the lattice relaxation effects (see Methods). The simulated STM image (Fig.  3 a) reproduces accurately the experiment and, in particular, the 2 π vortex. While the presence of the hydrogen adatom does induce minor lattice distortions that are consistent with the winding of the structural Kekulé distortion (Fig.  3 b, c), the results are essentially the same as the ones provided by our analytical description in Eq. ( 1 ) and tight-binding calculations (Supplementary Fig.  4) that do not include any lattice distortion effects whatsoever. Furthermore, performing DFT calculations without lattice relaxation does not affect the presence of the vortex (Supplementary Fig.  6) . This further confirms the electronic origin of the Kekulé vortex.

Associated bound state

We now discuss the existence of quasi-bound states associated with the Kekulé vortex. The hydrogen adatom forms a covalent bond with a carbon atom of the graphene lattice. This changes the hybridization of the hydrogenated carbon atom from s p 2 to s p 3 , as in diamond 33 , 34 . Thus, the carbon atom is essentially unavailable for the π electrons and shares similarities with a single-atom ideal vacancy (i.e. without atomic reconstruction). The simplest description mainly consists of removing the hydrogenated carbon atom and neglecting the structural reconstructions. Such descriptions preserve particle-hole symmetry and lead to the appearance of a quasi-bound state at the Dirac point. The zero-energy state is fully polarized on the sublattice opposite to that of the removed p z orbital and presents an algebraic decay due to the gapless relativistic spectrum 35 , 36 , 37 , 38 .

In the spinless description, the zero-energy state exhibits a fractional charge Q  = −  e /2 16 . The quasi-bound state results from the promotion of half a state from the valence band and half a state from the conduction band. This is a two-dimensional analogue of the charge fractionalization at domain-wall solitons in the one-dimensional spinless model of polyacetylene 39 , 40 , similar to that of the structural Kekulé vortex 5 . In contrast, the fractional excitations no longer exist in the spinfull description, which doubles the number of zero-energy states. At half-filling, one of the two spin-polarised states is fully occupied. Since each bound state is a hybrid superposition of valence-conduction half-states of − e /2 and + e /2 fractional charges, the quasi-bound state around a vacancy must have a neutral charge Q  = 0 with spin S  = 1/2. Doping the sample then leads to a charged bound state Q  = ±  e with zero spin S  = 0.

In realistic graphene systems with H adatoms, the particle-hole symmetry is not exactly respected. Thus, the bound state in real graphene is slightly shifted from zero energy and its charge should deviate from e/2. However, this deviation is very small. According to first-principle calculations 34 , the shift is approximately t /16 ( t is the nearest-neighbour hopping), which is just 0.01 of the total p z bandwidth.

It is therefore particularly interesting that previous spectroscopic measurements apparently provide evidence of such spin-charge relations for the quasi-bound states associated with the Kekulé vortex texture in our sample 41 . In this earlier work, the state occupancy was tuned with the doping, and DOS measurements revealed the formation of quasi-bound S  = 1/2 magnetic moments at half-filling and S  = 0 in doped graphene. Furthermore, a small (see ref. 5 and Supplementary Information) shift of the zero energy bound state associated with particle-hole symmetry breaking could not be revealed 41 . These measurements are also consistent with previous DFT studies, in which the hydrogen chemisorption leads to charge neutral and 1 μ B spin-polarized quasi-bound states 36 . These observations do not constitute a direct evidence of the charge fractionalization mechanism near H adatoms, but are sufficiently consistent with it to stimulate further experimental efforts to address the spin and charge state of the bound state as well as theoretical work to link the Berry phase with the charge fractionalization mechanism.

Kekulé pattern near divacancies

The existence of the Kekulé vortex and the quasi-bound states induced by the hydrogen adatom defect is related to the symmetries of this scattering center. While a hydrogen adatom breaks the C 2 symmetry and sublattice balance of graphene, a divacancy preserves these symmetries. Figure  4 a shows an STM image of such divacancy in graphene. The divacancy also induces locally a Kekulé pattern. Importantly, this Kekulé pattern is exempt of any winding, with the Kekulé domain being defined by one of the three orientations of the chemical bond linking the two removed atoms. This is confirmed by DFT calculations (Fig.  4 c), which show that the defect also creates significant lattice distortion originating from structural reconstruction due of the formation of two pentagonal rings (Figs.  3 c and  4 d).

figure 4

a An STM image of a divacancy in graphene ( V b  = 500 mV, i t  = 400 pA). The scale bar is 1 nm. b Fourier filtered image to highlight the Kekulé bond texture. The Fourier filter is shown on the inset. Selected harmonics are those of graphene (black circles) and \(\sqrt{3}\times \sqrt{3}R3{0}^{\circ }\) (red circles). c DFT simulated STM image. The scale bar is 1 nm. d Atomic structure of the divacancy defect in graphene from DFT calculations. The bond length were calculated from relaxed atomic positions.

A local Kékulé pattern with or without vortex can therefore be induced in graphene by the specific distribution of atomic defects. Harnessing experimentally these building blocks could lead to the long-awaited macroscopic Kekulé engineering from the cooperative effect of atomic defects 18 , 19 .

Samples and STM measurements

The samples were grown by thermal decomposition of the carbon-face SiC at temperatures close to 1150 ∘ C in ultrahigh vacuum. 42 Silicon evaporation results in several graphene layers decoupled by rotational disorder. Hydrogen atoms were then deposited by thermal dissociation of hydrogen gas in a custom atomic hydrogen source as described previously. 41 STM images were obtained in the constant current mode in a custom ultrahigh vacuum setup at 5 K.

Tight-binding calculations

The nearest-neighbor tight-binding Hamiltonian of graphene is expressed as

where the nearest-neighbor hopping integral t  = − 2.7 eV. The TB calculation is carried out with periodic boundary conditions, with the hydrogen adatom modelled as a large on-site potential ( V  = 100 ∣ t ∣  ≈ 270eV). The supercell size of 27 × 27 unit cells of graphene was used to reduce the spurious effects due to the mutual interference between the periodic images of adatoms.

The information about the phase difference between wavefunction amplitudes on the nearest-neighbor sites, that is the bond order, is needed in order to describe the Kekulé texture. We define the bond parameter as the LDOS between two neighboring atoms i ,  j . On the basis of TB model, we consider the wave function as the product of the TB eigenvector ψ and an envelope function f ( r ) \(\phi (r)={\sum }_{n}{\sum }_{i}{\phi }_{i}^{n}f(r-{r}_{i})\) , taking the middle point between the atoms r i j  = ( r i  +  r j )/2 LDOS writes

in which the inner product 〈 ψ i ∣ ψ j 〉 governs bond texture. Therefore, we use the orbital overlap R e 〈 ψ i ∣ ψ j 〉 as the bond-order operators in the TB calculation. In the presence of a scattering center, 〈 ψ i ∣ ψ j 〉 is perturbed by the inter-sublattice Green’s functions G A B and G B A (See Supplementary Information), since the atoms connected by the bond are in different sublattices. The bond order is plotted by integrating the LDOS from 0 to 300 meV.

DFT calculations

First-principles calculations were performed using the SIESTA code 43 . We use the double- ζ plus polarization localized orbital basis set combined with the local density approximation exchange-correlation functional 44 . The energy shift for constructing the localized orbital basis functions was set to 275 meV, and the real-space cutoff to 250 Ry. The structural relaxation was performed using the conjugate gradient method.

The simulated STM images were produced using the plstm module of the SIESTA package, as a postprocessing step following the DFT calculations. The images were simulated from LDOS at a constant height of 2.5 Bohrs above the graphene plane.

Data availability

The data that support the findings of this study are available from the corresponding author upon request.

Mermin, N. D. The topological theory of defects in ordered media. Rev. Mod. Phys. 51 , 591–648 (1979).

Article   ADS   MathSciNet   CAS   Google Scholar  

Nelson, D. R. Defects and Geometry in Condensed Matter Physics (Cambridge University Press, 2002).

Juričić, V., Mesaros, A., Slager, R.-J. & Zaanen, J. Universal Probes of Two-Dimensional Topological Insulators: Dislocation and π Flux. Phys. Rev. Lett. 108 , 106403 (2012).

Article   ADS   PubMed   Google Scholar  

Teo, J. C. & Hughes, T. L. Topological defects in symmetry-protected topological phases. Ann. Rev. Condens. Matter Phys. 8 , 211–237 (2017).

Article   ADS   Google Scholar  

Hou, C.-Y., Chamon, C. & Mudry, C. Electron Fractionalization in Two-Dimensional Graphenelike Structures. Phys. Rev. Lett. 98 , 186809 (2007).

Seradjeh, B. & Franz, M. Fractional Statistics of Topological Defects in Graphene and Related Structures. Phys. Rev. Lett. 101 , 146401 (2008).

Article   ADS   CAS   PubMed   Google Scholar  

Menssen, A. J., Guan, J., Felce, D., Booth, M. J. & Walmsley, I. A. Photonic Topological Mode Bound to a Vortex. Phys. Rev. Lett. 125 , 117401 (2020).

Gao, X., Yang, L. & Lin, H. Dirac-vortex topological cavities. Nat. Nanotechnol. 15 , 1012–1018 (2020).

Gao, P. et al. Majorana-like Zero Modes in Kekulé Distorted Sonic Lattices. Phys. Rev. Lett. 123 , 196601 (2019).

Ma, J., Xi, X. & Li, Y. T. Nanomechanical topological insulators with an auxiliary orbital degree of freedom. Nat. Nanotechnol. 15 , 576–583 (2020).

ADS   Google Scholar  

Li, S.-Y., Zhang, Y., Yin, L.-J. & He, L. Scanning tunneling microscope study of quantum Hall isospin ferromagnetic states in the zero Landau level in a graphene monolayer. Phys. Rev. B 100 , 085437 (2019).

Article   ADS   CAS   Google Scholar  

Coissard, A. Imaging tunable quantum Hall broken-symmetry orders in graphene. Nature 605 , 51–56 (2022).

Liu, X. et al. Visualizing broken symmetry and topological defects in a quantum Hall ferromagnet. Science 375 , 321–326 (2022).

Lian, Y. & Goerbig, M. O. Spin-valley skyrmions in graphene at filling factor ν  = − 1. Phys. Rev. B 95 , 245428 (2017).

Atteia, J., Lian, Y. & Goerbig, M. O. Skyrmion zoo in graphene at charge neutrality in a strong magnetic field. Phys. Rev. B 103 , 035403 (2021).

Ovdat, O., Don, Y. & Akkermans, E. Vacancies in graphene: dirac physics and fractional vacuum charges. Phys. Rev. B 102 , 075109 (2020).

Ducastelle, F. Electronic structure of vacancy resonant states in graphene: a critical review of the single-vacancy case. Phys. Rev. B 88 , 075413 (2013).

Cheianov, V., Fal’ko, V., Syljuåsen, O. & Altshuler, B. Hidden Kekulé ordering of adatoms on graphene. Solid State Commun. 149 , 1499–1501 (2009).

Cheianov, V. V., Syljuåsen, O., Altshuler, B. L. & Fal’ko, V. Ordered states of adatoms on graphene. Phys. Rev. B 80 , 233409 (2009).

Gutierrez, C. et al. Ubiquitous defect-induced density wave instability in monolayer graphene. Nat. Phys. 12 , 950 (2016).

Google Scholar  

Bao, C. et al. Experimental Evidence of Chiral Symmetry Breaking in Kekulé-Ordered Graphene. Phys. Rev. Lett. 126 , 206804 (2021).

Qu, A. C. Ubiquitous defect-induced density wave instability in monolayer graphene. Sci. Adv 8 , eabm5180 (2022).

Article   CAS   PubMed   PubMed Central   Google Scholar  

Goft, A., Abulafia, Y., Orion, N., Schochet, C. L. & Akkermans, E. Defects in graphene: A topological description. Phys. Rev. B 108 , 054101 (2023).

Abulafia, Y., Goft, A., Orion, N. & Akkermans, E. Wavefronts Dislocations Measure Topology in Graphene with Defects arXiv:2307.05185 (2023).

Dutreix, C. et al. Measuring the Berry phase of graphene from wavefront dislocations in Friedel oscillations. Nature 574 , 219–222 (2019).

Dutreix, C. & Katsnelson, M. I. Friedel oscillations at the surfaces of rhombohedral N -layer graphene. Phys. Rev. B 93 , 035413 (2016).

Dutreix, C. et al. Measuring graphene’s Berry phase at B  = 0 T. Comptes Rendus. Physique 22 , 133–143 (2021).

Article   Google Scholar  

Nuckolls, K. P. et al. Quantum textures of the many-body wavefunctions in magic-angle graphene. Nature 620 , 525–532 (2023).

H’´ytch, M., Snoeck, E. & Kilaas, R. Quantitative measurement of displacement and strain fields from HREM micrographs. Ultramicroscopy 74 , 131–146 (1998).

Clar, E. The Aromatic Sextet (Springer Netherlands, Dordrecht, 1983).

Balaban, A. & Klein, D. J. Claromatic Carbon Nanostructures. J. Phys. Chem. C 100 , 19123–19133 (2009).

Katsnelson, M. I. The Physics of Graphene (Cambridge University Press, 2020), 2 edn.

Boukhvalov, D. W., Katsnelson, M. I. & Lichtenstein, A. I. Hydrogen on graphene: Electronic structure, total energy, structural distortions and magnetism from first-principles calculations. Phys. Rev. B 77 , 035427 (2008).

Wehling, T. O., Yuan, S., Lichtenstein, A. I., Geim, A. K. & Katsnelson, M. I. Resonant Scattering by Realistic Impurities in Graphene. Phys. Rev. Lett. 105 , 056802 (2010).

Pereira, V. M., Guinea, F., Lopes dos Santos, J. M. B., Peres, N. M. R. & Castro Neto, A. H. Disorder Induced Localized States in Graphene. Phys. Rev. Lett. 96 , 036801 (2006).

Yazyev, O. V. & Helm, L. Defect-induced magnetism in graphene. Phys. Rev. B 75 , 125408 (2007).

Wehling, T. O. et al. Local electronic signatures of impurity states in graphene. Phys. Rev. B 75 , 125425 (2007).

Dutreix, C., Bilteanu, L., Jagannathan, A. & Bena, C. Friedel oscillations at the Dirac cone merging point in anisotropic graphene and graphenelike materials. Phys. Rev. B 87 , 245413 (2013).

Su, W. P., Schrieffer, J. R. & Heeger, A. J. Solitons in Polyacetylene. Phys. Rev. Lett. 42 , 1698–1701 (1979).

Su, W. P., Schrieffer, J. R. & Heeger, A. J. Soliton excitations in polyacetylene. Phys. Rev. B 22 , 2099–2111 (1980).

González-Herrero, H. et al. Atomic-scale control of graphene magnetism by using hydrogen atoms. Science 352 , 437–441 (2016).

Varchon, Fmc, Mallet, P., Magaud, L. & Veuillen, J.-Y. Rotational disorder in few-layer graphene films on 6 H  − SiC(000 − 1): A scanning tunneling microscopy study. Phys. Rev. B 77 , 165415 (2008).

Soler, J. M. et al. The SIESTA method for ab initio order-N materials simulation. J. Phys.: Condens. Matter 14 , 2745 (2002).

ADS   CAS   Google Scholar  

Perdew, J. P. & Zunger, A. Self-interaction correction to density-functional approximations for many-electron systems. Phys. Rev. B 23 , 5048–5079 (1981).

Download references

Acknowledgements

V.T.R. acknowledges the support from the ANR Flatmoi project (ANR-21-CE30-0029). Y.G. and O.V.Y. acknowledge support from the Swiss National Science Foundation (grant No. 204254). Computations were performed at the Swiss National Supercomputing Centre (CSCS) under projects No. s1146 and the facilities of the Scientific IT and Application Support Center of EPFL. C.D. acknowledges support from the projects TED, CDS-QM, and TopoMat (ANR-23-CE30-0029), respectively funded by Quantum Matter Bordeaux, the SMR department of Bordeaux University, and the French Research National Agency. MMU acknowledges support by the European Research Council Consolidator Grant (No. 101087014) mKoire. I.B. acknowledges the support from the “(MAD2D-CM)-UAM” project funded by Comunidad de Madrid, by the Recovery, Transformation and Resilience Plan, and by NextGenerationEU from the European Union, the Spanish Ministry of Science and Innovation (Grant PID2020-115171GB-I00) and the Comunidad de Madrid NMAT2D-CM program under grant S2018/NMT-4511. H.G-H. acknowledges financial support from the Spanish State Research Agency under grant Ramón y Cajal fellowship RYC2021-031050-I

Author information

Authors and affiliations.

Institute of Physics, École Polytechnique Fédérale de Lausanne (EPFL), CH-1015, Lausanne, Switzerland

Yifei Guan & Oleg V. Yazyev

Univ. Bordeaux, CNRS, LOMA, UMR 5798, F-33400, Talence, France

Clement Dutreix

Departamento de Física de la Materia Condensada, Universidad Autónoma de Madrid, E-28049, Madrid, Spain

Héctor González-Herrero & Ivan Brihuega

Condensed Matter Physics Center (IFIMAC), Universidad Autónoma de Madrid, E-28049, Madrid, Spain

Donostia International Physics Center (DIPC), Paseo Manuel de Lardizábal 4, 20018, San Sebastián, Spain

Miguel M. Ugeda

Centro de Física de Materiales (CSIC-UPV-EHU), Paseo Manuel de Lardizábal 5, 20018, San Sebastián, Spain

Ikerbasque, Basque Foundation for Science, 48013, Bilbao, Spain

Instituto Nicolás Cabrera, Universidad Autónoma de Madrid, E-28049, Madrid, Spain

Ivan Brihuega

Institute for Molecules and Materials, Radboud University, Heijendaalseweg 135, 6525AJ, Nijmegen, The Netherlands

Mikhail I. Katsnelson

Univ. Grenoble Alpes, CEA, Grenoble INP, IRIG, PHELIQS, 38000, Grenoble, France

Vincent T. Renard

You can also search for this author in PubMed   Google Scholar

Contributions

H.G.-H., M.M.U. and I.B. performed the experiments under the supervision of I.B.. V.T.R. discovered the Kekulé vortex. Y.G. performed DFT and TB calculations under the supervision of O.V.Y. Y.G. and C.D. performed Green’s function calculations. M.I.K. gave technical support and conceptual advice. Y.G., C.D., O.V.Y. and V.T.R. wrote the manuscript with the input of all authors. V.T.R. coordinated the collaboration.

Corresponding authors

Correspondence to Ivan Brihuega , Oleg V. Yazyev or Vincent T. Renard .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Peer review

Peer review information.

Nature Communications thanks Marcel Franz and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary information, peer review file, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Guan, Y., Dutreix, C., González-Herrero, H. et al. Observation of Kekulé vortices around hydrogen adatoms in graphene. Nat Commun 15 , 2927 (2024). https://doi.org/10.1038/s41467-024-47267-8

Download citation

Received : 12 June 2023

Accepted : 26 March 2024

Published : 04 April 2024

DOI : https://doi.org/10.1038/s41467-024-47267-8

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

observational research

IMAGES

  1. Observational Research

    observational research

  2. Observational research

    observational research

  3. 10 Observational Research Examples (2023)

    observational research

  4. Observational Research: What is, Types, Pros & Cons + Example

    observational research

  5. Observation Method in Research (Definition & Types)

    observational research

  6. Observational Study vs Experiment: What is the Difference?

    observational research

VIDEO

  1. OBSERVATIONAL METHOD OF RESEARCH IN PSYCHOLOGY, CLASS

  2. Observational Research

  3. Observational Study| Research Method, #researchmethodology #shortnotes #bba #bcom

  4. Observational Research (Blue)

  5. Consumer Surveys & Observational Research

  6. Introduction to Psychology: Scientific Method and Observational Research Strategies

COMMENTS

  1. What Is an Observational Study?

    An observational study is a research method that observes what the researcher sees without interfering or manipulating the participants. Learn about the types, advantages, disadvantages, and examples of observational studies in different fields and contexts.

  2. Observational Study Designs: Synopsis for Selecting an Appropriate

    The observational design is subdivided into descriptive, including cross-sectional, case report or case series, and correlational, and analytic which includes cross-section, case-control, and cohort studies. Each research design has its uses and points of strength and limitations. The aim of this article to provide a simplified approach for the ...

  3. Observational study

    Observational study. In fields such as epidemiology, social sciences, psychology and statistics, an observational study draws inferences from a sample to a population where the independent variable is not under the control of the researcher because of ethical concerns or logistical constraints. One common observational study is about the ...

  4. Observational Research

    Naturalistic observation is an observational method that involves observing people's behavior in the environment in which it typically occurs. Thus naturalistic observation is a type of field research (as opposed to a type of laboratory research). Jane Goodall's famous research on chimpanzees is a classic example of naturalistic observation ...

  5. What is an Observational Study: Definition & Examples

    Learn what an observational study is, how it differs from an experiment, and why it is useful for some research questions. Explore the types of observational studies, such as cohort, case-control, and cross-sectional, and their advantages and drawbacks.

  6. Observational studies and their utility for practice

    Introduction. Observational studies involve the study of participants without any forced change to their circumstances, that is, without any intervention.1 Although the participants' behaviour may change under observation, the intent of observational studies is to investigate the 'natural' state of risk factors, diseases or outcomes. For drug therapy, a group of people taking the drug ...

  7. What is Observational Study Design and What Types

    Learn what observational study design is and how it differs from experimental study design. Explore the advantages and disadvantages of the three types of observational studies: case control, cohort and cross sectional.

  8. Observation Methods: Naturalistic, Participant and Controlled

    Learn about different types of observation methods in psychology, such as controlled, naturalistic and participant observation. Find out how they are used to study behavior in various settings, and what are their advantages and disadvantages.

  9. 6.6: Observational Research

    The term observational research is used to refer to several different types of non-experimental studies in which behavior is systematically observed and recorded. The goal of observational research is to describe a variable or set of variables. More generally, the goal is to obtain a snapshot of specific characteristics of an individual, group ...

  10. Observational Research

    Learn what observational research is, how it works, and its types, data collection methods, data analysis methods, and applications. Find out the advantages and disadvantages of this research method and the challenges of data collection and analysis.

  11. Observational designs for real-world evidence studies

    The generated RWE needs to have internal as well as external validity to be actionable. The "fit-for-purpose" observational study designs include descriptive, case-control, cross-sectional, and cohort. This article focuses on the advantages and disadvantages including the inherent bias of each study design.

  12. Observational Studies: Uses and Limitations

    Observational epidemiologic studies are a type of nonexperimental research in which exposure is not controlled by the investigator. Observational studies are by far the most common form of clinical research because of their relatively low complexity, cost, and ethical constraints compared to randomized trials or other forms of clinical experimentation.

  13. Observational Methods

    Systematic observational methods require clearly defined codes, structured sampling and recording procedures, and are subject to rigorous psychometric analysis. We review best practices in each of these areas with attention to the application of these methods for addressing empirical questions that quantitative researchers may posit.

  14. Observational Research: What is, Types, Pros & Cons + Example

    Observational research is a broad term for various non-experimental studies in which behavior is carefully watched and recorded. The goal of this research is to describe a variable or a set of variables. More broadly, the goal is to capture specific individual, group, or setting characteristics. Since it is non-experimental and uncontrolled, we ...

  15. Experimental Studies and Observational Studies

    Observational studies dominate most fields of aging research because many research questions can be answered with these studies, chronological age cannot be experimentally manipulated (Cavanaugh and Blanchard-Fields 2019), and many social/societal conditions would be difficult to manipulate (Weil 2017).Among the observational studies, large-scale aging surveys are particularly valuable because ...

  16. What is an Observational Research: Steps, Types, Pros and Cons

    Observational research is a non-experimental method of studying a society, culture, behaviours and attitudes. It can be classified by the degree of structure of the environment (naturalistic, participant, controlled or structured) and the degree of structure imposed by the researcher (indirect, direct or erosion measures). Learn about the advantages and disadvantages of observational research and how to conduct it.

  17. Observational Studies

    Clinical Study (Observational) Protocol. Description. Provides a recommended structure for developing an NIDCR-funded protocol for an observational study that collects biospecimens, images, and other data. Sample and suggested text are offered in this template.

  18. Observation

    Observation is a way of collecting data through observing phenomena in a setting. It can be structured or unstructured, overt or covert, and has advantages and disadvantages. Learn more about the advantages, disadvantages, ethical issues, and examples of observation research.

  19. Observational Research Method explained

    Observational research is a method of collecting data by simply observing and recording the behavior of individuals, animals or objects in their natural environment. It offers researchers insights into human and animal behavior, revealing patterns and dynamics that would otherwise go unnoticed. This article explores the definition, types ...

  20. What is observational research?

    Observational research is a research technique where you observe participants and phenomena in their most natural settings. This enables researchers to see their subjects make choices and react to situations in their natural setting, as opposed to structured settings like research labs or focus groups.

  21. Assessing fall risk and equilibrium function in patients with age

    This study was designed to assess equilibrium function, fall risk, and fall-related self-efficacy (an individual's belief in their capacity to act in ways necessary to reach specific goals) in patients with AMD and glaucoma. Methods This observational study was performed at the Otorhinolaryngology Department of Shinseikai Toyama Hospital.

  22. Apprenticeship of Observation in Kinesiology: Undergraduates

    ABSTRACT. While kinesiology scholars have focused on how future faculty members are socialized, recruited into, and prepared for academia, limited attention has been given to the apprenticeship of observation for faculty roles when college students first develop impressions and initial understandings of faculty work.

  23. Observational Studies: Cohort and Case-Control Studies

    Cohort studies and case-control studies are two primary types of observational studies that aid in evaluating associations between diseases and exposures. In this review article, we describe these study designs, methodological issues, and provide examples from the plastic surgery literature. Keywords: observational studies, case-control study ...

  24. Observational Research Manager

    Performing and handling research projects involving the analysis of multiple types of data including medical claims, electronic health records and prospective observational cohort studies. Contributing to the development and implementation of innovative analytic methods, capabilities and tools to enable rapid, scalable and reproducible RWE.

  25. Prognostic utility and characteristics of MIB-1 labeling ...

    This observation may possibly confirm a clinical significance of KI-67/MIB-1 LI as a surrogate marker for the proliferative activity of glioma cells, as originally shown in comprehensive analyses on a cellular level, leading to the establishment of MIB-1 LI as the commonly used method for measuring the proliferative potential in human gliomas ...

  26. Homework 1. Observational Research Design. Pia Vegas Smith.

    HOMEWORK 1 2 Observational Research Design The question I came up with for the observational research design is, How is employee productivity and job satisfaction, over a six-month period, changed and impacted when there is an implementation of a wellness program influence in the workplace? We will be using an observational study design with a pre-post intervention.

  27. Social, Behavioral, and Metabolic Risk Factors and Racial ...

    Background: Cardiovascular disease (CVD) mortality is persistently higher in the Black population than in other racial and ethnic groups in the United States. Objective: To examine the degree to which social, behavioral, and metabolic risk factors are associated with CVD mortality and the extent to which racial differences in CVD mortality persist after these factors are accounted for.

  28. Observational Research Opportunities and Limitations

    Observational research may also provide preliminary data to justify the performance of a clinical trial, which might not have received sufficient funding support without the existence of such results. This paper will review observational research methods applied to addressing questions of causation in diabetes research, with a particular focus ...

  29. JMIR Cancer

    Objective: The primary objective of this prospective observational study (OGIPRO study) was to compare the ePRO data related to treatment side effects collected with the medidux app in patients with HER2-positive BC treated with the trastuzumab biosimilar Ogivri (prospective cohort) to those obtained from historical cohorts treated with ...

  30. Observation of Kekulé vortices around hydrogen adatoms in ...

    Observation of a Kekulé vortex near an H atom on graphene. Figure 1b shows a scanning tunneling microscopy (STM) image of graphene with a single chemisorbed hydrogen adatom (see Methods for ...