Skip navigation

Nielsen Norman Group logo

World Leaders in Research-Based User Experience

Formative vs. summative evaluations.

summative evaluation research design

July 28, 2019 2019-07-28

  • Email article
  • Share on LinkedIn
  • Share on Twitter

In the user-experience profession, we preach iteration and evaluation. There are two types of evaluation, formative and summative, and where you are in the design process determines what type of evaluation you should conduct.

Formative evaluations focus on determining which aspects of the design work well or not, and why. These evaluations occur throughout a redesign and provide information to incrementally improve the interface.

Let’s say we’re designing the onboarding experience for a new, completely redesigned version of our mobile app. In the design process, we prototype a solution and then test it with ( usually a few ) users to see how usable it is. The study identifies several issues with our prototype, which are then fixed by a new design. This test is an example of formative evaluation — it helps designers identify what needs to be changed to improve the interface.

Formative evaluations of interfaces involve testing and changing the product, usually multiple times, and therefore are well-suited for the redesign process or while creating a new product.

In both cases, you iterate through the prototyping and testing steps until you are as ready for production as you’ll get (even more iterations would form an even better design, but you have to ship at some point). Thus, formative evaluations are meant to steer the design on the right path.

Summative evaluations describe how well a design performs , often compared to a benchmark such as a prior version of the design or a competitor. Unlike formative evaluations, whose goals is to inform the design process, summative evaluations involve getting the big picture and assessing the overall experience of a finished product. Summative evaluations occur less frequently than formative evaluations, usually right before or right after a redesign.

Let’s go back to our mobile-app example. Now that we’ve shipped the new mobile app, it is time to run a study and see how our app stands in comparison to the previous version of the app. We can gather the time on task and the success rates for the core app functionalities. Then we can compare these metrics against those obtained with the previous version of the app to see if there was any improvement. We will also save the results of this study to evaluate subsequent major versions of the app. This type of study is a summative evaluation since it assesses the shipped product with the goal of tracking performance over time and ultimately calculating our return on investment . However, during this study, we might uncover some usability issues. We should make note of those issues and address them during our next design iteration.

Alternatively, another type of summative evaluations could compare our results with those obtained with one or more competitor apps or with known industry-wide data.

All summative evaluations paint an overview picture of the usability of a system. They are intended to serve as reference points so that you can determine whether you’re improving your own designs over time or beating out a competitor.

The ultimate summative evaluation is the go/no-go decision of whether to release a product. After all is said and done, is your design good enough to be inflicted on the public, or do we think that it will harm our brand so badly that it should never see the light of day? It’s actually rare for companies to have a formal process to kill off bad design, which may be why we encounter many releases that do more harm than good for a brand. If you truly embrace our proposition that brand is experience in the digital age, then consider a final summative evaluation before release.

In This Article:

Origin of the terms, when each type of evaluation is used, research methods for formative vs. summative evaluations.

The terms ‘formative’ and ‘summative’ evaluation were coined by Michael Scriven in 1967. These terms were presented in the context of instructional design and education theory, but are just as valuable for any sort of evaluation-based industry.

In the educational context, formative evaluations are ongoing and occur throughout the development of the course, while summative evaluations occur less frequently and are used to determine whether the program met its intended goals. The formative evaluations are used to steer the teaching, by testing whether content was understood or needs to be revisited, while summative evaluations assess the student’s mastery of the material.

Recall that formative and summative evaluations align with your place in the design process. Formative evaluations go with prototype and testing iterations throughout a redesign project, while summative evaluations are best for right before or right after a major redesign.

Great researchers begin their study by determining what question they’re trying to answer. Essentially, your research question is the same as the type of evaluation. Below is a list of possible research questions you might have and the corresponding evaluation. For that reason, this table is descriptive, not prescriptive.

After it is clear which type of evaluation you will conduct, you have to determine which research method you should use. There is a common misconception that summative equals quantitative and formative equals qualitative ­­— this is not the case.  

Summative evaluations can be either qualitative or quantitative. The same is true for formative evaluations.

Although summative evaluations are often quantitative, they can be qualitative studies, too. For example, you might like to know where your product stands compared with your competition. You could hire a UX expert to do an expert review of your interface and a competitor’s. The expert review would use the 10 usability heuristics as well as the reviewer’s knowledge of UI and human behavior to produce a list of strength and weaknesses for both your interface and your competitor’s. The study is summative because the overall interface is being evaluated with the goal of understanding whether the UX of your product stands up to the competition and whether a major redesign is warranted.

Additionally, formative evaluations aren’t always qualitative, although that is often the case. (Since it’s recommended to run an extended series of formative evaluations, it makes financial sense to use a cheaper qualitative study for each of them.) But sometimes big companies with large UX budgets and high level of UX maturity  might use quantitative studies for formative purposes in order to ensure that a change to one of their essential features will perform satisfactorily.  For instance, before launching a new homepage design, a large company may want to run a quantitative test on the prototype to make sure that the number of people who will scroll below the fold is high enough. 

Formative and summative evaluations correspond to different research goals. Formative evaluations are meant to steer the design on the correct path so that the final product has satisfactory user experience. They are a natural part of any iterative user-centered design process. Summative evaluations assess the overall usability of a product and are instrumental in tracking its usability over time and in comparing it with competitors.

Greenstein, Laura.  What Teachers Really Need to Know About Formative Assessment . ASCD, 2010.

Related Courses

Discovery: building the right thing.

Conduct successful discovery phases to ensure you build the best solution

User Research Methods: From Strategy to Requirements to Design

Pick the best UX research method for each stage in the design process

ResearchOps: Scaling User Research

Orchestrate and optimize research to amplify its impact

Related Topics

  • Research Methods Research Methods

Learn More:

Please accept marketing cookies to view the embedded video. https://www.youtube.com/watch?v=730UiP7dZeo

Formative vs. Summative Usability Evaluation

summative evaluation research design

Always Pilot Test User Research Studies

Kim Salazar · 3 min

summative evaluation research design

Level Up Your Focus Groups

Therese Fessenden · 5 min

summative evaluation research design

Inductively Analyzing Qualitative Data

Tanner Kohler · 3 min

Related Articles:

Open-Ended vs. Closed Questions in User Research

Maria Rosala · 5 min

Cognitive Mapping in User Research

Sarah Gibbons · 16 min

UX Research Methods: Glossary

Raluca Budiu · 12 min

Iterative Design of a Survey Question: A Case Study

Feifei Liu · 8 min

When to Use Which User-Experience Research Methods

Christian Rohrer · 9 min

You Are Not the User: The False-Consensus Effect

Raluca Budiu · 4 min

Site logo

  • Understanding Summative Evaluation: Definition, Benefits, and Best Practices
  • Learning Center

What is Summative Evaluation?

This article provides an overview of summative evaluation, including its definition, benefits, and best practices. Discover how summative evaluation can help you assess the effectiveness of your program or project, identify areas for improvement, and promote evidence-based decision-making. Learn about best practices for conducting summative evaluation and how to address common challenges and limitations.

Table of Contents

What is Summative Evaluation and Why is it Important?

Summative evaluation: purpose, goals, benefits of summative evaluation, types of summative evaluation, best practices for conducting summative evaluation, examples of summative evaluation in practice, examples of summative evaluation questions, challenges and limitations of summative evaluation, ensuring ethical considerations in summative evaluation, future directions for summative evaluation research and practice.

Ready to optimize your resume for a better career? Try Our FREE Resume Scanner!

Optimize your resume for ATS with formatting, keywords, and quantified experience.

  • Compare job description keywords with your resume.
  • Tailor your resume to match as many keywords as possible.
  • Grab the recruiter’s attention with a standout resume.

Resume Scanner Dashboard

Summative evaluation is a type of evaluation that is conducted at the end of a program or project, with the goal of assessing its overall effectiveness. The primary focus of summative evaluation is to determine whether the program or project achieved its goals and objectives. Summative evaluation is often used to inform decisions about future program or project development, as well as to determine whether or not to continue funding a particular program or project.

Summative evaluation is important for several reasons. First, it provides a comprehensive assessment of the overall effectiveness of a program or project, which can help to inform decisions about future development and implementation. Second, it can help to identify areas where improvements can be made in program delivery, such as in program design or implementation. Third, it can help to determine whether the program or project is a worthwhile investment, and whether it is meeting the needs of stakeholders.

In addition to these benefits, summative evaluation can also help to promote accountability and transparency in program or project implementation. By conducting a thorough evaluation of the program or project, stakeholders can be assured that their resources are being used effectively and that the program or project is achieving its intended outcomes.

Summative evaluation plays an important role in assessing the overall effectiveness of a program or project, and in informing decisions about future development and implementation. It is an essential tool for promoting accountability, transparency, and effectiveness in program or project implementation.

Summative evaluation is an approach to program evaluation that is conducted at the end of a program or project, with the goal of assessing its overall effectiveness. Here are some of the key purposes and goals of summative evaluation.

Purpose of Summative Evaluation

  • Assess effectiveness: Summative evaluation is focused on assessing the overall effectiveness of a program or project in achieving its intended goals and objectives.
  • Determine impact: Summative evaluation is used to determine the impact of a program or project on its intended audience or stakeholders, as well as on the broader community or environment.
  • Inform decision-making: Summative evaluation is used to inform decision-making about future program or project development, as well as resource allocation.

Goals of Summative Evaluation

  • Measure program outcomes: Summative evaluation is used to measure program outcomes, including the extent to which the program achieved its intended goals and objectives, and the impact of the program on its intended audience or stakeholders.
  • Assess program effectiveness: Summative evaluation is used to assess the overall effectiveness of a program, by comparing program outcomes to its intended goals and objectives, as well as to similar programs or initiatives.
  • Inform program improvement: Summative evaluation is used to inform program improvement by identifying areas where the program could be modified or improved in order to enhance its effectiveness.

Summative evaluation is a critical tool for assessing the overall effectiveness and impact of programs or projects, and for informing decision-making about future program or project development. By measuring program outcomes, assessing program effectiveness, and identifying areas for program improvement, summative evaluation can help to ensure that programs and projects are meeting their intended goals and making a positive impact on their intended audience or stakeholders.

Summative evaluation is an important tool for assessing the overall effectiveness of a program or project. Here are some of the benefits of conducting summative evaluation:

  • Provides a Comprehensive Assessment: Summative evaluation provides a comprehensive assessment of the overall effectiveness of a program or project, which can help to inform decisions about future development and implementation.
  • Identifies Areas for Improvement : Summative evaluation can help to identify areas where improvements can be made in program delivery, such as in program design or implementation.
  • Promotes Accountability and Transparency: Summative evaluation can help to promote accountability and transparency in program or project implementation, by ensuring that resources are being used effectively and that the program or project is achieving its intended outcomes.
  • Supports Evidence-Based Decision-Making : Summative evaluation provides evidence-based data and insights that can inform decisions about future development and implementation.
  • Demonstrates Impact : Summative evaluation can help to demonstrate the impact of a program or project, which can be useful for securing funding or support for future initiatives.
  • Increases Stakeholder Engagement : Summative evaluation can increase stakeholder engagement and ownership of the program or project being evaluated, by involving stakeholders in the evaluation process and soliciting their feedback.

Summative evaluation is an essential tool for assessing the overall effectiveness of a program or project, and for informing decisions about future development and implementation. It provides a comprehensive assessment of the program or project, identifies areas for improvement, promotes accountability and transparency, and supports evidence-based decision-making.

There are different types of summative evaluation that can be used to assess the overall effectiveness of a program or project. Here are some of the most common types of summative evaluation:

  • Outcome Evaluation: This type of evaluation focuses on the outcomes or results of the program or project, such as changes in behavior, knowledge, or attitudes. Outcome evaluation is often used to determine the effectiveness of an intervention or program in achieving its intended outcomes.
  • Impact Evaluation: This type of evaluation focuses on the broader impact of the program or project, such as changes in the community or society. Impact evaluation is often used to assess the overall impact of a program or project on the target population or community.
  • Cost-Benefit Evaluation: This type of evaluation focuses on the costs and benefits of the program or project, and is often used to determine whether the program or project is a worthwhile investment. Cost-benefit evaluation can help to determine whether the benefits of the program or project outweigh the costs.

The type of summative evaluation used will depend on the specific goals and objectives of the program or project being evaluated, as well as the resources and data available for evaluation. Each type of summative evaluation serves a specific purpose in assessing the overall effectiveness of a program or project, and should be tailored to the specific needs of the program or project being evaluated.

Conducting a successful summative evaluation requires careful planning and attention to best practices. Here are some best practices for conducting summative evaluation:

  • Clearly Define Goals and Objectives : Before conducting a summative evaluation, it is important to clearly define the goals and objectives of the program or project being evaluated. This will help to ensure that the evaluation is focused and relevant to the needs of stakeholders.
  • Use Valid and Reliable Measures: The measures used in a summative evaluation should be valid and reliable, in order to ensure that the results are accurate and meaningful. This may involve selecting or developing appropriate evaluation tools, such as surveys or assessments, and ensuring that they are properly administered.
  • Collect Data from Multiple Sources : Data for a summative evaluation should be collected from multiple sources, in order to ensure that the results are comprehensive and representative. This may involve collecting data from program participants, stakeholders, and other relevant sources.
  • Analyze and Interpret Results : Once the data has been collected, it is important to analyze and interpret the results in order to determine the overall effectiveness of the program or project. This may involve using statistical analysis or other techniques to identify patterns or trends in the data.
  • Use Results to Inform Future Development : The results of a summative evaluation should be used to inform future program or project development, in order to improve the effectiveness of the program or project. This may involve making changes to program design or delivery, or identifying areas where additional resources or support may be needed.

Conducting a successful summative evaluation requires careful planning, attention to detail, and a commitment to using the results to inform future development and improvement. By following best practices for conducting summative evaluation, stakeholders can ensure that their programs and projects are effective and relevant to the needs of their communities.

Summative evaluation is an important tool for assessing the overall effectiveness of a program or project. Here are some examples of summative evaluation in practice:

  • Educational Programs : A school district may conduct a summative evaluation of a new educational program, such as a reading intervention program. The evaluation may focus on the program’s outcomes, such as improvements in reading skills, and may involve collecting data from multiple sources, such as teacher assessments, student tests, and parent surveys.
  • Health Interventions : A public health agency may conduct a summative evaluation of a health intervention, such as a vaccination campaign. The evaluation may focus on the impact of the intervention on health outcomes, such as reductions in disease incidence, and may involve collecting data from multiple sources, such as healthcare providers, patients, and community members.
  • Social Service Programs: A non-profit organization may conduct a summative evaluation of a social service program, such as a job training program for disadvantaged youth. The evaluation may focus on the impact of the program on outcomes such as employment rates and job retention, and may involve collecting data from multiple sources, such as program participants, employers, and community partners.
  • Technology Products : A software company may conduct a summative evaluation of a new technology product, such as a mobile app. The evaluation may focus on user satisfaction and effectiveness, and may involve collecting data from multiple sources, such as user surveys, user testing, and usage data.
  • Environmental Programs : An environmental organization may conduct a summative evaluation of a conservation program, such as a land protection initiative. The evaluation may focus on the impact of the program on environmental outcomes, such as the protection of natural habitats or the reduction of greenhouse gas emissions, and may involve collecting data from multiple sources, such as program participants, community members, and scientific data.

Summative evaluation can be used in a wide range of programs and initiatives to assess their overall effectiveness and inform future development and improvement.

Summative evaluation is an important tool for assessing the overall effectiveness of a program or project. Here are some examples of summative evaluation questions that can be used to guide the evaluation process:

  • Did the program or project achieve its intended outcomes and goals?
  • To what extent did the program or project meet the needs of its intended audience or stakeholders?
  • What were the most effective components of the program or project, and what areas could be improved?
  • What impact did the program or project have on its intended audience or stakeholders?
  • Was the program or project implemented effectively, and were resources used efficiently?
  • What unintended consequences or challenges arose during the program or project, and how were they addressed?
  • How does the program or project compare to similar initiatives or interventions in terms of effectiveness and impact?
  • What were the costs and benefits of the program or project, and were they reasonable given the outcomes achieved?
  • What lessons can be learned from the program or project, and how can they inform future development and improvement?

The questions asked during a summative evaluation are designed to provide a comprehensive understanding of the impact and effectiveness of the program or project. The answers to these questions can inform future programming and resource allocation decisions and help to identify areas for improvement. Overall, summative evaluation is an essential tool for assessing the overall impact and effectiveness of a program or project.

Summative evaluation is an important tool for assessing the overall effectiveness of a program or project. However, there are several challenges and limitations that should be considered when conducting summative evaluation. Here are some of the most common challenges and limitations of summative evaluation:

  • Timing: Summative evaluation is typically conducted at the end of a program or project, which may limit the ability to make real-time improvements during the implementation phase.
  • Resource Constraints: Summative evaluation can be resource-intensive, requiring significant time, effort, and funding to collect and analyze data.
  • Bias: The data collected during summative evaluation may be subject to bias, such as social desirability bias, which can affect the accuracy and reliability of the evaluation results.
  • Difficulty of Measurement: Some outcomes of a program or project may be difficult to measure, which can make it challenging to assess the overall effectiveness of the program or project.
  • Difficulty of Generalization: The results of a summative evaluation may not be generalizable to other contexts or settings, which can limit the broader applicability of the evaluation findings.
  • Limited Stakeholder Involvement: Summative evaluation may not involve all stakeholders, which can limit the representation of diverse perspectives and lead to incomplete evaluation findings.
  • Limited Focus on Process: Summative evaluation typically focuses on outcomes and impact, which may not provide a full understanding of the program or project’s implementation process and effectiveness.

These challenges and limitations of summative evaluation should be considered when planning and conducting evaluations. By understanding these limitations, evaluators can work to mitigate potential biases and limitations and ensure that the evaluation results are accurate, reliable, and useful for program or project improvement.

While conducting summative evaluation, it’s imperative to uphold ethical principles to ensure the integrity and fairness of the evaluation process. Ethical considerations are essential for maintaining trust with stakeholders, respecting the rights of participants, and safeguarding the integrity of evaluation findings. Here are key ethical considerations to integrate into summative evaluation:

Informed Consent: Ensure that participants are fully informed about the purpose, procedures, risks, and benefits of the evaluation before consenting to participate. Provide clear and accessible information, allowing participants to make voluntary and informed decisions about their involvement.

Confidentiality and Privacy: Safeguard the confidentiality and privacy of participants’ information throughout the evaluation process. Implement secure data management practices, anonymize data whenever possible, and only share findings in aggregate or de-identified formats to protect participants’ identities.

Respect for Diversity and Inclusion: Respect and embrace the diversity of participants, acknowledging their unique perspectives, backgrounds, and experiences. Ensure that evaluation methods are culturally sensitive and inclusive, avoiding biases and stereotypes, and accommodating diverse communication styles and preferences.

Avoiding Harm: Take proactive measures to minimize the risk of harm to participants and stakeholders throughout the evaluation process. Anticipate potential risks and vulnerabilities, mitigate them through appropriate safeguards and protocols, and prioritize the well-being and dignity of all involved.

Beneficence and Non-Maleficence: Strive to maximize the benefits of the evaluation while minimizing any potential harm or adverse effects. Ensure that evaluation activities contribute to the improvement of programs or projects, enhance stakeholders’ understanding and decision-making, and do not cause undue stress, discomfort, or harm.

Transparency and Accountability: Maintain transparency and accountability in all aspects of the evaluation, including its design, implementation, analysis, and reporting. Clearly communicate the evaluation’s objectives, methodologies, findings, and limitations, allowing stakeholders to assess its credibility and relevance.

Equitable Participation and Representation: Foster equitable participation and representation of diverse stakeholders throughout the evaluation process. Engage stakeholders in meaningful ways, valuing their input, perspectives, and contributions, and address power differentials to ensure inclusive decision-making and ownership of evaluation outcomes.

Continuous Reflection and Improvement: Continuously reflect on ethical considerations throughout the evaluation process, remaining responsive to emerging issues, challenges, and ethical dilemmas. Seek feedback from stakeholders, engage in dialogue about ethical concerns, and adapt evaluation approaches accordingly to uphold ethical standards.

By integrating these ethical considerations into summative evaluation practices, evaluators can uphold principles of integrity, respect, fairness, and accountability, promoting trust, credibility, and meaningful impact in program assessment and improvement. Ethical evaluation practices not only ensure compliance with professional standards and legal requirements but also uphold fundamental values of respect for human dignity, justice, and social responsibility.

Summative evaluation is an important tool for assessing the overall effectiveness of a program or project. Here are some potential future directions for summative evaluation research and practice:

  • Incorporating Technology: Advances in technology have the potential to improve the efficiency and accuracy of summative evaluation. Future research could explore the use of artificial intelligence, machine learning, and other technologies to streamline data collection and analysis.
  • Enhancing Stakeholder Engagement: Future research could explore ways to enhance stakeholder engagement in summative evaluation, such as by involving stakeholders in the evaluation planning and implementation process.
  • Increasing Use of Mixed Methods: Future research could explore the use of mixed methods approaches in summative evaluation, such as combining qualitative and quantitative methods to gain a more comprehensive understanding of program or project effectiveness.
  • Addressing Equity and Inclusion: Future research could focus on addressing issues of equity and inclusion in summative evaluation, such as by ensuring that evaluation methods are sensitive to the needs and experiences of diverse stakeholders.
  • Addressing Complexity: Many programs and projects operate in complex and dynamic environments. Future research could explore ways to address this complexity in summative evaluation, such as by developing more adaptive and flexible evaluation methods.
  • Improving Integration with Formative Evaluation: Summative evaluation is typically conducted after a program or project has been completed, while formative evaluation is conducted during program or project implementation. Future research could explore ways to better integrate summative and formative evaluation, in order to promote continuous program improvement.

These future directions for summative evaluation research and practice have the potential to improve the effectiveness and relevance of summative evaluation, and to enhance its value as a tool for program and project assessment and improvement.

' data-src=

David Ndiyamba

Quite comprehensive Thank you

' data-src=

Fation Luli

Leave a comment cancel reply.

Your email address will not be published.

How strong is my Resume?

Only 2% of resumes land interviews.

Land a better, higher-paying career

summative evaluation research design

Jobs for You

Water, sanitation and hygiene advisor (wash) – usaid/drc.

  • Democratic Republic of the Congo

Health Supply Chain Specialist – USAID/DRC

Chief of party – bosnia and herzegovina.

  • Bosnia and Herzegovina

Project Manager I

  • United States

Business Development Associate

Director of finance and administration, request for information – collecting information on potential partners for local works evaluation.

  • Washington, USA

Principal Field Monitors

Technical expert (health, wash, nutrition, education, child protection, hiv/aids, supplies), survey expert, data analyst, team leader, usaid-bha performance evaluation consultant.

  • International Rescue Committee

Manager II, Institutional Support Program Implementation

Senior human resources associate, services you might be interested in, useful guides ....

How to Create a Strong Resume

Monitoring And Evaluation Specialist Resume

Resume Length for the International Development Sector

Types of Evaluation

Monitoring, Evaluation, Accountability, and Learning (MEAL)

LAND A JOB REFERRAL IN 2 WEEKS (NO ONLINE APPS!)

Sign Up & To Get My Free Referral Toolkit Now:

Integrations

What's new?

Prototype Testing

Live Website Testing

Feedback Surveys

Interview Studies

Card Sorting

Tree Testing

In-Product Prompts

Participant Management

Automated Reports

Templates Gallery

Choose from our library of pre-built mazes to copy, customize, and share with your own users

Browse all templates

Financial Services

Tech & Software

Product Designers

Product Managers

User Researchers

By use case

Concept & Idea Validation

Wireframe & Usability Test

Content & Copy Testing

Feedback & Satisfaction

Content Hub

Educational resources for product, research and design teams

Explore all resources

Question Bank

Research Maturity Model

Guides & Reports

Help Center

Future of User Research Report

The Optimal Path Podcast

Maze Guides | Resources Hub

What is UX Research: The Ultimate Guide for UX Researchers

0% complete

Evaluative research: Key methods, types, and examples

In the last chapter, we learned what generative research means and how it prepares you to build an informed solution for users. Now, let’s look at evaluative research for design and user experience (UX).

evaluative research illustration

What is evaluative research?

Evaluative research is a research method used to evaluate a product or concept and collect data to help improve your solution. It offers many benefits, including identifying whether a product works as intended and uncovering areas for improvement.

Also known as evaluation research or program evaluation, this kind of research is typically introduced in the early phases of the design process to test existing or new solutions. It continues to be employed in an iterative way until the product becomes ‘final’. “With evaluation research, we’re making sure the value is there so that effort and resources aren’t wasted,” explains Nannearl LeKesia Brown , Product Researcher at Figma.

According to Mithila Fox , Senior UX Researcher at Stack Overflow, the evaluation research process includes various activities, like content testing , assessing accessibility or desirability. During UX research , evaluation can also be conducted on competitor products to understand what solutions work well in the current market before you start building your own.

“Even before you have your own mockups, you can start by testing competitors or similar products,” says Mithila. “There’s a lot we can learn from what is and isn't working about other products in the market.”

However, evaluation research doesn’t stop when a new product is launched. For the best user experience, solutions need to be monitored after release and improved based on customer feedback.

Turn insights into impact with Maze

Create better product experiences with evaluative research powered by actionable insights from your users.

summative evaluation research design

Why is evaluative research important?

Evaluative research is crucial in UX design and research, providing insights to enhance user experiences, identify usability issues, and inform iterative design improvements. It helps you:

  • Refine and improve UX: Evaluative research allows you to test a solution and collect valuable feedback to refine and improve the user experience. For example, you can A/B test the copy on your site to maximize engagement with users.
  • Identify areas of improvement: Findings from evaluative research are key to assessing what works and what doesn't. You might, for instance, run usability testing to observe how users navigate your website and identify pain points or areas of confusion.
  • Align your ideas with users: Research should always be a part of the design and product development process . By allowing users to evaluate your product early and often you'll know whether you're building the right solution for your audience.
  • Get buy-in: The insights you get from this type of research can demonstrate the effectiveness and impact of your project. Show this information to stakeholders to get buy-in for future projects.

Evaluative vs. Generative research

The difference between generative research and evaluative research lies in their focus: generative methods investigate user needs for new solutions, while evaluative research assesses and validates existing designs for improvements.

Generative and evaluative research are both valuable decision-making tools in the arsenal of a researcher. They should be similarly employed throughout the product development process as they both help you get the evidence you need.

When creating the research plan , study the competitive landscape, target audience, needs of the people you’re building for, and any existing solutions. Depending on what you need to find out, you’ll be able to determine if you should run generative or evaluative research.

Mithila explains the benefits of using both research methodologies: “Generative research helps us deeply understand our users and learn their needs, wants, and challenges. On the other hand, evaluative research helps us test whether the solutions we've come up with address those needs, wants, and challenges.”

Use generative research to bring forth new ideas during the discovery phase. And use evaluation research to test and monitor the product before and after launch.

The two types of evaluative research

There are two types of evaluative studies you can tap into: summative and formative research. Although summative evaluations are often quantitative, they can also be part of qualitative research.

Summative evaluation research

A summative evaluation helps you understand how a design performs overall. It’s usually done at the end of the design process to evaluate its usability or detect overlooked issues. You can also use a summative evaluation to benchmark your new solution against a prior one, or that of a competitor’s, and understand if the final product needs assessment. Summative evaluation can be used for outcome-focused evaluation to assess impact and effectiveness for specific outcomes—for example, how design influences conversion.

Formative evaluation research

On the other hand, formative research is conducted early and often during the design process to test and improve a solution before arriving at the final design. Running a formative evaluation allows you to test and identify issues in the solutions as you’re creating them, and improve them based on user feedback.

TL;DR: Run formative research to test and evaluate solutions during the design process, and conduct a summative evaluation at the end to evaluate the final product.

Looking to conduct UX research? Check out our list of the top UX research tools to run an effective research study.

5 Key evaluative research methods

“Evaluation research can start as soon as you understand your user’s needs,” says Mithila. Here are five typical UX research methods to include in your evaluation research process:

Evaluative research methods

User surveys can provide valuable quantitative insights into user preferences, satisfaction levels, and attitudes toward a design or product. By gathering a large amount of data efficiently, surveys can identify trends, patterns, and user demographics to make informed decisions and prioritize design improvements.

Closed card sorting

Closed card sorting helps evaluate the effectiveness and intuitiveness of an existing or proposed navigation structure. By analyzing how participants group and categorize information, researchers can identify potential issues, inconsistencies, or gaps in the design's information architecture, leading to improved navigation and findability.

Tree testing

Tree testing , also known as reverse card sorting, is a research method used to evaluate the findability and effectiveness of information architecture. Participants are given a text-based representation of the website's navigation structure (without visual design elements) and are asked to locate specific items or perform specific tasks by navigating through the tree structure. This method helps identify potential issues such as confusing labels, unclear hierarchy, or navigation paths that hinder users' ability to find information.

Usability testing

Usability testing involves observing and collecting qualitative and/or quantitative data on how users interact with a design or product. Participants are given specific tasks to perform while their interactions, feedback, and difficulties are recorded. This approach helps identify usability issues, areas of confusion, or pain points in the user experience.

A/B testing

A/B testing , also known as split testing, is an evaluative research approach that involves comparing two or more versions of a design or feature to determine which one performs better in achieving a specific objective. Users are randomly assigned to different variants, and their interactions, behavior, or conversion rates are measured and analyzed. A/B testing allows researchers to make data-driven decisions by quantitatively assessing the impact of design changes on user behavior, engagement, or conversion metrics.

This is the value of having a UX research plan before diving into the research approach itself. If we were able to answer the evaluative questions we had, in addition to figuring out if our hypotheses were valid (or not), I’d count that as a successful evaluation study. Ultimately, research is about learning in order to make more informed decisions—if we learned, we were successful.

Nannearl LeKesia Brown, Product Researcher at Figma

Nannearl LeKesia Brown , Product Researcher at Figma

Evaluative research question examples

To gather valuable data and make better design decisions, you need to ask the right research questions . Here are some examples of evaluative research questions:

Usability questions

  • How would you go about performing [task]?
  • How was your experience completing [task]?
  • How did you find navigating to [X] page?
  • Based on the previous task, how would you prefer to do this action instead?

Get inspired by real-life usability test examples and discover more usability testing questions in our guide to usability testing.

Product survey questions

  • How often do you use the product/feature?
  • How satisfied are you with the product/feature?
  • Does the product/feature help you achieve your goals?
  • How easy is the product/feature to use?

Discover more examples of product survey questions in our article on product surveys .

Closed card sorting questions

  • Were there any categories you were unsure about?
  • Which categories were you unsure about?
  • Why were you unsure about the [X] category?

Find out more in our complete card sorting guide .

Evaluation research examples

Across UX design, research, and product testing, evaluative research can take several forms. Here are some ways you can conduct evaluative research:

Comparative usability testing

This example of evaluative research involves conducting usability tests with participants to compare the performance and user satisfaction of two or more competing design variations or prototypes.

You’ll gather qualitative and quantitative data on task completion rates, errors, user preferences, and feedback to identify the most effective design option. You can then use the insights gained from comparative usability testing to inform design decisions and prioritize improvements based on user-centered feedback .

Cognitive walkthroughs

Cognitive walkthroughs assess the usability and effectiveness of a design from a user's perspective.

You’ll create evaluators to identify potential points of confusion, decision-making challenges, or errors. You can then gather insights on user expectations, mental models, and information processing to improve the clarity and intuitiveness of the design .

Diary studies

Conducting diary studies gives you insights into users' experiences and behaviors over an extended period of time.

You provide participants with diaries or digital tools to record their interactions, thoughts, frustrations, and successes related to a product or service. You can then analyze the collected data to identify usage patterns, uncover pain points, and understand the factors influencing the user experience .

In the next chapters, we'll learn more about quantitative and qualitative research, as well as the most common UX research methods . We’ll also share some practical applications of how UX researchers use these methods to conduct effective research.

Generate product insights with Maze

Make actionable decisions powered by user feedback, evaluation, and research.

user testing data insights

In the next chapters, we'll learn more about quantitative and qualitative research, as well as the most common research approaches, and share some practical applications of how UX researchers use them to conduct effective research.

Frequently asked questions

Evaluative research, also known as evaluation research or program evaluation, is a type of research you can use to evaluate a product or concept and collect data that helps improve your solution.

Quantitative vs. qualitative UX research: An overview of UX research methods

Logo for Open Educational Resources

Chapter 2. Research Design

Getting started.

When I teach undergraduates qualitative research methods, the final product of the course is a “research proposal” that incorporates all they have learned and enlists the knowledge they have learned about qualitative research methods in an original design that addresses a particular research question. I highly recommend you think about designing your own research study as you progress through this textbook. Even if you don’t have a study in mind yet, it can be a helpful exercise as you progress through the course. But how to start? How can one design a research study before they even know what research looks like? This chapter will serve as a brief overview of the research design process to orient you to what will be coming in later chapters. Think of it as a “skeleton” of what you will read in more detail in later chapters. Ideally, you will read this chapter both now (in sequence) and later during your reading of the remainder of the text. Do not worry if you have questions the first time you read this chapter. Many things will become clearer as the text advances and as you gain a deeper understanding of all the components of good qualitative research. This is just a preliminary map to get you on the right road.

Null

Research Design Steps

Before you even get started, you will need to have a broad topic of interest in mind. [1] . In my experience, students can confuse this broad topic with the actual research question, so it is important to clearly distinguish the two. And the place to start is the broad topic. It might be, as was the case with me, working-class college students. But what about working-class college students? What’s it like to be one? Why are there so few compared to others? How do colleges assist (or fail to assist) them? What interested me was something I could barely articulate at first and went something like this: “Why was it so difficult and lonely to be me?” And by extension, “Did others share this experience?”

Once you have a general topic, reflect on why this is important to you. Sometimes we connect with a topic and we don’t really know why. Even if you are not willing to share the real underlying reason you are interested in a topic, it is important that you know the deeper reasons that motivate you. Otherwise, it is quite possible that at some point during the research, you will find yourself turned around facing the wrong direction. I have seen it happen many times. The reason is that the research question is not the same thing as the general topic of interest, and if you don’t know the reasons for your interest, you are likely to design a study answering a research question that is beside the point—to you, at least. And this means you will be much less motivated to carry your research to completion.

Researcher Note

Why do you employ qualitative research methods in your area of study? What are the advantages of qualitative research methods for studying mentorship?

Qualitative research methods are a huge opportunity to increase access, equity, inclusion, and social justice. Qualitative research allows us to engage and examine the uniquenesses/nuances within minoritized and dominant identities and our experiences with these identities. Qualitative research allows us to explore a specific topic, and through that exploration, we can link history to experiences and look for patterns or offer up a unique phenomenon. There’s such beauty in being able to tell a particular story, and qualitative research is a great mode for that! For our work, we examined the relationships we typically use the term mentorship for but didn’t feel that was quite the right word. Qualitative research allowed us to pick apart what we did and how we engaged in our relationships, which then allowed us to more accurately describe what was unique about our mentorship relationships, which we ultimately named liberationships ( McAloney and Long 2021) . Qualitative research gave us the means to explore, process, and name our experiences; what a powerful tool!

How do you come up with ideas for what to study (and how to study it)? Where did you get the idea for studying mentorship?

Coming up with ideas for research, for me, is kind of like Googling a question I have, not finding enough information, and then deciding to dig a little deeper to get the answer. The idea to study mentorship actually came up in conversation with my mentorship triad. We were talking in one of our meetings about our relationship—kind of meta, huh? We discussed how we felt that mentorship was not quite the right term for the relationships we had built. One of us asked what was different about our relationships and mentorship. This all happened when I was taking an ethnography course. During the next session of class, we were discussing auto- and duoethnography, and it hit me—let’s explore our version of mentorship, which we later went on to name liberationships ( McAloney and Long 2021 ). The idea and questions came out of being curious and wanting to find an answer. As I continue to research, I see opportunities in questions I have about my work or during conversations that, in our search for answers, end up exposing gaps in the literature. If I can’t find the answer already out there, I can study it.

—Kim McAloney, PhD, College Student Services Administration Ecampus coordinator and instructor

When you have a better idea of why you are interested in what it is that interests you, you may be surprised to learn that the obvious approaches to the topic are not the only ones. For example, let’s say you think you are interested in preserving coastal wildlife. And as a social scientist, you are interested in policies and practices that affect the long-term viability of coastal wildlife, especially around fishing communities. It would be natural then to consider designing a research study around fishing communities and how they manage their ecosystems. But when you really think about it, you realize that what interests you the most is how people whose livelihoods depend on a particular resource act in ways that deplete that resource. Or, even deeper, you contemplate the puzzle, “How do people justify actions that damage their surroundings?” Now, there are many ways to design a study that gets at that broader question, and not all of them are about fishing communities, although that is certainly one way to go. Maybe you could design an interview-based study that includes and compares loggers, fishers, and desert golfers (those who golf in arid lands that require a great deal of wasteful irrigation). Or design a case study around one particular example where resources were completely used up by a community. Without knowing what it is you are really interested in, what motivates your interest in a surface phenomenon, you are unlikely to come up with the appropriate research design.

These first stages of research design are often the most difficult, but have patience . Taking the time to consider why you are going to go through a lot of trouble to get answers will prevent a lot of wasted energy in the future.

There are distinct reasons for pursuing particular research questions, and it is helpful to distinguish between them.  First, you may be personally motivated.  This is probably the most important and the most often overlooked.   What is it about the social world that sparks your curiosity? What bothers you? What answers do you need in order to keep living? For me, I knew I needed to get a handle on what higher education was for before I kept going at it. I needed to understand why I felt so different from my peers and whether this whole “higher education” thing was “for the likes of me” before I could complete my degree. That is the personal motivation question. Your personal motivation might also be political in nature, in that you want to change the world in a particular way. It’s all right to acknowledge this. In fact, it is better to acknowledge it than to hide it.

There are also academic and professional motivations for a particular study.  If you are an absolute beginner, these may be difficult to find. We’ll talk more about this when we discuss reviewing the literature. Simply put, you are probably not the only person in the world to have thought about this question or issue and those related to it. So how does your interest area fit into what others have studied? Perhaps there is a good study out there of fishing communities, but no one has quite asked the “justification” question. You are motivated to address this to “fill the gap” in our collective knowledge. And maybe you are really not at all sure of what interests you, but you do know that [insert your topic] interests a lot of people, so you would like to work in this area too. You want to be involved in the academic conversation. That is a professional motivation and a very important one to articulate.

Practical and strategic motivations are a third kind. Perhaps you want to encourage people to take better care of the natural resources around them. If this is also part of your motivation, you will want to design your research project in a way that might have an impact on how people behave in the future. There are many ways to do this, one of which is using qualitative research methods rather than quantitative research methods, as the findings of qualitative research are often easier to communicate to a broader audience than the results of quantitative research. You might even be able to engage the community you are studying in the collecting and analyzing of data, something taboo in quantitative research but actively embraced and encouraged by qualitative researchers. But there are other practical reasons, such as getting “done” with your research in a certain amount of time or having access (or no access) to certain information. There is nothing wrong with considering constraints and opportunities when designing your study. Or maybe one of the practical or strategic goals is about learning competence in this area so that you can demonstrate the ability to conduct interviews and focus groups with future employers. Keeping that in mind will help shape your study and prevent you from getting sidetracked using a technique that you are less invested in learning about.

STOP HERE for a moment

I recommend you write a paragraph (at least) explaining your aims and goals. Include a sentence about each of the following: personal/political goals, practical or professional/academic goals, and practical/strategic goals. Think through how all of the goals are related and can be achieved by this particular research study . If they can’t, have a rethink. Perhaps this is not the best way to go about it.

You will also want to be clear about the purpose of your study. “Wait, didn’t we just do this?” you might ask. No! Your goals are not the same as the purpose of the study, although they are related. You can think about purpose lying on a continuum from “ theory ” to “action” (figure 2.1). Sometimes you are doing research to discover new knowledge about the world, while other times you are doing a study because you want to measure an impact or make a difference in the world.

Purpose types: Basic Research, Applied Research, Summative Evaluation, Formative Evaluation, Action Research

Basic research involves research that is done for the sake of “pure” knowledge—that is, knowledge that, at least at this moment in time, may not have any apparent use or application. Often, and this is very important, knowledge of this kind is later found to be extremely helpful in solving problems. So one way of thinking about basic research is that it is knowledge for which no use is yet known but will probably one day prove to be extremely useful. If you are doing basic research, you do not need to argue its usefulness, as the whole point is that we just don’t know yet what this might be.

Researchers engaged in basic research want to understand how the world operates. They are interested in investigating a phenomenon to get at the nature of reality with regard to that phenomenon. The basic researcher’s purpose is to understand and explain ( Patton 2002:215 ).

Basic research is interested in generating and testing hypotheses about how the world works. Grounded Theory is one approach to qualitative research methods that exemplifies basic research (see chapter 4). Most academic journal articles publish basic research findings. If you are working in academia (e.g., writing your dissertation), the default expectation is that you are conducting basic research.

Applied research in the social sciences is research that addresses human and social problems. Unlike basic research, the researcher has expectations that the research will help contribute to resolving a problem, if only by identifying its contours, history, or context. From my experience, most students have this as their baseline assumption about research. Why do a study if not to make things better? But this is a common mistake. Students and their committee members are often working with default assumptions here—the former thinking about applied research as their purpose, the latter thinking about basic research: “The purpose of applied research is to contribute knowledge that will help people to understand the nature of a problem in order to intervene, thereby allowing human beings to more effectively control their environment. While in basic research the source of questions is the tradition within a scholarly discipline, in applied research the source of questions is in the problems and concerns experienced by people and by policymakers” ( Patton 2002:217 ).

Applied research is less geared toward theory in two ways. First, its questions do not derive from previous literature. For this reason, applied research studies have much more limited literature reviews than those found in basic research (although they make up for this by having much more “background” about the problem). Second, it does not generate theory in the same way as basic research does. The findings of an applied research project may not be generalizable beyond the boundaries of this particular problem or context. The findings are more limited. They are useful now but may be less useful later. This is why basic research remains the default “gold standard” of academic research.

Evaluation research is research that is designed to evaluate or test the effectiveness of specific solutions and programs addressing specific social problems. We already know the problems, and someone has already come up with solutions. There might be a program, say, for first-generation college students on your campus. Does this program work? Are first-generation students who participate in the program more likely to graduate than those who do not? These are the types of questions addressed by evaluation research. There are two types of research within this broader frame; however, one more action-oriented than the next. In summative evaluation , an overall judgment about the effectiveness of a program or policy is made. Should we continue our first-gen program? Is it a good model for other campuses? Because the purpose of such summative evaluation is to measure success and to determine whether this success is scalable (capable of being generalized beyond the specific case), quantitative data is more often used than qualitative data. In our example, we might have “outcomes” data for thousands of students, and we might run various tests to determine if the better outcomes of those in the program are statistically significant so that we can generalize the findings and recommend similar programs elsewhere. Qualitative data in the form of focus groups or interviews can then be used for illustrative purposes, providing more depth to the quantitative analyses. In contrast, formative evaluation attempts to improve a program or policy (to help “form” or shape its effectiveness). Formative evaluations rely more heavily on qualitative data—case studies, interviews, focus groups. The findings are meant not to generalize beyond the particular but to improve this program. If you are a student seeking to improve your qualitative research skills and you do not care about generating basic research, formative evaluation studies might be an attractive option for you to pursue, as there are always local programs that need evaluation and suggestions for improvement. Again, be very clear about your purpose when talking through your research proposal with your committee.

Action research takes a further step beyond evaluation, even formative evaluation, to being part of the solution itself. This is about as far from basic research as one could get and definitely falls beyond the scope of “science,” as conventionally defined. The distinction between action and research is blurry, the research methods are often in constant flux, and the only “findings” are specific to the problem or case at hand and often are findings about the process of intervention itself. Rather than evaluate a program as a whole, action research often seeks to change and improve some particular aspect that may not be working—maybe there is not enough diversity in an organization or maybe women’s voices are muted during meetings and the organization wonders why and would like to change this. In a further step, participatory action research , those women would become part of the research team, attempting to amplify their voices in the organization through participation in the action research. As action research employs methods that involve people in the process, focus groups are quite common.

If you are working on a thesis or dissertation, chances are your committee will expect you to be contributing to fundamental knowledge and theory ( basic research ). If your interests lie more toward the action end of the continuum, however, it is helpful to talk to your committee about this before you get started. Knowing your purpose in advance will help avoid misunderstandings during the later stages of the research process!

The Research Question

Once you have written your paragraph and clarified your purpose and truly know that this study is the best study for you to be doing right now , you are ready to write and refine your actual research question. Know that research questions are often moving targets in qualitative research, that they can be refined up to the very end of data collection and analysis. But you do have to have a working research question at all stages. This is your “anchor” when you get lost in the data. What are you addressing? What are you looking at and why? Your research question guides you through the thicket. It is common to have a whole host of questions about a phenomenon or case, both at the outset and throughout the study, but you should be able to pare it down to no more than two or three sentences when asked. These sentences should both clarify the intent of the research and explain why this is an important question to answer. More on refining your research question can be found in chapter 4.

Chances are, you will have already done some prior reading before coming up with your interest and your questions, but you may not have conducted a systematic literature review. This is the next crucial stage to be completed before venturing further. You don’t want to start collecting data and then realize that someone has already beaten you to the punch. A review of the literature that is already out there will let you know (1) if others have already done the study you are envisioning; (2) if others have done similar studies, which can help you out; and (3) what ideas or concepts are out there that can help you frame your study and make sense of your findings. More on literature reviews can be found in chapter 9.

In addition to reviewing the literature for similar studies to what you are proposing, it can be extremely helpful to find a study that inspires you. This may have absolutely nothing to do with the topic you are interested in but is written so beautifully or organized so interestingly or otherwise speaks to you in such a way that you want to post it somewhere to remind you of what you want to be doing. You might not understand this in the early stages—why would you find a study that has nothing to do with the one you are doing helpful? But trust me, when you are deep into analysis and writing, having an inspirational model in view can help you push through. If you are motivated to do something that might change the world, you probably have read something somewhere that inspired you. Go back to that original inspiration and read it carefully and see how they managed to convey the passion that you so appreciate.

At this stage, you are still just getting started. There are a lot of things to do before setting forth to collect data! You’ll want to consider and choose a research tradition and a set of data-collection techniques that both help you answer your research question and match all your aims and goals. For example, if you really want to help migrant workers speak for themselves, you might draw on feminist theory and participatory action research models. Chapters 3 and 4 will provide you with more information on epistemologies and approaches.

Next, you have to clarify your “units of analysis.” What is the level at which you are focusing your study? Often, the unit in qualitative research methods is individual people, or “human subjects.” But your units of analysis could just as well be organizations (colleges, hospitals) or programs or even whole nations. Think about what it is you want to be saying at the end of your study—are the insights you are hoping to make about people or about organizations or about something else entirely? A unit of analysis can even be a historical period! Every unit of analysis will call for a different kind of data collection and analysis and will produce different kinds of “findings” at the conclusion of your study. [2]

Regardless of what unit of analysis you select, you will probably have to consider the “human subjects” involved in your research. [3] Who are they? What interactions will you have with them—that is, what kind of data will you be collecting? Before answering these questions, define your population of interest and your research setting. Use your research question to help guide you.

Let’s use an example from a real study. In Geographies of Campus Inequality , Benson and Lee ( 2020 ) list three related research questions: “(1) What are the different ways that first-generation students organize their social, extracurricular, and academic activities at selective and highly selective colleges? (2) how do first-generation students sort themselves and get sorted into these different types of campus lives; and (3) how do these different patterns of campus engagement prepare first-generation students for their post-college lives?” (3).

Note that we are jumping into this a bit late, after Benson and Lee have described previous studies (the literature review) and what is known about first-generation college students and what is not known. They want to know about differences within this group, and they are interested in ones attending certain kinds of colleges because those colleges will be sites where academic and extracurricular pressures compete. That is the context for their three related research questions. What is the population of interest here? First-generation college students . What is the research setting? Selective and highly selective colleges . But a host of questions remain. Which students in the real world, which colleges? What about gender, race, and other identity markers? Will the students be asked questions? Are the students still in college, or will they be asked about what college was like for them? Will they be observed? Will they be shadowed? Will they be surveyed? Will they be asked to keep diaries of their time in college? How many students? How many colleges? For how long will they be observed?

Recommendation

Take a moment and write down suggestions for Benson and Lee before continuing on to what they actually did.

Have you written down your own suggestions? Good. Now let’s compare those with what they actually did. Benson and Lee drew on two sources of data: in-depth interviews with sixty-four first-generation students and survey data from a preexisting national survey of students at twenty-eight selective colleges. Let’s ignore the survey for our purposes here and focus on those interviews. The interviews were conducted between 2014 and 2016 at a single selective college, “Hilltop” (a pseudonym ). They employed a “purposive” sampling strategy to ensure an equal number of male-identifying and female-identifying students as well as equal numbers of White, Black, and Latinx students. Each student was interviewed once. Hilltop is a selective liberal arts college in the northeast that enrolls about three thousand students.

How did your suggestions match up to those actually used by the researchers in this study? It is possible your suggestions were too ambitious? Beginning qualitative researchers can often make that mistake. You want a research design that is both effective (it matches your question and goals) and doable. You will never be able to collect data from your entire population of interest (unless your research question is really so narrow to be relevant to very few people!), so you will need to come up with a good sample. Define the criteria for this sample, as Benson and Lee did when deciding to interview an equal number of students by gender and race categories. Define the criteria for your sample setting too. Hilltop is typical for selective colleges. That was a research choice made by Benson and Lee. For more on sampling and sampling choices, see chapter 5.

Benson and Lee chose to employ interviews. If you also would like to include interviews, you have to think about what will be asked in them. Most interview-based research involves an interview guide, a set of questions or question areas that will be asked of each participant. The research question helps you create a relevant interview guide. You want to ask questions whose answers will provide insight into your research question. Again, your research question is the anchor you will continually come back to as you plan for and conduct your study. It may be that once you begin interviewing, you find that people are telling you something totally unexpected, and this makes you rethink your research question. That is fine. Then you have a new anchor. But you always have an anchor. More on interviewing can be found in chapter 11.

Let’s imagine Benson and Lee also observed college students as they went about doing the things college students do, both in the classroom and in the clubs and social activities in which they participate. They would have needed a plan for this. Would they sit in on classes? Which ones and how many? Would they attend club meetings and sports events? Which ones and how many? Would they participate themselves? How would they record their observations? More on observation techniques can be found in both chapters 13 and 14.

At this point, the design is almost complete. You know why you are doing this study, you have a clear research question to guide you, you have identified your population of interest and research setting, and you have a reasonable sample of each. You also have put together a plan for data collection, which might include drafting an interview guide or making plans for observations. And so you know exactly what you will be doing for the next several months (or years!). To put the project into action, there are a few more things necessary before actually going into the field.

First, you will need to make sure you have any necessary supplies, including recording technology. These days, many researchers use their phones to record interviews. Second, you will need to draft a few documents for your participants. These include informed consent forms and recruiting materials, such as posters or email texts, that explain what this study is in clear language. Third, you will draft a research protocol to submit to your institutional review board (IRB) ; this research protocol will include the interview guide (if you are using one), the consent form template, and all examples of recruiting material. Depending on your institution and the details of your study design, it may take weeks or even, in some unfortunate cases, months before you secure IRB approval. Make sure you plan on this time in your project timeline. While you wait, you can continue to review the literature and possibly begin drafting a section on the literature review for your eventual presentation/publication. More on IRB procedures can be found in chapter 8 and more general ethical considerations in chapter 7.

Once you have approval, you can begin!

Research Design Checklist

Before data collection begins, do the following:

  • Write a paragraph explaining your aims and goals (personal/political, practical/strategic, professional/academic).
  • Define your research question; write two to three sentences that clarify the intent of the research and why this is an important question to answer.
  • Review the literature for similar studies that address your research question or similar research questions; think laterally about some literature that might be helpful or illuminating but is not exactly about the same topic.
  • Find a written study that inspires you—it may or may not be on the research question you have chosen.
  • Consider and choose a research tradition and set of data-collection techniques that (1) help answer your research question and (2) match your aims and goals.
  • Define your population of interest and your research setting.
  • Define the criteria for your sample (How many? Why these? How will you find them, gain access, and acquire consent?).
  • If you are conducting interviews, draft an interview guide.
  •  If you are making observations, create a plan for observations (sites, times, recording, access).
  • Acquire any necessary technology (recording devices/software).
  • Draft consent forms that clearly identify the research focus and selection process.
  • Create recruiting materials (posters, email, texts).
  • Apply for IRB approval (proposal plus consent form plus recruiting materials).
  • Block out time for collecting data.
  • At the end of the chapter, you will find a " Research Design Checklist " that summarizes the main recommendations made here ↵
  • For example, if your focus is society and culture , you might collect data through observation or a case study. If your focus is individual lived experience , you are probably going to be interviewing some people. And if your focus is language and communication , you will probably be analyzing text (written or visual). ( Marshall and Rossman 2016:16 ). ↵
  • You may not have any "live" human subjects. There are qualitative research methods that do not require interactions with live human beings - see chapter 16 , "Archival and Historical Sources." But for the most part, you are probably reading this textbook because you are interested in doing research with people. The rest of the chapter will assume this is the case. ↵

One of the primary methodological traditions of inquiry in qualitative research, ethnography is the study of a group or group culture, largely through observational fieldwork supplemented by interviews. It is a form of fieldwork that may include participant-observation data collection. See chapter 14 for a discussion of deep ethnography. 

A methodological tradition of inquiry and research design that focuses on an individual case (e.g., setting, institution, or sometimes an individual) in order to explore its complexity, history, and interactive parts.  As an approach, it is particularly useful for obtaining a deep appreciation of an issue, event, or phenomenon of interest in its particular context.

The controlling force in research; can be understood as lying on a continuum from basic research (knowledge production) to action research (effecting change).

In its most basic sense, a theory is a story we tell about how the world works that can be tested with empirical evidence.  In qualitative research, we use the term in a variety of ways, many of which are different from how they are used by quantitative researchers.  Although some qualitative research can be described as “testing theory,” it is more common to “build theory” from the data using inductive reasoning , as done in Grounded Theory .  There are so-called “grand theories” that seek to integrate a whole series of findings and stories into an overarching paradigm about how the world works, and much smaller theories or concepts about particular processes and relationships.  Theory can even be used to explain particular methodological perspectives or approaches, as in Institutional Ethnography , which is both a way of doing research and a theory about how the world works.

Research that is interested in generating and testing hypotheses about how the world works.

A methodological tradition of inquiry and approach to analyzing qualitative data in which theories emerge from a rigorous and systematic process of induction.  This approach was pioneered by the sociologists Glaser and Strauss (1967).  The elements of theory generated from comparative analysis of data are, first, conceptual categories and their properties and, second, hypotheses or generalized relations among the categories and their properties – “The constant comparing of many groups draws the [researcher’s] attention to their many similarities and differences.  Considering these leads [the researcher] to generate abstract categories and their properties, which, since they emerge from the data, will clearly be important to a theory explaining the kind of behavior under observation.” (36).

An approach to research that is “multimethod in focus, involving an interpretative, naturalistic approach to its subject matter.  This means that qualitative researchers study things in their natural settings, attempting to make sense of, or interpret, phenomena in terms of the meanings people bring to them.  Qualitative research involves the studied use and collection of a variety of empirical materials – case study, personal experience, introspective, life story, interview, observational, historical, interactional, and visual texts – that describe routine and problematic moments and meanings in individuals’ lives." ( Denzin and Lincoln 2005:2 ). Contrast with quantitative research .

Research that contributes knowledge that will help people to understand the nature of a problem in order to intervene, thereby allowing human beings to more effectively control their environment.

Research that is designed to evaluate or test the effectiveness of specific solutions and programs addressing specific social problems.  There are two kinds: summative and formative .

Research in which an overall judgment about the effectiveness of a program or policy is made, often for the purpose of generalizing to other cases or programs.  Generally uses qualitative research as a supplement to primary quantitative data analyses.  Contrast formative evaluation research .

Research designed to improve a program or policy (to help “form” or shape its effectiveness); relies heavily on qualitative research methods.  Contrast summative evaluation research

Research carried out at a particular organizational or community site with the intention of affecting change; often involves research subjects as participants of the study.  See also participatory action research .

Research in which both researchers and participants work together to understand a problematic situation and change it for the better.

The level of the focus of analysis (e.g., individual people, organizations, programs, neighborhoods).

The large group of interest to the researcher.  Although it will likely be impossible to design a study that incorporates or reaches all members of the population of interest, this should be clearly defined at the outset of a study so that a reasonable sample of the population can be taken.  For example, if one is studying working-class college students, the sample may include twenty such students attending a particular college, while the population is “working-class college students.”  In quantitative research, clearly defining the general population of interest is a necessary step in generalizing results from a sample.  In qualitative research, defining the population is conceptually important for clarity.

A fictional name assigned to give anonymity to a person, group, or place.  Pseudonyms are important ways of protecting the identity of research participants while still providing a “human element” in the presentation of qualitative data.  There are ethical considerations to be made in selecting pseudonyms; some researchers allow research participants to choose their own.

A requirement for research involving human participants; the documentation of informed consent.  In some cases, oral consent or assent may be sufficient, but the default standard is a single-page easy-to-understand form that both the researcher and the participant sign and date.   Under federal guidelines, all researchers "shall seek such consent only under circumstances that provide the prospective subject or the representative sufficient opportunity to consider whether or not to participate and that minimize the possibility of coercion or undue influence. The information that is given to the subject or the representative shall be in language understandable to the subject or the representative.  No informed consent, whether oral or written, may include any exculpatory language through which the subject or the representative is made to waive or appear to waive any of the subject's rights or releases or appears to release the investigator, the sponsor, the institution, or its agents from liability for negligence" (21 CFR 50.20).  Your IRB office will be able to provide a template for use in your study .

An administrative body established to protect the rights and welfare of human research subjects recruited to participate in research activities conducted under the auspices of the institution with which it is affiliated. The IRB is charged with the responsibility of reviewing all research involving human participants. The IRB is concerned with protecting the welfare, rights, and privacy of human subjects. The IRB has the authority to approve, disapprove, monitor, and require modifications in all research activities that fall within its jurisdiction as specified by both the federal regulations and institutional policy.

Introduction to Qualitative Research Methods Copyright © 2023 by Allison Hurst is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License , except where otherwise noted.

  • Evaluation Research Design: Examples, Methods & Types

busayo.longe

As you engage in tasks, you will need to take intermittent breaks to determine how much progress has been made and if any changes need to be effected along the way. This is very similar to what organizations do when they carry out  evaluation research.  

The evaluation research methodology has become one of the most important approaches for organizations as they strive to create products, services, and processes that speak to the needs of target users. In this article, we will show you how your organization can conduct successful evaluation research using Formplus .

What is Evaluation Research?

Also known as program evaluation, evaluation research is a common research design that entails carrying out a structured assessment of the value of resources committed to a project or specific goal. It often adopts social research methods to gather and analyze useful information about organizational processes and products.  

As a type of applied research , evaluation research typically associated  with real-life scenarios within organizational contexts. This means that the researcher will need to leverage common workplace skills including interpersonal skills and team play to arrive at objective research findings that will be useful to stakeholders. 

Characteristics of Evaluation Research

  • Research Environment: Evaluation research is conducted in the real world; that is, within the context of an organization. 
  • Research Focus: Evaluation research is primarily concerned with measuring the outcomes of a process rather than the process itself. 
  • Research Outcome: Evaluation research is employed for strategic decision making in organizations. 
  • Research Goal: The goal of program evaluation is to determine whether a process has yielded the desired result(s). 
  • This type of research protects the interests of stakeholders in the organization. 
  • It often represents a middle-ground between pure and applied research. 
  • Evaluation research is both detailed and continuous. It pays attention to performative processes rather than descriptions. 
  • Research Process: This research design utilizes qualitative and quantitative research methods to gather relevant data about a product or action-based strategy. These methods include observation, tests, and surveys.

Types of Evaluation Research

The Encyclopedia of Evaluation (Mathison, 2004) treats forty-two different evaluation approaches and models ranging from “appreciative inquiry” to “connoisseurship” to “transformative evaluation”. Common types of evaluation research include the following: 

  • Formative Evaluation

Formative evaluation or baseline survey is a type of evaluation research that involves assessing the needs of the users or target market before embarking on a project.  Formative evaluation is the starting point of evaluation research because it sets the tone of the organization’s project and provides useful insights for other types of evaluation.  

  • Mid-term Evaluation

Mid-term evaluation entails assessing how far a project has come and determining if it is in line with the set goals and objectives. Mid-term reviews allow the organization to determine if a change or modification of the implementation strategy is necessary, and it also serves for tracking the project. 

  • Summative Evaluation

This type of evaluation is also known as end-term evaluation of project-completion evaluation and it is conducted immediately after the completion of a project. Here, the researcher examines the value and outputs of the program within the context of the projected results. 

Summative evaluation allows the organization to measure the degree of success of a project. Such results can be shared with stakeholders, target markets, and prospective investors. 

  • Outcome Evaluation

Outcome evaluation is primarily target-audience oriented because it measures the effects of the project, program, or product on the users. This type of evaluation views the outcomes of the project through the lens of the target audience and it often measures changes such as knowledge-improvement, skill acquisition, and increased job efficiency. 

  • Appreciative Enquiry

Appreciative inquiry is a type of evaluation research that pays attention to result-producing approaches. It is predicated on the belief that an organization will grow in whatever direction its stakeholders pay primary attention to such that if all the attention is focused on problems, identifying them would be easy. 

In carrying out appreciative inquiry, the research identifies the factors directly responsible for the positive results realized in the course of a project, analyses the reasons for these results, and intensifies the utilization of these factors. 

Evaluation Research Methodology 

There are four major evaluation research methods, namely; output measurement, input measurement, impact assessment and service quality

  • Output/Performance Measurement

Output measurement is a method employed in evaluative research that shows the results of an activity undertaking by an organization. In other words, performance measurement pays attention to the results achieved by the resources invested in a specific activity or organizational process. 

More than investing resources in a project, organizations must be able to track the extent to which these resources have yielded results, and this is where performance measurement comes in. Output measurement allows organizations to pay attention to the effectiveness and impact of a process rather than just the process itself. 

Other key indicators of performance measurement include user-satisfaction, organizational capacity, market penetration, and facility utilization. In carrying out performance measurement, organizations must identify the parameters that are relevant to the process in question, their industry, and the target markets. 

5 Performance Evaluation Research Questions Examples

  • What is the cost-effectiveness of this project?
  • What is the overall reach of this project?
  • How would you rate the market penetration of this project?
  • How accessible is the project? 
  • Is this project time-efficient? 

performance-evaluation-survey

  • Input Measurement

In evaluation research, input measurement entails assessing the number of resources committed to a project or goal in any organization. This is one of the most common indicators in evaluation research because it allows organizations to track their investments. 

The most common indicator of inputs measurement is the budget which allows organizations to evaluate and limit expenditure for a project. It is also important to measure non-monetary investments like human capital; that is the number of persons needed for successful project execution and production capital. 

5 Input Evaluation Research Questions Examples

  • What is the budget for this project?
  • What is the timeline of this process?
  • How many employees have been assigned to this project? 
  • Do we need to purchase new machinery for this project? 
  • How many third-parties are collaborators in this project? 

summative evaluation research design

  • Impact/Outcomes Assessment

In impact assessment, the evaluation researcher focuses on how the product or project affects target markets, both directly and indirectly. Outcomes assessment is somewhat challenging because many times, it is difficult to measure the real-time value and benefits of a project for the users. 

In assessing the impact of a process, the evaluation researcher must pay attention to the improvement recorded by the users as a result of the process or project in question. Hence, it makes sense to focus on cognitive and affective changes, expectation-satisfaction, and similar accomplishments of the users. 

5 Impact Evaluation Research Questions Examples

  • How has this project affected you? 
  • Has this process affected you positively or negatively?
  • What role did this project play in improving your earning power? 
  • On a scale of 1-10, how excited are you about this project?
  • How has this project improved your mental health? 

summative evaluation research design

  • Service Quality

Service quality is the evaluation research method that accounts for any differences between the expectations of the target markets and their impression of the undertaken project. Hence, it pays attention to the overall service quality assessment carried out by the users. 

It is not uncommon for organizations to build the expectations of target markets as they embark on specific projects. Service quality evaluation allows these organizations to track the extent to which the actual product or service delivery fulfils the expectations. 

5 Service Quality Evaluation Questions

  • On a scale of 1-10, how satisfied are you with the product?
  • How helpful was our customer service representative?
  • How satisfied are you with the quality of service?
  • How long did it take to resolve the issue at hand?
  • How likely are you to recommend us to your network?

summative evaluation research design

Uses of Evaluation Research 

  • Evaluation research is used by organizations to measure the effectiveness of activities and identify areas needing improvement. Findings from evaluation research are key to project and product advancements and are very influential in helping organizations realize their goals efficiently.     
  • The findings arrived at from evaluation research serve as evidence of the impact of the project embarked on by an organization. This information can be presented to stakeholders, customers, and can also help your organization secure investments for future projects. 
  • Evaluation research helps organizations to justify their use of limited resources and choose the best alternatives. 
  •  It is also useful in pragmatic goal setting and realization. 
  • Evaluation research provides detailed insights into projects embarked on by an organization. Essentially, it allows all stakeholders to understand multiple dimensions of a process, and to determine strengths and weaknesses. 
  • Evaluation research also plays a major role in helping organizations to improve their overall practice and service delivery. This research design allows organizations to weigh existing processes through feedback provided by stakeholders, and this informs better decision making. 
  • Evaluation research is also instrumental to sustainable capacity building. It helps you to analyze demand patterns and determine whether your organization requires more funds, upskilling or improved operations.

Data Collection Techniques Used in Evaluation Research

In gathering useful data for evaluation research, the researcher often combines quantitative and qualitative research methods . Qualitative research methods allow the researcher to gather information relating to intangible values such as market satisfaction and perception. 

On the other hand, quantitative methods are used by the evaluation researcher to assess numerical patterns, that is, quantifiable data. These methods help you measure impact and results; although they may not serve for understanding the context of the process. 

Quantitative Methods for Evaluation Research

A survey is a quantitative method that allows you to gather information about a project from a specific group of people. Surveys are largely context-based and limited to target groups who are asked a set of structured questions in line with the predetermined context.

Surveys usually consist of close-ended questions that allow the evaluative researcher to gain insight into several  variables including market coverage and customer preferences. Surveys can be carried out physically using paper forms or online through data-gathering platforms like Formplus . 

  • Questionnaires

A questionnaire is a common quantitative research instrument deployed in evaluation research. Typically, it is an aggregation of different types of questions or prompts which help the researcher to obtain valuable information from respondents. 

A poll is a common method of opinion-sampling that allows you to weigh the perception of the public about issues that affect them. The best way to achieve accuracy in polling is by conducting them online using platforms like Formplus. 

Polls are often structured as Likert questions and the options provided always account for neutrality or indecision. Conducting a poll allows the evaluation researcher to understand the extent to which the product or service satisfies the needs of the users. 

Qualitative Methods for Evaluation Research

  • One-on-One Interview

An interview is a structured conversation involving two participants; usually the researcher and the user or a member of the target market. One-on-One interviews can be conducted physically, via the telephone and through video conferencing apps like Zoom and Google Meet. 

  • Focus Groups

A focus group is a research method that involves interacting with a limited number of persons within your target market, who can provide insights on market perceptions and new products. 

  • Qualitative Observation

Qualitative observation is a research method that allows the evaluation researcher to gather useful information from the target audience through a variety of subjective approaches. This method is more extensive than quantitative observation because it deals with a smaller sample size, and it also utilizes inductive analysis. 

  • Case Studies

A case study is a research method that helps the researcher to gain a better understanding of a subject or process. Case studies involve in-depth research into a given subject, to understand its functionalities and successes. 

How to Formplus Online Form Builder for Evaluation Survey 

  • Sign into Formplus

In the Formplus builder, you can easily create your evaluation survey by dragging and dropping preferred fields into your form. To access the Formplus builder, you will need to create an account on Formplus. 

Once you do this, sign in to your account and click on “Create Form ” to begin. 

formplus

  • Edit Form Title

Click on the field provided to input your form title, for example, “Evaluation Research Survey”.

summative evaluation research design

Click on the edit button to edit the form.

Add Fields: Drag and drop preferred form fields into your form in the Formplus builder inputs column. There are several field input options for surveys in the Formplus builder. 

summative evaluation research design

Edit fields

Click on “Save”

Preview form.

  • Form Customization

With the form customization options in the form builder, you can easily change the outlook of your form and make it more unique and personalized. Formplus allows you to change your form theme, add background images, and even change the font according to your needs. 

evaluation-research-from-builder

  • Multiple Sharing Options

Formplus offers multiple form sharing options which enables you to easily share your evaluation survey with survey respondents. You can use the direct social media sharing buttons to share your form link to your organization’s social media pages. 

You can send out your survey form as email invitations to your research subjects too. If you wish, you can share your form’s QR code or embed it on your organization’s website for easy access. 

Conclusion  

Conducting evaluation research allows organizations to determine the effectiveness of their activities at different phases. This type of research can be carried out using qualitative and quantitative data collection methods including focus groups, observation, telephone and one-on-one interviews, and surveys. 

Online surveys created and administered via data collection platforms like Formplus make it easier for you to gather and process information during evaluation research. With Formplus multiple form sharing options, it is even easier for you to gather useful data from target markets.

Logo

Connect to Formplus, Get Started Now - It's Free!

  • characteristics of evaluation research
  • evaluation research methods
  • types of evaluation research
  • what is evaluation research
  • busayo.longe

Formplus

You may also like:

What is Pure or Basic Research? + [Examples & Method]

Simple guide on pure or basic research, its methods, characteristics, advantages, and examples in science, medicine, education and psychology

summative evaluation research design

Recall Bias: Definition, Types, Examples & Mitigation

This article will discuss the impact of recall bias in studies and the best ways to avoid them during research.

Formal Assessment: Definition, Types Examples & Benefits

In this article, we will discuss different types and examples of formal evaluation, and show you how to use Formplus for online assessments.

Assessment vs Evaluation: 11 Key Differences

This article will discuss what constitutes evaluations and assessments along with the key differences between these two research methods.

Formplus - For Seamless Data Collection

Collect data the right way with a versatile data collection tool. try formplus and transform your work productivity today..

  • Why Blitzllama?

Evaluative research: Methods, types, and examples (2024)

Master evaluative research with our guide, offering a detailed look at methods, types, and real-life examples for a complete understanding.

Product owners and user researchers often grapple with the challenge of gauging the success and impact of their products. 

The struggle lies in understanding what methods and types of evaluative research can provide meaningful insights. 

Empathy is crucial in this process, as identifying user needs and preferences requires a deep understanding of their experiences. 

In this article, we present a concise guide to evaluative research, offering practical methods, highlighting various types, and providing real-world examples. 

By delving into the realm of evaluative research, product owners and user researchers can navigate the complexities of product assessment with clarity and effectiveness.

What is evaluative research?

Evaluative research assesses the effectiveness and usability of products or services. It involves gathering user feedback to measure performance and identify areas for improvement. 

Product owners and user researchers employ evaluative research to make informed decisions. Users' experiences and preferences are actively observed and analyzed to enhance the overall quality of a product. 

A quote on evaluative research

This research method aids in identifying strengths and weaknesses, enabling iterative refinement. Through surveys, usability testing, and direct user interaction, evaluative research provides valuable insights. 

It guides product development, ensuring that user needs are met and expectations exceeded. For product owners and user researchers, embracing evaluative research is pivotal in creating successful, user-centric solutions.

Now that we understand what evaluative research entails, let's explore why it holds a pivotal role in product development and user research.

Why is evaluative research important?

Evaluative research holds immense importance for product owners and user researchers as it offers concrete data and feedback to gauge the success of a product or service. 

By identifying strengths and weaknesses, it becomes a powerful tool for informed decision-making, leading to product improvements and enhanced user experiences:

1) Unlocking product potential

Evaluative research stands as a crucial pillar in product development, offering invaluable insights into a product's effectiveness. By actively assessing user experiences, product owners gain a clearer understanding of what works and what needs improvement. 

This process facilitates targeted enhancements, ensuring that products align with user expectations and preferences. In essence, evaluative research empowers product owners to unlock their product's full potential, resulting in more satisfied users and increased market success.

2) Mitigating risk and reducing iteration cycles

For product owners navigating the competitive landscape, mitigating risks is paramount. Evaluative research serves as a proactive measure, identifying potential issues before they escalate. Through systematic testing and user feedback, product owners can pinpoint weaknesses, allowing for timely adjustments. 

This not only reduces the likelihood of costly post-launch issues but also streamlines iteration cycles. By addressing concerns early in the development phase, product owners can refine their offerings efficiently, staying agile in response to user needs and industry dynamics.

3) Enhancing user-centric design

User researchers play a pivotal role in shaping products that resonate with their intended audience. Evaluative research is the compass guiding user-centric design, ensuring that every iteration aligns with user expectations. By actively involving users in the assessment process, researchers gain firsthand insights into user behavior and preferences. 

This information is invaluable for crafting a seamless user experience, ultimately fostering loyalty and satisfaction. In the ever-evolving landscape of user preferences, ongoing evaluative research becomes a strategic tool for user researchers to consistently refine and elevate the design, fostering products that stand the test of time.

With the significance of evaluative research established, it's essential to know when is the right time to conduct it.

When should you conduct evaluative research?

Knowing the opportune moments to conduct evaluative research is vital. Whether in the early stages of development or after a product launch, this research helps pinpoint areas for enhancement:

When should you conduct evaluative research

Prototype stage

During the prototype stage, conducting evaluative research is crucial to gather insights and refine the product. 

Engage users with prototypes to identify usability issues, gauge user satisfaction, and validate design decisions. 

This early evaluation ensures that potential problems are addressed before moving forward, saving time and resources in the later stages of development. 

By actively involving users at this stage, product owners can enhance the user experience and align the product with user expectations.

Pre-launch stage

In the pre-launch stage, evaluative research becomes instrumental in assessing the final product's readiness. 

Evaluate user interactions, uncover any remaining usability concerns, and verify that the product meets user needs. 

This phase helps refine features, optimize user flows, and address any last-minute issues. 

By actively seeking user feedback before launch, product owners can make informed decisions to improve the overall quality and performance of the product, ultimately enhancing its market success.

Post-launch stage

After the product is launched, evaluative research remains essential for ongoing improvement. Monitor user behavior, gather feedback, and identify areas for enhancement. 

This active approach allows product owners to respond swiftly to emerging issues, optimize features based on real-world usage, and adapt to changing user preferences. 

Continuous evaluative research in the post-launch stage helps maintain a competitive edge, ensuring the product evolves in tandem with user expectations, thus fostering long-term success.

Now that we understand the timing of evaluative research, let's distinguish it from generative research and understand their respective roles.

Evaluative vs. generative research

While evaluative research assesses existing products, generative research focuses on generating new ideas. Understanding this dichotomy is crucial for product owners and user researchers to choose the right approach for the specific goals of their projects:

Difference between evaluative research and generative research

With the differentiation between evaluative and generative research clear, let's delve into the three primary types of evaluative research.

What are the 3 types of evaluative research?

Evaluative research can take various forms. The three main types include formative evaluation, summative evaluation, and outcome evaluation. 

Types of evaluative research

Each type serves a distinct purpose, offering valuable insights throughout different stages of a product's life cycle:

1) Formative evaluation research

Formative evaluation research is a crucial phase in the development process, focusing on improving and refining a product or program. 

It involves gathering feedback early in the development cycle, allowing product owners to make informed adjustments. 

This type of research seeks to identify strengths and weaknesses, providing insights to enhance the user experience. 

Through surveys, usability testing, and focus groups, formative evaluation guides iterative development, ensuring that the end product aligns with user expectations and needs.

2) Summative evaluation research

Summative evaluation research occurs after the completion of a product or program, aiming to assess its overall effectiveness. 

This type of research evaluates the final outcome against predefined criteria and objectives. 

Summative research is particularly relevant for product owners seeking to understand the overall impact and success of their offering. 

Through methods like surveys, analytics, and performance metrics, it provides a comprehensive overview of the product's performance, helping stakeholders make informed decisions about future developments or investments.

3) Outcome evaluation research

Outcome evaluation research delves into the long-term effects and impact of a product or program on its users. 

It goes beyond immediate outcomes, assessing whether the intended goals and objectives have been met over time. 

Product owners can utilize this research to understand the sustained benefits and challenges associated with their offerings. 

By employing methods such as longitudinal studies and trend analysis, outcome evaluation research helps in crafting strategies for continuous improvement and adaptation based on evolving user needs and market dynamics.

Now that we've identified the types, let's explore five key evaluative research methods commonly employed by product owners and user researchers.

5 Key evaluative research methods

Product owners and user researchers utilize a variety of methods to conduct evaluative research. Choosing the right method depends on the specific goals and context of the research:

Surveys represent a versatile evaluative research method for product owners and user researchers seeking valuable insights into user experiences. These structured questionnaires gather quantitative data, offering a snapshot of user opinions and preferences.

Types of surveys:

Customer satisfaction (CSAT) survey: measures users' satisfaction with a product or service through a straightforward rating scale, typically ranging from 1 to 5.

Customer satisfaction (CSAT) survey

Net promoter score (NPS) survey: evaluates the likelihood of users recommending a product or service on a scale from 0 to 10, categorizing respondents as promoters, passives, or detractors.

Net promoter score (NPS) survey

Customer effort score (CES) survey: focuses on the ease with which users can accomplish tasks or resolve issues, providing insights into the overall user experience.

Customer effort score (CES) survey

When to use surveys:

  • Product launches: Gauge initial user reactions and identify areas for improvement.
  • Post-interaction: Capture real-time feedback immediately after a user engages with a feature or completes a task.

2) Closed card sorting

Closed card sorting is a powerful method for organizing and evaluating information architecture. Participants categorize predefined content into predetermined groups, shedding light on users' mental models and expectations.

Closed card sorting for evaluative research

What closed card sorting entails:

  • Predefined categories: users sort content into categories predetermined by the researcher, allowing for targeted analysis.
  • Quantitative insights: provides quantitative data on how often participants correctly place items in designated categories.

When to employ closed card sorting:

  • Information architecture overhaul: ideal for refining and optimizing the structure of a product's content.
  • Prototyping phase: use early in the design process to inform the creation of prototypes based on user expectations.

3) Tree testing

Tree testing is a method specifically focused on evaluating the navigational structure of a product. Participants are presented with a text-based representation of the product's hierarchy and are tasked with finding specific items, highlighting areas where the navigation may fall short.

Tree testing for evaluative research

What tree testing involves:

  • Text-based navigation: users explore the product hierarchy without the influence of visual design, focusing solely on the structure.
  • Task-based evaluation: research participants complete tasks that reveal the effectiveness of the navigational structure.

When to opt for tree testing:

  • Pre-launch assessment: evaluate the effectiveness of the proposed navigation structure before a product release.
  • Redesign initiatives: use when considering changes to the existing navigational hierarchy.

4) Usability testing

Usability testing is a cornerstone of evaluative research, providing direct insights into how users interact with a product. By observing users completing tasks, product owners and user researchers can identify pain points and areas for improvement.

Usability testing for evaluative research

What usability testing entails:

  • Task performance observation: Researchers observe users as they navigate through tasks, noting areas of ease and difficulty.
  • Think-aloud protocol: Participants vocalize their thoughts and feelings during the testing process, providing additional insights.

When to conduct usability testing:

  • Early design phases: Gather feedback on wireframes and prototypes to address fundamental usability concerns.
  • Post-launch iterations: Continuously improve the user experience based on real-world usage and feedback.

5) A/B testing

A/B testing, also known as split testing, is a method for comparing two versions of a webpage or product to determine which performs better. This method allows for data-driven decision-making by comparing user responses to different variations.

A/B testing for evaluative research

What A/B testing involves:

  • Variant comparison: Users are randomly assigned to either version A or version B, and their interactions are analyzed to identify the more effective option.
  • Quantitative metrics: Metrics such as click-through rates, conversion rates, and engagement help assess the success of each variant.

When to implement A/B testing:

  • Feature optimization: Compare different versions of a specific feature to determine which resonates better with users.
  • Continuous improvement: Use A/B testing regularly to refine and enhance the product based on user preferences and behavior.

Now that we're familiar with the methods, let's see some practical evaluative research question examples to guide your research efforts.

Evaluative research question examples

The formulation of well-crafted research questions is fundamental to the success of evaluative research. Clear and targeted questions guide the research process, ensuring that valuable insights are gained to inform decision-making and improvements:

Usability evaluation questions:

Usability evaluation is a critical aspect of understanding how users interact with a product or system. It involves assessing the ease with which users can complete tasks and the overall user experience. Here are essential evaluative research questions for usability:

How was your experience completing this task? (Gain insights into the overall user experience and identify any pain points or positive aspects encountered during the task.)

What technical difficulties did you experience while completing the task? (Pinpoint specific technical challenges users faced, helping developers address potential issues affecting the usability of the product.)

How intuitive was the navigation? (Assess the user-friendliness of the navigation system, ensuring that users can easily understand and move through the product.)

How would you prefer to do this action instead? (Encourage users to provide alternative methods or suggestions, offering valuable input for enhancing user interactions and task completion.)

Were there any unnecessary features? (Identify features that users find superfluous or confusing, streamlining the product and improving overall usability.)

How easy was the task to complete? (Gauge the perceived difficulty of the task, helping to refine processes and ensure they align with user expectations.)

Were there any features missing? (Identify any gaps in the product’s features, helping the development team prioritize enhancements based on user needs and expectations.)

Product survey research questions:

Product surveys allow for a broader understanding of user satisfaction, preferences, and the likelihood of recommending a product. Here are evaluative research questions for product surveys:

Would you recommend the product to your colleagues/friends? (Measure user satisfaction and gauge the likelihood of users advocating for the product within their network.)

How disappointed would you be if you could no longer use the feature/product? (Assess the emotional impact of potential disruptions or discontinuation, providing insights into the product's perceived value.)

How satisfied are you with the product/feature? (Quantify user satisfaction levels to understand overall sentiment and identify areas for improvement.)

What is the one thing you wish the product/feature could do that it doesn’t already? (Solicit specific user suggestions for improvements, guiding the product development roadmap to align with user expectations.)

What would make you cancel your subscription? (Identify potential pain points or deal-breakers that might lead users to discontinue their subscription, allowing for proactive mitigation strategies.)

As we delve into the questions, let’s explore the case study on evaluative research.

Case study on evaluative research: Spotify

Spotify's case study on evaluative research

The case study discusses the redesign of Spotify's Your Library feature, a significant change that included the introduction of podcasts in 2020 and audiobooks in 2022. The goal was to accommodate content growth while minimizing negative effects on user experience. The study, presented at the CHI conference in 2023, emphasizes three key factors for the successful launch:

Early involvement: Data science and user research were involved early in the product development process to understand user behaviors and mental models. An ethnographic study explored users' experiences and attitudes towards library organization, revealing the Library as a personal space. Personal prototypes were used to involve users in the evaluation of new solutions, ensuring alignment with their mental models.

Evaluating safely at scale: To address the challenge of disruptive changes, the team employed a two-step evaluation process. First, a beta test allowed a small group of users to try the new experience and provide feedback. This observational data helped identify pain points and guided iterative improvements. Subsequently, A/B testing at scale assessed the impact on key metrics, using non-inferiority testing to ensure the new design was not unacceptably worse than the old one.

Mixed method studies: The study employed a combination of qualitative and quantitative methods throughout the process. This mixed methods approach provided a comprehensive understanding of user behaviors, motivations, and needs. Qualitative research, including interviews, diaries, and observational studies, was conducted alongside quantitative data collection to gain deeper insights at all stages.

More details can be found here: Minimizing change aversion through mixed methods research: a case study of redesigning Spotify’s Your Library

Ingrid Pettersson, Carl Fredriksson, Raha Dadgar, John Richardson, Lisa Shields, Duncan McKenzie

Best tools for evaluative research

Utilizing the right tools is instrumental in the success of evaluative research endeavors. From usability testing platforms to survey tools, having a well-equipped toolkit enhances the efficiency and accuracy of data collection.

Product owners and user researchers can leverage these tools to streamline processes and derive actionable insights, ultimately driving continuous improvement:

1) Blitzllama

Blitzllama

Blitzllama stands out as a powerhouse tool for evaluative research, aiding product owners and user researchers in comprehensive testing. Its user-friendly interface facilitates the quick creation of surveys and usability tests, streamlining data collection. With real-time analytics, it offers immediate insights into user behavior. The tool's flexibility accommodates both moderated and unmoderated studies, making it an invaluable asset for product teams seeking actionable feedback to enhance user experiences.

Maze

Maze emerges as a top-tier choice for evaluative research, delivering a seamless user testing experience. Product owners and user researchers benefit from its intuitive platform, allowing the creation of interactive prototypes for realistic assessments. Maze excels in remote usability testing, enabling diverse user groups to provide valuable feedback. Its robust analytics provide a deep dive into user journeys, highlighting pain points and areas of improvement. With features like A/B testing and metrics tracking, Maze empowers teams to make informed decisions and iterate rapidly based on user insights.

3) Survicate

Survicate

Survicate proves to be an essential tool in the arsenal of product owners and user researchers for evaluative research. This versatile survey and feedback platform simplifies the process of gathering user opinions and preferences. Survicate's customization options cater to specific research goals, ensuring targeted and relevant data collection. Real-time reporting and analytics enable quick interpretation of results, facilitating swift decision-making. Whether measuring user satisfaction or testing new features, Survicate’s agility makes it a valuable asset for teams aiming to refine products based on user feedback.

In conclusion, evaluative research equips product owners and user researchers with indispensable tools to enhance product effectiveness. By employing various methods such as usability testing and surveys, they gain valuable insights into user experiences. 

This knowledge empowers swift and informed decision-making, fostering continuous product improvement. The types of evaluative research, including formative, summative, and outcome evaluations, cater to diverse needs, ensuring a comprehensive understanding of user interactions. Real-world examples underscore the practical applications of these methodologies. 

In essence, embracing evaluative research is a proactive strategy for refining products, elevating user satisfaction, and ultimately achieving success in the dynamic landscape of user-centric design.

FAQs related to evaluative research

1) what is evaluative research and examples.

Evaluative research assesses the effectiveness, efficiency, and impact of programs, policies, products, or interventions. For instance, a company may conduct evaluative research to determine how well a new website design functions for users or to gauge customer satisfaction with a revamped product. Other examples include measuring the success of educational programs or evaluating the effectiveness of healthcare interventions.

2) What are the goals of evaluative research?

The primary goals of evaluative research are to determine the strengths and weaknesses of a program, product, or policy and to provide actionable insights for improvement. Through evaluative research, product owners and UX researchers aim to understand how well their offerings meet user needs, identify areas for enhancement, and make informed decisions based on data-driven findings. Ultimately, the goal is to optimize outcomes and enhance user experiences.

3) What are the three types of evaluation research methods?

Evaluation research employs three main methods: formative evaluation, summative evaluation, and developmental evaluation. Formative evaluation focuses on assessing and improving a program or product during its development stages. Summative evaluation, on the other hand, evaluates the overall effectiveness and impact of a completed program or product. Developmental evaluation is particularly useful in complex or rapidly changing environments, emphasizing real-time feedback and adaptation to emergent circumstances.

4) What is the difference between evaluative and formative research?

Evaluative research and formative research serve distinct purposes in the product development and assessment process. Evaluative research examines the outcomes and impacts of a completed program, product, or policy to determine its effectiveness and inform decision-making for future iterations or improvements. In contrast, formative research focuses on gathering insights during the developmental stages to refine and enhance the program or product before its implementation. While evaluative research assesses the end results, formative research shapes the design and development process along the way.

Latest articles

In-Depth: Blitzllama vs. Survicate

In-Depth: Blitzllama vs. Survicate

In-Depth: Blitzllama's Link Surveys

In-Depth: Blitzllama's Link Surveys

Implementing a CSAT Survey Strategy: A Guide for Product Leaders

Implementing a CSAT Survey Strategy: A Guide for Product Leaders

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • J Gen Intern Med
  • v.21(Suppl 2); 2006 Feb

Logo of jgimed

The Role of Formative Evaluation in Implementation Research and the QUERI Experience

Cheryl b stetler.

1 Independent Consultant, Amherst, MA, USA

Marcia W Legro

2 VA Puget Sound Health Care System, Seattle, WA, USA

3 University of Washington, Seattle, WA, USA

Carolyn M Wallace

Candice bowman.

4 VA San Diego Healthcare System, San Diego, CA, USA

Marylou Guihan

5 Edward Hines, Jr. VA Healthcare System, Hines, IL, USA

Hildi Hagedorn

6 Minneapolis VA Medical Center, Minneapolis, MN, USA

Barbara Kimmel

7 Baylor College of Medicine, Houston, TX, USA

8 Houston Veterans Affairs Medical Center, Houstan, TX, USA

Nancy D Sharp

Jeffrey l smith.

9 Central Arkansas Veterans Healthcare System, Little Rock, AR, USA

This article describes the importance and role of 4 stages of formative evaluation in our growing understanding of how to implement research findings into practice in order to improve the quality of clinical care. It reviews limitations of traditional approaches to implementation research and presents a rationale for new thinking and use of new methods. Developmental, implementation-focused, progress-focused, and interpretive evaluations are then defined and illustrated with examples from Veterans Health Administration Quality Enhancement Research Initiative projects. This article also provides methodologic details and highlights challenges encountered in actualizing formative evaluation within implementation research.

As health care systems struggle to provide care based on well-founded evidence, there is increasing recognition of the inherent complexity of implementing research into practice. Health care managers and decision makers find they need a better understanding of what it takes to achieve successful implementation, and they look to health care researchers to provide this information. Researchers in turn need to fill this need through collection of new, diverse sets of data to enhance understanding and management of the complex process of implementation.

A measurement approach capable of providing critical information about implementation is formative evaluation (FE). Formative evaluation, used in other social sciences, is herein defined as a rigorous assessment process designed to identify potential and actual influences on the progress and effectiveness of implementation efforts . Formative evaluation enables researchers to explicitly study the complexity of implementation projects and suggests ways to answer questions about context, adaptations, and response to change.

The Department of Veterans Affairs (VA) Quality Enhancement Research Initiative (QUERI) has integrated FE into its implementation program. 1 – 3 This article introduces QUERI and its implementation focus. It then describes research challenges that call for the use of FE in this specialized field of study, reviews FE relative to QUERI implementation research, identifies 4 evaluative stages, and presents challenges to the conduct of FE.

THE VETERAN HEALTH ADMINISTRATION'S QUERI PROGRAM

The Quality Enhancement Research Initiative, begun in 1998, is a comprehensive, data-driven, outcomes-based, and output-oriented improvement initiative. 2 , 3 It focuses on identification and implementation of empirically based practices for high-risk/high-volume conditions among the veteran population and on the evaluation and refinement of these implementation efforts. 3 The Quality Enhancement Research Initiative's innovative approach 1 – 4 calls upon researchers to work toward rapid, significant improvements through the systematic application of best clinical practices. It also calls upon researchers to study the implementation process to enhance and continuously refine these quality improvement (QI) efforts. 1 – 4

Classic intervention research methods 5 , 6 provide the means to evaluate targeted outcomes of implementation/QI efforts. From an evaluation perspective, studies using intervention designs, such as a cluster-randomized trial or quasi-experimental approaches, routinely include a summative evaluation . Summative evaluation is a systematic process of collecting data on the impacts, outputs, products, or outcomes hypothesized in a study. 7 Resulting data provide information on the degree of success, effectiveness, or goal achievement of an implementation program.

In an action-oriented improvement program, such as QUERI, summative data are essential but insufficient to meet the needs of implementation/QI researchers. Evaluative information is needed beyond clinical impact of the change effort and beyond discovering whether a chosen adoption strategy worked. Implementation researchers need to answer critical questions about the feasibility of implementation strategies, degree of real-time implementation, status and potential influence of contextual factors, response of project participants, and any adaptations necessary to achieve optimal change. Formative evaluation provides techniques for obtaining such information and for overcoming limitations identified in early implementation/QI studies.

NEED FOR FE IN IMPLEMENTATION/QI RESEARCH

The RE-AIM framework of Glasgow and colleagues highlights critical information that is missing from current research publications—i.e.,information needed to evaluate a study's potential for translation and public health impact . 8 , 9 Such information includes the efficacy/effectiveness of an intervention, its reach relative to actual/representative subject participation rate, its adoption relative to actual/representative setting participation rate, its implementation or intervention fidelity , and its maintenance over time.

The focus of the RE-AIM framework is the study of health promotion interventions. Similar issues must be addressed during implementation research if potential adopters are to replicate critical implementation processes. In addition, implementation researchers need to capture in-depth information on participant and contextual factors that facilitate or hinder successful implementation. Such factors can be used during the project to optimize implementation and inform post hoc interpretation.

As implementation efforts can be a relatively messy and complex process, traditional study designs alone are often inadequate to the task of obtaining evaluative information. For example, randomized clinical trials (RCT) may leave questions important to system-wide uptake of targeted research unanswered. As Stead et al. 10 , 11 suggest, traditional intervention research can fail to “capture the detail and complexity of intervention inputs and tactics” (10, p. 354), thereby missing the true nature of interventions as well as significant organizational factors important for replication. 10 , 11

Another argument for performing FE has been highlighted in guideline/QI literature, i.e., the need to address potential interpretive weaknesses. Such weaknesses relate to a failure to account for key elements of the implementation process and may lead to unexplainable and/or poor results. For example, Ovretveit and Gustafson 12 identified implementation assessment failure, explanation failure , and outcome attribution failure . Implementation assessment failure can lead to a “Type III” error, where erroneous study interpretations occur because the intervention was not implemented as planned. 12 , 13 Explanation and outcome attribution relate to failures to explore the black box of implementation. Specifically, what actually did/did not happen within the study relative to the implementation plan, and what factors in the implementation setting, anticipated or unanticipated, influenced the actual degree of implementation? By failing to collect such data, potential study users have little understanding of a particular implementation strategy. For example, 1 study regarding opinion leadership did not report the concurrent implementation of standing orders. 14

Use of a traditional intervention design does not obviate collection of the critical information cited above. Rather, complementary use of FE within an experimental study can create a dual or hybrid style approach for implementation research. 15 The experimental design is thus combined with descriptive or observational research that employs a mix of qualitative and quantitative techniques, creating a richer dataset for interpreting study results.

FORMATIVE EVALUATION WITHIN QUERI

As with many methodologic concepts, there is no single definition/approach to FE. In fact, as Dehar et al. 16 stated, there is a decided “lack of clarity and some disagreement among evaluation authors as to the meaning and scope” of related concepts (16, p. 204; see Table 1 for a sampling). Variations include differences in terminology, e.g., an author may refer to FE , process evaluation , or formative research . 16 , 17

A Spectrum of Definitions of Formative Evaluation

Given a mission to make rapid, evidence-based improvements to achieve better health outcomes, the authors have defined FE as a rigorous assessment process designed to identify potential and actual influences on the progress and effectiveness of implementation efforts . Related data collection occurs before, during, and after implementation to optimize the potential for success and to better understand the nature of the initiative, need for refinements, and the worth of extending the project to other settings. This approach to FE incorporates aspects of the last 2 definitions in Table 1 and concurs with the view that formative connotes action. 16 In QUERI, this action focus differentiates FE from “process” evaluations where data are not intended for concurrent use.

Various uses of FE for implementation research are listed in Table 2 . Uses span the timeframe or stages of a project, i.e., development/diagnosis, implementation, progress, and interpretation. Within QUERI, these stages are progressive, integrated components of a single hybrid project. Each stage is described below, in the context of a single project, and illustrated by QUERI examples ( Tables 3 , ​ ,4, 4 , and ​ and6 6 – 10 ). Each table provides an example of 1 or more FE stages. However, as indicated in some of the examples, various evaluative activities can serve multiple stages, which then merge in practice. Formative evaluation at any stage requires distinct plans for adequate measurement and analysis.

Potential Uses of Formative Evaluation 10 , 13 , 16 , 20 – 27

An Example of Developmental FE

QUERI, quality enhancement research initiative; SCI, spinal cord injury; VHA, Veterans Health Administration; CCR,computerized clinical reminder; FE, formative evaluation.

Implementation-Focused FE

QUERI, quality enhancement research initiative; FE, formative evaluation; CHF, congestive heart failure.

Implementation and Progress-Focused FE

QUERI, quality enhancement research initiative; FE, formative evaluation; QI, quality improvement.

An Illustrative, Potential FE

QUERI, quality enhancement research initiative; FE, formative evaluation.

Developmental Evaluation

Developmental evaluation occurs during the first stage of a project and is termed a diagnostic analysis . 1 , 28 It is focused on enhancing the likelihood of success in the particular setting/s of a project, and involves collection of data on 4 potential influences: (a) actual degree of less-than-best practice; (b) determinants of current practice; (c) potential barriers and facilitators to practice change and to implementation of the adoption strategy; and (d) strategy feasibility, including perceived utility of the project. (Note: studies conducted to obtain generic diagnostic information prior to development of an implementation study are considered formative research, not FE. Even if available, a diagnostic analysis is suggested given the likelihood that generically identified factors will vary across implementation sites.)

Activity at this stage may involve assessment of known prerequisites or other factors related to the targeted uptake of evidence, e.g., perceptions regarding the evidence, attributes of the proposed innovation, and/or administrative commitment. 11 , 21 , 29 – 31 Examples of formative diagnostic tools used within QUERI projects include organizational readiness and attitude/belief surveys 32 , 33 (also see Tables 3 and ​ and7). 7 ). Such developmental data enable researchers to understand potential problems and, where possible, overcome them prior to initiation of interventions in study sites.

Developmental/Implementation/Progress FE

QUERI, quality enhancement research initiative; FE, formative evaluation; SCI, spinal cord injury.

In addition to information available from existent databases about current practice or setting characteristics, formative data can be collected from experts and representative clinicians/administrators. For example, negative unintended consequences might be prospectively identified by key informant or focus group interviews. This participatory approach may also facilitate commitment among targeted users. 34

Implementation-Focused Evaluation

This type of FE occurs throughout implementation of the project plan. It focuses on analysis of discrepancies between the plan and its operationalization and identifies influences that may not have been anticipated through developmental activity. As Hulscher et al. note in a relevant overview of “process” evaluation, FE allows “researchers and implementers to (a) describe the intervention in detail, (b) evaluate and measure actual exposure to the intervention, and (c) describe the experience of those exposed (13, p. 40)”— concurrently. It also focuses on the dynamic context within which change is taking place, an increasingly recognized element of implementation. 37 – 40

Implementation-focused formative data enable researchers to describe and understand more fully the major barriers to goal achievement and what it actually takes to achieve change, including the timing of project activities. By describing the actuality of implementation, new interventions may be revealed. In terms of timing, formative data can clarify the true length of time needed to complete an intervention, as failure to achieve results could relate to insufficient intervention time.

Implementation-focused formative data also are used to keep the strategies on track and as a result optimize the likelihood of affecting change by resolving actionable barriers, enhancing identified levers of change, and refining components of the implementation interventions. Rather than identify such modifiable components on a post hoc basis, FE provides timely feedback to lessen the likelihood of type III errors (see Tables 4 , ​ ,6, 6 , ​ ,7, 7 , and ​ and9 9 ).

Implementation/Interpretive FE

QUERI, quality enhancement research initiative; FE, formative evaluation; CR, clinical reminder; SAS, site activation scale.

In summary, FE data collected at this stage offer several advantages. They can (a) highlight actual versus planned interventions, (b) enable implementation through identification of modifiable barriers, (c) facilitate any needed refinements in the original implementation intervention, (d) enhance interpretation of project results, and (e) identify critical details and guidance necessary for replication of results in other clinical settings.

Measurement within this stage can be a simple or complex task. Table 5 describes several critical issues that researchers should consider. As with other aspects of FE, both quantitative and qualitative approaches can be used.

Critical Measures of Implementation

Progress-Focused Evaluation

This type of FE occurs during implementation of study strategies, but focuses on monitoring impacts and indicators of progress toward goals. The proactive nature of FE is emphasized, as progress data become feedback about the degree of movement toward desired outcomes. Using implementation data on dose, intensity, and barriers, factors blocking progress may be identified. Steps can then be taken to optimize the intervention and/or reinforce progress via positive feedback to key players. As Krumholz and Herrin 49 suggest, waiting until implementation is completed to assess results “obscures potentially important information … about trends in practice during the study [that] could demonstrate if an effort is gaining momentum—or that it is not sustainable” (see Tables 6 and ​ and7 7 ).

Interpretive Evaluation

This stage is usually not considered a type of FE but deserves separate attention, given its role in the illumination of the black box of implementation/change. Specifically, FE data provide alternative explanations for results, help to clarify the meaning of success in implementation, and enhance understanding of an implementation strategy's impact or “worth.” Such “black box” interpretation occurs through the end point triangulation of qualitative and quantitative FE data, including associational relationships with impacts.

Interpretive FE uses the results of all other FE stages. In addition, interpretive information can be collected at the end of the project about key stakeholder experiences. Stakeholders include individuals expected to put evidence into practice as well as those individuals expected to support that effort. These individuals can be asked about their perceptions of the implementation program, its interventions, and changes required of them and their colleagues. 10 , 13 , 27 , 38 , 46 , 54 Information can be obtained on stakeholder views regarding (a) usefulness or value of each intervention, (b) satisfaction or dissatisfaction with various aspects of the process, (c) reasons for their own program-related action or inaction, (d) additional barriers and facilitators, and (e) recommendations for further refinements.

Information can also be obtained regarding the degree to which stakeholders believe the implementation project was successful, as well as the overall “worth” of the implementation effort. Statistical significance will be calculated using the summative data. However, as inferential statistical significance does not necessarily equate with clinical significance, it is useful to obtain perceptions of stakeholders relative to the “meaning” of statistical findings. For some stakeholders, this meaning will be placed in the context of the cost of obtaining the change relative to its perceived benefits (see Tables 8 – 10 ).

Interpretive FE

QUERI, quality enhancement research initiative; FE, formative evaluation; QI, quality improvement; NH, mental health; ATIP, antipsychotic treatment improvement program.

Formative evaluation, as a descriptive assessment activity, does not per se test hypotheses. However, within an experimental study, in-depth data from a concurrent FE can provide working hypotheses to explain successes or failures, particularly when the implementation and evaluation plans are grounded in a conceptual framework. 55 – 57 In this respect, interpretive FE may be considered as case study data that contribute to theory building. 58 Overall, FE data may provide evidence regarding targeted components of a conceptual framework, insights into the determinants of behavior or system change, and hypotheses for future testing.

CHALLENGES OF CONDUCTING FE

Formative evaluation is a new concept as applied to health services research and as such presents multiple challenges. Some researchers may need help in understanding how FE can be incorporated into a study design. Formative evaluation is also a time-consuming activity and project leaders may need to be convinced of its utility before committing study resources. In addition, much is yet to be learned about effective approaches to the following types of issues:

  • In the well-controlled RCT, researchers do not typically modify an experimental intervention once approved. However, in improvement-oriented research, critical problems that prevent an optimal test of the planned implementation can be identified and resolved. Such actions may result in alterations to the original plan. The challenge for the researcher is to identify that point at which modifications create a different intervention or add an additional intervention. Likewise, when the researcher builds in “local adaptation,” the challenge is to determine its limits or clarify statistical methods available to control for the differences. An implementation framework and clear identification of the underlying conceptual nature of each intervention can facilitate this process. As Hawe et al. 43 suggest, the researcher has to think carefully about the “essence of the intervention” in order to understand the actual nature of implementation and the significance of formative modifications.
  • Implementation and QI researchers may encounter the erroneous view that FE involves only qualitative research or that it is not rigorous, e.g., that it consists of “just talking to a few people”. However, FE does not lack rigor nor is it simply a matter of qualitative research or a specific qualitative methodology. Rather, FE involves selecting among rigorous qualitative and quantitative methods to accomplish a specific set of aims, with a plan designed to produce credible data relative to explicit formative questions. 61
  • A critical challenge for measurement planning is selection or development of methods that yield quantitative data for the following types of issues: (a) assessment of associations between outcome findings and the intensity, dose, or exposure to interventions and (b) measurement of the adaptations of a “standard” protocol across diverse implementation settings. 62 Whether flexibility is a planned or unplanned component of a study, it should be measured in some consistent, quantifiable fashion that enables cross-site comparisons. Goal attainment scaling is 1 possibility. 47 , 48
  • A final issue facing implementation researchers is how to determine the degree to which FE activities influence the results of an implementation project. If FE itself is an explicit intervention, it will need to be incorporated into recommendations for others who wish to replicate the study's results. More specifically, the researcher must systematically reflect upon why formative data were collected, how they were used, by whom they were used, and to what end. For example, to what extent did FE enable refinement to the implementation intervention such that the likelihood of encountering barriers in the future is adequately diminished? Or, in examining implementation issues across study sites, to what extent did FE provide information that led to modifications at individual sites? If the data and subsequent adjustments at individual sites were deemed critical to project success, upon broader dissemination to additional sites, what specific FE activities should be replicated, and by whom?

Formative evaluation is a study approach that is often key to the success, interpretation, and replication of the results of implementation/QI projects. Formative evaluation can save time and frustration as data highlight factors that impede the ability of clinicians to implement best practices. It can also identify at an early stage whether desired outcomes are being achieved so that implementation strategies can be refined as needed; it can make the realities and black box nature of implementation more transparent to decision makers; and it can increase the likelihood of obtaining credible summative results about effectiveness and transferability of an implementation strategy. Formative evaluation helps to meet the many challenges to effective implementation and its scientific study, thereby facilitating integration of research findings into practice and improvement of patient care.

Acknowledgments

The work reported here was supported by the Department of Veterans Affairs, Veterans Health Administration, Health Services Research and Development Service. The views expressed in this article are those of the authors and do not necessarily represent the views of the Department of Veterans Affairs.

Summative Evaluation

  • Living reference work entry
  • First Online: 24 February 2022
  • Cite this living reference work entry

summative evaluation research design

  • W. Douglas Evans 3  

65 Accesses

1 Citations

Summative (or outcome) monitoring and evaluation (M&E) is research on the extent to which a social marketing campaign achieved its outcome objectives. They consist of multiple study designs depending on the campaign implementation context, and integrate process data in order to attribute observed changes in outcomes and attribute those effects to the campaign rather than external or secular events and trends.

Social marketing evaluation, as in other program evaluation research, can be divided into process and outcome evaluation methods (Valente 2002 ). Process evaluation helps to assess whether (1) the target audience has been exposed to a campaign’s messages (e.g., did adolescents hear the truth campaign tobacco use prevention message?) and (2) the target audience reacts favorably to the messages in real-world circumstances (e.g., how did adolescents react to truth messages when they heard them?). Outcome evaluation helps to determine the effects of messages on health...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Andrade, E. L., Evans, W. D., Barrett, N., Edberg, M., & Cleary, S. (2018). Design, implementation, and monitoring of the Adelante community social marketing campaign. Health Education Research . https://doi.org/10.1093/her/cyx076 .

Edberg, M., Clearly, S. D., Andrade, E. L., Evans, W. D., Simmons, L., & Cubilla, I. (2016). Applying ecological positive youth development theory to address the co-occurrence of substance abuse, sex risk, and interpersonal violence among immigrant Latino youth. Health Promotion Practice . https://doi.org/10.1177/1524839916638302 . ePub April 18, 2016.

Evans, W. D., Blitstein, J., & Hersey, J. (2008a). Evaluation of public health brands: Design, measurement, and analysis. In W. D. Evans & G. Hastings (Eds.), Public health branding: Applying marketing for social change . London: Oxford University Press.

Chapter   Google Scholar  

Evans, W. D., Davis, K. C., & Farrelly, M. C. (2008b). Planning for a media evaluation. In D. Holden & M. Zimmerman (Eds.), A practical guide to program evaluation planning . Thousand Oaks: Sage.

Google Scholar  

Evans, W. D., Andrade, E. L., Barrett, N., Edberg, M., & Cleary, S. (2019). Outcomes of the Adelante community social marketing campaign. Health Education Research . https://doi.org/10.1093/her/cyz016 .

Farrelly, M. C., Davis, K. C., Haviland, M. L., Messeri, P., & Healton, C. G. (2005). Evidence of a dose-response relationship between ‘truth’ antismoking ads and youth smoking. American Journal of Public Health, 95 (3), 425–431.

Article   Google Scholar  

Guba, E. G., & Lincoln, Y. S. (1994). Competing paradigms in qualitative research. In N. K. Denzin & Y. S. Lincoln (Eds.), Handbook of qualitative research . Thousand Oaks: Sage.

Rossi, P., & Freeman, R. (1993). Evaluation: A systematic approach . Thousand Oaks: Sage.

Valente, T. W. (2002). Evaluating health promotion programs . Oxford, UK: Oxford University Press.

Download references

Author information

Authors and affiliations.

Milken Institute School of Public Health, George Washington University, Washington, DC, USA

W. Douglas Evans

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to W. Douglas Evans .

Section Editor information

Milken Institute School of Public Health, George Washington University, Washington, D. C., USA

School of Health, Sport, and Life Sciences, Leeds Trinity University, Horsforth, Leeds, UK

Marco Bardus

Dept. of Health Promotion and Community Health, American University of Beirut, Beirut, Lebanon

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive licence to Springer Nature Switzerland AG

About this entry

Cite this entry.

Evans, W.D. (2022). Summative Evaluation. In: The Palgrave Encyclopedia of Social Marketing. Palgrave Macmillan, Cham. https://doi.org/10.1007/978-3-030-14449-4_156-1

Download citation

DOI : https://doi.org/10.1007/978-3-030-14449-4_156-1

Received : 16 June 2020

Accepted : 23 January 2022

Published : 24 February 2022

Publisher Name : Palgrave Macmillan, Cham

Print ISBN : 978-3-030-14449-4

Online ISBN : 978-3-030-14449-4

eBook Packages : Springer Reference Business and Management Reference Module Humanities and Social Sciences Reference Module Business, Economics and Social Sciences

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

Summative Assessment and Feedback

Main navigation.

Summative assessments are given to students at the end of a course and should measure the skills and knowledge a student has gained over the entire instructional period. Summative feedback is aimed at helping students understand how well they have done in meeting the overall learning goals of the course.

Effective summative assessments

Effective summative assessments provide students a structured way to demonstrate that they have met a range of key learning objectives and to receive useful feedback on their overall learning. They should align with the course learning goals and build upon prior formative assessments. These assessments will address how well the student is able to synthesize and connect the elements of learning from the entirety of the course into a holistic understanding and provide an opportunity to provide rich summative feedback.

The value of summative feedback

Summative feedback is essential for students to understand how far they have come in meeting the learning goals of the course, what they need further work on, and what they should study next. This can affect later choices that students make, particularly in contemplating and pursuing their major fields of study. Summative feedback can also influence how students regard themselves and their academic disciplines after graduation.

Use rubrics to provide consistency and transparency

A rubric is a grading guide for evaluating how well students have met a learning outcome. A rubric consists of performance criteria, a rating scale, and indicators for the different rating levels. They are typically in a chart or table format. 

Instructors often use rubrics for both formative and summative feedback to ensure consistency of assessment across different students. Rubrics also can make grading faster and help to create consistency between multiple graders and across assignments.

Students might be given access to the rubric before working on an assignment. No criteria or metric within a summative assessment should come as a surprise to the students. Transparency with students on exactly what is being assessed can help them more effectively demonstrate how much they have learned.  

Types of  summative assessments

Different summative assessments are better suited to measuring different kinds of learning. 

Examinations

Examinations are useful for evaluating student learning in terms of remembering information, and understanding and applying concepts and ideas. However, exams may be less suited to evaluating how well students are able to analyze, evaluate, or create things related to what they've learned.

Presentation

A presentation tasks the student with teaching others what they have learned typically by speaking, presenting visual materials, and interacting with their audience. This can be useful for assessing a student's ability to critically analyze and evaluate a topic or content.

With projects, students will create something, such as a plan, document, artifact, or object, usually over a sustained period of time, that demonstrates skills or understanding of the topic of learning. They are useful for evaluating learning objectives that require high levels of critical thinking, creativity, and coordination. Projects are good opportunities to provide summative feedback because they often build on prior formative assessments and feedback. 

With a portfolio, students create and curate a collection of documents, objects, and artifacts that collectively demonstrate their learning over a wide range of learning goals. Portfolios usually include the student's reflections and metacognitive analysis of their own learning. Portfolios are typically completed over a sustained period of time and are usually done by individual students as opposed to groups. 

Portfolios are particularly useful for evaluating how students' learning, attitudes, beliefs, and creativity grow over the span of the course. The reflective component of portfolios can be a rich form of self-feedback for students. Generally, portfolios tend to be more holistic and are often now done using ePortfolios .

Volume 8 Supplement 1

Proceedings of Advancing the Methods in Health Quality Improvement Research 2012 Conference

  • Proceedings
  • Open access
  • Published: 19 April 2013

Effectiveness-implementation hybrid designs: implications for quality improvement science

  • Alice C Bernet 1 ,
  • David E Willens 2 &
  • Mark S Bauer 3 , 4  

Implementation Science volume  8 , Article number:  S2 ( 2013 ) Cite this article

15k Accesses

54 Citations

2 Altmetric

Metrics details

Presentation

This article summarizes the three types of hybrid effectiveness-implementation designs and associated evaluation methods. It includes a discussion of how hybrid designs have the potential to enhance knowledge development and application of clinical interventions and implementation strategies in “real world” settings. The authors propose implications of hybrid designs for quality improvement research.

Traditionally, researchers think of knowledge development and application as a uni-directional, step-wise, progression, in which different questions are addressed in isolation. First, a randomized clinical trial (RCT) is deployed to determine if an intervention implemented under controlled conditions has efficacy in specific populations. Next, “effectiveness research” methods determine if the effect remains when implemented in less controlled conditions with broader populations. Finally, “implementation research” methods, such as cluster randomized controlled trials, are deployed to understand the best methods to introduce the intervention into practice. While systematic, this unidirectional approach can take a great deal of time from the original efficacy study design to the final conclusions about implementation, and conditions may change so that original clinical and policy questions become less relevant [ 1 ]. Additionally, the unidirectional approach does not help us understand interaction effects between the intervention and the implementation strategy.

Hybrid designs simultaneously evaluate the impact of interventions introduced in real world settings (e.g. “effectiveness”), and the implementation strategy. Such designs enhance the ability to identify important intervention-implementation interactions, which inform decisions about optimal deployment and generalized impact, and may accelerate the introduction of valuable innovations into practice. This has implications for quality improvement researchers, who are often guiding the deployment and evaluating the impact of interventions in healthcare settings.

Types of hybrid designs

Hybrid designs form a continuum between pure effectiveness research and pure implementation research, as defined above. Hybrid designs are best suited for the study of minimal risk interventions with at least indirect evidence of effectiveness, and strong face validity to support applicability to the new setting, population or delivery method in question. Each type describes two a priori aims: one for testing intervention effectiveness, and one for evaluating the implementation strategy. The types differ according to the emphasis placed on testing the intervention or the implementation. These designs incorporate evaluation methods—process, formative and summative evaluation—which distinguish hybrid designs from traditional effectiveness research but are typical of implementation research. For more detailed examples of hybrid design in published research, please refer to Curran et al. [ 2 ].

Type 1 hybrid designs rigorously test the clinical intervention and secondarily gather data to inform subsequent implementation research trials. These studies measure patient functioning or symptoms in response to a clinical intervention, while simultaneously evaluating feasibility and acceptability of implementation through qualitative, process-oriented, or mixed methods.

Type 2 hybrid designs simultaneously test the clinical intervention, while rigorously testing the implementation strategy. Ideal targets include populations or settings that are reasonably close to those studied in prior effectiveness trials. The use of fractional factorial designs can inform allocation of study groups in hybrid Type 2 studies. This design strategy allows for multiple “doses” of implementation resources to be available during a test of intervention effectiveness.

Type 3 designs primarily test the implementation strategy, via measures of adoption of and fidelity to clinical interventions. The secondary aim measures patient-level effects of the clinical intervention, such as symptoms, functioning and service use. When there is robust intervention data, but effects are suspected to be vulnerable during the implementation trial, a Type 3 design may elucidate the implementation barriers.

Process, formative, and summative evaluation methods are identical to those used in implementation research [ 3 ]. Process evaluation identifies potential and actual influences on the conduct and quality of implementation. In contrast to formative evaluation, data are not used during the study to influence the process. Data informing process evaluation can be collected prior to the study, concurrently, or retrospectively. Formative evaluation makes use of data throughout the intervention trial to modify the intervention procedures or implementation process during the study. To augment randomized or observational study designs, formative evaluation data are used in Type 2 and Type 3 hybrid designs to refine and improve the clinical intervention and implementation process while under study. This allows for real-time refinement of intervention and implementation techniques, however it may diminish external generalizability.

The summative evaluation provides information about the impact of the intervention, similar to classical clinical trials, which in this case of hybrid designs helps to inform a local healthcare system’s decision to adopt the intervention under study. The summative evaluation outcomes may include, for instance: patient level health outcomes for a clinical intervention, process or quality measures for an implementation strategy, population-level health status, or an index of system function for an organizational-level intervention. It is important to note that for hybrid designs, no less than for other types of studies, power considerations are important for summative evaluation outcomes both for intervention and the implementation measures.

Two aspects of hybrid designs are particularly germane to quality improvement research. One aspect is the summative evaluation. The summative evaluation provides additional contextual elements, which inform the decision of a healthcare system to adopt the intervention under study. Second is the formative evaluation aspect of hybrid designs. This type of evaluation is comparable to the Plan-Do-Study-Act (PSDA) methods used by quality improvement researchers to refine intervention and implementation strategies during the study. The rigor of formative evaluation and PDSA methods can be maximized when combined with robust study designs. Key features of these research designs include time series measurement, testing the stability of the baseline, use of replication and reversals, constancy of the treatment effect, and statistical techniques for evaluating effects [ 4 ].

Hybrid designs combine concepts that are familiar to improvement researchers, such as statistical methods to account for modifications to the intervention or improvement strategy, a priori considerations of process and clinical level outcomes, and a focus on interventions with at least indirect evidence of clinical effectiveness. As a result, the use of hybrid designs, and corresponding evaluation strategies, are primed for adoption by quality improvement researchers.

Recommendations

Study of the contextual aspects of implementation and intervention effectiveness are essential to compare quality improvement interventions across a range of settings. Tools such as summative and formative evaluation could accomplish this. Hybrid designs would allow for evolving implementation and intervention strategies, which more closely resemble real-world changes to health care systems.

Grounding the use of hybrid designs in a unified conceptual framework will support the collaboration among professionals in quality improvement research, implementation science, and associated fields. Consideration for hybrid designs can further our understanding of how and why interventions may vary in effectiveness across settings. Such an achievement would enhance the legitimacy of quality improvement research as a solution to the problems facing healthcare.

Glasgow RE, Lichtenstein E, Marcus AC: Why don't we see more translation of health promotion research to practice? Rethinking the efficacy-to-effectiveness transition. Am J Public Health. 2003, 93: 1261-1267. 10.2105/AJPH.93.8.1261.

Article   PubMed Central   PubMed   Google Scholar  

Curran GM, Bauer M, Mittman B, Pyne JM, Stetler C: Effectiveness-implementation hybrid designs: combining elements of clinical effectiveness and implementation research to enhance public health impact. Med Care. 2012, 50: 217-226. 10.1097/MLR.0b013e3182408812.

Stetler CB, Mittman BS, Francis J: Overview of the VA Quality Enhancement Research Initiative (QUERI) and QUERI theme articles: QUERI series. Implement Sci. 2008, 3: 8-10.1186/1748-5908-3-8.

Speroff T, O’Connor GT: Study designs for PSDA quality improvement research. Q Manage Health Care. 2004, 13: 17-32.

Article   Google Scholar  

Download references

Author information

Authors and affiliations.

Veterans Affairs National Quality Scholars Program, VA Tennessee Valley Healthcare System, Nashville, Tennessee, 37212, USA

Alice C Bernet

Henry Ford Health System, Detroit, Michigan, 48202, USA

David E Willens

Center for Organization, Leadership, and Management Research, VA Boston Healthcare System, Boston, Massachusetts, 02130, USA

Mark S Bauer

Department of Psychiatry, Harvard Medical School, Boston, Massachusetts, 02215, USA

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Alice C Bernet .

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article.

Bernet, A.C., Willens, D.E. & Bauer, M.S. Effectiveness-implementation hybrid designs: implications for quality improvement science. Implementation Sci 8 (Suppl 1), S2 (2013). https://doi.org/10.1186/1748-5908-8-S1-S2

Download citation

Published : 19 April 2013

DOI : https://doi.org/10.1186/1748-5908-8-S1-S2

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Implementation Strategy
  • Summative Evaluation
  • Implementation Research
  • Hybrid Design
  • Quality Improvement Research

Implementation Science

ISSN: 1748-5908

  • Submission enquiries: Access here and click Contact Us
  • General enquiries: [email protected]

summative evaluation research design

Formative vs. Summative Assessment: What’s the Difference? [+ Comparison Chart]

A close-up shot of a hand and a sheet of paper on which a student takes notes during a lecture.

In education, assessments are the roadmap guiding teachers and students to successful outcomes — from navigating subject matter to reaching academic milestones. But not all means of measuring success are the same. In this blog post we’ll explore two of these methods: formative vs. summative assessment.

To maximize teaching effectiveness, it’s important to understand the differences between each assessment type. Keep reading to learn the benefits of tailoring instruction to meet the diverse needs of every learner, plus tips on implementing both techniques.

summative evaluation research design

What Is Formative Assessment?

Formative assessment is not actually a singular method, but instead, a variety of ways for teachers to evaluate student comprehension, learning needs, and academic progress in real-time throughout a lesson, unit, or course. 

These assessments aid in identifying areas where students are struggling, skills they find challenging, or learning standards they have not yet achieved. This information enables teachers to make necessary adjustments to lessons and instructional techniques to better meet the needs of their students. 

Its primary goal is to measure a student’s understanding during instruction; for example, with quizzes, tests, or exams.

As learning and formative assessment expert Paul Black puts it, “when the cook tastes the soup, that’s formative assessment. When a customer tastes the soup, that’s summative assessment.”

What Is Summative Assessment?

Summative assessment, on the other hand, is any type of evaluation that measures a student’s overall comprehension and achievement at the end of a unit, course, or academic period. It typically takes the form of final exams or projects, and aims to gauge what students have learned. Unlike formative assessment, which provides ongoing feedback, summative assessment focuses on determining the extent to which students have mastered the content overall.

This culmination of the learning process helps teachers determine proficiency levels against predefined standards or benchmarks. These assessments — which often carry higher stakes — are used for accountability, such as grading, ranking, and reporting student achievement to parents and school administrators.

5 REASONS WHY CONTINUING EDUCATION MATTERS FOR EDUCATORS

The education industry is always changing and evolving, perhaps now more than ever. Learn how you can be prepared by downloading our eBook.

summative evaluation research design

3 Examples of Formative Assessment

For a clearer idea of formative assessment , explore these three examples:

  • Exit tickets are brief assessments given to students at the end of a lesson or class period featuring questions that relate to that day’s work. Teachers use exit tickets to gauge student understanding before they leave the class, allowing them to adjust future instruction based on the feedback received. 
  • Think-Pair-Share involves three stages: First, prompting students to independently think about a question related to a lesson, then having them pair up with a classmate to discuss their thoughts, before finally asking them to share their discussion with the class. The process encourages active engagement, collaboration, and comprehension.
  • One-minute paper is aptly named, allowing students 60 seconds at the end of a lesson or class period to write down the most important concepts from the presented material. Teachers can review these papers to assess how well students understand the material at hand and address any misconceptions.

3 Examples of Summative Assessment

Likewise, here a few examples of summative assessments:

  • Final exams are comprehensive assessments that are typically given at the end of a course or academic year and cover a broad range of topics that were covered over a longer period of time. 
  • Standardized tests , such as the SAT and ACT, are administered and scored consistently across a large number of students for comparison purposes. They are also useful for identifying areas for improvement in educational systems and making decisions about student placement or advancement, such as admission into higher education institutions.  
  • End-of-unit projects are typically more extensive than regular class assignments and require students to demonstrate their understanding of multiple concepts or skills covered in the unit. Research, originality, collaboration, and presentation are often involved.

How to Grade Formative Assessments

Because of the unique nature of each type of student evaluation, there is also variety in grading summative vs. formative assessments. The following are considerations when grading formative assessments:

  • Focus on feedback by prioritizing constructive notes that guide students’ learning and improvement.
  • Use rubrics to establish clear criteria for assessment and ensure consistency in grading. 
  • Provide descriptive feedback that highlights strengths and areas for improvement.
  • Encourage self assessment to promote accountability and reflection as students examine their own work.
  • Focus on growth and development over time instead of final outcomes and grades exclusively.
  • Track progress to call out student achievement trends over time.
  • Use peer assessment to cultivate collaboration and diverse perspectives in evaluation.
  • Consider participation and effort in addition to academic achievement in order to take a big-picture look at education and achievement.
  • Communicate clearly to facilitate understanding and successful outcomes.

How to Grade Summative Assessments

Consider these methods as you grade summative assessments, keeping in mind a fair and accurate representation of students’ learning outcomes and progress.

  • Establish clear criteria to guide students on what is expected and to ensure transparency in assessment standards.
  • Use rubrics to keep evaluation criteria structured and promote consistency.
  • Assign numerical or letter grades to quantify performance and clearly articulate overall performance.
  • Consider weighting grades to reflect the relative importance of different aspects of student performance.
  • Provide feedback that is specific and actionable. 
  • Ensure fairness and consistency to uphold equitable grading for all students.
  • Communicate results clearly so that parents, students, and administrators understand learning outcomes.
  • Offer opportunities for review and reflection to encourage students to engage with their assessment and improve moving forward.
  • Use assessment data for instructional planning to tailor teaching strategies to student needs.
  • Adhere to school or district policies to maintain compliance and consistency.

Formative vs. Summative Assessment Comparison Chart

Understanding these differences is crucial for educators to help students succeed in meaningful and effective ways. When teachers try out different assessment methods and grading styles, they get a better handle on student needs and can create an environment for widespread growth and improvement. 

The best way for teachers to advance their knowledge and understanding of the latest assessment methods is to keep up with professional development opportunities, such as with the University of San Diego’s Professional and Continuing Education (PCE) certificate program. Explore the website to learn more about hundreds of online and independent courses for teachers covering a wide range of subjects.

Educator Programs – 25% Off Teachers Appreciation Sale

Valid   May 1, 2024 – May 31, 2024

Some restrictions apply.  Offer valid only during sale dates. Not all courses apply. **Only one discount can be applied per course.**

Be Sure To Share This Article

  • Share on Twitter
  • Share on Facebook
  • Share on LinkedIn

18 SIGNS A STUDENT IS STRUGGLING

Download eBook: 18 Signs a Student is Struggling

summative evaluation research design

Related Posts

summative evaluation research design

COMMENTS

  1. Formative vs. Summative Evaluations

    Research Methods for Formative vs. Summative Evaluations. After it is clear which type of evaluation you will conduct, you have to determine which research method you should use. There is a common misconception that summative equals quantitative and formative equals qualitative ­­— this is not the case.

  2. Conducting Summative Evaluation and Research: The Final Stage

    A Summative Evaluation Plan follows a format similar to a Formative Evaluation Plan except that the methods and tools may vary due to research design and methodology being used. Evaluators define the online instruction's purpose(s), the stakeholders, materials to be used, the evaluators and participants in the evaluation, and evaluation ...

  3. Understanding Summative Evaluation: Definition, Benefits, and Best

    Future Directions for Summative Evaluation Research and Practice ... Second, it can help to identify areas where improvements can be made in program delivery, such as in program design or implementation. Third, it can help to determine whether the program or project is a worthwhile investment, and whether it is meeting the needs of stakeholders

  4. Design and Implementation of Evaluation Research

    Evaluation has its roots in the social, behavioral, and statistical sciences, and it relies on their principles and methodologies of research, including experimental design, measurement, statistical tests, and direct observation. What distinguishes evaluation research from other social science is that its subjects are ongoing social action programs that are intended to produce individual or ...

  5. PDF Designing Evaluation

    Summative evaluation is used to make judgments about how worthwhile a program is in order to determine whether to keep it or license it; hence, the evaluation must have credibility for a number of ... EVALUATION RESEARCH DESIGN a good deal of resources go into the preparation for and administering of instruments to assess treatments; quantitative

  6. Summative Evaluation

    The distinction between formative and summative evaluation was proposed by Scriven (1967) in an article on the methodology of curriculum and program evaluation conducted by researchers and school administrators. In the mastery learning model developed by Bloom (1968) and Bloom et al. (1971), this distinction was transposed to the assessment of student learning by classroom teachers.

  7. New Trends in Formative-Summative Evaluations for Adult Education

    4. Assessment goal is formative or assessment for learning, that is, to improve the performance during the process but evaluation is summative since it is preformed after the program has been completed to judge the quality. 5. Assessment targets the process, whereas evaluation is aimed to the outcome. 6.

  8. Summative Analysis: A Qualitative Method for Social Science and Health

    In this paper the author describes a new qualitative analytic technique that she has been perfecting across a range of health research studies. She describes the summative analysis method, which is a group, collaborative analytic technique that concentrates on consensus-building activities, illustrating its use within a study of Holocaust ...

  9. PDF RESEARCH DESIGN IN EVALUATION: CHOICES AND ISSUES

    process and product public, evaluation is clearly a research activity. It is, however, a research activity with particular characteristics which distinguish it from other forms of research. Jamieson [1] has suggested that evaluation differs from academic research in two important ways. The first concerns the choice of research questions.

  10. Evaluative Research: Examples and Methods to Build Better Products

    Summative evaluation can be used for outcome-focused evaluation to assess impact and effectiveness for specific outcomes—for example, how design influences conversion. Formative evaluation research On the other hand, formative research is conducted early and often during the design process to test and improve a solution before arriving at the ...

  11. Research Design Steps

    Research Design Getting Started. ... Because the purpose of such summative evaluation is to measure success and to determine whether this success is scalable (capable of being generalized beyond the specific case), quantitative data is more often used than qualitative data. In our example, we might have "outcomes" data for thousands of ...

  12. PDF Research on Classroom Summative Assessment

    the purpose of summative assessment is to determine the student's overall achievement in a specific area of learning at a particular time—a purpose that distinguishes it from all other forms of assessment (Harlen, 2004). The accuracy of summative judgments depends on the quality of the assessments and 14 Research on Classroom Summative ...

  13. Evaluation Research Design: Examples, Methods & Types

    Evaluation Research Methodology. There are four major evaluation research methods, namely; output measurement, input measurement, impact assessment and service quality. Output measurement is a method employed in evaluative research that shows the results of an activity undertaking by an organization.

  14. Quantitative Approaches for the Evaluation of Implementation Research

    3. Quantitative Methods for Evaluating Implementation Outcomes. While summative evaluation is distinguishable from formative evaluation (see Elwy et al. this issue), proper understanding of the implementation strategy requires using both methods, perhaps at different stages of implementation research (The Health Foundation, 2015).Formative evaluation is a rigorous assessment process designed ...

  15. Evaluative research: Methods, types, and examples (2024)

    Engage users with prototypes to identify usability issues, gauge user satisfaction, and validate design decisions. This early evaluation ensures that potential problems are addressed before moving forward, saving time and resources in the later stages of development. ... Summative evaluation research. Summative evaluation research occurs after ...

  16. The Role of Formative Evaluation in Implementation Research and the

    Summative evaluation is a systematic process of collecting data on the impacts, outputs, ... complementary use of FE within an experimental study can create a dual or hybrid style approach for implementation research. 15 The experimental design is thus combined with descriptive or observational research that employs a mix of qualitative and ...

  17. Summative Evaluation

    Definition. Summative (or outcome) monitoring and evaluation (M&E) is research on the extent to which a social marketing campaign achieved its outcome objectives. They consist of multiple study designs depending on the campaign implementation context, and integrate process data in order to attribute observed changes in outcomes and attribute ...

  18. Evaluative Research Design Examples, Methods, And Questions ...

    Evaluative research, aka program evaluation or evaluation research, is a set of research practices aimed at assessing how well the product meets its goals. It takes place at all stages of the product development process, both in the launch lead-up and afterward. This kind of research is not limited to your own product.

  19. PDF THE DESIGN OF SUMMATIVE EVALUATIONS

    HRDC Expert Panel on Evaluation Design for the EBSM Program1 ("the Panel"). The two goals of the report are: (1) To provide a more detailed discussion of some of the issues raised at the Panel's meeting2 of March 8 and 9,2001; and (2) To highlight those remaining issues toward which additional research efforts might be directed.

  20. Summative Assessment and Feedback

    Summative Assessment and Feedback. Summative assessments are given to students at the end of a course and should measure the skills and knowledge a student has gained over the entire instructional period. Summative feedback is aimed at helping students understand how well they have done in meeting the overall learning goals of the course.

  21. Formative vs. summative research

    Benefits of formative evaluation vs. summative evaluation. Formative is the way to go when a product is being iterated on without being released. It requires much less rigor than a summative evaluation and is intended to be done rapidly and in succession in an iterative process. The only audience of a formative study should be the design team ...

  22. Effectiveness-implementation hybrid designs: implications for quality

    These designs incorporate evaluation methods—process, formative and summative evaluation—which distinguish hybrid designs from traditional effectiveness research but are typical of implementation research. For more detailed examples of hybrid design in published research, please refer to Curran et al. .

  23. Formative vs Summative research

    The most frequent techniques in summative research are surveys, expert reviews, benchmarks, usability tests and web analytics. With these techniques we can detect for example severe usability issues and bottlenecks, abandonment rate, number of Customer Service contacts, etc. In general, we seek to understand the performance of our product.

  24. Formative vs. Summative Assessment [+ Comparison Chart]

    Research, originality, collaboration, and presentation are often involved. How to Grade Formative Assessments. Because of the unique nature of each type of student evaluation, there is also variety in grading summative vs. formative assessments. The following are considerations when grading formative assessments: