• Our Mission

Comprehensive Assessment Research Review: Annotated Bibliography

Alvarez, L., & Corn, J. (2008). Exchanging Assessment for Accountability: The Implications of High-Stakes Reading Assessments for English Learners (PDF) . Language Arts, 85 (5), 354-365. Teacher research demonstrates the detrimental effects on English learners of replacing authentic literacy assessments with standardized assessments designed primarily for purposes of accountability.

Andrade, H. (2007). Self-Assessment Through Rubrics (PDF) . Educational Leadership, 65 (4), 60-63. Rubrics can be a powerful self-assessment tool -- if teachers disconnect them from grades and give students time and support to revise their work.

Andrade, H., Du, Y., & Mycek, K. (2010). Rubric-Referenced Self-Assessment and Middle School Students’ Writing . Assessment in Education: Principles, Policy & Practice, 17 (2), 199-214. This study investigated the relationship between 162 middle school students’ scores for a written assignment and a process that involved students in generating criteria and self-assessing with a rubric. In one condition, a model essay was used to generate a list of criteria for an effective essay, and students reviewed a written rubric and used the rubric to self-assess first drafts. The comparison condition involved generating a list of criteria and reviewing first drafts. The results suggest that reading a model, generating criteria, and using a rubric to self-assess can help middle school students produce more-effective writing.

Andrade, H., Du, Y., & Wang, X. (2008). Putting Rubrics to the Test: The Effect of a Model, Criteria Generation, and Rubric-Referenced Self-Assessment on Elementary School Students’ Writing . Educational Measurement: Issues and Practice, 27 (2), 3-13. Third- and fourth-grade students (N = 116) in the experimental condition used a model paper to scaffold the process of generating a list of criteria for an effective story or essay. They received a written rubric and used the rubric to self-assess first drafts. Matched students in the control condition generated a list of criteria for an effective story or essay and reviewed first drafts. Findings include a main effect of treatment and of previous achievement on total writing scores. The results suggest that using a model to generate criteria for an assignment and using a rubric for self-assessment can help elementary school students produce more-effective writing.

Andrade, H., & Valtcheva, A. (2009). Promoting Learning and Achievement Through Self-Assessment. Theory Into Practice, 48 (1), 12-19. doi:10.1080/00405840802577544. The authors describe how to do criteria-referenced self-assessment, and they review research in which criteria-referenced self-assessment has been shown to promote achievement. Criteria-referenced self-assessment is a process during which students collect information about their own performance or progress; compare it to explicitly stated criteria, goals, or standards; and revise accordingly. The purpose of self-assessment is to identify areas of strength and weakness in one’s work in order to make improvement and promote learning.

Bandura, A. (1997). Self-Efficacy: The Exercise of Control . New York, NY: W. H. Freeman and Company. Self-Efficacy is the result of more than 20 years of research by the psychologist Albert Bandura and related research that has emerged from Bandura’s original work. The book is based on Bandura’s theory that those with high self-efficacy expectancies -- the belief that one can achieve what one sets out to do -- are healthier, more effective, and generally more successful than those with low self-efficacy expectancies.

Bennett, R. E. (2011). Formative assessment: a critical review . Assessment in Education: Principles, Policy & Practice, 18 (1), 5-25. doi: 10.1080/0969594X.2010.513678. This paper takes a critical look at the research on formative assessment, raising concerns about the conclusions drawn from landmark studies such as Black & Wiliam (1998). Bennett argues that the term “formative assessment” is problematic since it is often used to capture a wide range of practices. Furthermore, formative assessment lacks a sufficient body of peer-reviewed, methodologically rigorous studies to support a thorough analysis of its effectiveness. He concludes by stating that additional research is needed.

Black, P., Harrison, C., Hodgen, J., Marshall, B., & Serret, N. (2010). Validity in Teachers’ Summative Assessments . Assessment in Education: Principles, Policy & Practice, 17 (2), 215-232. This paper describes some of the findings of a project that set out to explore and develop teachers’ understanding and practices in their summative assessments. The focus was on those summative assessments that are used on a regular basis within schools for guiding the progress of pupils and for internal accountability. The project combined both intervention and research elements. The intervention aimed both to explore how teachers might improve those practices in light of their reexamination of their validity and to engage them in moderation exercises within and between schools to audit examples of students’ work and to discuss their appraisals of these examples. It was found that teachers’ attention to validity issues had been undermined by the external test regimes, but teachers could readdress these issues by reflection on their values and by engagement in a shared development of portfolio assessments.

Black, P., & Wiliam, D. (1998, October). Inside the Black Box: Raising Standards Through Classroom Assessment (PDF) . Phi Delta Kappan, 92 (1), 81-90. Black and Wiliam conducted a review of 250 book chapters and journal articles, finding firm evidence that innovations designed to strengthen the practice of formative assessment yield substantial and significant learning gains. Learning gains are measured by comparing the average improvements in the test scores of pupils, represented by the statistical size of the effect. Typical effect sizes of the formative-assessment experiments were between 0.4 and 0.7 and are larger than most of those found for educational interventions. An effect size gain of 0.7 in the recent international comparative studies in mathematics would have raised the score of a nation in the middle of the pack of 41 countries (e.g., the United States) to one of the top five. The authors conclude that “while formative assessment can help all pupils, it yields particularly good results with low achievers by concentrating on specific problems with their work and giving them a clear understanding of what is wrong and how to put it right.” The authors recommend that “feedback to any pupil should be about the particular qualities of his or her work, with advice on what he or she can do to improve, and should avoid comparisons with other pupils.” In addition, three elements of feedback are defined: recognition of the desired goal, evidence about present position, and some understanding of a way to close the gap between the two. The authors also point out that sustained programs of professional development and support are required “if the substantial rewards promised by the research evidence are to be secured,” so that each teacher can “find his or her own ways of incorporating [feedback] into his or her own patterns of classroom work and into the cultural norms and expectations of a particular school community.”

Black, P., & Wiliam, D. (2009). Developing the Theory of Formative Assessment (PDF) . Educational Assessment, Evaluation and Accountability, 21 (1), 5-31. doi:10.1007/s11092-008-9068-5. This article provides a unifying framework for the diverse set of formative-assessment practices and aims to help practitioners implement the practices more fruitfully.

Bransford, J. D., Brown, A. L., & Cocking, R. R. (Eds.). (2000). Learning and Transfer . In How People Learn: Brain, Mind, Experience, and School (pp. 51-78). Washington, DC: National Academy Press. The authors explore key characteristics of learning and transfer that have important implications for education. The authors assert that all new learning involves transfer based on previous learning and that transfer from school to everyday environments is the ultimate purpose of school-based learning. Transfer is supported by abstract representations of knowledge and best viewed as an active, dynamic process rather than a passive end product of a particular set of learning experiences. Helping learners choose, adapt, and invent tools for solving problems is one way to facilitate transfer while also encouraging flexibility. Adaptive expertise, which involves the ability to monitor and regulate understanding in ways that promote learning, is an important model for students to emulate.

Briggs, D. C., Ruiz‐Primo, M. A., Furtak, E., Shepard, L., & Yin, Y. (2012). Meta‐Analytic Methodology and Inferences About the Efficacy of Formative Assessment . Educational Measurement: Issues and Practice, 31 (4), 13-17. doi: 10.1111/j.1745-3992.2012.00251.x. This paper is a commentary on the debate around formative assessment research, focusing on inconsistent results regarding its effectiveness. While Black and Wiliam (1998) found an effect size of 0.4 to 0.7 (moderate), Kingston and Nash (2011) found an effect size of 0.2 (small) in their own meta-analysis. Briggs et al. point out methodological concerns with Kingston and Nash’s analysis, and argue that additional research is needed.

Carlson, D., Borman, G. D., & Robinson, M. (2011). A Multistate District-Level Cluster Randomized Trial of the Impact of Data-Driven Reform on Reading and Mathematics Achievement . Educational Evaluation and Policy Analysis, 33 (3), 378-398. In a randomized experiment in more than 500 schools within 59 districts for the reading portion of the project, 57 districts for the math portion, and seven states, approximately half of the participating districts were randomly offered quarterly benchmark student assessments and received extensive training on interpreting and using the data to guide reform. The benchmark assessments monitored the progress of children in grades 3-8 (3-11 in Pennsylvania) in mathematics and reading and guided data-driven reform efforts. The outcome measure was school-level performance on state-administered achievement tests. The Center for Data-Driven Reform in Education model was found to have a statistically significant positive effect on student mathematics achievement. In reading, the results were positive but did not reach statistical significance.

Chang, Chi-Cheng (2009). Self-Evaluated Effects of Web-Based Portfolio Assessment System for Various Student Motivation Levels . Journal of Educational Computing Research, 41 (4), 391-405. The purpose of this study was to explore the self-evaluated effects of a Web-based portfolio assessment system on various categories of students’ motivation. The subjects for this study were the students of two computer classes in a junior high school. The experimental group used the Web-based portfolio assessment system whereas the control group used traditional assessment. The result reveals that the Web-based portfolio assessment system was more effective or useful in terms of self-evaluated learning effects for low-motivation students.

Chi, B., Snow, J. Z., Goldstein, D., Lee, S., & Chung, J. (2010). Project Exploration: 10-Year Retrospective Program Evaluation Summative Report . This report describes the independent evaluation, conducted in 2010, by the Center for Research, Evaluation, and Assessment (REA) at the Lawrence Hall of Science, University of California, Berkeley. The evaluators undertook a 10-year retrospective study of Project Exploration programming and participation by nearly 1,000 Chicago public school students. The survey and follow-up interviews attempted to surface factors that affected students’ decisions to get involved and stay involved with science. Key findings from the REA study include the following: increased science capacity; positive youth development; and engagement in a community of practice that nurtured relationships and helped students learn from one another, envision careers in science, and feel good about their futures.

Cohen, G. L., Garcia, J., Apfel, N., & Master, A. (2006). Reducing the Racial Achievement Gap: A Social-Psychological Intervention (PDF) . Science, 313 (5791), 1307-1310. In two field studies, students were led to self-affirm in order to assess the consequences on academic performance. In these studies (separated by a year and composed of a separate set of students), seventh-grade students at a racially diverse middle school in the northeast United States were randomly assigned to self-affirm or not to self-affirm as part of a brief classroom exercise. Students who self-affirmed did so by indicating values that were important to them and writing a paragraph indicating why those values were important. For students who did not self-affirm, they indicated their least important values and wrote a paragraph regarding why those values might be important to others. The effects on academic performance during the term were dramatic. African American students who had been led to self-affirm performed about .3 grade points better during the term than those who had not. Moreover, benefits occurred regardless of preintervention levels of demonstrated ability. The self-affirmation intervention appears to have attenuated a drop in performance occurring for the African American students.

Cohen, G. L., Steele, C. M., & Ross, L. D. (1999). The Mentor’s Dilemma: Providing Critical Feedback Across the Racial Divide (PDF) . Personality and Social Psychology Bulletin, 25 (10), 1302-1318. Stereotype threat is eliminated and motivation and domain identification are increased by so-called wise mentoring that offers criticism accompanied by high expectations and the view that each student is capable of reaching those expectations. Across two experiments, an emphasis on high standards and student capability eliminated perceived bias, eliminated differences in motivation based on race, and preserved identification with the domain in question. These results suggest that feedback that might be viewed in terms of negative stereotypes differs in effectiveness, according to the presence of an emphasis on high standards and assurance that the individual can meet those standards.

Council of Chief State School Officers (CCSSO). (2013, February). Knowledge, Skills, and Dispositions: The Innovation Lab Network State Framework for College, Career, and Citizenship Readiness, and Implications for State Policy (PDF) (CCSSO White Paper). This white paper communicates the shared framework and definitional elements of college, career, and citizenship readiness accepted by Innovation Lab Network (ILN) chief state school officers in June 2012. Going forward, each ILN state has committed to adopting a definition of college and career readiness that is consistent with these elements, although precise language may be adapted, and to reorient its education system in pursuit of this goal.

Danaher, K., & Crandall, C. S. (2008). Stereotype Threat in Applied Settings Re-Examined . Journal of Applied Social Psychology, 38 (6), 1639-1655. Reducingstereotypethreat.org says this about this study: "Given the importance of standardized-test performance in determining educational opportunities, career paths, and life choices, Danaher and Crandall argue that use of standard statistical decision criteria is misplaced in this context. Accordingly, these authors reexamined the data presented by Stricker and Ward (2004) but used criteria of p < .05 from the overall analysis of variance and η ≥ .05 for the standard. Results indicate that "soliciting identity information at the end rather than at the beginning of the test-taking session shrunk sex differences in performance by 33%." When test takers did not report their identities before the test, women’s performance improved noticeably and men’s scores declined slightly. Reducingstereotype.org concludes, "This re-analysis suggests that soliciting social-identity information prior to test taking does produce small differences in performance consistent with previous findings in the stereotype-threat literature that, when generalized to the population of test takers, can produce profound differences in outcomes for members of different groups."

Darling-Hammond, L., & Adamson, F. (2010). Beyond Basic Skills: The Role of Performance Assessment in Achieving 21st Century Standards of Learning (PDF) . Stanford, CA: Stanford University, Stanford Center for Opportunity Policy in Education. This paper is the culminating report of a Stanford University project aimed at summarizing research and lessons learned regarding the development, implementation, consequences, and costs of performance assessments. A set of seven papers was commissioned to examine experiences with and lessons from large-scale performance assessment in the United States and abroad, including technical advances, feasibility issues, policy implications, usage with English-language learners, and costs.

Darling-Hammond, L., & Barron, B. (2008). Teaching for Meaningful Learning: A Review of Research on Inquiry-Based and Cooperative Learning (PDF) . In Powerful Learning: What We Know About Teaching for Understanding (pp. 11-16). San Francisco, CA: Jossey-Bass. This is a comprehensive review of research on inquiry-based-learning outcomes and approaches, including project-based learning, problem-based learning, and design-based instruction. Darling-Hammond and Barron describe evidence-based approaches as follows to support inquiry-based teaching in the classroom: (1) clear goals and carefully designed guiding activities; (2) a variety of resources (e.g., museums, libraries, Internet, videos, lectures) and time for students to share, reflect, and apply knowledge while thinking through classroom dilemmas more productively; (3) participation structures and classroom norms that increase the use of discussion and a culture of collaboration (e.g., framing discussions to allow for addressing misconceptions midproject and using public performances); (4) formative assessments that provide opportunities for revision; and (5) assessments that are multidimensional. Ultimately, these practices will support students in evaluating their own work against predefined rubrics and promote assessment, knowledge development, and collaboration.

Darling-Hammond, L., Herman, J., Pellegrino, J., et al. (2013). Criteria for High-Quality Assessment (PDF) . Stanford, CA: Stanford University, Stanford Center for Opportunity Policy in Education. The Common Core State Standards, adopted by 45 states, feature an increased focus on deeper learning, or students’ ability to analyze, synthesize, compare, connect, critique, hypothesize, prove, and explain their ideas. This report provides a set of criteria for high-quality student assessments. These criteria can be used by assessment developers, policy makers, and educators as they work to create and adopt assessments that promote deeper learning of 21st-century skills that students need to succeed in today’s knowledge-based economy.

Duckworth, A. L., Grant, H., Loew, B., Oettingen, G., & Gollwitzer, P. M. (2011). Self-Regulation Strategies Improve Self-Discipline in Adolescents: Benefits of Mental Contrasting and Implementation Intentions (PDF) . Educational Psychology, 31 (1), 17-26. doi:10.1080/01443410.2010.506003. Sixty-six second-year high school students who were preparing during English class to take a high-stakes exam (the PSAT) by practicing the writing section were randomly assigned to one of two conditions: a 30-minute written “mental contrasting combined with implementation intentions” (MCII) exercise or a control condition. All students answered a question about the likelihood of accomplishing a goal (“How likely do you think it is that you will complete all 10 practice tests in the PSAT workbook?”), wrote about the importance of that goal, and listed two positive outcomes associated with completing that goal and two obstacles that could interfere. Students in the control condition wrote a short essay about an influential person or event in their life. Students in the MCII elaborated in writing on both the positive outcomes and obstacles of the goal by imagining as vividly as possible. Students then rewrote both obstacles and proposed a specific solution for each one by writing three “if-then” plans (i.e., implementation intentions) in this form: “If [obstacle], then I will [solution].” The third if-then specified where and when they would complete the workbook. Students in the mental contrasting with implementation intentions condition completed 60 percent more practice questions than did controls. The authors conclude that “these findings point to the utility of directly teaching to adolescents mental contrasting with implementation intentions as a self-regulatory strategy of successful goal pursuit.”

Duckworth, A. L., Peterson, C., Matthews, M. D., & Kelly, D. R. (2007). Grit: Perseverance and Passion for Long-Term Goals (PDF) . Journal of Personality and Social Psychology, 92 (6), 1087-1101. doi:10.1037/0022-3514.92.6.1087. The authors tested the importance of one noncognitive trait: grit. Defined as perseverance and passion for long-term goals, grit accounted for an average of 4 percent of the variance in success outcomes, including educational attainment among two samples of adults; grade point average among Ivy League undergraduates; retention among two classes at the United States Military Academy, West Point; and ranking in the Scripps National Spelling Bee. Grit did not relate positively to IQ but was highly correlated with “conscientiousness.” The authors conclude that achieving difficult goals involves not only talent but also sustained and focused application of talent over time.

Duckworth, A. L., & Quinn, P. D. (2009). Development and Validation of the Short Grit Scale (Grit-S) (PDF) . Journal of Personality Assessment, 91 (2), 166-174. doi:10.1080/00223890802634290. This paper validates the use of a shorter version of the Grit Scale, which measures the trait of perseverance and passion for long-term goals. The shorter version (Grit-S) correlated with educational attainment and fewer career changes among adults and predicted GPA among adolescents in addition to inversely predicting hours watching TV. Among West Point cadets, the Grit-S predicted retention, and among Scripps National Spelling Bee competitors, the Grit-S predicted final round attained, a relationship mediated by spelling practice.

Dweck, C. S. (2006). Mindset: The New Psychology of Success . New York, NY: Ballantine Books/Random House Publishing. Dweck shows how mindset is unwrapped when we are children, and as adults, it runs each part of our lives: jobs, athletics, relationships, child rearing. Dweck details the ways in which creative talents across all genres -- music, literature, science, sports, business -- use the growth mindset to get results. She also demonstrates how we can change our mindset at any time to achieve true success and fulfillment. Dweck covers a range of applications and helps parents and educators see how they can promote the growth mindset.

Gersten, R., Beckmann, S., Clarke, B., Foegen, A., Marsh, L., Star, J. R., & Witzel, B. (2009). Assisting Students Struggling With Mathematics: Response to Intervention (RtI) for Elementary and Middle Schools (PDF) (NCEE 2009-4060). Washington, DC: National Center for Education Evaluation and Regional Assistance, Institute of Education Sciences, U.S. Department of Education. This guide provides eight specific recommendations intended to help teachers, principals, and school administrators use response to intervention to identify students who need assistance in mathematics and to address the needs of these students through focused interventions. The guide provides suggestions on how to carry out each recommendation and explains how educators can overcome potential roadblocks to implementing the recommendations.

Gersten, R., Compton, D., Connor, C. M., Dimino, J., Santoro, L., Linan-Thompson, S., & Tilly, W. D. (2008). Assisting Students Struggling With Reading: Response to Intervention (RtI) and Multi-Tier Intervention for Reading in the Primary Grades (PDF) (NCEE 2009-4045). Washington, DC: National Center for Education Evaluation and Regional Assistance, Institute of Education Sciences, U.S. Department of Education. This guide offers five specific recommendations to help educators identify struggling readers and implement evidence-based strategies to promote their reading achievement. Teachers and reading specialists can utilize these strategies to implement RtI and multitier intervention methods at the classroom or school level. Recommendations cover screening students for reading problems, designing a multitier intervention program, adjusting instruction to help struggling readers, and monitoring student progress.

Gielen, S., Peeters, E., Dochy, F., Onghena, P., & Struyven, K. (2010). Improving the Effectiveness of Peer Feedback for Learning . Learning and Instruction, 20 (4), 304-315. doi:10.1016/j.learninstruc.2009.08.007. A quasi-experimental repeated-measures design examined the effectiveness of (a) peer feedback for learning, more specifically, certain characteristics of the content and style of the provided feedback, and (b) a particular instructional intervention to support the use of the feedback. Writing assignments of 43 students in grade seven in secondary education showed that receiving “justified” comments in feedback improves performance, but this effect diminishes for students with better pretest performance. Justification was superior to the accuracy of comments. The instructional intervention of asking students who received peer assessment to reflect upon feedback after peer assessment did not increase learning.

Goldenberg, C. (1992/1993). Instructional Conversations: Promoting Comprehension Through Discussion . The Reading Teacher, 46 (4), 316-326. This article describes an instructional conversation model developed along with elementary school teachers who want to promote these types of learning opportunities for students.

Gollwitzer, P. M., & Sheeran, P. (2006). I mplementation Intentions and Goal Achievement: A Meta-Analysis of Effects and Processes (PDF) . Advances in Experimental Social Psychology, 38 , 69-119. Holding a strong goal intention (“I intend to reach Z!”) does not guarantee goal achievement because people may fail to deal effectively with self-regulatory problems during goal striving. This review analyzes whether realization of goal intentions is facilitated by forming an implementation intention that spells out the when, where, and how of goal striving in advance (“If situation Y is encountered, then I will initiate goal-directed behavior X!”). Findings from 94 independent tests showed that implementation intentions had a positive effect of medium-to-large magnitude (d = .65) on goal attainment. Implementation intentions were effective in promoting the initiation of goal striving, the shielding of ongoing goal pursuit from unwanted influences, disengagement from failing courses of action, and conservation of capability for future goal striving. There was also strong support for postulated component processes: Implementation-intention formation both enhanced the accessibility of specified opportunities and automated respective goal‐directed responses. Several directions for future research are outlined.

Griffin, P. (2007). The Comfort of Competence and the Uncertainty of Assessment (PDF) . Studies in Educational Evaluation , 33(1), 87-99. This article argues that a probabilistic interpretation of competence can provide the basis for a link between assessment, teaching and learning, curriculum resources, and policy development. Competence is regarded as a way of interpreting the quality of performance in a coherent series of hierarchical tasks. The work of Glaser is combined with that of Rasch and Vygotsky. When assessment performance is reported in terms of competence levels, the score is simply a code for a level of development and helps to indicate Vygotsky’s zone of proximal development in which the student is ready to learn.

Hattie, J. (2009). Visible Learning: A Synthesis of Over 800 Meta-Analyses Relating to Achievement . New York, NY: Routledge. Hattie analyzed a total of about 800 meta-analyses, encompassing 52,637 studies, 146,142 effect sizes, and millions of students. Hattie points out that in education, most things work, more or less, and sets out to identify educational practices that work best and therefore best repay the effort invested. According to Hattie, the simplest prescription for improving teaching is to provide “dollops of feedback.” Providing students with feedback had one of the largest effect sizes on learning compared with other interventions studied.

Karegianes, M. L., Pascarella, E. T., & Pflaum, S. W. (1980). The Effects of Peer Editing on the Writing Proficiency of Low-Achieving Tenth Grade Students . The Journal of Educational Research, 73 (4), 203-207. This article found peer editing to be as effective, if not more effective, than teacher editing for low-achieving students in 10th grade.

Kingston, N., & Nash, B. (2011). Formative assessment: A meta‐analysis and a call for research . Educational Measurement: Issues and Practice, 30 (4), 28-37. doi: 10.1111/j.1745-3992.2011.00220.x. This meta-analysis reviews the research on formative assessment, re-examining Black and Wiliam’s (1998) claim that it has an effect size of 0.4-0.7 (moderate) on student learning. Kingston and Nash evaluated each study in Black and Wiliam’s meta-analysis and discovered many that had flawed research designs, reducing the number of valid studies from 300 to 13. Upon conducting their own meta-analysis, they found an effect size of 0.2 (small effect).

Koh, K. (2011). Improving Teachers’ Assessment Literacy Through Professional Development . Teaching Education, 22 (3), 255-276. This study examined the effects of professional development on teachers’ assessment literacy between two groups of teachers: (1) teachers who were involved in ongoing and sustained professional development in designing authentic classroom assessment and rubrics and (2) teachers who were given only short-term, one-shot professional-development workshops in authentic assessment. The participating teachers taught fourth- and fifth-grade English, science, and mathematics. The teachers who were involved in ongoing, sustained professional development showed significantly increased understanding of authentic assessment.

Liem, G. A. D., Ginns, P., Martin, A. J., Stone, B., & Herrett, M. (2012). Personal Best Goals and Academic and Social Functioning: A Longitudinal Perspective . Learning and Instruction, 22 (3), 222-230. This study examined the role of personal best (PB) goals in academic and social functioning. Alongside academic and social outcome measures, PB goal items were administered to 249 high school students at the beginning and end of their school year. Personal best goals were correlated with a range of positive variables at Time 1; however, at Time 2 the effects of personal best goals on deep learning, academic flow, and positive teacher relationship remained significant after controlling for prior variance of corresponding Time 1 factors, suggesting that students with personal best goals show sustained resilience in academic and social development.

Marx, D. M., Stapel, D. A., & Muller, D. (2005). We Can Do It: The Interplay of Construal Orientation and Social Comparisons Under Threat . Journal of Personality and Social Psychology, 88 (3), 432. Results of four experiments showed that women tended to perform as well as men on a math test when the test was administered by a woman with high competence in math, but they performed more poorly (and showed a lower state of self-esteem) when the test was administered by a man. Results indicated that these effects were due to the perceived competence, and not just the gender, of the experimenter.

Mento, A. J., Steel, R. P., & Karren, R. J. (1987). A Meta-Analytic Study of the Effects of Goal Setting on Task Performance: 1966-1984 . Organizational Behavior and Human Decision Processes, 39 (1), 52-83. This meta-analysis of published research from 1966 to 1984 focuses on the relationship between goal-setting variables and task performance. The analyses “yielded support for the efficacy of combining specific hard goals with feedback versus specific hard goals without feedback.”

Murphy, P. K., Wilkinson, I. A. G., Soter, A. O., Hennessey, M. N., & Alexander, J. F. (2009). Examining the Effects of Classroom Discussion on Students’ High-Level Comprehension of Text: A Meta-Analysis . Journal of Educational Psychology, 101 (3), 740-764. This comprehensive meta-analysis of empirical studies was conducted to examine evidence of the effects of classroom discussion on measures of teacher and student talk and on individual student comprehension and critical-thinking and reasoning outcomes. Results revealed that several discussion approaches produced strong increases in the amount of student talk and concomitant reductions in teacher talk, as well as substantial improvements in text comprehension. Few approaches to discussion were effective at increasing students’ literal or inferential comprehension and critical thinking and reasoning. While the range of ages of participants in the reviewed studies was large, a majority of studies were conducted with students in grades 4-6.

National Education Association (NEA) (2012). Preparing 21st Century Students for a Global Society: An Educator’s Guide to the “Four Cs” (PDF) . NEA, in collaboration with other U.S. professional organizations, developed this guide to help educators integrate policies and practices for building the “Four Cs” (critical thinking, communication, collaboration, and creativity) into their own instruction. They argue that what was considered a good education 50 years ago is no longer enough for success in college, career, and citizenship in the 21st century.

Parker, W. C., Lo, J., Yeo, A. J., Valencia, S. W., Nguyen, D., Abbott, R. D., . . . & Vye, N. J. (2013). Beyond Breadth-Speed-Test: Toward Deeper Knowing and Engagement in an Advanced Placement Course . American Educational Research Journal, 50 (6), 1424-1459. This mixed-methods-design experiment was conducted with 289 students in 12 classrooms across four schools in an “excellence for all” context of expanding enrollments and achieving deeper learning in AP U.S. Government and Politics. Findings suggest that quasi-repetitive projects can lead to higher scores on the AP test but a floor effect on the assessment of deeper learning. Implications are drawn for assessing deeper learning and helping students adapt to shifts in the grammar of schooling.

Reis, S. M., McCoach, D. B., Little, C. A., Muller, L. M., & Kaniskan, R. B. (2011). The Effects of Differentiated Instruction and Enrichment Pedagogy on Reading Achievement in Five Elementary Schools . American Educational Research Journal, 48 (2), 462-501. Five elementary schools (63 teachers and 1,192 students in grades 2-5) were randomly assigned to differentiated or whole-group classroom instruction in reading. The differentiated approach focused on student engagement in reading by using a three-phase model. The model begins with a book discussion and read aloud, with time for independent reading, and integrated reading strategies or higher-level-thinking questions. Students then listened to other students read and received differentiated reading strategies in five-minute individual conferences or participated in literary discussions. Groups then had options for independent reading, creativity training, buddy reading, or other choices. Differentiated instruction increased reading fluency in three out of five schools. The authors conclude that differentiated instruction was equally if not more effective than the traditional whole-group approach.

Rosenshine, B., & Meister, C. (1994). Reciprocal Teaching: A Review of the Research . Review of Educational Research, 64 (4), 479-530. An analysis of 16 studies indicated that reciprocal teaching was effective as long as the quality of instruction was reasonably high. The effect size was much larger for experimenter-developed comprehension tests (short answers and passage summaries) than when standardized tests were used.

Rowe, M. B. (1974). Wait-Time and Rewards as Instructional Variables, Their Influence on Language, Logic, and Fate Control: Part One -- Wait-Time . Journal of Research in Science Teaching, 11 (2), 81-94. The level of complexity in student responses rises as a teacher pauses after asking questions. Analysis of more than 300 tape recordings over six years of investigations showed mean wait time after teachers ask questions to be about one second. If students do not begin a response, teachers then repeat, rephrase, ask a different question, or call on another student. When mean wait times of three to five seconds are achieved through training, the length of student responses increases, the number of unsolicited but appropriate responses also increases, and failures to respond decrease.

Shute, V. J. (2008). Focus on Formative Feedback (PDF) . Review of Educational Research, 78 (1), 153-189. doi:10.3102/0034654307313795. Shute defines formative feedback as "information communicated to the learner intended to modify his or her thinking or behavior to improve learning.” One hundred and forty-one publications that met the criteria for inclusion serve as the basis for this meta-analysis, which uncovers several guidelines for generating effective feedback. These guidelines include the following: (1) Feedback to the learner should focus on the specific features of his or her work in relation to the task and provide suggestions on how to improve. (2) Feedback should focus on the “what, how, and why” of a problem. (3) Elaborated feedback should be presented in manageable units, and feedback should present information to the extent that students can correct answers on their own. (4) Feedback is more effective when from a trusted source. And (5) immediate feedback is most helpful for procedural or conceptual learning or at the beginning of the learning process and if the task is new and difficult (difficult relative to the learner’s capability), and delayed feedback is best when tasks are simple (relative to the learner’s capability) or when transfer to other contexts is sought.

Slavin, R. E., Cheung, A., Holmes, G., Madden, N. A., & Chamberlain, A. (2012). Effects of a Data-Driven District Reform Model on State Assessment Outcomes . American Educational Research Journal, 50 (2), 371-396. A district-level reform model created by the Center for Data-Driven Reform in Education (CDDRE) provided consultation with district leaders on strategic use of data and selection of proven programs. Fifty-nine districts in seven states were randomly assigned to CDDRE or control conditions. A total of 397 elementary schools and 225 middle schools were followed over a period of up to four years. Positive effects were found on reading outcomes in elementary schools by year four. An exploratory analysis found that reading effects were larger for schools that selected reading programs with good evidence of effectiveness than for those that did not.

Stecher, B. (2010). Performance Assessment in an Era of Standards-Based Educational Accountability (PDF) . Stanford, CA: Stanford University, Stanford Center for Opportunity Policy in Education. This paper is one of eight written through a Stanford University project aimed at summarizing research and lessons learned regarding the development, implementation, consequences, and costs of performance assessments. The paper defines performance assessment and different types of performance tasks; reviews recent history of performance assessments in the United States; and summarizes research on the quality, impact, and burden of performance assessments used in large-scale K-12 achievement testing.

Steele, C. M., & Aronson, J. (1995). Stereotype Threat and the Intellectual Test Performance of African Americans (PDF) . Journal of Personality and Social Psychology, 69 (5), 797-811. This paper raised the possibility that culturally shared stereotypes suggesting poor performance of certain groups can, when made salient in a context involving the stereotype, disrupt performance of an individual who identifies with that group. This effect was termed stereotype threat, and the existence and consequences of stereotype threat were investigated in four experiments. Study 1 involved black and white college students who took a difficult test using items from the verbal GRE under one of two conditions. In the stereotype-threat condition, the test was described as diagnostic and as a good indicator of their intellectual abilities. In one of the nonthreat conditions, the test was described as simply a problem-solving exercise that was not diagnostic of ability. In the third diagnostic condition, participants were encouraged to view the test as a challenge. Performance was compared with three conditions after statistically controlling for self-reported SAT scores. Black participants performed less well than their white counterparts in the stereotype diagnostic condition, but in the nonthreat condition, their performance was close to that of their white counterparts. Study 2 provided a replication of this effect but also showed that blacks both completed fewer test items and had less success in correctly answering items under stereotype threat. In Study 3, black and white undergraduates completed a task that was described either as evaluative (in assessing strength and weaknesses) or as not evaluative of ability, but experimenters encouraged students to try their best and let them know that they could find out their abilities later. When the task supposedly measured ability, African American participants showed heightened awareness of their racial identity (by completing word fragments related to their race), more doubts about their ability (by completing word fragments related to self-doubt), a greater likelihood for excuses for poor performance (i.e., self-handicapping), a tendency to avoid racial-stereotypic preferences, and a lower likelihood of reporting their race compared with students in the low-threat condition. Study 4 sought to identify the conditions sufficient to activate stereotype threat by having undergraduates complete the nonthreat conditions from Studies 1 and 2. Unlike in those experiments, however, students’ ethnic information was solicited from some of the students before they completed the test items. Results showed that performance was poorer only among African Americans whose racial identity was made salient prior to testing. These studies established the existence of stereotype threat and provided evidence that stereotypes suggesting poor performance, when made salient in a context involving the stereotypical ability, can disrupt performance, produce doubt about one’s abilities, and cause an individual to disidentify with one’s ethnic group.

Strobel, J., & van Barneveld, A. (2009). When Is PBL More Effective? A Meta-Synthesis of Meta-Analyses Comparing PBL to Conventional Classrooms . The Interdisciplinary Journal of Problem-Based Learning, 3 (1). Researchers from Purdue University and Concordia University synthesized eight meta-analyses of problem-based learning (PBL) studies to evaluate the effectiveness of problem-based learning and the conditions under which PBL is most effective. The meta-analyses included medical students and adult learners in postsecondary settings. PBL was more effective than traditional instruction for long-term retention, skill development, and satisfaction of students and teachers. Traditional approaches, on the other hand, were more effective for improving performance on standardized exams, considered by the researchers as a measure of short-term retention.

Topping, K. J. (2009). Peer Assessment . Theory Into Practice, 48 (1), 20-27. Peer assessment is an arrangement for learners to consider. Peer assessors can specify the level, value, or quality of a product or performance of other equal-status learners. Products to be assessed can include writing, oral presentations, portfolios, test performance, or other skilled behaviors. A formative approach to peer assessment helps students to help one another plan their learning, identify their strengths and weaknesses, target areas for remedial action, and support metacognitive and other personal and professional skills. A peer assessor with less skill at assessment but more time in which to do it can produce an assessment of equal reliability and validity as that of a teacher. Because peer feedback is available in greater volume and with greater immediacy than teacher feedback, teachers are encouraged to use it.

Tubbs, M. E. (1986). Goal Setting: A Meta-Analytic Examination of the Empirical Evidence . Journal of Applied Psychology, 71 (3), 474-483. Tubbs conducted meta-analyses to estimate the amount of empirical support for the major postulates of the goal theory of E. A. Locke (see record 1968-11263-001) and Locke et al. (see record 1981-27276-001). The results of well-controlled studies were generally supportive of the hypotheses that specific and challenging goals led to higher performance than easy goals, “do your best” goals, or no goals. Goals affect performance by directing attention, mobilizing effort, increasing persistence, and motivating strategy development.

Usher, E. L., & Pajares, F. (2008). Sources of Self-Efficacy in School: Critical Review of the Literature and Future Directions . Review of Educational Research, 78 (4), 751-796. The purpose of this review was threefold. First, the theorized sources of self-efficacy beliefs proposed by A. Bandura (1986) are described and explained, including how they are typically assessed and analyzed. Second, findings from investigations of these sources in academic contexts are reviewed and critiqued, and problems and oversights in current research and in conceptualizations of the sources are identified. Although mastery experience is typically the most influential source of self-efficacy, the strength and influence of the sources differ as a function of contextual factors such as gender, ethnicity, academic ability, and academic domain. Finally, suggestions are offered to help guide researchers investigating the psychological mechanisms at work in the formation of self-efficacy beliefs in academic contexts.

Walker, A., & Leary, H. (2009). A Problem-Based Learning Meta-Analysis: Differences Across Problem Types, Implementation Types, Disciplines, and Assessment Levels . Interdisciplinary Journal of Problem-Based Learning, 3 (1), 12-43. In a meta-analysis of 82 studies, 201 outcomes favored problem-based learning (PBL) over traditional instructional methods. The authors review a typology of 11 problem types proposed by Jonassen (2000), which range from logical problems to dilemmas and include features like highly structured problems (focused on an accurate and efficient path to an optimal solution) and ill-structured problems (which do not necessarily have solutions and which prioritize evaluation of evidence and reasoning). The typology includes logical problems, algorithmic problems, story problems (which have underlying algorithms with a story wrapper that amounts to an algorithmic problem), rule-using problems, decision-making problems (e.g., cost-benefit analysis), troubleshooting (systematically diagnosing a fault and eliminating a problem space), diagnosis-solution problems (characteristic of medical school and involving small groups understanding the problem, researching different possible causes, generating hypotheses, performing diagnostic tests, and monitoring a treatment to restore a goal state), strategic performance, case analysis (characteristic of law or business school and involving adapting tactics to support an overall strategy and reflecting on authentic situations), design problems, and dilemmas (such as global warming, which are complex and involve competing values and which may have no obvious solutions). Strategic-performance and design problems were deemed especially effective in producing positive PBL outcomes.

Watt, K. M., Powell, C. A., & Mendiola, I. D. (2004). Implications of One Comprehensive School Reform Model for Secondary School Students Underrepresented in Higher Education (PDF) . Journal of Education for Students Placed at Risk, 9 (3), 241-259. A study of 10 high schools that implemented Advancement Via Individual Determination (AVID) found that all 10 of the AVID schools improved their accountability ratings during the first three years of AVID implementation. AVID students outperformed their classmates on various standardized tests and attended school more often than their classmates.

Wiggins, G., & McTighe, J. (2005). Understanding by Design . Alexandria, VA: Association for Supervision and Curriculum Development. The ASCD website says the following about this book: “What is understanding and how does it differ from knowledge? How can we determine the big ideas worth understanding? Why is understanding an important teaching goal, and how do we know when students have attained it? How can we create a rigorous and engaging curriculum that focuses on understanding and leads to improved student performance in today’s high-stakes, standards-based environment? Authors Grant Wiggins and Jay McTighe answer these and many other questions in this second edition of Understanding by Design. Drawing on feedback from thousands of educators around the world who have used the UbD framework since its introduction in 1998, the authors have greatly revised and expanded their original work to guide educators across the K-16 spectrum in the design of curriculum, assessment, and instruction.”

Wiliam, D. (2010). The Role of Formative Assessment in Effective Learning Environments . In H. D. Dumont, D. Istance, & F. Benavides (Eds.), The Nature of Learning: Using Research to Inspire Practice (135-159). OECD Publishing. This chapter summarizes and elaborates upon formative-assessment research and effective practices to date. >

Yeh, S. S. (2007). The Cost-Effectiveness of Five Policies for Improving Student Achievement (PDF) . American Journal of Evaluation, 28 (4), 416-436. doi:10.1177/1098214007307928. Yeh conducts a cost-benefit analysis, comparing five educational policies, including rapid assessment, voucher programs, charter schools, accountability, and increased spending. Rapid assessment is identified as the most cost effective compared with the other strategies analyzed.

Go to the first section of the Comprehensive Assessment Research Review, Definitions and Outcomes .

  • Introduction: Definition and Outcomes
  • Start With Challenging, Multifaceted Goals
  • Provide Ongoing, Actionable Feedback
  • Motivate Students to Improve
  • Avoiding Pitfalls
  • -->Annotated Bibliography -->

Skip to content. | Skip to navigation

Masterlinks

  • About Hunter
  • One Stop for Students
  • Make a Gift
  • Identify Course Learning Outcomes
  • Map Assignments to Outcomes: Course Maps
  • Assess Student Learning: Rubrics and Item Analysis
  • Close the Loop: Adjust Pedagogical Practice
  • Program Learning Outcome Assessment
  • Map Courses to Outcomes: Curriculum Maps
  • Assess Student Learning: Key Assessments, Portfolios, Capstones
  • Close the Loop: Adjust Curriculum & Resources
  • Annual Assessment Report Template
  • 2020-2021 Program Assessment Template
  • 2021-2022 Program Assessment Report Form
  • Identify Office or Program Goals
  • Assess Evidence: Tracking, Pre-Post Tests, Surveys and Focus Groups
  • Close the Loop: Adjust Office or Program Design
  • Distance Learning Tools
  • Resources for Assessment, Teaching, and Learning Online
  • ACERT & Center for Online Learning - Resources and Toolkit
  • Assessment Glossary
  • Sample Tools
  • Assignment Library
  • Assessment Quick Tips
  • Assessment Bibliography
  • Formative Assessment
  • Syllabus Makeover
  • Department Assessment Plan Guide
  • Curriculum Committee Guide
  • Hunter Core Requirement Course Submission
  • Humanities and Arts
  • Social Sciences
  • Sciences and Mathematics
  • Professional Schools
  • School of Arts & Sciences
  • School of Education
  • School of Health Professions
  • School of Urban Public Health
  • School of Nursing
  • School of Social Work
  • Continuing Education
  • Newsletters
  • Upcoming Assessment Breakfasts
  • Assessment Breakfast: Best practices in using online tools for assessing students [December 14, 2016]
  • Assessment Breakfast Archive
  • Past Events
  • Assessment Day 2016
  • Assessment Day 2015
  • Events Announcements
  • NILOA Events
  • 2019-2020 Assessment Conferences
  • 2021-2022 Assessment Conferences
  • 2022-2023 Assessment Conferences
  • Fall 2023 Events
  • Mission and Goals
  • Assessment Staff
  • Hunter College Assessment Fellows 2017-2018
  • Hunter College Assessment Fellows 2015-2016
  • Hunter College Assessment Fellows 2018-2019
  • Hunter College Assessment Fellows 2019-2020
  • Hunter College Assessment Fellows 2021-2022
  • Assessment Coordinators
  • Middle States
  • Policies and Reports
  • Public Safety
  • Website Feedback
  • Privacy Policy
  • CUNY Tobacco Policy

Search form

  • About Faculty Development and Support
  • Programs and Funding Opportunities
  • Consultations, Observations, and Services
  • Strategic Resources & Digital Publications
  • Canvas @ Yale Support
  • Learning Environments @ Yale
  • Teaching Workshops
  • Teaching Consultations and Classroom Observations
  • Teaching Programs
  • Spring Teaching Forum
  • Written and Oral Communication Workshops and Panels
  • Writing Resources & Tutorials
  • About the Graduate Writing Laboratory
  • Writing and Public Speaking Consultations
  • Writing Workshops and Panels
  • Writing Peer-Review Groups
  • Writing Retreats and All Writes
  • Online Writing Resources for Graduate Students
  • About Teaching Development for Graduate and Professional School Students
  • Teaching Programs and Grants
  • Teaching Forums
  • Resources for Graduate Student Teachers
  • About Undergraduate Writing and Tutoring
  • Academic Strategies Program
  • The Writing Center
  • STEM Tutoring & Programs
  • Humanities & Social Sciences
  • Center for Language Study
  • Online Course Catalog
  • Antiracist Pedagogy
  • NECQL 2019: NorthEast Consortium for Quantitative Literacy XXII Meeting
  • STEMinar Series
  • Teaching in Context: Troubling Times
  • Helmsley Postdoctoral Teaching Scholars
  • Pedagogical Partners
  • Instructional Materials
  • Evaluation & Research
  • STEM Education Job Opportunities
  • Yale Connect
  • Online Education Legal Statements

You are here

Grading and ungrading: an annotated bibliography.

This document brings together a number of resources on the topic of ungrading, drawn from publications ranging from popular press to academic venues. These resources were curated by facilitators and members of a Fall 2023 learning community on this topic. They have included discussion questions as well as citations. Rather than endorsing a single perspective, these resources should be used to prompt discussion and consideration of grading and its alternatives.

Ungrading: Why Rating Students Undermines Learning (And What to Do Instead). Edited by Susan D. Blum. Morgantown: West Virginia University Press, 2020. 

Abstract: This interdisciplinary edited collection brings together theoretical and practical explorations of ungrading, exploring different models and offering both practical examples and reflections from practitioners across the disciplines. Examples include contract grading in writing-driven courses as well as an organic chemistry course restructured around ungraded work.

Suggested discussion questions: Why do we grade? What does it feel like to be graded? What do we want grading to do or not do in our classrooms? Could you imagine implementing one of these models in your classroom? Why or why not? 

Tags: ungrading, contract grading, distance-traveled

Bowles, Samuel. The Moral Economy: Why Good Incentives Are No Substitute for Good Citizens. (New Haven: Yale University Press, 2016.

Abstract: In reviewing the economic research on moral and economic motives, the author emphasizes the potential crowding out effect for both economic rewards/ relevant penalties. This may be transferable to our thinking about assessment, because penalty or better grades may not be the best way to encourage students to truly devote themselves to learning.

Suggested discussion question(s): How can we focus on rewarding positive behavior, rather than just censuring negative behavior? How can we build intrinsic motivation through fostering relationships in the classroom (between students; between instructors and students), rather than creating transactional relationships?  

Tags: incentives, intrinsic motivation, extrinsic motivation

Cimino, Adria J. “An Inside Look at Sorbonne Grades,” Medium (February 15, 2015). https://medium.com/paris-stories/an-inside-look-at-sorbonne-grades-66ee4a87b0e5

Abstract: A short blog post from an American experiencing the French education system through her children and her husband’s past experience. The author explores the differences between French and American grades, while reflecting on how those differences shape engagement and reflect cultural norms. 

Suggested discussion question(s): How does culture impact grading? How do past experiences shape students’ perceptions of our grading or ungrading systems? How do we instill intrinsic motivation in our students? Does written feedback change the way that students perceive their own progress, and which disciplines prioritize that kind of feedback?

Tags: grading systems, culture, international education

Gorichanaz, Tim.  “‘It Made Me Feel Like It Was Okay to Be Wrong’: Student Experiences with Ungrading.” Active Learning in Higher Education, May 2022: 1-23. 

Abstract: A qualitative study that includes in-depth interviews with eight students and reports on four experiential themes that characterize the switch to un-grading. These themes include “de-gamification, or unsettling the “gamified” nature of evaluation in the traditional grading system; time to think and reflect, creating space for review and the deepening of learning; rich communication, or continual feedback between teacher and student; and learning community, in which students felt like they were part of a team effort rather than siloed individuals.” 

Suggested discussion question(s): What characteristics of ungrading can best equip students to maximize their learning and succeed in an ever-changing fast-paced world?

Tags: ungrading, gamification, reflection, communication, feedback

Miller, Michelle D. “Ungrading Light: 4 Simple Ways to Ease the Spotlight off Points.” The Chronicle of Higher Education. August 2, 2022. 

This brief piece considers both the appeal of ungrading and some ways of easing into the process – yielding some of the benefits of the process without redesigning a course entirely. Those include ideas like implementing some ungraded required assignments, dropping late work penalties, and offering two-stage exams, among others. 

Suggested discussion questions: Which of these ideas seem practicable to you? What drawbacks or benefits might you see after implementation? 

Ren, Eva. “What Your Grades Really Mean: A TEDx Talk.” April 19, 2017, https://www.youtube.com/watch?v=yu5GPsnxBS4 .  

Abstract: Grades don’t tell the full story. This Ted Talk, from the perspective of a twelfth grade student, explores how grades limit creativity, discourage students from taking classes outside of their comfort zone, and impinge their self-confidence. 

Suggested discussion question(s): Does grading impede creativity and long-term knowledge? 

Tags: Grading, motivation, student voice

Scheinfield, Daniel R., Karen M. Haigh, and Sandra J.P. Scheinfeld. We Are All Explorers: Learning and Teaching with Reggio Principles in Urban Setting. 

Abstract: A real world case study of education using the Reggio Emilia approach which focuses on preschool and elementary learning as a student-focused, self-guided experience. “While focusing on the application, meaning, and value of Reggio Emilia principles in preschool classrooms, the authors also describe how those same principles and processes pervade relationships with parents, the professional development of teachers, and the overall organization of the program. Offering a powerful combination of theory and practice, this comprehensive model: Provides 10 years of lessons learned from successfully implementing the Reggio Approach in American inner-city schools.”

Suggested discussion question(s): How can the 11 pedagogical principles (or any one of them) be adapted to higher education? If graduate education is already shaped by students’ individual interests, how can we build curricula and assessment mechanisms that honor that? What systems and practices have prevented these approaches from being accepted as valid pedagogical tools for higher education?

Tags: Interest-driven, exploratory, early childhood, perspective-taking

These resources were curated by members of the Fall 2023 Grading and Ungrading Learning Community for graduate students and postdoctoral scholars, facilitated by Gina Marie Hurley and Rachel Wilson. 

Members of this group included Tianyi Zeng, Devin Thomas, Jasper Eastman, Kasturi Roy, Emma Mew, Fiona Bell, Hannah Keller, Thomas Zapadka, Alana Felton, Leonardo Carvalho, Patricia DuCharme, Allegra Ayida, Brielle Januszewski, and Isabelle Chouinard.

YOU MAY BE INTERESTED IN

bibliography education assessment

Peer-Review Groups

Learn about Peer-Review groups and how they can amplify your writing progress! We offer groups for dissertation writing, research paper writing, and fellowship writing.

bibliography education assessment

Writing Consultations

For graduate students looking for expert advice on planning, drafting, and revising their research paper, dissertation, presentation, or any other writing project.

Teaching @ Yale Day (T@YD)

Teaching at Yale Day is a required orientation and training event for all Teaching Fellows and Part-time Acting Instructors.

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

Writing Evaluative Annotated Bibliographies

Dawn Atkinson

Chapter Overview

This chapter aims to help you understand what an annotated bibliography is and how this type of document can be used when planning assignments, conducting research, and evaluating sources. An annotated bibliography generally takes one of two forms: descriptive annotated bibliographies reference and briefly describe sources, while evaluative (or critical ) annotated bibliographies reference, succinctly summarize, and evaluate resources. Regardless of form, an annotated bibliography may be incorporated into a longer text, such as a formal report, or be produced as a stand-alone piece to document research work—for example, to accompany an in-depth assignment like a researched argument essay. Either way, the sources listed on an annotated bibliography should center on a topic or focus; if the annotated bibliography documents research efforts related to an associated assignment, the focus will reflect the author’s thesis, research questions (questions that a study seeks to answer), or research objectives.

The remainder of this chapter addresses evaluative annotated bibliographies.

What, specifically, is an evaluative annotated bibliography?

An evaluative annotated bibliography focuses on an overarching topic by listing pertinent references and by providing sentences that discuss and assess the resources identified in those references. Place the reference for a source at the beginning of an annotated bibliography entry, and then craft sentences and paragraphs about the source that do some or all of the following in accordance with the specifications for the assignment.

  • Summarize the source’s main argument, main point, central themes, or key takeaways.
  • Evaluate the source; in other words, assess the source based on criteria. For example, what is your view of the source’s usefulness or relevance (in terms of research about a topic), accuracy, trustworthiness, timeliness, level of objectivity, or quality, and why do you hold that view? What methods did the author(s) of the source use to collect data, are they sound, and how did you draw that conclusion? How does the source compare with other publications listed on the annotated bibliography that address the same topic? You will need to read all the sources on your annotated bibliography before you can answer this last question.
  • Comment on how the source corresponds with your research aim. How does it fit with, support, or differ from your viewpoint? How has it expanded your thinking on a topic? How might you use the source when writing an associated assignment?

An annotation may also include information about an author’s credentials, the intended audience for a source, and the purpose of a text. Note that in technical and academic genres, authors oftentimes foreground the purposes of their works by indicating them early on, to help readers understand the overall direction of the writing. The purpose of a scholarly journal article, for instance, will typically be stated in its abstract , which is a summary of the article located after the publication’s title but before its introduction.

Why might you be asked to consider author credentials and date of publication when compiling sources for an annotated bibliography?

What does an evaluative annotated bibliography look like?

When constructing an annotated bibliography, follow your instructor’s directions about what information to include and how to format the document, and structure its references according to the style conventions specified. References in the following annotated bibliography entries adhere to APA style; the annotated bibliography as a whole also follows APA formatting conventions. The entries, which are adapted from McLaughlin (2020) as cited in Excelsior Online Writing Lab (2020, “Sample Annotated Bibliography”), feature combinations of the annotated bibliography information listed previously in this chapter and center on the topic of the transferability of writing skills , or applying knowledge and skills about writing gained in one context to another context—a practice that may advance a writer’s knowledge and skills. If you are asked to produce an annotated bibliography for a class, help readers navigate its contents by being consistent about the type of information you supply in each of its entries. In accordance with APA style, the references in the following sample are alphabetized by first author’s last name.

Boone, S., Biggs Chaney, S., Compton, J., Donahue, C., & Gocsik, K. (2012). Imagining a writing     and rhetoric program based on principles of knowledge “transfer”: Dartmouth’s Institute for Writing and Rhetoric. Composition Forum, 26 . http://compositionforum.com/issue/26/dartmouth.php

In this article, Boone et al. (2012) overview the writing program at Dartmouth College’s   Institute for Writing and Rhetoric to discuss an example of what a program based on      writing transfer research looks like. The authors trace the history of the program’s development, explain current curriculum and organization, and look at future directions for the program. Beginning with the idea that skills and knowledge do not all  transfer in the same way, program developers at Dartmouth set out to explore what kind of knowledge writing is and how this knowledge is transferred. By developing     curriculum and sequences of courses that foster reflection and connections to future courses and by encouraging faculty development, Dartmouth has established a thoughtfully constructed writing program that serves as a model for other such programs. The authors explore the state of research on the program and goals based on the results of their research.

This piece serves as a useful guide composed by writing program administrators and writing researchers who are interested in seeing how current studies of writing transfer can be applied to an operating program. The authors offer practical advice, include sample syllabi and curriculum, and honestly reflect on successes and struggles of the program. This article provides much-needed information for those interested in developing a writing program that aligns with transfer research.

Moore, J. (2012). Mapping the questions: The state of writing-related transfer research . Composition Forum, 26 . http://compositionforum.com/issue/26/map-questions-transfer-research.php

Moore (2012) reviews the literature on writing skill transfer in this article as a starting point for who those who are interested in the research area and are already conversant in the language of rhetoric and composition studies. The author begins by discussing the history of research on writing skill transfer, describing issues related to    common definitions and multi-institutional studies. She also explores the goals and methods of recent investigations and, ultimately, calls for explorations of new areas pertinent to writing transfer research. In doing the latter, she raises important   questions about how students’ involvement in non-writing courses and non-academic activities may influence what they do when writing.

Moore’s article provides a helpful overview of studies in the field of writing skill transfer and establishes a jumping-off point for new investigations in the area. I can use information from the article in my term paper introduction to establish context for the reader before exploring different dimensions of writing skill transfer in the body of the piece.

Wardle, E. (2007). Understanding ‘transfer’ from FYC: Preliminary results of a longitudinal study. Writing Program Administration, 31 (1-2), 65-85.  http://associationdatabase.co/archives/31n1-2/31n1-2wardle.pdf

In her report on a longitudinal study conducted at the University of Dayton, Wardle (2007) explores the transfer of writing skills from first-year college composition    courses. She begins by explaining that research is limited when it comes to transfer of   writing skills, even though transfer is seen as a key function of first-year writing     courses. The research that does exist indicates that the skills do not transfer well. With this in mind, Wardle established a curriculum designed to support writing transfer and followed students for two years after they had completed first-year composition. Her  research indicates that the skills from first-year writing did not transfer well, not because students were unable to make the transfer but because the writing assignments they encountered, along with a variety of other issues, made them feel there was no need to transfer the skills.

This longitudinal study is a foundational piece for writing program directors and   serves as a call for more research on writing skill transfer, particularly as it relates to first-year college writing courses. Consequently, lessons gleaned from this study continue to inform writing teachers, program directors, and researchers. In the article, Wardle cites her work with colleague Doug Downs. Together, Wardle and Downs are known as leaders in writing transfer research, which again speaks to the article’s  contribution as a trustworthy and influential piece of scholarship.

While the above sample focuses exclusively on the topic of writing skill transfer, an annotated bibliography that focuses on multiple topics related to a central theme may organize these under specific and informative headings to help readers distinguish one topic area from another. Additionally, if you are asked to produce an annotated bibliography as a stand-alone document, you may be required to provide an introduction to help set the context for the rest of the piece and to explain its purpose.

What are evaluative annotated bibliographies used for?

Because evaluative annotated bibliographies summarize, evaluate, and consider the relevance of sources, they can be used to narrow a research focus, to weigh up research in an area, and to document research findings. To illustrate, maybe you have been asked to compose an evaluative annotated bibliography on the way to producing a researched argument essay. Although you know which topic you want to write about in your essay, you are less clear about what the research says regarding this topic. After reading a book chapter and several journal, magazine, and newspaper articles on the topic, you begin drafting your annotated bibliography and notice that the sources discuss similar and opposing viewpoints and support these with various pieces of evidence. Although you were fairly certain of your perspective on the issue before you began the annotated bibliography assignment, you acknowledge that your view has expanded as a result of reading, writing about, and considering how the sources relate to your researched argument paper. Furthermore, by evaluating the sources for accuracy, quality, and relevance, you are also able to determine which ones best underpin your claims, as well as opposing claims. Thus, you are able to develop a focused thesis statement and supporting topic sentences for your essay that acknowledge the complexities of the topic. Furthermore, your annotated bibliography documents your research work for readers, communicating which sources you investigated for purposes of composing your researched argument and your evaluations of these sources.

Activity: Produce an Evaluative Annotated Bibliography Entry

Read Michael Bunn’s (2011) essay “How to Read Like a Writer,” which can be found at https://wac.colostate.edu/docs/books/writingspaces2/bunn–how-to-read.pdf . Bunn teaches in the University of Southern California’s Writing Program. After reading, reflect on the essay and its pertinence to your own reading and writing life by answering the four discussion questions on page 85 of the text. Be prepared to talk about your answers in class.

Once you have read, reflected on, and discussed the essay, produce an annotated bibliography entry for the source by following these steps.

Step 1: Write a complete, accurate APA reference list entry for the source.

Step 2 : Answer the following questions.

  • What qualifications does the author have? Google him to discover additional information about his credentials beyond that already supplied.
  • Who is the intended audience for the source?
  • What the purpose of the source?
  • How do the audience and purpose influence how information is presented in the source?
  • What argument does the author make?
  • Is the argument convincing? Why or why not?
  • How does the source contribute to your own ideas about reading and writing or relate to other sources you have read about reading and writing?

Step 3: Use the notes you have made to draft an evaluative annotated bibliography entry for the Bunn (2011) text. Refer to the information and examples provided in this chapter for guidance.

Homework: Produce an Evaluative Annotated Bibliography

Identify a topic of inquiry you can explore via means of an annotated bibliography. Your instructor may assign you a topic or ask you to select one. Research the topic by locating and reading sources about it; a librarian can help you identify a focused list of sources. Afterwards, compose an evaluative annotated bibliography that references, summarizes, and evaluates the sources. Your instructor may also ask you to identify author credentials and the intended audience and purpose for each source. In addition, you may be asked to discuss how the sources relate to a larger research aim. Since the evaluative annotated bibliography is a stand-alone assignment, supply an introduction to help set the context for the rest of the piece and to explain its purpose.

After drafting your evaluative annotated bibliography, your instructor may ask you to assess it in relation to the rubric criteria outlined in Rinto (2013, p. 10) in order to refine its content. The rubric is provided here in adapted format for your reference.

When refining your annotated bibliography, use the following handout, produced by the Writing and Communication Centre, University of Waterloo (n.d.), to ensure you have used the words that and which correctly.

 Which vs That: Restrictive and Non-Restrictive Clauses

Restrictive and Non-Restrictive Clauses

Which and that both introduce clauses (groups of words) that provide more information but are not grammatically necessary to the sentence.

e.g., The daily special, which was poached salmon , cost a lot. e.g., The dish that the sous-chef prepared turned out to be better than the daily special.

Using Restrictive Clauses: That

Use that when the information in the clause is necessary to the meaning of the sentence. It’s called a restrictive clause because it limits or affects the purpose of the sentence.

e.g., Suitcases that weigh more than 23kg must be checked.

that weigh more than 23 kg is necessary to the purpose of the sentence. If you removed this restrictive clause, it would imply that all suitcases must be checked, which isn’t what the author intends.

e.g., Drinks that have caffeine make it hard to fall asleep.

that have caffeine is also restrictive. If you take this part out, it suggests that all drinks make it hard to fall asleep.

Some writers will use which for a restrictive clause instead of that . This is technically fine, but if you are having any confusion about the distinctions between restrictive and non-restrictive clauses, it is better to maintain a clear distinction between that and which , for clarity’s sake.

Using Non-Restrictive Clauses: Which

Use which when the information in the clause is not necessary to the meaning of the sentence. It might be helpful or interesting, but if you took it out, the sentence would still make sense.

e.g., The suitcase, which was stuffed with dirty clothes , didn’t fit in the overhead bin.

If which was removed: e.g., The suitcase didn’t fit in the overhead bin.

e.g., Coffee and tea, which both have caffeine , are Canada’s favourite morning drinks.

If which was removed: e.g., Coffee and tea are Canada’s favourite morning drinks.

Note that the non-restrictive which clause is set off by commas.

Use that without commas for a restrictive (necessary) clause. That is required more often than which . Use which with commas for a non-restrictive (not necessary) clause.

Write in that (for restrictive clauses) or which (for non-restrictive clauses).

  • The spoon __________ fell on the floor needed to be washed.
  • The book __________ she wanted was on the top shelf.
  • They used Post-It notes __________ come in various colours to organize the pages.
  • For the hike I need shoes __________ are sturdy.
  • For the hike I need sturdy shoes __________ are expensive.
  • The first skyscraper we saw __________ was the biggest one on that street had 67 floors.
  • The only elevator __________ went all the way to the top was out of service.
  • The cord __________ charges this computer is missing.
  • He provided us with a whole box of samples __________ we didn’t really need so we could make a decision.

Bunn, M. (2011). How to read like a writer. In C. Lowe, & P. Zemliansky (Eds.), Writing spaces: Readings on writing (Vol. 2, pp. 71-86). Parlor Press. License: CC-BY 4.0. https://wac.colostate.edu/docs/books/writingspaces2/bunn–how-to-read.pdf

Excelsior Online Writing Lab. (2020). Annotated bibliographies . License: CC-BY 4.0. https://owl.excelsior.edu/research/annotated-bibliographies/

Rinto, E.E. (2013). Developing and applying an information literacy rubric to student annotated bibliographies. Evidence Based Library and Information Practice, 8 (3), 5-18.  License: CC-BY-NC-SA 4.0 .  https://doi.org/10.18438/B8559F

Writing Centre, University of Waterloo. (n.d.). Which vs that: Restrictive and non-restrictive clauses . License: CC-BY-SA 4.0 . https://uwaterloo.ca/writing-and-communication-centre/sites/ca.writing-and-communication-centre/files/uploads/files/which_vs_that.pdf

Mindful Technical Writing Copyright © 2020 by Dawn Atkinson is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

OEC logo

Site Search

  • How to Search
  • Advisory Group
  • Editorial Board
  • OEC Fellows
  • History and Funding
  • Using OEC Materials
  • Collections
  • Research Ethics Resources
  • Ethics Projects
  • Communities of Practice
  • Get Involved
  • Submit Content
  • Open Access Membership
  • Become a Partner

Evaluation & Assessment Bibliography

This bibliography includes resources for assessing students' competence in ethics as well as for evaluating the effectiveness of ethics instruction. It also has a section looking at methods for assessing the ethical climate of an organization.

Evaluation of Instructional Methods - Course Level

Antes, Allison, Stephen T. Murphy, Ethan O. Waples, Michael D. Mumford,, Ryan P. Brown, Shane Connelly, Lynn D. Deveport. 2009. “A Meta-Analysis for Ethics Instruction Effectiveness in the Sciences Ethics and Behavior 19(5), 379-402.” Doi: 10.1080/10508420903035380 In the present study, the authors conducted a quantitative meta-analysis based on 26 previous ethics program evaluation efforts of responsible conduct of research courses in the sciences, and the results showed that the overall effectiveness of ethics instruction was modest. The effects of ethics instruction, however, were related to a number of instructional program factors, such as course content and delivery methods, in addition to factors of the evaluation study itself, such as the field of investigator and criterion measure utilized. An examination of the characteristics contributing to the relative effectiveness of instructional programs revealed that more successful programs were conducted as seminars separate from the standard curricula rather than being embedded in existing courses. Furthermore, more successful programs were case based and interactive, and they allowed participants to learn and practice the application of real-world ethical decision-making skills.

Cates, Cheryl, and Bryan Dansberry. 2004. "A Professional Ethics Learning Module For Use in Co-operative Education*."  Science & Engineering Ethics 10 (2):401-407. The Professional Practice Program, also known as the co-operative education (co-op) program, at the University of Cincinnati (UC) is designed to provide eligible students with the most comprehensive and professional preparation available. Beginning with the Class of 2006, students in UC's Centennial Co-op Class will be following a new co-op curriculum centered around a set of learning outcomes Regardless of their particular discipline, students will pursue common learning outcomes by participating in the Professional Practice Program, which will cover issues of organizational culture, technology, professional ethics, and the integration of theory and practice. During their third co-op work term, students will complete a learning module on Professional Ethics. To complete the learning module students must familiarize themselves with the code of ethics for their profession, create a hypothetical scenario portraying an ethical dilemma that involves issues covered by the code, resolve the dilemma, and explain why their resolution is the best course of action based upon the code of ethics. A three-party assessment process including students, employers and faculty complete the module.

Clancy, Edward A, Paula Quinn, and Judith E. Miller. 2005. “Assessment of a Case Study Laboratory to Increase Awareness of Ethical Issues in Engineering.” IEEE Transactions on Education. 48 2:313-17. This article discusses the assessment of a three-hour “laboratory period,” during which students read and discussed three short cases on engineering ethics.  The assessment included focus groups and surveys, and while in focus groups students agreed that this activity enhanced their awareness of ethical issues, the survey results, however, were equivocal.  

Cruz, José A., and William J. Frey. 2003. "An Effective Strategy for Integrating Ethics Across the Curriculum in Engineering: An ABET 2000 Challenge."  Science & Engineering Ethics 9 (4):543-568. This paper describes a one-day workshop format for introducing ethics into the engineering curriculum prepared at the University of Puerto Rico at Mayagüez (UPRM). It responds to the ethics criteria newly integrated into the accreditation process by the Accreditation Board of Engineering and Technology (ABET). It also employs an ethics across the curriculum (EAC) approach; engineers identify the ethical issues, write cases that dramatize these issues, and then develop exercises making use of these cases that are specially tailored to mainstream engineering classes. The different activities and strategies employed in this workshop are set forth. Specific references are made to the cases and exercises developed as a result of these workshops. The paper ends by summarizing the different assessments made of the workshop by addressing the following questions: how did it contribute to the overall ABET effort at UPRM; could other universities benefit from a similar activity; and how did the participants evaluate the workshop?

Davis, Michael, and Alan Feinerman. 2012. "Assessing Graduate Student Progress in Engineering Ethics."  Science and Engineering Ethics 18 (2):351-367. Under a grant from the National Science Foundation, the authors (and others) undertook to integrate ethics into graduate engineering classes at three universities -- and to assess success in a way allowing comparison across classes (and institutions). This paper describes the attempt to carry out that assessment. Standard methods of assessment turned out to demand too much class time. Under pressure from instructors, the authors developed an alternative method that is both specific in content to individual classes and allows comparison across classes. Results are statistically significant for ethical sensitivity and knowledge. They show measurable improvement in a single semester.

Fan, Y., X. Zhang, and X. Xie. 2015. "Design and Development of a Course in Professionalism and Ethics for CDIO Curriculum in China."  Science and Engineering Ethics 21 (5):1381-9. doi: 10.1007/s11948-014-9592-2. At Shantou University (STU) in 2008, a stand-alone engineering ethics course was first included within a Conceive-Design-Implement-Operate (CDIO) curriculum to address the scarcity of engineering ethics education in China. The philosophy of the course design is to help students to develop an in-depth understanding of social sustainability and to fulfill the obligations of engineers in the twenty-first century within the context of CDIO engineering practices. To guarantee the necessary cooperation of the relevant parties, we have taken advantage of the top-down support from the STU administration. Three themes corresponding to contemporary issues in China were chosen as the course content: engineers' social obligations, intellectual property and engineering safety criteria. Some popular pedagogies are used for ethics instruction such as case studies and group discussions through role-playing. To impart the diverse expertise of the practical professional practice, team teaching is adopted by interdisciplinary instructors with strong qualifications and industrial backgrounds. Although the assessment of the effectiveness of the course in enhancing students' sense of ethics is limited to assignment reports and class discussions, our endeavor is seen as positive and will continue to sustain the CDIO reform initiatives of STU.

Feldhaus, Charles R., and Patricia L. Fox. 2004. "Effectiveness of an Ethics Course Delivered in Traditional and Non-Traditional Formats*."  Science & Engineering Ethics 10 (2):389-400. This paper details a three-credit-hour undergraduate ethics course that was delivered using traditional, distance, and compressed formats. OLS 263: Ethical Decisions in Leadership is a 200-level course offered by the Department of Organizational Leadership and Supervision in the Purdue School of Engineering and Technology at Indiana University Purdue University Indianapolis (IUPUI). Students in engineering, technology, business, nursing, and other majors take the course. In an effort to determine student perceptions of course and instructor effectiveness, end-of-course student survey data were compared using data from traditional, distance, and compressed sections of the course. In addition, learning outcomes from the final course project were evaluated using a standardized assessment rubric and scores on the course project.

Finelli, Cynthia J., Matthew A. Holsapple, Ra Eunjong, Rob M. Bielby, Brian A. Burt, Donald D. Carpenter, Trevor S. Harding, and Janel A. Sutkus. 2012. "An Assessment of Engineering Students' Curricular and Co-Curricular Experiences and Their Ethical Development."  Journal of Engineering Education 101 (3):469-494. We apply a conceptual framework to the study of engineering students' ethical development. This framework suggests that both formal curricular experiences and co-curricular experiences are related to students' ethical development. Using survey data collected from nearly 4,000 engineering undergraduates at 18 institutions across the U.S., we present descriptive statistics related to students' formal curricular experiences and their co-curricular experiences. Additionally, we present data for three constructs of ethical development (knowledge of ethics, ethical reasoning, and ethical behavior). Our data highlight opportunities for improving the engineering undergraduate/bachelor's level curricula in order to have a greater impact on students' ethical development. We suggest that institutions integrate ethics instruction throughout the formal curriculum, support use of varied approaches that foster high-quality experiences, and leverage both influences of co-curricular experiences and students' desires to engage in positive ethical behaviors.

Goldin, Ilya M., Rosa Lynn Pinkus, and Kevin Ashley. 2015. "Validity and Reliability of an Instrument for Assessing Case Analyses in Bioengineering Ethics Education."  Science and Engineering Ethics 21 (3):789-807. doi: 10.1007/s11948-015-9644-2. Assessment in ethics education faces a challenge. From the perspectives of teachers, students, and third-party evaluators like the Accreditation Board for Engineering and Technology and the National Institutes of Health, assessment of student performance is essential. Because of the complexity of ethical case analysis, however, it is difficult to formulate assessment criteria, and to recognize when students fulfill them. Improvement in students' moral reasoning skills can serve as the focus of assessment. In previous work, Rosa Lynn Pinkus and Claire Gloeckner developed a novel instrument for assessing moral reasoning skills in bioengineering ethics. In this paper, we compare that approach to existing assessment techniques, and evaluate its validity and reliability. We find that it is sensitive to knowledge gain and that independent coders agree on how to apply it.

Hashemian, Golnaz, and Michael C. Loui. 2010. "Can Instruction in Engineering Ethics Change Students’ Feelings about Professional Responsibility?"  Science & Engineering Ethics 16 (1):201-215. doi: 10.1007/s11948-010-9195-5. How can a course on engineering ethics affect an undergraduate student’s feelings of responsibility about moral problems? In this study, three groups of students were interviewed: six students who had completed a specific course on engineering ethics, six who had registered for the course but had not yet started it, and six who had not taken or registered for the course. Students were asked what they would do as the central character, an engineer, in each of two short cases that posed moral problems. For each case, the role of the engineer was successively changed and the student was asked how each change altered his or her decisions about the case. Students who had completed the ethics course considered more options before making a decision, and they responded consistently despite changes in the cases. For both cases, even when they were not directly involved, they were more likely to feel responsible and take corrective action. Students who were less successful in the ethics course gave answers similar to students who had not taken the course. This latter group of students seemed to have weaker feelings of responsibility: they would say that a problem was “not my business.” It appears that instruction in ethics can increase awareness of responsibility, knowledge about how to handle a difficult situation, and confidence in taking action.

Heitman, Elizabeth, Cara H. Olsen, Lida Anestidou, and Ruth Ellen Bulger. 2007. "New graduate students’ baseline knowledge of the responsible conduct of research." Academic Medicine 82(9): 838-845. doi: 10.1097/ACM.0b013e31812f7956 To assess (1) new biomedical science graduate students' baseline knowledge of core concepts and standards in responsible conduct of research (RCR), (2) differences in graduate students' baseline knowledge overall and across the Office of Research Integrity's nine core areas, and (3) demographic and educational factors in these differences, the authors developed a 30 question multiple choice test and asked new graduate students to take the test. They found that the students had inadequate and inconsistent knowledge of RCR, regardless of what type of former training they had gone through.

Keefer, Matthew W., and Michael Davis. 2012. "Curricular Design and Assessment in Professional Ethics Education: Some Practical Advice."  Teaching Ethics: The Journal of the Society for Ethics across the Curriculum 13 (1):81-90. Written by a philosopher and an educational psychologist, this article offers some practical advice and examples on designing assignments for a professional ethics course and assessing students’ work.

Keefer, Matthew, Sara Wilson, Harry Dankowicz, and Michael Loui. 2014. "The Importance of Formative Assessment in Science and Engineering Ethics Education: Some Evidence and Practical Advice."  Science & Engineering Ethics 20 (1):249-260. doi: 10.1007/s11948-013-9428-5. Recent research in ethics education shows a potentially problematic variation in content, curricular materials, and instruction. While ethics instruction is now widespread, studies have identified significant variation in both the goals and methods of ethics education, leaving researchers to conclude that many approaches may be inappropriately paired with goals that are unachievable. This paper speaks to these concerns by demonstrating the importance of aligning classroom-based assessments to clear ethical learning objectives in order to help students and instructors track their progress toward meeting those objectives. Two studies at two different universities demonstrate the usefulness of classroom-based, formative assessments for improving the quality of students' case responses in computational modeling and research ethics.

Kirkman, R. (2008). Teaching for Moral Imagination: Assessment of a Course in Environmental Ethics. Teaching Philosophy, 31 (4), 333-350.  This paper reports the results of an assessment project conducted in a semester-length course in environmental ethics. The first goal of the project was to measure the degree to which the course succeeded in meeting its overarching goal of enriching students' moral imagination and its more particular objectives relating to ethics in the built environment. The second goal of the project was to contribute toward a broader effort to develop assessment tools for ethics education. Through qualitative analysis of an exit survey and of a pair of writing assignments, the study yielded some promising results, outlined here, and suggested particular ways of improving both the course and the assessment procedure.

Kalichman, Michael W., Matthew A. Allison and Sean T. Powell. 2007. “Effectiveness of a Responsible Conduct of Research Course: A Preliminary Study.” Science and Engineering Ethics 13(2), 246-264. Training in the responsible conduct of research (RCR) is required for many research trainees nationwide, but little is known about its effectiveness. For a preliminary assessment of the effectiveness of a short-term course in RCR, medical students participating in an NIH-funded summer research program at the University of California, San Diego (UCSD) were surveyed using an instrument developed through focus group discussions. In the summer of 2003, surveys were administered before and after a short-term RCR course, as well as to alumni of the courses given in the summers of 2002 and 2001. Survey responses were analyzed in the areas of knowledge, ethical decision-making skills, attitudes about responsible conduct of research, and frequency of discussions about RCR outside of class. The only statistically significant improvement associated with the course was an increase in knowledge, while there was a non-significant tendency toward improvements in ethical decision-making skills and attitudes about the importance of RCR training. The nominal impact of a short-term training course should not be surprising, but it does raise the possibility that other options for delivering information only, such as an Internet-based tutorial, might be considered as comparable alternatives when longer courses are not possible.

Moore, Christy, Hart, Hilar., Randall, D’Arcy, & Nichols, Steven P. 2006.” PRiME: Integrating Professional Responsibility into the Engineering Curriculum”. Science & Engineering Ethics, 12(2), 273-289.  doi: 10.1007/s11948-006-0027-6 Engineering educators have long discussed the need to teach professional responsibility and the social context of engineering without adding to overcrowded curricula. The PRiME (Professional Responsibility Modules for Engineering) Project (http://www.engr.utexas.edu/ethics/primeModules.cfm) described in this paper was initiated at the University of Texas, Austin to provide web-based modules that could be integrated into any undergraduate engineering class. Using HPL (How People Learn) theory, PRiME developed and piloted four modules during the academic year 2004-2005. This article introduces the modules and the pilot, outlines the assessment process, analyzes the results, and describes how the modules are being revised in light of the initial assessment. In its first year of development and testing, PRiME made significant progress towards meeting its objectives.

Mumford, Michael D., et al.  2008. “A Sensemaking Approach to Ethics Training for Scientists: Preliminary Evidence of Training Effectiveness. “ Ethics & Behavior, 18(4), 315-339. doi:10.1080/10508420802487815 In recent years, we have seen a new concern with ethics training for research and development professionals. Although ethics training has become more common, the effectiveness of the training being provided is open to question. In the present effort, a new ethics training course was developed that stresses the importance of the strategies people apply to make sense of ethical problems. The effectiveness of this training was assessed in a sample of 59 doctoral students working in the biological and social sciences using a pre–post design with follow-up and a series of ethical decision-making measures serving as the outcome variable. Results showed not only that this training led to sizable gains in ethical decision making but also that these gains were maintained over time. The implications of these findings for ethics training in the sciences are discussed.

Mumford, Michael D., Steele, Logan., & Watts, Logan. L. 2015. “Evaluating Ethics Education Programs: A Multilevel Approach.” Ethics & Behavior, 25(1), 37-60. doi:10.1080/10508422.2014.917417 Although education in the responsible conduct of research is considered necessary, evidence bearing on the effectiveness of these programs in improving research ethics has indicated that, although some programs are successful, many fail. Accordingly, there is a need for systematic evaluation of ethics education programs. In the present effort, the authors examine procedures for evaluation of ethics education programs from a multilevel perspective: examining both within-program evaluation and cross-program evaluation. With regard to within-program evaluation, we note requisite designs and measures for conducting systematic program evaluation have been developed and multiple measures should be applied in program evaluation. With regard to cross-program evaluation, we argue that a meta-analytic framework should be employed where analyses are used to identify best practices in ethics education. The implications of this multilevel approach for improving responsible conduct of research educational programs are discussed.

Pimple, Kenneth. 2001 “Assessing Student Learning in the Responsible Conduct of Research ” Poynter Center for the Study of Ethics in American Institutions, Indiana University. Discusses challenges in assessing  ethics instruction, forming realistic expectations and goals, and suggests some possible ways to assess students in courses and workshops.

Pinkus, Rosa, Claire Gloeckner, and Angela Fortunato. 2015. "The Role of Professional Knowledge in Case-Based Reasoning in Practical Ethics."  Science & Engineering Ethics 21 (3):767-787. doi: 10.1007/s11948-015-9645-1. While there is a general consensus that case studies play a central role in the teaching of professional ethics, there is still much to be learned regarding how professionals learn ethics using case-based reasoning.  This paper reports the results of a study designed to investigate one of the issues in teaching case-based ethics: the role of one's professional knowledge in learning methods of moral reasoning. Using a novel assessment instrument, we compared case studies written and analyzed by three groups of students whom we classified as: (1) Experts in a research domain in bioengineering. (2) Novices in a research domain in bioengineering. (3) The non- research group-students using an engineering domain in which they were interested but had no in-depth knowledge. This study demonstrates that a student's level of understanding of a professional knowledge domain plays a significant role in learning moral reasoning skills.

Plemons, Dena K., Suzanne A. Brody and Michael W. Kalichman. 2006 “Student Perceptions of the Effectiveness of Education in the Responsible Conduct of Research.” Science and Engineering Ethics.  12(3), 574-582. Responsible conduct of research (RCR) courses are widely taught, but little is known about the purposes or effectiveness of such courses. As one way to assess the purposes of these courses, students were surveyed about their perspectives after recent completion of one of eleven different research ethics courses at ten different institutions. Participants (undergraduate and graduate students, post-doctoral fellows and faculty, staff and researchers) enrolled in RCR courses in spring and fall of 2003 received a voluntary, anonymous survey from their instructors at the completion of the course. Responses were received from 268 participants. Seventy-seven percent of open-ended responses listed specific kinds of information learned; only a few respondents talked about changes in skills or attitudes. The two principal findings of this multi-institutional study are that respondents reported: (1) a wide variety of positive outcomes for research ethics courses, but that (2) the impact on knowledge was greater than that for changes in skills or attitudes.

Rudnicka, Ewa A. 2005. “Ethics in an Operations Management Course.” Science and Engineering Ethics. 11 4:645-654. Article includes a model of a grading rubric for evaluating students' understanding of ethics case studies.

Schonfeld, Toby, Erin L. Dahlke, and John M. Longo. 2011. "Pre-test/Post-test Results from an Online Ethics Course: Qualitative Assessment of Student Learning."  Teaching Philosophy 34 (3):273-290. This paper describes a project that attempted to assess whether or not an online course was an effective way to teach applied ethics to students preparing for the health professions by qualitatively analyzing responses to a pretest and post-test administered to students in the course. While previous studies have reported various findings regarding the success of online ethics courses, the authors of this study failed to demonstrate that students gained a greater understanding of key concepts in ethics -- respect for autonomy, decisional capacity, informed consent, and role of the provider. The  findings demonstrate the need for better subjective methods of evaluation and raise questions regarding the efficacy of current models of online ethics courses for health professional students.

Schonfeld, Toby, Hugh Stoddard, and Cory Andrew Labrecque. 2014. "Examining Ethics Developing a Comprehensive Exam for a Bioethics Master's Program."  Cambridge Quarterly of Healthcare Ethics 23 (4):461-471. doi: 10.1017/s0963180114000139. In this article, the authors describe the rationale, development process, and features of the comprehensive exam they created as a culminating experience of a master's program in bioethics. The exam became the students' opportunity to demonstrate the way they were able to integrate course, textual, and practical knowledge gained throughout the experience of the program. Additionally, the exam assessed students' proficiency in the field of bioethics and their ability to critically and constructively analyze bioethical issues. In this article, the authors offer tips to other exam creators regarding our experiences with question and answer development, scoring of the exam, and relationships between coursework and exam preparation and completion.

Seiler, Stephanie N., Brummel, Bradley J., Anderson, Kerri L., Kim, Kyoung Jin., Wee, Serena., Gunsalus, C. K., & Loui, Michael C. 2011.“ Outcomes Assessment of Role-Play Scenarios for Teaching Responsible Conduct of Research.” Accountability in Research: Policies & Quality Assurance, 18(4), 217-246. doi:10.1080/08989621.2011.584760

The authors describe the summative assessment of role-play scenarios used to teach topics in the responsible conduct of research (RCR) to graduate students in science and engineering.  Interviews with role-play participants, with participants in a case discussion training session, and with untrained students suggested that role-playing might promote a deeper appreciation of RCR. The authors also present the results of a think-aloud case analysis study and describe the development of a behaviorally-anchored rating scale (BARS) to assess participants' case analysis performance.

Sim, Kang, Sum, Min Yi, & Navedo, Deborah. 2015.” Use of narratives to enhance learning of research ethics in residents and researchers.” BMC Medical Education, 15, 41. doi:10.1186/s12909-015-0329-y This article discusses the assessment methods and results of incorporating narratives into the learning environment of a research ethics course.  The narratives were chosen from the history of research ethics and the humanities literature related to human subject research. and learners were asked to provide post-session feedback through an anonymised questionnaire on their learning session. An outcomes logic model was used for assessment with focus on immediate outcomes such as engagement, motivation, understanding and reflective learning.  The study found that the majority of learners felt engaged, more motivated to learn, and better equipped about the subject matter. Better appreciation of the learning topic, engagement, motivation to learn, equipping were strongly correlated with the promotion of reflective learning, effectiveness of teaching, promotion of critical thinking and overall positive rating of the teaching session on research ethics.

Wilson, William. 2013. "Using the Chernobyl Incident to Teach Engineering Ethics."  Science & Engineering Ethics 19 (2):625-640. doi: 10.1007/s11948-011-9337-4. This paper discusses using the Chernobyl Incident as a case study in engineering ethics instruction. Groups of students are asked to take on the role of a faction involved in the Chernobyl disaster and to defend their decisions in a mock debate. The results of student surveys and the Engineering and Science Issues Test indicate that the approach is very popular with students and has a positive impact on moral reasoning. The approach incorporates technical, communication and teamwork skills and has many of the features suggested by recent literature.

Zhu, Qhu., & Zoltowski, Carla  B., & Feister, Megan. K., & Buzzanell, Patrice M., & Oakes, William. C., & Mead, Alan D. 2014. “ The Development of an Instrument for Assessing Individual Ethical Decisionmaking in Project-based Design Teams: Integrating Quantitative and Qualitative Methods .” Paper presented at 2014 ASEE Annual Conference, Indianapolis, IN. This paper introduces the development of an instrument for assessing individual ethical decision making in a project-based design context.

Evaluation and Assessment of Institution-wide Programs and Educational Approaches

Ajuwon, A. J., and N. Kass. 2008. "Outcome of a research ethics training workshop among clinicians and scientists in a Nigerian university."  BMC Med Ethics 9:1. doi: 10.1186/1472-6939-9-1. In Nigeria, as in other developing countries, access to training in research ethics is limited, due to weak social, economic, and health infrastructure. The project described in this article was designed to develop the capacity of academic staff of the College of Medicine, University of Ibadan, Nigeria to conduct ethically acceptable research involving human participants. Three in-depth interviews and one focus group discussion were conducted to assess the training needs of participants. A research ethics training workshop was then conducted with College of Medicine faculty. A 23-item questionnaire that assessed knowledge of research ethics, application of principles of ethics, operations of the Institutional Review Board (IRB) and ethics reasoning was developed to be a pre-post test evaluation of the training workshop. Ninety-seven workshop participants completed the questionnaire before and after the workshop; 59 of them completed a second post-test questionnaire one month after the workshop. The training improved participants' knowledge of principles of research ethics, international guidelines and regulations and operations of IRBs. It thus provided an opportunity for research ethics capacity development among academic staff in a developing country institution.

Bebeau, Muriel J. 2002. “The Defining Issues Test and Four Component Model: Contributions to Professional Education.” Journal of Moral Education. 31 3: 271-295 . Describes the development of a standardized test that can be used to measure the growth of moral reasoning skills in students over time.

Berry, Roberta M., Jason Borenstein, and Robert J. Butera. 2013. "Contentious Problems in Bioscience and Biotechnology: A Pilot Study of an Approach to Ethics Education."  Science and Engineering Ethics 19 (2):653-668. doi: 10.1007/s11948-012-9359-6. This manuscript describes a pilot study in ethics education employing a problem-based learning approach to the study of novel, complex, ethically fraught, unavoidably public, and unavoidably divisive policy problems, called "fractious problems," in bioscience and biotechnology. Diverse graduate and professional students from four US institutions and disciplines spanning science, engineering, humanities, social science, law, and medicine analyzed fractious problems employing "navigational skills" tailored to the distinctive features of these problems. The students presented their results to policymakers, stakeholders, experts, and members of the public. This approach may provide a model for educating future bioscientists and bioengineers so that they can meaningfully contribute to the social understanding and resolution of challenging policy problems generated by their work.

Borenstein, Jason, Matthew J. Drake, Robert Kirkman, and Julie L. Swann. 2010. "The Engineering and Science Issues Test (ESIT): A Discipline-Specific Approach to Assessing Moral Judgment."  Science & Engineering Ethics 16 (2):387-407. doi: 10.1007/s11948-009-9148-z.

Describes a tool called the Engineering and Science Issues Test (ESIT). ESIT measures moral judgment in a manner similar to the Defining Issues Test, second edition, but is built around technical dilemmas in science and engineering. The authors used a quasi-experimental approach with pre- and post-tests, and compared the results to those of a control group with no overt ethics instruction. Their findings are that several (but not all) stand-alone classes showed a significant improvement compared to the control group when the metric includes multiple stages of moral development.

Brock, Meagan E., Andrew Vert, Vykinta Kligyte, Ethan P. Waples, Sydney T. Sevier, and Michael D. Mumford. 2008. "Mental Models: An Alternative Evaluation of a Sensemaking Approach to Ethics Instruction."  Science & Engineering Ethics 14 (3):449-472. doi: 10.1007/s11948-008-9076-3. In spite of the wide variety of approaches to ethics training it is still debatable which approach has the highest potential to enhance professionals’ integrity. The current effort assesses a novel curriculum that focuses on metacognitive reasoning strategies researchers use when making sense of day-to-day professional practices that have ethical implications. The evaluated trainings effectiveness was assessed by examining five key sensemaking processes, such as framing, emotion regulation, forecasting, self-reflection, and information integration that experts and novices apply in ethical decision-making. Mental models of trained and untrained graduate students, as well as faculty, working in the field of physical sciences were compared using a think-aloud protocol 6 months following the ethics training. Evaluation and comparison of the mental models of participants provided further validation evidence for sensemaking training. Specifically, it was found that trained students applied metacognitive reasoning strategies learned during training in their ethical decision-making that resulted in complex mental models focused on the objective assessment of the situation. Mental models of faculty and untrained students were externally-driven with a heavy focus on autobiographical processes. The study shows that sensemaking training has a potential to induce shifts in researchers’ mental models by making them more cognitively complex via the use of metacognitive reasoning strategies. Furthermore, field experts may benefit from sensemaking training to improve their ethical decision-making framework in highly complex, novel, and ambiguous situations.

Carrese, Joseph A., Janet Malek, Katie Watson, Lisa Soleymani Lehmann, Michael J. Green, Laurence B. McCullough, Gail Geller, Clarence H. Braddock, III, and David J. Doukas. 2015. "The Essential Role of Medical Ethics Education in Achieving Professionalism: The Romanell Report."  Academic Medicine 90 (6):744-752. doi: 10.1097/acm.0000000000000715. This article-the Romanell Report-offers an analysis of the current state of medical ethics education in the United States, focusing in particular on its essential role in cultivating professionalism among medical learners. Education in ethics has become an integral part of medical education and training over the past three decades and has received particular attention in recent years because of the increasing emphasis placed on professional formation by accrediting bodies such as the Liaison Committee on Medical Education and the Accreditation Council for Graduate Medical Education. Yet, despite the development of standards, milestones, and competencies related to professionalism, there is no consensus about the specific goals of medical ethics education, the essential knowledge and skills expected of learners, the best pedagogical methods and processes for implementation, and optimal strategies for assessment. Moreover, the quality, extent, and focus of medical ethics instruction vary, particularly at the graduate medical education level. Although variation in methods of instruction and assessment may be appropriate, ultimately medical ethics education must address the overarching articulated expectations of the major accrediting organizations. With the aim of aiding medical ethics educators in meeting these expectations, the Romanell Report describes current practices in ethics education and offers guidance in several areas: educational goals and objectives, teaching methods, assessment strategies, and other challenges and opportunities (including course structure and faculty development). The report concludes by proposing an agenda for future research.

Carpenter, D., Harding, T., Sutkus, J., and Finelli, C. 2014. "Assessing the Ethical Development of Civil Engineering Undergraduates in Support of the ASCE Body of Knowledge."  Journal of Professional Issues in Engineering Education and Practice 140 (4):A4014001. doi: doi:10.1061/(ASCE)EI.1943-5541.0000177. Developing engineers must be aware that technological development and emerging global issues will require a keen sense of ethical responsibility. Therefore, they must be prepared to reason through and act appropriately on the ethical dilemmas they will experience as professionals. From a civil engineering professional perspective, graduates need to conform to the ASCE Body of Knowledge as they prepare for the Vision of 2025. This investigation evaluated different institutional approaches for ethics education with a goal of better preparing students to be ethical professionals. The project included visiting 19 diverse partner institutions and collecting data from nearly 150 faculty and administrators and more than 4,000 engineering undergraduates including 567 civil engineering undergraduates who completed the survey. Findings suggest that co-curricular experiences have an important influence on ethical development, that quality of instruction is more important than quantity of curricular experiences, that students are less likely to be satisfied with ethics instruction when they have higher ethical reasoning skills, and that the institutional culture makes affects how students behave and how they articulate concepts of ethics. Overall, regression analysis indicates that civil engineering student responses were consistent with the overall engineering undergraduate population. Finally, the research suggests the curricular foundation is in place, but that institutions need to improve their curricular and co-curricular offerings to facilitate ethical development of students and fulfill ASCE Body of Knowledge outcomes.

Culver, Steven, Ishwar Puri, Richard Wokutch, and Vinod Lohani. 2013. "Comparison of Engagement with Ethics Between an Engineering and a Business Program."  Science & Engineering Ethics 19 (2):585-597. doi: 10.1007/s11948-011-9346-3. Increasing university students' engagement with ethics is becoming a prominent call to action for higher education institutions, particularly professional schools like business and engineering. This paper provides an examination of student attitudes regarding ethics and their perceptions of ethics coverage in the curriculum at one institution. A particular focus is the comparison between results in the business college, which has incorporated ethics in the curriculum and has been involved in ethics education for a longer period, with the engineering college, which is in the nascent stages of developing ethics education in its courses. Results show that student attitudes and perceptions are related to the curriculum. In addition, results indicate that it might be useful for engineering faculty to use business faculty as resources in the development of their ethics curricula.

Curzer, Howard J., Sabrina Sattler, and Devin G. DuPree. 2014. "Do Ethics Classes Teach Ethics?"  Theory and Research in Education 12 (3):366-382. The ethics assessment industry is currently dominated by the second version of the Defining Issues Test (DIT2). In this article, we describe an alternative assessment instrument called the Sphere-Specific Moral Reasoning and Theory Survey (SMARTS), which measures the respondent's level of moral development in several respects. We describe eight difficulties that an instrument must overcome in order to assess ethics classes successfully. We argue that the DIT2 fails to solve these problems, and that the SMARTS succeeds. The SMARTS was administered as pretest and post-test during several semesters to ethics and nonethics classes. Ethics students improved significantly more than nonethics students in both moral theory choice and moral reasoning. Thus, ethics classes do indeed teach ethics.

Davis, Michael, Elisabeth Hildt, and Kelly Laas. 2016. " Twenty-Five Years of Ethics Across the Curriculum."  Teaching Ethics  16 (1). doi:  10.5840/tej201633028 After twenty-five years of integrating ethics across the curriculum at the Illinois Institute of Technology (IIT), the Center for the Study of Ethics in the Professions conducted a survey of full-time faculty to investigate: a) what ethical topics faculty thought students from their discipline should be aware of when they graduate, b) how widely ethics is currently being taught at the undergraduate and graduate level, c) what ethical topics are being covered in these courses, and d) what teaching methods are being used. The survey found that while progress spreading ethics across the curriculum has been substantial, it remains incomplete. The faculty think more should be done. From these findings the authors draw six lessons for ethics centers engaged in encouraging ethics across the curriculum.

Feldhaus, Charles R., Robert M. Wolter, Stephen P. Hundley, and Tim Diemer. 2006. "A Single Instrument: Engineering and Engineering Technology Students Demonstrating Competence in Ethics and Professional Standards."  Science & Engineering Ethics 12 (2):291-311. This paper details efforts by the Purdue School of Engineering and Technology at Indiana University Purdue University Indianapolis (IUPUI) to create a single instrument for honors science, technology, engineering and mathematics (STEM) students wishing to demonstrate competence in the IUPUI Principles of Undergraduate Learning (PUL's) and Accreditation Board for Engineering and Technology (ABET) Engineering Accreditation Criterion (EAC) and Technology Accreditation Criterion (TAC) 2, a through k. Honors courses in Human Behavior, Ethical Decision-Making, Applied Leadership, International Issues and Leadership Theories and Processes were created along with a specific menu of activities and an assessment rubric based on PUL's and ABET criteria to evaluate student performance in the aforementioned courses. Students who complete the series of 18 Honors Credit hours are eligible for an Honors Certificate in Leadership Studies from the Department of Organizational Leadership and Supervision. Finally, an accounting of how various university assessment criteria, in this case the IUPUI Principles of Undergraduate Learning, can be linked to ABET outcomes and prove student competence in both, using the aforementioned courses, menu of items, and assessment rubrics; these will be analyzed and discussed.

Fowers, Blaine J. 2014. "Toward Programmatic Research on Virtue Assessment: Challenges and Prospects."  Theory and Research in Education 12 (3):309-328. Poor construct definition has characterized research on virtue, beginning with Hartshorne and May's honesty studies and continuing to the present. Recently, scholars have begun to define virtues in ways that improve the prospects for measuring virtue constructs, but a coordinated, programmatic approach is necessary for success in virtue measurement. A brief overview of the construct of virtue includes six key elements that can structure virtue assessment design. Recent research on the trait/situation problem suggests that situational factors do not obviate traits. Veridicality issues such as social desirability and positive illusions are significant challenges for self-report virtue measurement. In summary self-report measures, these challenges can be met with a number of methods, including directly assessing social desirability and item construction to remove social desirability. These challenges can also be met using other reports, experience sampling, or experimental procedures. A brief discussion of construct validity in virtue measurement leads to the conclusion that many studies with a variety of methods are necessary to establish valid measures of virtue.

Funk, Carolyn L., Kirsten A. Barrett, and Francis L. Macrina. 2007. "Authorship and Publication Practices: Evaluation of the Effect of Responsible Conduct of Research Instruction to Postdoctoral Trainees."  Accountability in Research: Policies & Quality Assurance 14 (4):269-305. doi: 10.1080/08989620701670187. We have studied postdoctoral trainees funded by NIH F32 fellowship awards in order to test the effectiveness of responsible conduct of research (RCR) education in the areas of authorship and publication practices. We used a 3-wave telephone and on-line survey design, conducted over a period of two years, in order to test for individual change before and after completing RCR education. Overall the responses of the subjects suggested a clear awareness of standards and practices in publication. However, our results failed to suggest that RCR education in this group significantly increased the level of ethically appropriate behavioral responses measured in the study. Similarly we saw no significant effect on increasing awareness of or attention to ethical guidelines about authorship and publication practices. Our interpretation of these null findings was influenced by the significant publication experience of our cohort of subjects. We forward possible explanations for these null findings in this context. Most importantly, we do not suggest that our results argue against continued instruction in RCR education. Instead, we believe our data reinforce the importance of careful articulation of course goals and objectives with attention to the background and experience of the student audience when developing RCR curricula.

Helton-Fauth, Whitney, Blaine Gaddis, Ginamarie Scott, Michael Mumford, Lynn Devenport, Shane Connelly, and Ryan Brown. 2003. "A New Approach to Assessing Ethical Conduct in Scientific Work."  Accountability in Research: Policies & Quality Assurance 10 (4):205-228. doi: 10.1080/08989620390263708. The intent of the current article is to describe the development of a new approach to the study of ethical conduct in scientific research settings. The approach presented in this article has two main components. The first component entails the development of a taxonomy of ethical events as they occur across a broad range of scientific disciplines. The second involves the identification of proximate criteria that will allow systematic and objective evaluation of ethical behaviors through low-fidelity performance simulations. Two proposed measures based on the new approach are intended to identify and measure variations in the scientific environment that might predispose certain individuals to make unethical decisions.

Kligyte, Vykinta, Richard T. Marcy, Ethan P. Waples, Sydney T. Sevier, Elaine S. Godfrey, Michael D. Mumford, and Dean F. Hougen. 2008. "Application of a Sensemaking Approach to Ethics Training in the Physical Sciences and Engineering."  Science & Engineering Ethics 14 (2):251-278. doi: 10.1007/s11948-007-9048-z. One ethics education approach that shows some promise in improving researchers’ integrity has focused on the development of ethical decision-making skills. The current effort proposes a novel curriculum that focuses on broad metacognitive reasoning strategies researchers use when making sense of day-to-day social and professional practices that have ethical implications for the physical sciences and engineering. This sensemaking training has been implemented in a professional sample of scientists conducting research in electrical engineering, atmospheric and computer sciences at a large multi-cultural, multi-disciplinary, and multi-university research center. A pre-post design was used to assess training effectiveness using scenario-based ethical decision-making measures. The training resulted in enhanced ethical decision-making of researchers in relation to four ethical conduct areas, namely data management, study conduct, professional practices, and business practices. Broad implications of the findings for ethics training development, implementation, and evaluation in the sciences are also discussed.

Martin, April, Zhanna Bagdasarov, and Shane Connelly. 2015. "The Capacity for Ethical Decisions: The Relationship Between Working Memory and Ethical Decision Making."  Science & Engineering Ethics 21 (2):271-292. doi: 10.1007/s11948-014-9544-x. Although various models of ethical decision making (EDM) have implicitly called upon constructs governed by working memory capacity (WMC), a study examining this relationship specifically has not been conducted. Using a sense making framework of EDM, we examined the relationship between WMC and various sensemaking processes contributing to EDM. Participants completed an online assessment comprised of a demographic survey, intelligence test, various EDM measures, and the Automated Operation Span task to determine WMC. Results indicated that WMC accounted for unique variance above and beyond ethics education, exposure to ethical issues, and intelligence in several sensemaking processes.

Mecca, J. T., K. E. Medeiros, V. Giorgini, C. Gibson, M. D. Mumford, S. Connelly, and L. D. Devenport. 2014. "The Influence of Compensatory Strategies on Ethical Decision Making."  Ethics and Behavior 24 (1):73-89. doi: 10.1080/10508422.2013.821389. Ethical decision making is of concern to researchers across all fields. However, researchers typically focus on the biases that may act to undermine ethical decision making. Taking a new approach, this study focused on identifying the most common compensatory strategies that counteract those biases. These strategies were identified using a series of interviews with university researchers in a variety of areas, including biological, physical, social, and health as well as scholarship and the performing arts. Interview transcripts were assessed with two scoring procedures, an expert rating system and computer-assisted qualitative analysis. Although the expert rating system identified Understanding Guidelines, Recognition of Insufficient Information, and Recognizing Boundaries as the most frequently used compensatory strategies across fields, other strategies, Striving for Transparency, Value/Norm Assessment, and Following Appropriate Role Models, were identified as most common by the computer-assisted qualitative analyses. Potential reasons for these findings and implications for ethics training and practice are identified and discussed.

Monzon, J. E., O. L. Ariasgago, and A. Monzon-Wyngaard. 2010. "Assessment of moral judgment of BME and other health sciences students."  Conf Proc IEEE Eng Med Biol Soc 2010:2963-6. doi: 10.1109/IEMBS.2010.5626266. The accreditation criteria for engineering programs require that the curriculum introduce students to the ethical, social, economics and safety issues arising from the practice of engineering. This paper presents the assessment of moral judgment of biomedical engineering, dentistry and biochemistry students through the standardized Defining Issues Test (DIT). Results show that college students, as most active members of society, remain at a stage of moral development where morality is still predominantly dictated by outside forces. It is expected that after formal Ethics studies, students will score higher in the last stages of moral development, where laws are regarded as social contracts and moral reasoning is based on universal ethical principles.

Mumford, Michael. D., Steele, L., & Watts, L. L. 2015. “Evaluating Ethics Education Programs: A Multilevel Approach.”   Ethics & Behavior, 25 (1), 37-60. doi:10.1080/10508422.2014.917417 Although education in the responsible conduct of research is considered necessary, evidence bearing on the effectiveness of these programs in improving research ethics has indicated that, although some programs are successful, many fail. Accordingly, there is a need for systematic evaluation of ethics education programs. In the present effort, the authors examine procedures for evaluation of ethics education programs from a multilevel perspective: examining both within-program evaluation and cross-program evaluation. With regard to within-program evaluation, we note requisite designs and measures for conducting systematic program evaluation have been developed and multiple measures should be applied in program evaluation. With regard to cross-program evaluation, we argue that a meta-analytic framework should be employed where analyses are used to identify best practices in ethics education. The implications of this multilevel approach for improving responsible conduct of research educational programs are discussed.

Olson, Lynne E. 2014. "Articulating a Role for Program Evaluation in Responsible Conduct of Research Programs."  Accountability in Research: Policies & Quality Assurance 21 (1):26-33. doi: 10.1080/08989621.2013.822265. Since “Integrity in Scientific Research: Creating an Environment That Promotes Responsible Conduct” was released in 2001, there has been increased interest in evaluating programs designed to foster the responsible conduct of research (RCR). The field of program evaluation is designed to determine the worth or value of programs and can serve as a resource for institutions interested in evaluating their RCR programs. This article provides a very brief overview of program evaluation, demonstrates how it can be applied to RCR, and provides key reference information. Evaluating RCR programs can promote institutional accountability for the resources that are used in supporting those programs.

Olsen, Lynne. 2010 “Developing a Framework for Assessing Responsible Conduct of Research Education Programs.”   Science and Engineering Ethics. 16 (1): 185-200. doi: 10.1007/s11948-010-9196-4. This article discusses the process of developing a program evaluation module that could be used to document and assess educational programs focused on teaching responsible conduct of research.  A programmed series of questions for each of the nine RCR content areas identified by the United States Office of Research Integrity was created based on a performance-monitoring evaluation model. The questions focus on educational goals, resources provided to support the educational efforts, educational content, content delivery, educational outcomes, compliance requirements and feedback. Answers collected in response to the questions could be used to both document and continually improve the quality of RCR educational programs through on-going formative assessment and feedback.

Ponton, Richard F. 2015. "Evaluating continuing professional education in ethics."  Psychologist-Manager Journal (American Psychological Association) 18 (1):12-30. doi: 10.1037/mgr0000026. Currently 31 states and the District of Columbia require psychologists to acquire some form of continuing education in ethics throughout their careers. Of the jurisdictions that do have mandated continuing ethics training, there is wide variation in the minimum hours, specificity of content, and acceptable delivery methods. Psychologist-managers both for their own development and to promote the ethical behavior of organizations often evaluate ethics training programs. This review suggests that a framework for the conceptualization of the goals of ethics education and the evaluation of ethics training programs is needed to move beyond the current self-reported satisfaction model of evaluation toward valid outcome measures. Rest's (1986) model of moral decision making is extended to organizational ethics and a conceptual model of evaluation is suggested.

Quesenberry, Le Gene, Jamie Phillips, Paul Woodburne, and Chin Yang. 2012. "Ethics assessment in a general education programme."  Assessment & Evaluation in Higher Education 37 (2):193-213. doi: 10.1080/02602938.2010.515017. This study sought to assess whether flagged ‘values intensive’ courses within a public university's general education curriculum impacted on students' abilities to reason ethically. The major research question to be explored was, ‘what effect does taking a values intensive course have on students' ethical reasoning ability, when factors such as initial matriculation ability and total coursework are taken into account?’ Papers written by a sample of students in Legal Environment of Business (BSAD 240), mainly first‐year students and sophomores, were holistically scored to determine the level of values reasoning exhibited by the students. It was found that students who had completed more values intensive courses scored higher on the samples used for this research. After providing an overview of the university and the State System of Higher Education (of which the subject university is a part), this paper provides an overview of the university's General Education Programme.

Sacco, Donald, Samuel Bruton, Alen Hajnal, and Chris Lustgraaf. 2015. "The Influence of Disclosure and Ethics Education on Perceptions of Financial Conflicts of Interest."  Science & Engineering Ethics 21 (4):875-894. doi: 10.1007/s11948-014-9572-6. This study explored how disclosure of financial conflicts of interest (FCOI) influences naïve or 'lay' individuals' perceptions of the ethicality of researcher conduct. On a between-subjects basis, participants read ten scenarios in which researchers disclosed or failed to disclose relevant financial conflicts of interest. Participants evaluated the extent to which each vignette represented a FCOI, its possible influence on researcher objectivity, and the ethics of the financial relationship. Participants were then asked if they had completed a college-level ethics course. Results indicated that FCOI disclosure significantly influenced participants' perceptions of the ethicality of the situation, but only marginally affected perceptions of researcher objectivity and had no significant influence on perceptions of the existence of FCOIs. Participants who had previously completed a college-level ethics course appeared more sensitive to the importance of FCOI disclosure than those who lacked such background. This result suggests that formal ethical training may help individuals become more critical consumers of scientific research.

Schuurbiers, Daan. 2011. "What happens in the Lab: Applying Midstream Modulation to Enhance Critical Reflection in the Laboratory."  Science & Engineering Ethics 17 (4):769-788. doi: 10.1007/s11948-011-9317-8. In response to widespread policy prescriptions for responsible innovation, social scientists and engineering ethicists, among others, have sought to engage natural scientists and engineers at the 'midstream': building interdisciplinary collaborations to integrate social and ethical considerations with research and development processes. Two 'laboratory engagement studies' have explored how applying the framework of midstream modulation could enhance the reflections of natural scientists on the socio-ethical context of their work. The results of these interdisciplinary collaborations confirm the utility of midstream modulation in encouraging both first- and second-order reflective learning. The potential for second-order reflective learning, in which underlying value systems become the object of reflection, is particularly significant with respect to addressing social responsibility in research practices. Midstream modulation served to render the socio-ethical context of research visible in the laboratory and helped enable research participants to more critically reflect on this broader context. While lab-based collaborations would benefit from being carried out in concert with activities at institutional and policy levels, midstream modulation could prove a valuable asset in the toolbox of interdisciplinary methods aimed at responsible innovation.

Sindelar, Mark, Larry Shuman, Mary Besterfield-Sacre, Ronald Miller, Carl Mitcham, Barbara Olds, Rosa Pinkus and Harvey Wolfe. 2003. “Assessing Engineering Students’ Abilities to Resolve Ethical Dilemmas” November 5-8, 2003: 33 rd Annual F rontiers in Education, 2003, Boulder, Colorado. S2A-25-30. doi: 10.1109/FIE.2003.1265937 ABET's accreditation criteria provides additional impetus for preparing engineering graduates to act in an ethically responsible manner. However, methods to assess the effectiveness of educational efforts to do this remain primitive at best. We describe the first phase of a joint study at the University of Pittsburgh and the Colorado School of Mines to develop a measurement tool for assessing students' abilities to recognize and resolve ethical dilemmas. Pre- and post-tests at the beginning and end of a semester-long course focusing on engineering ethics are used to assess students' comprehension, analysis, and resolution of ethical dilemmas. Each test consists of two ethical dilemmas addressed through a response essay that is then holistically scored using a rubric that classifies students' level of achievement. Results are analyzed using statistical methods to determine if any "shifts " have occurred to indicate a significant positive change in the cohort's collective ability. A second phase will involve the development of a web-based assessment instrument similar to CSM's Cogito© that can be easily used by engineering faculty.

Steneck, Nicholas H. 1999. "Designing Teaching and Assessment Tools for an Integrated Engineering Ethics Curriculum." Proceedings of the 29th ASEE/IEEE Frontiers in Education Conference. 12d6-11, 12d6-17.  Describes how the faculty at the College of Engineering at the University implemented an across-the-curriculum approach for teaching engineering ethics, the development of strategic goals that shaped the program’s design, and the development of numerous assessment techniques to measure the effectiveness of the program.  

Thompson, Carla. 2014. “ Responsible Conduct of Research Assessment of Doctor of Education Candidates, Graduate Faculty, and Curriculum Considerations.” Innovative Higher Education, 39 (5), 349-360. doi:10.1007/s10755-014-9289-0

The study included an assessment of doctoral students, graduate faculty, and curriculum considerations to determine the degree of infusion of research integrity and responsible conduct of research (RCR) principles within a Doctor of Education program. Study results showed substantial increases in doctoral candidates' knowledge levels of RCR, and faculty members serving as dissertation committee chairs reported greater understanding of RCR tenets than did  non-dissertation chairs. The study also revealed a strong presence of research within the Ed. D. core curriculum.

Ethical Environment of Organizations

Anderson, Melissa. S., Ronning, Emily. A., De Vries, Raymond., & Martinson, B. C. 2007. “The perverse effects of competition on scientists' work and relationships.”   Science and Engineering Ethics, 13 (4), 437-461. Competition among scientists for funding, positions and prestige, among other things, is often seen as a salutary driving force in U.S. science. Its effects on scientists, their work and their relationships are seldom considered. Focus-group discussions with 51 mid- and early-career scientists, on which this study is based, reveal a dark side of competition in science. According to these scientists, competition contributes to strategic game-playing in science, a decline in free and open sharing of information and methods, sabotage of others' ability to use one's work, interference with peer-review processes, deformation of relationships, and careless or questionable research conduct. When competition is pervasive, such effects may jeopardize the progress, efficiency and integrity of science.

Anderson, Melissa. S., Martinson, Bria. C., & De Vries, Raymond. 2007. “Normative dissonance in science: Results from a national survey of U.S. scientists.”  Journal of Empirical Research in Human Research Ethics, 2 (4), 3-14. doi: 10.1525/jer.2007.2.4.3. Norms of scientific research represent ideals to which most scientists subscribe. Our analysis of the extent of dissonance between these widely espoused ideals and scientists' perceptions of their own and others' behavior is based on survey responses from 3,247 mid- and early-career scientists who had research funding from the U.S. National Institutes of Health. We found substantial normative dissonance, particularly between espoused ideals and respondents' perceptions of other scientists' typical behavior. Also, respondents on average saw other scientists' behavior as more counternormative than normative. Scientists' views of their fields as cooperative or competitive were associated with their normative perspectives, with competitive fields showing more counternormative behavior. The high levels of normative dissonance documented here represent a persistent source of stress in science.

Crain, A., Brian Martinson, and Carol Thrush. 2013. "Relationships Between the Survey of Organizational Research Climate (SORC) and Self-Reported Research Practices."  Science & Engineering Ethics 19 (3):835-850. doi: 10.1007/s11948-012-9409-0. The Survey of Organizational Research Climate (SORC) is a validated tool to facilitate promotion of research integrity and research best practices. This work uses the SORC to assess shared and individual perceptions of the research climate in universities and academic departments and relate these perceptions to desirable and undesirable research practices. An anonymous web- and mail-based survey was administered to randomly selected biomedical and social science faculty and postdoctoral fellows in the United States. Respondents reported their perceptions of the research climates at their universities and primary departments, and the frequency with which they engaged in desirable and undesirable research practices. More positive individual perceptions of the research climate in one's university or department were associated with higher likelihoods of desirable, and lower likelihoods of undesirable, research practices. Shared perceptions of the research climate tended to be similarly predictive of both desirable and undesirable research practices as individuals' deviations from these shared perceptions. Study results supported the central prediction that more positive SORC-measured perceptions of the research climate were associated with more positive reports of research practices. There were differences with respect to whether shared or individual climate perceptions were related to desirable or undesirable practices but the general pattern of results provide empirical evidence that the SORC is predictive of self-reported research behavior.

Croney, C. C., and R. Anthony. 2010. "Engaging science in a climate of values: tools for animal scientists tasked with addressing ethical problems."  Journal of Animal Science. 88 (13 Suppl):E75-81. doi: 10.2527/jas.2009-2353. In the United States, escalating concerns about current farm animal science and production methods have resulted not only in increased food animal protection policies, but also in animal welfare legislation. Animal scientists and industry leaders are apprehensive that such policies may be driven primarily by emotion and a lack of scientific understanding, and thus may have unforeseen consequences. However, decisions about animal care, and particularly animal welfare, cannot be made solely on the basis of science because the potential effects on producers, animals, and concerned citizens and the implications for the environment and on food prices must also be considered. Balancing the interests and values of all stakeholders in regard to animal welfare problems has presented a considerable challenge. Ethical accounting processes, such as the Ethical Matrix and the ethics assessment process by Campbell, offer models to combine socioethical concerns with relevant factual information, thereby facilitating decision making that is ethically responsible and that offers viable solutions. A case study is used to illustrate application of the ethics assessment process by Campbell that includes identification of the ethical problems, the embedded values, the relevant facts, and moral tests that can be applied. Awareness of these emerging ways of examining ethics that offer real solutions to conflicts of interests and not merely "one size fits all" answers should be an asset to animal and poultry scientists.

Fisher, Celia B., Gala True, Leslie Alexander, and Adam L. Fried. 2013. "Moral Stress, Moral Practice, and Ethical Climate in Community-Based Drug-Use Research: Views From the Front Line."  AJOB Primary Research 4 (3):27-38. doi: 10.1080/21507716.2013.806969. The role of front-line researchers, those whose responsibilities include face-to-face contact with participants, is critical to ensuring the responsible conduct of community-based drug use research. To date, there has been little empirical examination of how front-line researchers perceive the effectiveness of ethical procedures in their real-world application and the moral stress they may experience when adherence to scientific procedures appears to conflict with participant protections.  This study represents a first step in applying psychological science to examine the work-related attitudes, ethics climate, and moral dilemmas experienced by a national sample of 275 front-line staff members whose responsibilities include face-to-face interaction with participants in community-based drug-use research. Using an anonymous Web-based survey we psychometrically evaluated and examined relationships among six new scales tapping moral stress (frustration in response to perceived barriers to conducting research in a morally appropriate manner); organizational ethics climate; staff support; moral practice dilemmas (perceived conflicts between scientific integrity and participant welfare); research commitment; and research mistrust. As predicted, front-line researchers who evidence a strong commitment to their role in the research process and who perceive their organizations as committed to research ethics and staff support experienced lower levels of moral stress. Front-line researchers who were distrustful of the research enterprise and frequently grappled with moral practice dilemmas reported higher levels of moral stress.

Helton-Fauth, Whitney., Gaddis, Blaine., Scott, Ginamarie., Mumford, Michael., Devenport, Llynn., Connelly, Shane., & Brown, Ryan. 2003.” A New Approach to Assessing Ethical Conduct in Scientific Work.”   Accountability in Research: Policies & Quality Assurance, 10 (4), 205-228. doi:10.1080/08989620390263708

The intent of the current article is to describe the development of a new approach to the study of ethical conduct in scientific research settings. The approach presented in this article has two main components. The first component entails the development of a taxonomy of ethical events as they occur across a broad range of scientific disciplines. The second involves the identification of proximate criteria that will allow systematic and objective evaluation of ethical behaviors through low-fidelity performance simulations. Two proposed measures based on the new approach are intended to identify and measure variations in the scientific environment that might predispose certain individuals to make unethical decisions.

Kisamore, Jennifer, Thomas Stone and I. Jawahar. 2006. “Academic Integrity: The Relationship Between Individual and Situational Factors on Misconduct Contemplations” Journal of Business Ethics. 75(4), 381-394. doi:10.1007/s10551-006-9260-9 Recent, well-publicized scandals, involving unethical conduct have rekindled interest in academic misconduct. Prior studies of academic misconduct have focussed exclusively on situational factors (e.g., integrity culture, honor codes), demographic variables or personality constructs. We contend that it is important to also examine how ␣ these classes of variables interact to influence perceptions of and intentions relating to academic misconduct. In a sample of 217 business students, we examined how integrity culture interacts with Prudence and Adjustment to explain variance in estimated frequency of cheating, suspicions of cheating, considering cheating and reporting cheating. Age, integrity culture, and personality variables were significantly related to different criteria. Overall, personality variables explained the most unique variance in academic misconduct, and Adjustment interacted with integrity culture, such that integrity culture had more influence on intentions to cheat for less well-adjusted individuals. Implications for practice are discussed and future research directions are offered.

Louis, Karen S., Holdsworth, Janet M., Anderson, Melissa. S., & Campbell, Eric G. 2007.” Becoming a scientist: The effects of work-group size and organizational climate .” Journal of Higher Education, 78 (3), 311-336. The purpose of this article is to explore the effects of organizational and work-group characteristics on the socialization of new scientists. It focuses on the experiences of graduate students and postdoctoral fellows in science. The authors chose to look at outcomes that reflect behaviors (early productivity) and attitudes (willingness to share research findings) since both likely have an impact on the future attitudes and behavior of individuals once they enter the scientific work force. The first point suggested by the data is that the "local setting matters" in graduate education. For both of the outcome variables, a limited number of indicators of organizational structure and climate predict a relatively robust percentage of the variance. Although the rewards of science, from grants to the Nobel Prize, go to individuals, there is evidence that graduate students and postdoctoral fellows who find themselves in the right kind of work setting may have a leg up in their trajectories toward becoming successful scientists. A second overall finding is that "work group size is positively associated with early productivity." The authors conclude that, in a typical university setting, both graduate and postdoctoral students are better off being in larger laboratories. With respect to early productivity, the authors found that life science graduate students and postdoctoral fellows publish and present more than their chemical engineering peers. In spite of the increasingly cross-disciplinary nature of scientific research, this finding suggests the need to continue to explore underlying disciplinary differences that may make generalizations about graduate education inappropriate.

Martinson, Brian C., Anderson, Melissa S., & De Vries, Raymond. 2006. “ Scientists' perceptions of organizational justice and self-reported misbehaviors. ” Journal of Empirical Research on Human Research Ethics, 1 (1), 51-66. Policymakers concerned about maintaining the integrity of science have recently expanded their attention from a focus on misbehaving individuals to characteristics of the environments in which scientists work. Little empirical evidence exists about the role of organizational justice in promoting or hindering scientific integrity. Our findings indicate that when scientists believe they are being treated unfairly they are more likely to behave in ways that compromise the integrity of science. Perceived violations of distributive and procedural justice were positively associated with self-reports of misbehavior among scientists.

Martinson, B., Thrush, C., & Lauren Crain, A. (2013). Development and Validation of the Survey of Organizational Research Climate (SORC). Science & Engineering Ethics, 19 (3), 813-834. doi:10.1007/s11948-012-9410-7

Development and targeting efforts by academic organizations to effectively promote research integrity can be enhanced if they are able to collect reliable data to benchmark baseline conditions, to assess areas needing improvement, and to subsequently assess the impact of specific initiatives. A web- and mail-based survey was administered in 2009 to 2,837 randomly selected biomedical and social science faculty and postdoctoral fellows at 40 academic health centers in top-tier research universities in the United States. Measures included the Survey of Organizational Research Climate (SORC) as well as measures of perceptions of organizational justice. Exploratory and confirmatory factor analyses yielded seven subscales of organizational research climate, all of which demonstrated acceptable internal consistency and adequate test-retest reliability. The study found that the SORC demonstrates good internal (alpha) and external reliability (test-retest) as well as both construct and discriminant validity.

Mumford, Michael D. Ethan P. Walples, Alison L. Antes, Stephen T. Murphy, Shane Connelly, Ryan P. Brown and Lindsay D. Devenport. 2009.” Exposure to Unethical Career Events: Effects on Decion-making, Climate, and Socialization.” Ethics and Behavior. 19(5) 351-378. An implicit goal of many interventions intended to enhance integrity is to minimize peoples' exposure to unethical events. The intent of the present effort was to examine if exposure to unethical practices in the course of one's work is related to ethical decision-making. Accordingly, 248 doctoral students in the biological, health, and social sciences were asked to complete a field appropriate measure of ethical decision-making. In addition, they were asked to complete measures examining the perceived acceptability of unethical events and a measure examining perceptions of ethical climate. When these criterion measures were correlated with a measure examining the frequency with which they had been exposed to unethical events in their day-to-day work, it was found that event exposure was strongly related to ethical decision-making, but less strongly related to climate perceptions and perceptions of event acceptability. However, these relationships were moderated by level of experience. The implications of these findings for practices intended to improve ethics are discussed.

United States National Research Council and the National Institute of Medicine. 2002. Integrity in Scientific Research: Creating an Environment That Promotes Responsible Conduct. Washington D.C.: National Academies Press. doi:10.17226/10430. The pursuit and diffusion of knowledge enjoy a place of distinction in American culture, and the public expects to reap considerable benefit from the creative and innovative contributions of scientists. Major social institutions, including research institutions, are expected to be accountable to the public. Fostering an environment that promotes integrity in the conduct of research is an important part of that accountability. As a consequence, it is more important than ever that individual scientists and their institutions periodically assess the values and professional practices that guide their research as well as their efforts to perform their work with integrity. Considerable effort has been devoted to the task of defining research misconduct and elaborating methods for investigating allegations of misconduct. Much less attention has been devoted, however, to the task of fostering a research environment that promotes integrity. This report focuses on the research environment and attempts to define and describe those elements that enable and encourage unique individuals, regardless of their role in the research organization or their backgrounds on entry, to act with integrity. Although integrity and misconduct are related, the focus of this report is on integrity.

Related Resources

Submit Content to the OEC   Donate

NSF logo

This material is based upon work supported by the National Science Foundation under Award No. 2055332. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

Academic Development Centre

Annotated bibliography

Using an annotated bibliography to assess learning

Introduction.

An annotated bibliography is a selected list of sources (texts, primary sources and/or internet sites) reported in an agreed referencing convention and accompanied by a short summary or analysis. The main focus is not to provide a list of sources but to demonstrate an understanding of these sources. Annotated bibliographies can be a useful starting point for a literature review, and may be more suited to formatively assessed components within an assessment strategy. AI will become increasingly adept at supporting the creation of descriptive and analytical annotated bibliographies. This sort of activity presents good opportunities for students to develop skills in working with AI to develop research skills, where the 'added human value' of evaluative judgement or selection will be the focus of assessment.

Different types of annotated bibliographies

There are two main types of annotated bibliography depending on their purpose and function.

A descriptive or informative annotated bibliography usually summarises a source, describes its distinctive features and usefulness for researching a particular topic or question. It also describes the author's main arguments and conclusions without making an evaluative judgement on what the author says.

An analytical or critical annotated bibliography not only summarises the material but it analyses the author’s argument, examines the strengths and weaknesses of what is presented, and considers the applicability of the author's conclusions to the research being conducted.

What can annotated bibliographies assess?

This method can be used to assess students’ ability to access and manage information. More specifically it can give students the opportunity to develop skills and demonstrate their competence in:

  • researching
  • investigating
  • interpreting
  • organising information
  • reviewing and paraphrasing information
  • collecting data
  • comparing sources
  • referencing

Given their format and purpose, annotated bibliographies are not suitable to assess the way in which a coherent and or original argument is presented and developed.

Depending on the assignment, an annotated bibliography might have different purposes:

  • introduce students to research activities
  • provide a literature review on a particular subject
  • identify a gap in the literature
  • help to formulate a thesis on a subject
  • demonstrate the research students have performed on a particular subject
  • provide examples of major sources of information available on a topic
  • describe items that other researchers may find of interest on a topic
  • work with and build upon AI generated content.

It is essential to produce a clear brief to define the purpose of the annotated bibliography as well as what the annotated bibliography should provide. This could include amongst others:

  • full reference details of the text / resource
  • details of the method employed by the author, including how AI was used (if applicable)
  • an synopsis of the argument made by the author/s
  • identification of the advantages / limits in the way the study was conducted
  • an evaluation of the text’s relevance to a specific research question.

It might also be useful to define the range / number of sources that you expect students to include and the order in which they should present them (i.e. alphabetically, thematically, chronologically, etc.).

The brief should also specify if students are expected to preface the bibliography with a short overall introduction and include a concluding paragraph that draws together key points.

Clear marking criteria should be set as part of the design of the task and shared with the students.

Diversity & inclusion

Annotated bibliographies, when supported by an appropriate marking rubric, provide an opportunity for students to distinguish themselves through their selection of sources, their ability to reference, the quality of their writing, and their analytical insights. This method can also support inclusivity by allowing students a choice of topic, sources and approach. It can be a powerful way to empower students by giving them the opportunity to contribute to the curriculum and can contribute to efforts to decolonise the curriculum. Consideration should be given as to how these elements will impact on the quality of the annotated bibliography and whether they should be incorporated into the intended learning outcomes and marking criteria.

Academic integrity

Students should be provided with clear guidance as to how to preserve academic integrity. There is a wide range of annotated bibliographies available on the web so students might be tempted to either use pre-existing materials or rework abstracts instead of reading the whole text. AI is able to generate material for annotated bibliography, and has potential to be an integral part of academic research, and should be considered when designing the brief. Assistance from AI will enable students to do more within the assessment hours, so assessment criteria may reflect this. Alternatively, if the focus of the intended learning outcomes necessitates AI-free or AI-light engagement you might want to integrate mechanisms which also require and assess a personal response to the task, or a requirement to integrate material which cannot be readily scraped from the internet.

One way of reducing the risk of misconduct is to design marking criteria that focus on how students critique the sources in relation to a specific question. It might also be useful to ask students to write a brief introduction to their work that explains how and why they selected the sources and a brief conclusion that pulls together the key points raised across the various sources into a cohesive argument in relation to the question set. (Click here Link opens in a new window for further guidance on plagiarism.)

Student and staff experience

This type of task can help develop students’ ability to do independent research, identify relevant sources, and write a clear and concise evaluation of these texts. It can be a useful way to give students a good grounding in the topic and help them to connect a range of sources and see various perspectives. Once marked they can also be shared between peers as a resource and it can function as a study aid for later assessed work.

Students might be unfamiliar with the format of this task; it is therefore important to share examples of work with students and carefully explain what is expected. In order to avoid students writing long general summaries of sources they should be given the opportunity to practice summarising the whole source in one initial sentence. Once they have mastered this, they can then develop the rest of the annotation and expand on the:

  • quality of the arguments put forward
  • academic rigour of the source
  • perspective taken by the source
  • theoretical underpinnings of the source
  • possible impact of the source.

Given the need to summarise and the short word count available, students should be particularly wary of making unsupported judgements and sweeping generalisations.

Both students and staff might underestimate the time it takes to write an annotated bibliography. To ensure the workload is manageable it might be useful to consider the type of sources, i.e. depending on the time allocated to this it might be more realistic for students to produce a good annotated bibliography of a selection of shorter articles rather than of a range of very long texts or books. Careful definition of the task in this way can define the workload and time spent.

Useful resources

https://sites.umuc.edu/library/libhow/bibliography_apa.cfm

https://library.leeds.ac.uk/info/1401/academic_skills/80/annotated_bibliographies

https://info.lse.ac.uk/staff/divisions/Teaching-and-Learning-Centre/Assessment-Toolkit/Assessment-methods/Annotated-bibliographies

Class participation

Concept maps

Essay variants: essays only with more focus

  • briefing / policy papers
  • research proposals
  • articles and reviews
  • essay plans

Film production

Laboratory notebooks and reports

Objective tests

  • short-answer
  • multiple choice questions

Oral presentations

Patchwork assessment

Creative / artistic performance

  • learning logs
  • learning blogs

Simulations

Work-based assessment

Reference list

Fairtest

Annotated Bibliography: Performance Assessment

American Psychological Association Presidential Task Force on Psychology in Education and Mid-continent Regional Educational Laboratory. (1993). Learner-Centered Principles for School Reform. Washington, DC: APA (Office of Psychology in Education, Education Directorate, APA, 750 First St., NE, Washington, DC 20002).

Includes principles for developing and using student-centered performance assessments that can enhance the learning process. Calls for consulting and involving students in the design of assessment systems. Principles also cover cognitive, affective, developmental, personal and social factors of learning, and their implications for school redesign and reform.

Archbald, D. and F. Newmann. (1988). Beyond Standardized Testing: Assessing Authentic Academic Achievement in the Secondary School. Reston, VA: National Association of Secondary School Principals.

Discusses authentic academic achievement, how to assess it, and how to implement assessment programs. Considers secondary school and college-level assessment alternatives currently in place. Appendix includes a critique of standardized tests. Authors also have chapters in Berlak, below. (1904 Association Dr., Reston, VA 22091).

Ball, A. F. (1993). Incorporating Ethnographic-Based Techniques to Enhance Assessments of Culturally Diverse Students’ Written Exposition. Educational Assessment (1)3, 255-281.

Explains the need to include ethnographic-based approaches in writing assessments of culturally and linguistically diverse populations, in order for teachers to better understand and work with a broad range of students. Discusses as an example of this approach the assessment of eight students’ essays.

Barrs, M. et al. (1988). Primary Language Record . Portsmouth, NH: Heineman (361 Hanover St., Portsmouth, NH 03801-3912, (800) 541-2086).

Provides an excellent, succinct explanation of how and why children acquire literacy and how to document and assess their literacy behaviors. Integrates instruction and assessment of literacy from a whole language perspective. Developed for use with multilingual school populations. Top-notch comprehensive handbook serves as a text on whole language. Includes sample recording forms. A somewhat different handbook, adapted for U.S. use, is the California Learning Record (see below).

Note: A variety of materials based on the PLR are discussed in this bibliography. See: California Learning Record; Cooper & Barr; Darling-Hammond, Ancess & Falk; Falk & Darling-Hammond; Falk, MacMurdy & Darling-Hammond; and Hester.

Baxter, G.P., Glaser, R., & Raghavan, K. (1994). Analysis of Cognitive Demand in Selected Alternative Science Assessments . CSE Technical Report 382 (see CRESST, at Organizations, below).

Research concluding that constructing high-quality tasks that will spur deeper thinking in and across subject areas is quite difficult. Tasks need to be subject to careful review, including interviewing students, itself a difficult process.

Berlak, H., et al. (1992). Toward a New Science of Educational Testing and Assessment. Albany, NY: State University of New York Press.

Eight essays discuss why assessment must change. Provides examples of changing assessment in math and social studies, as well as more general discussions of assessment methods and approaches in the U.S. and England. Proposes a new structure for assessment systems, focusing on portfolios and documenting student work.

Burke, K. (1993). The Mindful School: How to Assess Authentic Learning . Palatine, IL: IRI/Skylight Publishing.

An easy-to-read introduction to classroom assessment. Compensates for lack of depth with wide range of assessment options, plus many graphics and examples.

Burton, E. & Linn, R. (1994). Comparability Across Assessments: Lessons from the Use of Moderation Procedures in England . Los Angeles, CA: Center for Research on Evaluation, Standards, and Student Testing. CSE Technical Report 369 (see CRESST, at Organizations, below).

A detailed look at several methods of establishing comparability among performance assessments which involve bringing readers closer to agreement, termed “moderation.”

California Assessment Collaborative. (1993). Charting the Course Toward Instructionally Sound Assessment: A Report of the Alternative Assessment Pilot Project . San Francisco: Author.

Summarizes and analyzes work of nearly 20 California school district projects in implementing performance assessment. Provides a conceptual map of essential elements and narratives of various experiences. Includes a thoughtful discussion of the costs and benefits of performance assessment. Implications of changing assessments on education as a whole are discussed in the conclusion. (WestEd, 730 Harrison St., San Francisco, CA 94107; $8.00).

California Learning Record . (1994). El Cajon: Center for Language in Learning. Adapted from Barrs, et al., Primary Language Record (see above). (CLL, 10610 Quail Canyon Rd., El Cajon CA 92021; (619) 443-6320.)

CLR provides handbooks for teachers in grades K-6 and 6-12, forms for documenting student learning, and scales and directions for evaluation and gauging student progress. The forms and scales are now available in Spanish, making them more useful in classrooms in which Spanish is spoken and for working with Spanish-speaking parents.

California State Department of Education. (1989). A Question of Thinking: A First Look at Students’ Performance on Open-Ended Questions in Mathematics. Sacramento, CA: Author.

Reviews student responses to open-ended math tasks from the 1987-88 grade twelve California Assessment Program. Tasks were designed to emphasize written communication in math, and this book’s purpose is to help teachers use the assessment to guide instruction to that end. Student misconceptions are analyzed and corresponding instructional recommendations are made. One chapter is devoted to how scoring rubrics were developed, but little attention is paid to how questions were developed.

California Department of Education. (1994-1995). A Sampler of Science Assessment. (1994). A Sampler of Science Assessment: Elementary: Preliminary Edition. Sacramento, CA: Author.

The bulk of these books is made up of tasks and student responses from the California Learning Assessment System (CLAS), for grades five, eight and ten. They emphasize how and why responses should be/were scored, as well as how rubrics were developed. Includes hands-on performance tasks, open-ended response items and “enhanced” multiple-choice questions.

Calfee, R. C. & Perfumo, P. (1993). Student Portfolios and Teacher Logs: Blueprint for a Revolution in Assessment . Berkeley, CA: University of California, National Center for the Study of Writing.

Looks at some problems with large-scale uses of classroom literacy portfolios, particularly inconsistency of information and lack of a rigorous technical base to support large-scale use. Proposes a method for systematizing the collection and evaluation of work that enables classroom flexibility and individualization of portfolios and teacher decision-making with standardization of summary information.

Carini, P. F. (1994). Dear Sister Bess: An Essay on Standards, Judgement and Writing. Assessing Writing , 1(1), 29-65.

Interesting article on assessing writing in the classroom that is also a personal reflection from a leading voice in portfolio assessment. Shows how standards can emerge for each student writer through her own work. Argues standardized assessments remove context and personhood, and externally-imposed standards could cause the same with performance assessment.

Carstens, L. (1993). From the Bottom Up: A Sourcebook of Scoring Rubrics Designed by Teachers . San Diego, CA: San Diego City Schools.

Developing high quality scoring guides, or rubrics, is difficult, and good ones are rare. This book contains some of high quality.

Cooper, W., & Barr, M. (1995). The Primary Language Record & The California Learning Record in Use . El Cajon, CA: Center for Language in Learning (see California Learning Record, above).

“Proceedings from the PLR/CLR International Seminar” contain a set of articles, many written by teachers, on the early years of implementing the PLR/CLR in classrooms in London, New York City, and California. Articles examine parent involvement, using PLR/CLR across the curriculum, equal opportunity, professional development, bilingual students, student self-assessment, and the use of the instruments for accountability. An excellent companion to other pieces on the PLR/CLR (see also Barrs, et al., above).

Council for Exceptional Children . (1995). Performance Assessment and Students with Disabilities . 1920 Association Drive, Reston, VA 20191; (800) 232-7323.

A “mini-library” from CEC containing four articles providing theory and practical information for teachers and teacher educators. Authors are L. Fuchs; S. Elliott; M. Thurlow; and M. McLaughlin & S. Warren.

Darling-Hammond, L., Ancess, J., & Falk, B. (1995). Authentic Assessment in Action: Studies of Schools and Students at Work. New York: Teachers College Press.

Powerful set of case studies on the development of performance assessments and their positive effects on the schools which use them — three high schools and two elementary schools, all but one in New York City. Cases are: Graduation by Portfolio at Central Park East Secondary School; The Senior Project at Hodgson Vocational Technical School (in Delaware); Collaborative Learning and Assessment at International High School; The Primary Language Record at P.S. 261; The Bronx New School. The first and last chapters frame the discussion and draw implications.

Darling-Hammond, L., Einbender, L., Frelow, F., & Ley-King, J. (1993). Authentic Assessment in Practice: A Collection of Performance Tasks, Exhibitions, and Documentation . New York, NY: NCREST (see Organizations, below).

Useful compilation of methods for collecting and evaluating student work, divided into three sections. “Performance Tasks and Exhibitions” includes a general explanation of performance assessment, a piece on how to design tasks, and task plans for various subject areas and cross-disciplinary assessments. “Portfolios” looks at state (Vermont), school (high school graduation) and classroom examples. “Documentation of Learning Over Time” includes discussion of the Primary Language Record (see Barrs, et al., above) and California Learning Record (see above) and other classroom documentation practices.

Diversity and Equity in Assessment Network (DEAN). (1993). Guidelines for Equitable Assessment . Cambridge, MA: FairTest.

A brief set of guidelines for fairness in assessment. Includes recommendations for use of performance assessment, with cautions to ensure equity. (Free from FairTest with SASE.)

Edelsky, C. & Harman, S. (1988). One More Critique of Testing – With Two Differences. English Education. Oct., pp. 157-171.

Excellent summary of many problems with standardized tests, suggests appropriate assessment procedures to meet the different needs of: parents, teachers and students; the public and elected officials; and researchers.

Educational Assessment.

A quarterly academic journal with a primary focus on performance assessment. (Lawrence Erlbaum Associates, Publishers, Mahwah, NJ. Editor: Robert Calfee, Stanford University).

Educational Leadership. (1989). Redirecting Assessment. (46)7, April.

Includes 17 short articles on developing alternative assessments, providing a wide range of introductory materials useful for understanding alternatives theoretically and in practice.

Educational Leadership. (1992). Using Performance Assessment. (49)8, May.

This issue focuses entirely on performance assessment. 19 articles are divided into three sections: “Using Performance Assessment,” “Using Portfolios,” and “Synthesis of Research.” This updates April 1989 issue. (See ASCD, at Organizations, below).

Estrin, E. T. (1993). “Alternative Assessment: Issues in Language, Culture, and Equity.”

Knowledge Brief, #11. San Francisco: WestEd (see Organizations, below).

Summary of many important issues in the current assessment reform movement. Provides information for considering equity issues. Notes that all assessment presumes cultural experiences and values. Recommends portfolio and performance assessments in the multilingual or multicultural classroom, but cautions that such assessment may not be compatible with social experiences and community practices for some students.

Estrin, E.T., & Nelson-Barber, S. (1995). Issues in Cross Cultural Assessment: American Indian and Alaska Native Students . San Francisco: WestEd (see Organizations, below).

This “Knowledge Brief” covers areas similar to Nelson-Barber & Estrin (cited below) with a focus on assessment.

FairTest. (1993). Bibliography on Testing and Evaluating Young Children. Cambridge, MA: Author. (See FairTest order form, last page).

Annotated bibliography contains entries on standardized tests and performance assessments for children from pre-school through grade three.

FairTest. (1995). Selected Annotated Bibliography on Language Minority Assessment . Cambridge, MA: Author. (See FairTest order form, last page).

Includes both more detailed annotations of material on performance assessment for students who are learning English and articles critical of standardized tests used on limited English proficient students.

FairTest . (1991). Standardized Tests and Our Children: A Guide to Testing Reform. Cambridge, MA: Author. (See FairTest order form, last page).

A 32-page, easy-to-read pamphlet which explains what standardized tests are, how they are used, what’s wrong with them, and alternative ways to evaluate students. Also includes sections on parents’ rights, testing terms, and what you can do. Available in Spanish and English; special New York edition available in both languages.

FairTest Examiner. (See FairTest order form, last page).

Quarterly newsletter from FairTest surveys developments in testing and testing reform, including pre-school, elementary & secondary school, IQ, university admissions and employment testing. Contains regular discussions of performance assessments.

Falk, B. and Darling-Hammond, L. (1993) . The Primary Language Record at P.S. 261: How

Assessment Transforms Teaching and Learning . New York: NCREST (see Organizations, below).

Reports in detail on one New York City school’s very positive experiences using the PLR (see Barrs, et al., above). Describes student-centered classroom activities that provide rich information about individual learning, the principal’s critical role, and the ways teaching was improved and home-school relations strengthened. Warns that traditional school structure does not provide time needed for implementation of the PLR.

Falk, B., MacMurdy, S., & Darling-Hammond, L. (1995). Taking a Different Look: How the Primary Language Record Supports Teaching for Diverse Learners . New York: NCREST (see Organizations, below).

Discusses how the PLR can be used to improve instruction for students with limited English proficiency, with special needs, from low-income or minority group backgrounds. The impact on instruction and placement decisions is noted. Contains many concrete examples from student records and teacher interviews. A useful complement to the PLR (see Barrs, above).

Garcia, G.E., & Pearson, P.D. (1994). Assessment and Diversity. In Darling-Hammond, L., Review of Research in Education. Washington, DC: American Educational Research Association.

Demonstrates harm caused by standardized test use with language minority students. Examines variety of performance assessment practices, in the classroom and for accountability, exploring advantages and potential difficulties. Concludes that changes in assessment are needed, but professional development and political changes are required to support the reforms.

Gardner, H. (1991). Assessment in Context: The Alternative to Standardized Testing. In B. Gifford & M.C. O’Connor, eds., Cognitive Approaches to Assessment. Boston: Kluwer Academic.

Detailed discussion of alternatives rooted in students’ classroom work and recent scientific understandings. Gardner has published many articles and books on assessment. The book has a number of other articles of potential interest.

Gearheart, M., Herman, J., Baker, E. L., & Whittaker, A. K. (1993). Whose Work Is It? A Question for the Validity of Large-Scale Portfolio Assessment. Los Angeles, CA: Center for Research on Evaluation, Standards, and Student Testing, CSE Technical Report 363 (see CRESST at Organizations, below).

Discusses how student collaboration, teacher-student collaboration and outside help for students raise issues for use of portfolios in large-scale assessments.

Glazer, S. M. & Brown, S. B. (1993). Portfolios and Beyond: Collaborative Assessment in Reading and Writing . Norwood, MA: Christopher-Gordon Publishers.

Provides “how to” assessment information, addressing “Why Change?” in the first chapter, “Questions Teachers Ask” in the last, and issues such as student-teacher collaboration throughout. The theoretical assumptions about literacy acquisition and the approach to instruction are non-traditional and informative.

Goodman, K. S., Bird, L. B., & Goodman, Y. M. (1992). The Whole Language Catalogue: Supplement on Authentic Assessment. Santa Rosa, CA: SRA School Group.

This guide, primarily for teachers, includes contributions from expert researchers, teachers, principals, parents and students. The combination of viewpoints creates a comprehensive, powerful picture of literacy assessment that is integrated with teaching and learning. Assessment tools are discussed in detail in various categories — conferences and interviews, anecdotal records, checklists, learning logs, learning portfolios, and parent-teacher communication. Sections on student and teacher self-evaluation are valuable. Examples are plentiful.

Grace, C. & Shores, E. F. (1994). The Portfolio and Its Uses: Developmentally Appropriate Assessment of Young Children, 3rd edition . Little Rock: Southern Early Childhood Association (P.O. Box 5403, Brady Stn., Little Rock, AR 72215; $10).

Provides an excellent, detailed look at how and why to organize young children’s assessment around the portfolio. Sections include using the portfolio in evaluating children and in communicating with parents.

Harman, S. (1992). “Snow White and the Seven Warnings: Threats to Authentic Evaluation.” Reading Teacher . November, pp. 22-25.

How traditional testing ideas and practices — such as norm-referencing, aggregating data, calibrating assessments, and new commercial products that are touted as authentic but are little more than recycled basal readers — can undermine performance assessment.

Hein, G. E., ed. (1990). The Assessment of Hands-on Elementary Science Programs. Grand Forks, ND: North Dakota Study Group on Evaluation.

Eleven chapters discuss assessment theory, large-scale assessments, and classroom assessment with elementary schoolchildren. Contains many examples and details. (Box 8158, University of North Dakota, Grand Forks, ND 58202; $12 plus 15% handling).

Hester, H., et al. (1993). Guide to the Primary Learning Record . London: Center for Language in Learning. (Available from FairTest for $30).

Very valuable tool for documenting and assessing the process and content of student learning. Includes sections for the subjects mandated by the British national curriculum: the “core” subjects (most detailed for language/English, math, and science); the “foundation” subjects (art, geography, history, physical education, and technology), and religious education. The PLeR language arts section is a condensed version of the PLR (see Barrs, et al. , above). PLeR contains a Guide and record-keeping forms. Adapting PLeR to the U.S. will require making basic decisions about standards or curriculum that can be used to define the content to be assessed.

High Scope Educational Research Foundation. (1992). Child Observation Record. Ypsilanti, MI: Author.

Manual and forms for developmentally appropriate, ongoing observation and assessment of children 2-1/2 through 6 years of age. Provides examples of behaviors that can be recorded in each of six categories; describes various forms of performance assessment.

Hill, B.C., & Ruptic, C. (1994). Practical Aspects of Authentic Assessment . Norwood, MA: Christopher-Gordon.

This large volume is intended as a practical guide to elementary school classroom assessment. Focuses on literacy, includes chapters on assessing in content areas, assessing special needs students, student self-assessment, parent involvement, and reporting. Contains many examples and reproducible materials for observation, documentation, portfolios, evaluations.

Hill, C. and Larsen, E. (1993). Testing and Assessment in Secondary Education: A Critical Review of Emerging Practices. Berkeley, CA: National Center on Research on Vocational Education (NCRVE Materials Distribution, Western Illinois Univ., 46 Horrabin Hall, Macomb, IL 61455; (800) 637-7652; #MDS-237, $6.50).

Wide-ranging discussion of authentic assessment focuses on two major approaches — alternative testing and documentation. Each is subjected to careful analysis, with examples from current assessments. While supporting authentic assessment, the authors question some claims for its benefits and point to a range of difficulties.

International Reading Association and National Council of Teachers of English. (1994). Standards for the Assessment of Reading and Writing . Newark, DE, and Urbana, IL: Authors.

These standards call for performance assessment that is integrated with teaching and learning, places the needs of the student first, is equitable, does not have harmful consequences, involves multiple perspectives and multiple forms of data, and is based in a school community in which all members have a voice in assessment.

Johnston, P. H. (1992). Constructive Evaluation of Literate Activity . New York: Longman.

Based on student-centered learning and assessment, contains detailed discussions of portfolios, observational checklists and dialogue journals. Has thorough instructions, accompanied by an audiotape for practice, for using “Running Records” to assess reading.. Includes a strong section on “What We Value in Evaluation,” which places validity, fairness and reporting in the context of what is valued, primarily important individual learning and thoughtful, personalized teaching.

Johnston, P. (1987). Teachers as Evaluation Experts. The Reading Teacher. April, pp. 744-748.

Discusses the fact that teachers can and do evaluate and urges more help for teachers so that they can become evaluation experts.

Kentucky Department of Education, Office of Assessment and Accountability. (1994). Kentucky Mathematics Portfolio: Teacher’s Guide. Frankfort, KY: Author.

Designed to guide Kentucky teachers into portfolio assessment, this comprehensive handbook covers theory, contents, development, scoring and task ideas. Outlines types of entries to include (e.g., interdisciplinary; writing), explains core math concepts students are expected to learn and corresponding evidence of meeting expectation. Presents task ideas for grades 5 and 8 and high school, accompanied by “Criteria for Appropriate Portfolio Tasks.” Includes state’s holistic scoring guide, with instructions for use. Concludes with a “Q&A” for teachers.

Koelsch, N., Estrin, E. T., & Farr, B. (1995). Guide to Developing Equitable Performance Assessments . San Francisco: WestEd (see Organizations, below).

Useful booklet provides background information on equity in assessment, including linguistic and cultural issues and connecting assessment to local context; strategies for developing performance tasks, with examples drawn from assessments used with American Indian students; and guidelines for workshops on the topic, including thoughtful questions for examining tasks. Contains five good criteria for “authentic” assessments, ways to reduce bias, and ways to think about equity.

Kulm, G. (1994). Mathematics Assessment: What Works in the Classroom . San Francisco: Jossey-Bass.

A comprehensive book on assessing math in the classroom. Starts with the purposes of assessment (based largely on the NCTM Standards , see below) and goals for teaching and learning. Part 2 explains how to plan and design a variety of assessments, for individuals and groups, including performance tasks, investigations, journals, portfolios, interviews, student self-assessment, and scoring. Part 3 presents classroom assessment models based on actual cases.

LeMahieu, P., Gitomer, D., Eresh, J. (1995) Portfolios in Large Scale Assessment: Difficult But Not Impossible. Educational Measurement: Issues and Practices . 14(3), Fall.

Analyzes success of Pittsburgh’s writing portfolio assessment as an accountability tool. Shows portfolios can allow for student choice in selecting materials, diversity in content and variety in classroom work, and still obtain sufficient reliability for public accounting. Portfolios support high quality classroom practice, help reveal actual instructional practices and student opportunity to learn.

Linn, R. L., Baker, E. L., & Dunbar, S. B. (1991). Complex, Performance-based Assessment: Expectations and Validation Criteria. Educational Researcher , November, pp. 15-21.

Influential effort to rethink the criteria for judging the quality of educational assessments to meet the rise of performance assessment and more complex understandings of validity. Proposed criteria include consequences, fairness, generalizability, cognitive complexity, content quality and coverage, meaningfulness, and cost. To some extent, the criteria presume discrete assessments, rather than a continuous assessment process.

Maryland Assessment Consortium. 1994-1995. Performance Assessment Tasks. Frederick, MD: Author.

The Consortium has a growing number of tasks in a variety of subject areas, compiled in 3-ring binders. Tasks are selected in part for use in assessing the Maryland State Learning Outcomes, but have wider value. Vol. 6 is the best of their elementary school tasks, Vol. 7 the best for middle school. Available for sale or on a site-licensure basis, and will be on CD-ROM disk. (c/o Frederick County Public Schools, 115 E. Church St., Frederick, MD 21701; (301) 694-1337).

Mathematical Sciences Education Board, National Research Council. (1993). Measuring What Counts: A Conceptual Guide for Mathematics Assessment. Washington, DC: National Academy Press. In-depth discussion of mathematics curriculum and the role of assessment in education, coupled with good examples of individual assessment tasks, group activities and performance exams. Also covers task design, scoring and reporting.

McCollum, S. L. (1994). Performance Assessment in the Social Studies Classroom: A How-To Book for Teachers. Joplin, MO: Chalk Dust Press.

Describes a host of classroom activities that can be turned into assessment opportunities, and how to do so. Some activities are of mediocre quality and the suggested scoring rubrics tend to be somewhat sparse and superficial, but the book is useful as it is one of the few devoted only to social studies.

McColskey, W. & O’Sullivan, R. (1993). How to Assess Student Performance in Science: Going Beyond Multiple-Choice Tests. Greensboro: SC: SouthEastern Regional Vision for Education (SERVE).

Useful level of detail on how to create and score science assessments, including informal and formal observation, journals, performance tasks and open-ended questions. Provides examples of tasks and systems.

McDonald, J. P., Smith, S., Turner, D., Finney, M., & Barton, E. (1993). Graduation by Exhibition: Assessing Genuine Achievement . Alexandria, VA: Association for Supervision and Curriculum Development. (See ASCD, at Organizations, below).

Discusses Coalition of Essential Schools idea of using exhibitions as a basis for determining high school graduation. Teachers provide four case studies (using essays, position papers, multi-media presentations, and Socratic seminars) of how to use exhibitions to “plan backwards” toward school reform. Having defined what students should be able to do, reform involves reshaping the school to best help students meet the goal. Discusses obstacles and setbacks as well as accomplishments and progress.

McLaughlin, B., Gesi Blanchard, A., & Osanai, Y. (1995). Assessing Language Development in Bilingual Preschool Children. Washington, DC: National Clearinghouse for Bilingual Education, Program Information Guide Series, No. 22.

Provides an introduction to various routes to language acquisition, points out the limitations of standardized tests, and offers guidelines for assessing bilingual children, calling for instructionally embedded assessment. Outlines the California Early Language Development Assessment Process, which utilizes a series of steps: planning assessment; collecting information through observation and documentation; developing a portfolio; writing a narrative summary; meeting with family and staff; and developing curriculum and instruction based on the needs found in the assessment.

Meisels, S. J. (1992). The Work Sampling System. Ann Arbor: Rebus Planning Associates (1103 S. University Ave., Ann Arbor, MI 48104; 1-800-435-3085).

Provides performance assessment methods for evaluating young children from age 3 through grade 3, utilizing developmental checklists, portfolios and summary reports. All parts are classroom-focused. Now used in hundreds of schools.

Medina, N. and Neill, D. M. (1990). Fallout from the Testing Explosion: How 100 Million Standardized Exams Undermine Equity and Excellence in America’s Public Schools. Cambridge, MA: FairTest, third edition. (See order form, last page).

Includes: survey on extent of test use, analysis of problems with test construction, reliability, validity, administration and bias; and the harmful impact of testing on educational goals, curriculum, student progress and local control of schools. Contains annotated bibliography. (See FairTest order form, last page). For a shorter version, see D. M. Neill and N. J. Medina, “Standardized Testing: Harmful to Educational Health,” Phi Delta Kappan (May 1989) pp. 688-697. (Issue contains several other relevant articles, including Wiggins, below).

Messick, S. (1992). The Interplay of Evidence and Consequences in the Validation of Performance Assessments. Princeton, NJ: Educational Testing Service, RR-92-39.

Messick, a leading theorist of validity, argues that key concepts of validity can be used to shape and evaluate performance assessments, which should be driven not by tasks but by understanding the construct to be assessed if performance assessments are to measure and promote higher order thinking skills. The focus is on validating assessments used for high-stakes decision-making, such as high school graduation. The discussion is therefore at times helpful, at times misleading for other assessment purposes.

Mitchell, R. (1992). Testing for Learning. New York: Free Press/Macmillan. (Available from FairTest, see order form on last page).

Comprehensive overview of performance assessments as they are developing in the U.S. Moves from a critique of multiple-choice testing to a wide-ranging study of performance assessments. Discusses writing, math, and science, assessed via portfolios and other methods, using programs in Arizona, California, Maryland and Vermont as examples. Teachers’ roles in assessment and involving parents and the community are also discussed.

Mitchell, R., Willis, M., & Chicago Teachers Union Quest Center. (1995). Learning in Overdrive: Designing Curriculum, Instruction, and Assessment from Standards. Golden, CO: North American Press.

Guide for cooperatively writing and using standards at the school level. Contains clear examples and step-by-step procedures. Though it has a too-narrow approach to the broader question of the purposes of schooling and offers a limited range of assessment practices, it is still worthwhile for teachers, curriculum specialists, and principals.

Morrow, L. M. (1988). Retelling Stories As a Diagnostic Tool. In Glazer, S. M., Searfoss, L. W., & Gentile, L. M, eds., Re-examining Reading Diagnosis . (Newark, DE: International Reading Association).

Describes the wide range of information teachers can obtain about students’ reading comprehension in a structured discussion, or “re-telling,” of stories. Provides a “how-to” for retelling.

Moss, P. A. (1992). Shifting Conceptions of Validity in Educational Measurement: Implications for Performance Assessment. Review of Educational Research , 62(3), 229-258.

Argues that expanding the concept of validity to include the consequences of assessment provides support for performance assessment. While validity criteria historically have privileged standardized forms of assessment, researchers, assessors and educators should question the traditional assumptions and principles of validity and expand the ways in which assessment is validated.

Moss, P. A., et al. (1992). Portfolios, Accountability, and an Interpretive Approach to Validity. Educational Measurement: Issues and Practice. (11)3, Fall, pp. 12-21.

Discusses how individually-varied classroom portfolios, developed by students and teachers, can be used in providing public information. Focus is on classroom and school-level information, but makes suggestions for large-scale assessment. A case study is used to illustrate the development of the portfolio.

Moya, S. S. & O’Malley, J. M. (1994). “A Portfolio Assessment Model for ESL.” The Journal of Educational Issues of Language Minority Students. Spring, pp. 13-36.

Proposes guidelines for use of portfolio assessment with limited English proficient students in elementary and secondary settings. Provides a rationale for portfolios and includes a portfolio assessment model for English as a Second Language (ESL) classes.

National Association for the Education of Young Children & National Association on Early Childhood Specialists in State Departments of Education. (1991). “Guidelines for Appropriate Curriculum Content and Assessment Programs Serving Children Ages 3 through 8,” Young Children . March, pp. 21-38.

Excellent set of guidelines for teachers and administrators for developmentally appropriate curricula and assessments for children ages 3-8, including those in special education programs. (Also available from NAEYC, 1509 15th St., NW, Washington, DC 20036).

National Council of Teachers of Mathematics. (1995). Assessment Standards for School Mathematics . Reston, VA: Author.

Standards state that assessment should reflect important math content, enhance math learning, promote equity, be an open process, promote valid inferences about learning, and be a coherent process. A section on using the standards for different purposes includes examples and discussion of performance tasks, projects and portfolios.

National Education Association. (1993). Student Portfolios . Washington, DC: Author.

A good complement to more comprehensive works on portfolios, the articles describe a range of experiences with portfolios, including early childhood literacy assessment, high school-wide cross-curricular evaluation sensitive to learning style, and parent-teacher-student collaboration.

National Forum on Assessment (1995). Principles and Indicators for Student Assessment Systems . Cambridge, MA: FairTest. (See FairTest order form, last page).

Endorsed by many leading education and civil rights organizations, this is a thorough set of principles for developing, reviewing or revising student assessment systems. Seven principles are: primary assessment purpose is to improve learning; other assessment uses must support learning; assessment is fair; professional collaboration and development supports assessment; the community participates in assessment development; communication about assessment is clear; and assessment systems are reviewed and improved. A section on Educational Foundations describes conditions supportive of good assessment.

Navarrete, C., Wilde, J., Nelson, C., Martinez, R., & Hargett, G. (1990). Informal Assessment in Educational Evaluation: Implications for Bilingual Education Programs. Washington, DC: National Clearinghouse for Bilingual Education.

Presents concerns with standardized testing in bilingual education programs and offers informal assessment techniques as an alternative. Defines informal assessment, describes examples of both structured and unstructured informal assessment, and explains various scoring methods for these assessments. Gives guidelines for using portfolios in bilingual education programs.

Neill, M., Bursh, P., Schaeffer, B., Thall, C., Yohe, M., & Zappardino, P. (1995). Implementing Performance Assessment: A Guide to Classroom, School and System Reform. Cambridge, MA: FairTest. (See FairTest order form, last page).

This guide from FairTest, written for teachers, administrators and parents, is a comprehensive, systematic look at performance assessments, from classroom to system-wide uses. Describes observations, interviews, work samples, projects, performances, exhibitions, exams, portfolios and learning records. Discusses equity, communication with parents, getting started. Additional chapters focus on use of assessment information in evaluating students, establishing validity, conducting school-level evaluation, and using assessment for accountability. Final chapter is on organizing for change. Includes a lengthy bibliography and list of resources.

Nelson-Barber, S., & Estrin, E.T. (n.d., 1995). Culturally Responsive Mathematics and Science Education for Native Students . San Francisco: WestEd (see Organizations, below).

Although does not directly focus on assessment, report’s implications are important. Authors detail how ways of knowing are culturally-based, how basing standards and instruction, and thus assessment, on one culture tends both to exclude students from other cultural backgrounds from accessing knowledge and to limit the very conceptions of science and math. If assessment is to serve all students, these issues must be addressed. (See also Estrin & Nelson-Barber, above).

Nettles, M. T. & Nettles, A. L. , eds. (1995). Equity and Excellence in Educational Testing and Assessment . Norwell, MA: Kluwer Academic Publishers.

Papers from 1993 Ford Foundation conference. Provides valuable ideas and information on equity in performance assessment. Offers strong support for use of performance assessments, while exploring potential problems in using them with minority group students. Includes overviews, analyses of particular projects, and impact of new assessments on classroom, state and national levels.

Newmann, F., Secada, W., & Wehlage, G. (1995). A Guide to Authentic Instruction and Assessment: Vision, Standards and Scoring . Wisconsin Center for Educational Research, 1025 W. Johnson St., Rm. 242, Madison, WI 53706; (608) 263-4214.

Defines authenticity as including construction of knowledge, disciplined inquiry, and value beyond school. Establishes seven standards for use in assessment, instruction and setting performance standards in math and social studies. Provides examples of assessments and instruction that meet the standards. Last chapter explores how to use the Guide in school reform. Appendices discuss scoring criteria, explain studies on which the Guide is based.

Office of Bilingual Education and Minority Languages Affairs (OBEMLA). (1992). Focus on Evaluation and Measurement: Proceedings of the Second National Research Symposium on Limited English Proficient Student Issues. 2 vols. Washington, DC: U.S. Department of Education.

Compilation of papers from the Second National Research Conference sponsored by OBEMLA. Documents focus on the role of assessment in relation to accountability and program improvement at the federal, state, and local levels. Authors believe that the core of the school reform movement is the dissemination of innovations in evaluation and measurement. Papers by Canales, Damico, French, and Ortiz discuss various uses, methods of performance assessment with LEP students.

Paulson, L. (1994). Portfolio Guidelines in Primary Math. Portland, OR: Multomah Education Service District Press.

Fine introductory booklet to math portfolios. While the discussion and examples focus on primary math, there is much of use for secondary math teachers and for other subject areas.

Perrone, V. , ed . (1991). Expanding Student Assessment. Alexandria, VA: Association for Supervision and Curriculum Development. (See ASCD, at Organizations, below).

Many valuable chapters written by educators include: methods, such as portfolios and documentation; assessing in writing, science and other areas; classroom implementation; and using the correct assessment for each specific purpose.

Polakowski, B., Spicer, W., & Zimmerman, J. (1992). Linking Assessment to Accountability/ Linking Curriculum to Appropriate Assessment: South Brunswick School District, South Brunswick, NJ. In Day, B., Malarz, L., & Terry, M., eds., Education and Care of Young Children . Alexandria, VA: ASCD (see ASCD, at Organizations, below).

Outlines process by which teachers implemented portfolio and performance assessment for young children across the district. Discusses problems and successes, including teacher workload, the process of constructing a scale for standardizing the data for accountability purposes, the effects on instruction and learning, and communicating with parents.

Portfolio News . (U. California – San Diego, Teacher Education Program, Portfolio Assessment Clearinghouse, 8500 Gilman Dr., La Jolla, CA 92093-0070. $32 yearly).

Quarterly newsletter full of information about using portfolios, including descriptions of individual projects at various schools and districts and discussion of issues of concern in portfolio assessment. Reviews and resources are published regularly.

Puckett, M. B. & Black, J. K. (1994). Authentic Assessment of the Young Child: Celebrating Development and Learning. New York: Macmillan College Publishing Company.

Teacher-oriented, clear and comprehensive book. Includes discussion of theoretical issues and practical advice, with many examples.

Raizen, S. A., Baron, J. B., Champagne, A. B., Haertel, E., Mullis, I. , & Oakes, J. (1990). Assessment in Science Education: The Middle Years . Andover, MA: National Center for Improving Science Education. (The NETWORK, 300 Brickstone Square, Suite 900, Andover, MA 01810).

Book’s premises are that assessment should improve instruction; and content, instruction, activities and assessment must be developmentally appropriate. Discusses cognitive abilities (and limitations) of 10-14 year-olds as they relate to teaching, learning and assessment in science. Suggests classroom activities for scientific inquiry, along with ways to turn them into assessments. A few state assessment systems (e.g., Vermont) are reviewed, and policy implications of testing are discussed.

Raven, J. (1991). The Tragic Illusion: Educational Testing. Unionville, NY: Trillium Press.

Provides a comprehensive critique of standardized testing. Explains why assessment should: help students attain competence, which varies individually and has many dimensions; include cognitive, affective, and conative (effort and motivation) elements; and be put in social context. Explains ways to do this and the consequences for schooling. (An abbreviated version is in Berlak, above).

Rethinking Schools . (1994). Rethinking Our Classrooms: Teaching for Equity and Justice. Rethinking Schools, 1001 E. Keefe Ave., Milwaukee, WI 53212; (414) 964-9646; $6.00 + S&H.

Contains many activities and ways of thinking about teaching that will be valuable for teachers seeking to integrate performance assessment into a changing classroom. Includes criticisms of origins and consequences of multiple-choice and norm-referenced standardized tests.

Ryan, C. D. (1994). Authentic Assessment. Teacher Created Materials: Westminster, CA.

Good but not sufficient introductory booklet for the teacher new to authentic assessment. Covers wide array of topics, including portfolios, performance tasks and observations. Useful section on parent involvement, and helpful overviews on assessment strategies specific to content areas: language arts, mathematics, science and social studies. Publisher also has a booklet on portfolio assessment.

Shepard, L. & Bliem, C. (1993). Parent Opinions about Standardized Tests, Teacher’s Information and Performance Assessments. Los Angeles: Center for Research on Evaluation, Standards, and Student Testing, CSE Technical Report 367. (See CRESST, at Organizations, below).

A Colorado case study of how parents responded very favorably to introduction of performance assessments. Nearly all preferred them to standardized tests, and half would simply do away with standardized tests.

Sizer, T. Horace’s School (Boston: Houghton Mifflin, 1992).

Uses Coalition of Essential Schools as basis for fictitious Franklin High to discuss how to fundamentally restructure schools. A committee struggles to develop and implement change. Uses the exhibition as the center of the assessment process, as is the case in many Coalition schools. (See CES, at Organizations, below).

Stenmark, J. K. (1991). Mathematics Assessment: Myths, Models, Good Questions, and Practical Suggestions. Reston, VA: National Council of Teachers of Mathematics.

An excellent handbook for teachers — and not only math teachers — to begin developing and using performance assessments in the classroom. Covers tasks and projects, observations and interviews, portfolios. Contains many valuable concrete ideas.

Taylor, C. (1994). Assessment for Measurement or Standards: The Peril and Promise of Large-Scale Assessment Reform. American Educational Research Journal . (31)2, 231-262.

Lucidly describes the dangers of shaping performance assessment to fit the norm-referencing that dominates traditional measurement. Calls for large-scale assessments to be based on standards, not norm-referenced comparisons. Does not address classroom assessments.

Thomas, W., et al. (1995). The CLAS Portfolio Assessment Research and Development Final Report . Princeton, NJ: Center for Performance Assessment, ETS Mail Stop 11-P, 08541.

Summary of a project to develop and score “organic” portfolios, for accountability purposes, based on California’s curriculum frameworks in language arts and math. Essential “dimensions of learning” were identified and a scoring guide constructed. The idea was for students to include evidence to “build the best case” for demonstrating achievement. Thus, great variety in portfolio content was accepted. Report discusses the process of developing and scoring, shows strongly positive results and provides evidence that this sort of portfolio system can be built. These are not, however, portfolios oriented toward helping classroom instruction and learning during the school year.

Thrust for Educational Leadership . (Nov.-Dec., 1993). (ACSA, 1575 Old Bayshore Hwy., Burlingame, CA, 94010. $5.00).

Issue focuses on performance assessment. Includes summaries of math standards and of San Diego’s pioneering city assessment system. Discusses portfolios and different assessment systems, of uneven quality, from different California districts. Leads off with “Equitable Assessment” by FairTest’s Monty Neill.

Tierney, R. J., Carter, M. A., & Desai, L. E. (1991). Portfolio Assessment in the Reading-Writing Classroom . Norwood, MA: Christopher-Gordon Publishers.

Excellent, accessible, detailed practical and theoretical overview of portfolio assessment for literacy takes readers from concepts to evaluation and record keeping. Includes a broad range of examples across various grades and subjects. Authors emphasize that each student’s collection of work will differ and that each classroom will exhibit a unique approach to authentic assessment.

Valdez Pierce, L. & O’Malley, J. M. (1992). Performance and Portfolio Assessment for Language Minority Students. Washington, DC: National Clearinghouse for Bilingual Education, #9, Spring.

Provides definitions and summary of alternative assessment and two of its varieties, performance assessment and portfolio assessment. Details each in terms of purpose, types, design, administration, and scoring with emphasis on use with language minority students. Lists common concerns of portfolio assessment and provides remedies for these concerns.

Virginia Education Association and Appalachia Educational Laboratory. (1992). Alternative Assessments in Math and Science: Moving Toward a Moving Target. Charleston, WV: Appalachia Educational Laboratory.

Eleven pairs of Virginia teachers used authentic assessment in their math and science classrooms for a school year. Booklet reports their findings and recommendations. Describes changes in student achievement and attitude, instruction, teacher effectiveness, and working conditions. A list of implementation strategies is emphasized. Includes sample activities conducive to classroom assessment in math and science.

Wiggins, G. P. (1993). Assessing Student Performance: Exploring the Purpose and Limits of Testing . San Francisco: Jossey-Bass Publishers.

More philosophical treatise than “how to,” the insights and ideas are important for practitioners. Argues that implementing performance assessment requires a world view based on new values that recognize assessment is a profoundly interpersonal activity. Does contain many particulars, some examples.

Wiggins, G. (1989). A True Test: Toward More Authentic and Equitable Assessment. Phi Delta Kappan. May, pp. 703-713.

Criticizes standardized tests. Argues that authentic tests (as sometimes different from assessment in general) can be valuable for learning and enhancing equity. Defines authentic tests as evidencing and providing a means to judge knowledge. Establishes criteria for authentic exams and for grading them. Links new forms of testing and assessment to restructuring schools. (See CLASS, at Organizations, below).

Winner, E. (1991). Arts PROPEL: An Introductory Handbook . Cambridge, MA: Project Zero (see Organizations, below).

A look at using portfolio assessment in the arts in a highly influential model program in Pittsburgh (see also Wolf in Educational Leadership , April 1989). Contains examples, discussion of many particulars of implementation.

Wolf, D. et al. (1991). To Use Their Minds Well: Investigating New Forms of Student Assessment. In G. Grant, ed., Review of Research in Education 17. Washington, DC: American Educational Research Association.

Provides a critique of the “culture of testing.” Proposes development of a “culture of assessment” based on use of the process-folio as a means of documenting and analyzing student progress and achievements. Discusses the impact of assessment change on schooling.

ORGANIZATIONS

The Association for Supervision and Curriculum Development (ASCD) has a rapidly growing list of fine materials — books, manuals and videos — on performance assessment. 1250 N. Pitt St., Arlington, VA 22314-1403; (703) 549-9110.

The Center for Collaborative Education works with the Coalition of Essential Schools and is closely connected with Central Park East Secondary School. Assessment reform is part of their school restructuring efforts. 1573 Madison Ave., Rm. 201, New York, NY 10029.

The Center on Learning, Assessment, and School Structure (CLASS) provides written materials and consultants for school reform, with a major focus on assessment. Its director is Grant Wiggins (see bibliography). 648 The Great Road, Princeton, NJ 08540; (609) 252-1211.

The Center for Research on Evaluation, Standards and Student Testing (CRESST) is the federal research center on testing and assessment. They have a free newsletter (CRESSTLine) and produce research reports on performance assessments, among other things. 405 Hilgard Ave., 145 Moore Hall, Los Angeles, CA 90024-1522; (310) 206-1532.

The Coalition of Essential Schools is a national network of schools engaged in restructuring, including assessment. Central Part East and International High, discussed in Darling-Hammond, Ancess & Falk, and the schools in McDonald, et al. are in the CES (see Bibliography). Box 1969, Brown University, Providence, RI 02912; (401) 863-3384.

Many states are working on performance assessments; some have included performance items as part of statewide exams. The Council of Chief State School Officers has a number of interstate consortia working on performance assessments, and they can put you in touch with states developing performance assessments, such as Vermont (portfolio assessments), Connecticut, Delaware, Kentucky, Maryland, and Maine (performance exams). Contact Ed Roeber, CCSSO, One Mass. Ave., NW, #700, Washington, DC 20001; (202) 336-7045.

The National Center for Education Outcomes for Students with Disabilities . They have sets of indicators that include assessment. 350 Elliot Hall, University of Minnesota, 75 East River Road, Minneapolis, MN 55455; (612) 624-4014.

The National Center for Fair & Open Testing (FairTest) works to make assessment fair, equitable and educationally helpful. In addition to conducting reform campaigns, it has numerous books, pamphlets, reports, bibliographies and fact sheets (see FairTest order form, last page). 15 Court Square, Suite 820, Boston, MA 02108; (857) 350-8207.

The National Center for Restructuring Education, Schools, and Teaching (NCREST) has many publications on performance assessment. Box 110, Teachers College, Columbia University, New York, NY 10027; fax (212) 678-4170.

The New Standards Project is a national effort to develop both standards and assessments, including performance exams, projects and portfolios. They are working with a number of states and districts. c/o National Center on Education and the Economy, 700 11th Street NW – Suite 750, Washington, DC 20001; (202) 783-3668.

Performance Assessment Collaboratives for Education (PACE) works with schools, districts and states in developing performance assessments. Harvard Graduate School of Education, 8 Story Street, Cambridge, MA 02138; (617) 496-2770.

Project Zero conducts research on performance assessment from pre-school through high school, works with some schools in implementing assessments, and provides a bibliography of papers and materials about their work. Harvard Graduate School of Education, Longfellow Hall, Appian Way, Cambridge, MA 02138; (617) 495-4342.

Prospect Archive and Center is a pioneer in the use of documentation, portfolios and teacher dialogue about student learning. Also runs institutes on assessment. P.O. Box 226, N. Bennington, VT 05257.

WestEd formerly the Far West Laboratory , has a growing list of publications and videos on assessment, with a focus on equity, in particular for American Indians. WestEd, 730 Harrison St., San Francisco, CA 94107-1242; (415) 565-3000.

In addition, many of the national subject area organizations, the Regional Laboratories, and organizations of teachers, principals and administrators are working to develop and implement new assessments.

Summary of State Graduation Requirements

The misguided war on test optional, interpreting pisa results: it’s poverty, stupid (with a bit of the iphone).

Logo for VIVA Open Publishing

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

Engagement in Online Learning

3 Annotated Bibliography

Al Mamun, A., Lawrie, G. & Wright, T. (2020). Instructional design of scaffolded online learning modules for self-directed and inquiry-based learning environments, Computers & Education, 144 , 1-17. https://doi.org/10.1016/j.compedu.2019.103695

Tags: Instructional Design, Discipline-specific (STEM) Summary: This article outlines a scaffolding model for use in asynchronous settings. Instructional design is ‘predict-observe-explain-evaluate’ (POEE). Inquiry-based approach is relevant to STEM disciplines.

Aloni, M., & Harrington, C. (2018). Research based practices for improving the effectiveness of asynchronous online discussion boards. Scholarship of Teaching and Learning in Psychology, 4 (4), 271–289. https://doi-org.10.1037/stl0000121

Tags: Online Discussions, Instructional Design Summary: This literature review discusses the challenges, benefits, teaching strategies and best practices for asynchronous discussions along with providing tables that outline these issues as well as listing the specific sources that address them. Many of these issues would also be relevant for synchronous and hybrid classes.

Angelino, L. M., Williams, F. K., & Natvig, D. (2007). Strategies to engage online students and reduce attrition rates. Journal of Educators Online, 4 (2), 1-14.

Tags: Group Work Summary: Table 1 shows strategies to reduce attrition. Group projects are mentioned as one strategy to engage learners in an online environment and reduce attrition. Group project aims to create a community of learners.

Bacca-Acosta, J., & Avila-Garzon, C. (2020). Student engagement with mobile‐based assessment systems: A survival analysis. Journal of Computer Assisted Learning, 37 (1), 158–171. https://doi.org/10.1111/jcal.12475

Tags: Instructional Design, Discipline-specific (ESL), Assessment Summary:  Research shows that mobile‐based assessment increases students’ learning outcomes and motivation. In this paper they show that students with positive acceptance of mobile based assessment and higher levels of self reported effort engage for longer periods of time.

Banta, T. W., Jones, E. A., & Black, K. E. (2009). Designing effective assessment: principles and profiles of good practice. Jossey-Bass. [ link ]

Tags: Assessment Summary: The book highlights examples of effective assessment in higher education including a section on “CLASSE: measuring Student Engagement at the Classroom Level” by Robert Smallwood and Judith Ouimet. The Classroom Survey of Student Engagement is an adaption of the NSSE that has been piloted in in-person classrooms. Smallwood and Ouimet’s section indicates that an overview of the Classroom Survey of Student Engagement CLASSE  along with survey results and the instrument can be found at http://assessment.ua.edu/CLASSE/Overview.htm.  While these pages are no longer online, copies can be accessed using the Internet Archive’s WayBackMachine.

Belland, B. R., Kim, C., & Hannafin, M. J. (2013). A framework for designing scaffolds that improve motivation and cognition. Educational Psychologist, 48 (4), 243–270.  https://doi.org/10.1080/00461520.2013.838920

Tags: Instructional Design Summary: This article proposes guidelines for the design of computer-based scaffolds to promote motivation and engagement. Scaffolds are: establish task value, promote mastery goals, promote belonging, promote emotion regulation, promote expectancy for success, and promote autonomy. Through better motivational scaffolds, all three kinds of engagement (behavioral, emotional, cognitive) can be enhanced.

Bigatel, P.M. & Edel-Malizia, S. (2018). Using the “Indicators of Engaged Learning Online” framework to evaluate online course quality, TechTrends, 62 , 58-70. https://doi.org/10.1007/s11528-017-0239-4

Tags: Instructional Design Summary: This paper uses the Indicators of Engaged Learning Online Framework (appendix 1) to assess the quality of online courses.

Bigatel, P., & Edel-Malizia, S. (2018). Predictors of instructor practices and course activities that engage online students. Online Journal of Distance Learning Administration, 21 (1), 1-19. [ link ]

Tags: Instructional Design Summary: This paper describes what activities and attitudes and behaviors of instructors most increase student engagement in online courses. An important finding was the sharing of knowledge and expertise within the learning community increases engagement.

Bond, M., Buntins, K., Bedenlier, S., Zawacki-Richter, O. & Kerres, M. (2020). Mapping research in student engagement and educational technology in higher education: A systematic evidence map. International Journal of Educational Technology in Higher Education, 17 (2), 1–30. https://doi.org/10.1186/s41239-019-0176-8

Tags: Introduction, Assessment Summary: This article maps 243 studies published between 2007 and 2016 on the area of student engagement and digital technology. Most studies focus on undergraduate, text-based tools such as online discussion boards and blended learning. Many studies did not define student engagement or used a theoretical framework.

Boton, E. C., & Gregory, S. (2015). Minimizing attrition in online degree courses. Journal of Educators Online, 12 (1), 62-90.

Tags: Group Work Summary: This paper uses qualitative case studies to find out in what way culture, motivation, learning management systems and online pedagogy can increase student engagement and reduce attrition. Group work is mentioned as one strategy.

Bovee, B. S., Jernejcic, T., & El-Gayar, O. (2020). A gamification technique to increase engagement in asynchronous online discussions. Issues in Information Systems, 21 (3), 20–30. https://doi.org/10.48009/3_iis_2020_20-30

Tags: Online Discussions Summary: This article studied the effects of gamification on student video discussion posts. Winners of these posts were those with the most replies. A website displayed those students in the lead. The gamification increased both the behavioral and cognitive engagement.

Brown, R. E. (2001). The process of community-building in distance learning classes. Journal of Asynchronous Learning Networks, 5 (2), 18−35. http://dx.doi.org/10.24059/olj.v5i2.1876

Tags: Online Discussions, Instructional Design Summary: Three key stages to community building: make online acquaintances (i.e., meeting people who may share similar interests), sensing community acceptance (i.e., being willing to share and accept similar and opposing points of view), and achieving camaraderie (i.e., mutual respect).

Cavanagh, S. R. (2016). The spark of learning: Energizing the college classroom with the science of emotion. West Virginia University Press. [ link to publisher site ]

Tags: Emotional Engagement, Instructional Design Summary: This book focuses on the emotional aspect of student engagement.  It provides a combination of evidence from scholarly sources and practical, anecdotal examples.

Czerkawski, B. C., & Lyman, E. W. (2016). An instructional design framework for fostering student engagement in online learning environments. TechTrends, 60 , 532-539. https://doi.org/10.1007/s11528-016-0110-z

Tags: Instructional Design Summary: This paper presents a 4-phase instructional design framework and strategies to foster student engagement in online classes. It combines student participation, motivation and student success.

Darby, F., & Lang, J.M. (2019). Small teaching online : Applying learning science in online classes . Jossey-Bass. [ link to publisher site ]

Tags: Online Discussions, Group Work Summary: Part 1, design for learning includes both backward design (alignment of course goals, activities and assessment) but also student engagement. Part 2 specifically focuses on building community and giving feedback to foster student success. Part 3 is about motivating students specifically making connections and giving autonomy. Discussion boards and group work can both build community and establish connections.  A reference list is included at the end of the book.

Davies, W. M. (2009). Groupwork as a form of assessment: common problems and recommended solutions. Higher Education, 58 , 563-584. http://dx.doi.org/10.1007/s10734-009-9216-y

Tags: Group Work, Social Engagement Summary: The paper discusses problems connected with group work such as free riding and ‘sucker effect’ and other problems and how the design of group work can alleviate these issues.

deNoyelles, A., Zydney, J.M., & Chen, B. (2014). Strategies for creating a community of inquiry through online asynchronous discussions.  MERLOT Journal of Online Learning and Teaching, 10 (1), 153-165. [ link ]

Tags: Online Discussions, Instructional Design Summary: Designing online discussion with a community of learners in mind. Each member should have a social, cognitive and teaching presence. One strategy mentioned is students acting as peer-reviewers or peer-facilitators. By serving as such, students have a greater connection to the discussion as they may be leading the discussion or providing additional insights.

Ding, L., Kim, C., & Orey, M. (2017). Studies of student engagement in gamified online discussions. Computers & Education, 115 , 126-142. https://doi.org/10.1016/j.compedu.2017.06.016

Tags: Online Discussions Summary: Gamification of online discussion has a positive effect on student engagement (behavioral, emotional and cognitive). Practical items such as badges, thumps-ups, progress bars and avatars were used.

Dixson, M. D. (2015). Measuring student engagement in the online course: The online student engagement scale (OSE). Online Learning, 19 (4). http://dx.doi.org/10.24059/olj.v19i4.561

Tags: Assessment Summary: This study provides validation of the Online Student Engagement scale (OSE) by correlating student self-reports of engagement (via the OSE) with tracking data of student behaviors from an online course management system.

Dziuban, C., Picciano, A., Graham, C., & Moskal, P. (2017). Conducting research in online and blended learning environments: New pedagogical frontiers. Routledge. https://doi.org/10.4324/9781315814605

Tags: Assessment Summary: This book can be useful when designing research in online and blended learning environments. As a Faculty Learning Community, we found that not many papers research assessments of student engagement. In such a case this book can be valuable for future research.

Fehrman, S. &  Watson, S. L. (2020): A systematic review of asynchronous online discussions in online higher education. American Journal of Distance Education. Advance online publication. https://doi.org/10.1080/08923647.2020.1858705

Tags: Online Discussions Summary: This systematic review identifies key themes on asynchronous online discussions in higher education found in peer reviewed literature with publication dates from 2010-2020.

Fredericks, J.A., Blumenfeld, P.C., & Paris, A.H. (2004). School engagement: Potential of the concept, state of the evidence. Review of Educational Research, 74 (1), 59-109. https://doi.org/10.3102/00346543074001059

Tags: Behavioral Engagement, Emotional Engagement, Cognitive Engagement Summary: This paper outlines three main types of engagement- behavioral, emotional, and cognitive engagement. This paper gives concrete examples of engagement in face-to-face classrooms. They call for richer characterizations of how students behave, feel and think, to assess and develop interventions.

Fredricks, J. A., Wang, M. T., Linn, J. S., Hofkens, T. L., Sung, H., Parr, A., & Allerton, J. (2016). Using qualitative methods to develop a survey measure of math and science engagement. Learning and Instruction, 43 , 5–15. https://doi.org/10.1016/j.learninstruc.2016.01.009

Tags: Social Engagement, Discipline-specific (STEM), Assessment Summary: This is a study of face-to-face STEM classes (middle school and high school). Table 1 contains indicators of engagement from interviews. A major conclusion is the need to develop valid and reliable measures of engagement in STEM. And to include a social engagement dimension.

Gay, G. H. E. & Betts, K. (2020).  From discussion forums to emeetings: Integrating high touch strategies to increase student engagement, academic performance, and retention in large online courses. Online Learning, 24 (1), 92-117. https://doi.org/10.24059/olj.v24i1.1984

Tags: Group Work Summary: Student engagement and group work are critical to developing competencies, deeper learning, and attributes that align with 21st-century skills. Quantitative data shows higher academic scores and lower attrition, qualitative data shows increased engagement.

Garrison D. R., Anderson, T., & Archer, W. (2010). The first decade of the community of inquiry framework: A retrospective. Internet and Higher Education, 13 (1/2), 5-9. https://doi.org/10.1016/j.iheduc.2009.10.003

Tags: Online Discussions, Social Engagement, Cognitive Engagement, Emotional Engagement Summary: This is an update on the evolution of the Community of Inquiry Framework. Initially presented in 2000 (see Garrison, Anderson, Archer, 2000) the three essential factors- social, cognitive, and teaching presence-were presented individually. In retrospect these three items are interconnected.

Garrison, D. R., Anderson, T., & Archer, W. (2000). Critical inquiry in a text-based environment: Computer conferencing in higher education. The Internet and Higher Education, 2 (2/3), 87-105. http://dx.doi.org/10.1016/S1096-7516(00)00016-6

Tags: Online Discussions, Social Engagement, Cognitive Engagement, Emotional Engagement Summary: Authors propose three essential factors that contribute to a student’s successful educational experience: social presence, cognitive presence, and teaching presence.

Garrison, D. R., & Arbaugh, J. B. (2007). Researching the community of inquiry framework: Review, issues, and future directions. The Internet and Higher Education, 10 (3), 157-172. http://dx.doi.org/10.1016/j.iheduc.2007.04.001

Tags: Social Engagement, Cognitive Engagement, Emotional Engagement Summary: This paper tracks community of inquiry literature and finds that future research should include more quantitatively-oriented studies, and more cross-disciplinary studies. In addition more research is needed to find factors that moderate and/or extend the relationship between the COI framework components and online course outcomes.

Grandzol, C. J., & Grandzol., J. G. (2010). Interaction in online courses: More is not always better. Online Journal of Distance Learning Administration, 13 (2). [ link ]

Tags: Online Discussions, Instructional Design Summary: An effective online discussion in part depends on the class size: small class sizes (e.g., fewer than 6 students) tend to have issues getting students to actively engage with the course discussions due to a lack of variety in replies to the discussion questions, large class size may hinder instructor participation.

Groccia, J. E. (2018). What is student engagement? New Directions for Teaching & Learning, 2018 (154), 11–20. https://doi.org/10.1002/tl.20287

Tags: Definition of Engagement Summary: This chapter reviews various definitions of student engagement in a general sense (not in the online environment per se ).

Guajardo-Leal, B. E., Navarro-Corona, C., & González, J. R. V. (2019). Systematic mapping study of academic engagement in MOOC. International Review of Research in Open and Distributed Learning, 20 (2), 113–139. https://doi.org/10.19173/irrodl.v20i2.4018

Tags: Instructional Design Summary: This is a synthesis of studies conducted from 2015 to 2018 on student engagement in MOOC’s.  One of the goals of the study was to develop a technique for locating and evaluating previous studies.  The synthesis found that a majority of studies used qualitative techniques to study engagement and learning analytics were the most common type of data collected.

Hall, D., & Buzwell, S. (2013). The problem of free-riding in group projects: Looking beyond social loafing as reason for non-contribution. Active Learning in Higher Education, 14 (1), 37-49.  http://dx.doi.org/10.1177/1469787412467123

Tags: Group Work, Social Engagement Summary: This study surveyed students on their attitudes towards group work and free-riders, or students who do not contribute a perceived equal share by peers but receive the same group grade. Hypotheses for the causes of voluntary and involuntary free-riding in groups are proposed but additional work is needed.

Henrie, C. R., Halverson, L. R., & Graham, C. R. (2015). Measuring student engagement in technology-mediated learning: A review. Computers & Education, 90 , 36–53. https://doi.org/10.1016/j.compedu.2015.09.005

Tags: Assessment Summary: This paper identifies methods to identify effective methods to conceptualize and measure student engagement in technology-mediated learning. Table 8 shows how to operationalise engagement. More research is needed to study the role of emotional engagement in learning.

Hew, K. F. (2016). Promoting engagement in online courses: What strategies can we learn from three highly rated MOOCS. British Journal of Educational Technology, 47 (2) , 320–341. https://doi.org/10.1111/bjet.12235

Tags: Group Work, Online Discussions, Instructional Design Summary: Five factors for MOOC popularity were found: (1) problem‐centric learning with clear expositions, (2) instructor accessibility and passion, (3) active learning, (4) peer interaction, and (5) using helpful course resources. These factors can guide course design.

Hirumi, A., & Bermudez, A. B. (1996). Interactivity, distance education, and instructional systems design converge on the information superhighway. Journal of Research on Computing in Education, 29 (1), 1−16. https://doi.org/10.1080/08886504.1996.10782183

Tags: Online Discussions, Instructional Design Summary: Distance education programs were based on the correspondence course where little interaction took place. Course design for interactivity is possible and can create advantages over face to face design, for example, personal messages by instructor and quicker replies.

Hsieh, Y-H., & Tsai, C-C. (2012). The effect of moderator’s facilitative strategies on online synchronous discussions. Computers in Human Behavior, 28 (5), 1708–1716. https://doi.org/10.1016/j.chb.2012.04.010 .

Tags: Online Discussions, Group Work Summary: Facilitators can increase collaboration and participation in online asynchronous discussion. Moderator messages to focus on the main topic and giving students positive feedback were the most common strategies observed.

Kim, M. K., Lee, I. H., & Wang, Y. (2020). How students emerge as learning leaders in small group online discussions. Journal of Computer Assisted Learning, 36 (5), 610–624.  https://doi.org/10.1111/jcal.12431

Tags: Online Discussions, Group Work Summary: Studied online asynchronous interactions of graduate students for emerging learning leaders. Student posts were coded for leadership style, and behavioral, cognitive and emotional engagement. Researchers used the IBM tone analyzer to recognize student emotions in the text.

Kim, M. K., Wang, Y., & Ketenci, T. (2020). Who are online learning leaders? Piloting a leader identification method (LIM). Computers in Human Behavior, 105 , 1-15.  https://doi.org/10.1016/j.chb.2019.106205

Tags: Group Work, Assessment Summary: Referenced in Bacca article as providing evidence that emotional engagement is one of main attributes of leaders in online learning contexts. Refers to social network theory.

Kinsella, G. K., Mahon, C., & Lillis, S. (2017). Facilitating active engagement of the university student in a large-group setting using group work activities.   Journal of College Science Teaching, 46 (6),  34-43.

Tags: Group Work Summary: This study was conducted in face-to-face courses. It surveyed student participants on their experiences participating in small group exercises as part of large classes to enhance student engagement and promote peer learning.

Linnenbrink-Garcia, L., & Pekrun, R. (2011). Students’ emotions and academic engagement: Introduction to the special issue. Contemporary Educational Psychology, 36 (1), 1-3.  https://doi.org/10.1016/j.cedpsych.2010.11.004

Tags: Introduction Summary: Introduction to a special issue about emotion and engagement: how and why student emotions emerge, how these emotions in turn shape students’ engagement and achievement, and the ways in which students can harness emotional resources for facilitating their engagement and achievement.

Liu, M., Liu, L., & Liu, L. (2018). Group awareness increases student engagement in online collaborative writing. The Internet and Higher Education, 38 , 1–8. https://doi.org/10.1016/j.iheduc.2018.04.001

Tags: Group Work, Instructional Design Summary: Conducted a study of the impact of a tool called Cooperpad (to increase group awareness) on student engagement (behavior and cognitive). Article references sources that group awareness tools that provide evidence of individual participation increases participation.

Martin, F., & Bollinger, D.U. (2018). Engagement matters: Student perceptions on the importance of engagement strategies in the online learning environment. Online Learning Journal 22 (1), 205–222.  doi:10.24059/olj.v22i1.1092

Tags: Group work, Online Discussions, Instructional Design Summary: Viewing engagement from a community or inquiry (learner to instructor, learner to learner, learner to content), they find that learner to content engagement is improved with realistic assignments, and that learner to learner engagement is improved with collaboration and discussions.

Mandernach, B. J.  (2015). Assessment of student engagement in higher education: A synthesis of literature and assessment tools. International Journal of Learning, Teaching and Educational Research, 12 (2), 1-14. [ link ]

Tags: Assessment Summary: Engagement is dynamic concepts that comprises behavioral, affective and cognitive dimensions. In table 1 an overview is given how engagement data can be collected.

Mayer, G., Lingle, J., & Usselman, M. (2017). Experiences of advanced high school students in synchronous online recitations. Educational Technology & Society, 20 (2), 15–26. [ link ]

Tags: Online Discussions, Group Work, Discipline-specific (STEM) Summary: Anonymous input and group work encouraged in learning activities. Student centered learning environment. High school calculus students.

Mendini, M., & Peter, P. C. (2019). Research note: The role of smart versus traditional classrooms on students’ engagement. Marketing Education Review, 29 (1), 17–23. https://doi.org/10.1080/10528008.2018.1532301

Tags: Instructional Design Summary:   This study compared a face-to-face classroom environment to one using smart technology in the classroom.  Results suggest higher student engagement with groups and instructor in a classroom without technology.

Morley, C., & Ablett, P. (2017). Designing assessment to promote engagement among first year social work students.  E-Journal of Business Education and Scholarship of Teaching, 11 (2), 1–14. https://files.eric.ed.gov/fulltext/EJ1167329.pdf

Tags: Assessment Summary: Assessing students on a group task (presentation) increased collaboration and cooperation. Assessing group work as a way to promote engagement.

Morgan, C. K., & Tam, M. (1999). Unraveling the complexities of distance education student attrition. Distance Education, 20 (1), 96−108. https://doi.org/10.1080/0158791990200108

Tags: Online Discussions Summary: Addresses a connection between motivation and participation/engagement in online courses.  Students in distance learning environments express feelings of isolation or alienation in comparison to traditional face-to-face classes which have a physical classroom and face-to-face interactions with other students.

Muir, T., Dyment, J., Hopwood, B., Milthorpe, N., Stone, C., & Freeman, E. (2019). Chronicling engagement: students’ experience of online learning over time. Distance Education, 40 (2), 262–277. https://doi.org/10.1080/01587919.2019.1600367

Tags: Assessment, Online Discussions Summary: Weekly survey of students to uncover factors that affect student engagement and the factors that affect fluctuation in student engagement. Tracking of student engagement throughout the course as opposed to a fixed point in time.

Nagel, L., Blignaut, A. S., & Cronje, J. C. (2009). Read-only participants: A case for student communication in online classes. Interactive Learning Environments, 17 , 37–51. https://doi.org/10.1080/10494820701501028

Tags: Online Discussions Summary: Makes an argument that discussion forums are essential forms of communication for online classes. In addition, this paper highlights the negative aspects of online discussions such as read only participants.

Newton, D. W., LePine, J. A., Kim, J. K., Wellman, N., & Bush, J. T. (2020). Taking engagement to task: The nature and functioning of task engagement across transitions. Journal of Applied Psychology, 105 (1), 1–18. https://doi.org/10.1037/apl0000428

Tags:  Cognitive Engagement, Emotional Engagement, Behavioral Engagement Summary: This is a job-based rather than a classroom-based study.  Engagement can differ between tasks, and there can be a spillover effect from one task to the next in that engagement in one task can influence engagement in the next task.

Ouyang, F., & Chang, Y. H. (2019). The relationships between social participatory roles and cognitive engagement levels in online discussions. British Journal of Educational Technology, 50 (3), 1396-1414. https://doi.org/10.1111/bjet.12647

Tags: Introduction, Group Work, Instructional Design, Social Engagement, Cognitive Engagement Summary: This was a multi-method study to examine student participation in asynchronous online discussions. Engagement on a social level can deepen interaction on a cognitive level and vice versa.

Pérez-López, R., Gurrea-Sarasa, R., Herrando, C., Martín-De Hoyos, M. J., Bordonaba-Juste, V., & Acerete, A. U. (2020). The generation of student engagement as a cognition-affect-behaviour process in a Twitter learning experience. Australasian Journal of Educational Technology, 36 (3), 132–146. https://doi.org/10.14742/ajet.5751

Tags: Online Discussions, Group Work Summary: This study evaluates the use of Twitter as online discussion tool, to increase student engagement. Recommendation is to use active and collaborative activities to increase engagement and performance.

Peterson, A.T., Beymer, P.N., & Putnam, R.T. (2018). Synchronous and asynchronous discussions: Effects on cooperation, belonging, and affect. Online Learning, 22 (4), 7-25. doi:10.24059/olj.v22i4.1517

Tags: Group Work, Online Discussions Summary: Asynchronous communication interferes with cooperative (group) learning dynamics.  Asynchronous work increases students’ perception of independence and therefore makes them perceive less interdependence for group work.

Rodriguez, R. J., & Koubek, E. (2019). Unpacking high-impact instructional practices and student engagement in a preservice teacher preparation program. International Journal for the Scholarship of Teaching and Learning, 13 (3), Article 11, 1-9. https://doi.org/10.20429/ijsotl.2019.130311

Tags: Instructional Design, Group Work Summary: High impact practices, like applied learning, collaborative assignments, understanding diverse points of view and constructive feedback on assignments as essential components of engagement and learning.

Rovai, A. P. (2003). A constructivist approach to online college learning. Internet and Higher Education, 7 (2), 79−93. https://doi.org/10.1016/j.iheduc.2003.10.002

Tags: Online Discussions Summary: Some highlights include ways to improve student motivation (e.g., extrinsic factors like grades) and the importance of fostering a sense of community in the classroom (i.e., a sense of belonging and contributing to the greater good of the class tends to lead to greater engagement and the quality of the contributions is enhanced).

Rovai, A. (2007). Facilitating online discussions effectively. The Internet and Higher Education, 10 , 77-88. https://doi.org/10.1016/j.iheduc.2006.10.001

Tags: Online Discussions Summary: This article presents a synthesis of the theoretical and research literature on facilitating asynchronous online discussions effectively. Building community, not only instructor-learner contact, but also learner-learner contact. For example choose topics in areas of student interest. Provide clear guidelines and grading rubrics.

Salter, N & Conneely, M. (2015). Structured and unstructured discussion forums as tools for student engagement.  Computers in Human Behavior, 46 , 18-25. https://doi.org/10.1016/j.chb.2014.12.037

Tags: Online discussions, Instructional Design Summary: Structured forums were seen as more engaging, students used the feedback more often than in unstructured forums. More peer engagement in less structured forums.

Skinner, E. (2009). Using community development theory to improve student engagement in online discussion: a case study. ALT-J Research in Learning Technology, 17 (2), 89-100. https://doi.org/10.1080/09687760902951599

Tags: Online discussions Summary: Participation by students is a prerequisite for building community. Students need to interject their own personal and emotional interests to increase participation (i.e., they need personally invested to get something out of the class). Instructors can choose topics /questions that align with student interest.

Sweat, J., Jones, G., Han, S., & Wolfgram, S. M. (2013). How does high impact practice predict student engagement? A comparison of white and minority students. International Journal for the Scholarship of Teaching and Learning, 7 (2), Article 17. https://doi.org/10.20429/ijsotl.2013.070217

Tags: Group Work Summary: HIPS that have an effect on engagement across racial categories are service learning, undergraduate research, group assignments, learning communities, sequence courses, and, especially, having a close faculty mentor.

Taneja, A. (2014). Teaching tip: Enhancing student engagement: A group case study approach. Journal of Information Systems Education, 25 (3), 181-188. http://jise.org/Volume25/n3/JISEv25n3p181.pdf

Tags: Group Work Summary:  Group work is seen as a learning goal on its own. Within the group case studies are used to transfer theoretical knowledge to real world practice.

Tofade, T., Elsner, J., & Haines, S. T. (2013). Best practice strategies for effective use of questions as a teaching tool. American Journal of Pharmaceutical Education, 77 (7). https://doi.org/10.5688/ajpe777155

Tags: Online Discussions Summary: Using different lenses to look at question formulation, ie. taxonomy of questions, Skinner (2009) mentions the importance of asking relevant questions

Truhlar, A. M., Walter, M. T., & Williams, K. M. (2018). Student engagement with course content and peers in synchronous online discussions. Online Learning, 22 (4), 298-312. https://doi.org/10.24059/olj.v22i4.1389

Tags: Online Discussions, Group Work Summary:  Group reflection enhances critical student-content interaction, and assigning roles enhances critical student -student interaction. Interestingly self-reflection did not enhance either interaction.

Warburton, D. (1998). Community and sustainable development. Earthscan.

Tags: Online Discussions, Group Work Summary: Participation precedes learning. A student needs to be present to be engaged, take part in a learning community to learn.

Waschull, S. B. (2005). Predicting success in online psychology courses: Self-discipline and motivation. Teaching of Psychology, 32 , 190–192. https://doi.org/10.1207/s15328023top3203_11

Tags: Instructional Design Summary: A number of factors are measured relative to success in the course. Only self-discipline and motivation matter. Not factors like, time commitment, study skills, preference for text-based learning, access to technology, and technology experience.

Watson, F. F., Castano Bishop, M., & Ferdinand-James, D. (2017). Instructional strategies to help online students learn: Feedback from online students. TechTrends, 61 , 420–427. https://doi.org/10.1007/s11528-017-0216-y

Tags: Instructional Design Summary: This paper reviews the top 10 preferred instructional strategies from online students.

Wollschleger, J. (2019). Making it count: Using real-world projects for course assignments. Teaching Sociology, 47 (4), 314-324. https://doi.org/10.1177/0092055X19864422

Tags : Group Work, Instructional Design Summary: Redesign a course from lecture and real projects to community involved real world projects. This goes back to HIPS practices like community engagement, collaborative learning, inquiry based learning.

Woods, K. & Bliss, K. (2016). Facilitating successful online discussions. The Journal of Effective Teaching, 16 (2), 76-92. https://uncw.edu/jet/articles/vol16_2/woods.html

Tags: Online Discussions Summary: Relevant to student engagement to discussions is the level of structure as well as the ability for conversations to evolve as the discussion progresses.  They make the case that less structure can allow more degrees of freedom in discussions. An interesting counterpoint to the argument that more structure is better for online courses.

Xie, K. & Ke, F. (2011). The role of student’s motivation in peer-moderated asynchronous online discussions.  British Journal of Educational Technology, 42 (6), 916-930. https://doi.org/10.1111/j.1467-8535.2010.01140.x

Tags: Online Discussions Summary: Intrinsic motivation is a factor in individual interaction. Relatedness is a factor in collaboration. Low level interaction is related to perceived value, competency and autonomy. Knowing the type of motivation instructors can scaffold accordingly.

Engagement in Online Learning: An Annotated Bibliography Copyright © 2021 by Elizabeth Johnson; Caleb Adams; Agatha Engel; and Lisa Vassady is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License , except where otherwise noted.

Share This Book

extension logo for printing

Assessment and Evaluation Language Resource Center

Annotated Bibliographies

Pink and blue flower logo

The following annotated bibliographies compile relevant research on various topics related to language assessment and evaluation.

Get up-to-date on all aspects of oral proficiency testing with our comprehensive annotated bibliography.

Read our annotated overview of different issues related to HL assessment and program evaluation.

Take a look at our annotated guide to studies on evaluation and SLO assessment in language education.

  • Subject List
  • Take a Tour
  • For Authors

Subscriber Services

  • Publications
  • African American Studies
  • African Studies
  • American Literature
  • Anthropology
  • Architecture Planning and Preservation
  • Art History
  • Atlantic History
  • Biblical Studies
  • British and Irish Literature
  • Childhood Studies
  • Chinese Studies
  • Cinema and Media Studies
  • Communication
  • Criminology
  • Environmental Science
  • Evolutionary Biology
  • International Law
  • International Relations
  • Islamic Studies
  • Jewish Studies
  • Latin American Studies
  • Latino Studies
  • Linguistics
  • Literary and Critical Theory
  • Medieval Studies
  • Military History
  • Political Science
  • Public Health
  • Renaissance and Reformation
  • Social Work
  • Urban Studies
  • Victorian Literature
  • Browse All Subjects
  • How to Subscribe
  • Free Trials

Subject Areas

Guided Tour

Anatomy of An Article

Frequently Asked Questions

Oxford Bibliographies Update Program

Did you know that each Oxford Bibliographies article is reviewed annually and updated with the latest scholarship available? Learn about how Oxford Bibliographies are updated on the Update Program Page .

Editor in Chief | Area Editors | Articles and Contributors

Education is a highly active field. From teacher retention strategies to early childhood development, the study of education features a range of disciplines including psychology, sociology, history, economics, philosophy, anthropology, and political science. The multidisciplinary nature of education makes it challenging to stay informed regarding applicable areas. Moreover, much of the relevant scholarship has moved online with the most recent scholarship, research, and statistics appearing in online databases. Thus, there are consistently new discoveries, new interpretations, and new theoretical concepts to take into account. With  Oxford Bibliographies in Education, students and scholars now have a reliable, selective, and authoritative guide to the best literature in the field.

A Bibliographical Introduction to Native American Studies

This page, curated by  Oxford Bibliographies  editor-in-chief Susan Faircloth, features a select group of annotated bibliographies addressing key topics in the field of Native American Studies, a term which is used here to include a broad range of disciplines that specifically explore the histories, cultures, languages, lifeways, and sociocultural and political experiences and representations of American Indian, Alaska Native, Native Hawaiian and other Indigenous peoples of the Americas. The content of this page will evolve over time as new and more relevant scholarship is published. Bibliographies authored by Indigenous peoples are highly encouraged.  Read More.

Oxford Bibliographies and Student Speech and Debating

Recent topics selected for the National High School Level Speech and Debate Tournaments are covered by many of the entries in Oxford Bibliographies . We invite students of debate and forensic speech to explore a select group of articles that can serve as a springboard into each issue.  Read More

AREA EDITORS

FORMER EDITORS IN CHIEF

FOUNDING EDITORS IN CHIEF

FORMER STANDING EDITORIAL BOARD

FORMER AREA EDITORS

FOUNDING EDITORIAL BOARD

ARTICLES AND CONTRIBUTORS

* = recently published

Continuous Improvement in Education

Curriculum and Pedagogy

Disabilities, Special Education, and Inclusive Education

Diversity, Equity, and Inclusion in Educational Improvement and Innovation

Education, Cultures, and Ethnicities

Educational Administration and Leadership

Educational Law and Policy

Educational Psychology

Educational Theories and Philosophies

Environmental Education

Higher Education and Tertiary Education

Professional Learning and Development

Research, Evaluation, and Assessments

Technology and Education

We want to hear from you. Oxford Bibliographies is a partnership between the publisher and the academic community, and we invite you to participate. Please feel welcome to email the editorial team with comments, suggestions, or questions.

  • Privacy Policy
  • Cookie Policy
  • Legal Notice
  • Accessibility

Powered by:

  • [66.249.64.20|185.39.149.46]
  • 185.39.149.46

Welcome to MyBib

Generate formatted bibliographies, citations, and works cited automatically

What is mybib.

MyBib is a free bibliography and citation generator that makes accurate citations for you to copy straight into your academic assignments and papers.

If you're a student, academic, or teacher, and you're tired of the other bibliography and citation tools out there, then you're going to love MyBib. MyBib creates accurate citations automatically for books, journals, websites, and videos just by searching for a title or identifier (such as a URL or ISBN).

Plus, we're using the same citation formatting engine as professional-grade reference managers such as Zotero and Mendeley, so you can be sure our bibliographies are perfectly accurate in over 9,000 styles -- including APA 6 & 7, Chicago, Harvard, and MLA 7 & 8.

Quick features:

Help | Advanced Search

Computer Science > Computation and Language

Title: the promises and pitfalls of using language models to measure instruction quality in education.

Abstract: Assessing instruction quality is a fundamental component of any improvement efforts in the education system. However, traditional manual assessments are expensive, subjective, and heavily dependent on observers' expertise and idiosyncratic factors, preventing teachers from getting timely and frequent feedback. Different from prior research that mostly focuses on low-inference instructional practices on a singular basis, this paper presents the first study that leverages Natural Language Processing (NLP) techniques to assess multiple high-inference instructional practices in two distinct educational settings: in-person K-12 classrooms and simulated performance tasks for pre-service teachers. This is also the first study that applies NLP to measure a teaching practice that is widely acknowledged to be particularly effective for students with special needs. We confront two challenges inherent in NLP-based instructional analysis, including noisy and long input data and highly skewed distributions of human ratings. Our results suggest that pretrained Language Models (PLMs) demonstrate performances comparable to the agreement level of human raters for variables that are more discrete and require lower inference, but their efficacy diminishes with more complex teaching practices. Interestingly, using only teachers' utterances as input yields strong results for student-centered variables, alleviating common concerns over the difficulty of collecting and transcribing high-quality student speech data in in-person teaching settings. Our findings highlight both the potential and the limitations of current NLP techniques in the education domain, opening avenues for further exploration.

Submission history

Access paper:.

  • HTML (experimental)
  • Other Formats

References & Citations

  • Google Scholar
  • Semantic Scholar

BibTeX formatted citation

BibSonomy logo

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

  • Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .

Joint World Bank, UN Report Assesses Damage to Gaza’s Infrastructure

Damages to Physical Structures Estimated at $18.5 billion as of end January

WASHINGTON, April 2, 2024  – The cost of damage to critical infrastructure in Gaza is estimated at around $18.5 billion according to a new report released today by the World Bank and the United Nations, with financial support of the European Union. That is equivalent to 97% of the combined GDP of the West Bank and Gaza in 2022.

The Interim Damage Assessment report used remote data collection sources to measure damage to physical infrastructure in critical sectors incurred between October 2023 and end of January 2024. The report finds that damage to structures affects every sector of the economy. Housing accounts for 72% of the costs. Public service infrastructure such as water, health and education account for 19%, and damages to commercial and industrial buildings account for 9%. For several sectors, the rate of damage appears to be leveling off as few assets remain intact. An estimated 26 million tons of debris and rubble have been left in the wake of the destruction, an amount that is estimated to take years to remove.

The report also looks at the impact on the people of Gaza. More than half the population of Gaza is on the brink of famine and the entire population is experiencing acute food insecurity and malnutrition. Over a million people are without homes and 75% of the population is displaced. Catastrophic cumulative impacts on physical and mental health have hit women, children, the elderly, and persons with disabilities the hardest, with the youngest children anticipated to be facing life-long consequences to their development.

With 84% of health facilities damaged or destroyed, and a lack of electricity and water to operate remaining facilities, the population has minimal access to health care, medicine, or life-saving treatments. The water and sanitation system has nearly collapsed, delivering less than 5% of its previous output, with people dependent on limited water rations for survival. The education system has collapsed, with 100% of children out of school.

The report also points to the impact on power networks as well as solar generated systems and the almost total power blackout since the first week of the conflict. With 92% of primary roads destroyed or damaged and the communications infrastructure seriously impaired, the delivery of basic humanitarian aid to people has become very difficult.

The Interim Damage Assessment Note identifies key actions for early recovery efforts, starting with an increase in humanitarian assistance, food aid and food production; the provision of shelter and rapid, cost-effective, and scalable housing solutions for displaced people; and the resumption of essential services.

About the Gaza Interim Damage Assessment Report

The Gaza Interim Damage Assessment report draws on remote data collection sources and analytics to provide a preliminary estimate of damages to physical structures in Gaza from the conflict in accordance with the Rapid Damage & Needs Assessment (RDNA) methodology. RDNAs follow a globally recognized methodology that has been applied in multiple post-disaster and post-conflict settings. A comprehensive RDNA that assesses economic and social losses, as well as financing needs for recovery and reconstruction, will be completed as soon as the situation allows. The cost of damages, losses and needs estimated through a comprehensive RDNA is expected to be significantly higher than that of an Interim Damage Assessment.

This site uses cookies to optimize functionality and give you the best possible experience. If you continue to navigate this website beyond this page, cookies will be placed on your browser. To learn more about cookies, click here .

IMAGES

  1. How to write an annotated bibliography step-by-step with examples

    bibliography education assessment

  2. ANNOTATED BIBLIOGRAPHY UVIC

    bibliography education assessment

  3. Annotated Bibliography Example in Word and Pdf formats

    bibliography education assessment

  4. APA Annotated Bibliography

    bibliography education assessment

  5. MLA Annotated Bibliography Examples and Writing Guide

    bibliography education assessment

  6. ⭐ Sample chicago style annotated bibliography. Annotated Bibliography

    bibliography education assessment

VIDEO

  1. Cite Your Sources (Grades 4-5)

  2. How to write bibliography for a school project or Research paper

  3. Bibliography for school project

  4. How to Write a Bibliography for a Research Paper?

  5. purpose to study bibliography

  6. ANNOTATED BIBLIOGRAPHY ENGL1C

COMMENTS

  1. Educational Assessment

    Educational assessment can be defined as any effort to gather systematic evidence about the attainment of learning goals and objectives, either at the level of the individual student or at the level of a larger organization such as a school, district, or country. This bibliography focuses on methods for examining whether particular educational ...

  2. PDF Bibliography of recent publications on assessment

    Victoria: Australian Council for Educational Research. Abstract This review addresses the role of assessment in education. It observes that the field of educational is currently divided and in disarray. Fault lines fragment the field into differing, and often competing philosophies, methods and approaches.

  3. Comprehensive Assessment Research Review: Annotated Bibliography

    Andrade, H., Du, Y., & Mycek, K. (2010). Rubric-Referenced Self-Assessment and Middle School Students' Writing.Assessment in Education: Principles, Policy & Practice, 17(2), 199-214.This study investigated the relationship between 162 middle school students' scores for a written assignment and a process that involved students in generating criteria and self-assessing with a rubric.

  4. Evaluation Bibliography

    Since 1989, the number of Dutch-English bilingual secondary schools in the Netherlands has been growing substantially. Funded by the Dutch Ministry of Education, Admiraal, Westhoff, and de Bot conducted a six-year longitudinal comparative study of lower secondary learners'(12-15 year olds) English proficiency (vocabulary, pronunciation, reading and oral ability), subject knowledge (history ...

  5. PDF Assessment, Evaluation, and Teaching

    Assessment, Evaluation, and Teaching - an annotated bibliography Assessment at the institutional level Allen, M. J. (2006). Assessing general education programs. Bolton, MA: Anker This book is a pragmatic guide for developing, aligning, and assessing general education programs in meaningful, manageable and sustainable ways. It presents a ...

  6. PDF PROGRAM ASSESSMENT BIBLIOGRAPHY

    learning and assessment in college, the two authors aim to synthesize classroom teaching and grading with institutional assessment and how the two can inform one another. The authors examine the link between teaching and grading, which allows for faculty to consider how classroom work can be used for program and general education assessment.

  7. Assessment Bibliography

    Assessment Bibliography. The Office of Assessment has compiled a list of books, articles, and websites dedicated to assessment. Whether you are a novice or a seasoned practitioner, browse through this list, organized by topic, to learn more about assessment. Scroll through the list or click on the topic that most interests you!

  8. PDF Annotated Bibliography: Assessment of Critical Thinking in Higher Education

    Describes an assessment regime in which "compulsory, product-focused" (i.e. summative) assessments were combined with "voluntary, processed-focused" formative assessments. Positive results were shown with the processed-focused assessments, which were designed to require student critical thinking.

  9. Grading and Ungrading: An Annotated Bibliography

    Edited by Susan D. Blum. Morgantown: West Virginia University Press, 2020. Abstract: This interdisciplinary edited collection brings together theoretical and practical explorations of ungrading, exploring different models and offering both practical examples and reflections from practitioners across the disciplines.

  10. Writing Evaluative Annotated Bibliographies

    Step 3: Use the notes you have made to draft an evaluative annotated bibliography entry for the Bunn (2011) text. Refer to the information and examples provided in this chapter for guidance. Homework: Produce an Evaluative Annotated Bibliography. Identify a topic of inquiry you can explore via means of an annotated bibliography.

  11. Evaluation & Assessment Bibliography

    A three-party assessment process including students, employers and faculty complete the module. Clancy, Edward A, Paula Quinn, and Judith E. Miller. 2005. "Assessment of a Case Study Laboratory to Increase Awareness of Ethical Issues in Engineering." IEEE Transactions on Education. 48 2:313-17.

  12. Annotated bibliography

    An annotated bibliography is a selected list of sources (texts, primary sources and/or internet sites) reported in an agreed referencing convention and accompanied by a short summary or analysis. The main focus is not to provide a list of sources but to demonstrate an understanding of these sources. Annotated bibliographies can be a useful ...

  13. Annotated Bibliography: Performance Assessment

    Project Zeroconducts research on performance assessment from pre-school through high school, works with some schools in implementing assessments, and provides a bibliography of papers and materials about their work. Harvard Graduate School of Education, Longfellow Hall, Appian Way, Cambridge, MA 02138; (617) 495-4342.

  14. Annotated Bibliography

    3 Annotated Bibliography Al Mamun, A., Lawrie, G. & Wright, T. (2020). ... Summary: The book highlights examples of effective assessment in higher education including a section on "CLASSE: measuring Student Engagement at the Classroom Level" by Robert Smallwood and Judith Ouimet. The Classroom Survey of Student Engagement is an adaption of ...

  15. PDF Bibliography: Feedback Assessment for Clinical Education

    Bibliography: Feedback Assessment for Clinical Education 1. Adair-Hauck, B., & Troyan, F. J. (2013). A descriptive and co-constructive approach ... 71. Koh, L. C. (2010). Academic staff perspectives of formative assessment in nurse education. Nurse Education in Practice, 10, 205-209. 72. Krichbaum, K. (1994). Clinical teaching effectiveness ...

  16. Needs Assessments Processes, Methods and Examples: Annotated Bibliography

    This needs assessment annotated bibliography is planned as a companion resource for regional and state level trainings on needs assessments for Extension professionals. The resources covered will help support new Extension faculty onboarding efforts and can be used by those planning to engage in needs assessment activities.

  17. Annotated Bibliographies

    Oral Proficiency Testing Bibliography; CAL Assessment Resources; AELRC C-Test Repository; Evaluation. Heritage Language Learner Assessment Bibliography; Feature presentation: Program Evaluation ... Take a look at our annotated guide to studies on evaluation and SLO assessment in language education. Facebook; X; Instagram; YouTube; Poulton Hall ...

  18. Bibliography Search Tool

    Search Reset. The Bibliography Search Tool allows you to search for individual citations from journal articles that have been published using data from 18 research programs conducted by NCES (click on the icons below for more information on each program). The bibliography is updated continually. The list of citations was derived from computer ...

  19. Education

    Education is a highly active field. From teacher retention strategies to early childhood development, the study of education features a range of disciplines including psychology, sociology, history, economics, philosophy, anthropology, and political science. The multidisciplinary nature of education makes it challenging to stay informed ...

  20. MyBib

    MyBib is a free bibliography and citation generator that makes accurate citations for you to copy straight into your academic assignments and papers. If you're a student, academic, or teacher, and you're tired of the other bibliography and citation tools out there, then you're going to love MyBib. MyBib creates accurate citations automatically ...

  21. [2404.02444] The Promises and Pitfalls of Using Language Models to

    Assessing instruction quality is a fundamental component of any improvement efforts in the education system. However, traditional manual assessments are expensive, subjective, and heavily dependent on observers' expertise and idiosyncratic factors, preventing teachers from getting timely and frequent feedback. Different from prior research that mostly focuses on low-inference instructional ...

  22. Developing and Validating the Qualitative ...

    Developing and Validating the Qualitative Characteristics of Children's Play Assessment System Supported by Caplan Foundation Project led by: PI: Michael Haslip, PhD This one-year project in the McNichol ECE lab is validating an assessment created to measure young children's play skill development, called the Qualitative Characteristics of Children's Play or QCCP, and building a new ...

  23. Joint World Bank, UN Report Assesses Damage to Gaza's Infrastructure

    The Interim Damage Assessment report used remote data collection sources to measure damage to physical infrastructure in critical sectors incurred between October 2023 and end of January 2024. The report finds that damage to structures affects every sector of the economy. ... The education system has collapsed, with 100% of children out of school.