Contemporary Peer Review: Construct Modeling, Measurement Foundations, and the Future of Digital Learning

  • Ashley Nichole Reese University of South Florida
  • Rajeev Reddy Rachamalla University of South Florida
  • Alex Rudniy Farleigh Dickinson University
  • Laura Aull Wake Forest University
  • Dave Eubanks Furman University
Keywords: corpus linguistics, first-year composition, peer review, student writing, writing analytics


  • Background: In this article, we offer a study of peer review in a digital learning environment. Our analysis focuses on intrapersonal and interpersonal domains of the writing construct as they are enacted in the peer review process in terms of self-reflection and transaction.  Our study is organized as a demonstration of the force of construct articulation, the usefulness of fairness as an integrative measurement framework, and the affordances of research in digital ecologies. Based on findings from our National Science Foundation funded research, we conclude with considerations for future peer review research.
  • Literature Review: Despite the fact that most first-year composition programs utilize peer review, there is little writing studies research surrounding the practice of peer review (Haswell, 2005, p. 211). The studies that have addressed peer review generally find that peer review leads to positive outcomes (Moxley & Eubanks, 2015; Ross, Liberman, Ngo, & LeGrand, 2016). Notably, peer reviews appear to help both the reviewer and the reviewee (Dochy, Segers, & Sluijsmans, 1999). Our understanding of the revision process is rooted in Flower and Hayes’ (1981) social cognitive theory of writing. Combined with a need to expand models of the writing construct based on cognitive, interpersonal, and intrapersonal demands, our research seeks to fill the gap in acknowledging that the metacognitive nature of peer review is part of the construct of writing.
  • Research Questions: Our research questions divide into three categories: the intrapersonal and interpersonal domain, forms of evidence, and digital learning affordances. We inquire into (1) the tone and quality of student self-reflection, as well as (2) the quality and tone of the peer review transaction. In the study of fairness evidence, we ask (3) what may be learned by investigating responses when student sub-groups are disaggregated according to gender, ethnicity, race, and English language learning. In the study of reliability evidence, we ask (4) what forms of evidence related to response consistency are useful in the analysis of peer review. In the study of validity evidence, we ask (5) how a precise definition of the writing construct lends precision to construct-related evidence. In terms of digital learning, we ask (6) what is the instrumental value of questions 1-5 in terms of demonstration of affordances to participate in the MyReviewers (MyR) peer review process.
  • Research Methodology: Our research utilizes a sample of 837 students enrolled in first-year composition at a public research university, in particular their self-reflection ratings and the transaction ratings. These surveys were conducted voluntarily, presented to the students upon completing peer review (for reviewers) and the revision plan (for reviewees) as part of the MyR software.
  • Results: The study shows that while self-reflection and transaction surveys received high to neutral ratings for helpfulness, politeness, and kindness, encouragement received only high or low ratings. In terms of fairness evidence for self-reflection, women believed their feedback was more polite and helpful. Similarly, Hispanic students believed their reviews were more helpful than non-Hispanic students did, and students who claimed proficiency in two or more languages felt their own reviews were more helpful than English-only speakers did. For fairness evidence for transactions, men were perceived as more encouraging than women in their feedback, while Hispanic students’ reviews were no more helpful than non-Hispanic students’ reviews. No statistically significant differences were found amongst English language learners and native English speakers. In relation to reliability evidence for self-reflection, for the most part reliability reaches statistically significant levels. In terms of digital learning affordances, students in groups typically associated with low writing performance thrived in the digital learning platform when the construct included domains beyond the cognitive.
  • Discussion: Based on these findings, there are three areas of consideration worthy of extended pursuit: 1) consider the advantages of expanded notions of the writing construct; 2) consider information analysis in terms of opportunity to learn; and 3) consider digital ecologies as a way to advance writing instruction for all students.
  • Conclusions: This study provides unique insight that writing program administrators (WPAs) might utilize to inform their programs. A natural next step to implement the findings of our study would be for WPAs to systematically examine how evidence reacted to gender, ethnicity, and race is manifested within the classroom at their own institutions.

Author Biographies

Ashley Nichole Reese, University of South Florida
Digital Teaching Fellow, Department of English
Rajeev Reddy Rachamalla, University of South Florida
Application Project Manager / Scrum Master, Deparment of English
Alex Rudniy, Farleigh Dickinson University
Assistant Professor in Computer Science
Laura Aull, Wake Forest University
Associate Professor of English and Linguistics
Dave Eubanks, Furman University
Assistant Vice President, Office of Institutional Assessment and Research


American Educational Research Association, American Psychological Association, and National Council on Measurement in Education (2014). Standards for educational and psychological testing. Washington, DC: American Educational Research Association.

Aull., L. (2015). First-year university writing: A corpus-based study with implications for pedagogy. New York: Palgrave Macmillan.

Behizadeh, N., & Engelhard. G. (2015). Valid writing assessment from the perspectives of the writing and measurement communities. Pensamiento Educativo. Revista de Investigación Educacional Latinoamericana, 52(2), 34-54.

Bennett, R. E. (1993). On the meanings of constructed response. In R. E. Bennett & W. C. Ward (Eds.), Construction vs. choice in cognitive measurement: Issues in constructed response, performance testing, and portfolio assessment (pp. 1–27). Hillsdale, NJ: Erlbaum.

Chang, C.Y-h. (2016). Two decades of research in L2 peer review. Journal of Writing Research, 8(1), 81-117.

Cheng, W., & Warren, M. (1997). Having second thoughts: Student perceptions before and after a peer assessment exercise. Studies in Higher Education, 22(2), 233–239.

College Board (2016). 2016 college-bound seniors: Total group profile. New York, NY: College Board. Retrieved from

Condon, W. (2013). Large-scale assessment, locally-developed measures, and automated scoring of essays: Fishing for red herrings?. Assessing Writing, 18(1), 100-108.

Dochy, F., Segers, M., & Sluijsmans, D. (1999). The use of self-, peer and co-assessment in higher education: A review. Studies in Higher Education, 24(3), 331–350.

Elliot, N. (2016). A theory of ethics for writing assessment. Journal of Writing Assessment, 9(1). Retrieved from

Falakmasir, M. H., Ashley, K. D., Schunn, C. D., & Litman, D. J. (2014). Identifying thesis and conclusion statements in student essays to scaffold peer review. In S. Trausan-Matu, K.E. Boyer, M. Crosby, & K. Panourgia (Eds.), Intelligent Tutoring Systems. ITS 2014. Lecture notes in computer science, vol 8474. Springer.

Flower, L., & Hayes, J. R. (1981). A cognitive process theory of writing. College Composition and Communication, 32(4), 365–387.

Gallagher, C. W. (2016). What writers do: Behaviors, behaviorism, and writing studies. College Composition and Communication, 68(2), 238–265.

Gibson, A., Kitto, K., & Bruza, P. (2016). Towards the discovery of learner metacognition from reflective writing. Journal of Learning Analytics, 3(2), 22–36.

Greeno, J. G., & Gresalfi, M. S. (2008). Opportunities to learn in practice and identity. In P. A. Moss, D. C. Pullin, J. P. Gee, E. H.

Haertel, & L. J. Young (Eds.), Assessment, equity, and opportunity to learn (pp. 170-199). Cambridge, UK: Cambridge University Press.

Hamer, J., Purchase, H., Luxton-Reilly, A., & Denny, P. (2015). A comparison of peer and tutor feedback. Assessment and Evaluation in Higher Education, 40(1), 151–164.

Hart-Davidson, W., McLeod, M., Klerkx, C., & Wojcik. M.. (2010). A method for measuring helpfulness in online peer review. In Proceedings of the 28th ACM International Conference on Design of Communication (SIGDOC '10). ACM, New York, NY, USA, 115–121.

Haswell, R.H. (2005). NCTE/CCCC’s recent war on scholarship. Written Communication, 22(2), 198–223.

Hayes, J. R. (2012). Modeling and remodeling the writing construct. Written Communication, 29(3), 369–388.

Hewett, B. L. (2015). A review of WriteLab. WLN: A Journal of Writing Center Scholarship, 40 (3-4), 8–19.

Hu, G., & Lam, S.T.E. (2010). Issues of cultural appropriateness and pedagogical efficacy: Exploring peer review in a second language writing class. Instructional Science, 38, 371–394.

Hussar, W. J., & Bailey, T. M. (2014). Projections of education statistics to 2022 (NCES 2014-051). 41st ed. U.S. Department of Education, National Center for Education Statistics. Washington, DC: U.S. Government Printing Office. Retrieved from

Inoue, A. B. (2015). Antiracist writing assessment ecologies: Teaching and assessing for a socially just future. Fort Collins, CO & Anderson, SC: WAC Clearinghouse and Parlor Press.

Intemann, K. (2010). 25 years of feminist empiricism and standpoint theory: Where are we now? Hypatia, 25(4), 778–796.

Jarratt, S. C., Mack, K., Sartor, A., & Watson, S. E. (2009). Pedagogical memory: Writing, mapping, translating. WPA: Writing Program Administration, 33(1– 2), 46– 73.

Kane, M. T. (2016). Validation strategies: Delineating and validating proposed interpretations and uses of test scores. In S. Lane, M. R. Raymond, & T. M. Haladyna (Eds.), Handbook of test development (pp. 64–80). New York, NY: Routledge.

Kelly Riley, D., & Whithaus, C. (2016). A theory of ethics for writing assessment. [Special issue]. Journal of Writing Assessment, 9(1). Retrieved from

Kirschenbaum, M. (2012). Digital humanities as/is a tactical term. In M.K. Gold & L.F. Klein (Eds.), Debates in the digital humanities (n.p.). Minneapolis, MN: University of Minnesota Press.

Lawrence, S.M., & Sommers, E. (1996). From the park bench to the (writing) workshop table: Encouraging collaboration among inexperienced writers. Teaching English in the Two-Year College, 23(2), 101–9.

Leijen, D.A.J. (2017). A novel approach to examine the impact of web-based peer review on the revisions of L2 writers. Computers and Composition, 43, 35–54.

Leijten, M., Van Waes L., Schriver, K., & Hayes, J. R. (2014). Writing in the workplace: Constructing documents using multiple digital sources. Journal of Writing Research, 5(3), 285–337.

Liang, M-Y. (2010). Using synchronous online peer response groups in EFL writing: Revision-related discourse. Language Learning & Technology, 14(1), 45–64.

Lundstrom, K., & Baker, W. (2009). To give is better than to receive: The benefits of peer review to the reviewer’s own writing. Journal of Second Language Writing, 18, 30–43.

McLaughlin, P., & Simpson, N. (2004). Peer assessment in first year university: How the students feel. Studies in Educational Evaluation, 30(2), 135-149.

Meizlish, D., LaVaque- Manty, D., & Silver, N. (2013). Think like/write like. In R. Thompson (Ed.), Changing the conversation about higher education (p. 53–74). New York, NY: Rowman & Littlefield.

Mislevy, R. J. (2016). How developments in psychology and technology challenge validity argumentation. Journal of Educational Measurement, 53(3), 265-292.

Mislevy, R. J., Steinberg, L. S., & Almond, R. G. (2003). On the structure of educational assessment (with discussion). Measurement: Interdisciplinary Research and Perspective, 1(1), 3–62.

Mongo (2016). MongoDB Atlas best practices. New York, NY: Mongo. Retrieved from

Moss, P. A., Pullin, D. C., Gee, J. P., Haertel, E. H., & Young, L. J. (Eds.). (2008). Assessment, equity, and opportunity to learn. Cambridge, UK: Cambridge University Press.

Moxley, J. M., & Eubanks, D. (2016). On keeping score: Instructors’ vs. students’ rubric ratings of 46,689 essays. Writing Program Administration, 39(2), 53–80.

National Research Council. (2012). Education for life and work: Developing transferable knowledge and skills in the 21st century. Committee on Defining Deeper Learning and 21st Century Skills, J. W. Pellegrino & M. L. Hilton, Board on Testing and Assessment and Board on Science Education, Division of Behavioral and Social Sciences and Education (Eds.). Washington, DC: The National Academies Press.

National Research Council. (2013). Frontiers in massive data analysis. Washington, D.C.: The National Academies Press.

Nystrand, M. (1984). Learning to write by talking about writing: A summary of research on intensive peer review in expository writing at the University of Wisconsin—Madison. ED 255 914. Retrieved from

Paulus, T.M. (1999). The effect of peer and teacher feedback on student writing. Journal of Second Language Writing, 8, 265–289.

Poe, M., & Inoue, A. B. (2016). Writing assessment as social justice [Special issue]. College English, 79(2).

Raymond, R.C. (1989). Teaching students to revise: Theories and practice. Teaching English in the Two-Year College, 16(1), 49–58.

Ross, V., Liberman, M., Ngo, L., & LeGrand, R. (2016). Weighted log-odds-ratio, informative dirichlet prior method to enhance peer review feedback for low- and high-scoring college students in a required first-year writing program. Proceedings of the EDM 2016 Workshop and Tutorial. Retrieved from

Rudniy, A., & Elliot, N. (2016). Collaborative review in writing analytics: N-gram analysis of instructor and student comments. Proceedings of the EDM 2016 Workshops and Tutorials. Raleigh, NC, USA, June 29, 2016, 1–8.

Stricker, L. J., & Ward, W. C. (2004). Stereotype threat, inquiring about test taker’s ethnicity and gender, and standardized test performance. Journal of Applied Social Psychology, 34(4), 665–693.

Struyven, K., Dochy, F., Janssens, S., & Gielen, S. (2006). On the dynamics of students’ approaches to learning: The effects of the teaching/learning environment. Learning and Instruction, 16(4), 279–294.

Teixeira, R., Frey, W. H., & Griffin, R. (2015). States of change: The demographic evolution of the American electorate, 1974-2060. Washington, DC: Center for American Progress, American Enterprise Institute, & Brookings Institution. Retrieved from

Topping, K. (1998). Peer assessment between students in colleges and universities. Review of Educational Research, 68(3), 249-76.

Tsui, A. B. M., & Ng, M. (2000). Do secondary L2 writers benefit from peer comments? Journal of Second Language Writing, 9(2), 147–170.

Tucker, R. (2014). Sex does not matter: Gender bias and gender differences in peer assessments to contributions to group work. Assessment & Evaluation in Higher Education, 39(3), 293–309.

Weiss, C. H. (1995). Nothing as practical as good theory: Exploring theory-based evaluation for comprehensive community initiatives for children and families. In J. I. Connell, A. C. Kubisch, L. B. Schorr, & C. H. Weiss (Eds.), New approaches to evaluation community initiatives: Concepts, methods, and contexts (p. 65–92). New York, NY: The Aspen Initiative.

Weiss, C. H. (1998). Have we learned anything new about the use of evaluation? American Journal of Evaluation, 19(1), 21–33.

Wen, M. L., & Tsai, C. C. (2006). University students’ perceptions of and attitudes toward (online) peer assessment. Higher Education, 51, 27–44.

White, E. M., Elliot, N., & Peckham, I. (2015). Very like a whale: The assessment of writing programs. Logan, UT: Utah State University Press.

White, T. (2015). Hadoop: The definitive guide. (4th ed.). Storage and analysis at internet scale. Sebastopol, CA: O’Reilly Media.

Willey, K., & Gardner, A. (2010). Investigating the capacity of self and peer assessment activities to engage students and promote learning. European Journal of Engineering Education, 35(4), 429–443.

Wilson, M. J., Diao, M. M., & Huang, L. (2015). ‘I’m not here to learn how to mark someone else’s stuff’: An investigation of online peer-to-peer review workshop tool. Assessment & Evaluation in Higher Education, 40(1), 15–32.