Writing Mentor: Writing Progress Using Self-Regulated Writing Support

  • Jill Burstein Educational Testing Service
  • Norbert Elliot University of South Florida
  • Beata Beigman Klebanov Educational Testing Service
  • Nitin Madnani Educational Testing Service
  • Diane Napolitano Educational Testing Service
  • Maxwell Schwartz
  • Patrick Houghton Educational Testing Service
  • Hillary Molloy Educational Testing Service
Keywords: automated writing evaluation, feedback, natural language processing, self-efficacy, self-regulated writing, writing analytics, Writing Mentor


The Writing MentorTM (WM) application is a Google Docs add-on designed to help students improve their writing in a principled manner and to promote their writing success in postsecondary settings. WM provides automated writing evaluation (AWE) feedback using natural language processing (NLP) methods and linguistic resources. AWE features in WM have been informed by research about postsecondary student writers often classified as developmental (Burstein et al., 2016b), and these features address a breadth of writing sub-constructs (including use of sources, claims, and evidence; topic development; coherence; and knowledge of English conventions). Through an optional entry survey, WM collects self-efficacy data about writing and English language status from users. Tool perceptions are collected from users through an optional exit survey. Informed by language arts models consistent with the Common Core State Standards Initiative and valued by the writing studies community, WM takes initial steps to integrate the reading and writing process by offering a range of textual features, including vocabulary support, intended to help users to understand unfamiliar vocabulary in coursework reading texts. This paper describes WM and provides discussion of descriptive evaluations from an Amazon Mechanical Turk (AMT) usability task situated in WM and from users-in-the-wild data. The paper concludes with a framework for developing writing feedback and analytics technology.


Attali, Y., & Burstein, J. (2006). Automated essay scoring with e-rater v.2.0. Journal of Technology, Learning, and Assessment, 4(3).

Beigman Klebanov, B., Stab, C., Burstein, J., Song, Y., Gyawali, B. and Gurevych, I. (2016) Argumentation: Content, Structure, and Relationship with Essay Quality, In Proceedings of the 3rd Workshop on Argument Mining, ACL 2016, Berlin, Germany.

Beigman Klebanov, B., Madnani, N., Burstein, J., and Somasundaran, S. (2014). Content Importance Models for Scoring Writing from Sources. In Proceedings of the Annual Meeting of the Association of Computational Linguistics, Baltimore, MD. June 23-25, 2014.

Bennett, R. E. (2011). Formative assessment: A critical review. Assessment in Education: Principles, Policy & Practice, 18, 5-25.

Klebanov, Beata Beigman, and Michael Flor. (2013). "Word association profiles and their use for automated scoring of essays." In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), vol. 1, pp. 1148-1158.

Brooke, J. (1996). SUS-A quick and dirty usability scale. Usability evaluation in industry, 189(194), 4-7.

Burstein, J., Kukich, K., Wolff, S., Lu, C., Chodorow, M., Braden-Harder, L., & Harris, M. D. (1998, August). Automated scoring using a hybrid feature identification technique. In Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics-Volume 1 (pp. 206-210). Association for Computational Linguistics.

Burstein, J., Marcu,D., and Knight, K., (2003). “Finding the WRITE Stuff: Automatic Identification of Discourse Structure in Student Essays,” IEEE Intelligent Systems: Special Issue on Advances in Natural Language Processing, Vol. 18, no. 1 pp. 32-39.

Burstein, J., Tetreault, J. and Madnani, N. (2013). “The E-rater® Automated Essay Scoring System,” In Handbook for Automated Essay Scoring, M. D. Shermis, and J. Burstein, Eds. New York, NY: Routledge, pp. 55-67.

Burstein, J., Elliott, N., and Molloy, H. (2016a). Informing Automated Writing Evaluation Using the Lens of Genre: Two Studies. To appear in Special Issue: CALICO Journal 33.1, 2016 (Guest Editors: Volker Hegelheimer, Ahmet Dursun, Zhi Li).

Burstein, J., Beigman Klebanov, B., Elliot, N., & Molloy, H. (2016b). A Left Turn: Automated Feedback & Activity Generation for Student Writers. To appear in the Proceedings of the 3rd Language Teaching, Language & Technology Workshop, co-located with Interspeech, San Francisco, CA.

Burstein, J., Chodorow, M., and Leacock, C. (2004). “Automated Essay Evaluation: The Criterion Online Service,” AI Magazine, vol. 25, no. 3, pp. 27-36.

Burstein, J., McCaffrey, D., Beigman Klebanov, B., & Ling, G. (2017b). Exploring Relationships between Writing and Broader Outcomes with Automated Writing Evaluation. In Proceeding of the 12th Workshop on Innovative Use of NLP for Building Educational Applications (BEA), EMNLP 2017, Copenhagen, Denmark.

CCSSO. 2010. Common Core State Standards for English language Arts & Literacy in History/Social Studies, Science, and Technical Subjects. Appendix A: Research supporting key elements of the Standards. Washington, DC.

Cohen, K. B., & Demner-Fushman, D. (2014). Biomedical natural language processing (Vol. 11). John Benjamins Publishing Company.

Coleman, R., & Goldenberg, C. (2012, February). The Common Core challenge: English language learners. Principal Leadership, 46-51.

Collins-Thompson, K and Callan, J. 2004. A language modeling approach to predicting reading difficulty. In Proceedings of the HLT/NAACL.

Complete College America. (2012). Remediation: Higher education’s bridge to nowhere. Retrieved http://completecollege.org/docs/CCA-Remediation-final.pdf

EU High Level Group of Experts on Literacy. (2012). Final Report. Retrieved from: http://ec.europa.eu/dgs/education_culture/repository/education/policy/school/doc/literacy-report_en.pdf.

Foltz, P. W., Streeter, L. A., Lochbaum, K. E., & Landauer, T. K (2013). Implementation and applications of the Intelligent Essay Assessor. Handbook of Automated Essay Evaluation, M. Shermis & J. Burstein, (Eds.). Pp. 68-88. Routledge, NY. NY.

Graesser, A.C., McNamara, D.S., & Kulikowich, J. 2011. Coh-Metrix: Providing multilevel analyses of text characteristics. Educational Researcher, 40, 223-234.

Graham, S., Bruch, J., Fitzgerald, J., Friedrich, L., Furgeson, J., Greene, K., Kim, J., Lyskawa, J., Olson, C.B., & Smither Wulsin, C. (2016). Teaching secondary students to write effectively (NCEE 2017-4002). Washington, DC: National Center for Education Evaluation and Regional Assistance (NCEE), Institute of Education Sciences, U.S. Department of Education. Retrieved from the NCEE website: http://whatworks.ed.gov.

Heilman, M. and Smith, N.A. (2010). Good question! Statistical Ranking for Question Generation. In Proceedings of NAACL.

Inoue, A. B (2014). Theorizing failure in US writing assessments. Research in the Teaching of English, 48, 330-352.

Klobucar, A., Elliot, N., Deess, P., Rudniy, O., & Joshi, K. (2013). Automated scoring in context: rapid assessment for placed students. Assessing Writing, 18, 62–84.

Musu-Gillette, L., de Brey, C., McFarland, J., Hussar, W., Sonnenberg, W., and Wilkinson-Flicker, S. (2017). Status and Trends in the Education of Racial and Ethnic Groups 2017 (NCES 2017-051). U.S. Department of Education, National Center for Education Statistics. Washington, DC. Retrieved from http://nces.ed.gov/pubsearch.

NCES (2016). Remedial Coursetaking at U.S. Public 2- and 4-Year Institutions: Scope, Experience, and Outcomes. NCES 2016-405. Retrieved from: https://nces.ed.gov/pubs2016/2016405.pdf

Pane, J. F., Steiner, E. D., Baird, M. D., Hamilton, L. S. & Pane, J. D. (2017). Informing progress: Insights on personalized learning implementation and effects. RAND. Retrieved from https://www.rand.org/pubs/research_reports/RR2042.html

Perin, D., & Lauterbach, M. (2018). Assessing text-based writing of low-skilled college students. International Journal of Artificial Intelligence in Education, 28, 56-78.

PISA (2016). PISA 2015 Results in Focus. Retrieved from: https://www.oecd.org/pisa/pisa-2015-results-in-focus.pdf

Roscoe, R., Varner, L., Weston, J.,Crossley, S., and McNamara, D. ( 2014). “The Writing Pal Intelligent Tutoring System: usability testing and development,” Computers and Composition, vol. 34, pp. 39-59, 2014.

Sabatini, J., Bruce, K., Steinberg, J., and Weeks, J. (2015). SARA Reading Components Tests, RISE Forms: Technical Adequacy and Test Design, 2nd Edition. Technical Report ETS-RR-15-32.

Shermis, M., Burstein, J.,Elliot, N., Miel, S., and Foltz, P. (2015). “Automated writing evaluation: an expanding body of knowledge,”. In Handbook of writing research, C. A. McArthur, S. Graham, and J. Fitzgerald, Eds., 2nd ed. New York, NY: Guilford: 395-409.

Somasundaran, S., Burstein, J., and Chodorow, M. (2014) Lexical Chaining for Measuring Discourse Coherence Quality in Test-ˇtaker Essays, COLING 2014, Dublin, Ireland.

White, E. M., Elliot, N., & Peckham, I. (2015). Very like a whale: The assessment of writing programs. University Press of Colorado.

Winerip, M. (2012, April). Facing a Robo-Grader? Just Keep Obfuscating Mellifluously. New York Times, pp: A11. Retrieved from: http://www.nytimes.com/2012/04/23/education/robo-readers-used-to-grade-test-essays.html.