Tuesday 1:10 p.m.–1:45 p.m.

A Pipeline for Modeling Automated Scoring Using Python, R and Jupyter Notebooks

Nitin Madnani

Audience level:
Intermediate

Description

In this talk, we will present RSMTool, a tool that we built to help the NLP and Speech scientists at the Educational Testing Service (ETS) streamline their workflows for building and evaluating machine learning models used to score written and spoken test responses.

Abstract

In this talk, we will present a tool (RSMTool) that we have developed to build and evaluate machine learning models used to score written and spoken responses from various high- and low-stakes assessments developed at ETS.

  1. RSMTool builds upon numpy, pandas, seaborn, and Jupyter notebooks to create an end-to-end pipeline for NLP and speech scientists.

  2. RSMTool supports data processing and standardization of the input features, descriptive feature analyses, training of the machine learning model, and producing a wide variety of model evaluations as recommended by the educational assessment and psychometric community.

  3. RSMTool uses the powerful and customizable Jupyter notebooks to output a self-contained, highly customizable HTML report. In addition, it also outputs an Excel notebook containing each of the raw analyses as a separate sheet.

  4. RSMTool supports at least a dozen different linear and non-linear regression algorithms (from both R and scikit-learn) for its experiments.

We will present a detailed description of the RSMTool architecture and run through an example of how we use it at ETS to streamline our machine learning workflows.

Sponsors


Become a sponsor.