State of the art modeling of language learning for Duolingo.
Duolingo is a global language learning company with over 300 million students.
Students create immense amounts of data on their various mistakes. The goal of this work was to predict future mistakes that learners of English, Spanish, and French will make based on the mistakes they have made in the past.
In analysing student mistakes, their knowledge gaps can be detected. Sana Labs used an ensemble approach that did not require encoding domain knowledge and can capture more nuances of learning. For example, the student below seems to be struggling with "my", "mother", and "father." These mistakes could point to difficulties with possessive pronouns or the orthography of English "th" sounds. The Sana model picks up on these trends, predicting student knowledge gaps in a personalized way that evolves over time. These predictions can subsequently be used to personalize learning by recommending exactly what to study and when, based on each individual's needs.
The Sana technology performed extraordinarily well, and made the most accurate predictions. The Sana technology also produced the best evaluation metrics on Duolingo's benchmark.