Duolingo and Foreign Language Learning
UX Research + Usability Test
Overview
The usability test on Duolingo Character Lesson Learning is a UX research class project which lasts for 8 weeks. For this project, I and my teammates are interested in examining how the “unfamiliarity” of the non-Latin language courses to English speakers shapes the overall language learning experience on Duolingo, as well as discovering which usability positives and shortcomings currently exist on Duolingo’s platform. By finally providing recommendations for Duolingo’s “Learn a new writing system” tool after conducting 10 usability tests, we hope that we could help English speakers better navigate the Japanese and Korean writing systems.
My Role:
UX Researcher​
Duration:
January 2022 - March 2022 (8 weeks)
Tools:
Miro | Microsoft Excel | Google Doc
Project Timeline

Planning + Recruiting
About the Product
-
Duolingo is a digital language-learning tool that allows users to practice dozens of languages online. Currently, Duolingo courses exist in 37 languages for English speakers.
-
The area we are specifically interested in examining is how Duolingo's learning experience works for the languages with non-Latin writing systems (*Note: that we are exclusively examining English speakers here).

Duolingo Icon
-
Currently, 10 of the languages offered on Duolingo use non-Latin writing systems. Of those 10, 8 have a specific character learning tool. (Korean, Japanese, Russian, Ukraine, Hebrew, Yiddish, Greek, Arabic)
​
-
We wanted to explore how users interact with the Characters feature.

Character Page for Japanese on Duolingo
Product Exploration
After deciding that we would like to explore Duolingo’s character feature, we developed a user flow on one character lesson for Japanese. The research questions and test tasks later are based on the user flow here.

Screener Questionnaire &
Info Collection
-
We sent out a Screener Questionnaire to recruit participants, and we got 33 responses in total. ​
​​
-
After collecting the data from our screener questionnaire, based on people’s interests, we decided to recruit people who were interested in learning Japanese or Korean.

Screener Question - Language Interest Info Collection
Research Scope Down &
Research Questions
-
After collecting all the other information from our screener questionnaire, we defined our targeted audience as the one with these three characteristics.
​
-
New to the respective writing systems means they could not read or write it.
​​
-
If the participant has used Duolingo before, they should have only used it a few times in the past year.

Our Target Audience
-
After scoping down our research and selecting our target audience, we developed these three research questions.

Research Questions for the study
Test Execution
-
After collecting the data and information from the screener questionnaire, we send out the Background Questionnaire to the qualified participants before we do the usability test with them. We finally recruited 5 participants for Japanese testing and 5 participants for Korean testing (10 participants in total) according to their indicated interest in Japanese/Korean.
​​
Participant Overview - Demographic




Participant Overview - Foreign Language Experience
-
90% of the participants has foreign language learning experience, these languages include:
-
Traditional Chinese
-
Korean (tested for Japanese)
-
Indonesian
-
German
-
Spanish
-
French
-

Participant Overview - Duolingo Platform Experience
-
Though most of the participants have used Duolingo before as an online language learning tool, they stopped using it or have only used it a few times in the past year.
​
-
The top reasons why they stop using it were:
-
Inappropriate structure of the lessons
-
Unmatched language level for their language ability
-

Participant Overview - Motivation to learn Japanese/Korean




Test Plan
-
We decided to do the remote moderated usability test through zoom considering the COVID situation and some of the participants we recruited are not in Seattle.
-
We asked the participant to do the screen sharing while they were completing the assigned tasks.
​
-
All tests are done with computers
​

Usability Tasks

Task 1 ​
Navigating Duolingo’s homepage and explaining what it communicates to them

Task 2 ​
Navigating Duolingo’s Characters page and explaining what it communicates to them

Task 3 ​
Completing 2 Characters lessons in Japanese / Korean
Test Data Collection
-
Asking participants to think aloud, we observed how each interacted with Duolingo’s interface and collected a variety of qualitative and quantitative data. ​

Analysis
Test Data Collection



-
Upon completing the usability testing, we began analyzing the test results. Notes are all collected in the Interview Note Form and cleaned up.​
Snippets of Our Interview Notes
Affinity Diagram Analysis
-
After cleaning up the test data, we started the Affinity Diagram. We successfully extracted meaningful information and identified the common usability issues and patterns through affinity diagramming.

Affinity Diagramming after Collecting Interview Notes
Reporting
-
Based on the analysis, we drew the following major findings, and below is a summary of our findings. For more details and more information on the Study Report which also includes the positive findings, please see the study report.
-
The negative results were ranked on the following severity rating criteria (adapted from Carol M.Barnum). Overall, Duolingo is doing a relatively good job on its usability that there aren’t any catastrophe issues found during the test which prevent task completion.
3
MAJOR
ISSUES
which create significant delay/frustration
1
MINOR
ISSUES
which has minor effects on usability
2
COSMETIC
ISSUES
which are subtle suggestions and improvements
Major Issue 1: Unexpected Audio Play
Issue found in 7/10 studies
Issue Type: General Feedback
-
Sound plays immediately after a lesson starts without notice, which is unexpected for users
​​
-
Participants want to know that the lesson would have the sound effect before starting the lesson
"​Ohh, I thought the sound is coming from some web pages that I left open"

Recommendation
-
Add a notification about the sound before the lesson and allow users to confirm it ​​

Issue 2: Lack of instruction
Issue Type: Build Character Exercise (Korean)
-
Not enough information provided about how the users should answer the question
​​
-
During testing, participants assumed dragging and dropping instead of clicking
"​There were no cues for this exercise and you just clicked on the buttons to put them in the right box, I thought you had to trace them."
Issue found in 4/10 studies (4/5 in Korean)

Recommendation
-
Add a more thorough explanation that Korean characters are made of building blocks before their first lesson
​​
-
Include an animation of a few characters being build

Issue 3: Wrong Answer Mechanisms
Issue Type: When users made mistakes answering questions
-
Many participants did not notice Duolingo displayed the correct answer
​​
-
Participants took some time to realize the question they got wrong was repeated
"​"When I get it wrong I don't really notice what I got wrong."

Issue found in 4/10 studies
Recommendation
-
Directly display the correct answer in the quiz
​​
-
Communicate that people will be asked the same question over and over until they get the answer right
-
Add a note “Don’t worry. We’ll try this question again.” when users made a mistake
​​​

Reflection
-
​​At first, I and my teammate thought that Duolingo is doing a pretty good job on its usability and there won’t be any major usability issues. However, we learned that there are much more problems than we expected after all the usability tests. Overall, this project was fun to do and I learned more skills and approaches to usability tests. If I could get the chance to re-do the project, these will be the things I would like to improve/change:
​​
-
Recruit more diverse participants. We would like to have more people who are not familiar with non-Latin writing systems at all and are without a foreign language learning experience.
​
-
Incorporate more quantitative data. The quantitative data we collected (the time to finish a lesson) was not supportive enough for our final findings so we were not using them. We would like to have more meaningful quantitative data if we could re-do the test​​