PUZZLED Watching Students Watching Educational Videos Dataset

Introduction

The dataset consists of a set of videos of students who are watching educational videos. It includes annotations of the students' level of intellectual challenge as they watch the educational videos. The original goal of the data collection was to provide data for developing an automatic feedback tool for the people delivering lectures using a video delivery mechanism (either pre-recorded or live streamed).

The problem that motivated this research is: How can an instructor get feedback on the effectiveness of a presentation at various points in the presentation, when all of the students are remote? This contrasts with the traditional approach where the instructor can see the students and assess their facial expressions. This motivates the PUZZLED research project, which is investigating if some feedback signal can be extracted by watching the facial expressions of the students as they watch the videos on their laptop, while being observed using the laptop's webcam.

An example frame from the educational video that the students are watching is on the left below, and an example frame from the video of the student watching the educational video is on the right.

The original 9:35 video (83 Mb) that the students are watching can be downloaded here.

A snapshot of the schema of the view that the students see when watching and self-annotating is on the left below, and a snapshot of the facial feature analysis results is on the right

The dataset consists of 10 videos of student heads, each about 10 minutes long, of the volunteers watching the same educational video. We attempted to balance the genders and ethnicities of the volunteers so as to provide a variety of viewing styles and skin tones; however, we were limited to a small pool of volunteers, so had little flexibility with balancing their characteristics. Ethical permission was given to record and distribute this data. All volunteers agreed to allowing these videos to be distributed.

Associated with each video is a second-by-second annotation of the student's degree of engagement. The annotations were produced manually by: 1) the student, and 2) by three researchers.

Details

The engagement annotations have 4 values:

OK - student is comfortable with the material
Challenged - student is having to work to keep up with the material, but is following it
Lost - student is lost and is gaining little from the material
Bored - student is bored by the material

There were few instances of the Bored label.
Note: for these videos, the students produced the labels at the same time as they watched the videos. This required the students to shift their gaze from the video to a labeling panel.

The CSV annotation files consist of a sequence of lines of the form: VIDEO_ID,ANNOTATOR_ID,START_LABEL,LABEL. The VIDEO_ID is the same as the CSV filename. The ANNOTATOR_ID is 0, 1, 2, 3, where 0 is the student and 1-3 are by the 3 researchers. START_LABEL indicates the time in seconds when the annotation LABEL starts, and all subsequent times have the same label until the next line in the CSV file. It is assumed that the video starts with an 'OK' level of engagement.

The Dataset

The table below links to the videos and CSV files. The ethnicities were self-declared as C:Caucasian, SA: South Asian, EA: East Asian, and genders self declared as F:female and M:male.

Video	CSV	Video Size (Mb)	Est. FPS	Gender	Ethnicity
163904405748.webm	163904405748.csv	220	30	F	SA
163904657409.webm	163904657409.csv	206	26	M	EA
163949565032.webm	163949565032.csv	335	11.8	F	C
163965258378.webm	163965258378.csv	193	30	F	C
163974758282.webm	163974758282.csv	202	24	F	SA
16400282145.webm	16400282145.csv	164	12.5	F	C
164002054426.webm	164002054426.csv	188	30	M	C
164006412052.webm	164006412052.csv	234	29	F	EA
164007913498.webm	164007913498.csv	215	30	M	C
164008493608.webm	164008493608.csv	177	30	F	EA

Access to the data

The data is freely available for research use. Any publications or public display of images or videos based on the data must cite:

A. Linson, Y. Xu, A. R. English, R. B. Fisher; Identifying student struggle by analyzing facial expressions during asynchronous video lecture viewing: Towards an automated tool to support instructors, Proc. 23rd Int. Conf. on Artificial Intelligence in Education, Durham, 2022.

Acknowledgements

Funding for the data collection was by the University of Edinburgh Regional Skills program. Ethics approval for the data collection and dissemination was given by the School of Informatics, University of Edinburgh.