Overview

Speech-driven 3D facial motions describe dynamic 3D human faces while speaking. The behavior is repeatable and person-specific and thus is promising for many applications, e.g person recognition, lip language analysis, etc. This database focuses on dynamic human faces when the subjects are speaking a short phrase. The database collects 1030 samples consisting of two parts: Speaking with Frontal Pose (S3DFM-FP) and Speaking with Varying Pose (S3DFM-VP). There are 770 samples from 77 participants in the FP sub-dataset and 260 samples from 26 participants in the VP sub-dataset. The participants have different ages, genders, ethnicities, and mother-tongues.

Data Acquisition

A high-frame-rate (500FPS) 3D video sensor from DI4D Ltd was used for capturing data. The sensor is a binocular stereo vision system mainly consisting of two intensity cameras. Each participant was asked to repeat a short phrase -- ni'hao (a Chinese word, means 'Hello') 10 times when looking naturally straight at the cameras. For each repetition, we captured a video sequence using the sensor and a synchronized audio sequence via a microphone. In the capture of speaking face with varying pose, the participant repeated the same phrase but with the head naturally moving.

The 3D reconstruction of each video sequence was done using DI4D's commercial software with additionally spatial smoothing and temporal filtering. Each sample contains a depth/3D sequence and a pixel-wise registered intensity sequence, plus a short 'passphrase' (the synchronized audio sequence). Each video sequence contains 500 frames and each audio sequence also covers 1 second with a sampling frequency of 44.1 kHZ. The resolutions of the depth/3D and intensity images are 600*600 points each. (The original video sequence was downsampled from their original resolution of 1200*1200 pixels to improve the processing efficiency and to reduce the 3D noise)

Overall, the database contains 2 parts: Frontal Pose (S3DFM-FP), Varying Pose (S3DFM-VP).

In the S3DFM-FP, there are 770 samples with

In the S3DFM-VP, there are 260 samples with

Data Examples

We present the cosine shaded depth data from two participants as examples, their registered intensity frames (frame #: 50, 150, 300, 450) from a video sequence, and the synchronized audio sequence, as shown in Fig.1.

example image
Figure 1. Example samples from 2 participants. For each set: 3D images (top row); registered intensity images (mid-row); bottom row: a synchronized audio sequence (phrase: ni'hao).

The mouth is the principal dynamic region of a speaking face. We represent a 3D mouth region via the mouth width and opening. The change of the 3D mouth region from a participant and its repeatability from 10 sequences are shown in Fig.2.

LM distance
Figure 2. Analysis of the mouth region of a participant speaking a phrase

Data Download

The database is freely available for use by other researchers or parties, under CC-BY-NC-ND license terms. Note that the database can only be used for academic research. If you use the data in a publication, please cite:

Each file listed below contains 10 sequences (3D & intensity & audio) from a participant. You could download them by clicking on a file and then unzipping them individually.

Part 1: Speaking Faces with Frontal Pose (S3DFM-FP)


Participant 1:   Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10   
Participant 2:   Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10
Participant 3:   Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10   
Participant 4:   Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10
Participant 5:   Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10   
Participant 6:   Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10
Participant 7:   Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10   
Participant 8:   Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10
Participant 9:   Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10   
Participant 10: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10
Participant 11: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10   
Participant 12: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10
Participant 13: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10   
Participant 14: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10
Participant 15: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10   
Participant 16: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10
Participant 17: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10   
Participant 18: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10
Participant 19: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10   
Participant 20: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10
Participant 21: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10   
Participant 22: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10
Participant 23: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10   
Participant 24: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10
Participant 25: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10   
Participant 26: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10
Participant 27: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10   
Participant 28: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10
Participant 29: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10   
Participant 30: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10
Participant 31: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10   
Participant 32: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10
Participant 33: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10   
Participant 34: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10
Participant 35: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10   
Participant 36: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10
Participant 37: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10   
Participant 38: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10
Participant 39: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10   
Participant 40: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10
Participant 41: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10   
Participant 42: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10
Participant 43: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10   
Participant 44: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10
Participant 45: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10   
Participant 46: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10
Participant 47: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10   
Participant 48: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10
Participant 49: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10   
Participant 50: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10
Participant 51: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10   
Participant 52: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10
Participant 53: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10   
Participant 54: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10
Participant 55: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10   
Participant 56: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10
Participant 57: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10   
Participant 58: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10
Participant 59: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10   
Participant 60: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10
Participant 61: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10   
Participant 62: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10
Participant 63: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10   
Participant 64: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10
Participant 65: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10   
Participant 66: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10
Participant 67: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10   
Participant 68: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10
Participant 69: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10   
Participant 70: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10
Participant 71: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10   
Participant 72: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10
Participant 73: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10   
Participant 74: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10
Participant 75: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10   
Participant 76: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10
Participant 77: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10   


Part 2:   Speaking Faces with Varying Pose (S3DFM-VP)

The video sequences above recorded the participants while they were facing forward and were essentially static, except for their speaking. We recorded an additional 10 videos where the 26 participants are moving their heads while speaking the same passphrase.


Participant 1:   Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10   
Participant 2:   Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10
Participant 3:   Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10   
Participant 4:   Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10
Participant 5:   Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10   
Participant 6:   Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10
Participant 7:   Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10   
Participant 8:   Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10
Participant 9:   Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10   
Participant 10: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10
Participant 11: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10   
Participant 12: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10
Participant 13: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10   
Participant 14: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10
Participant 15: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10   
Participant 16: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10
Participant 17: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10   
Participant 18: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10
Participant 19: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10   
Participant 20: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10
Participant 21: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10   
Participant 22: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10
Participant 23: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10   
Participant 24: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10
Participant 25: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10   
Participant 26: Seq1   Seq2   Seq3   Seq4   Seq5   Seq6   Seq7   Seq8   Seq9   Seq10

Note: The participants in the S3DFM-VP are the same as some in the S3DFM-FP. The identity number correspondences are given as follows for possible linking purposes.

S3DFM-VP 12345678910111213 14151617181920212223242526
S3DFM-FP 1604359112187496162636465666768697071727374757677

Research Team

The database was established by Jie Zhang as part of her PhD research while she was a visiting PhD student at the University of Edinburgh (UoE). Jie Zhang was with Beihang University and UoE. Robert B. Fisher and Luis Horna are with UoE.

You might be interested in these related papers:

If you have any questions, please don't hesitate to contact us.

Acknowledgement

This research was supported by the funding from China Scholarship Council (CSC) under grant 201606020087 and National Council for Science and Technology (CONACyT) of Mexico. We would like to thank all the participants in the data acquisition and the support from DI4D Ltd.