MATCH corpus download
Use this page to download audio and annotations from the MATCH corpus. The annotations, which include the orthographic transcription, come all together in one zip file. The audio files are too large to package in this way, so you need to use the chooser to indicate which ones you wish to download.
Annotations, including transcription
Annotations are in NXT format. To use with signals downloaded below, unzip one or both of these files into the 'amicorpus' directory. Requires NXT version 1.4.4.
- MATCH annotations v1.0 30-May-2014 (4.3MB)
Audio
Each participant interacts with the MATCH wizard-of-Oz system 9 times resulting in 9 separate audio files. The audio files differ in size depending on the length of the interactions. In total there is about 200MB of audio for each of the 46 participants. There is a total of 8.4GB audio. Please download only what you need.
To use the audio files with the NXT annotations, they should be stored in a directory called signals in the same directory as the MATCH metadata file match.xml.
If you prefer you can download individual audio files using your browser here.
1) Select one or more MATCH participants
All of the signals, transcription and annotations have been released publicly under the Creative Commons Attribution NonCommercial ShareAlike 2.5 Licence..