AMI Corpus Participant IDs Explained
Participant IDs in the AMI Meeting corpus take the form:
[MF][IET][EDO][0-9][0-9][0-9]
A limited number of Participant IDS also have a further 2-3 letters at the end. Examples of participant IDs are MIO016; FEE088; MTD012ME.
Note that further information was collected about the participants in a questionnaire regarding language skills. That information is available when you download the NXT-format information as corpusResources/participants.xml. If the native language is marked as English, the english_language element can have a region attribute stating which country / region the English native speaker has mainly lived in. If the native language is not English, the english_language element may have two attributes: country and months which hole the name of an English-speaking country and the number of months the person has been resident. If neither are present the assumption is the participant has not lived in an English-speaking country.
There is also information mapping channel and camera numbers to these participants (and their roles for scenario meetings) in meetings.xml.
A limited number of Participant IDS also have a further 2-3 letters at the end. Examples of participant IDs are MIO016; FEE088; MTD012ME.
- First Letter: gender
- Must be either M or F
- Second Letter: location
- Must be either I, E or T for the location in which the participant was recorded. These stand for Idiap, Switzerland; University of Edinburgh, UK; TNO, Holland
- Third Letter: Native Language
- Either E, D or O for English, Dutch or Other
- Numbers
- Three numbers chosen to make a unique identifier
Note that further information was collected about the participants in a questionnaire regarding language skills. That information is available when you download the NXT-format information as corpusResources/participants.xml. If the native language is marked as English, the english_language element can have a region attribute stating which country / region the English native speaker has mainly lived in. If the native language is not English, the english_language element may have two attributes: country and months which hole the name of an English-speaking country and the number of months the person has been resident. If neither are present the assumption is the participant has not lived in an English-speaking country.
There is also information mapping channel and camera numbers to these participants (and their roles for scenario meetings) in meetings.xml.