Data Problems

Issues concerning the collection and pre-processing of audio, video, and auxiliary data, and resolution status

As indicated in the table below, some of the recordings are affected by signal discontinuities. In most cases, this was due to a sudden failure by the audio recording equipment. For the purpose of generating transcriptions and annotated data for these files, the signals were concatenated with zero padding (performed on 48kHz files) in correspondence with the length of the discontinuity. Frame dropping also occurred with some of the video signals collected in Edinburgh and IDIAP when encoding them from DV tape to DivX. These have undergone additional processing and are now synchronized with the audio signals. Due to hardware issues during the recording stage, video signals collected at TNO are not perfectly synchronized. However, data for these have undergone an additional processing step (see Meeting Rooms), and now exhibit acceptable levels of synchronization.

Other data collection issues relate to seat swapping by meeting participants. In a few cases, this has resulted in the mislabeling of participant IDs and thumbnails. A very small number of meetings also feature an inappropriate positioning of headset microphones. In one meeting, a participant removes his headset altogether. These and other data collection and pre-processing issues are presented in the following table, along with details of whether and, where relevant, how the issue was resolved.

MEETING ID PROBLEM RESOLVED (Y/N) NOTES
* all meetings Logitech I/O digital pen output not synchronized with rest of data N pens' internal clocks do not drift by more than a few seconds during each meeting, providing sufficiently accurate calibration
E and I meetings some frame drops in video signals when encoding from DV tape to DivX Y video now synchronized with audio signals
IN1014 Recording stops before the end of the meeting. N/A
IS1000a PM and UI remove microphones for part of meeting N/A
IS1000b PM and UI swap seats N/A
IS1000c audio dropout N videos un-synched with audio after 23:05; no audio dropout timing information available
IS1001c,d ID and UI swap seats before meeting N/A
IS1002* incomplete trial due to dropout in IS1002a N
IS1003b no mic array audio
N
lost
IS1003c,d ID and UI swap seats before meetings N/A correct map in metadata xml files
IS1005* incomplete trial due to dropout in IS1005d N
IS1007d no mic array audio N lost