Video clips
created July 11, 2003 and January 20, 2004
Introduction
For the CAVIAR
project a number of video clips were recorded acting out the different scenarios
of interest. These include people walking alone, meeting with others, window
shopping, entering and exitting shops, fighting and passing out and last, but
not least, leaving a package in a public place.
The first section
of video clips were filmed for the CAVIAR project with
a wide angle camera lens in the entrance lobby of the INRIA Labs at Grenoble,
France. The resolution is half-resolution PAL standard (384 x 288 pixels, 25
frames per second) and compressed using MPEG2. The file sizes are mostly
between 6 and 12 MB, a few up to 21 MB.
A typical frame
from the image sequences is below. It shows three individual boxes (yellow) and
one group box (green). There are several people in the video sequence that are
not boxed because they do not move over the course of the sequence.
The second set of data also
used a wide angle lens along and across the hallway in a shopping centre in
Lisbon. For each sequence, there are are two time synchronised videos, one with the
view across and the other along the hallway. The resolution is half-resolution
PAL standard (384 x 288 pixels, 25 frames per second) and compressed using
MPEG2. The MPEG file sizes are mostly between 6 and 12 MB, a few up to 21 MB. All data is
publicly available and on this page they can be downloaded in MPEG2 format or
split into JPEGs. If you publish results using the data, please acknowledge
the data as coming from the EC Funded CAVIAR project/IST 2001 37540, found at
URL: http://homepages.inf.ed.ac.uk/rbf/CAVIAR/.
Usability is Creative Commons BY-SA:
The ground truth for these
sequences was found by hand-labeling the images, as in the example shown above.
The JAVA programs for the interactive labeller can be found here (unsupported). This is the userguide. The finite state automata
that describes the allowable roles, activities and sequence of situations in
each context is here. The XML grammar
for the ground truth labeling files is here.
Note that these XML files are large (eg. averaging 1.5MB), so viewing one of the files in a browser window can take a minute
or more to load up because of the default formatting of the XML.
The groundtruth XML is based on the CVML language (CVML- An XML-based Computer Vision Markup Language), which was
presented at IPCR.
CVML is described in more
detail at
mindmakers.org.
The CVML detail includes a C++ based
CoreLibrary
which supports
reading and writing the XML (amongst other things).
A discussion about the INRIA datasets was presented at PETS04. The ground truth labelling notation discussed in the paper has changed
to XML and some minor details have changed, but most of the concepts and discussion are still useful.
Some tracked targets in 17 of the sequences have had additional information
about their head, gaze direction, hand, feet and shoulder positions added.
(Not all because we have no more time and money at the moment.)
An example of a marked up frame with heads, gaze, hands, feet and shoulders is:
Clips from INRIA (1st Set) Six basic
scenarios were acted out by the CAVIAR team members. Most clips start with one
member showing in body sign language the scene number. This can be used for
calibration or be removed at will. For people interested in the ground plane homography, the mapping can be
computed from this information. The image pixel positions will depend on
your image scaling. Here the image sizes are the jpg 384x288.
For the image just below the corresponding pixel positions are:
Walking One person
walking – straight line Walk1.mpg (7 Mb) JPEGS: Walk1_jpg.tar.gz (13 Mb) One person
walking – straight line and return Walk2.mpg (12 Mb) One person
walking – B-line Walk3.mpg (16 Mb) Browsing Person browsing
back and forth Browse1.mpg (12 Mb) Person browsing
and reading for a while Browse2.mpg (10 Mb) Person browsing
and reading with back turned Browse3.mpg (11 Mb) Person browsing
reception desk Browse4.mpg (13 Mb) JPEGS: Browse4_jpg.tar.gz (24 Mb) Person browsing
while waiting short Browse_WhileWaiting1.mpg (9 Mb) Person browsing
while waiting long Browse_WhileWaiting2.mpg (21 Mb) Resting,
slumping or fainting Person resting
in chair Rest_InChair.mpg (12 Mb) Person slump on
floor Rest_SlumpOnFloor.mpg (11 Mb) JPEGS: Rest_SlumpOnFloor_jpg.tar.gz (19 Mb) Person wiggle
on floor Rest_WiggleOnFloor.mpg (15 Mb) Person fall
down immobile Rest_FallOnFloor.mpg (12 Mb) Leaving bags
behind Person leaving
bag by wall LeftBag.mpg (17 Mb) Person leaving
bag at chairs LeftBag_AtChair.mpg (13 Mb) Person leaving
bag behind chairs LeftBag_BehindChair.mpg (13 Mb) Person leaving
box LeftBox.mpg (10 Mb) Person leaving
bag but then pick it up again LeftBag_PickedUp.mpg (16 Mb) People/groups
meeting, walking together and splitting up Two people meet
and walk together Meet_WalkTogether1.mpg (8 Mb) Two other
people meet and walk together Meet_WalkTogether2.mpg (9 Mb) Two people
meet, walk together and split Meet_WalkSplit.mpg (7 Mb) Two people meet,
walk, split with third person Meet_Split_3rdGuy.mpg (11 Mb) Crowd of four
people meet, walk and split Meet_Crowd.mpg (6 Mb) Two people
enter walk and split mpg
missing Two people
fighting Two people
meet, fight and run away Fight_RunAway1.mpg (6 Mb) Two other
people meet, fight and run away Fight_RunAway2.mpg (6 Mb) Two people
meet, fight, one down, other runs away Fight_OneManDown.mpg (11 Mb) JPEGS: Fight_OneManDown_jpg.tar.gz (20 Mb) Two people
meet, fight and chase each other Fight_Chase.mpg (5 Mb) JPEGS: Fight_Chase_jpg.tar.gz (9 Mb) Clips from Shopping Center in Portugal (2nd Set) In the second set
of experiments each clip was recorder from two different points of view. The
first one shows a view of the corridor, while the second shows a frontal view of
the scenario. The two video sequences should be time synchronised frame by frame.
However, each video set may start at a slightly different time, so you need to
figure out the frame correspondences. The time-code in the
upper left of the images gives the necessary information.
During capture and digitisation, a frame segmentation error means that some
frames are duplicated and the expected next frame is missing. Thus, the
overall rate is correct, but occasionally 2 consecutive frames are identical.
This set of
sequences are longer (1500 frames on average), containing
more individuals and groups than the first set. Example synchronized
images are shown here: For people interested in the ground plane homography, the mapping can be
computed from this information. The image pixel positions will depend on
your image scaling. Here the image sizes are the jpg 384x288.
For the image just below the corresponding pixel
positions are:
CORRIDOR VIEW FRONT
VIEW Couple walking
along corridor browsing, persons going inside and coming out of stores WalkByShop1cor.mpg (14 Mb) WalkByShop1front.mpg (14 Mb) Two persons
cross paths at the entrance of a store, couple walking on the corridor EnterExitCrossingPaths1cor.mpg (2 Mb) Ground
truth XML version: ceecp1gt.xml EnterExitCrossingPaths1front.mpg ( 2 Mb) Two
persons cross paths at the entrance of a store EnterExitCrossingPaths2cor.mpg ( 3 Mb) Ground truth
XML version: ceecp2gt.xml EnterExitCrossingPaths2front.mpg ( 3 Mb) Person
goes outside a store. Visible
on corridor view only: Three
persons walking together in the corridor OneLeaveShop1cor.mpg (2 Mb) OneLeaveShop1front.mpg (2 Mb) Person
goes outside a store. Visible on
corridor view only: Four
persons walking together in the corridor OneLeaveShop2cor.mpg (6 Mb) OneLeaveShop2front.mpg (6 Mb) Person
comes out of store and later reenters. Visible
on corridor view only: Person coming
out of store and walking on the corridor OneLeaveShopReenter1cor.mpg ( 2 Mb) OneLeaveShopReenter1front.mpg (2 Mb) Person
comes out of store and later reenters. Visible
on corridor view only: Four
persons walk on the corridor. OneLeaveShopReenter2cor.mpg ( 3 Mb) OneLeaveShopReenter2front.mpg (3 Mb) Couple
walking on the corridor, one goes inside a store, the other waits outside,
later they rejoin and leave together. OneShopOneWait1cor.mpg ( 8 Mb) OneShopOneWait1front.mpg ( 8 Mb) Similar
to OneShopOneWait1front, but contains various groups of persons walking along
the corridor. OneShopOneWait2cor.mpg ( 5 Mb) OneShopOneWait2front.mpg (8 Mb)
Couple
goes inside store OneStopEnter1cor.mpg ( 9 Mb) OneStopEnter1front.mpg (9 Mb) Person
browses stores and goes inside and out, couple of walker along the corridor Visible on
corridor view only: Two
persons go inside stores. OneStopEnter2cor.mpg ( 16 Mb) OneStopEnter2front.mpg (16 Mb) Person stops outside store, goes inside and
out of store. Five groups of people walking along the corridor. OneStopMoveEnter1cor.mpg (9 Mb) OneStopMoveEnter1front.mpg (9 Mb) Person
goes inside and out of a store twice. Visible
on corridor view only: Group of
4 people comes out store OneStopMoveEnter2cor.mpg (13 Mb) OneStopMoveEnter2front.mpg (13 Mb) Person
stops out of store goes inside and out of store Visible
on corridor view only: Couple
come out of store OneStopMoveNoEnter1cor.mpg (10 Mb) OneStopMoveNoEnter1front.mpg (10 Mb) Person
goes inside store, browses, and leaves store. Visible
on corridor view only: Couple
walking along the corridor OneStopMoveNoEnter2cor.mpg (6 Mb) OneStopMoveNoEnter2front.mpg (6 Mb) Person
stops outside an store and continues walking along the corridor OneStopNoEnter1cor.mpg (4 Mb) OneStopNoEnter1front.mpg (4 Mb) Person
stops outside a store and continues walking, another person goes inside a
store, browses and then leaves the store. OneStopNoEnter2cor.mpg (9 Mb) OneStopNoEnter2front.mpg (9 Mb) Person
goes inside a store and browse, another person joins and they leave together
the store ShopAssistant1cor.mpg (10 Mb) ShopAssistant1front.mpg (10 Mb) Person
goes inside a store, browse, another person joins later they split, 3 people
walking together along the corridor ShopAssistant2cor.mpg (21 Mb) ShopAssistant2front.mpg (21 Mb) 3 persons
walking in the corridor ThreePastShop1cor.mpg (10 Mb) ThreePastShop1front.mpg (10 Mb) Another 3
persons walking in the corridor ThreePastShop2cor.mpg (9 Mb) ThreePastShop2front.mpg (9 Mb) Couple
goes inside a store and later comes out TwoEnterShop1cor.mpg (10 Mb) TwoEnterShop1front.mpg (10 Mb) Couple
goes inside a store and later comes out TwoEnterShop2cor.mpg (9 Mb) Ground
truth XML version: c2es2gt.xml TwoEnterShop2front.mpg (9 Mb) Ground truth XML version: f2es2gt.xml Two
couples go inside store and later one comes out TwoEnterShop3cor.mpg (7 Mb) TwoEnterShop3front.mpg (6 Mb) Couple
leaves a store while browsing TwoLeaveShop1cor.mpg (8 Mb) TwoLeaveShop1front.mpg (8 Mb)
A couple
leaves a store TwoLeaveShop2cor.mpg ( 4 Mb) TwoLeaveShop2front.mpg (4 Mb)
INRIA Sequences
Datafile Sequence Comments
fomdgt2.xml Fight_OneManDown only targets marked: 0,1,4
mc1gt.xml Meet_Crowd all marked
mwt2gt.xml Meet_WalkTogether2 all marked
fcgt.xml Fight_Chase all marked
mws1gt.xml Meet_WalkSplit all marked
ms3ggt.xml Meet_Split_3rdGuy only targets marked: 0
fra1gt.xml Fight_RunAway1 only targets marked: 7
fra2gt.xml Fight_RunAway2 only targets marked: 4
Lisbon Sequences
Datafile Sequence Comments
c2es1gt.xml TwoEnterShop1cor no gaze directions annotated
fsa1gt.xml ShopAssistant1front all marked
csa1gt.xml ShopAssistant1cor all marked
c3ps1gt.xml ThreePastShop1cor all marked
f3ps1gt.xml ThreePastShop1front all marked
f2es1gt.xml TwoEnterShop1front all marked
fosow2gt.xml OneShopOneWait2front all marked
cosow2gt.xml OneShopOneWait2cor all marked
cosme2gt.xml OneStopMoveEnter2cor only targets marked: 0,2,11
c3ps2gt.xml ThreePastShop2cor all marked
f3ps2gt.xml ThreePastShop2front all marked
Point
(Col,Row) (pixels)
(X,Y) (cm)
1 (64,88) (0,671.5) 2 (211,40) (1116,670) 3 (349,184) (1545,190) 4 (39,187) (0,0)
Ground truth XML version: wk1gt.xml
Ground truth XML version: wk2gt.xml
JPEGS: Walk2_jpg.tar.gz (22 Mb)
Ground truth XML version: wk3gt.xml
JPEGS: Walk3_jpg.tar.gz (29 Mb)
Ground truth XML version: br1gt.xml
JPEGS: Browse1_jpg.tar.gz (22 Mb)
Ground truth XML version: br2gt.xml
JPEGS: Browse2_jpg.tar.gz (18 Mb)
Ground truth XML version: br3gt.xml
JPEGS: Browse3_jpg.tar.gz (19 Mb)
Ground truth XML version: br4gt.xml
Ground truth XML version: bww1gt.xml
JPEGS: Browse_WhileWaiting1_jpg.tar.gz (16 Mb)
Ground truth XML version: bww2gt.xml
JPEGS: Browse_WhileWaiting2_jpg.tar.gz (39 Mb)
Ground truth XML version: ricgt.xml
JPEGS: Rest_InChair_jpg.tar.gz (21 Mb)
Ground truth XML version: rsfgt.xml
Ground truth XML version: rwgt.xml
JPEGS: Rest_WiggleOnFloor_jpg.tar.gz (28 Mb)
Ground truth XML version: rffgt.xml
JPEGS: Rest_FallOnFloor_jpg.tar.gz (21 Mb)
Ground truth XML version: lb1gt.xml
JPEGS: LeftBag_jpg.tar.gz (31 Mb)
Ground truth XML version: lb2gt.xml
JPEGS: LeftBag_AtChair_jpg.tar.gz (24 Mb)
Ground truth XML version: lbbcgt.xml
JPEGS: LeftBag_BehindChair_jpg.tar.gz (23 Mb)
Ground truth XML version: lbgt.xml
JPEGS: LeftBox_jpg.tar.gz (18 Mb)
Ground truth XML version: lbpugt.xml
JPEGS: LeftBag_PickedUp_jpg.tar.gz (29 Mb)
Ground truth XML version: mwt1gt.xml
JPEGS: Meet_WalkTogether1_jpg.tar.gz (15 Mb)
Ground truth XML version: mwt2gt.xml
JPEGS: Meet_WalkTogether2_jpg.tar.gz (17 Mb)
Ground truth XML version: mws1gt.xml
JPEGS: Meet_WalkSplit_jpg.tar.gz (13 Mb)
Ground truth XML version: ms3ggt.xml
JPEGS: Meet_Split_3rdGuy_jpg.tar.gz (19 Mb)
Ground truth XML version: mc1gt.xml
JPEGS: Meet_Crowd_jpg.tar.gz (10 Mb)
Ground truth XML version: spgt.xml
JPEGS: Split_jpg.tar.gz (11 Mb)
Ground truth XML version: fra1gt.xml
JPEGS: Fight_RunAway1_jpg.tar.gz (12 Mb)
Ground truth XML version: fra2gt.xml
JPEGS: Fight_RunAway2_jpg.tar.gz (12 Mb)
Ground truth XML version: fomdgt1.xml fomdgt2.xml fomdgt3.xml
Ground truth XML version: fcgt.xml
Point
(Col,Row) (pixels)
(X,Y) (cm)
1 (91,163) (000,975 ) 2 (241,163) (290,975 ) 3 (98,266) (000,-110 ) 4 (322,265) (290,-110 ) 5 (60,153) (000,000 ) 6 (359,153) (000,975 ) 7 (50,201) (382,098) 8 (367,200) (382,878)
Ground truth XML version: cwbs1gt.xml
JPEGS: WalkByShop1cor.tar.gz (48 Mb)
Ground truth XML version: cwbs1gt.xml
JPEGS: WalkByShop1front.tar.gz (56 Mb)
JPEGS: EnterExitCrossingPaths1cor.tar.gz (8 Mb)
Ground truth XML version: feecp1gt.xml
JPEGS: EnterExitCrossingPaths1front.tar.gz (10 Mb)
JPEGS: EnterExitCrossingPaths2cor.tar.gz (10 Mb)
Ground truth XML version: feecp2gt.xml
JPEGS: EnterExitCrossingPaths2front.tar.gz ( 12 Mb)
Ground truth XML version: cols1gt.xml
JPEGS: OneLeaveShop1cor.tar.gz ( 6 Mb)
Ground truth XML version: fols1gt.xml
JPEGS: OneLeaveShop1front.tar.gz ( 8 Mb)
Ground truth XML version: cols2gt.xml
JPEGS: OneLeaveShop2cor.tar.gz ( 23 Mb)
Ground truth XML version: fols2gt.xml
JPEGS: OneLeaveShop2front.tar.gz ( 29 Mb)
Ground truth XML version: colsr1gt.xml
JPEGS: OneLeaveShopReenter1cor.tar.gz ( 8 Mb)
Ground truth XML version: folsr1gt.xml
JPEGS: OneLeaveShopReenter1front.tar.gz ( 10 Mb)
Ground truth XML version: colsr2gt.xml
JPEGS: OneLeaveShopReenter2cor.tar.gz ( 12 Mb)
Ground truth XML version: folsr2gt.xml
JPEGS: OneLeaveShopReenter2front.tar.gz ( 15 Mb)
Ground truth XML version: cosow1gt.xml
JPEGS: OneShopOneWait1cor.tar.gz ( 30 Mb)
Ground truth XML version: fosow1gt.xml
JPEGS: OneShopOneWait1front.tar.gz ( 10 Mb)
Ground truth XML version: cosow2gt.xml
JPEGS: OneShopOneWait2cor.tar.gz ( 32 Mb)
Ground truth XML version: fosow2gt.xml
JPEGS: OneShopOneWait2front.tar.gz ( 38 Mb)
Ground truth XML version: cose1gt.xml
JPEGS: OneStopEnter1cor.tar.gz ( 31 Mb)
Ground truth XML version: fose1gt.xml
JPEGS: OneStopEnter1front.tar.gz ( 38 Mb)
Ground truth XML version: cose2gt.xml
JPEGS: OneStopEnter2cor.tar.gz ( 56 Mb)
Ground truth XML version: fose2gt.xml
JPEGS: OneStopEnter2front.tar.gz ( 71 Mb)
Ground truth XML version: cosme1gt.xml
JPEGS: OneStopMoveEnter1cor.tar.gz ( 37 Mb)
Ground truth XML version: fosme1gt.xml
JPEGS: OneStopMoveEnter1front.tar.gz ( 42 Mb)
Ground truth XML version: cosme2gt.xml
JPEGS: OneStopMoveEnter2cor.tar.gz ( 46 Mb)
Ground truth XML version: fosme2gt.xml
JPEGS: OneStopMoveEnter2front.tar.gz ( 58 Mb)
Ground truth XML version: cosmne1gt.xml
JPEGS: OneStopMoveNoEnter1cor.tar.gz ( 35 Mb)
Ground truth XML version: fosmne1gt.xml
JPEGS: OneStopMoveNoEnter1front.tar.gz ( 42 Mb)
Ground truth XML version: cosmne2gt.xml
JPEGS: OneStopMoveNoEnter2cor.tar.gz ( 21 Mb)
Ground truth XML version: fosmne2gt.xml
JPEGS: OneStopMoveNoEnter2front.tar.gz ( 27 Mb)
Ground truth XML version: cosne1gt.xml
JPEGS: OneStopNoEnter1cor.tar.gz ( 15 Mb)
Ground truth XML version: fosne1gt.xml
JPEGS: OneStopNoEnter1front.tar.gz ( 18 Mb)
Ground truth XML version: cosne2gt.xml
JPEGS: OneStopNoEnter2cor.tar.gz ( 31 Mb)
Ground truth XML version: fosne2gt.xml
JPEGS: OneStopNoEnter2front.tar.gz ( 37 Mb)
Ground truth XML version: csa1gt.xml
JPEGS: ShopAssistant1cor.tar.gz ( 26 Mb)
Ground truth XML version: fsa1gt.xml
JPEGS: ShopAssistant1front.tar.gz ( 43 Mb)
Ground truth XML version: csa2gt.xml
JPEGS: ShopAssistant2cor.tar.gz ( 69 Mb)
Ground truth XML version: fsa2gt.xml
JPEGS: ShopAssistant2front.tar.gz ( 101 Mb)
Ground truth XML version: c3ps1gt.xml
JPEGS: ThreePastShop1cor.tar.gz ( 36 Mb)
Ground truth XML version: f3ps1gt.xml
JPEGS: ThreePastShop1front.tar.gz ( 42 Mb)
Ground truth XML version: c3ps2gt.xml
JPEGS: ThreePastShop2cor.tar.gz ( 33 Mb)
Ground truth XML version: f3ps2gt.xml
JPEGS: ThreePastShop2front.tar.gz ( 39 Mb)
Ground truth XML version: c2es1gt.xml
JPEGS: TwoEnterShop1cor.tar.gz ( 35 Mb)
Ground truth XML version: f2es1gt.xml
JPEGS: TwoEnterShop1front.tar.gz ( 43 Mb)
Ground truth: c2es2gt.txt
JPEGS: TwoEnterShop2cor.tar.gz ( 35 Mb)
Ground truth: f2es2gt.txt
JPEGS: TwoEnterShop2front.tar.gz ( 42 Mb)
Ground truth XML version: c2es3gt.xml
JPEGS: TwoEnterShop3cor.tar.gz ( 24 Mb)
Ground truth XML version: f2es3gt.xml
JPEGS: TwoEnterShop3front.tar.gz ( 28 Mb)
Ground truth XML version: c2ls1gt.xml
JPEGS: TwoLeaveShop1cor.tar.gz ( 28 Mb)
Ground truth XML version: f2ls1gt.xml
JPEGS: TwoLeaveShop1front.tar.gz ( 35 Mb)
Ground truth XML version: c2ls2gt.xml
JPEGS: TwoLeaveShop2cor.tar.gz ( 13 Mb)
Ground truth XML version: f2ls2gt.xml
JPEGS: TwoLeaveShop2front.tar.gz ( 16 Mb)
There have been accesses since March 2005.
© 2007 Robert Fisher