A Large-Scale Database of Images and Captions for automatic face naming
(This dataset is temporarily unavailable.)
FAN-Large contains over 125.000 images with accompanying text captions. It is designed as a resource for testing algorithms to learn visual models from weakly supervised data.
We collected the images by querying Google Image with combination of celebrity names and verbs corresponding to distinct upper-body poses. In  we give detailed statistics of the dataset
and present an evaluation of several name-face association algorithms on it.
In addition to the image-caption pairs, this release also includes:
- ground-truth association of faces to names
- ground-truth association of body poses to verbs
- names extracted automatically from the captions using 
- face bounding-boxes detected automatically from the images using [3,4]. These facilitate a direct comparison to our results.
- Auxiliary programs and scripts we have used to collect and evaluate FAN-Large.
These images were downloaded from the internet, and may subject to copyright. We don't own the copyright of the images and only provide them for non-commercial research purposes.
FAN-Large uses a lot of disk space. We partitioned the dataset and the accompanying files into smaller chunks for a more convenient download. The tarballs should be extracted in the same folder to obtain the proper directory structure.
|images.tar.gz||Images (alternatively you can download them in parts and concatenate them as explained below: P1 P2 P3 P4 P5)||19 September 2011||18 GB|
|captions.tar.gz||Captions in text format||19 September 2011||42 MB|
|bbx.tar.gz||Detected face bounding-boxes in text format.||19 September 2011||16 MB|
|face_features.tar.gz||Face features extracted using [3,4] in text format||19 September 2011||2.0 GB|
|mturk_annotations.tar.gz||Ground-truth annotations in text format.||19 September 2011||9.0 MB|
|persons.tar.gz||Names detected in the caption using ||19 September 2011||17 MB|
|DatasetXML.tar.gz||XML files holding the information of the dataset folder structure. They can be used for conveniently parsing the dataset.||19 September 2011||23 MB|
|README||README files for the dataset folder structure. Additionally, each tarball in the list above has its own README describing its specific contents.||19 September 2011||3.4 KB|
|dataset_matlab.tar.gz||The dataset and the information extracted (e.g. face and name detections) in matlab file format.||19 September 2011||2.7 GB|
We also provide some auxiliary files that can be used with FAN-Large. These files include: the webcrawler we used to download the data, a GUI to display the data, scripts to extract captions from the html files, face and name detectors, name-face association algorithms, and scripts used to process the Amazon MT annotations.
|auxilliary.tar.gz||Auxiliary files for the FAN-Large dataset||19 September 2011||146 MB|
Related Publications and Softwares
 M. Ozcan, L. Jie, V. Ferrari and B. Caputo.
A Large-Scale Database of Images and Captions for Automatic Face Naming
British Machine Vision Conference (BMVC), 2011.
This work was done while Mert Ozcan was an intern at the Idiap Research Institute. Luo Jie was supported by PASCAL Pump Priming SS2-Rob Project, Vittorio Ferrari was supported by a SNSF Professorship. Barbara Caputo was supported by the SNSF project NINAPRO.