Idiap Research Institute
ETH Zurich, CALVIN group


A Large-Scale Database of Images and Captions for automatic face naming

(This dataset is temporarily unavailable.)

Mert Ozcan,    Luo Jie,    Vittorio Ferrari,    Barbara Caputo


Annotated example

FAN-Large contains over 125.000 images with accompanying text captions. It is designed as a resource for testing algorithms to learn visual models from weakly supervised data. We collected the images by querying Google Image with combination of celebrity names and verbs corresponding to distinct upper-body poses. In [1] we give detailed statistics of the dataset and present an evaluation of several name-face association algorithms on it.

In addition to the image-caption pairs, this release also includes:

Important Notice

These images were downloaded from the internet, and may subject to copyright. We don't own the copyright of the images and only provide them for non-commercial research purposes.


FAN-Large uses a lot of disk space. We partitioned the dataset and the accompanying files into smaller chunks for a more convenient download. The tarballs should be extracted in the same folder to obtain the proper directory structure.

FilenameDescriptionRelease DateSize
images.tar.gz Images (alternatively you can download them in parts and concatenate them as explained below: P1 P2 P3 P4 P5) 19 September 2011 18 GB
captions.tar.gz Captions in text format 19 September 2011 42 MB
bbx.tar.gz Detected face bounding-boxes in text format. 19 September 2011 16 MB
face_features.tar.gz   Face features extracted using [3,4] in text format 19 September 2011 2.0 GB
mturk_annotations.tar.gz Ground-truth annotations in text format. 19 September 2011 9.0 MB
persons.tar.gz   Names detected in the caption using [2] 19 September 2011 17 MB
DatasetXML.tar.gz XML files holding the information of the dataset folder structure. They can be used for conveniently parsing the dataset. 19 September 2011 23 MB
README README files for the dataset folder structure. Additionally, each tarball in the list above has its own README describing its specific contents. 19 September 2011 3.4 KB
For convenience, we also provide the dataset (excluding the images) in MATLAB format (.mat). This way the dataset can be used in MATLAB without the need to parse the text files.
FilenameDescriptionRelease DateSize
dataset_matlab.tar.gz The dataset and the information extracted (e.g. face and name detections) in matlab file format. 19 September 2011 2.7 GB

We also provide some auxiliary files that can be used with FAN-Large. These files include: the webcrawler we used to download the data, a GUI to display the data, scripts to extract captions from the html files, face and name detectors, name-face association algorithms, and scripts used to process the Amazon MT annotations.

FilenameDescriptionRelease DateSize
auxilliary.tar.gz Auxiliary files for the FAN-Large dataset 19 September 2011 146 MB

Related Publications and Softwares

[1] M. Ozcan, L. Jie, V. Ferrari and B. Caputo.
     A Large-Scale Database of Images and Captions for Automatic Face Naming
     British Machine Vision Conference (BMVC), 2011.





This work was done while Mert Ozcan was an intern at the Idiap Research Institute. Luo Jie was supported by PASCAL Pump Priming SS2-Rob Project, Vittorio Ferrari was supported by a SNSF Professorship. Barbara Caputo was supported by the SNSF project NINAPRO.

Please report problems with this page to
Luo Jie   or    Vittorio Ferrari
Last updated 20th September 2011s.