Word Storms:
Multiples of Word Clouds for Visual Comparison of Documents

Introduction Motivation Examples Code About

Introduction

Word storms are a visualization tool for analysing text corpora. Just as a storm is a group of clouds, a word storm is a group of word clouds. Each cloud in the storm represents a subset of the corpus. For example, a storm might contain one cloud per document, or alternatively one cloud to represent all the documents written in each year, or one cloud to represent each track of an academic conference, etc.

Example of a coordinated Word Storm representing six scientific articles in the Complexity field.

Motivation

Although word clouds are a popular tool for visualizing documents, they are not a good tool for comparing documents, because identical words are not presented consistently across different clouds. In order to make the comparisons easier, we build the clouds coordinately, locating shared words in similar positions and emphasizing the most informative words. In this way, similar documents are represented by visually similar clouds.

Standard Clouds

Standard clouds are difficult to compare. In the figure, we represent four articles: three about materials and one about mathematics. All clouds look very different and it is hard to check the presence of words.

Coordinated Word Storm

A Coordinated Word Storm is easier to analyse. In the figure, there are the same documents as before. As shared words appear in similar locations with the same color and orientation, words are easier to find and similar documents are represented by similar clouds. In this example, it is clear by looking at the clouds that the first three articles are related and the fourth one is the most different. Moreover, as the transparency of the colors is associated with the importance of the words, the informative terms stand out the most. In this case, words such as 'materials', 'research' and 'development' aren't very important because they are very common, while words such as 'light', 'alloys', 'composite' and 'theory' give us more information.

Examples of Applications

Word Storms can be used in different scenarios. Here, we show some real deployment examples:

Create your Word Storms!

Analyse your documents by creating and costumizing your own word storms. You can choose how the clouds look by setting the font or the number of words, but you can also decide how to select the terms or how to emphasize the important ones.

Download the code from github.

About

This project was developed by Quim Castellà and Charles Sutton at the University of Edinburgh.

More information can be found in the article:
Word Storms: Multiples of Word Clouds for Visual Comparison of Documents. Quim Castella and Charles Sutton [ arxiv ]

This research project was made possible by funding from the Engineering and Physical Sciences Research Council [grant number EP/J00104X/1].