What is a word cloud, what are they are good for, and what they not good for?

Ray Poynter, 11 January 2023

Word clouds have grown in popularity since they first appeared about twenty years ago, expanding rapidly when Wordle was launched. In this post, I look at what word clouds are, what they are good for, and I will also discuss what word clouds are not good for.

Tag clouds that focus on text and frequency
A tag cloud is a visual representation of information that uses the size of objects to express their weight or importance. A word cloud is a tag cloud that displays information from text. The text can be collected from a wide variety of sources, such as articles, reviews, press releases, interviews, discussions, and open-ended comments collected in surveys.

The word cloud below (created with Word Cloud Plus) uses the text of President Biden’s 2022 State of the Union speech. The more frequently a word appears in President’s Biden's speech, the larger it appears in the word cloud.

The word cloud does not show all of the frequently used words. Words such as ‘a’, ‘the’ and ‘and’ are typically suppressed, i.e. they are not displayed in the word cloud. These suppressed words are known as ‘stop words’.

The most frequently used words were America, Americans and America – which is hardly surprising when working with a President’s State of the Union speech. Words such as Ukraine, Russia, Putin, Justice, Covid-19, Health, and Economy may all be indicators of key topic areas.

By contrast with these terms, the word ‘Tonight’ is used as a narrative device, for example “so tonight i’m offering a unity agenda for the nation’ and “but tonight i say that we will never just accept living with covid-19”. In the case of this particular data, set it might make sense to treat the word ‘tonight’ as a stop word.

he two key uses for Word Clouds
There are two key uses for Word Clouds. The first is to help find and understand the messages in the text. The second is to help convey information to an audience. It is also, of course, possible to use a Word Cloud for both finding and conveying information.

Finding Messages in Text
Word frequencies are a useful way of summarising the data if the question being answered is very simple. For example, if people are asked to state one word that they feel best describes something, a summary of the frequencies is a sufficient analysis. An example of this approach was the analysis of the recent UK Prime Minister, Liz Truss. Two separate word clouds were enough to identify that in September, the Government was shown as Determined and Strong, and by the start of October, the key words were Incompetent, Useless, and Untrustworthy.

However, in most cases, a word cloud will be more of a starting point to the analysis, as opposed to a complete solution. Turning to the example of President Biden’s speech, we can get a clearer read by tweaking the algorithm to favor phrases, rather than words. In Word Cloud Plus we describe this as the Combination Counting algorithm.

These phrases give a clearer picture of the key topics, including: Electric Vehicles, Fight Inflation, and Child Care. However, it is still necessary to dig deeper. Consider the phrase “Burn Pits”, by digging into the original text we see "Breathing in toxic smoke from burn pits", "these burn pits that incinerate waste", "just yards from burn pits the size of football fields", "but cancer from prolonged exposure to burn pits ravaged heath’s lungs and body".

In general, word clouds can often be used to speed up your analysis of text. However, in many cases you still need to access the actual text, the word counts are unlikely to be enough.

Using a Word Cloud to Convey Information
Regardless of whether a word cloud was used to help find the meaning in the text, a word cloud can be used to help visualize the messages. In the cloud below I have drawn on other information about President Biden’s plans, as well as a more detailed reading of the text. I have also used color to highlight associations and changed the layout of the items.

The word cloud illustrates that the 2022 State of the Union speech covers a wide range of topics, such as Ukraine and the burn pits. But the word cloud can also be used to illustrate the centrality to President Biden of his ‘American Rescue Plan’.

What can’t Word Clouds Do?
On their own, the only thing a word cloud will do is illustrate the frequencies of words in the text. When answering a question such as “Which one describes the customer service you have received?”, the word counts can readily convey the main message in the text.

In most cases, a Word Cloud is at most a starting point for the search for meaning. If the meaning in the text is closely linked to the frequency of terms, the a Word Cloud can give you a start, identifying phrases that occur frequently.

However, if you are looking for a deeper meaning, for example looking at the underlying beliefs and motivations, a word cloud is unlikely to be helpful. To draw an analogy from the world of art, we can’t understand De Vinci’s Mona Lisa by counting the number of paint strokes, or the frequency of blue versus green.

Do you want to try it yourself?
You can use Word Cloud Plus by setting up your free account. If you want to use the data from the 2022 State of the Union speech, you can access it here.

ps, For many years, Wordle (Worlde.net) was the go to tool for creating word clouds, however it no longer appears to be available. You can read more about tag clouds, Wordle (not the word game) at the Wikipedia entry for Tag Clouds.