Joyce and Mathematics

By Frank O’Shea

If I was to tell you that this is an article about Joyce and mathematics, I wonder which would be more likely to alarm you. In Melbourne, given the work of a number of hardworking Joycean enthusiasts, I suspect that mathematics would be the bigger turn-off. So I will start with Joyce.

The Americans had great fun with the Dublin man, partly because they did not understand him, but mainly because he provided a rich mine for doctoral theses. At the University of Wisconsin back in the 1940s, a group of research students typed out the complete text of Ulysses and then cut out all the individual words, putting them in piles to count the number of occurrences of each word throughout the work.

Then they counted the piles to find the number of times each word appeared in the text. Here is one thing they found:

 

Word Occurrences Rank
I

2653

10

say

265

100

big

26

1000

orangefiery

2

10000

The word “I” occurred 2653 times and was the tenth most popular word. The word “say” was used 265 times and was the 100th most popular. The pattern continued with the 1000th most popular word “big” which appeared 26 times in the text and even that strange word at the end which occurs twice, was the 10000th most used.

Ulysses in American Colleges. Credit Cartoonstock

Ulysses in American Colleges. Credit Cartoons

No mathematician worth his chalkdust would bypass such neatness without making some comment. Since we must dismiss the possibility that Joyce was carefully counting how many times he was using words, there must be something else happening. It is tempting to put it in an equation, but I am conscious that as Stephen Hawking noted, with every equation you lose ten per cent of your readers – and he was talking about the kind of people who read Stephen Hawking!

The data come from a book titled ‘Alex Through the Looking-Glass’ by British writer Alex Bessos. He was as intrigued as you and I by this Joycean tidiness, so he did a similar analysis of the words in his own book, using a computer program which cut down the work to considerably less than the 14 months it had taken the folk at Wisconsin. He found that the word that he used most frequently was “the” which appeared 10 times more often than the tenth-ranked word “was” and 100 times more often than his 100th-ranked word “who” and about 1000 times more than the 1000th-ranked word “spirals”.

It turns out that this pattern is common to all books in all languages. Moreover, it has been found that in all books, 50 per cent of the words are used only once. These are hardly earth-shattering results, but they are surprising. They form particular instances of what is called Zipf’s Law which is not confined to books but applies to many different collections of figures.

For example, populations of people in large metropolitan areas obey Zipf’s law. The population of New York/New Jersey is 10 times greater than the tenth-largest city in the US, Cleveland, which is in turn 10 times greater than the 100th-ranked city of Hamilton/Middletown. The wealth of the richest person in the country is 10 times greater than that of the 10th-richest and 100 times that of the 100th-richest and so on.

This latter is a particular example of what is known as Pareto’s Law or the 80-20 rule. This says that 80 per cent of the wealth of a country is in the hands of 20 per cent of the people, 80 per cent of the output of a factory is down to 20 per cent of the employees, 80 per cent of the profit of a business comes from 20 per cent of the customers, we achieve 80 per cent of our happiness during 20 per cent of our time.

Faced with delightful little pieces of useless information like those, how could someone think that orangefiery Joyce was more interesting?