Why St Paul’s?
Before the Great Fire of 1666 in London, there was Old St Paul’s which stood in the same location as the current cathedral. After the fire, St. Paul’s was burned down and became a central part of a major rebuilding program in the city. In this way, St. Paul’s is symbolic of London’s rebirth and can be used as a measurement of general attitudes during the time.

Methodology
When looking at Reconstructing Utopias in London, one of our main questions is who the “Utopia” is built for and what groups of people were considered in its design. To do this, we will take a look at how sentiments toward St Paul’s changes throughout time from 1600-1700, and combine that with Hapax Richness to track the complexity of texts.
Processing Texts for Sentiment Analysis
- After cleaning our data set, we filtered for texts containing terms relevant to St Paul’s (ie: St Paul’s, Saint Paul’s, St. Paule, etc).
- Then, the 10 words before and after the key term were extracted and placed into the ‘text’ column. This was done for each occurrence of the key term.
- The idea behind extracting the sections around the keyword is that we can leave out text that is irrelevant to our sentiment analysis.

For our analysis, we used the Bing Lexicon, which categorizes terms as either positive or negative. We chose this lexicon specifically because we found it was mostly accurate for texts over 200 words, which fit most of our extracted text lengths. Finally, we calculated the sentiment as a proportion of positive tokens to the total number of sentiment tokens:
Sentiment = # of positive tokens / (# of positive tokens + # of negative tokens)
Sentiment Analysis

Our initial results indicate that changes in sentiment reflect a few major historical events relevant to the reconstruction:
- 1666: the Great Fire of London
- 1672: Coal taxes are allocated to the St Paul’s project
- 1678-86: Coal dues funding halted; prompted widespread request for private donations
- Mid 1690’s: After more than two decades of construction, this period is where the most intense backlash against St Paul’s arises
- 1697: The Chancel is consecrated, though not the rest of the cathedral

Hapax Richness
- Hapax richness comes from hapax legomena, which is a word or an expression that occurs only once within a context.
- Hapax richness is a way of assessing lexical richness, which is how diverse or how complex a text is.
- The equation we used to hapax richness is included below:
Hapax Richness = total # of words that occur once / total # of words
Conclusions
Comparing hapax richness with sentiment, we can see that although overall trends are similar, sentiment and hapax richness have an inverse relationship during key historical events. A reason for this could connect to the genre of the text. For instance, in 1697 we can see that there is a dip in sentiment, whereas hapax has a peak.
During this time, the chancel, which was a part of the church near the altar, was consecrated, or made sacred, but not the rest of the cathedral. This was because St Paul’s was taking so long to finish that they needed something to happen so people would be inspired by the completion of construction.
Prior to consecration, St Paul’s was facing backlash, and often, people would use poetry as a form of protest. Since poetry is often more descriptive and dense than everyday letters or pamphlets, this could correlate with the increase in hapax richness.
After the consecration, we can see that sentiment increases and hapax decreases, which suggests that the more complex texts protesting St Paul’s (ie: protest poetry) lowered in frequency. In the future, hapax richness and sentiment analysis could possibly be used for a testable hypothesis relating to the genres of texts.