The question of literature being data brings up a lot of controversy depending on who you ask. To answer this question, one must define what data is. Data does not really have a set definition, and can vary depending on what kind of data you are talking about. In terms of data being a quantitative set of points that can be analyzed, I argue that everything is data. Therefore literature is data. But I don’t believe that treating literature as data is ‘the end of the book as we know it’ as Stephen Marche believes. Treating literature as data, or distant reading literature and other forms of writing adds to the experience one can attain from reading. But distant reading and treating literature as data is completely optional, a practice that one can abstain from if they chose. In that way, books can have a multidimensional character to them that can allow a literary scholar to simply analyze the text of a novel, while a digital scholar could analyze the word count and frequency of that same novel. Both people could come to significantly different conclusions of the meaning of the novel, but this just adds to the creativity the author put into it rather than taking away anything. Distant reading and treating literature as data can only add to the experience of reading, and can give us a grasp of ideas that could not have been discovered with just human brainpower. Tools like Google N-Gram, or text analysis of ‘JK Rowling’ novels use the idea of distant reading and the data in literature to elucidate complex patterns that show real meanings. Projects like these, especially Google N-Gram, augment scholarship by analyzing sets of data that are so large and impossible for one person or even universities of people to analyze by themselves. Through digitizing and searching over 6% of all literature ever published, N-Gram gives us insights into times of history when record-keeping only took place in literature. It allows us a holistic insight into periods of history, that could only be achieved in the past by reading as many books from that time as possible. Now we have libraries upon libraries at our finger tips.
The argument of whether literature is data depends upon the definition of data. Data can be viewed in a negative connotation, in a way that removes the artistic and creative elements and turns something into a quantitative subject. It can also be viewed simply as a form of information, from which we can establish interpretations and analyses that we can learn from. In his article Literature is not Data: Against Digital Humanities, Stephen Marche makes several bold statements claiming the introduction of digital books have brought about the “…end of the book as we know it”. However, digitizing literature offers us an additional medium through which literature can be experienced, analyzed, and interpreted in different ways. It is not the end of the book as we know it, but rather the expansion of the book as we know it. Literature has always been taken apart—quotes, syntax, characters, plots, symbols, themes, and more have been discussed, interpreted, debated, and given meaning since their origin. Digitizing books doesn’t put an end to this kind of thinking, but rather provides tools that allow us to go even further in depth. Digital tools, such as n-gram, allow us to compare thousands of types of literature in seconds. Computer algorithms allow us to delve into details that would take years of collecting and studying to analyze in as little as a few seconds.
Literature has always been data—we have always learned from it and always used it as a tool to examine different elements of writing, human psyche, cultural reflections, and more. Just because there are new means to examining this data doesn’t mean that the old form is nonexistent. The book as we know it today still exists, and personally I prefer reading a hard copy of a book. I can choose not to associate with the digital tools that are being developed and experience books in the more nostalgic form. However, the fact that those digital tools exist provides the opportunity for me to expand my knowledge and understanding of a book, if I choose. Digital tools open the window for books to be examined on a large scale, to go in depth with details, such as word choice, while covering a wide sample, which could range from several works to an entire era’s worth of literature.
Marche, Stephen. “Literature is not Data: Against Digital Humanities.” Los Angeles Review of Books. 28 Oct 2012: n. page. Web. 2 Oct. 2013. .
I randomly came across this so I wanted to share. It reminded me of how we were discussing interactive and impassive video games, but this gets very specific.
I included the actual image and site where I found it.
Although it offers benefits, the idea of analyzing literature as big data has become a controversial issue. The most volatile aspect of the issue is whether or not literature is data. If data is defined as information, then everything, including literature is data. It is because data has a connotation of codes and numbers, that academics like Stephen Marche suggest that this is “the end of books as we know it.” However, by viewing literature as data and analyzing as such, it is more like what Kim suggested in class, this is the “expanding of books as we know it.” Looking at literature from a different point of view is encouraged in all literature classes because to many the importance of literature lies in what it represents and how people understand it. If this is the case then why are some academics up in arms about looking at it from the scope of a computer? Perhaps if distant reading were explained as macroanalysis, as suggested by Matthew Jockers, people would be more at ease. Through his definition, treating literature as data creates a school of thought that compliments reading; like how macroeconomics compliments microeconomics so too would macroanalysis compliment close reading.
One of the aims of this class is show us how different forms of media change the way we experience written work, thereby augmenting reality. Similarly, using different media in analyzing written work is another way of augmenting reality. It not that the literature is changing, but rather what can be gained from it has been augmented to enhance the learning experience.