American Networks, American Nerds

I’m very pleased to announce that I’ve been invited to speak at Emory University’s new Digital Scholarship Commons next week. If you find yourself in the vicinity you won’t want to miss it. Here are the details:

The Digital Scholarship Commons Presents Ed Finn, Ph.D.: “American Networks, American Nerds”
Wednesday, November 2, 4:00 pm – 5:00 pm
Research Commons, third Floor, Robert W. Woodruff Library

Ed Finn, a recent Stanford graduate and University Innovation Fellow at Arizona State University, will speak about his network analysis of Amazon consumer reviews of David Foster Wallace and Junot Díaz, explaining how these differ from literary critics’ assessments. You can read about Dr. Finn’s work in the New York Times.
This talk explores changing systems of literary reputation in contemporary American fiction through two case studies: Junot Díaz and David Foster Wallace. Long-established models of literary production are changing dramatically as the digital era continues to blur the divisions between authors, critics and readers. Millions of cultural consumers are now empowered to participate in previously closed literary conversations and to express forms of mass distinction through their purchases and reviews of books. The bookselling behemoth Amazon has been collecting such information from its users since 1996, assembling a rich ecology of cultural data. Drawing on Amazon’s archive and a set of professional book reviews, I analyze the literary networks that readers have created for Wallace and Díaz through their collective acts of distinction. Tracing contemporary shifts in critical and commercial reception, I argue that both writers use style as a way to reinvent authorship for a hyper-mediated age. By redrawing the boundaries of dialect and slang in American English, they promote radical revisions to contemporary concepts of literary identity and community.

Pamphlet #3

I’m very excited to announce that a version of my essay on David Foster Wallace has just been published online as the Stanford Literary Lab’s third pamphlet. Here’s the lead-in:

If there is one thing to be learned from David Foster Wallace, it is that cultural transmission is a tricky game. This was a problem Wallace confronted as a literary professional, a university-based writer during what Mark McGurl has called the Program Era. But it was also a philosophical issue he grappled with on a deep level as he struggled to combat his own loneliness through writing. To really study this question we need to look beyond the symbolic markets of prestige to the real market, the site of mass literary consumption, where authors succeed or fail based on their ability to speak to that most diverse and complicated of readerships: the general public. Unless we study what I call the social lives of books, we make the mistake of keeping literature in the same ascetic laboratory that Wallace tried to break out of with his intense authorial focus on popular culture, mass media, and everyday life.

Writing in the Real World

It’s been a summer of major changes in my life: completing grad school and moving on to my first job as a fellow at Arizona State University. As I adjust to a new position where I am “doing” almost as much as I am “thinking” (for a very word-based, university definition of doing), the impossible has occurred. I’ve begun to miss the abundant time I used to spend just sitting at the keyboard, writing. And think about writing. And fiddling.

I still do a fair amount of sitting and fiddling in the new job, of course, but my full agenda there does not include any special time for research. There is no gilt-edged appointment in my office Outlook calendar. I need to make that time myself, and I’ve begun to wish I was a faster writer. I mean, I’m fast enough at drafting proposals, emails and memos, but I don’t have the prodigious speed that some academics seem to have for polishing off whole essays in an evening. I can barely read whole essays in an evening.

So my ambitions for this year are to practice the arts of making time and of thinking through problems on the go. It’s dawned on me that my new slate of responsibilities is not a temporary condition, and that the period of graduate navel-gazing is done forever.

The positive side of this new reality is that I am actually starting to enjoy working on my own stuff once again. It’s still a challenge of will to revise dissertation work for publication, but I am really starting to look forward to some new projects and fresh directions. Who knows, maybe I’ll even put more time into this blog?

On Reading The Pale King

This was a strange experience for me, having recently spent a lot of time thinking about Wallace for a chapter of my dissertation. Somehow reading this unfinished novel brought the sad fact of his death to life unavoidably to mind in a way that my other DFW research never did.

The novel itself is really enjoyable–I could really see Wallace extending himself into the new style that he was struggling to develop. The various chapters are full of life and intelligence, and seemed in a sense less guarded and cerebral than his previous fiction. I found the whole setting of the novel (an IRS center in the 1980s) to be hilarious and was really drawn into the book in a way that this kind of postmodern fiction usually doesn’t (though I love it anyway). That quality was particularly surprising because it doesn’t really cohere as a novel and clearly was part of something larger that will never be.

At the end of the text, Michael Pietsch, Wallace’s editor, chose to include a collection of notes drawn from the author’s working files on characters, potential plot twists and various endings for the book. (Unless, of course, this was also some kind of postmodern DFW gag, but it didn’t read that way.) This closing chapter was what really brought Wallace’s death home for me. I felt as if I’d been let in behind the curtain and seen the magician preparing his next trick, and he’d seen me see him, and there we both were, feeling upset and depressed and unable to think of a way to correct the situation. With most authors I would find this kind of glimpse into the voyaging writerly mind intriguing. In a different context I would probably enjoy this kind of thing with Wallace, too–I hope to check out his archives at the Ransom Center in Austin one day. But here, at the end of The Pale King, it just made me wish he’d been able to finish the book.

Culturomics: Not Quite Yet

Well, it’s time to stick my oar in on the Google Ngrams discussion. While a number of computational linguistics scholars have pointed out the pitfalls of Google’s latest toy, I think I have a unique perspective to offer on the issue. I understand what the Ngrams creators were trying to do, because I’m trying to exactly the same thing: get some things cooking. My research on contemporary literary reception is not exhaustive or dependent on highly complex statistical models. That’s because literary reception is a huge, multiply mediated field ranging from café conversations to book reviews, and my access to data is limited. But where I have adopted a “core sample” model, choosing a few accessible data sources to make some robust but limited generalizations about readers and reading culture, Google has gone for the moon shot. By creating an opaque front-end to their 5 million book archive, they offer the illusion of a truly global Ngram search—and they emphasize the scale of their ambition by claiming their tool isn’t merely a corpus search mechanism but the portal to a new science of “culturomics.”

As my colleague Matthew Jockers noted in his own oar-insertion post, “To call these charts representations of ‘culture’ is, I think, a dangerous move.” He goes on to suggest it “may be,” but I have to go a bit farther and say “definitely not.” Here’s the problem: we can’t get reasonable, arguable claims about things like culture or literary history unless the limitations of the corpus are acknowledged and dealt with from the outset. Typically, projects like this limit themselves either by going too small or too big, and Google has gone way big. Let me explain what I mean.

Too Small:

The opposite example would be a research project on a small, meticulously tended patch of texts. Classic humanities research, really, but of limited usefulness for making grounded claims about larger literary-historical or cultural issues (at least until enough such small projects emerge with commensurable results that we can begin to construct some causal chains). Traditional humanities as a whole is full of projects that are “too small” for making broad cultural claims because they are limited to a small data footprint. The walled garden of closely tended results is fascinating and lovely to explore, but it’s difficult or impossible to compare the work to anything outside.

Too big:

Google, by contrast, flies off the macro end of the scale by trying to do too much and claim too much. The corpus is amazing, but nevertheless limited and contingent in many ways. As others have pointed out, the OCR is problematic; the metadata is sloppy; the text distribution almost certainly has a number of biases (how could it not? What is the gender, historical and language distribution of the world’s universal library supposed to be anyway?). By choosing to obscure these limitations instead of illuminating them, Google turns “culturomics” into a toy, not a tool.

Fortunately, the data is all there, and these problems can be fixed. Google loves a good algorithm and will presumably figure out solutions to the various technical problems. With luck (and the persistence of its academic research partners) the Ngrams team will also come to acknowledge and reveal the limitations on its data. Once that happens, we can really get cooking and make a clear case for when this vast corpus really does reveal broad cultural trends.

For now, Ngrams is a blunt object but it still has some value as a tool. I’ll post some examples next time.

Stanford Dissertation Browser

While I’ve had the dissertation specter floating before me for several years now, it has never looked so beautiful. Created by two Stanford graduate students in Computer Science, the Stanford Dissertation Browser uses topic modeling to graph recent dissertations by their disciplinary affiliation. The visualization was created with Flare, successor to Prefuse, which I was using for my own visualizations for a while (this being Stanford, the guy who created all of these visualization tools, Jeffrey Heer, is advising the project).

I’m looking forward to adding my dissertation to the mix next June. I wonder where it will line up?

Talking in Tempe

I had a great time speaking at the Southwest English Graduate Student Symposium on Saturday, or SWEGS, according to its intimidating acronym. This was a great way to introduce some of my research on chapter 2 of the dissertation, which is a case study on Toni Morrison’s ouvre. It was great to meet some other members of the local English grad student community, and I was shocked (and pleased) to encounter a fellow panelist who’s also looking at Amazon’s recommendation networks, and I’m looking forward to sharing ideas with him down the road.

This was the first stop in the 2010 road tour, which will include Stanford, New Orleans and, hopefully, London. I’ll be updating the presentation with new bells and whistles as I make more progress on some new ways of looking at references in book reviews.

Until then, back to the mines.

A Big Year

2010! Where is my jetpack?

It’s been a busy year so far, and I’m hoping to keep up with this new, futuristic energy. After a bit of a slow autumn (we use the term metaphorically here in Phoenix) and the usual distraction of the holidays, I finally got to check a few major items off my list this week. Yesterday I completed a funding application for the Stanford Humanities Center–they offer a few dissertation fellowships each year. Today I finally–FINALLY–finished revising a paper submission based on my Pynchon chapter and sent it back for round two.

Now it’s time to buckle down and return to data analysis. I’ve assembled a great pile of book reviews and recommendations in a MySQL database, and I have a few discrete challenges ahead of me:

First, I need to come up with an effective way to identify and then tag proper nouns in book reviews. This is easy to do badly and then clean up by hand, which is what I did for the last chapter. But there are a lot of Morrison reviews out there, so now I really need a computer for this. As a first pass/proof of concept I’m hand-editing a little “dictionary” of all the proper noun literary references made in professional reviews of Morrison’s work. Then I’ll write some kind of program to search for and tag those references in the reviews.

Once I get that figured out, the second trial process is going to be creating network graphs of these literary references based on collocations. I think I’ll probably start by defining links as “in the same paragraph,” but this might change depending on how useful the graphs end up being.

If I can get all this working in the next week or two, hopefully I will get some kind of epiphany for how to do automate the process elegantly for a much larger, and badly proof-read, set of consumer reviews of Morrison. It’s 2010…where is my artificial intelligence research assistant?