A Reflection on Building a Puzzle with No Guides and a Deadline

This project was full of ups and downs, of failed project plans, unexpectedly boring results, and little moments of genius. Aryamaan and I first thought of tracking the geographical locations in Rooftop Rhythms and the qualities associated with them, yet in a process that involved hours of searching for a world gazeteer, bad STT mistranscribing proper names, and realising that Voyant Tools could have done this in 3 minutes, we realised there were almost no place names referenced on the corpus we could work with. We changed our aim again, to using Named Entity Recognition and collocate analysis to find the most common subjects of the Rooftop Rhythms performances - an idea we abandoned because of the technical constraint of having to tag the corpus to train our own algorithm.


Aryamaan then came up with the idea of tracking whether the different emotions in the ‘nrc’ lexicon were interrelated - for example, whether more joy in an episode necessarily means less fear, or more sadness means less anger, and so on and so forth. We knew this would be a very broad reading when performed on the whole corpus, and were worried the results wouldn’t be interesting enough. However, we decided to explore topic modeling on the side of sentiment analysis in order to have multiple areas we could aim our final project to.


At one moment, our work had led us to have a sentiment analysis that was less than unimpressive, where all emotions varied similarly and seemed to be affected by something external, and a topic model that was everything but easy to read. However, we had something in hand: the five minutes of genius Aryamaan and I had at the end of every class. What seemed unimpressive and unreadable started making sense: plotting the topics into a heatmap let us see that for each episode there was exactly one very strongly related topic, and we noticed others that, like the different sentiments we found through the ‘nrc’ lexicon, were consistent throughout the corpus. As well, we both had the opportunity to write our own reflections on the biases and pitfalls of Speech to Text algorithms, which gave us a possible cause for the constant variation of the sentiments.


All the little bits of work and the different techniques we used throughout the final project seemed to fit together at this point, which gave us a strong ground to start fleshing out interesting observations. We started noticing that the diversity of speakers in Rooftop Rhythms might have caused lower levels of all emotions in some episodes due to a transcription fault, and that some of the topics corresponded to the different structural parts of a Rooftop Rhythms episode, like, for instance, Dorian and Bill’s introductions of the speakers. As well, although this couldn’t be done for all emotions in our study, we found certain words that are related to joy only in the context of the show (“nyu”, and “Bill” are among them!), as well as other interesting effects the start of the pandemic and the change to online episodes had in the performances.


In short, this project had a rocky start, but we were fortunate to find an order to all our pieces of exploration, and making these connections after days without progress felt like a victory to us. We managed to build this puzzle. It is not a particularly pretty one, and some of the pieces might seem jammed in, but, if you squint your eyes, it contains amazing ideas I (and I hope Aryamaan too) am proud of.


ready for grading. december 13th, 2021. i want to go home now.