Wednesday, November 27, 2013

Happy Thanksgivukkah!

The first post of this blog will commemorate the extremely rare event occurring this Thursday:  Thanksgivukkah.

For those of you who don't know, this thanksgiving (2013) we will experience the convergence of the first day of the Jewish holiday of Hannkkah and the american holiday of Thanksgiving.  This is a rare event because we are dealing with two different calendars.  The rareness of the event is attributed to its not occurring again for another 75,000 years.  

Social media is a nice outlet to discern how people reflect on events and holidays.  As such, I built a wordcloud by mining Twitter for "#Thanksgivukkah".  This hashtag was actually mentioned about 1500 times which is impressive.  

This was done using R.  In addition I had some useful instruction from an blogger in terms of setting up the API for R to search through hashtags.  



##search twitter
>thanksgivukkah<-searchTwitter("#Thanksgivukkah",n=1500,cainfo="cacert.pem")
>thanksgivukkah_text<-sapply(thanksgivukkah,function(x) x$getText())
##Build Corpus
>thanksgivukkah_text_corpus<-Corpus(VectorSource(thanksgivukkah_text))
##Remove punctuation, stopwords, and lowercase
>thanksgivukkah_text_corpus<-tm_map(thanksgivukkah_text_corpus,tolower)
>thanksgivukkah_text_corpus<-tm_map(thanksgivukkah_text_corpus,removePunctuation)
>thanksgivukkah_text_corpus<-tm_map(thanksgivukkah_text_corpus,function(x)removeWords(x,stopwords()))
##Build wordcloud
>wordcloud(thanksgivukkah_text_corpus,scale=c(8,.3),min.freq=20,vfont=c("sans serif","plain"))

Pretty encouraging overall...seems like people are happy...which is great.  Happy Hanukkah and Happy Thanksgiving!