data | 15 December 2020

SongscrapR Package for R

Tidy lyric-scraping for text analysis in R

Following an analysis I did of my own Spotify data, I thought it would be interesting to look at patterns and topics in lyrics of the songs I was listening to as well. Finding a structured dataset for song lyrics is not easy so I decided to learn the {rvest} package to scrape my own dataset from A-Zlyrics.com. In order to use this more generally for other projects, I combined the set of functions into a small package.

The function has a well-documented set of functions and can be used to construct a tidy dataset of entire albums (or any range you wish!) by any artist. I’ve found it tremendously helpful in collecting data for songs by artists who have a large body of work (I used this to create a dataset for songs by The Beatles for another project that used markov chains at a later stage) and using it for further analysis. Attached is the Github repository for the package.