In the article, Sophia Ciocca gives three types of recommendation models that are used to generate the playlists. The first is collaborative filtering: crudely, your friends like this, you might like this too. Digging deeper, the mathematical modelling sounds fascinating. The third is raw audio models: analysis of the audio tracks themselves. This is why Release Radar works so well, despite the tracks not having been played many times.
But I didn’t know about the second one, the emphasis Spotify puts on natural language processing, or NLP:
Spotify crawls the web constantly looking for blog posts and other written texts about music, and figures out what people are saying about specific artists and songs — what adjectives and language is frequently used about those songs, and which other artists and songs are also discussed alongside them.
While I don’t know the specifics of how Spotify chooses to then process their scraped data, I can give you an understanding of how the Echo Nest used to work with them. They would bucket them up into what they call “cultural vectors” or “top terms.” Each artist and song had thousands of daily-changing top terms. Each term had a weight associated, which reveals how important the description is (roughly, the probability that someone will describe music as that term.)
Unlike many others, I’m a fan of the Apple Music UI and implementation. But I’ve not had terrific results with their recommendation engines. The opposite is true for Spotify. It’d be nice to save some money by cancelling one or other of the services, but they do such different things for me that I can’t see that happening any time soon.