Big Data Conversation with Spotify

I spoke with Eliot Van Buskirk ( @listeningpost ), Data Storyteller at Spotify, as part of my Big Data Conversations series for Dell EMC. Note that this is not a product or technology blog and certainly there are no endorsements implied from either side but, in all fairness, I am a paying Spotify user and I am in love with what they do to bring music to the masses.

Erin K. Banks: How do you utilize big data at Spotify?

photo credit: Larisa Barita

Eliot Van Buskirk: A good way to understand how I utilize data in my role at Spotify is to look at my background. I came out of journalism. I worked at CNET and Wired and I reviewed the first 50-plus digital players in the world and have covered digital music since it came onto the scene in the 1990’s. I left Wired to go to a startup called The Echo Nest, a music data company. They had more individual data points about music than any other entity on the planet. This is everything like “which artists are similar to other artists”, which enabled the clients to make radio stations that make sense, down to what key the chorus is in a song, and where the beats are in a song. It is incredibly detailed data. There are attributes including acousticness which measures how acoustic a song is versus how electronic, and mechanism tied to how regular of a beat it is. Whatever you can think of — these guys were teaching computers to listen to music and to understand what people were saying about music on the web. If a thousand people say something is dubstep then the system says that is probably dubstep. It draws conclusions using the acoustic data and cross references that with the semiotic data from the web to figure out how to understand music at a scale that has pretty much never been possible before. Going from journalism to then going to The Echo Nest, I found the sweet spot in telling stories about the data and analyzing all this incredible technology that is there to solve other problems like “what someone should listen to right now” or “will someone like a certain song.” We started to realize the potential for storytelling – things like how music from various decades is listened to now. The break-out story for us was how artists are related to the individual states. We looked at which artist is listened to disproportionately in each US state, and we made a map that got a ton of traction. Music charts are great, and they tell us a lot about what people like, but the most popular music is popular everywhere. We started looking at what music makes things distinct from each other. You can apply this to anything, like what music is distinctive to millennials versus boomers… the answer is Skrillex versus Roy Orbison. The basic idea is that the data that people view as kind of dry is capable of producing these incredible interesting nuggets of information, and as a journalist who was shifting into a marketing and communications role, what appealed to me was that everything we did with the data is real. It is like being a journalist where you actually do have access to all of the facts. At Spotify, I help publications like the NY Times or whomever to tell data stories, and I publish my own data stories based on the massive amounts of data at Spotify. It was great doing this at The Echo Nest because we had all the data about the music itself, and now that I am a part of the largest on demand music service in the world,  there a lots of anonymized data about how people listen to music, so we are able to see even more, and that makes my job even more interesting

Erin: You said you created a map to identify music regions. Is this map publicly available?

Eliot: This map is publicly available and went everywhere, I think USA Today almost printed instead of the weather map. This is what I mean about it being a success: to get this information out there and to get people talking about it. When we did it originally, it was a static map and it was a snapshot of a certain time when those were the popular artists. I have gone further with the concept now and made this musical map of the world that you can find at insights.spotify.com, which updates twice a month so you can see the music that is distinct for a thousand cities worldwide at any given time. People from those cities will recognize local bands because people listen to them in, say, Nashville but might not have heard of them outside of Nashville. The beautiful thing about this map is that I worked with this incredibly talented guy called Glenn McDonald ( @EveryNoise ) and he maintains these playlists that update for each city. In order to create the map I linked each spot on the map to one of these playlists. This is a demonstration of what we call at Spotify, “Care at Scale”. We are lavishing attention on these cities and showing exactly what makes them special compared to the rest of the world, but it scales remarkably well. We have a great response from people all over the world, like a radio station in Napoli, Italy where they have a bi-weekly show just dedicated to the Naples playlist on this map. I get emails from people all over the place who can tell that the care is there. It doesn’t come off as, “here is the Top Ten globally” — big data can essentially allow for a more nuanced view of the world, and again, it is important to emphasize that this is all based on anonymized and aggregated listening data.

Erin: Your background is journalism but do you consider yourself a data scientist? Are there data scientists there? Lately I have been hearing that it is not about just the data scientist. When it comes down to it, it is regular people searching through the data trying to get insights. Do you have thoughts on that?

Eliot: We definitely do have data scientists here and I do not consider myself one of them. I also don’t consider myself a data journalist which is another title. I have been a journalist before and this is not journalism. I came up with “data storyteller” as the title which I think is the sweet spot for describing what I do. I have a math and science background from a long time ago. I was very much into calculus and physics and became an English major because it was more interesting to me at that point. I have sort of a split early background between the humanities and more math and sciences and did a tiny amount of programming but really nothing after the age of ten or eleven. I taught myself SQL with the help of people at Spotify who are legendary programmers who are helping me along and I am finally at the point where I can get direct access to data and I can write my own queries and crunch my own data and it is actually empowering and I recommend it to anyone. The data can be a bottleneck, as we all know, and only certain people know how to analyze it and without the ability to do that, I couldn’t do my job.

Erin: Do you pull from the data scientists or do you do it yourself? What is the difference between what you are doing and what data scientists are doing?

Eliot: The data scientists have PhDs and patents and I was an English major who was a former journalist. The skill level is very different, but there is a democratizing aspect to this where once you learn your way around the tools, you are looking at the same data as everyone else, so if you learn how to  use the data, the results are going to be similar to them analyzing it and that is the great thing about technology. Regardless of whether you have a PhD or not… if you figure something out, you figure it out. I would never put myself on the same level as they are and I think there needs to be more of us that are in-between various fields and data science. At Spotify, the data scientists are solving major problems and doing very difficult work. They keep this whole thing running as they optimize everything and provide the best music recommendations. For example, it would not make sense for them to be working on a data-driven story about what happens to Aerosmith listening when there is a comet in the news. Turns out that people listen to “Don’t Miss a Thing” every time there is a comet event. I consider that fun and interesting from a news and PR point of view but, that isn’t something you need a data scientist working full time on.

Erin: I think it says a lot that you know that whenever there is news about a comet, people listen to that song and that it turns into something else. How has big data impacted your business? Did Spotify start off as a streaming business and then they added analytics in order to provide different services?

Eliot: Spotify was always data driven. The Echo Nest brought a different approach. They combined what people said to what music actually is. To have a machine ear that says this song sounds happy or this song sounds sad… this is an actual number that we have, an emotional valence. It varies between zero which is completely unhappy and 1, completely happy. I don’t think anyone else had been able to scale up like that. Spotify is scaling up massively in terms of usage and The Echo Nest scales very well for understanding music and how people listen to music. I would like to add that when people hear about big data being associated with music, they can understandably get the wrong idea that it is just algorithms — what do they know, and why would a robot make a good radio station? That is completely the wrong way to look at it. Thinking of music as data and algorithms is misleading. People make music and people listen to music and it is all about humans being involved. The comparison I often make: I used to review music. That is one person’s opinion, and you can base something on that, or you can base something on how 75 million people listen to music. I argue that 75 million people can make a data-drive approach more human than one person can. The algorithms are a way of getting at very human things through this route that it is sometimes mistakenly viewed as artificial.

Erin: What is the biggest myth about big data? Do you think that it is that big data is not looking at the human aspect?

Eliot: Yes, that would be a fair answer to that question. A great example is the “Discover Weekly” feature on Spotify. Every Monday, for every single user of Spotify, Discover Weekly creates a personalized playlist of music that listeners have never heard before on Spotify, and that they know you are going to like. It seems impossible, but when you have that many people listening to music, making playlists, adding things to their collections, it is possible to say that for every single user, here is music that you will enjoy and we know that you have never listened to it before on Spotify, so here it is. That happens every Monday, and people’s minds are blown. It is a similar feeling to a friend making a mix tape. You feel like “There is no way this thing knows me this well”. That might seem strange or magical or impossible if you are just viewing it as an algorithm, but once you understand that it is all about people, the human element, their responses and reactions, then it makes sense.

mixtape

Erin: Now on Monday you have people interested in listening to new music and you are offering it up to them based on everything they have given you before. You have a portfolio on me, as a Spotify user, this is what I listen to and this is what I like and on Monday you give me something different based on me and others.

Eliot: Yes, that is essentially how it works. It is a great demonstration of how big data or algorithms actually provide a very human experience. Once you understand that it is about people, it makes sense.

Erin: So, big data allows your business to be more human. Instead of being a streaming music service you are now creating human interaction with us by gathering all this data around us. You are changing how I listen to music and how I discover music.

Eliot: I think that is true. People really love this feature. People return every Monday saying “give me more of this please.” I think it is one of the most powerful ways that we are using data to improve people’s listening experiences right now.

Erin: Does data allow you to create additional products and services?

Eliot: For sure, as we discussed, there is “Discover Weekly” and there is a degree of outreach being done. If we can tell that you are really into a certain band, then we can tell you if they are going to tour where you live. There is no end to where you can go with the data. I am not involved in the product here, but would guess that we don’t want to “over hit” people with offers. But, if we could help artists find people that are interested in them, only good things can come from that.

Erin: I feel people are not seeing all the power of the data and the additional products and services it can provide. We tend to get stuck in our bubble and not see outside of it.

Eliot: I think we are seeing a lot of examples where people are in bubbles more and more with technology, but that data can expand people’s spheres. Discover Weekly is a great example of that. A similar approach could be used on social networks to bring us outside of our bubbles. The amount of data that we generate in everything we do these days, people can use that in a number of different ways. The part that is interesting to me is using it to expand people’s exposure to things –  rather than keeping them in their bubbles.

Erin: What can you leave us with that is important to note?

Eliot: Another of the offers that we have that we couldn’t have done without big data is “Fresh Finds,” which we release every Wednesday, and it is all about expanding people’s spheres. It is introducing bands that are not in the mainstream. In one case, a local bar band in Florida was featured on the list, found completely using big data, and then using the data for understanding very subtle signals that this band was about to take off. They were then showcased on Fresh Finds and then toured in NY and I think scheduled shows all around the United States. There are ways to use big data to personalize things, or ways to use it to identify very early trends, in our case music trends, but it can be applied to every kind of market.

Erin: When you said that people are talking about them are you bringing in social media?

Eliot: The whole web is sitting right there if you want to know about music. If people suddenly start tweeting about bands, there is a way to do semantic analysis of social networks to see what is bubbling up. What I think is powerful is to combine lots of different signals like listening data, what regions are heating up, social media, and who is opening for what band. The number of signals are almost without limit, and that is what is so powerful about big data and the ability to analyze it — you can come close to understanding all of this, resulting in a play list that feels magical. It is the result of lots of hard work and data processing and analysis of anonymized listening. The more signals you can factor in the better. Then you can continue to apply new data and get more results.

Erin: Can bands receive additional services through Spotify based on big data?

Eliot: We released the Spotify for Artists site and service where artists can access data that they couldn’t earlier receive — levels beyond what they can see in the Spotify app, to get an understanding of where their fans are, and so on. It is another great opportunity from big data.

Erin: It isn’t just Healthcare companies or banks, it is more than that, we are talking about bands.

Eliot: Yes, and we’re still talking about data. If you are a band in Seattle and you are trying to decide if you should drive to San Francisco to play, there are ways to answer that question… even that question can be answered using data.

About the Author: Erin K. Banks

Erin K. Banks is a Product Marketing Director in the Telco Systems Business at Dell Technologies. Previously she was the Director of Product Marketing for the Unstructured Data Solutions group as well as the Messaging Director for Security Transformation at Dell Technologies.  She has been in the IT industry for almost 20 years, previously working at Dell EMC as Portfolio Marketing Director for Data Analytics. She has also worked at Juniper Networks in Technical Marketing for the Security Business Unit and VMware and EMC as an SE in the Federal Division, focused on Virtualization and Security. She holds both CISSP and CISA accreditations. Erin has a BS in Electrical Engineering and is an author, blogger, and on the board for Our SAM Foundation.