Vicki Boykis Data, tech, and sometimes Nutella



How many people are actually using your stuff?

Five million years after it became popular, I’ve gotten into Spotify. The thing I like the most about the service is the curated content.

When I was working with television viewing data, my boss at the time had a hypothesis that there are two viewing behaviors,

  • Flipping through the channels to see what’s on or
  • Actively seeking out a show.

The same is true for music: you’re either actively looking for a song, or you want to listen to a “channel.” If I just want to play specific songs, I use my own music on my phone or go to YouTube. If I want channels, I’ll use Spotify.

Spotify is really great at building channels not only by genre, but by intangible things like mood. “Coffeehouse chill” is one of my favorites. I’m not sure if this is still the case, but initially, Spotify had a staff of 35 people working on this curation problem by creating manual lists.

Every day, there is a new collection of playlists up on my app. Because they are so carefully managed to show relevant lists by time of day, I love this section and check it out first:

image

Last week, Spotify created a list called “20 Million Thank Yous,” which it created when the service reached 75 million active users and 20 million premium subscribers.

image

This number is pretty impressive, but, as a data person, I was immediately curious how they came up with it, because there is no such thing as active users.

Well, there definitely is. There are people using Spotify all the time.

But what’s the definition of active?

  • Is it someone who’s signed up for Spotify?
  • Is it someone who’s signed up for Spotify and created a profile?
  • Is it someone who’s played one song?
  • Is it someone who’s played one song in the past minute? week? Month? Year?
  • Is it someone who spends 30 minutes on the site a day? What if they’re on the app but have the sound muted and forgot it’s playing?
  • Is it someone who likes to create playlists?

This problem is relevant across all industries, particularly those who operate on the web, and numerous companies have tried to tackle the issue in different ways.

No one has a single answer, because intrinsic human intellectual activity such as “listening,” “viewing,” or “using” is usually very hard to define.

I once worked on a research project trying to measure knowledge worker productivity, and it, like many discussions about how to tell whether developers are productive, ended in no concrete answer.

As a result, for simplicity’s sake, PR usually just uses the most impressive number the data team comes up with.

Because of this, it’s important to be critical of any usage number you hear about.

For example, Facebook usually says they have almost a billion active users for any given month.

image

That’s 13% of the world’s total population logging in every month! That’s enormous! How does anyone get any work done when almost 1/7 of the entire world is just reposting gifs of slow lorises?

image

The answer is that obviously not that many people are actively using Facebook, depending on Facebook’s definition of “active.” Is active..

  • People who have ever logged into Facebook?
  • People who log into Facebook x times a day?
  • People who comment at least x times as a day?
  • People who like people’s statuses?
  • People who have over 100 friends?

Just think about how many of your Facebook friends post at any given time. It’s usually probably around 20-30%. The rest are all dormant, just creeping on people’s slow loris gifs.

Facebook itself notes how crazy hard it is to measuring people in its 10k statement:

While these numbers are based on what we believe to be reasonable estimates of our user base for the applicable period of measurement, there are **inherent challenges **in measuring usage of our products across large online and mobile populations around the world.

For example, there may be individuals who maintain one or more Facebook accounts in violation of our terms of service. We estimate, for example, that “duplicate” accounts (an account that a user maintains in addition to his or her principal account) may have represented between approximately 4.3% and 7.9% of our worldwide MAUs in 2013.

That’s corporate speak for “We guesstimate.”

According to Facebook’s 10k statement, the way Facebook does measure that funny term, MAUs, is

A registered Facebook user who logged in and visited Facebook through our website or a mobile device, used our Messenger app, or took an action to share content or activity with his or her Facebook friends or connections via a third-party website or application that is integrated with Facebook, in the last 30 days as of the date of measurement.

That’s pretty broad. Although it does mean you have to have logged in AND shared something, it could also mean you are logged into Facebook through third-party apps unkowingly. It also includes anyone who has logged in at least once in a single month, which power Facebook users will tell you is a pretty long span of time.

The point is not that measurement is impossible, but that every company will measure this kind of stuff differently, and if you are a stockholder, developer, or even just plain user, it’s important to keep an ear tuned for how analytics can be manipulated.


Comments