About 100 years or so ago, a few years after the MBA program was founded, Professor Melvin T. Copeland started teaching capital-M Marketing at Harvard Business School. Thanks to his early work, marketing became its own domain in business education. And business school academics and marketing practitioners ever since acknowledge that customer segmentation is a key to growth.

It’s key because when you segment customers, you can vary the features in their products, the messages you use to talk to them, the type of customer support they get, or even their pricing. When you do that, you can increase growth through higher conversion rates and higher revenue per user. And you can increase profitability because each dollar of product, marketing, and support spend works harder for you.

Sometimes people assume that good segmentation is complex and time-intensive to build. In my experience, that’s not the case. In fact, with only a few simple ideas, you can build a terrific segmentation to guide your product, marketing, and customer support.

If your users are everyday people, you might be tempted to segment them based on demographics (gender, age, income). That’s been a popular way to do it for many years. Popular and lazy. 

Demographic-based segmentation is really just a descriptive segmentation of who they are, not a segmentation based on how they behave.

Behavioral segmentation is more powerful than descriptive segmentation for both consumer and business users. When you use people’s actions to sort them into clusters, you can build a much more direct connection between the product, marketing, and customer support investments you make and the business outcomes that you want.

If you want to be great at this, you need to do three things:

  1. Gather data about your customers' actual usage of your product or service;
  2. Analyze it using a few simple human and machine methods;
  3. Use the resulting insights to guide real business decisions in product, marketing, and support.

The behavioral segmentation I like has three parts. The first sorts users based purely on their activity. It’s called Recency-Frequency-Monetary Value segmentation, or RFM for short, and it’s simple enough to do by hand. The second is a machine-based method called K-Means Clustering. It spots groups of users who are “closest” to each other behaviorally but might be invisible to a human eye. Third, Sequential Pattern Mining spots the common lifecycle events that led high-value users and low-value users to become the way they are.

Tactic: R-F-M

This technique was invented in the mid-20th Century in the catalog business in the United States. But it’s just as useful today. Here’s how you do it in four steps:

Step 1: Gather data 

Gather data on every user’s activity (each instance with the date) and how much money they generate. Use your judgment to define “activity” (a session, an engagement with a particular feature, or a purchase) and “money” (direct measures like gross merchandise sales, your direct revenue, or indirect measures like dollar value of ad impressions they saw or amortized subscription revenue).

Step 2: Calculate

For each user calculate their measure for: R - Recency (days since last activity is a good one); another measure for F - Frequency (days active / days since registration is a common example); and a third for M - Monetary value. Use your judgment to decide if you need to time-bound F and M measures (for instance, last 60 days instead of lifetime).

Step 3. Sort 

Sort the entire user base into quintiles (five equal sized groups) by R measure. By this I mean sort the user base by the value in the R column, break them into five equally sized groups, give every member of the quintile with the “worst” recency a 1. Score the users in the second quintile 2, third quintile 3, and fourth quintile 4. Finally, give every member of the quintile with the “best,” most recent activity a score of 5. Your users now have R - Recency scores of 1 through 5.

Step 4. Repeat 

Repeat step 3 for F - Frequency, and M - Monetary measures. 1 = lowest performing 20%, 5 = highest performing 20%.

How to understand your RFM score

Every single user is now tagged with a three-digit RFM score. The digits define where in the RFM “space” – a 5x5x5 cube defined by the RFM axes – each user exists. While there are 125 cells in this cube, that does not mean you have 125 actionable segments for varying product, marketing, or customer support. The actionable segments for  product, marketing, or customer support purposes are made up of multiple cells grouped based on your business judgment.

I guarantee you will never think of your users the same way again after you have made an R-F-M cube.

For example, one ecommerce app company that uses RFM decided that their Highest Value Users (HVUs) included 5-5-5, 5-4-5, 5-4-4, 5-5-4, and 4-4-5. Those five cells alone accounted for more than 75% of total company revenue. And they decided to group 4-4-4 and 4-3-4 into a High-Potential Users (HPUs) segment. This company built social media acquisition campaigns targeting HVU look-alikes, product marketing directly to HPUs to promote features loved by HVUs, and boosted customer support for both of those segments because they were worth it economically. In the months that followed, the company was able to 8x their total revenue using a new strategy based on behavioral segmentation. Behavioral segmentation + great business judgment = good work.

You can automate the living heck out of this, re-run the scoring daily if you like, tune and tweak it to your heart’s delight. Whether you do it manually or with machines, I guarantee you will never think of your users the same way again after you have made an R-F-M cube. It sparks questions like:

  • How concentrated is my business in the high-value-segments?
  • How do the users in the boundary zones (2s, 3s, 4s) change over time?
  • Where did my 5s come from? How do I find more?
  • Can I get my 3s and 4s to step-it-up and act more like my 5s?
  • Am I wasting effort on 1s and 2s? Or are there some lapsed 5s in there I should go back after with targeted investment?
  • Why am I capping my acquisition spend based on average user value? Why not target my spend at my high-value and high-potential (3s and 4s who can become 5s)?
  • Most important question of all: How can I treat them differently – product, marketing, support – so that I maximize their value?

If you stop here, you are in the 99th percentile of businesses with respect to segmentation. Sad, but as my mom and dad always said, “A low bar is your opportunity, MD!” With just R-F-M, you have a powerful behavior-based segmentation that can inform product, marketing, and support investments.

Tactic: K-Means Clustering

The R-F-M space I described above is a nice frame for this slightly more advanced approach. K-Means Clustering is machine-driven technique that spots relationships between users that would take you forever to find using a human eye. The name “K-Means” refers to the number of clusters (K) and the machine-driven optimization of the mean distances between users to the centers of these clusters. That’s a mouthful. So here’s a simple explanation of how it works.

Step 1: Access and run

Using the same user data set as in R-F-M (make sure you use the measures of R, F, and M, not just the quintile score 1 - 5), access and run the K-Means clustering tool in your analytics package of choice (like R or matlab). K-Means is an “unsupervised” method that starts by automatically drawing “centroids” (machine-drawn shapes centered on a point in space) around clusters of users that seem like they might go together. Imagine the machine is scanning the cloud of users and drawing boundaries around the ones that seem like they are similar.

Step 2: Iterate, iterate, iterate

K-Means tries thousands of iterations to improve the centroids’ shapes until it has minimized the differences between members of the same cluster – measured by each user's distance from the center of the shape – and maximized the differences between the clusters. It stops when no better centroids can be drawn. This is the part that would take a human until Doomsday. So, thank you, Industrial Revolution!

Step 3: Assess

The clusters that fall out of K-Means might be subsets of R-F-M segments. Or they might be completely independent of your R-F-M work. But with K-Means Clustering, you can quickly have a whole new lens on customer segments that would have been invisible to your human eye.

All this prompts similar questions as the R-F-M did, including the most important question of all: “Given their unique behavioral profiles, how might I treat the clusters differently – using product, marketing, support programs – to maximize their value?”

Tactic: Sequential Pattern Mining

This one you can do manually or by machine. The principle is the same either way. But let’s start assuming you are doing it by hand so it’s easier to imagine.

Step 1: Gather data

Gather lifetime data about a group of interesting users (could be an R-F-M segment, could be a K-Means cluster, or just your top 25 users).

Step 2: Write it all out

For just one user in that segment, write out the chain of events they experienced, the actions they took (including all marketing, product, and support interactions), from day zero to today. Not kidding. Write the event, put an arrow to the right of the event pointing at the next event. You want to make a map of that user’s journey that leads up to today.

Step 3: Repeat

Repeat for the next users on the list. And the next. And the next.

Step 4: Recognize patterns

Start to notice the overlaps in their journey as well as the differences. Write down your observations about the patterns, the sequence of steps they passed through. You will naturally build clusters of users who had similar journeys. It’s like you are stacking all the drawings from step 2 to get a master map of the whole population’s journey.

Step 5: Hypothesize

Write a hypothesis – to be used to inform product, marketing, support decisions – that explains the migration of users from newbie to high-value veteran. Or from newbie to low-value lapsed.

Now imagine doing steps one through five with a machine. You can process huge volumes of users very quickly. You can – or the machine can – spot sequential patterns that are non-obvious humans (like non-adjacent-sequential events predictive of high or low value). This is a very powerful form of behavioral segmentation: use events to sort and group users. With these insights you can modify your product, marketing, and support investment plans to maximize the good patterns and minimize the bad ones. Thank you, again, Industrial Revolution.

Just like with R-F-M or K-Means Clusters, you can use the insights you gather from Sequential Pattern Mining behavioral segmentation to answer the fundamental question: How might I treat user segments differently – through product, marketing, or support – to maximize value?

Parting thoughts

In ten minutes of reading this and a few hours of playing with the numbers, you can be in the Varsity squad of segmentation. If you can connect insights from simple, behavioral segmentation to your product, marketing, and support operations, you can be in the 99th percentile of all business people who have ever lived on Earth with respect to customer segmentation. It shouldn’t be that easy. But it is. :)

These tools are Swiss Army knives: generally useful in lots of situations. Use them and let me know if I can help you think through the analysis or the application of them to your business (md@harrisonmetal.com or @mcgd on Twitter). I have used these techniques – directly myself and coaching others – to help create billions of dollars of value. Segmentation works. Now go and try it.