When you want to analyze your marketing data, it is simply not realistic to look at each customer separately. True, it is beneficial to collect and store rich data for each customer; however, it is impossible to organize and communicate analyses that look at thousands or millions of individual customer records at the same time. Making decisions at a strategic level would be impractical.
Our brains simply cannot process information at such a granular level. At the same time, we know that we don't want to oversimplify it down to a one-size-fits-all approach. There has to be a middle ground where the customer’s voice is adequately heard, even if some segmentation of the user base is required.
In fact, there is a way to elegantly approach the challenge of segmenting customers. It is called cluster analysis, and it is one of the most accessible and explainable ways to apply machine learning on marketing data.
Why cluster analysis ?
Let's take a step back before diving into this technique. It’s important to understand how cluster analysis differs from other approaches. If the goal is to segment customers, why can't you do this segmentation manually?
Well, you can. In fact, if you work with web analysis tools like Google Analytics, you are probably used to manually defining traffic and user segments of interest in order to keep the analysis focused on the right places.
This approach is very common and for good reason, but it has its limitations.
While it can be effective when working with a small number of user dimensions, it is not hard to imagine how it cannot easily scale in the presence of a high number of user attributes. Luckily, when the human brain reaches its limit, advanced analytics and machine learning can provide solutions.
Prepare your data first
Cluster analysis is a fascinating technique and one of the top advanced analytics methods used in Marketing.
To prepare the foundation of your organization to work effectively with clustering you'll need to carefully prepare your data.
You'll want to make sure your basic digital marketing reporting needs are well taken care of. Having a solid automated data and reporting pipeline in place will free up resources, reduce human errors, and improve data quality.
The quantity and diversity of data also play a key role. The reason for this is that most of the advanced marketing analytics techniques such as clustering perform significantly better in the presence of larger volumes of granular data collected from a variety of sources.
Improvado can help with all of these aspects of your preparation before you dive into advanced marketing analytics, from automating your marketing reports to collecting and storing granular level data. Schedule a demo with Improvado.
Use Cases for Marketing
Clustering for customers is one of the most widely-known domains for cluster analysis applications. It helps marketers group together similar customer stories. Once you become familiar with the technique, there is no shortage of other marketing-related fields where you can meaningfully apply it .
Customer use case
You can cluster customers based on the many types of characteristics available about them and their behavior. For example, clustering can be based on:
- Customer browsing activity
- Customer demographics
- Recency, frequency, and monetary value of a customer
- Items bought by a customer
- Offline customer behavior
Product use case
Another interesting use case is product clustering, which can be based on attributes of products such as:
- When the product was purchased
- Who purchased the product
- In which store the product was purchased
SEO use case
Likewise, say for SEO keywords, you can apply cluster analysis if you have available data about:
- Keyword rankings
- Difficulty score
- Authority score
How Clustering works
The basic concept
Now that you have seen how useful clustering is in a marketing context, it's time to gain some intuition on how it works. Incidentally, if you have been wondering how a machine learning technique can work in practice for marketing, this will give you a great sense. In fact, clustering is considered among the most widely-used, unsupervised machine learning techniques.
Why unsupervised? Because there isn't any ground truth that we want the machine to learn or predict, instead we want the data itself to reveal the natural structures within it. Sound confusing? It's not. To make the concept clearer, let's look at a simple example.
A simple example
Imagine you are in charge of a T-shirt company who wants to customize the fit of T-shirts for its customers. You have sample data regarding the height and weight of your customers. This is how the data looks when plotted in two dimensions:
What the clustering algorithm does is label each customer—represented by a point in the graph—according to the optimal cluster that it can be matched to. The key is to make clusters as homogeneous as possible.
How are the clusters determined? The idea is to form clusters in a way that maximizes the similarity between the points of each group. “How is similarity defined?” you might ask. It's expressed as the distance between each possible pair of points.
How can you measure that distance? This is where the Pythagorean theorem comes in (you might have heard of it in geometry class). If you have the x and y values of two points —in our example, the weight and height measurements of two customers— you can calculate the distance between them. This simple calculation, based on this classic theory, is the foundation of the clustering algorithm.
A marketing example
Hopefully by now, the information in this article has helped you to start connecting the dots.
Next step, forget about heights and weights and think about some more realistic scenarios. While with two variables clustering analysis might seem easy and intuitive, this is not the case when you start adding customer attributes. If you move beyond the three attributes threshold, it's no longer possible to visualize the data.
Instead of measurements like height and weight, you now have variables such as customer income, age, purchase value, and so on. You can calculate the distances in the same away as in the simple example above until you find the optimal clusters.
This last step however cannot happen in one go. It should happen iteratively by following one of the several clustering algorithms available. The most common one is called k-means, which, as we 'll see, comes with some favorable properties.
Once the algorithm determines the optimal clusters, the ball is back in your court.
The marketer's role
As a marketer, you need to use your domain knowledge, intuition, and experience to give descriptive names to the clusters produced by the algorithm and, of course, ensure that the outcomes make sense from a practical and business standpoint. You might want, for example, to experiment with adding or removing one or more of the initial attributes and then rerun the algorithm to check if it produces more meaningful clusters.
Applying the clustering technique
- To prepare for clustering, you'll need to have granular level data for each customer, each product, etc. This technique simply doesn't work with aggregate data.
- Ideally, if your data lives in different places you’ll want to collect them and store them in a data warehouse such as BigQuery, Redshift, or Snowflake for easy access. Remember that Improvado is here to help you with this.
- Before applying the technique, you'll need to make sure that the data is numeric or converted into a numeric form so that the mathematical distances can be calculated.
- You might also need to normalize the data of the various attributes if they are expressed in different scales. One way to do this is by converting the values of attributes in such a way that they range between zero and one, while still keeping all their original properties.
Once the data is ready from a preprocessing point of view, there are a few options as to how to apply the algorithm:
- If you have data scientists on your team, they can use open source tools such as the programming language R or Python for such tasks.
- SaaS and other analytics tools like Tableau have integrated functionality to allow users to perform clustering in a drag and drop fashion.
- With the right add-on packages, it is also possible to carry out clustering in Excel.
- These days, another very convenient way to do this is via BigQuery, especially if you are familiar with SQL syntax. Implementation of clustering can be accomplished within a few lines of SQL code with the option to immediately visualize results.
Cluster analysis in practice
The image below shows how the outcome of a cluster analysis might look like in practice. This particular example is from Tableau, which provides a built-in function for clustering. A large number of products have been grouped into three distinct clusters, based on their sales value and profit ratio.
The clustering algorithm could have included many more variables. But even with just these two, the result of the analysis can be really informative. For instance, if you are in charge of marketing and product strategy you now have a data-driven way to prioritize the products based on which “performance” cluster they belong to -notice also the presence of some outliers that might require your special attention!
Clustering, despite its merits, is not the perfect solution for all segmentation use cases. Here are some pros and cons of clustering to keep in mind:
- It is a very interpretable technique and is easy to visualize.
- It is efficient to implement and can easily scale to large data with millions of records.
- It is dynamic. The definitions of clusters evolve as data changes.
- It can be used as a data exploration technique to better understand data before making decisions.
- The cluster analysis result is not deterministic, meaning that different executions of the algorithm might return different results.
- With k-means clustering, the marketer must predefine the number of clusters, which is not always an easy, straightforward decision.
- There is some preprocessing in the data that needs to be done before applying the technique, as discussed in the requirements section.
Great, now that all the steps have been followed and some interesting clusters have been produced— what’s next?
Well, there are many options depending on the exact use case.
For clustering of customers and prospects, you can use the clusters to
- customize your re-targeting and re-marketing strategies
- better adjust promotional and other types of marketing messages
- customize the product for the various personas to better fit their needs
- personalize the website design and UI.
When clustering is used on the product level, it is possible to better capture cross- and up-selling opportunities between the different product clusters.
Clustering is a perfect fit for marketing. It reveals the natural structure in marketing data. It is a great tool for data exploration and it is relatively easy to explain and visualize. It is also one of the most accessible machine learning techniques for marketing. It is very effective in clustering customers, products, keywords, ad groups— you name it!