How to Build a Custom Marketing Attribution Model [Guide]
Allocating marketing budget efficiently is problematic without knowing the contribution of each marketing channel to a user’s lifetime value and your overall conversions.
The problem arises from the fact that analytics tools and marketing tools report the last click that occured and are biased towards their own data. Plus, they can’t always detect other channels efficiently.
As companies and marketing budgets grow, customer journeys become more complex and convoluted since the number of marketing channels they use increases.
Here’s an example. Let’s think about a conversion path in which a user first interacts with a Facebook ad, then performs a Google search for the product and finally opens the company’s newsletter before converting.
In this case, the same conversion would be reported differently in different tools.
- Facebook would get credit in Facebook Ads Manager, and yet
- Google would get credit as the last paid channel in Google Analytics.
As seen in the example above, relying on third-party tools for attribution will not reveal the whole customer journey and will have inherent biases because of the data and the way it is treated under different analytics tools.
Three steps to create a custom marketing attribution model
Why does anybody need to create custom revenue attribution model? This way you can give credit where credit is due when it comes to the different channels on your customer journey.
Having a data-driven model helps you to:
- Allocate your budget to the most effective channels
- Target the most valuable users based on their lifetime value.
Step #1. Collect data for your attribution model
In order to create an attribution model you need all the customer paths (converting and non-converting) that occured on your website / app.
A path consists of the touchpoints (clicks) that the user interacted with during a typical conversion window.
Touchpoints are captured through the UTM tags used in your campaigns.
Capturing UTM Tags
UTM tags are parameters appended in the URLs of marketing campaigns that do not modify the destination of the URL but pass on information that can be captured by analytics tools.
For example, a URL from a Facebook campaigns would look like:
This URL directs people to example.com. Everything after the ? is not part of the web address but passes on parameters to identify information for the origin source of the traffic. These parameters fire when the above URL is loaded.
The parameters used to identify the sources of traffic are:
- utm_source: The channel that brought the traffic (e.g. Google, Facebook etc)
- utm_medium: The type of traffic (e.g. social, paid search etc)
- utm_campaign: the name of the campaign
Note that you can manually choose the UTM tags for your campaigns, but you can also assign them dynamically. For example, this article shows you how to use dynamic UTM tagging for your Facebook campaigns.
If you don’t use UTM tags for your campaigns, the analytics tools that you use will capture the URL of the referrer but will label it as organic instead of paid.
Capturing user level data
The second thing to consider is user-level data, i.e. the ability to identify users by assigning a unique ID to them if possible. In that case, you’ll be able to have a proper cross-channel and cross-device attribution and not rely just on cookies.
Analytics tools like Mixpanel or Heap give you the ability to identify device IDs and user IDs, and use the combination of the above dimensions to identify multiple devices for each user across the customer journey.
User-level data is particularly important if you intend to use Google Analytics on your Customer Relationship Management system (CRM).
UTM-tags are naturally problematic when used with CRMs. So, to properly track revenue from your CRM, you need to match every visitor to a ClientID through Google Analytics API.
This allows you to get valuable data about the channels that led them to your website or store, even without a UTM tag.
Capturing conversion & revenue data
Since the attribution model data consists of the converting and non-converting paths of your users, along with the touchpoints that you will get from the UTM tags, you will also need the conversions and the respective value (revenue) for those conversions, likely found in your CRM.
These data points will help you calculate the conversions, revenue and ROI for each marketing channel.
Step #2. Pull all your data together
In order to create your own custom attribution model without relying on the attribution models your analytics tools provide, you will need to import all the data we described above into one place, ideally a database or data warehouse.
An easy and efficient way to do this is to use an ETL tool like Improvado which enables you to connect all your marketing data in minutes, saving massive amounts of time and developer resources.
Once you have all the metrics and dimensions required for the attribution model imported into your database, you should consider a few factors for your model.
Step #3. Decide on an attribution window
Decide on an attribution window based on your data and business considerations like the purchase cycle for your products. The attribution window is the time period during which a purchase should be credited to a touchpoint that happened within that period.
Different industries have different purchase cycles and that affects their attribution window.
For example, it takes much longer for a customer to decide on the purchase of a vacation package worth thousands of dollars than to buy an inexpensive t-shirt. Generally, expensive purchases have long cycles that might take months and dozens of touchpoints to complete, whereas cheaper and impulse purchases might take only a few hours from the first touchpoint to the conversion.
Analytics tools like Google Analytics provide reports that help you see the distribution of users based on how long it took them in terms of time and number of touchpoints to convert.
Select Data-driven Attribution Model as a Foundation
There are two widely accepted data-driven models for attribution:
- Shapley Value
- Markov Chains.
The inputs needed for both models are the touchpoints and the conversions, that as stated above are part of the data that you will import into your database.
Using the Shapley Value Attribution Model
Shapley Value - named after the Nobel Prize-winning economist Lloyd Shapley - is a game theory model for cooperative problems. In other words, the model tries to assign credit to different parties that contributed to a total value. This is also the question we’re trying to answer with an attribution model, namely how much credit every marketing channel should get for making a user convert along the path.
The Shapley model is also the one used by Google for their own data-driven attribution model in Google Analytics 360, however by creating your own model you will have better control over your data and will avoid the biases that Google Analytics might have by giving more credit to Google Search.
In order to calculate the contribution of a channel under the Shapley Value model, we compare all the different permutations of paths and touchpoints that occured. For example, we take two paths that differ by a single touchpoint and we assign the difference in total value to that extra touchpoint, since it is the only difference between the two.
Then we compute all the permutations and we assign credit to each channel accordingly. Thus, the model calculates the probability of conversion when a specific channel is present in the conversion path.
Using the Markov Chain Attribution Model
The Markov Chain model - named after the Russian mathematician Andrey Markov - describes the sequence of various events and tries to make predictions based on them. Once again, we try to assign the probability of a user converting when exposed to various marketing channels.
The Markov Chain model assigns credit to marketing channels by calculating the removal effect. The removal effect depicts what happens when we remove a marketing channel from a path and see how many conversions take place without that channel.
By calculating all the different permutations of paths and the removal effects for every touchpoint, we end up with a probability to convert for each marketing channel.
Shapley Value vs Markov Chain Attribution Models
In both the Shapley and the Markov model, the output is a matrix of all marketing channels and a probability or credit for all conversions that occur thanks to each of those channels.
The above table is an example of the output of a custom attribution model compared to a standard last-click model. Note that the total number of conversions is the same for both models, but what changes is the allocation between different channels. Moreover, the data-driven model can have fractional conversions, since credit for a conversion is given to multiple channels.
You can also calculate the revenue and ROI for each of the channels since you have conversions, revenue and marketing cost in your database. This will help you allocate your marketing budget across channels.
How to run a “Lift Test”
In the models and data mentioned above we talked about capturing touchpoints via UTM tags. UTM tags occur through clicks, which means that there are channels (mainly social media) that will be underrepresented due to the lack of impressions.
This also has a similar impact on display advertising, where visitors mostly convert after viewing your display ads multiple times across different content networks.
In order to incorporate impressions to your model, you should consider running lift tests for channels like Facebook and Instagram as they rely on impressions more than other channels.
A lift test is a randomized control test where we randomize an audience into a test and control group. We only show ads to the test group.
The difference in conversions between the two groups is known as lift or incrementality and represents the real impact of a channel’s ads on the audience. Moreover, since this is based on the concept of randomized control trials, it also incorporates the concept of causality, meaning that we know that it was the ads that caused the extra conversions.
A good practice is to regularly run lift tests (e.g. once a quarter) so that you can see the effect of Facebook, Instagram and other impression-heavy channels and calibrate your attribution model accordingly.
On the other hand, if you run an ad campaign but purchases occur on a marketplace (like Amazon), econometric modeling is your only option for assessing your campaign’s efficacy.
Lift test vs attribution model
Both attribution models and lift tests are useful and should work in conjunction to give the best possible results. They both have their advantages and limitations, as you can see in the table below:
Approximation based on model
One data point in time
Tool that can be used on a daily basis
Based on results, not on arbitrary rules
Rule-based unless you build a data-driven model
Baseline (organic, brand effect) is taken into account
Gives little to no credit to organic
Impressions are taken into account (but not segregated)
Impressions hard to track (depends on channel)
Not all channels have lift tests (imperfect alternatives like matched market test, before-after etc exist)
Models all digital channels. Offline is problematic
Incorporating Offline Activities into your attribution model with “Matched Market Tests”
For offline activities (TV, billboards etc) it is recommended to run Matched Market Tests, where you take two similar geographic areas and use them as a test and control group to get results. The calculation of results is similar to a lift test but we have to acknowledge that this is not a perfect test, as the audiences are not randomized.
You can also employ before-after tests, where you compare two periods of time with different marketing activities.
Something we have to take into account when running all sorts of tests is duration and seasonality.
A rule of thumb is that tests should last for at least one week (and ideally 4 weeks or more), since there might be fluctuations of conversions for different days (e.g. a lot more conversions on weekends compared to weekdays). Moreover, you should avoid periods during which you experience big increases or decreases (e.g. Christmas, Black Friday).
What is the Best Attribution Model?
With numerous marketing attribution models available, one of the most widely shared curiosities among marketers is centered around finding one attribution model that generates the best results for all businesses.
While this is a nice-to-have, there is no one-size-fits-all, “ultimate” attribution model as at the time of writing this. Your ideal choice of attribution model largely depends on what you intend to achieve.
However, we’ve noticed many instances where marketers record significant boosts in conversion after switching to a data-driven attribution model. Your chosen data-driven model could be the Shapley Model or the Markov Chain Model already described above.
The only catch is that your business needs to have a solid backlog of conversion data to be eligible for data-driven attribution. Thus, newer businesses can’t benefit from it.
If you can’t use data-driven attribution, then you can work with any rule-based attribution model. There are several types of rule-based attribution model:
3. Last Non-Direct Touch
4. Time Decay
We suggest trying different optimization methods using each attribution model to figure out the one that gives the highest boost in sales. Make sure to run these tests with proper planning and involve skilled analysts if needed.
Data is a powerful force in the 21st-century marketing ecosystem. Data-driven businesses are able to scale and maintain a solid level of competitive advantage in their industry.
With that being said, marketing attribution is one of the many ways data can be used to make efficient marketing decisions for businesses. However, using this concept the traditional way can come with several challenges, as already mentioned in different sections of this article.
This is where Improvado comes to the rescue.
Improvado’s attribution modeling enables you to optimize your campaigns by associating your leads and purchases with the right sources. We understand that not all leads are equal, so we provide the infrastructure you need to discover the sources that give you the hottest leads so you can drive conversions as fast as possible.
Improvado supports popular analytics platforms, including Google Analytics, Mixpanel, Adobe Analytics, and more. Furthermore, you can connect these platforms with your favorite CRMs, including Hubspot, Salesforce, Pipedrive, and more.
Adopting an intelligent approach to market attribution can help you boost your ROI by up to 25%.