Running conversion rate optimisation (CRO) tests is part and parcel of any good marketer’s toolbox in 2019, but knowing which testing methodology to use is something many ponder over. It relies on your point of view and overall objective. Below is an explanation of each of the different methodologies that will provide you with a clear understanding of each one and when to use it.
Types of Testing:
This is the most-used type of testing. A/B/n testing compares two versions of the same page with only one difference between them to see which performs better.
Performance is monitored by sending a certain number of users to each variant of the page and that is usually done through the use of a testing platform such as Google Optimize, Optimizely or VWO. You typically see only moderate changes being made, such as changing the colour of a button, re-writing text or changing the location of elements on a page to make them more or less prominent.
The A in the name of this test refers to the original version of a page (known as ‘control’ or simply ‘A’) and the new version is named ‘B’. The ‘n’ signifies that you can run A/B tests with several variations.
An uncommon type of testing, A/A is best for ensuring your data is accurate. Data integrity is vital to ensuring your CRO activity is a success and yields results you can rely on. It is the process of testing two identical pages against each other.
The best time to start A/A testing is as soon as you’ve setup a new testing tool, such as Optimizely or Google Optimize. It takes very little time to setup an A/A test for a page, so it’s worth doing. We recommend periodically running A/A testing to ensure no other changes to a website are having a detrimental impact on your data integrity.
Often confused with A/B testing, split testing compares two different pages against each other with different URLs. This is the key difference as you are testing two different pages as opposed to small changes on one page.
This can be extremely time consuming as new pages require designing, building and potentially new text, imagery or functionality. Split testing is best employed when making large-scale changes and if you have only a small amount of traffic coming to your website or specific page. If you have a lot of traffic, it’s best to incrementally make changes using A/B/n testing.
It is very similar to A/B/n testing, but with one key difference. Multivariate testing combines multiple variations together to find the best possible combination. You will create variations of multiple elements on a page and the test will try all the possible combinations to find what works best.
Multivariate testing is only viable if your website or the page specifically in question receives a lot of traffic, as it must be able to test all the possible variations. Otherwise you may be waiting a long time to get any statistically significant data.
If you’ve ever attempted CRO, it is likely you adopted A/B testing. There’s nothing wrong with it, but there may be a better approach for you. It’s the go-to method for many as it allows you to easily compare one difference on a page.
Bandit Testing is an alternative testing methodology that we recommend every marketer has in their arsenal. Don’t think of it as a replacement for A/B testing entirely, as it is still has its uses. But the following may mean you can scale faster or get the most from the traffic to your website.
An alternative to the traditional 50/50 split, the bandit method gives priority to the best performing variations over the duration of the test, learning and adapting as it goes to meet your objective.
The one-armed, or multi-armed bandit problem, is a mathematics problem usually described using a casino setting.
Imagine you walk into a casino with £200, and in front of you stand 20 one armed bandits (slot machines). With a goal of maximising your returns, how do you know which ones to play?
The problem is obvious, you need to know more about the machines. If you knew which one paid out the most, you could spend your time purely on that one, as it would ensure you walk out of that casino with the biggest return.
Now apply this problem to conversion rate optimisation and you can see clear similarities. If you knew which layout, copy or colour worked best you would run that page all the time. But you don’t. You need to know more about the machines first.
Explore vs Exploit – The conundrum comes from spending your time between exploring the different bandits to see what their pay-outs are and exploiting any that have been favourable so far.
The pros and cons:
Traditional A/B/n testing is split between a set period of exploring each option until jumping to exploit the one that performed the best. This has one major drawback:
- It wastes resource exploring inferior options in favour of gathering more data
Simply put, it can waste vital time, sales or leads whilst it figures out which is best.
Bandit testing aims to solve this problem by being adaptive and using machine learning to simultaneously explore and exploit. As it identifies variations that perform better it adapts to push more traffic into these variations.
Often referred to as earning while you are learning, this maximises opportunity and minimises regret.
There are numerous types of bandit testing algorithms. With the most popular one being the Epsilon-Greedy algorithm.
The Epsilon-Greedy algorithm has been shown to perform well consistently. It works by always keeping track of which variation has been shown, how many times it had the desired outcome and how many times it hasn’t. It adjusts as needed and maximising the exploitation and exploration.
The Epsilon-Greedy method is easy to implement and is not impacted by seasonality or changes in trends. However, it does have some drawbacks. One such downside to this method is its measure of variance and whether the exploration should decrease over time.
There are alternative methods that try to address such issues but most simply try to balance the exploration vs exploitation in various ways. Alternatives:
- Upper Confidence Bound (UCB)
- Thompson sampling
- Bayesian Bandits
When it comes to testing on your site the above is over-kill. Statisticians and analysts will argue about the various methods but often miss the point. The Epsilon-Greedy method delivers results quickly and efficiently.
Bandit vs A/B/n Testing:
Which should you use? The easy answer is both! If you want to answer a simple question, such as whether one variant is better than another, then A/B/n testing would suffice.
You should instead ask yourself which will be more beneficial to you or your business. Matt Gershoff, author of Conductrics, said “If you actually care about optimisation, rather than understanding, bandits are often the way to go.”
Banding testing is more preferential due to its sympathetic approach to risk and reward. Any business that is conducting tests will want to maximise the impact of a successful test as soon as possible. Standard A/B testing does not offer this, as the test must run until you have statistically significant data to make a judgement call. Bandit testing adapts as it goes, ensuring the test that is making you the most money is given preference.