Smart marketers A/B Test businesses, not websites. Here’s how.
Let’s talk about how A/B testing is more than a way of improving conversions on your website or email marketing campaign. Let’s talk about how it’s a way of improving your entire business.
Don’t worry if you’re new to A/B testing – we’ll run through the basics. And if you’re experienced, we’ll keep it interesting for you by busting a few myths along the way. Then we’ll air our dirty laundry by discussing a few examples from our own archive, and discuss how a better way of talking about, and planning A/B testing can transform your entire approach to marketing.
It’s not about changing button colours…
A lot of articles you read about A/B testing jump straight into the “middle” of a test, with sensational headlines like “How changing ONE button colour lead to a 600% increase in sales” or “Halving the number of form fields DOUBLED our conversions”. We don’t mean to belittle this type of article – the advice they give is often good! But they very often just confirm what most user interface designers would consider ‘best practice’.
This type of test can still have value, though. If your team is missing an interface designer, or if you’re designing in an environment with multiple stakeholders and the experience, talent or intuition of the designer is being called into question, A/B tests are a good way of settling debates: Either you’re right, you’re wrong, or it doesn’t make a difference and you’re wasting time talking about it.
And it’s not about changing headlines either…
Another common A/B test is the ‘change the words around’ test, where a change in a heading, call to action or microcopy results in an increase in leads or an increase in sales. We’re fairly keen on this type of test as words are subtle, devious things, and the way you combine them can have unexpected results.
The challenge with this type of test is setting good goals. Occasionally, changing a headline on a landing page may improve metrics deep in your conversion funnel (for example, increasing purchases because it single-handedly changes a new visitor’s understanding of what you are offering), but just as often, it only affects behaviour on the page where you have changed the headline (i.e. it makes your visitors a bit more keen to look around, but the rest of your site still needs to step up and convert them from a visitor to a customer).
To detect the changes in behaviour resulting from wording changes, you may have to use something like CrazyEgg heatmaps or Optimizely custom events to track user interaction with the page you’re running the test on, for example:
- After seeing the new headline, did more people scroll further down the page?
- fter seeing the new headline, did more people click a ‘more’ link on one of our case studies?
- After seeing the new headline, did people spend longer on the page?
Lance at CopyHackers runs A/B tests every day for companies of all shapes and sizes. He knows more about A/B tests than most people have forgotten, and he offers some essential advice on this topic: Don’t focus on more than one metric.
If you pick two metrics (for example ‘played video on landing page’ and ‘signed up to mailing list’), and your test causes them to pull in different directions, two things could go very wrong:
- Best case scenario is that you’ll end up confused, so decide to take no action. This sounds foolish, but it’s better than…
- Worst case scenario – you continue to optimise for the metric that increased, because that ‘feels good’. But unfortunately the metric which decreased is the one that was more important to your business. In the example I gave above, would you really optimise for people playing a video, over people giving you their email address and permission to market to them?
If you absolutely must focus on more than one metric, be very clear which metrics are most closely linked to you ‘making money’, and have the guts to ignore changes to ones which aren’t. If a variation results in more engagement on a heatmap, but fewer sales, it’s not a winner! Don’t fool yourself with vanity metrics.
So what is it about?
So we’re fans of both the “user interface tweak” test and the “change the words around” test. But we’re not always fans of the writing and discussion we see about A/B testing. Too often, we chat about the topic to other marketers and startups and hear things like:
- “A/B testing, that’s about testing different photos and seeing which one gets most people to sign up for your site, right?”
- “We tried A/B testing, but it’s useless because we can’t design tests that increase sales”
This breaks our hearts. Let’s look at why…
A/B testing – done right
Because more than a way of optimising on-site conversions, A/B testing is a way of optimising your whole business by carrying out ongoing, data-driven market research (and if you can also use that research to increase on-site conversions, great!).
To feel its full benefits, you have to spend some time in the unfashionable parts of the A/B testing process: the beginning and end. (Not just the fun bit in the middle where you get to feel like a superhero designer tinkering with your website in Optimizely). Let’s look at the stages you should be carrying out in every single A/B test you run:
- Follow up – this is the crucial stage which most businesses miss.
This is the difference between a worthwhile and worthless A/B test. A hypothesis is the ‘theory’ you set out to test. In the tests we’ve mentioned so far in this article, examples of the hypothesis under test would be:
The hypothesis should be something you can actually test. It’s really important to get this clearly defined and documented, because everything else in your test hinges on it. You’ll waste time in your testing tool changing things that don’t need to be changed, and you’ll draw bad conclusions from the test results if your hypothesis is murky.
A common complaint is “we don’t know what to test”. A clear hypothesis should tell you what to test! If you’re stuck for a good hypothesis, it may be that you’re thinking too much about your site, and not enough about your visitors. Step away from the site, and make educated guesses about what could encourage even more of the human beings who visit your site to ‘convert’ (whatever ‘conversion’ means to you).
If you’re still stuck, it may be time to bring in external help in the form of a consultant – often a fresh pair of eyes is all that’s needed.
This is the fun part, and as a lot has already been written all over the web about it, we won’t labour the point. We use Optimizely, but there are plenty of alternatives available. See the resources section at the end of this post!
Don’t worry if the variations you create in Optimizely aren’t completely beautiful. The aim of the test isn’t to win design awards, it’s to learn something new about your visitors that helps you sell more. If we create a successful variation that’s ugly, we can use this variation (and the conclusions we draw from it – see the next section) as a brief to our talented designer (hi Ryan!), to create a beautiful new page that incorporates what we learned from the test. In fact, it can be wise to try variations that are a bit ugly – challenge everything, including the assumption that a ‘beautiful’ design is what sells.
Finally, don’t forget that you are smarter than your testing tools. I can’t say this better than Justin at WhichTestWon, another massively-respected A/B testing pro:
If you see a major lift, say in the triple digits on the first day of a test, the testing tech will conclude that the changes are statistically significant at probably around a 99% confidence rate. Do you call the test after the first day? Of course not, there are so many factors! We need to humanize the numbers and account for changes in interaction based on time of day, week, and other considerations.
Again, the danger is that you might let your testing tools tell you what you want to hear. If you expect a variation to do well, and Optimizely (or your tool of choice) tells you it’s doing well, don’t end the test early! Have the courage to question yourself – has the test run long enough? Are there factors happening outside of your site (for example a targeted Adwords campaign) which are driving unnatural traffic to your site which would make this variation perform better in the short term, but make it a bad long-term idea?
Analysis and conclusions
This is the hard part, but the most interesting part. And, from many of the discussions we’ve had, the part that people struggle with most. We can’t give you a ‘quick fix’, or ‘3 surprisingly simple steps to automatically become an A/B test analysis superstar’, but we can set you on the right track. Like anything else, the more tests you run, the better you’ll get at it.
It’s easy to fall into the trap of making bad conclusions from an A/B test. For example, in the ‘fewer form fields’ test we’ve already used as an example, the simple conclusion is:
“Fewer form fields means more people complete that form. Specifically, halving the number of fields doubled our conversions, therefore we should expect a similar doubling if we half the remaining form fields. Let’s do it!” That may be true, but that’s not exactly what you tested. Take this 8-field sign-up form for instance:
If you ran a split test which cut this down to 4 fields:
…and doubled signups for your e-book over the test’s duration…
All you truly know is that removing the exact 4 fields you actually removed resulted in double signups. Think about it for a second. Perhaps a different 4 fields wouldn’t have the same effect. Perhaps removing just one of the fields caused the doubling. Maybe your visitors just resented being asked their age and saw it as annoying or intrusive. You simply don’t know for sure, and unless you test every possible combination of fields you can’t know for sure.
Don’t worry though. This is where practice and a clear hypothesis come in. You know your audience – was one of the fields likely to cause difficulty? If you’re not confident in your conclusions, why not run a new test comparing the 4-field form against the original form with only the ‘annoying’ field removed?
The most important thing about the analysis phase is keeping a record of your conclusions, so you can revisit them and keep learning from them. (To save you wasting time, we’ve given you a template spreadsheet at the end of this post – take it and use it!)
Why is recording your conclusions important? Well, keep the above ‘form’ example in mind and imagine you ran a second test to check your conclusions. Surprisingly the 7-field form performs as well as the 4-field form – it was just that one ‘troublesome’ field all along! You may as well keep the 7-field form as it gathers more information for your inbound leads database.
However, if in a few months time, a new designer joins your team and questions the size of the form, you run the risk of restarting this whole process, unless you’ve got a record of your design decisions and the motivations behind them. This is a simple example, but especially when you start doing more subtle tests (for example headline/call to action changes), a record of all your tests and their conclusions will save you hours in going over old ground.
For another example, let’s look at the ‘change the words around’ tests we mentioned earlier. Imagine your hypothesis is ‘our customers are more interested in saving money than time’, so you test your current headline:
Our widget-refreshing service will save your team 12 hours a week – guaranteed!
against a B-variation of:
“This widget-refreshing service saved me $1200 a month! – Ms. A. Happy-Customer”
… and you see a 200% increase in brochure downloads. You conclude that your site visitors care about saving money, but not saving time, right?
Maybe. Can you spot the problem?
Maybe including a real customer testimonial caused the increase. Maybe your customers actually care more about saving time, and if you combined a ‘time saving’ message with a customer testimonial, you would have seen a 400% increase in downloads!
To avoid making this kind of analysis error, run tests which only change one thing.
(In this example, it’s actually possible to run an A/B/C/D test “cost”, “time”, “cost+testimonial”, “time+testimonial”, if your traffic levels are high enough).
Hopefully these examples show how you how errors can creep in to your analysis, and how you can avoid them. Remember – keep clear hypotheses, and change one thing to make your job of analysing a test easier!
One last piece of advice – don’t let your own biases sneak into your analysis. You might be utterly convinced that something is true, but if a well designed A/B test (clear hypothesis, changing one thing, picking metrics which actually relate to the thing you changed), shows you are wrong, you may well be wrong. Being wrong is often uncomfortable, but if being wrong leads to a 500% increase in sales, it can be easier to take.
Follow up (The biggest missed opportunity in A/B testing)
So you’ve run an A/B test, drawn some conclusions, and implemented them on your site. You rush back into Optimizely and start a new test, right?
This is where lots of A/B testers make a big mistake. It’s great that you’ve found a change you can make on your website which increases conversions – but your business is so much more than your website!
Let’s assume you ran “save time -vs- save money” test properly and confirmed that your site visitors prefer saving time to saving money (incidentally, this is an AWFUL value proposition – try harder!), that’s a really valuable bit of market research. Are you just going to use to improve your website, or are you going to use it to improve your whole business? Here’s just a few places you can apply this knowledge before you dive into your next A/B test:
- Do you have any printed sales literature? Are your findings significant enough to change it?
- Do your social media bios need updating?
- Do you need to brief your whole team on the findings? How are they going to internalise it and use it when they discuss your company networking events?
- Do you rely on content generation for your marketing? Does your content schedule need to change?
- Do you have personas which guide your content creation and your web copy? Do they assume that your customers prefer saving time? Do they need updating?
- What about your product itself? If you sell something complicated like Software as a Service (SAAS), you might need to change the messaging within the product itself, or even alter its functionality to emphasise the ‘time saving’ benefits it offers.
This is a contrived example – but it shows you just how far-reaching the changes you need to make from a website A/B test can be. This is the most exciting, and least discussed benefit of split testing – it’s not just a way of boosting conversions on your site, it’s a data-driven way of doing market research.
To clarify some of the points we’ve made in this post, we’ll share some of our own A/B tests. They say you learn best from mistakes, so here’s your chance to learn from ours.
Here’s how our website looked last November. I’ve cropped the screenshot roughly where it crops on my (large) laptop screen. Our hypothesis? We aren’t providing enough ‘information scent’ to people who land on our site to encourage them to explore further. They just see a headline – is that going to make them want to scroll?
Our simple, but brutal A/B test of that hypothesis? Remove the image.
Even on small screens, you’re guaranteed to see more information when you land on the site.
Reached signup form
OK – looks like the change we made was positive! What’s our conclusion?
Well, because our hypothesis was clear, and we made one change, we’re pretty confident that we weren’t giving people enough to engage them. The ‘B’ variation isn’t as pretty, but it’s what people want to see!
Do we need to make any changes in the business as a result of this? Not this time. This was a simple test, with a simple outcome. Let’s look at another.
Value proposition change
Last year, we were exploring different variations on the first headline on our homepage. On the expert advice of Jo at CopyHackers, we were using a simple value proposition:
But because we were lacking in market research, we weren’t sure what the ‘perfect’ value proposition for our target audience was. Our hypothesis? “Visitors might be looking for ‘customer feedback’, not ‘customer service’”. Let’s try a value proposition about customer feedback and see what happens:
Engagement is up - by a statistically significant margin!
More people reached the plans and pricing page, but this time, the numbers aren’t enough for Optimizely to tell us that the results are significant. It’s close though!
So, what do we conclude?
It’s tricky! Have you spotted the errors we made? Our hypothesis was pretty solid, but…
- “Simplified” isn’t quite the same as “the easiest”. Maybe bringing in this new, stronger word was responsible for the increased conversions and visits to plans and pricing. Maybe our visitors just don’t care about the difference between ‘customer feedback’ and ‘customer service’ (If you read this blog often, you’ll know that they should!)
- Our goals for this test aren’t great. ‘Reached plans and pricing’ is all well and good but did we make any more money? (Answer: no – I haven’t included that graph because it’s a bit boring!). It would be good to set up some more on-page goals to see if the ‘Customer Feedback’ headline is having an effect on people in the moments after they read it.
But still – more engagement and more people looking at how much it costs to get the benefits we provide. We concluded that people are more interested in ‘Feedback’, so doubled down on that. We updated our personas, our social media profiles, and used this learning when we redeveloped the ‘welcome guide’ to CustomerSure – ensuring that we focused on helping people get feedback rather than carry out a more general, less focused customer service program.
You could argue that the results of this test weren’t strong enough to draw these conclusions, and you’d have a good point. But this is where the experience and intuition that I mentioned come in: We are pretty clear that we offer customer feedback you can act on, not customer service, and we have a back-story to prove it. If this had been a surprise to us, we might have tried to do more tests, but as it was, it was all we needed to double down on something our existing messages.
Hopefully, we’ve given you a solid framework to build your own A/B tests upon. But more than that, we’ve shown you that whilst A/B testing often starts with your website, it can end up touching your entire marketing strategy. If you’ve got any examples of A/B tests you’ve run which have extended far beyond your site, use the comment form to let us know about them, we’d love to write a sequel to this post and feature you!
Rather than give you all this advice and leave you to go it alone, we’ve collated a few tools and some further reading to help you on your way to A/B success!
- Testing record spreadsheet: Copy this spreadsheet and use this to track all your tests and conclusions!
- Optimizely: Optimizely is what we use to run our A/B tests, and we love it.
- Visual Website Optimizer: A lot of A/B pros swear by VWO as an alternative to Optimizely.
- Unbounce allows you to build landing pages and A/B test them. Great if you don’t have the HTML skills to make pages from scratch.
- “How do you determine what is staistically significant when looking at the results of an A/B test?” on Quora
- “How to run A/B tests that give your business big wins” on KissMetrics.
- Smashing Magazine have a pretty comprehensive guide to A/B testing.
Some marketers struggle to run A/B tests because of low traffic to their sites. This article explains how to make the most of a bad situation:
- Which Test Won? publish a regular newsletter packed with advice for marketers interested in A/B testing.
Any more resources we should add to this list? Let us know in the comments.
Are you great?
Would you like to know the techniques we use to help service teams get great at customer feedback?
We’ve compiled them into a short, practical ebook you can start using in your organisation today. Download it for free now…Get it here