We wouldn’t dream of running an A/B or Multivariate test without a solid hypothesis in place. These little statements are the tiny hearts that power an idea through to completion.
But what is it about this statement that makes it so invaluable? And how specifically has it helped us? I’m hoping this article will give you an answer to these questions, as well as convince you to use them in your testing program (if you’re not already doing so, that is).
Best of all, I’ll be using a real example as a case study!
Where do hypotheses come from?
We get user feedback all the time. We get them from usability studies, feedback forms, comments on Twitter, our parents, etc.
One of the recurring bits of feedback we got was that some users didn’t understand what “Direct” meant within the context of a rail journey. Could there be simpler terminology we could use?
The feedback suggested the following hypothesis:
“On the search results page, saying ‘0 changes’ instead of ‘Direct’ will improve clarity and confidence, therefore leading to an increase in conversion.”
Looking at this statement, it does a number of things:
- It tells us what we want to change (as well as the page we want to apply the change to)
- That two variables are related, namely the language of the number of changes and the conversion rate
- How and why we think there is a relationship between the two variables
We wanted to create an experiment to validate this statement!
How the hypothesis validates itself
With any hypothesis, it’s essential to interrogate it properly.
So going back to our statement, we paid special attention to the part that said:
“…will improve clarity and confidence…”.
Remember I said that the statement is showing us how two variables are related? Well, this part provides the justification for that relationship — the “how” and “why” a relationship exists, i.e.
[Variable 1] … will improve clarity and confidence, therefore leading to … [Variable 2]
Justifying how two variables are related helps us “buy into” a hypothesis. Without that justification, the statement would feel weak and too much like it’s a leap in logic.
Our hypothesis originated from user feedback, so the justification seemed to be backed up by qualitative insight. Even so, it’s a good idea to cross-reference with as many other sources of data as possible, so we reviewed past tests, and also checked if there was any quantitative data from analytics packages to help us size the opportunity.
In this instance, all we had to go off was user feedback.
Another thing to note was that the test idea centred around the “search results page”.
That meant we needed to know what the drop-off rate for this page was. We also needed know the traffic volume — the more traffic a given page has, the greater the chance of a high level of statistical significance with a test result.
How the hypothesis allows us to combine test ideas
Another check we do with our hypothesis is to compare it with other ideas in our backlog to find other tests that are similar.
Once we find a similar idea, we have a choice: either roll these tests into one, or just plan a test launch order for them, perhaps starting with the simplest test that will give us some validation, and go from there.
In this instance, we also found this in our backlog:
“On the search results page, increasing the prominence of journey details, will provide customers with more clarity for journey they are choosing, thus increasing overall confidence and therefore conversion.”
Here again is an image to help you with context.
Clicking on each one of those columns brings up an overlay with more journey details. This idea is about improving the clarity of each of those columns, and giving it some prominence for the user.
Both our hypotheses focus around the same area of the site, so we felt there was an opportunity to combine these ideas into a single test. After all, they both target the same area of the page. Also the hypotheses compliment each other nicely.
We hoped that combining the two tests would amplify any impact we measured. We essentially decided to do a Multi Variate Test (MVT).
What’s the deal with an MVT?
So we had two hypotheses and two changes to deal with. We decided on conducting an MVT. But what does that mean for our test design?
Let’s review both hypotheses again…
H1: “On the search results page, saying ‘0 changes’ instead of ‘Direct’ will improve clarity and confidence, therefore leading to an increase in conversion.”
H2: “On the search results page, increasing the prominence of journey details, will provide customers with more clarity for journey they are choosing, thus increasing overall confidence and therefore conversion.”
We decided on the changes that will best validate each hypothesis.
H1: we wanted to change the text to read “0 changes”, instead of “direct”.
H2: we wanted to tweak the design to draw more attention to the journey details columns (notice the background of the cell colours? It’s a subtle change, I know).
These are the two factors in our test. i.e. They are the changes we want to measure. An MVT means that we get to test all combinations of these factors.
For our example, we arrived at the following variations:
- Control: No changes
- Variation 1: H1 change only
- Variation 2: H2 change only
- Variation 3: H1 change + H2 change
You still with me? The reason we had these combinations was because we wanted to measure the effects of all of our factors, not just in conjunction with each another (Variation 3), but separately as well (Variations 1 and 2).
For reference, variation 3 (with all changes) looks like this:
This way, we could attribute any conversion effect to a single change or combination of changes. So basically, with this design, we’d be able to figure out what caused a conversion increase or decrease.
So basically, when testing, it’s important to ensure that anything we build is designed in such a way that it answers the questions raised by our hypotheses.
How the hypothesis helps to keep our focus
As an aside, I breezed over the “coming up with a design” part for the respective test ideas above. That’s because the designs for these came easily.
That’s not always the case though. For some test designs, we find that, while looking for a solution, we over-complicate the final design. In some occasions, we try to change too much. On other occasions we get lost in details that are either inessential or too broad for the core focus of the test.
Whatever the reason, when this happens, we fall back on our hypothesis to help us regain our focus. It’s a nice way of reminding us why we’re running a test in the first place.
Making sure we have all the metrics covered
A hypothesis is also great during the build phase. It helps us ensure we have all the metrics we need to measure any effects that our changes might have.
In order for it to help us, we study our hypotheses carefully, taking special consideration of all the variables mentioned in them. The idea is to think about how we’re going to measure each of them. We’re predicting that these variables are related, remember? So we need to add metrics that prove that relationship.
In our case, we decided to measure clicks on the journey columns. We predicted an increase in clicks here.
In regards to the text change, it’s difficult to measure what people see. However, because we were only applying the text change where the wording “Direct” appeared, we knew that we needed measure searches where that text appeared. That way, we could be sure we were measuring users who had definitely seen the “0 changes” wording.
Next, we needed to ensure we were measuring overall conversion, as well as step-conversion. We measure those for all tests anyway, so that part was easy.
Checking if the hypothesis is validated
Weeks later, we had a concluded test to analyse!
Much time may have elapsed since the formation of that idea, and so the hypothesis acts as a reference point for the analyst so that he or she know what they’re looking for during analysis.
These are the possible outcomes of the test:
- Hypothesis validated (Win!)
- Hypothesis invalidated (Lose!)
- Hypothesis inconclusive (No measured effect)
In our case, our test was a success! Both our hypotheses were validated and we found that the changes in combination of each other did amplify the win. So that was great! These validations also helped us uncover new ideas and thus, new hypotheses were born.
Additionally, and this goes without saying, we recommended putting these changes into production straight away.
It’s worth noting however, that even if this test was unsuccessful or inconclusive, we would still have learned something. Perhaps we’d have proved that there was no relationship between our variables after all. Or maybe we’d uncover new friction points, like inconsistent messaging across the site. This would mean potentially revising the test conditions and trying again.
We could also see some surprising insights, which may lead to new hypotheses as well.
Either way, win or lose, the findings of any test should lead to more hypotheses, and these should in turn lead to more validations or invalidations, leading to yet more hypotheses!
As I mentioned at the beginning of the article, we’re huge fans of the hypothesis. Hopefully you now understand why that is.
It all sounds really obvious though, doesn’t it? After all if you’re running a scientific experiment, you want to stick to the scientific method, right?
However, within the context of a business, it’s sometimes easy to fall into the trap of just “winging it” with an idea that you think will prove itself. But if you do that, you’d be missing out on all the benefits that a hypothesis provides.
Written well, a hypothesis will guide your A/B testing program from start to finish. And if you are hypothesis-driven, the entire process almost runs itself!
About the author
My name’s Iqbal and I’m part of the conversion optimisation team at Trainline. I’m fascinated by every aspect of the optimisation process: ideation, building and analysing of tests; and I’m passionate about learning new stuff.