A/B Split and Multivariate Testing: Statistics/Evidence Based Optimization{4}

by Ruben S
For the past few weeks we have been talking about pay-per-clicks, conversion rates, and other topics on what we need our sites to do. How do we go about accomplishing our goals of increased conversion rate optimization, or having more people visiting out sites? Well, we could go about this in two ways, 1st we could just go willy nilly changing items on our site and hoping that this would accomplish out task. Although some people might get some lucky guesses, this is no way to systematically what our changes actually accomplish. The 2nd method we could use is statistical or evidence based changes. Companies no longer guess at what changes should be kept or scrapped, now companies rely on statistical evidence to make and keep changes on their sites. There are several methods that a company can use to test their changes, but the two I will cover today, which happen to be the most common, are A/B (Split) Testing and Multivariate Testing. With this type of testing, we can implement different changes to out site and let our customers use the site in their natural state. Customers actually using our site will give us feedback to what changes work and what changes do not work. Each one has its advantages and disadvantages, but both are far better than not doing any controlled testing at all.

Why Use Statistical Analysis?

In making changes to a site, it is better to optimize for long term not just clickthroughs. In the old days of website optimization, the person with the highest paycheck was the one who made the decision on which changes to make, and as you will see with A/B and Multivariate Testing, this can prove extremely difficult without data to back up your decision. Changes to a site should be made based on the Overall Evaluation Criterion (Kohavi, R., Longbotham, R., Sommerfield, D., & Henne, R. M., 2009). The organization has to determine the OEC’s though, this is not done by individuals. The other thought to keep in mind when performing Statistical Experiments is to focus on quantitative metrics because there will not always be an explanations as to “Why” is works. Two issues with controlled experiments on websites are the Primacy Effect and consistency. In Primacy Effect, changes to the site, navigation bar for example, may at first degrade customer experience even if the change is better. Consistency issues may arise when customers use multiple computer, because assignments of the experiments are often cookie based. A/B Split Testing and Multivariate Testing do their best to mitigate these issues and do a good job at it because these are the two most popular methods of controlled testing.

A/B (Split) Testing

One of the big online companies to have wildly successful A/B testing is Amazon. The culture at Amazon from an early state, where data trumps intuition, and a system that made running experiments easy allowed the innovation rate at Amazon to move quickly and effectively (Kohavi, R., Sommerfield, D., & Henne, R. M., 2007). One such innovation pioneered by Amazon, credited to Greg Linden, was the idea of showing recommendations based on cart items. The pro of this idea was it increased the average size of the basket because it eased the ability to cross sell items. The idea was shut down by an executive who said this feature would reduce conversion rates because it would distract people from checking out. On the face of the executives statement, it seems like a logical conclusion. Greg Linden thought it was an incorrect conclusion though, and performed an A/B Test and proved that his simple experiment run was successful. Without this test, it may have been many years until we may have had that feature on many of our websites. I hope this famous scenario makes you ask, “So what exactly is A/B Testing?”. Well I am glad you asked. In website optimization, A/B (Split) Testing refers to the testing of multiple versions of the same webpage (Kaushik, A., 2006). Each page is usually set apart from the other pages. So for example you might have two or three different home pages, or for the amazon example you would have two different carts. This is the most simple of the testing methods because most of it can be done with your current resources. A/B Testing makes it seem like there can only be a control and a test but really the name should be A/B/n Testing, where A is the original page you started with and the rest are variations upon that (Goward, C., 30 Reasons 2011). This type of testing is more useful for big changes than Multivariate Testing, you will see why in a minute, because it tests whole pages at a time. Lets run an experiment. In the image below there are two different pages with 9 differences between them. Page A is the current page for your site and you hired a designer to create an upgrade. If a designer showed you these two pages and it was up to you to decide which to deploy based on their capability to generate higher conversion rates, could you decide? Can you tell from just looking at them which has the highest conversion rate, the amount of difference, and if it was a significant change?  


Which would you choose?

(A)                                                  (B)

Retrieved from


Checkout page A outperformed page B by a considerable amount. What Doctor FootCare thought was an “upgrade” lost 90% of their revenue. Most of the changes were positive, but the coupon code made people think they might be paying too much for the product because there is a discount they are not getting (Kohavi, R., Sommerfield, D., & Henne, R. M., 2007). Another example of how A/B Testing can be easily implemented by an organization is through Email Testing. Lets say you have a customer list of 337,466 emails. You can send two different emails to your customers and see which email works better for your site (A/B Split Testing, 2005). 

A/B Testing with Email

Retrieved from


A problem with A/B Testing can occur when multiple changes are present per webpage. If there are more than one change per page, as there often is, it is difficult to measure what causes the desired or undesired response from the user. For incremental changes to a webpage, MultiVariate Testing can be useful.

MultiVariate Testing

With A/B Testing, several pages had to be created to test the different outcomes. Now with MultiVariate Testing you can take just one webpage, modularize it (cut it into pieces), and change just the modules you choose. The picture below shows 5 different factors being tested on the MSN home page. 

Multivariate Testing


Retrieved from

(Kohavi, R., Longbotham, R., Sommerfield, D., & Henne, R. M., 2009)

Some of these factors could be changed by region, individualized per customer, chosen by the editor of the day, etc. With the MultiVariate method of testing, an organization can isolate small elements on a page and measure their individual effects, on conversion rate for example. This allows for a more moderate change or incremental improvement. Though this method allows for some fascinating statistical analysis, problems can still arise from this method. Because there can be so many different variations, 5 different factors each having multiple variations, MultiVariate Testing requires a lot more traffic than A/B Testing in order to reach a statistically significant conclusion.

Multivariate Testing Email


Retrieved from


Because of the modularized method of changes, significant layout changes are very difficult to implement. This may hinder creativity and stifle innovation (Goward, C., A/B Split 2011). Most of the MultiVariate implementations are done on the server side of website. There are several companies that facilitate this, which I’ll get to bellow, so all your IT staff has to do is input a few lines of javascript on the page and that is all on the organizations part (Kaushik, A., 2006). This is a good thing because of the biggest obstacles on any type of testing comes from collection of referrer and destination points (Kohavi, R., Longbotham, R., Frasca, B., & Crook, T.,2009). Using the MSN page as an example, if the only information you collect is what the configuration the factors were in that made users click through, you will not know what the factors when like when users did not click through. Both sets of information are extremely valuable and it is much easier to record analytics of users who do not click through on the server side.

How to go about doing these tests?

Although I mentioned a few methods of implementing A/B Testing readily in your site, it is much difficult to explain how a typical admin can effectively implement MultiVariate on their own. For this reason there are major companies who can do both, and several other methods of testing, for you. Adobe Test & Target offers a whole range of products that allow for A/B and MultiVariate Testing, different types of goal setting, etc. (Adobe.com). I highly recommend you watch the demo videos on the Adobe link in my references, to see how incredibly detailed the testing can be. Another major competitor in the optimization field is SiteSpect.com. This site has been used by major companies such as Staples, Mozilla Firefox, Barnes & Noble, TigerDirect, Newegg, and The New York Times, to name a few (SiteSpect.com). SiteSpect differs from other optimization platforms because it is non intrusive. The traffic between you and your client first travels through SiteSpect. The service can make changes on your html, css, xml, or javascript on the fly while only saving the definitions to apply to traffic.


AB & MultiTesting

Retrieved from


Optimizing your site has come a long way from making changes based on intuition. Major competitors on many markets now make optimization decisions based on statistical data. The most simple method to implement is A/B Testing. This method gives the quickest response to changes and allows you to make bigger leaps based on your goals. MultiVariate Testing should be done periodically in order to fine tune pages that have already shown improvement based on A/B Testing. This allows you to make the best use of both methods.


A/B Split Testing (2005, August 16th). Retrieved from http://www.marketingexperiments.com/improving-website-conversion/ab-split-testing.html

Adobe.com (n.d.). Web optimization and testing | Adobe Test&Target. Retrieved from http://www.adobe.com/products/testandtarget.html

Goward, C. (2011, May 1st). 30 Reasons to Use AB Split Testing for Conversion Optimization. Retrieved from http://www.widerfunnel.com/conversion-rate-optimization/30-reasons-to-use-ab-split-testing-for-conversion-optimization

Goward, C. (2011, May 1st). A/B Split Testing vs. Multivariate: Pros & Cons. Retrieved from http://www.widerfunnel.com/conversion-rate-optimization/ab-split-testing-vs-multivariate-pros-cons

Kaushik, A. (2006, May). Experimentation and Testing: A Primer. Retrieved from http://www.kaushik.net/avinash/experimentation-and-testing-a-primer/

Kohavi, R., Longbotham, R., Sommerfield, D., & Henne, R. M. (2009). Controlled experiments on the web: Survey and practical guide. Data Mining and Knowledge Discovery, 18(1), 140-181. doi:http://dx.doi.org/10.1007/s10618-008-0114-1

Kohavi, R., Longbotham, R., Frasca, B., & Crook, T. (2009). Seven Pitfalls to Avoid when Running Controlled Experiments on the Web. KDD ’09 Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, 1105-1114.

Kohavi, R., Sommerfield, D., & Henne, R. M. (2007). Practical Guide to Controlled Experiments on the Web: Listen to Your Customers not to the HiPPO. Retrieved from http://www.exp-platform.com/Documents/GuideControlledExperiments.pdf

SiteSpect.com (n.d.). A/B and Multivariate Testing, Behavioral Targeting. Retrieved from http://www.sitespect.com