Testing your website before launching: how to let content drive design (Part 1)

7 Sep 2011

In this post we discuss several useful online tools for performing tests on your website copy before starting the design process. We use Decal Mockups to create "content first" wireframes. If you'd like to know more please enter your email at the bottom of this post.

Prelude to a test

For a long time, we here at Working Software have been strong believers in the "content precedes design" philosophy, as famously espoused by Jeffrey Zeldman and summarised here on UX Myths.

I've also been doing a lot of reading of blogs and listening to interviews on Mixergy and it seems that the one thing that distinguishes successful marketers and business people is testing (In particular, check out the interviews with Chad Mureta, Zack Linford and Juan Martitegui).

We are relaunching this website so we'd gone through the process of creating our "content first" wireframe using Decal Mockups.

It was around this time that I came across this blog post discussing how, when choosing a web designer you should look for words on their website like "Business Objectives" and "Return on Investment". This paragraph in particular left me a little crestfallen:

Put a cross in the Bad column every time you see a term like "branding", "beautiful", "passion", "making a difference", "modern", "clean", or any other puff word that doesn't convey a clear benefit to you.

D Bnonn Tennant

Guess what? All the copy in the mockup I'd just spent the last couple of weeks creating looked almost exactly like that.

But rather than just take his word as gospel, it occurred to me I should somehow test to see which style of copy writing was more effective.

All the testing I've seen discussed in these blog articles refers to A/B testing using services like Optimizely on a website that's already deployed, through Pay Per Click campaigns.

We simply don't have the resources to do an entire redesign of both our copy and our visual/graphic design and layout for each test like 37 signals does.

We somehow needed to be able to test our website copy before we even started our visual design process.

Deciding what to test

Whilst the structure and exact content is something that we'll test once we've deployed, establishing our "voice" prior to engaging in the design process seems appropriate as that's the component of the copy that will have the biggest effect on which overall design direction we take.

I came up with the following 3 voices that I wanted to test and produced different versions of my original mockup for each:

UPDATE: So that search engines don't go on indexing these forever I've replaced the links below with screen shots and added a 301 redirect into the original sites. Incidentally check out http://www.paulhammond.org/webkit2png/ for doing full length screenshots.

Deciding how to test

I then went and looked for services that allow you to get feedback on your website without having launched something, and without using traditional advertising.

I settled on the following 4 services, each of which will be reviewed in detail below:

  • Pick Fu: this service is based on Mechanical Turk but provides a framework for getting feedback on any question in an "A/B" fashion (ie. which logo do you like best?), and the promise of "$5.00 for 50 opinions" up the top of the page was just too good to pass up!
  • Domain Polish: a website created by Dan Shipper which provides a framework for getting feedback on your website based on Amazon's Mechanical Turk
  • Feedback Army: the value proposition here was pretty similar to Domain Polish but "higher volume"
  • Feedback Roulette: this site uses a really interesting and heavily "gamified" model to drive a community of reviewers

What follows is a detailed account of how I used each of these services to try and "A/B test" my web copy.

PickFu: A/B test anything

Pick Fu is brilliant. Although I realise that it's not "technically" A/B testing because you're not randomising who sees what, you're not measuring "engagement" so much as just asking a question and there's no statistics or mathematical principles underlying the calculation of the results (like confidence intervals and what have you), this site is absolutely fantastic for just getting feedback from a very diverse range of people on absolutely any question.

There was one little limitation, though: since I'm actually trying to test the effectiveness of three different variations of my copy writing, I couldn't cram it all into one test.

I decided to run three separate tests:

You can view those links to see the real results and how each voice stacked up.

Fortunately, the cost is really low, especially if you go for one of the higher plans. I paid just $17 for 200 responses on each combination, so $54 for a total of 600 responses.

The signup process is very simple - you don't need to "create an account", just enter your email and pay via PayPal, too easy. I was setup in a matter of minutes.

As for the structure of the tests, I simply linked to each of the "voice websites" and posed the following question:

Which of these two websites has the most compelling copywriting? (ignoring the design)

Pick Fu tallies the results for you, as well as providing a breakdown of responses by age, income, level of education, gender and race of the respondents.

The best thing about using Pick Fu was getting some conclusive and useful results! Not only could I see that the "business voice" (Voice 1) won in each case but each respondent also gives a quick reason for their response and reading through them has actually yielded actionable, valuable insights for improving the copy and the information architecture of the site.

Really I have absolutely nothing bad to say about Pick Fu and I highly recommend you start testing everything with it from your hair style to your blog headlines (hint: I tested the headline for this article too!).

Domain Polish

Domain Polish uses Mechanical Turk for reviews, but the idea is that these have been heavily vetted and preened for quality (mostly focused on better English skills as far as I can see).

The majority of respondents were from the USA and even those that weren't had good English.

When you sign up there are some sample questions to get you started and these were useful - I actually ended up basing my questions for Feedback Roulette and Feedback Army on the initial sample questions provided by Domain Polish.

I settled on the following questions for Domain Polish:

  • What is your age?
  • What is your gender?
  • If you wanted to get a website online for your business does the copy on this site make you feel comfortable that we'd be able to deliver?
  • Does the copy on this site make you want to do business with us? (ignoring the design)
  • Do you think that the information on this site is organised such that you are able to find out what you need in order to make a purchasing decision?
  • Once you have made a purchasing decision is it clear how to act and what will happen when you do?
  • What country are you from?
  • Have you ever built a website before yourself?
  • Have you ever paid someone else to build a website on your behalf?

I think I made the mistake of not emphasising the "ignore the design and focus on the copy" angle enough, because I got some decidedly negative sentiments in return.

It almost seemed to me that some of the Domain Polish reviewers were used to a higher calibre of site and felt a bit offended by my shitty design.

All in all, the downfall of Domain Polish for this purpose was the cost per review. The cost at the time of writing is $20 for 7 reviews, and because I have 3 URLs that I want to have reviewed that means I paid $60 for 21 reviews total.

Given the fact that the volumes are so low, this is less like "A/B testing" and not particularly well suited to the task of getting a feel for what type of voice to use on the site.

I got about 50% positive and negative reviews and couldn't really conclude anything useful from the results.

As the name suggests, I think that Domain Polish will be useful at the finishing stages of our design process when we have settled on a design - in particular helping us to improve things like information architecture and the "sales funnel" of the site.

Aside from a couple of minor bugs in the review management interface, the experience with Domain Polish was a positive one and Dan Shipper was very helpful and attentive. I'll be heading back to give this site another try at a different stage of our design process.

Feedback Army

Feedback Army is basically the same premise as Domain Polish but without the "vetting" of respondents, so you get a higher volume of reviews for less money.

I submitted each "voice" as a separate survey with the same questions, and paid $55 for 50 reviews of each.

I basically asked the same set of questions on Feedback Army as I did on Domain Polish although I left out the gender, age and location:

Whilst this is far more expensive than Pick Fu, it's considerably cheaper "per review" than either Domain Polish or Feedback Roulette. I wasn't really expecting quality to go with my quantity, but I was pleasantely surprised!

The first few responses I got on each survey were quite thoughtful and although when all was said and done there were plenty of crappy responses, there were also some real gems.

There was one obvious challenge here: the results were so unstructured that there was no clear way to gain "immediate insight" into them. Whilst I could read through every comment and note down any usable insights in a spreadsheet, I couldn't get a "50,000 foot view" of how people had reacted to each of the different voices.

Fortunately, I'm a programmer so I was able to write some code which allowed me to copy and paste all the response and then analyse them.

I wanted to analyse the following things:

  • Yes or no answers to each question
  • Positive, negative or neutral sentiment in the responses (ie. did they say "no" or did they say "no I was really confused and frustrated")
  • Whether they got the point of the exercise (ie. that they were supposed to be reviewing the copy and not the design)

I first created a database structure to hold the information and then a script to loop through the raw data and attempt to split the responses up into individual questions.

I then created a web interface that allows me to do the following:

  • Correct poorly formatted responses
  • Set the sentiments for each response (ie. id they just say "Yes" or did they say "Yes! A thousand times yes!!"
  • Set whether each question response was "yes" or "no"
  • Set whether or not each respondent "got the point" (ie. that they were supposed to be focusing on content and not design)

Whilst this sounds like an overwhelmingly laborious task it only took a few hours to do and it gave me a chance to read each and every response carefully, copying and pasting anything useful into a spreadsheet I was using to take notes.

After feeding all this information into the Feedback-O-Matic and watching all the lights blink and the tapes whirr around, here are the results!

As you can see, the "business voice" not only produced more responses in the affirmative to each of the questions, but also produced fewer negative sentiments and more positive sentiments than either of the other two voices by a clear margin.

Since the code to analyse Feedback Army responses might well be useful to others treading the content review path (or using Feedback Army for just about anything else) I've put it on Git Hub for everyone to share and enjoy.

Feedback Roulette Feedback Roulette has a really interesting, heavily gamified model for driving a community of reviewers. The premise is quite simple: it's Chat Roulette for getting feedback on your website.

However the system of incentives and monetisation they've come up with is really interesting. Here's how it works:

  • In order to get your website reviewed, you need to have Feedback Points
  • You can earn Feedback Points by reviewing other people's websites
  • In order to make sure you write good reviews, your reviews are rated by the reviewee from 1 to 5 stars. You receive a maximum of 1 point per review, but if your review is rated fewer than 5 stars you only receive a fraction of a point.
  • If you don't have time to go and write a bunch of reviews, you can buy Premium Feedback Points, BUT ...
  • Since the site would be utterly worthless if everyone just bought feedback points, they've designed it so that you build up your "reputation" and "trust level" by writing good reviews and helping resolve public disputes. If you have a high reputation, then when you submit a website for review, your reviewers will have high reputations, ensuring that those who put in the effort are rewarded.

In order to conduct my test, I submitted each of my "voices" as a separate website, and enabled "throttling" in the FBR preferences, ensuring that each would only get a maximum of 5 reviews. I then included these notes for reviewers indicating the purpose of the exercise and asking some specific questions.

It's a really solid model, and Feedback Roulette is a great product, but for this particular purpose it fell down in some key ways.

First of all, reviewing sites so that you get 5 star ratings takes a really long time. You have to spend about 45 minutes to write a really good review, and then you feel pretty annoyed if the reviewee only gives you 4 stars and you don't get your full feedback point afterwards!

Secondly, if you want to buy your Premium Feedback Points they're really expensive compared with the higher volume feedback options based on Mechanical Turk. The price for Premium FPs fluctuates according to market demand but when I was testing they were running about USD$8.00 per point.

If you're looking for "quick and easy" testing for some copy ideas on your website, this is not it.

I think I spent between 10 and 15 hours all told reviewing websites and public disputes to build up 17 review points (although my reputation and trust level were also very high at that point).

Thirdly, the volume of reviewers on the site is still really low (this could be due to the fact that my reputation was high so I was only exposed to those with high reputations).

At the time of writing this, I've had my 3 "voices" on there for about a week now, and I've only received 4 reviews total.

I think that Feedback Roulette will be better suited for getting more detailed feedback on the site once it's nearer to completion.

They've got a pretty serious chicken and egg issue over there, but it's a great system and the higher the site's usage the better the experience will be for everyone (like Yum Cha!).

So what are you waiting for?! Go and give some Feedback!

Conclusion

Having copy written before starting your design process is one thing, but having multiple versions of your copy written and tested by a large indiscriminate audience is just fantastic.

Even in our very early stage visual mockups I can see just how this exercise has given us new direction in design and I think you can expect to see great things in Part II of the post when we launch our new site!

The reason I'm undertaking this exercise to begin with is because we're redesigning and re-launching our site. Subscribe via email using the form below, via RSS, follow Working Software on Twitter, or become a fan of Working Software on Facebook, to get notified when I follow up with Part II once the design process is finalised and our new site is launched!

Lastly, please fill in the form below if you'd be interested in using Decal Mockups to create multiple versions of your web copy to help with pre-launch A/B testing as described in this post:

If you've read this far, you should definitely follow Working Software on Twitter, become a fan of Working Software on Facebook, subscribe via email using the form above, subscribe via RSS and share the love using the social icons above.

If you subscribe via Email you'll also be informed when I follow up with a post once we actually launch the new site.

If you liked this post, you'd probably also like to read What I've learned about making product videos (that don't totally suck).

This entry was posted on Wednesday, September 7th, 2011 at 11:08 am and is filed under design, marketing, doing business outside the bubble, testing, multi-variate testing, split testing, testing your website, testing website copy before launch, ab testing website, ab testing content, content precedes design, author iain dooley, ab testing, pre launch, designs, testings

You can follow any responses to this entry through the RSS 2.0 feed.

blog comments powered by Disqus

Subscribe

Subscribe by RSS

Profitable Websites - The Decal CMS Blog

This is the Decal CMS product blog, Profitable Websites. We write about how to design and build websites that are profitable, profitably.