Secret Truth Series #18: Effective Text Ad Testing

Text ads are trying to answer questions. Writing text ads is difficult because you have only 95 characters to stand out from a sea of competing messages and persuade the searcher that you’re the ad to click. There are many strategies and tactics to accomplish this, and both technical skill and creativity are required. The process takes considerable time and effort. And there is only one way to measure success. Testing. But testing requires more than simply running a couple of ads simultaneously. It requires the conditions for a fair test, a clear goal, and valid measurement and analysis. Much of what passes for text-ad testing in paid search lacks one or more of these requirements. Let’s look a little closer at each to better understand how to properly test text ads.

Conditions For Text Ad Testing

Text ads cannot be effectively tested too early in the process of optimizing your account or ad groups. If you haven’t yet optimized the keywords, match types, ad group organization, and negatives then the search queries coming into the ad group will be too diverse. If people are asking 25 different questions it’s impossible to compose any single answer that will satisfy all of them. If you try to test text ads too early, you won’t be able to trust the results. Maybe you’ve got a lousy text ad, or maybe the ad is just running against a lot of very untargeted or unqualified search queries. So before even beginning to worry about text ad testing, make sure you’ve read and implemented Secret Truths #1#8. When the vast majority of the queries coming into an ad group are similar, you’re read to test text ads. Of course, you have to write some text ads when you setup an ad group. And you should monitor and review their performance and make changes as necessary. But hard core text-ad testing – statistical comparisons – isn’t reasonable or necessary until the ad group has been properly constructed and intelligently optimized and – in terms of search queries – stabilized.

What You Can and Can’t Control

Another factor to consider in text-ad testing is the reality that paid search is a dynamic environment. Keywords get added, negatives expanded, bids change, competitors impact average positions, and other account modifications take place on a regular basis. There are variations in activity and results based on day of the week, week of the month, month of the year, weather, news events, sales, inventory, competitor promotions, and more. So in the time it takes your ad group to get a sufficient number of impressions or clicks for a good test, how can you be sure that it is the ad copy that you’re really testing? The answer is that you really can’t. There are no static environments in PPC. But all the ads in the test are subject to almost all the same environmental conditions, so many would argue these external influences don’t influence relative performance. That may be true, it may not. But you can’t control for many of these variables, so we ultimately have to accept them as a fact of life, a limitation in the system. Whenever possible however, try to limit those changes you do control during deliberate text-ad tests. Don’t introduce new keywords or negatives or dramatically shift bids. Chance are if you find the need to make radical changes of any of these types you’d be better off making them and then restarting your tests.

Clear Text Ad Testing Goals

The goal of text ad testing is to determine which ad copy delivers the best click-through and/or conversion rates.

  • Most ad testing focuses on CTR. That’s clearly the direct goal of the ad, and helps to drive up quality score.
  • Conversion rate should be tracked and considered, even if CTR is the primary goal. There are many ways to incite a click, but Google gets paid for clicks while you get paid for conversions.
  • The conversion-per-1000-impressions metric (CP1K) is a great way to blend these two goals and find the truly optimal ad copy. (I hope to write more about CP1K in the near future).

Statistically Valid Text Ad Testing Analysis

Assuming you have a clear goal in mind and a stable testing environment, test data becomes the next hurdle. How many impressions and clicks does a set of text ads need for a valid test? The answer to that relatively straightforward question has eluded most PPC managers for years. I assume this is due to the fact that most of us aren’t trained mathemeticians or statisticians. (I’m certainly not.) And most of the software we use to create and edit text ads does not provided the statistical support we really need. So we’ve slithered forward based on the conventional wisdom that suggests tracking ‘at least 100 impressions or 10 clicks before there is enough data to declare a winner’. Unfortunately this really isn’t very accruate or useful advice. Statistically, it turns out that those of us who’ve been reacting to text-ads with anything near 100 impressions or a dozen or so clicks have regularly made essentially random decisions. We’ve paused the better ad many times, letting the loser run. We’ve sabataged our own results. Repeatedly. Over long periods of time. Consider the example shown at right: Three text ads running in an ad group. About 500 impressions each. Is there enough data to make a wise decision? It seems pretty clear. The first ad at 1.98% CTR appears to be our winner. But the statistics tell us that it isn’t that clearcut. I looked at the statistical significance and confidence intervals for these ads. We can only be 80% confident that the CTR difference between the first and second ad are actually different. Same for the difference between the second and third ad. 80% confidence is not very high. It’s not considered high enough to be sure something is true in most activities where statistical confidence is considered. For scientific activities a 95% rate is the desired standard. To understand the potential error in accepting these numbers, look (below) at the range of possible CTRs for each of these ads that we can be sure of with a 95% confidence. The first ad may actually be as low as 0.82% CTR, or could be as high as 3.14%. That’s a pretty wide range – we just don’t know yet, with a high level of confidence, what the CTR of this ad is going to be. You can see similarly wide ranges for the other two ads, and in comparison see there is plenty of overlap in the potential which means if we really let this test play out, we may get a very different result. So how many impressions would it take to get 95% confident in the differences? If we let these ads run until they had around 1000 impressions each they’d achieve a 90% confidence. It takes nearly 1500 impressions per ad to hit 95% confidence. The actual number needed for any given set of ads depends a lot on the CTRs and their relative difference. But it’s a rare circumstance when anything like 100 impressions or 10 clicks is adequate. You can check the numbers on your own ads using two great tools:

  • offers a simple, free, online utility that lets you enter CTRs for two ads and check the confidence level.
  • Teascalc is an Excel sheet that costs $49 but offers a both confidence and interval data.

Making The Grade

Everything we do to create and optimize paid search accounts is done in hopes of showing the right people the right ad at the right price. Their reaction to our ads is feedback on how well we’ve done at targeting them and organizing our accounts as well as on how aligned our answers are to their questions. Fortunately for us if we do things right – in setting goals, creating testable conditions, and accurately measuring and analyzing we can get this feedback in clear, powerful, and actionable form. Text ad testing isn’t just another wise and important step in paid search management. It’s the crucial step that pays off all the others.

This blog post is part of a series extending and amplifying the ideas in our free ebook '21 Secret Truths of High-Resolution PPC'.


