My latest foray into internet sociology has involved beating my head against the Google Adwords program. I’m interested in seeing what data researchers can extrapolate regarding search engine traffic and market interest in search terms based on the information Adwords gives to potential advertisers.
Google’s Adwords program works on an auction model similar to a second-price auction. (In a second-price auction, the winning bidder pays the second-highest price bid. It’s a model designed to minimize “the winner’s curse”, the tendecy of the winner in an auction to overpay.)
From the user’s perspective, it’s pretty simple. A bidder identifies the terms she’s interested in advertising on and the maximum price she’s willing to pay for a click. The system estimates how many clicks will be available for purchase at prices beneath her price threshold, the average she will pay per click and calculates where her ads will likely be ranked on the page. She can then change the maximum price she’s willing to pay per click (which will likely change the number of clicks available for purchase and her projected rank) and then set a price ceiling on the maximum she’s willing to pay in total per day.
Behind the scenes, some really fearsome math takes place. Google has a good sense – based on past performance – of how many searches for a given term will occur per day. They also can make certain assumptions about the clickthrough an ad will receive because they reserve the right to pull ads below a certain level, generally around 1%. (I’ve run a few test campaigns: Ad/search term combinations that get 2% or better are marked as “strong”; 1-2% gets you a “moderate”; under 1% and the system warns you, and then slows deliver of your ads.) And Google knows what everyone else has bid for the search terms in question.
Based on this information, the system starts allocating page impressions to each bidder. The high bidder gets her fill of top-ranked ads at a price one cent per click above the next-highest bid. The next highest bidder gets the remainder of available inventory in the top slot, and then runs her fill of ads in the second slot. For example:
Assume Google has 10,000 searches per day for “Africa” – this implies 100 clicks for sale at a 1% clickthrough on ads. Buyer A is willing to pay a maximum of $0.25 per click, and will spend up to $5 per day to run her ads. Buyer B is willing to pay up to $0.10 per click and has a $10 budget per day. The Adwords system solves some equations, and runs Buyer A’s ad on roughly 4,500 pages in the first position, charging her $0.11 per click, one cent higher than the next highest bid. Buyer B’s ad runs on the 5,500 pages that didn’t include Buyer A’s ad in first position, and in second position on the other 4,500 pages. Because no one else has bid, B’s ads should run at the system minimum cost – $0.05 per click. So while B’s ads don’t always appear in first position – their “average position” is 1.45 – they’re lots cheaper than Buyer A’s ads.
Unfortunately for anyone trying to build a mathematical model of this process, that’s not the whole story. The popularity of an ad matters as well. When Adwords calculates its ranking of ads, it multiplies the clickthrough rate by the maximum cost per click. Google’s rationale for this is that it benefits their users – more relevant ads move towards the top, like search results; it also benefits Google economically, as they have a disincentive to show poorly crafted ads, since they’re paid per click, not per impression. (There’s an excellent paper by Juan Feng, Hemant Bhargava and David Pennock that demonstrates quite elegantly why this is a far better way for Google to allocate ad placements than based on willingess to pay alone…)
So what can an internet detective glean from the numbers the AdWords system reveals to a potential customer? Create a dummy ad, and you can get a good guess at the number of search results available for a set of keywords per day. AdWords tells me that the optimum pricing for my keywords “Africa News” is $0.57, that I’ll pay $0.21 per click on average, receive 9.3 clicks and have an average position of 1.3. Trying a few other values for my maximum price per click, I get the following data set:
at $0.06 per ad, 7 clicks, $0.06 per click, position 2.7
at $0.12 per ad, 7.9 clicks, $0.07 per click, position 2.2
at $0.25 per ad, 8.4 clicks, $0.11 per click, position 1.8
at $0.57 per ad, 9.3 clicks, $0.21 per click, position 1.3
at $1.00 per ad, 9.5 clicks, $0.27 per click, position 1.2
at $5.00 per ad, 9.7 clicks, $0.36 per click, position 1.0
There’s a clear logarithmic relationship between maximum price and clicks available. (log(price)=n*log(clicks) fits a larger data set – prices for my usual set of keywords representing 180 nations – at R2=0.97). This suggests that there’s an asymptotic ceiling to the number of clicks Google will predict – once your average position has reached 1.0, Google is anticipating a situation where your ad is served on top of every page available, and increasing the amount of money you’re willing to pay per click is unlikely to increase the number of clicks available because Google simply can’t give you any more page impressions.
Turning this figure into total searches per day is an inexact process. I’ve run ads targeted to “Africa News” for the past week, paying a maximum of $0.05 per click – the ad has appeared on 2035 pages and received 41 clicks, and an average placement of 2.2. At 291 impressions per day, this ad would need to receive a 2.4% clickthrough to experience the 7 clicks AdWords projects for $0.06 – the ad has actually received 2% clickthrough. It’s possible that Google is using the actual clickthrough on ads targeted to “africa news” to calculate clickthrough and that other ads for “africa news” are doing better than my ad – it’s also possible that they’re using a fixed clickthrough of 2-3% as an estimator. Assuming that range, I can project that Google is experiencing 323 to 485 searches for “africa news” on a given day. (That figure seems depressingly low. If I’m somehow getting this very, very wrong, please let me know.)
It’s also possible that Google has vastly more searches for that term and only places my ad on some of the searches, but I don’t think so. I’ve told Google I’m willing to spend $5 a day – with only 5.8 clicks a day at $0.05, I’m paying $0.29 a day, or 5% of the money Google could extract from me if they delivered more ads. The only rational reason for Google not to serve my ads is lack of inventory… and they can create more inventory by adding more ads to the sidebar, lowering my rank, but selling me impressions. (Google’s not shy about this – “St. Lucia”, the most expensive search term I’ve found in my “nations” set get 8 ads per page of results. “Africa News” gets two – mine, and a website selling South African television programming.)
It’s also possible to glean something about the market value of a term from this data. Bid very little for ads and you can get a sense for just how competitive each search term is, by looking at what the projected rank is for your ad. At the minimum bid, $0.05 per click, you’ll be ranked near the top (1.3 – 1.4) for searches for “Solomon Islands”, Mauritania, “Burkina Faso”, “Vanuatu”, Swaziland, “Sao Tome” or Lesotho. The same bid puts you far down the page (3 – 3.4) bidding for Maldives, “Costa Rica”, St. Lucia, Croatia, “Dominican Republic”, Fiji, Italy, Belize, Bulgaria, Cyprus, Spain and Bahamas. The most popular terms feature a plethora of ads from rival travel agencies; the least popular are places you’re probably not traveling to any time soon. Market scarcity may also play a role – Maldives, St. Lucia, Croatia, Belize and Bulgaria all get fewer projected clicks per day than the median (66) for my set of nations. (Then again, ALL the unpopular terms get fewer than the median.)
When you first start a campaign on AdWords, Google suggests the maximum price per click you should pay – it appears to set this price at whatever will get you a projected average rank of 1.3. For St. Lucia, for instance, this is $4.32 per click. Before concluding that the system is a) broken or b) preying on the very dumb, there are a couple of reasons to set your keyword price that high. One is that you rarely pay full-freight – even with a maximum of $4.32, Google projects you’ll pay $1.38 on average for your St. Lucia ads – you’ll get most at under $1 per click, but your willingness to bid higher will ensure you end up top ranked even when someone else bids $3 per ad… The second is elegantly explained by this roofing contractor who is willing to pay $25 per click on Google: he closes 30-70% of the deals that come to him through Google, generally for hundreds or thousands of dollars. At that point, a $25 customer acquisition cost is a bargain… (Feng and her colleagues speculate that interest in ads decays exponentially depending on ad placement – the second ad gets only a fraction of the attention the first does, and the third a fraction of the second. They end up recommending that programs like AdWords reward ads that manage to get decent clickthrough in lower positions…)
Let’s posit a projected rank of 1.3 as the threshhold of sanity – i.e., not even Google, who is taking your money, thinks you should be willing to spend $5 per click on a St. Lucia ad. At $2.50 per click, the following nations are still below the sanity threshhold (i.e., you’re going to be ranked 1.4 – 1.8 if you’re willing to pay “only” $2.50 per click): Lebanon, St. Lucia, Maldives, Cyprus, Bulgaria, Costa Rica, Panama, Barbados, Spain, Jamaica, Angola, Sudan, Dominican Republic, Mauritius, Italy, Malta, UK, Iceland, Portugal, Mexico, Turkey, Macedonia and Peru.
So why does the free market think these nations are worth so much per click? Some are obvious: St. Lucia, Costa Rica, Jamaica, Italy and others are expensive vacation destinations – a user clicking on the ad might be prepared to pay thousands for tickets or a hotel. Others – Mexico, Panama, Dominican Republic, Bulgaria, Lebanon – have large expatriate populations who search for flights home, discount phone cards or financial remittance services.
Sudan’s the really weird one. (Angola baffled me for a moment, before I followed a few links and discovered that advertisers were encouraging me to travel to Angola, Indiana.) Search for Sudan on Google. You’ll get a results page with eight ads, the maximum Google puts on a page. Every ad is from a nonprofit organization. Save the Children, Care USA, Doctors Without Borders, Amnesty International and Mercy Corps are running straightforward “We work in Sudan – support our work” ads; American Progress Action Fund and National Public Radio are running ads for their Sudan information sites. The top bidder is “Global Nomads Group”, an NGO which aims to connect children around the world through videoconferencing – they’re also the leading bidder for “Rwanda”.
The rank/price relationship for “Sudan” implies that one or more advertisers either are receiving an excellent clickthrough rate, or are paying well over a dollar per click for their ads, likely both. This reveals an uncomfortable truth about the relief business – on those rare occasions a humanitarian crisis gets global attention, aid agencies have to take advantage of the situation to raise money.
Doctors Without Borders’ website lists projects in 85 countries that they’ve worked on in the past few years. It’s pretty rare that the ongoing strife in Burundi gets international attention – the money that comes in from donors concerned about Darfur supports a program for rape survivors in Bujumbura, HIV prevention efforts in Malawi and anti-malarial efforts in Nigeria. The situation is analagous to the controversy over the Red Cross’s “Liberty Fund”, where the organization announced an intention to use some of the money donated to support the victims of 9/11 to support Red Cross projects around the country – Red Cross CEO Bernadine Healy ended up resigning over the public outcry. Jim Moore has raised concerns about Bono’s DATA (Debt AIDS Trade Africa) project buying Sudan impressions, advertising a site that had little to do with Sudan. (DATA no longer appears to be buying the “Sudan” keyword.)
While it’s interesting (and soul-crushingly depressing) to discover bidding wars over keywords associated with human suffering, I’m focused on the idea that I can pull data about web users’ interest in different subjects out of this data. My data collection holy grail would be an algorithm that allowed me to estimate how much money is spent on each keyword based on click availability and predicted rank at different maximum click levels. Unfortunately, the math is way beyond my capabilities – any game theory/auction economists out there want to give me some pointers?