In honor of the 20th anniversary of the ad exchange, I’m posting chapters from my unfinished book on the history of programmatic advertising. I hope you enjoy!
Built to spec
The little Right Media team had to find its own space now that we were breaking free from Poindexter. We rented an office at 29th and 5th. Today we would call it NoMad, but back then it was the garment district, full of wholesalers and rug dealers. Our space was tiny, just three offices and a conference room. On my first day, Matt Philips, my boss, handed me a fifteen-page spec for what my ad server was supposed to do. I got to work, hiring another engineer to help out.
Six weeks later, when Matt and I went to Mike Walrath, the founder and CEO, to show him the prototype, I assumed he would be excited. I had built almost exactly what Matt had written down. The three of us gathered in our only conference room and Matt began to walk through the screens of the prototype.
Mike’s instant response was frustration. “That won’t work, that just doesn’t do what we need it to do!” Matt tried to defend the design choices we had made, but it was clear that Mike wasn’t satisfied.
After 20 minutes, I decided to step in. “Mike, instead of telling why it’s so terrible, could you tell me what you want it to do?”
Over the next half hour, I learned how Right Media worked. The business model was straightforward: it was an ad network, aggregating ad inventory from a multitude of websites and selling it to advertisers. There was no technology, just Mike’s craft and canny. He would pore over reports to see which ads worked on which sites, calling publishers to ask them to move the ads to different placements or to increase or reduce the frequency at which they displayed.
But that wasn’t the vision. Mike wanted to build technology that would be smarter than he could be by picking up the phone. A system that would make sure that the right ad ran on the right medium at the right moment.
Mike explained that ad servers up to this time had always ranked ads based on a priority. You’d run the highest priority ad first, usually the one with the most budget, and then the next, and the next. To run a performance campaign where the advertiser paid based on the number of clicks or sales, you’d have to guess at the right priority. In reality, different ads would be appealing to different people on different sites, so you’d create multiple line items in the ad server with different rules and different priorities. This was complicated to manage, and in fact often meant that even though a performance ad might be the most profitable for a particular user, it would still be stuck at a lower priority. “I need you to build a system that ensures that we always run the ad that will make us the most money.”
I realized the real goal of the technology platform: to build Mike’s natural intelligence and skill into software that could figure out which ad to run on which site, ideally in real-time. I was going to build a software version of Mike’s brain. And moreover, I was going to rank ads based on their expected value, not their priority. It would be an auction, not a daisy chain. This simple idea, of auctioning ads instead of prioritizing them, led to the $100B programmatic advertising industry and, as Cory Doctorow would say, the enshittification of the internet.
Side note: On meritocracy in founding teams
You might wonder what Matt thought about me taking over the meeting and questioning the product design that he had hired me to build. Most people would have been furious, but Matt stayed cool. A couple of weeks later, he suggested to me that I should be the CTO of the company. I asked him what that meant for him, since his title was head of technology. He said, “I will do whatever is best for the company. Right now, that means working for you.”
Matt’s decision to step aside was one of the best and smartest business decisions I’ve been witness to. It wasn’t a selfless act - Matt ended up making a ton of money when we sold Right Media a few years later. It wasn’t an act of friendship, as we didn’t know each other well enough yet for a personal relationship to weigh on the decision. It was the only way for Right Media to be a transformational company. It enabled a partnership between Mike and I that took us further, faster, than we could have done with Matt in the middle. Think about the 28-year-olds that you know. How many of them have the maturity or wisdom to make that decision? For that matter, forget age. How many executives would willingly give up their role to let a younger, less experienced person take over?
Over the next few years Matt and I did become friends. He taught me how to live in New York City like a native and supported me when things got tough in my life. He was an instrumental part of Right Media’s success, devoting himself to learning how to build and run datacenters and the immense technical infrastructure we needed to operate at internet scale. I never heard him complain or express jealousy about my role or success. Matt is the embodiment of what it means to put the mission ahead of oneself. Years later, when we created the Founder’s Award at AppNexus, making a major equity grant to someone who made a massive contribution to the company as an honorary founder, it should have been named the Matt Philips Award.
Predicting Value
I spent the next few months building a revised ad platform based on hours of interviewing Mike and digging into how the performance advertising business worked. Mike had some fundamental insights about how internet ads worked based on his experience running the business by hand. For instance, the more times you’ve seen a particular ad - the frequency - or the more recently you’ve seen an ad - the recency - the less likely you’ll respond to it. To address this without technology, the publisher team would call websites and ask them to “frequency cap” ads so that they only showed once or twice a day per website.
My insight was that we could use machine learning to determine the decrease in response rate based on frequency and recency, and use this to find the optimal cadence for each ad. In addition, we could predict not just which ad would have the best response rate, but how much money we would make from each ad. The challenge of trying to do a continuous valuation of each ad based on frequency and recency was that we would need to perform this calculation every time an ad would serve, requiring far more computational power than ad servers of the time that executed simple rules like “serve this ad once a day then stop”. I was convinced that the increase in ad performance, and thus revenue and profit for our ad network, was worth the cost of more servers.
Since cloud computing wasn’t an option in 2004, we needed to put physical servers in a datacenter. I asked Matt to take point on acquiring servers and datacenter space. Our budget was incredibly limited since Right Media had only raised a few hundred thousand dollars from our investors. Matt found a datacenter in Chinatown, a short taxi ride away from our offices at 29th and 5th, and after countless hours of tuning the precise configuration spent a huge portion of the company’s cash reserves on 10 servers, a pair of used load balancers, and some networking equipment. When this all arrived, we spent a couple of days in a loud, frigid cage at the datacenter. For some reason we decided to crimp our own Ethernet cables to save money, and many of our amateur wiring jobs just didn’t work properly. Finally we got everything working well enough to try lighting it all up.
In January 2004, we started running our first traffic through the ad server. That first month we ran about 1 million ads through the system. At the time it seemed like a big number, and we were both thrilled and exhausted as we tracked down bugs and tried to see if the algorithm actually worked. In February and March the volume increased significantly as we moved the entire business to our new platform.
I got used to hearing “Yes!” yelled from Mike’s office, as he refreshed the reporting and saw that a user had clicked on an ad and signed up for AOL, our largest advertiser. Mike was the ultimate power user. He dug into the details of every campaign, finding countless places where the algorithm seemed to be doing illogical things. I loved making him happy by tweaking the algorithm to make it even more effective. His excitement and intensity were contagious. We had an almost spiritual connection, me building the system and him pushing it to its limits. Our quest to drive users to click on ads and then buy things didn’t feel like an advertising business; it was more abstract, like trading stocks is to running companies. We were manipulating a system, not people.
By April, the system was working well. We needed more hardware; we needed to rearchitect parts of the system; but we were confident that we could drive performance better than almost any ad network out there. Now the business needed more advertisers and more publishers. We hired Ramsey McGrory to lead sales for the former, and I offered to take over the business development team to sign more web sites since the technology side of the house was stable. Mike agreed, and I turned my attention to understanding publishers, the people who operated and made money from web sites.
Understanding publishers
My first step was to learn as much as I could from our publisher lead, Aaron Letscher, a co-founder who had followed Mike from DoubleClick. He had been spending his time calling publishers and getting them to give us better rates or change the frequency caps on our ads. Our pitch was pretty simple: money. We would offer a share of the revenue we’d get from selling the ad space, or in some cases, agree to pay based on volume. Publishers would generally allocate their ad space across a handful of ad networks like us, often based on relationships or promises of big checks.
In theory, now that our prediction algorithm was working, we should be able to generate more clicks, and thus more revenue, for the traffic that publishers sent us. My naive assumption was that publishers would realize this and start sending us more traffic. However, this wasn’t happening. So we got on the phone and called some publishers to try to understand why we were writing bigger checks but not getting more traffic. As an example, one of our publishers was runescape.com. Our account manager, Meghan Ficca, called them and we waited with bated breath to hear their explanation. It was simple: “You’re paying us $0.50 per thousand impressions; FastClick is paying us $4.00.”
We were shocked and confused. There was no way that FastClick was generating 8 times the performance that our ads were. In fact, due to our close relationship with some of the large advertisers, we knew that our ads were outperforming theirs. Maybe there were advertisers they had that we didn’t? I spent hours refreshing websites and using browser plugins to identify when the ad came from FastClick, tallying the advertisers that they served. No significant difference, except that FastClick showed a surprising number of public service ads instead of running paying advertisers. When we were completely out of ideas, Meghan called Runescape and asked if we could borrow their FastClick login to see the reports for ourself.
In the Right Media publisher portal, we reported 1 million ads shown for $500 of net revenue, or $0.50 per thousand impressions. The FastClick publisher portal reported 100,000 ads shown for $400, or $4.00 per thousand impressions. We stared at each other, confused. Was Runescape really splitting their traffic evenly? And if so, why did it seem like we were getting 10 times more traffic than FastClick? I looked back at my data set from manually refreshing the site. What if they weren’t counting the public service ads in reporting? That would artificially inflate the CPM dramatically. But how could they get away with it without causing a counting discrepancy with the publisher?
We called Runescape again and asked them how they tracked the number of ads that they sent to each ad network. They admitted that they didn’t. We pointed out that we were writing bigger checks than FastClick, and that the CPMs were based on what FastClick felt like counting, not the actual volume. The Runescape team was surprised but pleased to act upon the data, and they began to shift more traffic to us.
I grabbed Matt and Mike and excitedly explained to them what we had figured out. They both understood the issue immediately. At DoubleClick, they had seen a number of ad network shenanigans around reporting, most commonly reporting gross revenue instead of net revenue. They believed the industry should use an “effective CPM” as the metric of choice: how much money you actually got paid divided by the actual number of impressions you showed. I said, “That makes a ton of sense. Why don’t publishers have the ability to track the number of impressions that they send each network and how much money they actually make?” Neither of them knew.
Yield Manager for Publishers
I told the business development team to use our discovery to start convincing publishers to send us more traffic. Then I went back to my desk and started coding. I love the way it feels to write code, seeing an idea come into existence. The rhythm: code, compile, test, repeat was addictive. Each cycle my creation would look a little more like the idea in my head, or the idea would evolve based on new insights or roadblocks. I couldn’t have explained to someone what I was building. My mind was racing faster than I could communicate, the conceptual leap from seeing the Runescape problem to building a solution.
After a few days of frenzied development, I was honing in on a solution and felt like I could actually explain it to the team. I grabbed my lead engineer, Ed Kozek, and explained what I was thinking. As usual, he caught on quickly, and we began to work together on a project I called Yield Manager: an ad server for publishers that would allow them to intelligently track how many impressions they sent to each ad network and how much money they actually made. By the middle of August, I was ready to test it. I called a few publishers and convinced them to roll out Yield Manager on their sites.
The initial results from the publishers were disappointing. My hypothesis was that most ad networks were acting like FastClick and taking advantage of publishers’ lack of technology. In reality, most networks reported impression numbers that were much closer to reality. The largest networks, like Google’s AdSense, reported effective CPM just as we did. This meant that publishers didn’t see a dramatic improvement in yield across the board, just on the second-tier networks like FastClick.
Yield Manager wasn’t intended to just be a reporting tool. It was also supposed to help publishers do a better job of allocating impressions across the ad networks based on their comparative value. I was hopeful that basing allocation decisions on better data would improve publisher revenue. Most publishers at the time used spreadsheets to make these decisions, pulling in their numbers once a month. Yield Manager enabled publishers do to this allocation every day. I thought that more frequent updates would improve yield, but based on our first few publishers, it didn’t.
I was frustrated. This seemed like such an obvious problem to solve with technology, and yet our innovative solution wasn’t actually helping. I spent hours scribbling ideas on notepads, brainstorming with Mike and Matt, trying to crack the code. My entire thesis for building a real-time prediction algorithm was that better prediction would mean better yield; better yield would drive increased volume; and increased volume should drive better advertiser outcomes. But if we couldn’t get publishers to allocate more volume as we increased performance, we’d be better off hiring salespeople and account managers to cajole and convince them than investing in technology.
I paced. I was cranky to my co-workers. I came home from work, pulled out my laptop, and threw myself back into the problem. I think my conscious mind knew there was a solution floating around, and my unconscious mind was plugging away. I tossed and turned in bed. I went on long runs along the Hoboken waterfront, hoping for inspiration in perspiration. Finally, the solution came to me, and it felt so obvious that I was almost embarrassed to present it to my colleagues.
Auctions? Not so fast.
“We have a real-time prediction engine, right? We can predict what each individual impression is worth. Instead of setting overall allocations to each network every day, we could do it on every impression! Take Runescape. We get a quarter of their traffic, but a lot of what they send us we know isn’t going to drive performance. What if we thought of this as an auction where we bid on the impressions, so we get more high-performing impressions and fewer low-performing impressions? I bet other networks see different impressions that perform well, so for any given impression, there should be an optimal network that will pay the most. That has to be far better for yield that randomly allocating each impression. Right?” Imagine me saying all of that at about twice normal speed, gesticulating wildly like someone who has barely slept or spoken to a human for a few days.
Running an auction on each impression was a fundamentally different way of thinking about how advertising worked. This wasn’t how TV or print or radio operated. You could choose a show or a time slot, an edition or a day of the week. The idea that every individual would see a different ad was unique to digital. I didn’t invent that - a few startups had products that could target ads to people based on their browsing history during the dot com boom - but nobody had figured out how this idea of advertising to individuals would change the way ad space was sold and allocated. Mike, Matt, and Aaron were some of the most advanced thinkers in how to manipulate the existing system of online advertising to drive performance. I had the benefit of knowing so little about the way things were done that I didn’t know better than to question the operational foundations of the entire $600 billion advertising industry. The fact that hundreds of thousands of people around the world at some of the biggest companies in the world, from ad agencies to media companies, spent their entire working day buying and selling allocations of ads didn’t bother me… because I didn’t know.
I updated Yield Manager to use our real-time predicted bid instead of the average CPM and watched to see what happened. As expected, we cherry-picked more quality impressions, and our performance and payouts increased. However, the incremental impact was difficult to measure. Were we cherry-picking the same impressions that produced value for other networks? We had no way to know if our payout was actually higher than what our competitors were generating… unless they could give us real-time predictions too.
With the help of our beta publishers, I reached out to some of the ad networks to see if they could give us real-time estimates of how valuable they found each impression. None could. It was just not how they operated. Their models were like the ones Mike and Aaron had used before we built our predictive algorithm: smart campaign managers would tweak the system, changing frequency, recency, and the inventory used in order to increase performance. In aggregate, these techniques worked, but there was no concept of what any individual impression was worth.
It was a dead end. My inspired idea that I could help publishers improve yield through some network optimization technology or through the concept of a real-time auction simply wouldn’t work without other networks implementing predictive optimization. Reluctantly, I ended the beta and went back to work calling publishers and convincing them to join our network.
At this point, in early fall 2004, Right Media was running around a million dollars a month of media with around a 40% margin. There were around 700 publishers in our network. Each month we delivered over a billion ads. We had grown the team to around 15 people, and were consistently running a small profit. Mike was thrilled with our progress as a company, and he began talking to venture capitalists about investing in the business. In October 2004, we received a term sheet that would value Right Media at $18 million, up from just $1 million when Mike raised his angel round the year prior. Aside from my frustration that my innovative new product had failed, the team was thrilled. Our little company was thriving.
Special bonus!
Thanks to Ramsey McGrory for digging up this training deck about dynamic CPM pricing. It’s fascinating that many of these performance concepts have disappeared from the programmatic landscape… maybe they need to come back!