Understanding the nuances of a digital strategy in the travel business is a never-ending series of projects and experimentation.
One of the core disciplines to such an approach has been the emergence of test-and-learn - a process by which organizations conduct hundreds, often thousands, of small, medium and large procedures on their platforms to gauge their usefulness and efficiency for customers.
The Expedia Group is one of the big cheerleaders of the test-and-learn model, conducting these experiments throughout the year across its brands.
We spoke to its vice president of global product, Brent Harrison, to understand more about the scope of the test-and-learn system, what it does with the successes and failures, and how it impacts strategy across the group.
Can you give me some background on Expedia’s test-and-learn strategy?
Test-and-learn just describes our application of the scientific method.
Being a global travel provider and platform, we deal with a myriad of complexity that’s predicated on data. Data of course comes from suppliers, that could be lodging, airlines, cars and ground transportation, activities and so on.
Of course, we see a lot of data from understanding consumers and how they interact with products and services online. Test-and-learn is just a way we can generate hypotheses either through qualitative observation or quantitative analysis of that data and come up with things that we think will better serve our customer.
The test part would be constructing something like an A/B test, where we take a current state of a website or application, then we make a modification in the interest of serving the customer better. Then we learn from results.
We get quantitative data that say, hey, the original worked well or didn’t work as well as the new variant. It’s a way that we systematically look at improving our products to better serve our customers.
What are some examples of test-and-learn strategies that have worked, specifically around customer experience?
We generally will break down customer experience broadly into three buckets: One is dream/discover; one is shop/buy; and the other is the actual trip experience itself.
If you think about shopping and buying, the way test-and-learn can manifest itself – I’ll use myself as an example: I have a business life when I’m traveling with Expedia Group to a variety of destinations; and I also have personal life when I’m traveling or vacationing with my family.
I could go into a shopping experience on one of our properties like Hotels.com and be looking for a business-friendly hotel that’s walking distance to our office in downtown San Francisco.
The test-and-learn might play out - and the way we use intelligence to better tune the experience to the consumer need – could be something as subtle as the types of photos that I see in a hotel property on our site could be different. Like highlighting the bar I have access to or the grab-and-go coffee station or the business center or the gym.

My general default answer as a consumer product technologist is, it starts with the customer. If we’re doing right by the customer, that’s a great unifier and how I would think about priorities of a test or the priorities around one test vs. another or another.
Brent Harrison
I could go back to that property when they know that I’m traveling with my family – a trigger for that could be as simple as, instead of a party of one, I’m traveling with two adults and two children … [and I would want to know] hey, do I have an adjoining room; what does the bed configuration look like; is there a pool on property?
Things like that are a great example, at least in the hotel shopping use case scenario, of how we can use not only the data that we have or the signal from the customer, but also we still have to generate the hypothesis that looking at photos depending on your context could materially change your proclivity or likelihood that you want to choose one property over another.
With the same data set and the same general use case that I need a place to stay, the context could be different … and test-and-learn would be basically to discover by presenting different approaches.
In this case, surfacing content in a slightly different manner that might better help me select a property depending whether I was traveling in a business context or a personal or family context.
Can you give an example of a strategy that hasn’t worked and what you learned in that situation?
We tend to lionize or celebrate the winners, but the losers are just as important.
Let me give you a mobile example. Part of my responsibility is I build both for Android and iOS. In general, those experiences would look similar, but sometimes you see bizarre behavior.
One example would be the way users on iOS want to access, sort and filter. The basic use case for hotels: You come in and you do a basic search around date and location.
Now you want to sort on things like price, on star rating, on amenities. in the case of amenities, amenities could be free parking, it could be breakfast included, it could be Wi-Fi, it could be athletic facilities on site.
Sometimes we get data that doesn’t really make sense. Where on iOS we present amenities of a property in one particular order, and we run the exact same test on Android, the hypothesis would be: Hey, it’s the same need. It’s generally the same mobile consumer in the same market, the data should be the same.
But in iOS on this particular property, people cared more about Wi-Fi, but on Android, they cared about the free breakfast.
It depends on how the test is constructed, but in that case, one would have been a failure, one would have been a winner.
We have to then adapt or adopt or make the decision from a product perspective: Is it worth the customer engagement and the utility that’s derived from the user, in that case to actually have two different experiences? One that showcases one thing for the exact same property just on the basis of the device that you’re accessing vs. another one?
You see a lot of these things that are a little bit counter-intuitive. You’d think they are just the same, but the data turns out to give you a different signal.
I think although that’s maybe kind of a negative, I think it also underscores the importance of that scientific method and how test-and-learn kind of does play out and allows us to make more intelligent product choices for our customers.
Of the tests that you run, what’s the success rate of things that make it onto your live sites?
I don’t know if I have exact rates top of mind, but what I will say as a product leader: I like seeing divergence. There are three outcomes when you do a test. Let’s take a simple example of an A/B test. There’s a control, then there’s a variant - the A and the B.
I like to see almost an equal number of winners vs. users. If you’re getting too many winners, the challenge I would have is you aren’t pushing the boundaries of what people really want, and it’s a fascinating place because people oftentimes can’t articulate what they want, or in isolation will articulate or give you a false impression of what they want.
It’s only when you test that you see behaviorally how people react… The beauty is if you get a strong signal – a strong signal is positive or a strong signal is negative – the benefit of positive of course is you can just roll those out globally.
Subscribe to our newsletter below
In our case where we have multiple brands and multiple markets we’re serving, it fairly rapidly provides value to consumers wherever they are in the world. Conversely, from a learning and a modification of the product perspective over time, the negative results are also helpful.
I go back to my second example – we never would have hypothesized that iOS and Android users are different in that way.
They are different in other ways, but in that particular way, the beauty of running a test was we got a strong negative signal from one of the platforms, and that allowed us to learn and then run another test where we modified the experience and then got a positive signal.
I may be a little bit operational in my answer, but that’s how we use it, and that’s the benefit. I look for almost an equivalent number of positives vs. negatives.
We’re doing two things: We’re always improving the product for consumers, and secondarily, we’re improving the quality and specificity of the learning that helps us inform what we are going to next for our customers.
Looking at what you said about “pushing boundaries” - how do you try and push the boundaries of what you’re giving to consumers, and what’s your objective there?
I’ll try and answer the second question first: I think the objective for us is equipping people to have the best possible experience regardless of where they are in their proverbial journey.
I gave you a simple construct at the outset that people are either dreaming or planning travel; shopping actively and buying; and then getting ready to or actually experiencing the trip.
It’s fascinating just from a time perspective, we spend most of our time dreaming about travel and not actually shopping for it and not actually doing it. So that dynamic is interesting in and of itself. We use that as a model to say, well what is it that you need in each of those phases?

We get quantitative data that say, hey, the original worked well or didn’t work as well as the new variant. It’s a way that we systematically look at improving our products to better serve our customers.
Brent Harrison
Let me give you a couple examples in the in-trip one, which is an area I’m really passionate about. One is we introduced conversation facility that allowed our suppliers – our hoteliers - to basically establish a live chat-like connection to consumers.
The problem is kind of a two-sided problem: On the one hand, a customer has come to us as an online travel agency and entrust us with their travel plans, and that’s a big deal.
For most people, it’s an expensive item. They get to do it once, maybe twice, a year if they’re lucky, and there’s a lot of stress that goes along with that. They want to make sure and we have a responsibility to make sure we do everything in our power to make sure they get the right product and then ultimately make sure they have a good trip.
On the other side of our marketplace, we hear from suppliers that say, you know what, historically we don’t know much about the customer you bring our way.
While we love you filling our rooms or beach resort or whatever, we think we can provide a better quality of service if we had connection with that customer before they arrive.
So the product that we developed is really just a conversation platform, where we now have - if you go to your mobile app itinerary, which is a place people are increasingly going to manage their trip - I now can actually send a message to a hotelier.
Let me give you an example: When I travel to India, I know I’m going to get in at one in the morning, I know I’m going to be horribly jetlagged. It’s a way for me to very easily send a message to the hotel and say, hey, can you make sure that you’re ready for me to check in. I don’t want to spend time at front desk. Can you make sure I have extra bottle of water.
Conversely, the hotelier can say, we got it, we’ll take care of you. And would you like a newspaper in the morning? Would you like a coupon for a drink when you arrive? Can we preorder something at the breakfast buffet?
That’s a great example where we just need to provide a facility to allow the two most important people in the transaction – the consumer and supplier – to make that connection.
That’s a great example where we’re trying to be a good steward of that dynamic and just allowing those conversations to happen.
When did that functionality roll out?
We started test mode six months ago. We started actively signing up suppliers – basically they have to opt in to that because there’s the challenge that if you expose the functionality to consumers and they use it, we want to make sure the hotelier is prepared to respond so we don’t create a bad experience - maybe three months ago.
It’s one of those things where you’ll increasingly see more and more hoteliers equipped and therefore when you book a trip on Expedia or one of our other brands, you’re able to access this messaging capability.
We need that for airlines…
I hear your frustration on that one. Just to give one other example about how we’re trying new things: One of the challenging things with airlines is the vagaries of weather and reliability, so we’re testing with getting access to flight data – like real-time data.
Take, for example, you’re flying to Europe and you have to connect in Iceland and you’ve got a relatively tight window and you’ve got a delay on your initial flight.
Wouldn’t it be nice if you got not just a notification that you’re going to be delayed, but also get a prompt to dynamically reroute you. We realize you’re going to miss your connection, here are your options, click to make this modification.
That’s an example of where we’re trying to wire in the understanding of data – the flight schedule in this case and the fact that you’re a valued cust – with the vagaries of change, because things can happen and there are things that we can’t control. But how can we use that to be proactive with the customer to give them peace of mind?
For tests that you’re running, about how many people on Expedia’s end are involved in executing a test and interpreting the data?
It varies a little bit team by team, but let me give you a concrete example. The mobile app team would have a variety of multi-functional teams – in the software world you call the “scrum agile” teams, we call them pods – and they’re basically a group.
You could think of these as a product manager who’s helping establish priorities, a designer – in the case of a front-end technology like a mobile native app you need a designer to work on flows and look and feel – and you would typically have an engineer or two.

You see a lot of these things that are a little bit counter-intuitive. You’d think they are just the same, but the data turns out to give you a different signal.
Brent Harrison
In the case of mobile, where you have two platforms, there could be as many as four or more engineers, which would include testing and release engineering.
We would have an analytics person who specializes in test construction – it’s part science, it’s part statistics and it’s part analytics. It’s the technical side of analytics, so we’re using an analytic tool framework.
That’s kind of the steps we would go through to get something in the product where we ran the test until we got statistical significance that a certain portion of our customers might see it. So if [three people] were all using the app doing the same search at the same time, one of us actually might get a slightly different experience.
If we go back to the example of hotel photos – we’re all putting in the same thing on the same iOS application, but I’m seeing a different photo presentation– once we have statistical significance … and it might take only take a week or two to get to statistical significance, we would want to make a call.
Then the analytics person would work typically back with the product manager and in some cases the engineer again to determine if that was a winner or a loser and what did we learn from it.
If I add up those people, it’s anywhere from six to eight people. And they’re not doing one test at a time. These teams on mobile at any point in time could be running a hundred tests.
Of the tests that are being worked on at any given time, how do you prioritize what goes where in the queue, and what’s the reasoning behind it?
It kind of strikes at heart of the role of product management. The simple answer is: I want to optimize for user value, so whatever is going to either create more value for the user or eliminate the most amount of pain for a user.
We kind of joked about the flight example, but that’s a case where there’s probably value and delight if we did it well.
Sometimes there are things we need to do for our suppliers that add value. There are things you need to do to make sure we’re a good partner for the business, meaning returning value for the business. We’re set up where our businesses ultimately working off the same unified platform and they’re kind of run as businesses unto themselves.
And you’re dealing with contentious real estate. You think about mobile, there’s only so many things you can put on a screen. Do I show hotels? Do I show flights? Do I show packages?
After you book, is it more important for me to show you your confirmation, your itinerary, or to show you the things you could now start to plan to do in trip? We noticed you didn’t purchase a vehicle, perhaps that was an oversight? It gets a little more nuanced.
My general default answer as a consumer product technologist is, it starts with the customer. If we’re doing right by the customer, that’s a great unifier and how I would think about priorities of a test or the priorities around one test vs. another or another.
How does test-and-learn change when you’re thinking about testing something on an Alexa or another third-party system?
We can test all of it. We’re very data-driven, we’re very test-driven. It’s very core to our DNA. Every piece of functionality to date is tested, so I can look at how many times a skill was opened, what pieces of functionality did people use, what was the auditory language or the intent… We can measure all of that.
The analytics approach that we use on the back end is exactly the same. To us this is, in a sense, hey, what we did on the web in terms of the testing methodology and the tools we used on the mobile app we’ll do on Alexa and we’ll do in chat.
We’re spending a lot of time on that. We were one of the first travel companies to be on Alexa – almost two years now. Voice is different beast I think. So for core shopping, it’s challenging because our product is highly configured.
What I mean by that is there’s a lot of dimensions. It’s not like, oh I have a brand of toilet paper or toothpaste and I need to reorder it, which is a great use case for when I’m just thinking of it to reorder that consumable.
Booking a trip is far more complicated. To date, we’re investing across that life cycle. Can travelers ask questions about destinations? If so, engagement might look more like: hey, tell me about weekend excursions from Seattle, or, what are the best beaches in the Caribbean?
For some of that dream discovery, voice works well. If you jump to the other end of the continuum of the in-trip, it also works really well.
I think using my use case where San Francisco is frequent trip for me, saying, hey, Alexa, it’s Sunday night, we’re cooking in the kitchen, remind me what time my flight is tomorrow, or remind me what airline I’m getting in to or what time I need to leave to be at the airport – those are good use cases.
I think what we’ll see in the core shopping is an evolution where we see more of a hybrid, and what I mean by hybrid are things that are voice-enacted but have a visual component to it.
The best example … is where I can talk to my phone but then I can get a visual representation of a search result. So instead of getting one result, I can get cards that say next time you go to Las Vegas, you can check out these three properties.
I can also save those lists and come back to them either on voice, or I think it’s really powerful when I then can go next time I’m on my laptop or tablet or on my mobile device and can initiate a shopping session and go back and do my sorting and filtering before I actually buy.
Our hypothesis is … it’s that interplay of voice and visual that for our business and for our customers I think will be the place we’ll see the most traction in time.
The Expedia Group boss speaks...