Data crunching to find the cheapest airline in the worldNewsBy Viewpoints | December 14, 2012Share This article was originally published on NB: This is a guest article by Michael Cameron, co-founder of Rome2rio.The growth of big data and availability of APIs is providing exciting new opportunities for making sense of travel data, even for a fledgling start-up like Rome2rio.Airfares fluctuate wildly, but do follow certain obvious trends; longer flights cost more, and some airlines are more expensive per mile flown than others.We recently started an internal project aiming to model approximate/typical air fares for the flight itineraries assembled by our system. Our aim was to use this model to improve the accuracy of our multi-modal routing engine. However, in the process we generated some interesting data worth sharing with the industry.We modeled airfares using some simple parameters. To do this, we examined the economy class airfares displayed by Rome2rio to users over the past 4 months, totalling some 1,780,832 price points. We grouped the airfares by distance and selected the 20th percentile fare for each distance (where 20% of fares are less, and 80% are more), to produce the following graph:The graph shows a pretty clear linear relationship between distance traveled and airfares. Based on this data, we can create a simple equation to model this relationship:Fare = $50 + (Distance * $0.11)Where Fare is the cost in US$ of flying Distance miles. On average, a fare costs $50 before any flight distance is taken into account, plus an average of 11 cents per mile travelled.So what happens if we divide our data by airline? How does the 11 cents per mile flown vary per carrier?We analyzed the average cost per mile for fares grouped by airline, using the same methodology. We only considered competitive fares - those within two times the cheapest fare for that price search - to remove outlier price points. We also excluded airlines where we had insufficient data.The results are summarized below:The results are fascinating, and there are some clear trends. Budget carriers such as Ryanair and AirAsia are at the low end of the scale; short haul, turboprop operating carriers such as Regional Express and Darwin Airlines are at the high end.There are, however, many factors which can influence per mile costs including type of aircraft flown, routes flown, local salary and fuel costs, ancillary revenue, and airport landing fees.The results should also be taken with a grain of salt, since our sampling set is small, no statistical analysis has been performed, and the results may be biased depending upon the types of searches performed on Rome2rio. Also, Rome2rio may not always have access to the cheapest fares. A major, comprehensive meta-search player such as Kayak or Skyscanner could perform a more thorough analysis based on a far greater sample of search logs or their airfare caches. Nonetheless we wanted to share this data since we thought the results would be of interest to the travel industry, travel buffs, or anyone excited about big data.NB: This is a guest article by Michael Cameron, co-founder of Rome2rioNB2: Globe image via ShutterstockNB3: Special thanks to Fenn Bailey from Adioso and Timothy O'Neil-Dunne for providing valuable feedback on the analysis.