From Intuition to Insight

 

Notes on why analytics is key for product managers by airbnb data scientist Dr Teresa Johnson...

It’s important that early stage startups build a strong data culture early. In many cases those insights are the difference between shutdown and success.
— Brian Chesky
  • For small businesses, get an early appreciation of data and an early data science culture
  • When can we start to infuse data and data-driven thinking into our product landscape?

Data Stack Levels:
Visualization - Sustained narrative around execution
Experimentation - Disentangling causality
Data Products - Machine learning and feedback loops
Analysis - Understanding user behavior and business drivers
ETL (Extract Transform Load) - Curates clean data for analysis reporting
Infrastructure - Stability of warehouse systems and tools

Airbnb's embedded Data Science Team model:
Outcome and OKR driven
Smaller teams- two pizza rule
Push down decision-making

Data scientists are only as impactful as the context they have for the set of problems they’re meant to solve.
— Dr. Theresa Johnson
  • Sub-optimal approach: Someone will give a data scientist a list of things and say "could you give me this number and this number and this number". They are more than just SQL monkeys and you're under-utilizing your data scientist.
  • Optimal approach: Data scientists should be involved in the conversations. Thats how you start the process of guiding product development using data. They have to be involved in the problem statement and the generation of how you're going about solving it.

Machine Learning Vision: Establishing a robust, extensible and efficient Machine Learning Workflow critical to unlocking productivity and democratizing machine learning across Airbnb

  • Using predictive modeling with feedback loops to drive insight. Being able to take the next step of the data and feed it back into the model and make the model better.
  • If you have any sort of customer experience support (CX), how inefficient are they? How much more efficient could they be if there was an algorithm running in the background that was constantly taking feedback from them and being able to predict the most important issue that they might face in a given day or in a given week or in a given month? Could you plan a bit better?
  • Use ML to develop a lifetime value model (LVM) for every single home and every single host. Prioritize top users, segment and target users, etc.

Controlled Experiments: The fundamental building block for decision-making at Airbnb

  • Need to figure out is it safe to launch feature? Can't assume that its safe for business. Is it actually driving my specific goal?
  • Motivational factor: how much is this experiment contributing? How are the late nights and hours contributing to team goal?
  • AB testing: when you're not running AB tests, no counterfactuals-- what would happen if we did not launch the treatment?

Characteristics About Experimentation:

  • How many populations could you be testing something on?
  • What is the visit pattern of use?
  • How can you map logged out traffic to logged in traffic? If this is not solved, data might be a bit muddled. Experimentation might take longer to see
  • How can you track multiple devices and should you track them independently?
That was Airbnb’s real innovation— a platform of ‘trust’— where everyone could not only see everyone else’s identity but also rate them as good, bad, or indifferent hosts or guests
— Thomas Friedman

Unique characteristics about experimentation at Airbnb:

  • 2-sided marketplace
  • Users only visit when planning trips (guests) or adding a home (hosts) [visit pattern isn't consistent like for example Facebook]
  • User decisions take a long time (conversion funnel of 2-3 weeks)
  • Users often logged out when researching
  • Users browse from multiple devices
  • Offline experience that you can't control

Case Study: Creating and increasing Airbnb's platform of trust

By 2012, most hosts around the world were eligible a free professional photoshoot of their listing to move from the couch-surfing feel to something more aesthetic and professional. From 2011 to 2016, they found that providing this service was also a large business operation, where multi-million dollars were spent every year.

Fast Company article here.

Was professional photography still serving its function of building trust in 2016? Was there a changing landscape? Do photos still equal trust?

Changing landscape:

  • 2013: launched "verified ID" system
  • 2014: changed rating system to be more "honest" (double-blind ratings)
  • 2014: normal publicity- Inc's Company of the Year
  • 2015: Apple's iPhone started including a 12-megapixel camera

Did professional photography matter anymore?

AB Testing to measure impact of professional photography:

Group A (50%): Eligible for photography
Group B (50%): Received "Sorry, no photographers available at the moment" email

Experiment challenges:

  • Endogeneity: Only accessible by people with the willpower. The people who go do it are fundamentally different than the people who are just otherwise floating and might not do it. Are we comparing two of the same intent populations?
  • Infeasible guest-facing test: When a guest visits the site, they would either see all professional photography (treatment) or all cell phone photography (control). Unethical because you'd fake pictures
  • Bookings cannibalization: Are the set of professional photo listings cannibalizing the other ones? How do you know if its an extra booking vs a booking that shifted from one to another listing? Maybe overall bookings are the same but they shifted from one listing to one listing.
  • Statistical power and significance: Subset of people who request professional photography is already small. So if you further narrow it down, how long is this going to run? Am I going to get statistical significance?
  • The human element, CX agents: can try to control it but "customers are always right"
  • Metrics: proxy of trust is bookings as a host. Is the metrics that you see the true effect or is there some cannibalization factor that you have to account for?

Results:

  • Negative impact on global Airbnb bookings (trust) was almost undetectable
  • Upper bound- considering no cannibalization
  • Professional photos mattered to hosts in certain markets more than others
  • Positive impact on owner bookings and booking value

Conclusion: Offer a unique benefit to hosts at a nominal fee. Allow more hosts especially in newer markets, to uplevel their listing appeal

Rolled out Globally Available Professional Photography for Hosts.

  • Only pay once they get a booking
  • Affordable based on average earnings in every market
  • Allows to scale up service and be available for more hosts
  • Saves Airbnb money to focus on different methods of driving trust and safety for owners

Airbnb's Data University Vision:

A set of courses taught by data scientists for everyone in the company to empower every employee to make data-informed decisions by providing data education that scales to their team and their role.

TechCrunch article here.
Medium article here.

Data 101: Data-Driven Decision Making and Problem Solving

  • Identify root cause of issue
  • Set data-informed targets and goals
  • Generate hypotheses tree (or list out all possibilities)
  • Solve root cause of issue
  • Consider data as a factor in all aspects of problem solving
     
  • Tool: 5 Why's
    • It might take asking WHY 5 times to get to the root of the problem
  • Tool: Issue Trees
    • How do you in a data-informed fashion deep dive and avoid all-over-the-place intuitive thinking but use a data-drive approach to reach the cause?
    • Make sure each issue is mutually exclusive and exhaustive
    • Overlay with data; be concise
    • Output = concise problem statement to understand why its happening
  • Tool: Hypothesis Tree
    • Drive the solution
    • Estimate impact
    • Which one is the biggest win if you were to address it? (small wins vs big wins)