How to Predict Seattle Airbnb Listing Price using Machine Learning with Python

Vincent Chong
5 min readJun 13, 2021

Introduction and Motivation

Airbnb has revolutionised the way travelers and hosts connect with each other. Hosts who have an extra room or house can easily list it on Airbnb to earn extra cash. The hosts can set any price they want. However, there should be a few factors that play a role in deciding the most suitable listing price.

Let’s explore the Seattle Airbnb dataset and attempt to answer the following questions:

  1. Does location e.g. neighbourhood affect listing price?
  2. Does the type of property and room affect listing price?
  3. What are the top 5 factors influencing the listing price? To be answered using a machine learning model, Random Forests .

Finally, we will attempt to train a Random Forests model to predict listing price.

The Seattle Airbnb dataset has been downloaded from Kaggle for us to work with.

Question 1: Does location e.g. neighbourhood affect listing price?

The first impression we usually have is that location does play a part in determining the listing price. Area which has better amenities, better public transport or is better developed generally should have a higher listing price. The chart below shows the average listing price based on the neighbourhood.

Average Price based on Location

From the bar chart above, we can see that there are obvious price variation based on location. The lowest average price is around $80 at Delridge while the highest average price is around $160 at Magnolia. We can see that the northern part of Seattle near the bay such as Magnolia, Queen Anne and Downtown are able to command a higher listing price compared to other locations.

In conclusion, location does play an important role in the Airbnb listing price.

Question 2: Does the type of property and room affect listing price?

For property type, we can see that tent, dorm and treehouse have the lowest listing price at around average of $25 to $50. Boat has the highest listing price with an average of $250. For other property type, the listing price ranges between an average of $100 to $150 with loft and condominium property being at the higher end of the listing price. We can see from here that the kind of property that affect listing price the most are the unique kind of property such as tent, dorm, treehouse and boat.

Average Price based on Property Type

For room type, we can see that there’s a significant average listing price difference in terms of the type of room being offered. The difference in listing price does make sense because when you’re sharing a room, the price is typically cheaper. The listing price is much higher for entire home / apartment because you are getting more space and rooms for it.

Average Price based on Room Type

Question 3: What are the top 5 factors influencing the listing price? To be answered using Random Forests model.

To answer this question, I have trained a Machine Learning model called a Random Forest Regressor to predict the listing price based on features such as the listing’s number of bedroom, number of bathroom offered etc.

One of the useful output of the Machine Learning model is that it can tell us which features have the strongest influence on the listing price based on the dataset used for training it.

The chart below shows the importance of the feature in determining the listing price in descending order. We can see that the top 5 factors influencing the listing price are:

  1. Number of bedrooms
  2. Number of bathrooms
  3. Cleaning fee
  4. How many people the place can accommodate
  5. Whether the room type is ‘Private Room’
Importance of Features

I believe cleaning fees ended up having a correlation with listing price most likely because when you have an expensive house being listed with a high listing price, you would impose a high cleaning fee to discourage travellers from messing up your home.

The table below shows the real listing price compared with the Machine Learning model’s predicted listing price based on the given features such as number of bedrooms and bathrooms.

Comparison between the Real Listing Price and Predicted Listing Price

As can be seen from the table above, the predicted values generally are close to the real values. This shows that the predictive model has managed to capture the important predictors to predict the output.

Conclusion

We can conclude the following:

  1. Location does play a part in influencing the listing price. Northern part of Seattle near the bay such as Magnolia, Queen Anne and Downtown are able to command a higher listing price compared to other locations.
  2. The type of property and room does influence the listing price. Townhouse, loft and condominium tend to have higher listing price. Shared room is generally cheaper compared to private room and entire apartment.
  3. The top 5 factors influencing the listing price are:
  • Number of bedrooms
  • Number of bathrooms
  • Cleaning fee
  • How many people the place can accommodate
  • Whether the room type is ‘Private Room’

To check out more on the technical side of this analysis, see the link to my Github available here.

The Seattle Airbnb dataset can be downloaded from Kaggle here.

--

--