Waiting to be sold: IUPUI researchers develop model to predict probability of home sales
March 21, 2017
What is the probability that the house you want to sell -- or buy -- will be sold within a month, two months, three months or more? Computer scientists from Indiana University-Purdue University Indianapolis have developed what they believe to be the first data-based answer to how long it will take for a house to sell.
Their machine-learning solution innovatively draws upon methodology used to predict length of disease survival in patients with life-threatening medical conditions.
"We went to the websites that homebuyers and sellers visit -- Trulia, Zillow and Redfin," said Mohammad Al Hasan, associate professor of computer and information science in the School of Science, who led the house-selling probability study. "There was a lot of information to help in the decision-making process for both buyers and sellers, but what was missing was the answer to 'How long does it take for a house to be sold after it first appears in the listing?'"
In addition to predicting the probability of how long a specific house will remain on the market, the algorithms developed and validated by Hasan and Mansurual Bhuiyan, a former IUPUI graduate student now with IBM Research, also account for how changing a feature -- such as lowering the price of the home or adding a bathroom -- influences the length of time the house remains unsold.
Hasan and Bhuiyan "trained" their computer with three months' worth of data on 7,216 houses on the market in five Central Indiana cities and towns: Indianapolis, Carmel, Fishers, Noblesville and Zionsville. In addition to the details typically found in real estate listings, the data included the dates of the initial listing and the sale. This information enabled the computer to study features and patterns, with the goal of being able to make predictions on how long homes that come on the market in the future will remain on the market. The scientists then evaluated and validated the approach.
Drawing upon methodology used to determine the probability that a patient with a certain disease stage will live for a specific length of time, called survival analysis, the researchers designed a machine-learning model that can determine, based on a given set of features such as price, location, age, size, number of bedrooms, number of bathrooms, school ratings and local crime information, the probability that a house will sell within a certain time frame.
Information generated by the house-selling probability model could provide the seller with recommendations on what can be done -- reducing the asking price or remodeling, for example -- to expedite the sale within the time frame in which the family needs to sell the home. A potential buyer could find the same information helpful to inform the timing and amount of a purchase offer.
"As long as there is a steady stream of data so we know how long houses are on the market and the features of those houses, our model can provide valuable information to homebuyers and sellers," Hasan said. "We can expand beyond the three months of our study to account for the seasonality of a real estate market, if it exists. We can use the methodology to look at other geographic areas with different real estate dynamics and predict the probability that a home will sell in a specific time period, adjusting that probability when changes in a feature, like a drop in price, occur.”
"Waiting to be Sold: Prediction of Time-Dependent House Selling Probability" is published online ahead of print in 2016 IEEE International Conference on Data Science and Advanced Analytics. The study was supported by the National Science Foundation through a CAREER award to Hasan.