Realestate

This repository contains a set of Python scripts that scrape a real estate webpage, clean and analyze the data, plot visualizations, and perform a multiple linear regression fit.

See also the associated report featured on Medium - Towards Data Science

Files

Web scraping

scrapeweb.py: uses Requests to connect to mlslistings, BeautifulSoup to pull verification token, html to get web content, Re to clean the results, and Pandas to store scraped content as a dataframe
getdata.py: pulls zipcodes from .csv file, uses webscrape function defined in scrapeweb.py to scrape content from the webpage and store it in Pandas dataframe, and writes a .csv file with the scraped content

Map plotting

plotmaps.py: pulls .csv file with listing information, uses price_quintiles function in calculatequintiles.py to place listings into five bins by price, uses cartoplot_x_price (x = bay, sf, eastbay, peninsula, southbay) functions defined in cartoplotfunctions.py to plot data points on a map of the respective region. Also contains scripts to plot commute and school quality data using zip code shapefiles
cartoplotfunctions.py: pulls data from .csv file and city or zipcode borders from shapefile, uses Matplotlib.pyplot and Cartopy to plot maps with terrain background and bounded by given set of latitude, longitude coordinates for full Bay Area as well as sub-regions

Boxplot plotting

plotboxplots.py: pulls data from .csv file and selects cities of interest to plot price information with using Seaborn box + strip plots

Data fitting

fitdata.py: pulls data from .csv file, filters outliers, uses Statsmodels.formula.api to perform ordinary least squares fit and summarize the result, uses Sklearn.linear_model to create price predictions using the fitted coefficients, and uses functions defined in plotfunctions.py to plot a histogram of the residuals

Libraries

Acknowledgement

Written by Michael Boles in summer of 2019 with help from the StackOverflow community.

Name		Name	Last commit message	Last commit date
Latest commit History 190 Commits
.vscode		.vscode
Data		Data
Figures		Figures
Images		Images
Old		Old
Shapefiles		Shapefiles
Text		Text
.gitignore		.gitignore
README.md		README.md
calculatequintiles.py		calculatequintiles.py
cartoplotfunctions.py		cartoplotfunctions.py
cleandata.py		cleandata.py
csvreader.py		csvreader.py
fitdata.py		fitdata.py
getdata.py		getdata.py
merge_shapefile_listings.py		merge_shapefile_listings.py
pairplots.py		pairplots.py
plotboxplots.py		plotboxplots.py
plotboxplots_neighborhoods.py		plotboxplots_neighborhoods.py
plotboxplots_sf_neighborhoods.py		plotboxplots_sf_neighborhoods.py
plotfunctions.py		plotfunctions.py
plotmaps.py		plotmaps.py
plotpairplots.py		plotpairplots.py
reducedimensions.py		reducedimensions.py
schoolquality.py		schoolquality.py
scrapeweb.py		scrapeweb.py
tkinterdemo.py		tkinterdemo.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Realestate

Files

Web scraping

Map plotting

Boxplot plotting

Data fitting

Libraries

Acknowledgement

About

Releases

Packages

Languages

Nick777-Pixel/Realestate

Folders and files

Latest commit

History

Repository files navigation

Realestate

Files

Web scraping

Map plotting

Boxplot plotting

Data fitting

Libraries

Acknowledgement

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages