Loading Zillow Housing Data in SAS

Zillow is a well-known website widely used by those searching for a home or curious to find out the value of their current home. What you may not know is that Zillow has a dedicated research page. To make their website work optimally, they churn through tons of data on the American housing market. They share insights they gleaned via zillow.com/research. If you visit their research website you’ll notice they have a data page where you can download some really cool data sets for your own research. They even have an API with which you can load data directly, but you’ll have to register for access. In this post, we’ll look at how to load the CSV files that are available for direct download into SAS for analysis. ...

Aug 1, 2022 · 5 min · 892 words · D. Michael Senter

North Carolina Housing Data

A popular beginners machine learning problem is the prediction of housing prices. A frequently used data set for this purpose uses housing prices in California along some additional gathered through the 1990 Census. One such data set is available here at Kaggle. Unfortunately, that data set is rather old. And I live in North Carolina, not California! So I figured I might as well create a new housing data set, but this time with more up-to-date information and using North Carolina as the state to be analyzed. One thing that may be interesting about North Carlina as compared to California is the position of major populations centers. In California, major population centers are near the beach, while major population centers in North Carolina are in the interior of the state. Both large citites and proximity to the beach tend to correlate with higher housing prices. In California, unlike in North Carolina, both of these go together. ...

Nov 6, 2020 · 8 min · 1644 words · D. Michael Senter

Teacher Salaries

What do you do when your data table is in PDF format? Let’s use tabula-py to extract teacher salary information from PDFs directly into Pandas dataframes. We’ll also use some regex to clean up the results.

Oct 29, 2020 · 9 min · 1788 words · D. Michael Senter

Accessing Census Data via API

The Census Bureau makes an incredible amount of data available online. In this post, I will summarize how to get access to this data via Python by using the Census Bureau’s API. The Census Bureau makes a pretty useful guide available here - I recommend checking it out. ...

Aug 22, 2020 · 5 min · 999 words · D. Michael Senter