Visualizing the housing bubble and crash

Content Warning: d3, data science, data visualization, housing bubble, great recession

If you know me, you'll know that my wife and I are relocating to the Boston area for work. As you might imagine, the housing market in the Boston metro area is a tad bit different than in Albuquerque -- Boston's the 3rd most expensive metro area, behind San Francisco and New York City. My wife and I got to chatting about housing and rentals in general, and noticed via Zillow and Trulia's trend charts that many neighborhoods out here didn't seem to be all that affected by the 2007--2012 crash -- the houses didn't lose much value and just seem to have stopped gaining value during those years. Being the nerd I am, I wanted to get some raw, fine-grained data to see for myself.

Turns out, Zillow provides monthly housing valuations down to the zip-code level (the data is far from complete) back to 1996. After a bit of exploration, my wife suggested building a map that could also display charts when you looked at a specific place, and voila, I had a something to do for the next few days, so I though.

I explored several options for making interactive cartographs in the Python ecosystem, including Bokeh, Vincent, and several other D3.js wrappers. None did exactly what I needed, which is often the case. I also explored the nasty landscape of just building maps in the first place, which was a lengthy, interesting, and somewhat frustrating endeavor that I would detail but I think it might bore you. I might write a basic tutorial on that process some other time. Have you ever wondered why the hell so many US maps are shown down to the county level? Turns out, it's far easier to work with that as opposed to smaller areas, and laziness likely plays a big factor here.

I spent long enough toying with the zip-level data before realizing that the data cleaning and prep involved was making what was supposed to be a fun sabbatical project into something a bit more laborious than what it was worth. County-level data it is.

Anyway, I settled on using pure D3.js to make the visualization and do minor data prep in python. I had to learn D3.js, a good deal of Javascript, basics of front-end development, how the hell SVG's work, etc. to do this, so the code is more than a bit messy. I would refactor this a bit if I was going to expand on it -- which I might do someday. I'd love to build this same map at the ZIP or census block level and show a line chart instead of a table on hover. While I already had high respect for the folks who make quality data visualizations, I have ample more after this mini-project.

Anyway, use the slider to change the yearly comparison value, hover over a county to display a table of information about that county's average home value, and click or scroll to zoom on the map. Note that you are seeing the percentage the average home value changed from year to year per county, and the 2016 data is current to May.

It's pretty wild how quickly some areas gained value -- big swaths of Florida, Arizona, and California come to mind -- and how quickly they fell right back down. Also, D3 is really, really awesome and Mike Bostok is a brilliant and kind person for having initially developed D3 and for providing so many tutorials for it. Those of you familiar with learning new software development skills will know that this visualization is the hacked-together-and-tinkered-with child of about a dozen examples with a healthy dose of Stack Overflow thrown in for good measure. County-level data isn't exactly what I looked at mapping when I started, but it'll be straightforward to extend this to zip-level or block-level from here -- if I ever come back to it.