Project Summary & Skills Used
Team Members
Project Title
Urban Shift: Visualizing U.S. Housing Market Behavior Through Time-Series Analytics & Geographic Clustering
Description
Urban Shift is an interactive housing analytics platform built in Java and Vaadin. Users can explore long-term home and rental price trends for any U.S. county and view nationwide growth clusters on a fully interactive map. The system processes Zillow’s ZHVI and ZORI datasets, computes percent-change growth for selected time windows, and uses K-Means clustering to reveal spatial housing trends across the United States.
Industrial Engineering Context
The project aligns with IE concepts related to data-driven decision support, time-series analysis, geospatial visualization, and classification modeling. It demonstrates how real estate market data can be transformed into actionable visual insights for planning, forecasting, and resource allocation.
Skills Practiced
- Custom Java class design, modular architecture, and debugging
- Working with large datasets using Tablesaw
- Implementing machine learning through SMILE K-Means
- JSON handling with Jackson
- Building UI components and client–server interactions in Vaadin 24
- Integrating external JavaScript (Leaflet) due to Vaadin’s missing map support
- Resolving real-world issues such as inconsistent date formats and FIPS code mismatches
- GitHub-based collaboration, versioning, and merging team contributions
Project Development Process
Urban Shift began as a simple trend-viewer for Zillow housing data.We had had 2 git projects TFP1 and TFP2. TFP1 served as our experimentation and learning phase: we tested date melting, growth metrics, clustering approaches, and experimented with parsing messy real-world datasets. Many of these early files were not used directly, but they helped us understand how Zillow formats its time-series data and how to reliably clean and transform it.
During TFP2, the project evolved into a full Vaadin web application. Several key design pivots shaped the final system:
Switch from Point-Based Maps to Polygon Shading
We originally planned to compute county centroids from uscities.csv and display circular markers whose size reflected growth. This approach failed because:
- centroids were often inaccurate,
- zooming caused thousands of overlapping markers, and
- the visualization was unreadable at the national scale.
This led us to adopt full county polygons using a GeoJSON dataset. The result was much cleaner and more informative.
Fixing the FIPS Mismatch Problem
Zillow’s datasets store FIPS codes inconsistently—sometimes as integers (dropping leading zeros), sometimes split, sometimes as strings. The GeoJSON file uses strict five-digit GEOID strings. Our map initially colored the wrong counties or none at all.
We solved this by implementing a standardized FIPS builder that converts every county into a normalized 5-digit code (e.g., “05031”). This ensured perfect alignment between data and map shapes.
Leaflet Integration
Vaadin 24 no longer includes Vaadin Maps, so we integrated Leaflet.js manually through a custom JavaScript module. This allowed us to:
- render county polygons,
- apply cluster colors,
- generate legends, and
- dynamically update the map based on UI inputs.
Flexible Date Parsing
Zillow data includes multiple date formats (M/D/YY, YYYY-MM, MM/YYYY). To avoid hard-coding formats or assumptions, we built a flexible parser using regex detection and LocalDate conversion. This allowed the user to choose any time window and ensured clustering remained stable.
By the end of the process, TFP1 gave us the analytical foundation, while TFP2 delivered the complete interactive system. The final project exceeded our expectations in clarity, performance, and user experience.
Key Features or Highlights
Housing & Rental Trend Viewer
Users select a county and generate two aligned time-series charts with dual axes for housing and rental prices.
Nationwide Growth Clustering
Using SMILE’s K-Means, the system groups counties based on percent change over a chosen date range.
Interactive Leaflet Map
Full county polygons are shaded according to cluster membership. A dynamic legend explains the growth ranges, and counties with missing data appear in white.
Robust Data Processing Pipeline
- Automatic date-column detection
- Flexible parsing of mixed date formats
- Consistent FIPS normalization
- Filtering by user-defined time windows
- Clean handling of missing values
User-Friendly Vaadin UI
Clean interface, dynamic dropdowns, responsive map, and a clear division between trend viewing and clustering modes.
Each of these features contributes to a system that is both technically rigorous and easy for users to understand.
Reflection
This project significantly improved our understanding of Java-based data pipelines, object-oriented design, and collaborative development. We learned how to troubleshoot issues that arise only when working with messy real-world datasets—like inconsistent formatting, missing values, and mismatched identifiers.
From a coding perspective, we grew more confident in debugging, modular design, and integrating external libraries. From a teamwork perspective, we coordinated through GitHub, combined features from different contributors, and adapted as our design evolved.
Individually, we are proud of the clustering logic, the FIPS-matching solution, and the Leaflet integration—none of which we knew how to do at the start. Overall, the project strengthened our technical skills and prepared us to solve similar data-driven problems in more advanced Industrial Engineering courses.