Project Summary & Skills Used

Team Members

Project Title

Urban Shift: Visualizing U.S. Housing Market Behavior Through Time-Series Analytics & Geographic Clustering

Description

Urban Shift is an interactive housing analytics platform built in Java and Vaadin. Users can explore long-term home and rental price trends for any U.S. county and view nationwide growth clusters on a fully interactive map. The system processes Zillow’s ZHVI and ZORI datasets, computes percent-change growth for selected time windows, and uses K-Means clustering to reveal spatial housing trends across the United States.

Industrial Engineering Context

The project aligns with IE concepts related to data-driven decision support, time-series analysis, geospatial visualization, and classification modeling. It demonstrates how real estate market data can be transformed into actionable visual insights for planning, forecasting, and resource allocation.

Skills Practiced

Custom Java class design, modular architecture, and debugging
Working with large datasets using Tablesaw
Implementing machine learning through SMILE K-Means
JSON handling with Jackson
Building UI components and client–server interactions in Vaadin 24
Integrating external JavaScript (Leaflet) due to Vaadin’s missing map support
Resolving real-world issues such as inconsistent date formats and FIPS code mismatches
GitHub-based collaboration, versioning, and merging team contributions

Project Development Process

Urban Shift began as a simple trend-viewer for Zillow housing data.We had had 2 git projects TFP1 and TFP2. TFP1 served as our experimentation and learning phase: we tested date melting, growth metrics, clustering approaches, and experimented with parsing messy real-world datasets. Many of these early files were not used directly, but they helped us understand how Zillow formats its time-series data and how to reliably clean and transform it.

During TFP2, the project evolved into a full Vaadin web application. Several key design pivots shaped the final system:

Switch from Point-Based Maps to Polygon Shading

We originally planned to compute county centroids from uscities.csv and display circular markers whose size reflected growth. This approach failed because:
- centroids were often inaccurate,
- zooming caused thousands of overlapping markers, and
- the visualization was unreadable at the national scale.

This led us to adopt full county polygons using a GeoJSON dataset. The result was much cleaner and more informative.

Fixing the FIPS Mismatch Problem

Zillow’s datasets store FIPS codes inconsistently—sometimes as integers (dropping leading zeros), sometimes split, sometimes as strings. The GeoJSON file uses strict five-digit GEOID strings. Our map initially colored the wrong counties or none at all.

We solved this by implementing a standardized FIPS builder that converts every county into a normalized 5-digit code (e.g., “05031”). This ensured perfect alignment between data and map shapes.

Leaflet Integration

Vaadin 24 no longer includes Vaadin Maps, so we integrated Leaflet.js manually through a custom JavaScript module. This allowed us to:
- render county polygons,
- apply cluster colors,
- generate legends, and
- dynamically update the map based on UI inputs.

Flexible Date Parsing

Zillow data includes multiple date formats (M/D/YY, YYYY-MM, MM/YYYY). To avoid hard-coding formats or assumptions, we built a flexible parser using regex detection and LocalDate conversion. This allowed the user to choose any time window and ensured clustering remained stable.

By the end of the process, TFP1 gave us the analytical foundation, while TFP2 delivered the complete interactive system. The final project exceeded our expectations in clarity, performance, and user experience.

Key Features or Highlights

Housing & Rental Trend Viewer

Users select a county and generate two aligned time-series charts with dual axes for housing and rental prices.

Nationwide Growth Clustering

Using SMILE’s K-Means, the system groups counties based on percent change over a chosen date range.

Interactive Leaflet Map

Full county polygons are shaded according to cluster membership. A dynamic legend explains the growth ranges, and counties with missing data appear in white.

Robust Data Processing Pipeline

Automatic date-column detection
Flexible parsing of mixed date formats
Consistent FIPS normalization
Filtering by user-defined time windows
Clean handling of missing values

User-Friendly Vaadin UI

Clean interface, dynamic dropdowns, responsive map, and a clear division between trend viewing and clustering modes.

Each of these features contributes to a system that is both technically rigorous and easy for users to understand.

Reflection

This project significantly improved our understanding of Java-based data pipelines, object-oriented design, and collaborative development. We learned how to troubleshoot issues that arise only when working with messy real-world datasets—like inconsistent formatting, missing values, and mismatched identifiers.

From a coding perspective, we grew more confident in debugging, modular design, and integrating external libraries. From a teamwork perspective, we coordinated through GitHub, combined features from different contributors, and adapted as our design evolved.

Individually, we are proud of the clustering logic, the FIPS-matching solution, and the Leaflet integration—none of which we knew how to do at the start. Overall, the project strengthened our technical skills and prepared us to solve similar data-driven problems in more advanced Industrial Engineering courses.