Portfolio
LiveBatting Cleanup
Civic TechIn ProductionGeospatial BackendCity of ToledoApplied Labs

Batting Cleanup

QR-code waste reporting for a smarter Toledo.

A production civic-tech platform connecting public QR-code reports, edge APIs, and PostGIS geospatial validation to improve downtown Toledo maintenance workflows.

Live
Production — downtown Toledo
10K+
Localized assets in simulator
<10ms
ST_DWithin geofence query
GiST
Spatial index — no table scans
7-dev
Docker Compose team environment
Edge
Cloudflare Workers + Hono API
Batting Cleanup
01 — Overview

Not theoretical. Used by real people in a real city.

Batting Cleanup is a live smart-city waste reporting application deployed in downtown Toledo. The system lets residents scan QR codes placed near public waste assets and instantly file a maintenance report — overflowing trash, nearby litter, anything that needs attention.

The project is more than a reporting form. It connects public-facing QR codes, edge-hosted APIs, geospatial validation, city asset telemetry, and backend data integrity into one civic technology platform. Residents get a low-friction way to report problems. Maintainers get structured, location-aware data to prioritize response.

I contributed as a core backend engineer. My work transitioned the system from raw initialization scripts into a modern, type-safe Deno monorepo bridging Cloudflare Workers/Hono APIs with a PostGIS geospatial database — and built the spatial modeling, performance optimization, and validation infrastructure that makes the system reliable at city scale.

02 — Problem

Cities need better feedback loops for public infrastructure

Traditional waste maintenance workflows create friction on both sides: residents don't know who to contact, and maintenance teams receive incomplete, unlocated, or duplicate reports. The gap between “a trash can is overflowing” and “a maintenance crew knows about it” can be days.

No reporting path

Most residents have no simple way to report an overflowing trash can. Phone lines and 311 apps create friction that kills adoption.

Unlocated reports

Even when reports come in, they often lack precise asset location — just a street address, or nothing.

GPS noise downtown

Urban canyons and indoor environments degrade GPS accuracy. A naive location check will reject valid reports.

Spoofing risk

A public-facing system with no location validation is easy to flood with fake or mislocated reports.

Duplicate submissions

Multiple users report the same asset without any deduplication, creating noise for maintenance teams.

Local dev friction

A 7-person team doing manual database setup diverges quickly — inconsistent environments slow everyone down.

Batting Cleanup solves the first-mile reporting problem by putting QR codes directly where the issue happens — no app install, no account, no friction. Scan, report, done.

03 — Architecture

Edge API → validation → PostGIS → city ops

The system flows from a physical QR code through an edge-hosted API, a type-safe service layer, and a geospatial database before surfacing structured data to city maintainers.

User reporting flow

1Resident spots overflowing trash or litter near a QR-coded asset
2Scans the QR code on their phone — no app install required
3Public report form opens with the asset pre-identified
4Resident selects issue type and optionally provides contact info
5Backend validates report context: location, geofence, plausibility
6Report stored with verification metadata in PostGIS database
7Maintainers query structured, location-aware report data

System layers

01
Public QR Interface
Physical QR codes downtownZero-install reportingOptional contact infoMobile-first flow
02
Edge API
Cloudflare WorkersHono REST routesFast public-facing handlerLightweight serverless
03
Type-Safe Backend
Deno monorepoDrizzle ORMRepository patternService patternTypeScript schema
04
Geospatial Database
PostgreSQLPostGISSpatial asset modelsGiST indexesST_DWithin geofence queriesLocation verification logs
05
Validation Layer
isIndoor flagsGeofence checksAnti-spoofing logicGPS uncertainty handlingReport plausibility metadata
06
Local Infrastructure
Docker ComposeAlpine LinuxBash automationSQL hydration scripts10,000+ simulated assets
04 — Geospatial Database Design

Real-world location data is messier than static records

The database had to handle real-world GPS data, not clean coordinates. Each entity was modeled to preserve enough location context to support both query performance and validation logic.

Waste Assets
QR-code identity
Physical location (PostGIS)
Asset metadata
Report history
Geofence radius
Reports
Submitted timestamp
Issue type
Associated asset
Optional contact
Device location
Verification status
Location Verification
Reported coordinates
Expected asset coordinates
Distance from asset
isIndoor flag
GPS confidence
Spoofing indicators
Audit Trail
Report context metadata
Validation outcomes
Timestamp chain
Source tracking
Simulator Data
10,000+ generated assets
Relational report records
Geospatial test state
Repeatable seed data
Maintainer View
Asset clusters
Report frequency
Recurring hotspots
Response status

This structure lets the backend answer operational questions in real time: Is the user near the asset they're reporting? Are multiple reports clustering around one location? Which assets have recurring issues? How does query performance hold at city scale?

05 — Performance Optimization

B-Tree indexes don't work for proximity queries

A major engineering focus was geospatial query performance. The backend needs to run proximity checks on every report submission: Is this report within the expected geofence? Which asset is nearest to this coordinate? How many reports are clustered near this location?

I benchmarked ST_DWithingeofence queries against 10,000+ localized assets. The initial run used standard B-Tree indexing — which is the PostgreSQL default but completely wrong for spatial distance lookups. B-Tree indexes sort by scalar value; they cannot efficiently prune the search space for a geometric “within N meters” query.

Before — B-Tree Index

Full table scan on every geofence query. Latency grows linearly with asset count. Not viable at city scale — every report submission triggers an O(n) scan across all assets.

After — GiST Spatial Index

GiST (Generalized Search Tree) indexes are designed for spatial lookup patterns. They prune the search space geometrically. Result: sub-10 ms ST_DWithin benchmarks against 10,000+ assets in the simulator.

Key queries optimized
Which asset is nearest to this submitted coordinate?
Is this report coordinate within the asset's geofence radius?
Which assets exist within a given bounding area?
How many reports are clustered within N meters of a location?
06 — Infrastructure as Code

Reproducible local environments for a 7-person team

Manual database setup across a 7-person team creates drift fast. One developer has a different schema version, another has stale seed data, a third has a different PostGIS extension installed. These inconsistencies slow everyone down and hide bugs that only appear in certain environments.

I helped create reproducible local development infrastructure using Docker Compose and Alpine Linux containers, scripted with Bash and SQL initialization files. The local simulator generates a fully hydrated, geospatially accurate development environment from scratch in a single command.

Local simulator generates
  • 10,000+ localized waste assets
  • Relational report records
  • Geospatially accurate test coordinates
  • Repeatable development seed state
  • PostGIS extension and GiST indexes pre-configured
What this eliminates
  • Schema version drift between developers
  • Manual PostGIS setup steps
  • Stale or inconsistent seed data
  • Reliance on production data for testing
  • "Works on my machine" environment bugs

Stack

IaC
Infrastructure
Docker ComposeAlpine Linux containersBash automation scriptsSQL initialization filesConditional hydration logic
07 — Data Integrity & Anti-Spoofing

Public-facing systems attract imperfect data

Because Batting Cleanup is publicly accessible, the backend must handle a wide range of real-world edge cases — not as errors to reject, but as signals to classify. The goal is not to block every imperfect report; it is to preserve enough verification context that the system can distinguish trusted, uncertain, and suspicious submissions.

Real-world edge cases

GPS inaccuracy downtown

Urban canyons and tall buildings degrade mobile GPS accuracy. A report 30m from the asset may still be legitimate.

Location permission denied

A user may scan a QR code but decline to share their location. The report is still valid — the QR code already identifies the asset.

Indoor submissions

A user near a public trash can inside a lobby may have GPS coordinates that appear far from the street asset.

Distant submissions

Users may attempt to file reports from far away — either mistakenly or maliciously.

Duplicate reports

Multiple users may report the same asset in a short window without knowing others already did.

Spoofed coordinates

A bad actor may attempt to inject fake location data to flood or mislead the maintenance system.

Validation mechanisms

isIndoor

Flags reports where device signals suggest an indoor environment — preserved as metadata rather than rejected.

Location verification log

Stores reported coordinates, expected asset coordinates, computed distance, and GPS confidence alongside every report.

Geofence check

ST_DWithin validates whether the submitted location is within the asset's configured radius.

Plausibility metadata

Contextual signals (distance, confidence, indoor flag) are stored to support downstream filtering and trust scoring.

Audit trail

Structured record of report context allows maintainers and analysts to review submission quality over time.

08 — Lessons Learned

What production civic software teaches you

Geospatial data needs specialized indexes

B-Tree indexes are the PostgreSQL default but useless for distance queries. GiST is not an optimization — it is a correctness requirement for spatial workloads.

Production civic software handles imperfect input

Real-world users scan QR codes in parking garages, deny location permissions, and submit from unpredictable environments. The system must tolerate all of it gracefully.

Dockerized environments save team time compounding

Local environment drift across 7 developers is slow, annoying, and hard to debug. A reproducible setup script pays for itself the first week.

Repository/service separation makes code scalable

Keeping persistence logic out of API handlers and business logic out of repositories made the codebase much easier to test, extend, and reason about.

Edge APIs need careful database boundary design

Cloudflare Workers are powerful but stateless. The handoff between edge handler and database layer must be explicit — connection pooling, timeouts, and error propagation all matter.

Real city deployments surface edge cases no classroom project reveals

GPS noise, QR code durability, public network reliability, and user behavior at scale are not theoretical concerns. They show up immediately.

09 — Future Work

The reporting layer is the foundation

The current system solves the first-mile problem: getting structured, location-aware reports from residents into a reliable database. The next phase turns that data into operational intelligence for city maintainers.

Operations
  • Live operations dashboard for city maintainers
  • SLA tracking for report resolution time
  • Route optimization for cleanup crews
  • Automated escalation rules for high-frequency assets
Intelligence
  • Report deduplication and clustering
  • Heatmaps for recurring waste hotspots
  • Anomaly detection for spam or suspicious report patterns
  • Analytics for seasonality, event impact, and foot traffic correlation
Public
  • Public transparency metrics dashboard
  • Richer QR-code asset metadata and status pages
  • Notification system for users who filed reports
  • Expanded coverage beyond downtown Toledo

The hard part was not making a form. The hard part was making location-aware public reporting reliable enough for real city use.

Cleaner cities need better feedback loops. Batting Cleanup is the first one.