Our methodology is systematic, automated, and reproducible. Every step — from data ingestion through scoring — follows documented rules applied uniformly to all parcels and all owners in a county.
We process the complete parcel dataset from the county assessor. For St. Louis County, this is 401,458 parcel records with 73 data fields including owner name, mailing address, property address, valuations, deed type, year built, and property classification.
Source data is obtained from publicly available county assessor portals (typically ArcGIS Hub or similar open data platforms). We do not scrape, purchase, or use non-public data.
Raw owner names and addresses contain significant variation that would prevent accurate matching. We apply two normalization functions:
Owner name normalization: Convert to uppercase, strip entity suffixes (LLC, Inc, Corp, Trust, LP, Ltd, and 25+ variants), remove punctuation, collapse whitespace, strip leading "THE". This allows "ABC Properties LLC" and "ABC PROPERTIES, L.L.C." to match as the same entity.
Address normalization: Convert to uppercase, strip suite/unit/apt designators, standardize directionals (North → N, South → S), standardize street suffixes (Street → ST, Avenue → AVE, Boulevard → BLVD), remove punctuation, collapse whitespace.
We construct a mailing address key from the owner's mailing address, city, state, and ZIP code. We then group all parcels by this normalized mailing address key and count the number of distinct normalized owner names at each address.
The threshold is 3 or more distinct owner/entity names sharing the same normalized owner mailing address. This threshold balances sensitivity (catching real clusters) against specificity (avoiding false positives from incidental address sharing).
The question each cluster answers is: "How many different entity names all receive their mail at this same address?"
Known registered agent addresses, virtual office providers, and commercial mail centers are automatically excluded from clustering. This prevents common business service addresses from generating false positive clusters.
Our exclusion list includes addresses matching: registered agent services (CT Corporation, National Registered Agents, Corporation Service Company, United States Corporation Agents), identified mail-drop addresses, and addresses flagged through manual review of high-entity-count clusters.
The exclusion list is maintained and expanded with each report update. When in doubt, we exclude — a missed cluster is preferable to a false positive.
Each cluster receives a composite concentration index (referred to as "risk score" in the data) based on multiple factors:
| Component | Value |
|---|---|
| Base: number of distinct entities at the address | entity_count |
| Majority out-of-state owners (>50% non-local) | +3 |
| Each entity with distress score above 40 | +2 each |
| Total cluster appraised value exceeds $1M | +5 |
| 10+ entities at address | +10 |
| 5-9 entities at address | +5 |
| Each quitclaim deed in cluster | +2 each |
Higher concentration indices indicate greater ownership density and complexity. They do not indicate wrongdoing.
Individual parcels receive a distress score (0-100) based on publicly observable signals that may indicate property neglect, vacancy, or financial stress:
| Signal | Points |
|---|---|
| Out-of-state absentee owner | 20 |
| In-state absentee owner | 10 |
| Vacant land (no structure) | 15 |
| Low improvement ratio (<10% of appraised value) | 10 |
| Long ownership: 20+ years | 10 |
| Long ownership: 10-20 years | 5 |
| Pre-1950 building | 5 |
| Low total appraised value (<$30,000) | 10 |
| Large lot: 1+ acre residential | 5 |
| Multi-family absentee | 5 |
Distress scores are observational indicators derived from public assessor data. A high distress score does not imply negligence or wrongdoing — it highlights parcels that may exhibit patterns associated with deferred maintenance, vacancy, or long-term holding.
Our analysis is only as current and accurate as the underlying county assessor data. Known limitations include:
We encourage all users to independently verify findings before taking action.
Every report includes a full methodology section with county-specific data currency dates.
View County Reports