University of Southern California
Site Index Contact Login Sign up RSS
USC College of Letters, Arts & Sciences  
USC GIS Research Laboratory
Geocoding Platform Technical Details

Version 2.92 - November 4th 2009

Technical Reports

The following technical report details of the inner workings of the USC WebGIS Geocoding Platform:

Goldberg, D.W., 2009. The USC WebGIS Open Source Geocoding Platform. Technical Report No 11. Los Angeles CA: University of Southern California GIS Research Laboratory. Available online at: http://gislab.usc.edu/i/publications/gislabtr11.pdf.

Geographic Coverage

The USC Geocoding Platform currently supports US geocoding

Reference Data Sources

2008 US Census Bureau TIGER/Line Files
Edges
Places
Cities
Consolidated Cities
Zip Code Tabulation Area (ZCTA)
County Sub Regions
Counties
States
2005 US Census Bureau TIGER/Line Files
Edges
2000 US Census Bureau Catographic Boundary Files
Places
Cities
Zip Code Tabulation Area (ZCTA)
Counties
Los Angeles County Assessor’s Parcel Files

USPS Tiger/ZIP + 4 Files
If you have other reference data sets you would like to use for and/or contribute to the geocoding process on this site, we can incorporate them. Please contact us for more information.

Feature Matching

Deterministic Matching
This version of the USC geocoder performs strictly deterministic matching, i.e., probabilistic matching is not attempted. This means that if an exact match is not found in a particular reference dataset, no match is returned for that reference data layer.

Attribute Relaxation
This option directs the geocoder to try alternative versions of the input data in the case when an exact match can not be found. In particular, attributes of the input address are removed from the query, first one at a time and then in combination with each other. Attribute relaxation is performed (if the option is selected) on the following address attributes:

Rank Attribute
1)Street predirectional
2)Street postdirectional
3)Street suffix
4)City
5)Zip


An example of the first few iterations it will try are listed in the next table:

Number Pre Name Suffix Post Zip City State
3620SVermontAveN90089Los AngelesCa
3620 VermontAveN90089Los AngelesCa
3620SVermontAve 90089Los AngelesCa
3620SVermont N90089Los AngelesCa
3620 VermontAve 90089Los AngelesCa
3620 Vermont N90089Los AngelesCa
3620SVermont  90089Los AngelesCa
3620 Vermont  90089Los AngelesCa


Substring Matching
This option directs the geocoder to use substring matching techniques to test for matches in the database. Using this approach increases the likelyhood of finding a match if the input data or reference data are incomplete (the recall is increased). However, using this also increases the chances that wrong results are returned (the precision is descreased). The following table shows examples of this strategy:

Query Reference Feature Match
Vermont Vermont yes
Verm Vermont yes
Mont Vermont yes


Soundex Matching
This option directs the geocoder to use soundex matching techniques to test for matches in the database. Using this approach increases the likelyhood of finding a match if the input data or reference data have minor misspellings (the recall is increased). However, using this also increases the chances that wrong results are returned (the precision is descreased). The following table shows examples of this strategy:

Query Query Soundex Reference Feature Reference Feature Soundex Match
Vermont V655 Vermont V655 yes
Vermond V655 Vermont V655 yes
Varnend V655 Vermont V655 yes

Feature Interpolation

Linear Interpolation
The following linear interpolation techniques are used:
1) Address range interpolation
2) Uniform lot interpolation
The following areal-unit interpolation techniques are used:
1) Geometric centroid

Full list of API Output InterpolationType Values


Unknown
LinearInterpolation
ArealInterpolation
None
NotAttempted

Full list of API Output InterpolationSubType Values


Unknown
LinearInterpolationAddressRange
LinearInterpolationUniformLot
LinearInterpolationActualLot
LinearInterpolationMidPoint
ArealInterpolationBoundingBoxCentroid
ArealInterpolationConvexHullCentroid
ArealInterpolationGeometricCentroid
None
NotAttempted

Matching Geography Types

Parcel centroid
A exact match was found to a parcel and its centroid is returned as output

Street segment
A match was found to the street segment and the address range associated with the segment was used to interpolate a point to return as output

ZCTA
A match was found to the ZIP portion of the address and its centroid was returned as output

City
A match was found to the city portion of the address and its centroid was returned as output

County subregion
A match was found to the city portion of the address in the county subregion reference data set and its centroid was returned as output

County
A match was found to the city portion of the address in the county reference data set and its centroid was returned as output

Unmatchable
A match could not be found for the input

Full list of API Output MatchingGeographyTypes Values


Unknown
GPS
BuildingCentroid
BuildingDoor
Parcel
StreetSegment
StreetIntersection
StreetCentroid
USPSZipPlus5
USPSZipPlus4
USPSZipPlus3
USPSZipPlus2
USPSZipPlus1
USPSZip
ZCTAPlus5
ZCTAPlus4
ZCTAPlus3
ZCTAPlus2
ZCTAPlus1
ZCTA
City
ConsolidatedCity
MinorCivilDivision
CountySubRegion
County
State
Country
Unmatchable

Geocode Quality

Exact parcel centroid
A exact match was found to a parcel and its centroid is returned as output

Nearest parcel centroid
A match was found to the nearest parcel and its centroid is returned as output

Uniform lot interpolation
A match was found to the street segment and the number of lots on the segment was used to interpolate a point to return as output

Address range interpolation
A match was found to the street segment and the address range associated with the segment was used to interpolate a point to return as output

ZCTA centroid
A match was found to the ZIP portion of the address and its centroid was returned as output

City centroid
A match was found to the city portion of the address and its centroid was returned as output

County subregion centroid
A match was found to the city portion of the address in the county subregion reference data set and its centroid was returned as output

County centroid
A match was found to the city portion of the address in the county reference data set and its centroid was returned as output

Unmatchable
A match could not be found for the input

Full list of API Output GeocodeQualityType Values


Unknown
GPS
BuildingFrontDoor
BuildingCentroid
ExactParcelCentroidPoint
ExactParcelCentroid
NearestParcelCentroidPoint
NearestParcelCentroid
ActualLotInterpolation
UniformLotInterpolation
AddressRangeInterpolation
StreetIntersection
StreetCentroid
ZCTAPlus5Centroid
ZCTAPlus4Centroid
ZCTAPlus3Centroid
ZCTAPlus2Centroid
ZCTAPlus1Centroid
ZCTACentroid
USPSZipPlus5LineCentroid
USPSZipPlus4LineCentroid
USPSZipPlus5AreaCentroid
USPSZipPlus4AreaCentroid
USPSZipPlus3AreaCentroid
USPSZipPlus2AreaCentroid
USPSZipPlus1AreaCentroid
USPSZipAreaCentroid
CityCentroid
ConsolidatedCityCentroid
CountySubdivisionCentroid
CountyCentroid
StateCentroid
CountryCentroid
DynamicFeatureCompositionCentroid
Unmatchable

Uncertainty Hierachy

The USC geocoder allows the user to choose if they want the "best" geocode returned for an address to be chosen dynamically based on an accuracy metric calculated by the geocoder, or statically always in the same order.

This uncertainty hierarchy directs the geocoder to choose the geocode with the lowest uncertainty as the resulting "best geocode" that should be returned for an address. This option will slow down the processing of your records. For deatils on how this uncertainty is calculated, please contact us.

When this option is not selected, the "best geocode" will be chosen based on the first geocode that matches in the following table:
Exact parcel centroid
Nearest parcel centroid
Uniform lot interpolation
Address range interpolation
ZIP code centroid
City centroid
County subdivision centroid
County centroid
State centroid
Country centroid

Query Status Codes

All API's available from the USC WebGIS site (geocoding, address parsing, etc.) use the same set of query status codes from the following table.

Group Code Value Code Name
Success200Success
 
API Key Errors400API Key Error
API Key Errors401API Key Missing
API Key Errors402API Key Invalid
API Key Errors403API Key Not Activated
 
Non-Profit Errors450Non Profit Error
Non-Profit Errors451Non Profit Not Confirmed
 
Quota Errors470Quota Exceeded Error
Quota Errors471Anonymous Quota Exceeded
Quota Errors472Paid Quota Exceeded
 
Versions Errors480Version Missing
Versions Errors481Version Invalid
 
Internal Errors500Failure
Internal Errors501Internal Error
 
Unknown Errors0Unknown

Matched Location Types

StreetAddress
The matched input data was a postal street address
Example: 3620 South Vermont Ave, Los Angeles, CA 90089-0255
PostOfficeBox
The matched input data was a Post Office Box address
Example: PO Box 0255, Los Angeles, CA 90089-0255
RuralRoute
The matched input data was a Rural Route address
Example: RR 13 Box 2, Los Angeles, CA 90089-0255
StarRoute
The matched input data was a Star Route address
Example: Star Route 13 Box 2, Los Angeles, CA 90089-0255
HighwayContractRoute
The matched input data was a Highway Contract Route address
Example: HC 13 Box 2, Los Angeles, CA 90089-0255
Intersection
The matched input data was an intersection of two or more streets
Example: 36th and Vermont, Los Angeles, CA
NamedPlace
The matched input data was a named place
Example: USC GIS Research Laboratory, Los Angeles, CA
RelativeDirection
The matched input data was a relative direction
Example: 1 mile south of downtown Los Angeles
Unmatchable
A match could not be found for the input

Full list of API Output MatchedLocationTypes Values


Unknown
StreetAddress
PostOfficeBox
RuralRoute
StarRoute
HighwayContractRoute
Intersection
NamedPlace
RelativeDirection
Unmatchable

Matched Types

Exact
The input data exactly matched a feature in the reference data source:
input 3620 S Vermont Ave, Los Angeles, CA 90089-0255
reference 3620 S Vermont Ave, Los Angeles, CA 90089-0255
Relaxed
One or more of the input data attributes had to be removed to find a match in the reference data source:
input 3620 Vermont Ave, Los Angeles, CA 90089-0255
reference 3620 S Vermont Ave, Los Angeles, CA 90089-0255

- South is missing from input and is present in the reference

For more details, see the section on Address Relaxation above.

Soundex
One or more of the input data attributes had to be matched with soundex to find a match in the reference data source:
input 3620 S Vermonnt Ave, Los Angeles, CA 90089-0255
reference 3620 S Vermont Ave, Los Angeles, CA 90089-0255

- Soundex("Vermonnt") = V655 = Soundex("Vermont")

For more details, see the section on Soundex Matching above.

Full list of API Output FeatureMatchTypes Values


NoMatch
Exact
Relaxed
Substring
Soundex
Composite
Unknown

Feature Matching Result Types

Success
A exact match was found

Unmatchable
A match could not be found for the input

Full list of API Output FeatureMatchingResultTypes Values


Unknown
Success
Ambigous
Composite
LessThanMinimumScore
InvalidFeature
NullFeature
Unmatchable
ExceptionOccurred

Known Bugs

The following list contains the set of known bugs for this release of the geocoding service. We are constantly working to improve the service and will be addressing these bugs in future releases. If you discover or suspect another bug, please report it.
ID Type Description
1 Input data Street intersection data are not supported
2 Feature matching A state is required to obtain any type of street- or parcel-level match
3 Feature matching Probabalistic feature matching is not supported
4 Feature matching/Feature interpolation Street centroids are not returned for street level matches without address numbers
5 Address parsing/Feature matching Zip codes with leading zero's get the leading zero's removed
6 Address parsing/Feature matching City portions of an address are not normalized so you should pre-normalize them if possible, e.g., use "East Hanover" instead of "E Hanover"
Quick Links: Home | Services | Databases | Support | About | Site Map | Contact