Parcel Data Onboarding and FAQ
Documentation and Frequently Asked Questions
Where does your parcel data come from?
We source our data directly from counties, states, municipalities and their designated vendors wherever possible. We also work with trusted partners and even digitize paper maps when needed.
How do you standardize county data generally?
The main way we make county data much easier to work with is by standardizing the column names of the raw data provided by each county. We do not standardize the values in most columns, most we keep those exactly as provided by the county, but we do make sure that every county in our system is converted to a standard table schema, with consistent column names across the nationwide dataset. Please see “What is the Regrid Parcel Schema?” question below.
In addition, we further standardize the parcel address fields using the US Postal Service database of addresses. For details on address specific standardization please see the "How do you standardize and normalize addresses?" question below.
What Coordinate Reference System (CRS) / Spatial Reference System (SRS) is your data in?
All geometries are delivered in the 2D CRS of the World Geodetic System 1984 (WGS84) with the Spatial Reference Identifier (SRID) EPSG:4326.
How do you clean parcel geometries?
We rely on authoritative data from county and county designated sources for our data. We seek to minimize unwarranted changes that would inappropriately modify this source data. That said, we take accuracy of the data seriously, including the accuracy of the spatial alignment of the data. To that end, we have the below processes to ensure the highest level of spatial accuracy of our data.
- We perform a visual inspection to assess the completeness of the county.
- Recognizing that imagery can itself contain spatial inaccuracies; we do not manually align parcels to imagery.
- If large scale shifts, skews, or rotations are discovered we work back with the source for them to correct, or we seek alternative source.
- We use standard geospatial functions to correct common errors in the individual parcel polygons and remove any polygons that can not be made valid in that process.
- We also remove some slivers and overly large or detailed polygons like road sets, water or wetlands that have proven to cause issues for many users.
How do you deliver bulk data?
All bulk data is provided via SFTP as zip files of each county in the format of your choice (GeoJSON, NDGeoJSON, SQL, Shapefile, FileGDB, GeoPackage, KML, CSV), using a pull model. We organize things on a county by county basis using the county’s FIPS code (geoid in our Regrid Parcel Schema column).
What is the Regrid Parcel Schema?
We standardize column names for easy access across counties in our nationwide dataset into our Regrid Parcel Schema for tables.
We are currently at version 6 of our Regrid Parcel Schema and it standardizes the column names for approximately 80 county provided data columns, and 25 Regrid provided data columns. This schema is applied to 100% of our dataset. A data dictionary is available: https://docs.google.com/spreadsheets/d/14RcBKyiEGa7q-SR0rFnDHVcovb9uegPJ3sfb3WlNPc0/edit#gid=1010834424
Do you have a specific attribute for a specific county?
A current, detailed list of every county in our data set and what data fields we have for each is always available at the following URL. This spreadsheet can be downloaded as a CSV for closer analysis: https://docs.google.com/spreadsheets/d/1rvRYv6_ppZlwbmyi2kbzemot6FOEm2EEPdHPyENTQPE/
Why do my files have columns not in the Regrid Parcel Schema?
Every county has our schema columns, so any code or process can rely on those columns (also called attributes in GIS software) being present. However, most counties also provide attributes that do not map into one of our Regrid Parcel Schema columns, so we keep those attributes and include them on the end of our schema columns using whatever name is provided by the county. In many tools you can control what attributes are retained during import or merging, so we suggest folks working across multiple counties just keep the schema columns or a sub set of them and leave what we call "custom columns" off entirely to get a uniform set of columns. Going back and reviewing those custom columns can reveal interesting information provided by the counties, but not provided very often in other counties.
Nationwide dataset clients also have access to a 'schema-columns-only' directory by default, which will not have any of the county custom columns and will always just be the columns listed in our current parcel schema. If you do not receive the full nationwide dataset, please just let us know and we can provide you a 'schema-columns-only' version of the counties you receive. Regrid Parcel Schema and technical data dictionary
How can I explore the custom columns for each county?
We work with all of our counties in a PostgreSQL database, each county in its own table. That makes managing the custom columns from each county much easier. Most database servers provide a way to search the column names of the tables in a database. For example, in Postgres you would do it this way:
SELECT table_name, column_name FROM information_schema.columns
WHERE table_schema = public and
column_name ~ juri
order BY table_name, column_name;
Also, directly browsing the data on a place-by-place process of areas or regions you are interested in can be very useful. DBeaver is a cross platform, multi vendor database client that can render geographic data.
Why do Shapefile attribute names not match the Regrid Parcel Schema column names?
Some of our Regrid Parcel Schema column names are longer than the Esri Shapefile format allows and the column names in your attribute table will be truncated to the first 10 characters of the Regrid Parcel Schema column names.
When was your data last updated?
On average 94% of our parcels have been refreshed in the last 12 months, with most of those in the last 6 months. We target approximately 600 counties that are generally fast growing and populous for more frequent updates. Some of these are targeted for updates as frequently as four times per year. We work on a rolling update schedule, refreshing county data directly for 100 - 300 counties per month, usually grouped by state.
Monthly we share, both in machine readable format and via a monthly update email, what counties have been updated, and what states are in the pipeline for the upcoming quarter. Here is an example of our Monthly Data Update Email. Regrid tracks the date of last refresh from the county in the ll_last_refresh attribute.
Our USPS related attributes are updated monthly for our entire data set. Several other Regrid generated attributes are also updated outside of updates available from the parcel data source provider. These updates do not impact the ll_last_refresh attribute.
A detailed listing of every county in our data set and the date we last refreshed directly from the county is available in the Regrid coverage report. This report can be downloaded as a CSV: Regrid Attribute Completeness Report
How do you provide data updates?
Please see our guide on data updates
How do I keep my data up-to-date?
Please see our guide on data updates
What software can I use to work with your data?
Editing or working with most of our data requires software for working with geographic and geospatial data The OSGeo project provides free and open source desktop software to work this kind of data called QGIS.
There are other free and paid software options for working with geospatial data, but all of it has a learning curve. We are not able to provide support for those 3rd party software applications. For QGIS at least, the best options for learning are via the many community developed tutorial videos and texts.
We suggest starting with the GeoPackage (geoPKG, .gpkg) formatted files to use in QGIS or any geospatial software.
What about Google Earth?
We provide KML/KMZ options for Google Earth and Google Earth Pro, but neither of those applications support editing our data, only viewing the data. If you need to make changes to the data you get from us, you will need a desktop application like QGIS discussed above.
How large is the nationwide dataset?
The nationwide dataset is approximately 400-800 GB uncompressed, varying by file format, storage method, attribute tier, and other factors.
How do you load all of these files into a database?
We generally work in a GNU/Linux environment using the command line. Our internal workflow makes use of the OSGeo Foundation libraries and tools including GDAL/OGR and PostGIS for PostgreSQL. The OSGeo project also provides an MS Windows 10 installer for using the tools on a Windows machine named OSGeo4Win.
The ogr2ogr
command line tool is the best way to import data into a PostgreSQL/PostGIS or MS SQL Server database.
Below is a typical command line to cycle through a set of GeoDB files and append them to a table in PostgreSQL:
for gdb in /path_to_files/*.gdb ;
do echo "$gdb" ;
ogr2ogr -f "PostgreSQL" PG:"dbname=dbname" $gdb -progress --config PG_USE_COPY YES -nln public_schema.target_table_name -append ;
done
Dealing with custom columns at scale is best accomplished by ignoring them during import. One approach is to use the -sql
option with ogr2ogr
to restrict what columns you load to all of, or a sub-set of, the columns in our Regrid Parcel Schema. We use the filename_stem
value from our verse
table to assign layer names in formats that support layer names. In the below example,
ogr2ogr -f 'PostgreSQL' PG:'dbname=dbname' st_county.gpkg -sql 'select wkb_geometry,geoid,etc from <filename_stem>' -nln st_county_imported
Using ogr2ogr to load parcel data into a MS SQL Server works the same way. Parcel data should use the geometry
data type in MS SQL Server. A good example of how to do that is this blog post by Alastair Aitchison. They also cover installing the OSGeo4Win environment.
An example osgeo4w shell command to load a folder full of geopackages into MS SQL Server looks like this:
for /R %G in (*.gpkg) do ogr2ogr -progress -f "MSSQLSpatial" "MSSQL:server=localhost;database=alabama;trusted_connection=yes" %G -nln "%~nG"
The main item of the command line options are the database connection options. You will have to make sure the user name and password are available and that the client can actually connect to the database and has all the needed permissions. For PostgreSQL on GNU/Linux, there are standard PG_* environmental variables and the .pgpass file for storing credentials that will work with the ogr2ogr commands so they do not have to be included in the command line.
Why do some Shapefiles say '2GB_WARN' in the file name?
The shapefile format itself has a 'soft limit' of 2 gigabytes (GB) of data. 'Soft limit' means it is just a rule of the format: "no data larger than 2GB". There is no technical limit preventing more data being encoded as a shapefile. When we export large counties, our tools inform us the resulting shapefile is over that soft limit.
We can confirm that some software handles 2GB and larger shapefiles just fine (OSGeo tools like QGIS), but some software will just silently ignore attribute data above the 2GB limit (ArcGIS). A sincere thank you to the Regrid client who did a lot of in depth research and testing on this and shared their results with all of us.
Starting in July 2020, we made the following changes to help flag the counties where the data exceeds the 2GB soft limit. Please double check how you are handling these files.
- The filenames themselves will indicate the county generated the 2GB warning on export. A
_2GB_WARN
suffix is added to the file names so you can know just by checking the name. - We added a column to our
verse
table, namedshapefile_size_flag
so you can check if a place needs a different format than Shapefile, or to generate a list of places you need to pull the alternate format for. - Shapefiles that are larger than 2GB are not available through our data store.
We recommend the GeoPackage (GeoPKG) exports for situations where you would normally use a Shapefile. These can be opened by both Open Source and ArcGIS tools. ArcGIS uses ESRI’s "Data Interoperability" extension to work with GeoPackages.
Why do my files not have the ogc_fid
column listed in your parcel record schema?
The ogc_fid
(Open Geospatial Consortium Feature ID) is a limited, table-by-table unique identifier in our data and is either not used or not supported by several geospatial file formats so is not included in those files. Specifically the following file formats will not include an 'ogc_fid' column: geojson, kml, shp, csv, parquet.
Typically when importing data into a database, a unique record/row/feature id will be automatically created if needed.
What about multi unit parcels or parcels with secondary addresses?
Our parcel data packages do not contain the list of every secondary unit on a parcel. If you need the full secondary address list per parcel, please contact us at parcels@regrid.com as we do have a separate nationwide, complete Matched Secondary Addresses product available.
Our parcel data attribute packages have the main, primary physical address for a parcel and the mailing address of record for the parcels. These do vary somewhat based on what the county provides, so please review our "Attribute Completeness Report":
Regrid Attribute Completeness Report
What about condominiums or cooperative buildings or parcels?
Our data is directly from county GIS and county assessor’s offices around the US and as such there can be some disparity from county to county in how condominium and cooperative buildings/parcels are handled. In many local jurisdictions condominiums and cooperative buildings are assessed and taxed individually by each owner and reflected in the related GIS data, however this is not always the case. In some instances, local authorities will assess and tax condominiums and co-ops collectively and they will be linked to a single GIS Parcel record. It is also possible that while these condominiums and co-ops are assessed and taxed independently on the assessors tax roll, these unique assessment records may not be reflected as a GIS parcel, or they may be reflected as a non-unique parcel geometry (duplicate/stack). Regrid does make an effort to add condo/co-op records from tax and assessor rolls where they are missing from the GIS when possible.
Why are some parcels duplicated or stacked parcels?
Our data is directly from county GIS and county assessor's offices around the US. They are primarily focused on collecting taxes so recording ownership and mailing addresses is their main goal. They often use their software in creative ways to get things recorded so they can track the taxes. For parcels with multiple, individual owners the counties often 'stack' parcels so they can enter different owner and contact information for each owner. The 3 most common ways are:
-
Identical polygons stacked, exactly the same size parcel, just 4 or 10 or 100 stacked, all exactly on top of each other, with different attributes for the different owners. By far this is the most common way and is widespread around the US.
-
Puzzle pieces, ground parcels with exact cutouts for the footprint of the building. This is common for downtown buildings. They have no intentional overlap.
-
Laying condo parcels on top of a ground parcel. This is a polygon the exact size of the building, but instead of 'cut out' of the ground parcels, it just lays on top of the ground parcel, like a stacked solution, but only 2 layers: one big ground parcel, with smaller parcels stacked on the ground parcel, but spread out.
In these cases, our dataset usually does contain all of the addresses associated with the parcel, as each owner's 'parcel' usually has what is considered a primary address.
The vast majority of the counties create unique parcel numbers for each stacked parcel. The unique parcel number is a benefit of the stacked parcel approach. However, some counties will duplicate the parcel number and use a secondary id field for the sub-parcels. We think this is much more rare, and we would retain any secondary parcel numbers as a custom column in our data.
For the most common way counties handle condos and other shared ownership situations, stacked, identical parcel geometries, Regrid provides a special column to help work with stacked parcels: ll_stack_uuid
. For a set of stacked parcels, a common ll_stack_uuid
will be assigned to every parcel in the stack, they will all share the same ll_stack_uuid
. This ll_stack_uuid
is generated by selecting the ll_uuid
from a random parcel in the stack and then using that value to fill in the ll_stack_uuid
for every parcel in the stack. This makes it easy to identify parcels that are part of a stack and all the other parcels in that same stack.
Developers and analysts can use SQL to select subsets of parcels using the ll_stack_uuid
:
-- Select just the parcels that are part of a stack
SELECT *
FROM table_name
WHERE ll_stack_uuid IS NOT NULL;
-- Select just one parcel from every stack
SELECT *
FROM (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY ll_stack_uuid ORDER BY ll_uuid) AS rn
FROM table_name
) subquery
WHERE rn = 1;
-- Select all the non stacked parcels and one parcel from each stack
SELECT *
FROM (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY COALESCE(ll_stack_uuid, ll_uuid::text)) AS rn
FROM table_name
) subquery
WHERE rn = 1;
Why do some counties have partial data?
Some counties in our dataset exclude parcels for a variety of reasons. The most common reason is non-taxable state, federal and/or tribal lands. In other places, parcel shapes may not be digitized, or the place may not distribute them. When we know a county excludes some parcels intentionally, we indicate that in the verse
table in the partial
column with the value "partial". No value (null) indicates the county should have all the parcels in the county.
Parcel attribute details
Why do your parcel numbers (apn, pin, etc) not look the way I expect or are null?
County assessor parcel identification numbers ('parcelnumb') are well known for being complicated and often have variable punctuation or zeros (0) that can affect searching or matching by parcel id number. Counties do occasionally change their method for generating or assigning parcel numbers and that can lead to "new" and "old" parcel number situations. We always retain any identification number attributes as 'County Custom Columns' so it should be possible to match up our data with county data directly, even if our parcel number field is not the only identification number used by the county.
Also, sometimes a State GIS source will add their own unique id to a local parcel id number.
We suggest visiting the county's website or calling the assessor's office directly to better understand their parcel numbering system. If, after contacting the county it appears we have an error in what values we have in our parcel id attribute field, please send an email to help@regrid.com and we will review our data asap.
Null and duplicated parcel numbers are common across the US. Counties do it for a variety of reasons, but the most common reason is they do not bother with parcel numbers for non taxable parcels like city parks, rights of way, etc. However, condos, timeshares, mineral or air rights, unassigned rights etc, all might use duplicated or null parcel numbers and the county uses some other attribute, combination of attributes or system for tracking them.
How do I match your parcel data with other systems or datasets?
To maximize the chances of locating a parcel in other systems or datasets, we convert all the available identification fields provided by counties to several standardized column names in our data. This makes it easy to automate checking all the known identification type fields for our county data. We recommend trying these fields in our data in the following order:
parcelnumb
parcelnumb_no_formatting
state_parcelnumb
account_number
tax_id
alt_parcelnumb1
alt_parcelnumb2
alt_parcelnumb3
Each field may contain a different identifier used by various systems. By checking these fields sequentially, you increase the likelihood of finding a match across different databases or platforms.
Additionally, we provide the following 3rd party parcel identification value fields that can be used to match parcels when available in the other system or dataset:
placekey
How do you standardize and normalize addresses?
The county-provided parcel address data is transformed to populate the typical address fields in use, address line 1, address line 2, city, state, zip. Basic text cleaning is done at that same time to remove special or non-printing characters.
Where counties have not provided full addresses for parcels, we use US Census data to fill in missing zip codes and cities.
Parcel addresses are then passed through a USPS address verification system which further normalizes and standardizes the address to USPS guidelines where a match was made against the USPS database of addresses. Non matching addresses are left as they were sent by the county.
The original county-provided raw address data is always retained unaltered and included on the parcel record in the original_address
field.
Why do your parcel value fields ('parval') not look the way I expect?
Our parcel value related fields are all directly from the county assessor's data. We populate as directly from the assessor attributes as possible our 'improvements value' (improvval), 'land value' (landval), 'parcel value' (parval), 'ag value' (agval) and 'parcel valuation method' (parvaltype) attributes. However, while those are the most common value related attributes, every county has their own definition for those attributes and their own methods for how they calculate, record and display amounts for tax purposes. We can not answer questions about why the county records the values in those attributes. We suggest visiting the county's website or calling the assessor's office directly to better understand those values. If, after contacting the county it appears we have an error in what values we have in those attribute fields, please send an email to help@regrid.com and we will review our data asap.
What are the parcel value types (parvaltype
)?
We generally keep the parval type provided by the county, or moderately standardize the values we get from the County into 'Assessed' and 'Market' as applicable.
The two most common types of parval the Assessors use or provide are "assessed" (or some variation of that term) and "market" (or some variation of that term).
Assessed values are valuations determined by the county or taxing authority and they can be very formula based (number of rooms, fireplace, basement, wall covering, wall construction, heating type, etc). Assessed values can take into account things like exemptions for homesteads, veterans, seniors, etc. It often takes into consideration the current market value of the property, but is usually not the same as the market value of the property.
Market valuations are much more based on the current market value of the property if it were to be sold in the current market, usually based on comparable recent sales in the area or perhaps it directly uses a recent sale price for the parcel. The tax authorities can determine the market rate, or an appraisal professional can generate it.
Equalized valuations are a less common type of parvalue. They are quite variable in how they are calculated, but they are essentially some sort of formula applied to assessed value usually and tax rates are adjusted to use that "equalized" value instead of the raw assessed or market valuation.
Assessed and market values are not interchangeable usually, but most parval types that have "true", "market", "full" or "appraised" in their names are generally comparable.
The less commonly used parval types like "VALUE 20", "COST", "TOTAL" are difficult to be sure what they are comparable to. We pass on these types from the county and they may be county-specific as different areas will use different systems for valuing parcels.
Building Data
What are your premium schema building footprints attributes?
Our premium schema building footprints attributes include ll_bldg_footprint_sqft
(total building footprint in square feet) and ll_bldg_count
(total number of buildings on the parcel) which are calculated by Regrid using the nationwide building footprints data provided by EarthDefine.
Please note that these fields do not provide the actual building footprint geometries. Our Premium Parcel Data + Matched Building Footprints dataset provides the full spatial dataset of nationwide building footprints joined with our parcel data. See below to learn more about that dataset.
How are these attributes different from the ones included in the matched building footprints schema?
Our Premium Parcel Data + Matched Building Footprints dataset combines 156+ million nationwide building footprints geometries with our 152+ million nationwide parcel data as one solution. The matched buildings schema comes with the building geometries along with relevant buildings data attributes such as building uuid, structure uuid, and much more. See our Matched Buildings Schema. We provide this data with a join table - every building and structure in the buildings dataset is matched with our parcel uuid. If you are interested in learning more about this combined parcel data solution, please contact us at parcels@regrid.com.
How do you determine the Regrid Building Count value (ll_bldg_count
)?
We work with EarthDefine, an industry leading Machine Learning and AI firm that specializes is feature extraction from high-resolution imagery and point clouds, to generate a comprehensive, nationwide, seamless building footprint data set across the US.
We then process that footprint data set with our parcel shapes data set to determine how many buildings are on the parcel and how many square feet of parcel are covered by buildings.
Please see our Matched Buildings Product FAQ or contact us at help@regrid.com if you have any questions related to buildings.
How do you calculate the parcel acres, parcel square feet, and building footprint square feet (ll_gisacre
, ll_gissqft
, ll_bldg_footprint_sqft
)?
We projected each parcel and building footprint into its UTM Zone SRS, calculated the area in meters and converted that to acres and sqft. This should provide a relatively uniform and consistent value across the US. The ll_bldg_footprint_sqft
value for buildings that cover more than one parcel are only for the portion of the building that is on the single parcel. For example, a 10,000 square foot building footprint might only have 500 square feet of building footprint on a parcel, so that parcel's ll_bldg_footprint_sqft
value would be 500.
API Access
What is an API and why might I use one?
API stands for Application Programming Interface, and is for use by software developers to interact directly with our national dataset. They are usually used when someone wants to automate some process that interacts with our national parcel dataset. All APIs require programming work to make them useful for any specific application.
What APIs does Regrid make available?
We provide two different APIs for working with our data. Please remember, all APIs are intended for use by software developers and are not for end users’ direct use.
Tile Map Service (TMS) Layer - This is an interactive, vector and raster rendering of our dataset, for use in web or desktop applications where using the parcel shapes overlaid on other layers is useful. Regrid’s TMS layer is in Mapbox Vector Tile (.mvt) format. https://docs.mapbox.com/vector-tiles/reference/
RESTful Parcel API - This is a typical stateless, client/server based API that supports retrieving parcel shape and meta data using lat / lon coordinates, among other options. Please see Using The Api for a complete reference to the API.
Miscellaneous
How do I download Regrid Parcel data?
We use the "Secure File Transfer Protocol", also called SFTP. This is supported in most traditional FTP clients and SSH client software.
Both of the clients listed below are multi protocol and also support connecting to services like S3.
Windows users can use WinSCP, which also supports scripting and .NET/PowerShell integration.
MacOS users can use CyberDuck
Why does Regrid have different county names for Connecticut?
The state of Connecticut and US Census Bureau redefined all of the state's counties, called Planning Regions, in 2022. As part of that process, all the county FIPS / geoid codes changed as well. We updated our data to reflect these new geoids in December 2022. The old FIPS/geoid is still attached to every parcel as the first 5 digits of the census_block
attribute. According to the Census Bureau, those will remain unchanged until 2030. For more background, please see the extensive writeup of the changes in the Federal Register.
Regrid Standardized Land Use Codes - Classification
Please visit our Regrid Standardized Land Use Codes specific documentation.
In this section
- Documentation and Frequently Asked Questions
- Where does your parcel data come from?
- How do you standardize county data generally?
- What Coordinate Reference System (CRS) / Spatial Reference System (SRS) is your data in?
- How do you clean parcel geometries?
- How do you deliver bulk data?
- What is the Regrid Parcel Schema?
- Do you have a specific attribute for a specific county?
- Why do my files have columns not in the Regrid Parcel Schema?
- How can I explore the custom columns for each county?
- Why do Shapefile attribute names not match the Regrid Parcel Schema column names?
- When was your data last updated?
- How do you provide data updates?
- How do I keep my data up-to-date?
- What software can I use to work with your data?
- What about Google Earth?
- How large is the nationwide dataset?
- How do you load all of these files into a database?
- Why do some Shapefiles say '2GB_WARN' in the file name?
- Why do my files not have the
ogc_fid
column listed in your parcel record schema? - What about multi unit parcels or parcels with secondary addresses?
- What about condominiums or cooperative buildings or parcels?
- Why are some parcels duplicated or stacked parcels?
- Why do some counties have partial data?
- Parcel attribute details
- Why do your parcel numbers (apn, pin, etc) not look the way I expect or are null?
- How do I match your parcel data with other systems or datasets?
- How do you standardize and normalize addresses?
- Why do your parcel value fields ('parval') not look the way I expect?
- What are the parcel value types (
parvaltype
)?
- Building Data
- What are your premium schema building footprints attributes?
- How are these attributes different from the ones included in the matched building footprints schema?
- How do you determine the Regrid Building Count value (
ll_bldg_count
)? - How do you calculate the parcel acres, parcel square feet, and building footprint square feet (
ll_gisacre
,ll_gissqft
,ll_bldg_footprint_sqft
)?
- API Access
- Miscellaneous
- Regrid Standardized Land Use Codes - Classification