Data Updates
Overview
Regrid processes raw data from thousands of sources. Working on a rolling basis, we update batches of counties from raw source data. We run hundreds of automated and manual cleaning and enhancement steps on every county to prepare our nationwide parcel product. Roughly every two weeks, we deliver an updated batch of counties to our customers. We also make dataset-wide improvements and fixes that we deploy when they are available.
With 100% coverage of all United States counties, Regrid is committed to keeping our product data up to date. We target updating over 3,000 counties (92%) on an at least annual basis. Additionally, Regrid targets about 600 counties for at least 2 updates per year, and 200 of the 600 are updated 4 times per year. These 600 and 200 counties represent approximately 45% and 27% of all parcels nationwide.
How Regrid provides bulk data updates
Bulk files
We deliver our data files through our SFTP and cloud buckets. We generally push updated files to SFTP/cloud every two weeks. In some cases, we make corrections or significant improvements and push new data files to the SFTP/cloud at other times. This means that you will always have access to the freshest data.
Other update schedules
Monthly, all Premium schema files are updated for every county in the country, even when we have not updated the county data from the source. Premium schema contains fields that benefit from monthly updates across the country. For example, parcel addresses are validated and the vacancy indicator is updated monthly.
Quarterly, every file across all tiers and schemas is re-exported. We make improvements to the nationwide data set on a regular basis, often based on feedback from clients. Quarterly, full dataset exports ensure every data tier has all of the improvements, even if the data was not refreshed directly from the county that quarter. We are working to provide an easily shareable changelog for dataset-wide fixes, but currently give a notice to significant updates in our monthly data update emails.
Feature Service, API and tileserver
The Regrid Feature Service, API, and tileserver always pull from our latest data. No action is required if you are using one of these services.
Contents of updates
Each time we update a county, we replace the entire existing county file. We do not provide 'diff' files or deltas for the parcel data.
Updates typically include 100 - 200 counties delivered roughly every two weeks that have been refreshed from source data, as well as other fixes and improvements made as needed.
We send out an email each month listing the counties that were refreshed during the previous month, as well as a listing of the states we are targeting for update over the next two months. The email also contains a link to our coverage report which is a listing of all the counties and their last_refresh
date, which is the last date we updated the data directly from the source. You can subscribe to that email here.
How to update
We recommend replacing each updated county with a full new copy of the county.
We do not track changes at any level below the county level, but with our ll_uuid
it should be possible to determine if you need to update, add, or remove a row in your database if you do not want to replace the whole county.
That same ll_uuid
should be used if you need a permanent, nationwide unique id for a parcel, and to match up with data you might attach to our parcel data in your usage.
When to update
We advise one of two timelines:
-
To have the freshest data, you can build your data loading system to check for updates daily. If there are updated files, those should be processed. Many days there will be no refreshed files; however, on some days there will be changes and these changes are likely important. This automated approach lets you get the freshest data as soon as we have it, with minimal manual work.
-
If you prefer a set date every month, we suggest the 7th of every month. Due to variability in some of our monthly processes, the exact date we have completed the monthly process is unable to be locked. However, we always expect a month's processing to be fully completed and available in your download directories by the 7th day of the following month.
Metadata
We provide our verse tracking table (metadata) in many formats. It has key metadata, including, county name, state+county FIPS code (the geoid
column), the date we last refreshed data directly from the source (last_refresh
column), and a filename_stem
column that indicates the file name with no format .extension or .zip, just the basename of the file. You can see more about the verse metadata here.
We update the verse file whenever we update county data.
Improvements Regrid makes to the data do not affect the last_refresh
of a county. The last_refresh
is always the date we last pulled data directly from the county source.
How files are organized
All bulk data is provided via SFTP or pushed to client cloud storage as compressed files (.zip and .gz) of each county in the format of your choice. We organize files on a county by county basis using the county geoid
(FIPS code). We refresh 200 - 500 counties every month with data directly from the county source.
You can use the Verse metadata, described above, to find the predictable filename_stem
corresponding to any county you are looking for.
We also provide our Verse metadata table that lists the last_refresh
date for each county.
On each parcel, we provide a unique ID (ll_uuid
) that permanently and uniquely identifies each parcel across data refreshes and updates. The ll_uuid
can be used to match any locally stored parcels with updated parcels in the county file.
ll_uuid
is tied to the county provided parcel numbers. Parcel revisions like splits and other modifications that result in new parcel numbers will get new ll_uuid
s.