Talk:AcquiringData
Welcome to Acquiring Data
The purpose of this page is to answer any questions that community members have about acquiring data to use in your analysis. This could include questions about:
- how to navigate a FOIL system or process
- how to ask questions to get the right data
- how to deal with challenging issues in acquiring data
- how to file public information request appeals
All questions are valued here! For anyone replying with assistance, remember that our goal is to assist and help.
Best Technical Practices
While there is some good starter content about acquiring public data here - there is a lot of additional advice that we can provide, specific to the technical side of getting data that is useful.
In a perfect world this is what we get - a dataset in a machine-readable form (xls, csv, .shp), with ample documentation, an understanding of the limitations of the data, and other helpful metadata (the vintage or time period of the data, the owner of the data, any changes to the data over time, etc). However we frequently don't recieve this robust data and context - we may get "malicious compliance" where someone responds to a FOIL however they send the data in scanned PDFs to deliberately make consumption of the data difficult, or no such supplemental information exists off-the-shelf.
We always think the way to increase your chances of receiving the data that you need are a combination of the golden rule (plenty of pleases and thank yous) and a detailed ask. Your detailed ask might look like a wish list but why not ask for everything? For example, we may submit a FOIL request for crime data that looks like this:
"Please provide the individual incident data for crimes in our city from 2015-2025. If the data are not available for any portion of that range would you be able to explain the reasoning. For the schema or layout, please provide the data in machine readable form (CSV, XLS) and not PDF format. Our expectation is that the crime data, for each incident, has time, date, address (or other location), FBI Uniform Crime Reporting category, disposition of the incident, demographics, and potentially some local-specific categorization. If possible, please provide any documentation associated with the fields. If there is any commentary that you can provide on how underlying methods for incident reporting have changed, that would be great to see. Lastly, please provide a context name and email that I can circle back with to ask any questions. Thank you so much!" KarlTyche (talk) 16:28, 16 November 2025 (UTC)
"We can't give you the data, a 3rd party owns it?"
You may ask a question of your local government to acquire a certain dataset that was constructed - in some manner - by a 3rd party. It is quite common now for a local government to have a vendor who is now responsible for a task or process, the task or process generates data, the 3rd party houses/processes the data. And... when you request the data you get a response like this: "We can't give you the data, a 3rd party owns it?"
We can't speak for the regulations in every state, however we know that in the State of NY data that is built/stored/governed by a 3rd party is just as available via a FOIL or public information request as data that is created by the local government. FOIL applies to records created "with, by or for an agency by a 3rd party" (and there is a precedent case, see Encore College Bookstores, Inc. v. Auxiliary Services Corporation of the State University of New York at Farmingdale, ___ NY 2d ___, December 27, 1995)
While we can't speak for the laws in other states we believe that 3rd party housing (and perceived ownership) should not stop anyone seeking to acquire data. Let us know if you hit this issue.