Talk:AcquiringData
Welcome to Acquiring Data
The purpose of this page is to answer any questions that community members have about acquiring data to use in your analysis. This could include questions about:
- how to navigate a FOIL system or process
- how to ask questions to get the right data
- how to deal with challenging issues in acquiring data
All questions are valued here! For anyone replying with assistance, remember that our goal is to assist and help. KarlLDS (talk) 15:48, 2 September 2025 (UTC)
Best technical practices
While there is some good starter content about acquiring public data here - there is a lot of additional advice that we can provide, specific to the technical side of getting data that is useful.
In a perfect world this is what we get - a dataset in a machine-readable form (xls, csv, .shp), with ample documentation, an understanding of the limitations of the data, and other helpful metadata (the vintage or time period of the data, the owner of the data, any changes to the data over time, etc). However we frequently don't recieve this robust data and context - we may get "malicious compliance" where someone responds to a FOIL however they send the data in scanned PDFs to deliberately make consumption of the data difficult, or no such supplemental information exists off-the-shelf.
We always think the way to increase your chances of receiving the data that you need are a combination of the golden rule (plenty of pleases and thank yous) and a detailed ask. Your detailed ask might look like a wish list but why not ask for everything? For example, we may submit a FOIL request for crime data that looks like this:
"Please provide the individual incident data for crimes in our city from 2015-2025. If the data are not available for any portion of that range would you be able to explain the reasoning. For the schema or layout, please provide the data in machine readable form (CSV, XLS) and not PDF format. Our expectation is that the crime data, for each incident, has time, date, address (or other location), FBI Uniform Crime Reporting category, disposition of the incident, demographics, and potentially some local-specific categorization. If possible, please provide any documentation associated with the fields. If there is any commentary that you can provide on how underlying methods for incident reporting have changed, that would be great to see. Lastly, please provide a context name and email that I can circle back with to ask any questions. Thank you so much!"