Heads up developers, some census data just became easier to use

lterrat's bookmarks 2017-03-12

Summary:

"People have to overcome a steep learning curve to use granular data from the American Community Survey — perhaps the Census Bureau’s best-known product — because of its structure and lack of metadata.

Historically only available in common tabular file formats like CSV, the dataset requires reference to separate dictionary document to understand it. But now, developers and data scientists will be able to more easily use the ACS data and build apps from it because it has been transformed into linked data, the Census Chief Marketing Officer Jeff Meisel announced Saturday during a panel at the SXSW Conference in Austin.

The Austin-based data.world, funded by the National Science Foundation, brought on then-graduate student Jonathan Ortiz to address problems with the Public Use Microdata Sample, as it’s called.

'What comes to you in the microdata survey file … is essentially just: one piece is the CSV, which has coded values throughout, and you constantly have to refer back and forth to the data dictionary,' said Ortiz, who now works as a data scientist for data.world, in an interview with FedScoop. 'And the data dictionary is a human-readable document, it’s not computer-readable at all.'

But semantic technology allows users to 'put that metadata in to the data itself so that you’re consuming both at the same time, and you’re also able to use unique identifiers for each of the data resources in that data so the computer can actually understand them, make sense of them.'

The tradeoff in getting the metadata is that 'the size of the data explodes when you start incorporating all this other information.'

To address the storage issue, Amazon Web Services is making it available as an AWS public dataset: Anyone can then analyze the data in the cloud without downloading or storing a copy. The old formats will still be available, Ortiz said. Most spreadsheet programs can easily read a CSV file."

Link:

https://www.fedscoop.com/heads-developers-census-data-just-became-easier-use/

From feeds:

Open Access Tracking Project (OATP) » lterrat's bookmarks

Tags:

Date tagged:

03/12/2017, 22:53

Date published:

03/12/2017, 18:53