Open Data is a big idea and it means different things to different people. The most common definition concerns government-owned data sets that are published online under a license that allows for free dissemination, modification, and re-use. The label ‘open’ becomes more complicated when we realize that even with a permissive license, data isn’t even remotely useful -and thus not truly open- unless it is properly organized, annotated, and saved to a format that is reasonably interoperable across software. Poor data quality can also present issues (for example because of high % of inaccuracies, missing records, or messy taxonomies) making it difficult to draw reliable conclusions. So what is truly Open Data?
Some people feel we should set a high bar for openness. In 2007, a group of influential Open Data advocates defined open government data with eight principles: complete, primary, timely, accessible, processable, non-discriminatory, non-proprietary, and license-free.
Tim Berners-Lee offers a more technical definition with the 5 stars of Open data ...
The Open Knowledge Foundation dedicated an entire website - opendefinition.org - to defining openness with an 11-point legalistic definition, summarized as follows: 'A piece of data or content is open if anyone is free to use, reuse, and redistribute it — subject only, at most, to the requirement to attribute and/or share-alike.' Defining openness precisely can be helpful. It let’s us categorize content as open, not open, or open to the nth degree. When asking a government to invest in an open data policy, the ask is clear: please conform your data to this specification. Clear guidelines help to inform the growing community of open publishers, scientists, artists, and other content creators. And the technology powering open content evolves more deliberately when the theory underpinning its use is spelled out.There can also be drawbacks of being so explicit about definitions. Consider a couple examples ... To open data means to apply any combination of open principles to achieve one’s goals in the context of a particular situation. What’s different about this approach is (1) the context, and (2) the goals. These vary from project to project, because unlike a definition they are idiosyncratic and they change. We can still consider degrees of openness with respect to definitions. But we must give equal weight to the appropriateness of an approach in a given context; and to the merits of the intended outcomes. When using or building an open data site or app ask yourself: who is this built for, to do what with, and why? Please don’t only ask: is the data open enough? ..."