Intro to the Open Data Tool Kit
The Open Data Toolkit is designed to help governments and open data enthusiasts understand the basic concepts of Open Data, how to plan and implement an open government data programme and some tricks on how to approach a dataset.
Getting started with Open Data can be easier than many people think. If you have data in a re-usable electronic form, a publicly accessible domain to put it and an open license, then you’re on the right path.
This toolkit provides a comprehensive step by step approach to help government agencies release more open data in a simplified manner.
Learn about Open Data
What is Open Data?
According to the Open Data Institute, “Open data is data that anyone can access, use or share. Simple as that. When big companies or governments release non-personal data, it enables small businesses, citizens and medical researchers to develop resources which make crucial improvements to their communities.”
What is Open Government Data?
Open government data means:
Data produced or commissioned by government or government controlled entities Data which is open as defined in the Open Definition – that is, it can be freely used, reused and redistributed by anyone.
What is machine readable data?
Data in a data format that can be automatically read and processed by a computer, such as CSV, JSON, XML, etc. Machine-readable data must be structured data.
Some human-readable formats, such as PDF, are not machine-readable as they are not structured data, i.e. the representation of the data on disk does not represent the actual relationships present in the data.
Why Open Data?
Releasing Open Data is not easy. It requires time and resources and it can also be a challenge to make sure the fine line between transparency and privacy isn’t compromised. So if it is so challenging and requires money, then the question is : Is it worth it?
Take a moment to watch this video about the potential of open data
This is just the tip of the iceberg, there is much more to this.
Some of the key areas where Open Government Data is creating value are:
- Transparency and democratic control
- Improved or new private products and services
- Improved efficiency and effectiveness of government services
- Impact measurement of policies
- New knowledge from combined data sources and patterns in large data volumes
A few examples that validate the above:
- The UK ‘Where does my money go’ site shows users how their tax money is being spent by government
- Canada was able to save $3.2 billion in charity tax fraud.
- Various websites such as the Danish folketsting.dk tracks activity in parliament and the law making processes, so you can see what exactly is happening, and which parliamentarians are involved.
For more examples, see here.
In summary, key benefits of open data include:
- Promoting greater transparency and engagement between government and communities by allowing data to be analysed and visualised in unique and different ways (where government may not always have the expertise or resources to do these). This can lead to a more engaged, connected and informed community and can help highlight some of the work happening behind the scenes to collect and manage public data.
- Facilitating social and commercial innovation, by allowing the growth of new business and service models that rely on open data.
- Improving service delivery and community satisfaction by allowing citizens to interact with public government data via online interfaces or community-developed apps.
In the computer world there is something called Linus’ Law, which states: “given enough eyeballs, all bugs (problems) are shallow.” and one can draw parallels from it for almost everything. There can also be long-term or unanticipated benefits to opening data. This is because it is not always possible to predict the kinds of innovation that may evolve in response to the release of open data.
Examples of Open Data usage
So, you think Open Data sounds like a great proposition but you’re still keen to know what can actually be achieved with open data? Surely people aren’t interested in X dataset are they?
Let us introduce you to a few interesting uses of Open Data that have popped up over the past few years. They may seem peculiar but there’s no doubt that they’ve definitely challenged the assumptions about how useful any given data set may be!
Adopt a what? Yup, you read it right. The precise location of fire hydrants in Boston City may put you to sleep at first but what if someone took it and made something, not only enjoyable but also beneficial to the city?
A typical snow storm in Boston can blanket the city for days, forcing residents to shovel their driveways but other utilities can remain lost for days. In a life or death situation like a burning building, a firefighter may not have time to dig out a hydrant, let alone spend the time necessarily to locate it under such heavy snowfall.
This was exactly the type of problem solved by Adopt A Hydrant using Open Data. Citizens of Boston could ‘adopt’ a hydrant of their own which they kept clean and shovelled. If they didn’t keep up their responsibilities in a timely manner, neighbours would have the opportunity to “steal” their hydrant away from them!
You can read more about the story here but it’s a good example of how you can “gamify” data and get citizens engaged in city affairs in a way that they otherwise wouldn’t have.
Best of all, it’s open source as well which means anyone can take the application and with a bit of retooling, adapt it to fit their city too. There have been a few interesting spin offs already like Adopt A Siren in Honolulu.
What do you do if you’re a city that has an ever expanding amount of trees but not enough resources to monitor them all? Give them email addresses!
There have been other alternatives but this crafty application of data and technology produced a website that let Melbourne citizens email specific trees to report issues like broken branches or fallen leaves.
Funnily enough, citizens also started expressing their affection for their favourite trees as well, writing in to say how much they adored their luscious leaves or that they were upset about trucks scraping the undersides of branches.
These are just a few examples of how Open Data can be utilised and even disguised as a novelty at the same time making it much more attractive than something like a survey.
Open Government Partnership
The Open Government Partnership is a multilateral initiative that aims to secure concrete commitments from governments to promote transparency, empower citizens, fight corruption, and harness new technologies to strengthen governance. In the spirit of multi-stakeholder collaboration, OGP is overseen by a Steering Committee including representatives of governments and civil society organizations.
To become a member of OGP, participating countries must endorse a high-level Open Government Declaration, deliver a country action plan developed with public consultation, and commit to independent reporting on their progress going forward.
To learn more on how to be a member, read more here.
What does it mean for New Zealand?
New Zealand Data and Information Management Principles: The principles under which the New Zealand Government releases its open data are called the New Zealand Data and Information Management Principles
The core principles of NZDIMP are: - Open - Protected - Readily available - Trusted and authoritative - Well managed - Reasonably priced - Reusable
Should NZ adopt the Open Data Charter?
This is not a simple question and doesn’t have a straight-forward answer. There has been a lot of discussion going on around this and to participate in this discussion and read more about it, follow the links below.
Laying the Groundwork
Launching an Open Data doesn’t have to be a tricky endeavour but rushing into it without proper planning and assessment beforehand may caught pain later on in the process.
Defining the goals of your Open Data initiative
Your Open Data initiative should align with the larger strategic goals and objectives that your organisation strives towards. It may be tempting to “do open data” for the sake of it but in order to have the longest possible impact, you should have a greater goal in mind.
You should be clear from the beginning as you define your Open Data goals. This will aid your team immensely when articulating your vision to potential stakeholders.
You should ask yourself - What makes Open Data important for your agency? - What would a successful Open Data initiative look like? - What existing needs and priorities does your organisation have, that can be supported by Open Data?
For example: Public trust and confidence and being able to work closely with well-informed communities are two things critical to Police. Open data is now a key contributor to Police achieving both these goals.
Common goals and outcomes for Open Data
Depending on the values, priorities and resources afforded to your Open Data initiative, you may have a single goal or multiple. Here are some examples of common goals:
- Saving staff hours through increased sharing of information between departments.
- Enabling citizens to gain a better understanding of government activity, promoting economic development and improving quality of life.
- Providing greater availability, and awareness, of data used in data-driven decision-making.
- Providing a basis for a local civic technology ecosystem.
- Reducing the volume of incoming Official Information Act requests by making highly requested, non-sensitive information available in a self-serve format.
Aligning with organizational goals
A key avenue for generating initial buy-in from leadership is to show how embracing open data can drive progress on high priority issues. You should consider how open data can support the priorities of high-level policies that are relevant to your context.
It could be aiding in creating jobs, reducing vacant and abandoned properties, or increasing government transparency. The way you pitch your Open Data plan can help get some quick wins under your belt and get on your way to a longer-term initiative.
You may find it useful to review recent high-profile speeches and statements from your agency leadership to identify some potential key issues your initiative may assist with.
Don’t be afraid to be creative with your solutions as open data can have impact in the most unexpected areas.
Open Data Toolkit
This guide will be most useful to IT managers, GIS coordinators, asset managers and database admin.
The focus of this guide will be generic, easily uploadable datasets such as CSVs, spreadsheets and so on.
Listing your dataset on data.govt.nz
The general process for uploading a dataset is as follows:
- Identify datasets to publish
- Export the data and ensure it is cleaned, machine-readable data
- Publish the data (make it accessible on the internet as a file to download or via an API/web service)
- Apply an appropriate license
- List the dataset to data.govt.nz, providing metadata
- Schedule regular updates
For the first dataset, pick something that is likely to be useful straight away. The data needs to be good quality as well as being free from any privacy or confidentiality issues.
Some examples of good starting datasets that don’t change frequently includes drain pipes, waste collection zones, dog walking zones, and customer service centre locations.
Preparing the Datasets
In order to prepare data sets to maximize their usefulness, it’s important to acknowledge the different types of formats available as well as the limitations that they may present:
|CSV||High||A versatile format for opening structured data (eg. As spreadsheets)|
|XLS or XLSX||Low||Limits machine reading and use on non-Microsoft systems|
|KML||High||An open standard developed for Google Earth. May not translate to other systems. KMZ is also available as a packaged set of KML files.|
|GeoJSON||High||A form of JSON that caters for simple geospatial attributes|
|WMS||High||Standardized format for georeferenced map images|
|WFS||High||Standardized format for geographical features|
|TXT||High||Simple text format readable on most operating systems. No formatting is available|
|RTF||High||Simple text format readable on most operating systems which retains some formatting|
|ODT||Medium||Limits machine reading|
|DOC or DOCX||Low||Limits machine reading and use on non-Microsoft systems|
|Low||Useful for document exchange to preserve formatting, but has limitations for machine reading, character recognition and remixing.|
|XML||High||Extensible Markup Language (XML) is a markup language that defines a set of rules for encoding data using tags, attributes and content|
|RSS/ATOM||High||An XML based standard used in data feeds/web services|
Any tabular data should also be cleansed. This means that files should be:
- Clean - Using uniform data formatting (eg; numerical dates, postcodes in every field) with no missing entries, no embedded non-text information, data in every field and as few mistakes as possible, and
- Machine-readable – data in a structured, predictable and non-proprietary format that can be consumed by a software program.
Examples of machine readable, clean data
Examples of data that is not machine readable or clean
|Copyright of Dept. X|
|10th Dec 11||15||Fem||-|
|* Footnote Information|
Listing the Datasets
From the front page of the data.govt.nz website, you’ll be able to see an Add Data button in the top right corner followed by Submit Dataset on the next page.
Your submission to data.govt.nz will require a few pieces of information:
- Dataset title
- A direct link to the dataset
- A description of the dataset
- The category it fits under
- The formats it is available is
- Re-use rights / license (See below)
- Update frequency (daily, monthly, yearly?)
- Title of the agency responsible for the dataset
- A contact email address (which will NOT be published)
Picking a licence
Licensing an important aspect of releasing your dataset. It will determine whether or not anyone actually has permission to reuse your dataset, let alone redistribute it or use it for a commercial purpose. It also specifies whether others have to provide attribution to you for your work.
In New Zealand, the NZGOAL framework provides guidance for agencies to follow when releasing copyright works and non-copyright material for re-use by others.
Once you’ve successfully listed your dataset on data.govt.nz, you should be able to see the listing in the catalogue once it has been moderated by DIA:
Scheduling regular updates
It’s important to keep a schedule of how often the datasets you have released are updated. Some datasets may not need updating very often but others, such as weekly petrol prices, will be at their peak usefulness should they be updated in a timely manner.
For the example of petrol prices, it wouldn’t make sense to only update the dataset once a month and interested users would likely look somewhere else for fresher data.
Open Data Standards
All content on this site is licensed Creative Commons Attribution 4.0. Please attribute Code for Aotearoa.