Recently I’ve been chatting to a lot of different people about the benefits of Open Data. One question that seems to crop up again and again is simply why open up the data they have collected to publish on the web?

It’s the question that is at the root of it all when it comes to changing how proprietary people feel regarding their organisation’s data.

Just yesterday I was talking with a friend who has been working in computing since the 80′s and during the 90′s was head of IT in a university department. She told me of meetings they had with the head of departments and leaders in the university management when the World Wide Web was new. The meetings were to discuss the benefits of creating one of these new-fangled websites for the university.

The management team were in general supportive of the idea, but had some questions that fell into two key areas:

A feeling of loss of control

How could they see who was reading the information? Contact with potential students and other academics happened person to person and could be given as feedback to management so that they would know what people were asking for. If they put information on a website, would the web visitors simply come and go without ever making “real” contact with the university, and they would never know whether the person’s questions had been answered or not.

Of course, as time went on they found that providing information on a website cut down on the amount of time administration staff had to spend dealing with duplicate queries about the same subject. The website provided points of contact which funneled queries to the right departments, and new questions prompted new content on the website.

Unsure of the wider benefits

How would anyone find the information? This was at the very early stages of the World Wide Web, at the time of the meeting this was before search engines and organisations provide index pages to navigate the collection of web pages that they had written.

Seeing the potential benefits of publishing information on web pages, they pushed ahead along with other early adopters of the Web. As more and more web pages were added, programmers developed search engines which made a huge difference to finding information on web pages. The more information was published on the Web the more it pushed new technologies to be developed: search engines being the most game-changing of these, but also user friendly reporting of the visitor logs and searches used on web sites allowed content creators to gain insight into what their visitors needed.

I could carry on and talk about how years later companies would open up their functionality with APIs and allow other organisations to make use of funky tools on their own websites such as embedded image galleries or maps, or images pinpointed on maps! That came later, all of what I’ve spoken about happened in the 90′s, I’m talking about initial advancements such as search engines and web analytics that wouldn’t have been developed if the content hadn’t been out there to search for.

The similarities of that meeting to the recent discussions I’ve had about Open Data certainly didn’t escape me!

Before I get going, I am at no point talking about personal data. The type of data I’m talking about would more accurately be defined as research data.

Changing your mindset from keeping raw data locked away to publishing it as Open Data can be difficult to imagine when the concept is relatively new. One of the points I try to reassure people about is that no one is suggesting you dump every bit of data you own onto a public website! In the same way that that university didn’t suddenly publish every piece of information they had for students, you can look at your data and think about what would be useful to be provided openly. A good starting point is the data used in any reports that you make public. After all, you must want that information out there and people making use of it, otherwise - why spend time on publishing reports, or papers, or whatever other formats of data you already make public?

Why do it at all? I hear you ask. One simple word:


As more raw information is opened up and published as Open Data, the more it pushes new technologies and applications to be developed by the wider Web community of developers and researchers.

Data from different sources can be pulled together to be shown in new and interesting ways, whether that’s by developers who want to simply provide a useful tool, or academics who are going to be able to push research forward in a field that you care about.

Open Data will drive the next generation of applications in the same way that simply publishing text on web pages drove the formative technologies on the World Wide Web.


Note: this post was originally posted on my personal blog