Search

Open data workshop on global definitions and regional understanding

Open data is growing in attention on a global scale. There exist different definitions of the term open data. According to The Open Definition (n.d.), “Open data and content can be freely used, modified, and shared by anyone for any purpose.” This definition refers to different types of data like data of cultural works and artefacts, scientific data and publications, financial data and governmental expenditure, statistics and census data, weather and climate data, as well as environmental data like data on pollution or the quality of water and air.

At the “Open Southeast Asia” in Bonn, the European Chapter has held an own workshop with the goal to bring together people with Western and with Asian backgrounds together and to share ideas and best practice examples about how to deal with open data.

The workshop was designed as hands-on. All participants have been invited to contribute to the discussions and findings. Within the workshop we have focused our discussion on the following two questions:

How is the access to open data provided in Southeast Asia and Germany?

What is the value added of open data in Southeast Asia and Germany?

The aim is to identify whether there are differences in the understanding of open data and its impact and further if we are able to learn from each other in an open discussion as in the workshop format.

In total 12 persons have participated. Seven are students, two are PhD candidates/ researcher and the others are activists. The participants have been split into two groups according to their interests in open data. One group that is more interested in the technical side and one group that is more interested in the impact of open data.

In addition, we have presented the work of ASIS&T in Europe. Due to the high number of participating students, we focused as well on the introduction of the European Student Chapter. As open data is subject to diverse information and technology science research we are looking forward to establish a new Special Interest Group Open Data and to enable and share our research on this topic within the ASIS&T community.

If you like the idea of introducing a SIG Open Data, please contact Agnes agnes.mainka[at]hhu.de. She is going to coordinate further steps. This idea is work in progress and everybody who is interested is invited to join.

Report

Workshop on Open Data: Global Definitions and Regional Understanding

Workshop was held at the Open Southeast Asia in Bonn (Germany), 19th – 20th of May

Introduction

Do we have a global understanding of what open data is and does open data impact have cultural differences? In this workshop, we discussed the German or Western understanding of open data and its impact and compared it to the understanding and impact in Southeast Asia. Are there common ideas and can we learn from each other?

There exist different definitions of the term open data. According to The Open Definition (n.d.), “Open data and content can be freely used, modified, and shared by anyone for any purpose.” This definition refers to different types of data like data of cultural works and artefacts, scientific data and publications, financial data and governmental expenditure, statistics and census data, weather and climate data, as well as environmental data like data on pollution or the quality of water and air.

Besides the impact of open data and the possible areas of applications, there are technical aspects which need to be considered. Datasets are only useful if they are accessible and processed easily. It is not enough to just create an open dataset, publish it somewhere on the web and hope that somebody will find it. For this purpose, a number of open data platforms have been created, where datasets can be enriched with metadata, can be indexed by different aspects like license and category. In addition, they give users the possibility to search or browse for datasets. Over time multiple platforms with similar names have emerged, which leads to a confusion where to host and where to search for open data datasets. The datasets itself often do not contain information about licensing, so there is a need for such platforms where additional information (file format, license, date, etc.) can be added.

The impact of open data projects has been recently investigated by Young and Verhulst (2016). They tried to categorise the impact of open data in four groups, namely improving government, empowering citizens, solving public problems, and creating opportunities. Looking at these categories we can identify projects in the German-speaking region that fall into these categories as well. The category improving government, for example, covers projects like Maerker Brandenburg (maerker.brandenburg.de) which helps citizens and city services easily to communicate and to fix problems. Citizens are able, for example, to report on broken lights in a dark city area and the city service can directly react on that issue. According to the category creating opportunities, a mobile application called RIS:App has opened new possibilities to easily get access to national laws and judicial decisions in Austria. This has improved the way how advocates retrieve their information. Eventually, the success of the app has helped the developers to become known and successful app developers. Referring to the category solving public problems the OpenStreetMap (www.openstreetmap.org) is used (not only in German-speaking regions) to add geographical data on a map and to share this information with the community. While many are updating the information, the knowledge of the mass eventually leads to richer and up to date information than solely automatically generated maps. Finally, the category empowering citizens can be described as an impact of the project JedeSchule (jedeschule.de). This project enables statistical data on schools and enhances the ways how diverse stakeholder and politicians may interact with each other. The goal is to enhance the transparency of educational institutions and to empower parents and children to make better decisions.

Hence, the examples we are aware of and work with originating from our Western understanding how open data and citizen engagement can improve our lives. Whereas, looking at other cultures and their understanding how data may help to solve regional problems may open our minds and help us to identify real opportunities. The Open SOA Workshop was a unique opportunity to discuss open data access and impact in a mixed group with people with a Western and Southeast Asian background.

Method

The workshop was designed as hands-on. All participants have been invited to contribute to the discussions and findings. We have focused our discussion on the following two questions:

How is the access to open data provided in Southeast Asia and Germany?

What is the value added of open data in Southeast Asia and Germany?

The aim is to identify whether there are differences in the understanding of open data and its impact and further if we are able to learn from each other in an open discussion as in the workshop format.

In total 12 persons have participated. Seven are students, two are PhD candidates/ researcher and the others are activists. The participants have been split into two groups according to their interests in open data. One group that is more interested in the technical side and one group that is more interested in the impact of open data.

Discussion and Conclusion

Impact of Open Data

In the first session, the impact group had the task, first, to brainstorm all open data projects they have been in contact with or which they know personally and second they should identify which impact these projects have for them. After a break, they got another two tasks. First, to identify cultural differences and similarities according to the impact of one of the mentioned projects and second, to discuss opportunities to improve the impact and discover untapped potentials accordingly. An open data project that is known by the most participants is Massive Open Online Courses (MOOCs). The goal is to make education accessible for everyone through online courses. The impact of those projects is that knowledge becomes decentralised. Now it is possible to attend courses from Harvard and the MIT without being physically in Cambridge. Through MOOCs, everybody with a connection to the world wide web and access to technology may benefit from that. One disadvantage is that there is a technological barrier. The participants emphasised public libraries to overcome the technological gap.

The cultural differences between the impact for German or Western people and the impact for people living in Southeast Asia can be huge. Hence, nations like Singapore meet in general the same standards as Western countries whereas Indonesia or Myanmar are less developed in comparison (ITU, 2015). Thus, we do not have to deal with cultural differences but with a big divide in ICT development. Accordingly, the impact of MOOCs decline. To solely participate and learn something is free but certificates are charged. Finally, MOOCs are not open and thus they are not able to help to overcome educational gaps. Further, people that visit personally a school and pay for it, even it may be of lesser quality than a MOOC offered by the MIT, it could be more accepted at the local labour market. Nevertheless, MOOCs may help to empower youngsters in the Southeast Asian region. For example, in Singapore, most young people have to learn what their parents are expecting for them. To join online webinars can help those students to identify their own preferences. Making MOOCS certifications worth to pay, one possibility would be that universities around the world would officially recognise them. Further opportunities are to develop standards that are recognised by employers and to assign certificates for less developed countries free of charge. Ideally, the transformation of MOOCs into real open data would result in a global access of knowledge.

Technical Accessibility of Open Data

The second group focused on ways of the technical accessibility of open data. The first task was to use a SWOT analysis to discuss the strengths, weaknesses, opportunities and threats of the provision of open data using centralised portals. As there is currently no consistent strategy how to provide open data sets we focused our analysis on a centralised solution. Having one (or only a few) central portal(s) where open data can be published and searched for has many benefits. It would save resources and money if the operation and maintainance of the data platforms could be centralised. Further, people working on different solutions right now could collaborate and share the work. If there would be a single point of access, it would be easy to find and share datasets as well as reduce duplicate data. Regarding the scope of such a portal, it has been the question if such a centralization should be undertaken on country level or with an even broader audience. The ultimate goal has been imagined in creating worldwide standards for the provision of open data. Besides possible benefits, a number of drawbacks have been identified. The coordination and organisation of a centralised solution is difficult especially when the number of involved people increases. It has been questioned who takes responsibility and how a misuse of control could be avoided.

Whereas the first task was about the provision of open data datasets the second task focused on the question of standardisation regarding the contents of published datasets. The pros and cons of standards for a fixed set of common fields for datasets as well as how to include additional fields have been discussed and collected. For each argument, a weight has been assigned on a scale from one to three and the sum of both sides has been calculated. In general, the group has decided that there are more pros in having a standardisation as the side has won by five points. One of the central pro arguments is that it would be easy to analyse such datasets, to create software tools that could process all dataset files following such a standard and that these standards could be set by publishing bodies. Another suggestion was to standardise meta data (format and extent) and content of data files could vary. It is questionable who will define these standards and which resources could be used. Furthermore, it is not clear if institutions or people will still publish open data if they do not like the standards. Probably the hardest part would be to convince people from different countries, different cultures and with different work habits to adapt these standards and (re-)publish their datasets along.

To sum up, we had a great hands-on workshop with enthusiastic, open-minded and communicative participants without whom it would not have been such a success. The ideas and questions that came up during the session will further enter our research on open data.