Open Data Day is an annual celebration of open data all over the world. In the year of 2018, more than 400 cities simultaneously organise hackathons on Mar 3. According to one Hong Kong organiser, Bastien Douglas, most local organisers of ODD are government affiliates. In Hong Kong, communities like OSHK and ODHK lead the organisation every year. One highlight for ODD-HK-18 is the talk from Jessica Lo, the system manager from OGCIO responsible for the open data portal: data.gov.hk

When initially launched, the portal was just a shim wrapper of CKAN, or say yet another CKAN deployment with a different (visual) theme. After a few year’s development, we started to see innovations of the portal that is Hong Kong first or Hong Kong only. For example, the portal is now able to search dataset content instead of only meta data, like name, author and tag which is a default CKAN behaviour. Indexing the content of all datasets is a tremendous effort and it adds complexity for the ranking of search results. Each dataset is now associated with geo label and this enables a map view of datasets and geo-search functions. There are two more initiatives that address the time concerns of open data. One is the historical data archive. Another is a daily updated RSS feed on datasets.

Now that there is a nice bottle, how does the wine taste? Data quantity, quality, usability and license are commonly concerns from the community. According to an HK01 report in March 2017, the quantity numbers are largely made up by unusable data (e.g. pictures/ scanned PDFs), data slicing (e.g. by year; by region) and language versioning (3x). The estimated number of “true unique dataset” is around 2000. As of March 2018, there are 3200+ unique datasets, which sees 50% growth. As to API, the number nearly doubled, from 500 last year to 1000 at present. Even if we do not peek into the details, we see apparent progress in terms of quantity.

According to the latest open data census, i.e. 2016 Global Open Data Index, conducted by OKFN, Hong Kong ranked 24th worldwide. It was 37th in 2015, 54th in 2014 and 56th in 2013. Despite the delightful change in figures, we still can not conclude it as a “big leap”. Like the university ranking systems, there does exist way to trick the indicators and summary formula. As one insider who participated in the rating process once put, reviewers did have different interpretations of some metrics and disagrees on certain scope of evaluation was also changed in the past years. Despite the situation that the index can not be used as a wholesale rating of the advancement, it is still good reference just like people generally believe students who consistently score higher in all kinds of exams are more competent.

Jessica Lo presenting data.gov.hk updates on ODD-HK-18

There had been a warm discussion during the Q/A session and participants proposed several valuable points. Developers usually want to re-format the data tables into a structure that is easier to handle by programs. Sammy from OSHK proposed to revise the Terms so that people can derive from and redistribute those datasets. People also proposed to setup a mechanism for the community to submit their derivatives, like enriched, polished, reformatted and joined datasets back to the portal. In this way, all the open data related with Hong Kong will be centralised. Another suggestion I also echo is to setup regular feedback loop. Hearing meetings similar to the Q/A session on the ODD-HK-18 day is a good example. More interactive and immediate feedback will help to formulate future areas of development. In terms of implementation, conducting surveys in potential user group can better help prioritise the new data requests.

Hong Kong can also reference Shanghai’s experience by aligning multiple stakeholders using the SODA competition. Each time SODA is organised, new datasets from related government bodies are released. The community has new datasets to work on and the government sees new applications of the data. It is a brilliant system to solve the chicken-and-egg problem. Now that the chief executive included “open data” in the 2017 policy address (77th statement) and the new budget proposal tried to allocate more than 10B$ for technology innovation, what shall we do with the solid support and numerous resources to push the further action, adoption and awareness of open data?