Beyond freedom of information lies open data

Don’t laugh, but what you see above is very exciting; it should send a tingle down your spine.

What the hell is it? It’s a dataset from a United States census of public library systems containing data elements that cover library service measures such as the number of uses of electronic resources, the number of Internet terminals available to the general public, reference transactions, interlibrary loans, circulation, library visits, children’s program attendance, and circulation of children’s materials. It also includes information on collection sizes, staffing, operating revenue and expenditures.

And libraries are exciting? No, the data is exciting.

It is an example of the US government releasing raw data to the public in machine-readable format. This allows civil society groups, or even single interested persons, to mine the data to discover significant information about public libraries, the way they are used, their cost-benefit aspects etc.

Releasing raw data to the public in machine-readable format – that’s what is exciting. It empowers citizens; it keeps government transparent, and it harnesses the intelligence of crowds to diagnose public problems and find solutions. This is the outcome of the Open Data Movement, a push to get governments to put as much raw data as it has onto the internet, subject to personal confidentiality and critical national security limitations. The data should be in machine-readable format. Citizens, either singly or in collaborative groups, write applications that mine the data to tease out patterns. The knowledge so gained will be of immeasurable public value.

For example, one might be able to see the effect of road widening (and increased traffic noise) on transactional values of flats within 100 metres — which will enable the public to better judge one of the consequences of vehicle population increase. Or we can track permanent residents’ contribution to Singapore in terms of how many really stay long-term, convert to citizenship, have their sons do national service, etc. That way we can have a more informed debate about immigration.

Readers are perhaps more familiar with the Freedom of Information (FOI) movement. It is a topic I have written about several times, and increasingly others are talking about it. Workers’ Party member of parliament Pritam Singh made a call for a Freedom of Information Act recently. But there is a huge difference between FOI and Open Data.

FOI is request-driven. It enshrines a right of citizens to obtain information from government, with disputes about whether such a request is practical adjudicated by an ombudsman. Agencies required to respond have to be truthful and the answer may be quite voluminous in content.

Open data is not request driven, though public pressure can be instrumental in ensuring that more and more data be made available online. The default position is that all data collected by government (including local government) should be publicly available in machine-readable format, unless personal confidentiality or national security is at risk.

FOI and Open Data do not conflict, but can reinforce each other. While data that is numerical or database in nature suits Open Data, it may need an FOI request to obtain, for example, the technical specifications and decision trail for a government tender. An anti-corruption group for example, may wonder why so many vendors had been disqualified in a tender (a pattern they might have seen via Open Data). How fair or limiting were the tender conditions? How did judging proceed? That kind of information is more suitable for an FOI route.

* * * * *

Open Data is a relatively recent movement, with the US and UK governments currently leading implementation (see data.gov and data.gov.uk). While I myself had heard about it 18 – 24  months ago, it wasn’t until two students at the Singapore Management University presented a paper mid 2011 that I took any real interest. I caught up with them again to find out more, beginning with the question:  Why did you choose this topic for your paper?

Randy Lai was interested in data as a problem-solving tool. As he describes it, humans solve problems using both intuition and data. If we’re using intuition alone, this may be biased. Certain fundamental causes or effects may not have occurred to us, but data may reveal the connections. Data is useful in a limitless number of fields, e.g. in public health and medicine, and can bring about forecasts and action.

Priscilla Soh said her interest came from a different direction. “I’m more passionate about the People’s Action Party story,” she said. “Currently, there’s a one-way flow of information. There needs to be a level playing field.”

I asked them for some examples how Open Data might produce interesting insights in Singapore. Priscilla suggested home ownership. “What percentage are held by foreigners? What’s the trend over the years?” she gave by way of examples. “We need to test whether grievances are justified by ground facts.”

I see her point, for dissatisfaction has a political cost, and if it’s unfounded, why do we keep paying this cost merely for lack of data?

Randy had a very different example to offer. He would use data to see what happens to standardised test scores when school curricula are changed. If possible, “I would want to capture a trend between this and future income,” he added.

We also discussed the association between test performance and ethnicity that government statistics implicitly make. Is that really the case? What other social factors are determinants? Which groups really need help?

Here’s a map that illustrates what can be done with data.

Someone took the released data on locations of accidents involving cyclists in London and laid them over a map to show fellow-cyclists where the danger spots are. It firstly leads cyclists to be more careful when on those roads, but it is also a good place to start when civil society and the authorities want to improve road conditions for cyclists’ safety.

* * * * *

Randy and Priscilla pointed me to a study led by Betty Hogge for the Soros Open Society Foundation, published May 2010. The study’s aim was

. . . to identify the strategies used in the US and UK contexts with a view to building a set of criteria to guide the selection of pilot countries, which in turn suggests a template strategy to open government data.

The report finds that in both the US and UK, a three-tiered drive was at play. The three groups of actors who were crucial to the projects’ success were: Civil society, and in particular a small and motivated group of “civic hackers”; An engaged and well-resourced “middle layer” of skilled government bureaucrats; and a top-level mandate, motivated by either an outside force (in the case of the UK) or a refreshed political administration hungry for change (in the US).

To promote the development of digital applications that mine data for public benefit, civic organisations have a huge role to play. In the United States, there is the Sunlight Foundation, for example, which declares on its homepage that they are:

. . . . a non-profit, nonpartisan organization that uses the power of the Internet to catalyze greater government openness and transparency, and provides new tools and resources for media and citizens, alike. We are committed to improving access to government information by making it available online, indeed redefining “public” information as meaning “online,” and by creating new tools and websites to enable individuals and communities to better access that information and put it to use.

Besides raw data, the data.gov website from the US now contains, for many datasets, its own applications, which you can run by clicking on available icons. These will filter the data in the way you want or visualise them in different ways. While you may still need to be clear-headed about what you’re looking for, you don’t have to write computer code to retrieve what you want.

* * * * *

Believe it or not, Singapore has a site: data.gov.sg. Alas, it is a pathetic attempt to mimic what has been done elsewhere. The chief problem is that it links lead back to ministries’ websites and to already-processed data (not raw data). Data is presented in one format and no other. Nor are they machine-readable. But that is exactly what Open Data is NOT about – presenting data in just one way to suit the government’s agenda.

Moreover, on its homepage, Singapore’s site features strongly the statement “Create value by catalysing application development”, which also strikes me as rather off the mark. Of course, “value” can mean many things, but there’s something about that statement that suggests commercial value. Commercial value may indeed be an incidental spin-off of data mining, but it isn’t the primary point of Open Data, an offshoot of Open Society Initiatives. The primary aim is transparency in government and the empowering of civil society for public (not private) benefit.

It does not surprise me that what we have is mimicry without any understanding at all of the substance; wanting (once again) to look like a modern, developed, liberal democratic country without letting go of authoritarian, control-freak habits.

—-

Randy Lai and Priscilla Soh’s paper on Open Data will be a chapter in a forthcoming book to be launched in early November by Singapore Management University and the Wee Kim Wee Centre of SMU. The book’s title is Progress and its (Dis-)Contents. Look out for it.

20 Responses to “Beyond freedom of information lies open data”


  1. 1 Singaporean 23 October 2011 at 22:46

    This is an interesting topic.
    Boy, Alex, your blog is really good.
    Much much better than the ST trash.
    Schools should encourage students to read your blog and have meaningful discussions.

    Perhaps Education Minister Heng Swee Keat can start by highlighting the useful blogs that provide REAL information than that from the MSM.

  2. 2 Wy 23 October 2011 at 23:08

    Open data is an interesting project that promises a lot. However, from my own research on the project, it seems like not many people actually pay much attention to the downside of it.

    In fact, if you look at the operations of “anonymous” and wikileak, they are pretty much striving for the same goal – that freedom of information.

    You might be interested in Michael Gurstein’s work on data divide. http://gurstein.wordpress.com/?s=data+divide

    =)

    • 3 Brendan 24 October 2011 at 11:38

      Don’t try to obfusccate the issue. Comparing open data to wikileaks is like comparing apples to oranges. open data is data published by and endorsed by the govt. wikileaks on the other hand, contains leaked classified military ops data (some unverified), information specifically excluded from the Freedom of Information Act (FOIA).

  3. 4 Anonymous 24 October 2011 at 00:37

    In Singapore, you need to purchase data from the government. Nothing comes free, even if you are working on government projects, you will still need to pay. That is how our government suck money from the people.

    There is no reason why the government should refuse to adot a freedom of information act unless it does not want to be accountable for it’s action.

    • 5 Gazebo 25 October 2011 at 02:16

      as an academic researcher, i have requested for data from singapore government agencies before. one of the most frustrating replies i have received was, “sorry, no. research should only be done with public data.”

      *face palm*

      i have conveyed to the government before, that if they want to raise singapore’s profile and enhance its punching power in the international arena, then perhaps as a start, allow researchers to work with singapore data. research featuring singapore will help to profile the country! singapore is such an interesting setting for good academic research — small, controlled setting, relatively small selection bias etc. — but data is so unavailable. and i think singapore has many good things going for it i.e. the results might be much better than the government probably fears they would be.

      • 6 Fox 25 October 2011 at 13:54

        Like Gazebo, I’ve had similar experience requesting for data from Singapore government agencies/organizations such as MOE, MOM and the Department of Statistics. I can personally attest to how much stonewalling these agencies are capable of.

        Once I requested for the number of students who are homeschooled from MOE and some MOE bureaucrat referred me to the Education Statistics Digest which has absolutely no information pertaining to this. Further correspondence with MOE did not yield anything. My take from this was that if it is something not already in public domain, these agencies/stat boards will not provide the data, never mind that it is not anything sensitive.

  4. 7 RSF 24 October 2011 at 01:38

    Open data is one of the main topics to be discussed at the upcoming GovCamp.

    http://www.facebook.com/event.php?eid=169650753122654

    The tech community have complained that the paltry data at data.gov.sg doesn’t leave a lot of room for development and would like to see a framework for getting more data released. They hope to open some lines of discussion in this area.

    • 8 yawningbread 24 October 2011 at 09:21

      Now, that’s interesting — see what crowd sourcing can do?
      I saw this Youtube clip in the comments trail there and I think it’s worth putting here too to help fill out ideas about what benefits can come from open data.

  5. 9 Tan Tai Wei 24 October 2011 at 12:38

    For “transparency”, thereby truthfulness and justice, free info is justified, inter alia. And “confidentiality” is called for, surely in exceptional situations, only when info stoppage would on the contrary prevent probable abuse of info that threatens fairplay.

    In practice, however, “confidentiality” often is abused by administrators, in order to do just the opposite. Often, it is abused for hiding mistakes, preventing the very divulging of info needed for rectifying and seeing to injustices undone. For example, confidentiality about exam marking may be abused for keeping quiet about mistakes at grading or adding of marks, rather than upholding of fairness at assessment. The concern for the latter should rather call for open acknowledging of such mistakes, precisely to best ensure truthful and just rectifying!

  6. 10 suggestion 24 October 2011 at 14:42

    Thumbs up for another excellent article!

    Your insgiht and analysis trumphs many of the so called analysis and perspective pieces in the ST.

  7. 11 Saycheese 24 October 2011 at 16:33

    Giving info to the populace is empowering them… now, where will this leave the MIWs… in the culvert with their hatchets and knuckle-dusters?

  8. 12 Chanel 24 October 2011 at 17:21

    I very much doubt that the PAP govt would adopt either FOI or Open Data. Even if in the unlikely event that they did, it would probably be such a heavily watered version as to make it useless.

    The govt wants to monopoly over all national data because full disclosure of information would result in serious questions over the purported benefits to S’poerans of many national policies. The govt’s tactic continues to be “keep them in the dark and feed them bullshit”. A recent case in point is the Health Minister Gan Kin Yong’s parliamentary reply:

    “In 2010, there were around 2 million working Singaporean CPF members, of which 80,000, or less than 5%, used their CPF to pay for their parents’ healthcare expenses.”

    What is wrong with the above statement? A lot. It does not answer the question raised by NCMP Gerald Giam on parents having to burden their children’s Medisave for their own medical expenses. The correct percentage should be the number of CPF members (i.e. 80,000) who used their CPF divided by the total number of parents requiring hospital treatments. The latter figure should be a lot smaller than 2 million.

  9. 13 ttyy1 25 October 2011 at 02:43

    For those interested, tim oreilly’s government as a platform http://ofps.oreilly.com/titles/9780596804350/

  10. 14 Tan Chong Kee 25 October 2011 at 05:47

    One of the pioneer of open data is Hans Rosling who envisions all countries sharing data. See his famous talk here:

  11. 15 serendib 25 October 2011 at 10:29

    I’m a big fan of transparency of information (as long as it doesnt undermine national security and personal info). However, we have to consider that the govt here (and in many not fully democratic and/or developing countries) has suppressed data for a considerable amount of time. And releasing some of that info now, publicly, could have unintended negative consequenses. For example, what if health data shows that those residing in western singapore have a higher rate of cancer than those living elsewhere on the island? On the plus side, this could be used to pressure the govt to get heavy industries there to clean up their act – no doubt something that will take time and will be met with resistance (or the cancer rate could be due to some other reason entirely). On the downside, property values could fall and insurers could increase premiums for western residents immediately.
    Can such conflicts be mitigated somehow?

    • 16 Fox 25 October 2011 at 14:05

      It would be impossible to conclude that the environment in Western Singapore is responsible for the higher cancer rate solely on the basis of that piece of information.

      You would also have to look at demographic data (ethnic group, age, sex ratio, etc) and do some heavy duty multivariate regressions. Furthermore, you would have to adjust for the duration of residency and take into account people who moved from the Eastern parts of Singapore. I do not think it would be possible to draw that sort of conclusion solely from demographic data; you would need some kind of purposeful longitudinal study to make that inference.

    • 17 Chanel 25 October 2011 at 16:13

      serendib,

      Withholding information leads to mistrust by the electorate and may potentially result in corruption at the highest level.

      The examples you cited should be arguments for a FOI law. If (erstwhile hidden) data show higher incidence of cancer in western singapore, the MOH should explain why this is so. And if this is due to toxic affluence from nearby factories, shouldn’t the logical thing be for immediate action to be taken against those factory owners?? The pertinent question in this case shoud be why MOH hid such information from the public.

  12. 18 shengz 26 October 2011 at 14:32

    The reality is that this govt withhold information or gives incomplete versions is symptomatic of its inability to be transparent. Unfortunately, not many are able to understand or appreciate the ultrior intentions of the power that be. With a FOI law, it empowers citizens to have to more complete date when they wish to have. LKY has always operated on the principle that “He knows best – you don’t need to know – just obey what I dish out”. I would say that’s why the PAP still have that 60% votes.

    • 19 Gazebo 26 October 2011 at 22:50

      exactly. this inability to be transparent pervades every level of the machinery. would you believe it, that government agencies belonging to the same ministry, withhold information from each other?! in my previous life as a civil servant, my colleagues and i had requested for information before from another agency several times. it was denied. basically the only way any information can be shared is if someone from the top requests it. the inherent distrust that singaporeans hold for each other is unbelievable.

  13. 20 Tan Tai Wei 27 October 2011 at 13:51

    On the other hand, there occurs the same sort of “fear” of info as what Eric Fromm has called “the fear of freedom”, ie. the laziness about having to undergo the agony of choice that freedom brings, therefore the willingness to live under autocracy. So, people shy from the effort that information processing and judging requires, preferring to “trust” the powers-that-be, and acquiesce with the policy of confidentiality.

    And the LKY and the PAP might have taken that to be convenient governance, calling for the people’s “trust”.

    But then, the passivity and complacency thus engendered among the people goes against the grain of, say the “entrepreneur” spirit they also want to cultivate.


Leave a reply to Saycheese Cancel reply