YLibrary? Embracing Open Data: The Library’s Role as Digital Curator by Rachel Cobcroft

In May 2011, Europe faced a significant health crisis: a deadly outbreak of the bacteria E.coli had emerged from an unknown food source, affecting 4000 people, and killing 53. Researchers turned to the broader scientific community for help: releasing details of the sequenced bacterial genome via Twitter and sharing publicly accessible sequence data via the NCBI (National Center for Biotechnology Information) database. Within 24 hours, international teams were uploading analyses and annotations to the open data repository GitHub, and within days, possible ancestral strains were being identified. At record speed, scientists were able to pinpoint the source of the contamination, allowing authorities to isolate the farms in question, and to declare the outbreak over by the end of July.

Such collaboration was enabled by the open licensing of the genomic data under the 'no rights reserved' CC0 licence. This licence, released by Creative Commons, enables copyright holders to waive their rights to materials, placing them as completely as possible in the public domain. This allows scientists, educators, artists, and other creators to build upon, enhance, and reuse these materials for any purpose, without restriction under copyright or database law.

Open data’ is data ‘that can be freely used, reused and redistributed by anyone – subject only, at most, to the requirement to attribute and sharealike.’ According to the Open Knowledge Foundation, open data’s key features are:

  •  Availability and Access: data must be available as a whole, and at no more than reasonable reproduction cost, preferably over the Internet. It must be in convenient and modifiable form;
  • Reuse and Redistribution: data must be made available under terms that permit reuse and redistribution, including intermixing with other datasets. It must be machine-readable;
  • Universal Participation: data must be available to everyone to use, reuse, and redistribute. There must be no restrictions against persons or groups, or against commercial interests, for example.

As individuals and institutions face increasingly complex computational challenges and grapple with exponentially increasingly amounts of data, there is an urgent need to establish frameworks to support open data distribution, use, and reuse. It is here that the library of the 21st century plays a critical role – that of digital curator.

What is Digital Curation? Why Does It Matter?

The UK’s Digital Curation Centre (DCC) defines this essential research activity as follows:

‘Digital curation involves maintaining, preserving, and adding value to digital research data throughout its lifecycle.’

Through its education and oversight of the various stages of the digital curation lifecycle, depicted using the DCC model below, the library can ensure that data is appropriately captured, described, stored and secured, appraised and preserved, and disposed of, according to relevant policies, procedures, and legal requirements. Such frameworks will ensure that meaningful data is preserved for others to access, use, share, and re-use in both the short and long term.

The Digital Curation Centre’s Digital Curation Lifecycle Model

accurate authentic integrityi.e.value and veracity

Embracing ‘Intelligent Openness’

Worldwide, scientific institutions such as the Royal Society have called for ‘intelligent openness,’ in which data and its associated metadata (‘data about data,’ which enables its retrieval, management, and use) must be accessible, intelligible, assessable, and re-usable.

Here, the library plays an integral role in achieving intelligent openness – by encouraging owners of data to engage in the following steps, defined by the Open Knowledge Foundation:

  •  Make your data available: in bulk and in a useful format;
  • Make it discoverable: put it on the web with its associated metadata;
  • Apply an open licence to your datasets.


Wiles, S. (2011). An outbreak of crowdsourcing, Sciblogs. Retrieved October 1, 2013, from http://sciblogs.co.nz/infectious-thoughts/2011/06/09/an-outbreak-of-crowdsourcing/. The genome was originally sequenced by BGI and the University Medical Centre Hamburg-Eppendorf.

NCBI. (2011). Sequencing for E.coli. Retrieved October 1, 2013, from http://www.ncbi.nlm.nih.gov/sra?linkname=bioproject_sra&from_uid=67657.

GitHub. (2011). E. coli O104:H4 Genome Analysis Crowdsourcing. Retrieved October 1, 2013, from https://github.com/ehec-outbreak-crowdsourced/BGI-data-analysis/wiki.

Edmunds, S. (2011). Notes from an E. coli “tweenome” – lessons learned from our first data DOI, GigaScience. Retrieved October 1, 2013, from http://blogs.biomedcentral.com/gigablog/2011/08/03/notes-from-an-e-coli-tweenome-lessons-learned-from-our-first-data-doi/.

Creative Commons. (n.d.) About CC0 – “No Rights Reserved”. Retrieved October 1, 2013, from http://creativecommons.org/about/cc0. See also Creative Commons. (n.d.). CC0 FAQ. Retrieved October 1, 2013, from http://wiki.creativecommons.org/CC0_FAQ.

Creative Commons, http://creativecommons.org/.

Open Knowledge Foundation. (n.d.). Open Definition. Retrieved October 1, 2013, from http://opendefinition.org/. An open licence may require that users of the data to credit the owner of the dataset (‘attribution’), or that users who mix the data with other data must also release the results under an identical licence (‘share-alike’). For more information regarding open licence terms, see Creative Commons Australia. (n.d.). About the licences. Retrieved October 1, 2013, from http://creativecommons.org.au/learn-more/licences.

Open Knowledge Foundation. (n.d.). Open Data. Retrieved October 1, 2013, from http://okfn.org/opendata/.

See, for example, Open Knowledge Foundation. (2012). Why Open Data? Open Data Handbook. Retrieved October 1, 2013, from http://opendatahandbook.org/en/why-open-data/.

Digital Curation Centre. (2013). What is digital curation? Retrieved October 1, 2013, from http://www.dcc.ac.uk/digital-curation/what-digital-curation.

Digital Curation Centre. (2013). DCC Curation Lifecycle Model. Retrieved October 1, 2013, from http://www.dcc.ac.uk/resources/curation-lifecycle-model.


 The Royal Society. (2012). Science as an open enterprise. Retrieved October 1, 2013, from http://royalsociety.org/uploadedFiles/Royal_Society_Content/policy/projects/sape/2012-06-20-SAOE.pdf.

For more information about metadata, see National Information Standards Organization (NISO). (2004). Understanding Metadata. Retrieved October 1, 2013, from http://www.niso.org/publications/press/UnderstandingMetadata.pdf.

Open Knowledge Foundation. (2013). Open Data – An Introduction. Retrieved October 1, 2013, from http://okfn.org/opendata/.

For more information on appropriate data licences, see Open Knowledge Foundation. (n.d.). Making Your Data Open: A Guide. Retrieved October 1, 2013, from http://opendatacommons.org/guide/ and Open Knowledge Foundation. (n.d.). Guide to Open Data Licensing. Retrieved October 1, 2013, from http://opendefinition.org/guide/data/.


We welcome relevant, respectful comments.
Please read our Comment Policy before commenting.
We also welcome direct feedback via Contact Us.

My son and I attend our local lbairry (the Dimond branch) almost every week to check out new children's books that we read together. From what I have read on the saveoaklandlibrary.org site, our branch will be one of only four left if the proposed A Scenario happens and will be reduced to being open only 3 days a week. Once this happens our lbairry will become increasingly overcrowded and books will become less available as we have to cope with the traffic that will be redirected our way from the closure of the other 14 branches. I am trying to foster a love of reading and learning in my son, but to do that I need the resources of my local lbairry which provides me and him with an endless supply of books. If multiple branches close, it will make it so much more difficult for not only my son, but other children in Oakland to have access to these valuable tools. I am one of the few people who probably uses the lbairry the least, there are many others who depend on it for internet use, study space, and research material. Libraries are for many children one of the only places where they can access such an infinite amount of knowledge. Lets not take this away from the generations to come.

I agree that the library should be the centre of the community, and that access to knowledge is critical for children's education. I sympathise that closures in Oakland may have affected your branch. It's impressive how organised the http://saveoaklandlibrary.org/ campaign is, with a popular Facebook presence! I note that you contributed your story to the campaign also (http://saveoaklandlibrary.org/tell-us-your-stories/#comment-77), which now appears to have saved the library for the 2013-15 period! Great news!The digital library is one aspect of a library service. In fact, ensuring that materials are licensed openly means that they will exist for a long time - they can be shared easily, and they can be kept in multiple locations, and adapted to multiple devices. They can even be translated into many languages! Moreover, materials which are 'free' in both cost and licensing mean that there are as few impediments to access as possible - ensuring that people can enjoy them both now and into the future.

Hi, I do think your blog could be having web browser compatibility issues. When I take a look at your blog in Safari, it looks fine however, when opening in Internet Explorer, it's got some overlapping issues. I just wanted to give you a quick heads up! Other than that, wonderful site!|