The British Oceanographic Data Centre (BODC) launch a Published Data Library that enables specific datasets to be cited in journal papers through the assignment of a Digital Object Identifier (DOI) in collaboration with the British Library and the global DataCite initiative. There is a fundamental assumption that the copy of the data set will be exactly the same each time it is referenced.
|One of many datasets in our Published Data Library. ©|
It should be noted that the assignment of a DOI is not a substitute for long-term data management by the Natural Environment Research Councol (NERC) data centres, which enables users to construct their own data sets from all data holdings.
The PDL is designed for base datasets suitable for future re-use in other applications, rather than data reworked specifically for a single research publication — the latter sometimes termed 'data behind the graph'. It is based on a model in which multiple copies of datasets are stored indefinitely and in full. Data storage, however, is not an infinite resource. Therefore, the system has obvious limits to the size and number of datasets that it can handle. Procedures are in place to decide which datasets should be included.
PDL datasets will be of two types
- Data sets that have not yet been ingested into the BODC system, but are destined for future ingestion.
Candidate datasets will be identified through negotiation between data originators and BODC. The technical quality of these datasets (including metadata) to an acceptable standard is the responsibility of the data originator. BODC will judge the acceptability of candidate datasets in terms of their completeness, but not in terms of their scientific quality or value.
- Data sets that have been ingested into the BODC system and subsequently exported.
Candidate data sets of this type will be identified through discussion between the scientists who supplied the data and BODC. The technical quality of these data sets is BODC's responsibility.