Data Availability Statements – Tips

The Background: While there aren’t agreed standards as such, there is a degree of consensus regarding certain types of statements’ suitability for use, as well as some emerging communities of practice. 

6 Quick General Tips

  1. Encourage the use of persistent identifiers or PIDs (for example, DOIs for datasets, ORCIDs for authors, RRIDs for reagents – more information can be found on the ORCID website here)
  2. Engage with journal editors, learned societies and other domain leaders to work out what standards, identifiers and language are appropriate for the community. You could use the RDA policy framework as the outline for the conversation. 
  3. It is preferable to upload data to a repository, and include a link within a research article, rather than hosting via a supplementary material facility.
  4. Sometimes data do need to be kept closed, but this doesn’t need to be the default situation. Ask the researcher/author why should it be closed rather than why should it be open. 
  5. Where possible, have some information (metadata) in front of any paywall to point to where underlying data can be found. See the following examples:

Taken from Scientific Data, a Gold Open Access journal.
This example points towards a dataset being held in a repository, although it doesn’t give the specific DOI.
The inclusion of the data licence is particularly useful for potential sharing.


Taken from The Journal of Social Psychology, which is a subscription journal.
Of particular interest and merit is the fact that although the article was behind a paywall, the information about the data is freely accessible as part of the metadata in front of the paywall.

Taken from Mathematical Problems in Engineering a wholly Gold Open Access journal.
Although it does not provide any information about additional data,
it does enable the reader to be certain that they have access to all the relevant data underlying the article.

6. Think about the workflow – pre-submission as well as the peer review and production of research outputs – from the researcher’s viewpoint.

  • Is there anything that can be simplified through using Crossref’s or DataCite’s automated services? 
  • Are there any confusing or even contradictory requirements that can be clarified, thereby improving the experience and eventual quality of outputs?
  • Is it possible that your colleagues and vendors (such as copy-editors and typesetters) should be included in this investigation for the sake of consistency throughout the workflow?

This text is partly based upon a blogpost published on the OASPA website via a CC-BY licence: 

DAS Samples

To help support publishers implement DASs, we have collected some sample statements that have been developed by experts and can be adapted for re-use by publishers. 

Table 1 is a slightly modified version of the Belmont Forum’s DAS template. It was designed by a combined group of funder and publisher representatives, ratified in October 2018 and is available through a CC-BY 4.0 licence.

TABLE 1

Requirement Example(s) 







Confirmation the data do (or do not) exist 
Crystal structures are accessible, subject to registration, from “My University” [​grid.423328.c] ​at https://doi.org/10.15125/010203 [1] 

This study was a re-analysis of existing data that are available from XXXX [grid.23345] at https://doi.org/10.15125/12345. Further documentation about data processing are available subject to registration from the XXXX Centre [grid.317] at https://doi.org/10.15125/12345. 

No new data​​ were created during this study.The study brought together existing data obtained upon request and subject to licence restrictions from a number of different sources. Full details on how these data were obtained, including individual persistent identifiers, workflow analyses and processing, are available in the documentation available at https://doi.org/10.15125/12345​. 
Information on where the data underlying the article can be found Microscopy images are openly accessible, using a CC-0 licence from XXXX University [​grid.466587.e] ​at https://doi.org/10.15125/01423
NB Specific contact details and charges for accessing should not be included in the DAS itself as they are likely to change over time.
As well as reference to the specific data underpinning the reproducibility of the relevant research article (subset) and to the overall dataset(s) relating to the project[2]Information ​on the full suite of datasets relating to the ​“Mapping Rivers and Impacts” Program ​​can be accessed at https://doi.org/10.15125/12345 
Persistent identifiers where available including ​DOIs​​, ​Accession Numbers​​, IGSN​​s. We are aware that these systems are in the process of developing so ​Grant ID​​s, for instance, may become compulsory at a later point in timeThis paper contains several samples identified by IGSN, one of them is IGSN: SSH000SUA. Information about this sample can be obtained by resolving the IGSN by adding the URL of the resolver before the IGSN: http://igsn.org/SSH000SUA​​ 
Standard terms for repositories/institutions (such as GRID/ISNI/OrgRef​​) Microscopy images are openly accessible, using a CC-0 licence from XXXX University [​​grid.466587.e]​​ ​at https://doi.org/10.15125/01423  
Licensing restrictions Microscopy images are openly accessible, using a ​CC-0 licence​​ from XXXX University [​grid.466587.e] ​at https://doi.org/10.15125/01423 
Access requirements (e.g., ethics, environmental concerns privacy, registration and fees) Due to the fact that ​human subjects​​ are involved, supporting data cannot be made openly accessible. Further information about the data and conditions for access can be found at the University of XXX [​grid.7340.0] ​data archive: https://doi.org/10.15125/1234 
The name of Project and Granting Agency/Agencies (until Grant Identifiers mature). The underlying data relating to this article were funded by the ​National Science Foundation, Japanese Technology Fund, UK Research Fund and National Taiwan Research​​ within the ​Belmont Forum “Mapping Rivers and Impacts” Program​​.

Further DAS Information

For further perspective on how different publishers’ work and thinking in this area has already begun to align, please see below. Table 2 is a synthesised set of DAS characteristics based upon the Center for Open Science survey across a range of STM publisher websites. It shows how consistent the requests for information are and how even providing a limited range of responses will elicit useful information that strengthens the links between underlying data and the published article.

TABLE 2

Option (Multi choice selection)Follow-up question (Allows free text response)
Data openly available in a repositoryPlease provide a DOI or other identifier of the dataset
Data available in a protected access repository.Please provide a persistent identifier or description of the dataset and a link to a location that describes how data can be accessed.
Data derived from public domain resourcesPlease describe the data sources
Data are available on request due to privacy or other restrictionsPlease describe the restrictions to data access.
The data are not available.Please provide the rationale for inability to share data. For example, data that could be potentially harmful to individuals, or data provided upon condition of confidentiality.
All of the data supporting underlying findings are included in the manuscript and its supplemental files. 
Not applicable, no data were used to support this article  

[1] Global Research Identifier Database (GRID) is part of the Digital Science portfolio and downloadable using a Creative Commons Public Domain 1.0 licence. Other research organisational identifier systems, such as ROR, are available or in development.

[2] Do note, this characteristic emerges because of the involvement of the funder. The Belmont Forum is not just interested in data specifically underlying published articles, it is also interested in any research data that is produced by their funded research.