Where relevant and appropriate, we encourage authors to cite data as they would cite other research objects, such as publications and books, in their reference lists. See below for guidance and examples of how to cite datasets in your manuscript.
Where relevant and appropriate, we encourage authors to cite data as they would cite other research objects, such as publications and books, in their reference lists. See below for guidance and examples of how to cite datasets in your manuscript.
Citing data may not be as straightforward as citing other research objects for a number of reasons, as discussed by NISO . Infrastructure for storing and sharing data varies significantly across disciplines, and not all datasets are static. There is therefore no single universal standard for citing data, but several organisations have created guiding principles for data citation. FORCE11’s Joint Declaration of Data Citation Principles is one example, endorsed by NISO.
Several of our journals have existing policies on data citation. When considering how to cite data, you should first check your journal's information pages to see if they provide guidance that is relevant to you and your field.
If you are citing data stored in a repository or archive, the repository itself may also provide guidance on data citation. If your journal does not have a policy on data citation, you should follow the repository's guidance on how to cite its data.
If you would like to cite data you have used in your work, and neither your journal nor the data source provides guidance on how to do so, we recommend including the following minimum elements necessary for dataset identification and retrieval. These are adapted from IASSIST's Quick Guide to Data Citation .
Name(s) of each individual or organisational entity responsible for the creation of the dataset.
The year the dataset was published or disseminated.
The complete title of the dataset, including the edition or version number, if applicable.
The organisational entity that makes the dataset available by archiving, producing, publishing, and/or distributing the dataset.
Whenever possible this should be a unique, persistent, global identifier used to locate the dataset (such as a DOI). Otherwise a web address and the date of access can be used.
As a general rule, you should include enough identifying elements to precisely specify which data you have used in your work. These elements can then be arranged following the punctuation and order used for other citations in your journal's style. If the dataset has a DOI, you can use DataCite's DOI Citation Formatter to generate a citation in your choice of many standard formatting styles.
We also recommend prefacing any data citation with [dataset], to help us recognise and identify it with appropriate metadata when publishing your article.
Example 1: The journal Political Analysis gives specific guidance for citing data in its author instructions, including example citations of datasets in the form Author, "Title", Identifier, Version, Date. This format should be followed when citing data in this journal, for example:
Example 2: The U.S. Geological Survey's ScienceBase Catalog provides example citations for many of its datasets. If your journal does not specify otherwise, you should follow the citation format given, for example:
Example 3: The India Water Portal hosts many datasets but does not provide specific guidance on how to cite these. If your journal does not specify how to cite data, you should follow your journal’s style for other citations, including all necessary elements to identify the dataset. For example if your journal uses the Chicago Manual of Style:
If you have further questions about citing data, please feel free to contact us: researchdata@cambridge.org