Citing Data
With many researchers now sharing and reusing data, there is a growing need to cite data as a scholarly output in the same way that traditional print outputs such as books, journal articles and conference papers are acknowledged - by including a bibliographic reference to acknowledge the original data creator/s.
For further information you can read:
Short term benefits and long term value for making datasets citable (Alex Ball & Monica Duke, UKOLN, University of Bath) Available online: http://www.dcc.ac.uk/resources/briefing-papers/introduction-curation/data-citation-and-linking#benefits
Data Citation for researchers (Australian National Data Service) Available online: https://www.ands.org.au/working-with-data/citation-and-identifiers/data-citation/data-citation-for-researchers
Ball, A. & Duke, M. (2015). ‘How to Cite Datasets and Link to Publications’. DCC How-to Guides. Edinburgh: Digital Curation Centre. Available online: https://dcc.ac.uk/guidance/how-guides/cite-datasets
Standards for Data Citation
Standards for data citation vary across disciplines. Some data repositories and archives provide formats for citing data as part of the metadata record for the dataset.
The DataCite Consortium provides a recommended minimum format for citing data:
Required elements
Optional elements
DataCite format examples of a data citation :
Creator (PublicationYear): Title. Publisher. (resourceTypeGeneral). Identifier
Creator (PublicationYear): Title. Version. Publisher. (resourceTypeGeneral). Identifier
Tips for Citing Data
To cite data in APA 7th referencing style:
In-Text Citation
(Paris et al., 2015)
Reference List
Paris, T., Kim, J., & Davis, C. (2015). EEG responses to two contexts of AV speech presentation [Data set]. Western Sydney University. http://doi.org/10.4225/35/54bf146fa4012
Data Citation Formatting Examples
Heathcote, A. (2006) Examining the origins of the word frequency effect in episodic recognition memory and its relationship to the word frequency effect in lexical memory. University of Newcastle, Australia. http://hdl.handle.net/1959.13/807086.
PANGAEA - Earth & Environmental Science Data Library
Jahnke, A et al. (2007): Polyfluorinated alkyl substances (PFAS) in high-volume air samples collected during Polarstern expedition ANT-XXIII/1. doi:10.1594/PANGAEA.610160
Australian Social Science Data Archive
Dobson, A. J., et al. Australian Longitudinal Study on Women's Health, 2003: Food Frequency Questionnaire. [Computer file]. Canberra: Australian Social Science Data Archive, The Australian National University, 2005.
Barnes RSK, Ellwood MDF (2011) Data from: Macrobenthic assemblage structure in a cool-temperate intertidal dwarf-eelgrass bed in comparison to those in lower latitudes. Biological Journal of the Linnean Society doi:10.5061/dryad.v8gg2
SEER (Surveillance Epidemiology and End Results), National Cancer Institute (US)
Surveillance, Epidemiology, and End Results (SEER) Program Populations (1969-2009) (www.seer.cancer.gov/popdata), National Cancer Institute, DCCPS, Surveillance Research Program, Cancer Statistics Branch, released January 2011.
Bibliographic Management Software
EndNote
The EndNote software (Thomson Reuters) includes a template for reference type 'dataset' for versions X4 and above.
Other bibliographic management software may support creating custom templates for datasets. Consult your style manual or guide for advice, or use one of the DataCite standards.
Altman, M. & Florence, D. (2007) A Proposed Standard for the Scholarly Citation of Quantitative Data. D-Lib Magazine, 13(3-4). doi:10.1045/march2007-altman
Ball, A. & Duke, M. (2011) Data Citation and Linking. Data Seal of Approval. http://www.datasealofapproval.org/?q=node/66
Birney, E., Hudson, T. J., Green, E. D., Gunter, C., Eddy, S., Rogers, J., et al. (2009). Prepublication data sharing. Nature, 461(7261), 168-70. doi:10.1038/461168a
CODATA (The Committee on Data for Science and Technology) (2010). Data Citation Standards and Practices. http://www.codata.org/task-groups/data-citation-standards-and-practices
Constable, H., Guralnick, R., Wieczorek, J., Spencer, C., & Peterson, a T. (2010). VertNet: A new model for biodiversity data sharing. PLoS biology, 8(2), e1000309. doi: 10.1371/journal.pbio.1000309.
Green, T. (2009) We need publishing standards for datasets and data tables. OECD Publishing White Paper, OECD Publishing. doi:10.1787/603233448430
Mons, B., Haagen, H. van, Chichester, C., Hoen, P.-B. ’T, Dunnen, J. T. den, Ommen, G. van, et al. (2011). The value of data. Nature genetics,
43(4), 281-3. Nature Publishing Group. doi: 10.1038/ng0411-281.
Moore, A. J., McPeek, M. a, Rausher, M. D., Rieseberg, L., & Whitlock, M. C. (2010). The need for archiving data in evolutionary biology. Journal of
evolutionary biology, 23(4), 659-60. doi: 10.1111/j.1420-9101.2010.01937.x.
Page, R. D. M. (2010). Enhanced display of scientific articles using extended metadata. Web Semantics: Science, Services and Agents on the World
Wide Web, 8(2-3), 190-195. doi: 10.1016/j.websem.2010.03.004.
Piwowar HA, Day RS, Fridsma DB (2007) Sharing Detailed Research Data Is Associated with Increased Citation Rate. PLoS ONE 2(3): e308. doi:10.1371/journal.pone.0000308
Sieber, J. & Trumbo, B. (1995). (Not) giving credit where credit is due: Citation of data sets. Science and Engineering Ethics, 1(1), 11–20. doi:10.1007/BF02628694
Tenopir, C., Allard, S., Douglass, K., Aydinoglu, A. U., Wu, L., Read, E., et al. (2011). Data Sharing by Scientists: Practices and Perceptions. (C.
Neylon, Ed.)PLoS ONE, 6(6), e21101. doi: 10.1371/journal.pone.0021101.
Wellcome Trust. (2003). Sharing Data from Large-scale Biological Research Projects: A System of Tripartite Responsibility.
https://wellcome.ac.uk/sites/default/files/wtd003207_0.pdf
Whitlock, M. C., McPeek, M. a, Rausher, M. D., Rieseberg, L., & Moore, A. J.(2010). Data archiving. The American naturalist, 175(2), 145-6.
doi:10.1086/650340.
Whitlock, M. C. (2010). Data archiving in ecology and evolution: best practices, Trends in Ecology & Evolution, 26(2), 61-65.
doi:10.1016/j.tree.2010.11.006.