Research Data Management
Good practice in data management is required to maintain reliable and accurate data throughout the data's lifecycle. Good data management will also facilitate data re-use after completion of the project and enable others to replicate research outcomes into the future. Raising awareness of good practice for data management starts with planning.
Grant Applications
Preparing a data management statement
The following information is provided to assist in the development of a statement to address data management planning for your research project in completing your grant application form. A data management (DM) statement can be drafted based on responding to the following prompts in relation to your data and your project. The statement can include on:
As a guide, responses to the following prompts are intended to produce approximately two to three paragraphs of text.
Data Management Outline |
What to include in your statement |
Describe your Data |
Describe the data that will be collected, generated or created during the project including how the details of how the data will be generated. Describe the characteristics and feature of the data, i.e. what type/s of data. Also outline the approximate volume/quantity of data you may generate. |
Data storage |
Provide details of where will you store your data during the project (working data) and on completion of your project (archival data). Are there any specific data storage issues that will need to be managed? UON has a number of research data storage options for working data and provides data storage for archival data when you publish your data. |
Publishing your Data |
A statement on your intent to archive and/or publish your data on completion of the project. Outline where and how your archived data be published. UON provides services to archive and publish your research data. Other external options include subject or discipline specific data repositories. <Link for more info on services and help> |
Re-use of your data |
A statement outlining if your data can/can’t be shared and re-used by others. If relevant, include details of intent to apply a Creative Commons licence (AUSGOAL) to facilitate sharing and re-use. Refer to AUSGOAL licensing http://www.ausgoal.gov.au/the-ausgoal-licence-suite |
Note : Successful grant applications may require a full data management plan, as opposed to the above outline or statement. The Library provides data management plan templates along with advice and assistance to help write your Data Management Plan (DMP).
Data Management Plans
A data management plan is a document outlining how research data and associated materials will be managed, stored, documented and secured throughout a research project as well as planning for what will happen to the data and materials after completion of the project. This includes retention and disposal, archiving, accessing, sharing or publishing the data, and any conditions or restrictions for sharing the data. The plan is intended to provide descriptive details of the data, the processes, decisions, as well as identifying roles and responsibilities.
The following templates are provided to assist with the development of a data management plan. The templates address the main broad issues for consideration during the planning stage, for example:
Data Management Plan – Template (version 1)
Version 1 of the template is based on a format of 56 short question and answers covering the broad areas of data management that are identified above. The author can determine the amount of information to include, however addressing all relevant issues will create a more robust data management plan.
Download Rich Text Format (.rtf) version
Data Management Plan – Template (version 2)
Version 2 of the template is based on a format of 8 broad multi-dimensional questions requiring descriptive responses. The author can determine the level of detail for inclusion.
Download Rich Text Format (.rtf) version
Guides to Assist with Data Management and Planning
UK Data Archive. Managing and Sharing Data, 3rd edition, 2011
Australian National Data Service. Data Management Planning, 2011
ICPSR. Inter-University Consortium for Political and Social Research. Guide to Social Science Data Preservation and Archiving. Best Practice Throughout the Data Life Cycle, 2009
UK Data Archive. Data Management for Qualitative Data Using NVIVO9, 2011
Data Management Checklist
The Data Management Checklist is based on the above Data Management Plans. The checklist provides an alternative method to using the formal templates and allows the author to develop their own data management plan by identifying the issues that need decisions and relevant information to be captured and documented.
File Formats
File formats are an important consideration in managing research data.
To ensure usability of and access to your data over the course of time you need to determine that durable file formats are utilised. This encompasses the life cycle of a research project as well as for long term access that may be required, to meet a data retention period, after completion of a research project.
Not all file formats are durable over time or compatible with the need to share data with others. There is also a distinction between the types of file formats that are optimal for presentation versus those optimal for preservation and longer term access.
Considerations include:
The National Library of the Netherlands suggests the following criteria when evaluating file formats for long term preservation:
Source : Selecting file formats for long term preservation. The National Archives. 2008.
Long Term Preservation
There are a number of resources available to assist you in determining an appropriate file format for your data. The following table, prepared by the UK Data Archive contains data formats identified by the Archive as optimal for long term preservation of data to the Archive. These can serve as a guide.
Type of data | Acceptable formats for sharing, reuse and preservation | Other acceptable formats for data preservation |
Quantitative tabular data with extensive metadata a dataset with variable labels, code labels, and defined missing values, in addition to the matrix of data |
SPSS portable format (.por) delimited text and command ('setup') file (SPSS, Stata, SAS, etc.) containing metadata information some structured text or mark-up file containing metadata information, e.g. DDI XML file |
proprietary formats of statistical packages e.g. SPSS (.sav), Stata (.dta) MS Access (.mdb/.accdb) |
Quantitative tabular data with minimal metadata a matrix of data with or without column headings or variable names, but no other metadata or labelling |
comma-separated values (CSV) file (.csv) tab-delimited file (.tab) including delimited text of given character set with SQL data definition statements where appropriate |
delimited text of given character set - only characters not present in the data should be used as delimiters (.txt) widely-used formats, e.g. MS Excel (.xls/.xlsx), MS Access (.mdb/.accdb), dBase (.dbf) and OpenDocument Spreadsheet (.ods) |
Geospatial data vector and raster data |
ESRI Shapefile (essential - .shp, .shx, .dbf, optional - .prj, .sbx, .sbn) geo-referenced TIFF (.tif, .tfw) CAD data (.dwg) tabular GIS attribute data |
ESRI Geodatabase format (.mdb) MapInfo Interchange Format (.mif) for vector data Keyhole Mark-up Language (KML) (.kml) Adobe Illustrator (.ai), CAD data (.dxf or .svg) binary formats of GIS and CAD packages |
Qualitative data textual |
eXtensible Mark-up Language (XML) text according to an appropriate Document Type Definition (DTD) or schema (.xml) Rich Text Format (.rtf) plain text data, ASCII (.txt) |
Hypertext Mark-up Language (HTML) (.html) widely-used proprietary formats, e.g. MS Word (.doc/.docx) some proprietary/software-specific formats, e.g. NUD*IST, NVivo and ATLAS.ti |
Digital image data | TIFF version 6 uncompressed (.tif) |
JPEG (.jpeg, .jpg) but only if created in this format TIFF (other versions) (.tif, .tiff) Adobe Portable Document Format (PDF/A, PDF) (.pdf) standard applicable RAW image format (.raw) Photoshop files (.psd) |
Digital audio data | Free Lossless Audio Codec (FLAC) (.flac) |
MPEG-1 Audio Layer 3 (.mp3) but only if created in this format Audio Interchange File Format (AIFF) (.aif) Waveform Audio Format (WAV) (.wav) |
Digital video data |
MPEG-4 (.mp4) motion JPEG 2000 (.mj2) |
|
Documentation and scripts | Rich Text Format (.rtf) PDF/A or PDF (.pdf) HTML (.htm) OpenDocument Text (.odt) |
plain text (.txt) some widely-used proprietary formats, e.g. MS Word (.doc/.docx) or MS Excel (.xls/.xlsx) XML marked-up text (.xml) according to an appropriate DTD or schema, e.g. XHMTL 1.0 |
Source : UK Data Archive : http://www.data-archive.ac.uk/create-manage/format/formats-table
Zena. Software for Digital Preservation.
National Archives of Australia.
Xena software aids digital preservation by performing two important tasks:
PRONOM - The Technical Registry
Developed by the Digital Preservation Department of the UK National Archives, PRONOM provides authoritative information about data file formats and supporting software products including technical requirements and support lifecycles. Users can search by file format extension, name of software, vendor or keyword.
Library of Congress. Sustainability of File formats
This website provides a browsable alphabetical list of desciptions of file formats, file format classes, bitstream structures and encodings along with inforamtion on compression of files or bitstreams.
Global Digital Format Registry. Available in January 2012
A collaborative project between Harvard University Library, National Archives and Record Administration and Online Library Computer Centre (OCLC).
Developed as part of a broader program providing guidance on the creation of digital content and preservation for future generations.
Digital Curation and Preservation Bibliography
From the Digital Scholarship website covering digital copyright, digital curation, digital repositories, open access, scholarly communication, and other digital information issues.
Additional Resource Guides for File formats
UK Data Archive. Managing and sharing Data (file formats p. 11.)
http://www.data-archive.ac.uk/media/2894/managingsharing.pdf
DDC. Digital Curation Manual. File Formats.
http://www.dcc.ac.uk/sites/default/files/documents/resource/curation-manual/chapters/file-formats/file-formats.pdf
Selecting file formats for long term preservation. The National Archives. 2008
http://www.nationalarchives.gov.uk/documents/selecting-file-formats.pdf
Library of Congress. Sustainability of File formats
http://www.digitalpreservation.gov/formats/fdd/browse_list.shtml
National Library of Australia. Preserving Access to Digital Information
http://www.digitalpreservation.gov/formats/fdd/browse_list.shtml
Cambridge University. (2011).Common Image Formats: what to use when.
http://www.lib.cam.ac.uk/dataman/resources/common_image_formats_table.pdf
JISC Digital Media (2011).Still images, moving images and sound advice.
http://www.jiscdigitalmedia.ac.uk/
File Formats – Working Level. ANDS Guide
http://ands.org.au/guides/file-formats-working.html
PDF Files for long term preservation
http://hul.harvard.edu/ois/digpres/docs/OIS_recs_for_pdf.pdf
A format for digital preservation of images. A study on JPEG2000 File robustness.
DLib Magazine. July/August 2008. Vol. 14, number 7/8.
http://www.dlib.org/dlib/july08/buonora/07buonora.html
Find UoN Data
View University of Newcastle data collections in Research Data Australia
Research Technology Catalogue - A guide developed by UoN Academic Research Computing Support listing tools, technologies and services to support researchers at the University of Newcastle.