On this page:
“Open data can underpin innovation, for example when researchers with fresh perspectives use data in unexpected ways or when companies use data to help them develop new products.”
Data storage and preservation at the University of Hull
If you will be handling sensitive personal data which is in scope of Data Protection regulations, read the guidance on Managing Personal Data before making a decision about where to store your files.
Written for University of Hull students, the Digital Student guide to making the most of digital technology covers Online Safety and Security, including storing your data securely and protecting yourself from scams and phishing attacks.
Introduction to OneDrive from the University of Hull's IT Services
OneDrive is a networked file storage system, with the current quota set at 0.5 TB of storage space per person ( maximum individual file size 250GB). It's part of the Microsoft Office 365 suite of services available to all University of Hull staff and students, which can be accessed from any device, in any location with internet access.
Files are encrypted in transit and located in EU datacentres for compliance with Data Protection regulations. OneDrive is regularly backed up and recoverability of lost files can be initiated.
OneDrive: Guidance around storing and sharing
Files located in OneDrive can be shared with specific people within or outside the University (via their email address), anyone at the University of Hull, or anyone with the link (in which case the link expires after 90 days). You can choose whether people you share with have permission to edit the files, comment, or simply view them.
What happens to my OneDrive when I leave the University?
OneDrive should not be used for preserving data after a project is completed, as access to files is blocked once the file owner's account is closed. If you are leaving the University, you will need to ask colleagues to make their own copies of the files, or move them to a shared Team or offsite location before your account is closed.
Microsoft Teams is a cloud-based communication and collaboration platform, which integrates with other MS Office products and supports working together on files. Each Team is connected to a Sharepoint parent site for file storage.
- University of Hull IT Services support for Microsoft Teams
- Skills Guide for Teams (from University of Hull Library).
How do I share and work on files with others in Teams?
If you are undertaking a project with multiple collaborators over a significant period of time, IT Services recommend setting up a Team - anyone with an email address can be invited. Members can access the Team from any browser, to upload or download files, and view/edit/comment on files added by other members, as well as benefitting from other Teams functions such as chat, meetings and whiteboards.
University of Hull staff can request a new Team from IT Services. Students should ask their academic supervisor to request a Team on their behalf.
Teams has limited potential as a platform for preserving files after completion of a project. Files are owned by the Team rather than the creator, so remain accessible to all members if the creator leaves the institution. However, University of Hull policy commits ICT to archive any Team which has been inactive for 90 days, then delete the content after a further 90 days (with a 30 day recovery window).
All staff and students are advised to use OneDrive as your primary filestore to benefit from vast storage capacity, access from any networked device, and compliance with EU Data Protection regulations.
The on-campus hardware that hosts the University's networked filestore is reaching the end of its life. Starting September 2024, IT Services will contact all current users to migrate their data to OneDrive and retire their G: Drive.
More information for University staff and students.
Wasabi is a S3-compatible cloud object storage system now offered by the University and is intended to be used for storing large static data sets that needs to be kept. It's low cost with no ingress or egress charges.
University of Hull staff can request further information along with discussing if the solution will fit their requirements by contacting IT Services via the Staff Portal. Students should ask their academic supervisor to request this information.
Sharepoint site for Worktribe: access, guides and FAQs
If your research project is logged in Worktribe, any datasets which have potential value after the completion of the project should be recorded as project outputs. Each record must include the title of the dataset, date of collection and location of the files. The files themselves can be added to the record if no suitable external repository can be identified.
Datasets deposited in Worktribe can be made openly accessible through Repository@Hull if appropriate, up to a maximum file size of 512 MB. The Repository will provide a citable url and reusable metadata, but it does not assign a DOI to the dataset.
Worktribe's suitability for preserving zipped files and proprietary formats is untested. It has not yet been evaluated against the international standards for a Trusted Data Repository.
The University of Hull's Data Safe Haven is a secure research environment with appropriate technical and information governance controls for the storage and processing of sensitive research data, managed by the Hull Health Trials Unit.
The DSH is available to researchers from their own machines, wherever they are. Data resides within a secured data centre and is accessed via a secured connection meaning that all the activity occurs within the environment and not on the end user machine.
The DSH gives researchers an environment to store and analyse sensitive data without it ever leaving the security of the data centre. It is available to support activity across all areas of the University, for researchers with or without a budget.
The University of Hull's instance of Box has been decommissioned, as the University supports the use of Microsoft OneDrive and Teams to facilitate cyber-secure cloud storage and collaborative working. A separate Box instance has been retained by Hull Health Trials Unit for managing their own data.
Hydra was the University of Hull's original repository for datasets and other research outputs, before the launch of Worktribe in 2017. It is closed to new deposits as it no longer meets current cyber-security standards. A project is underway to transfer Hydra files to alternative repositories.
If you have a stake in datasets preserved in Hydra, contact repository@hull.ac.uk to discuss next steps.
Anonymization:
The EU-funded OpenAIRE consortium has released Amnesia, a free web-based tool for removing identifying information from data
Encryption:
If you are working with sensitive data, you may need to encrypt that data, particularly if you are moving the data between systems, or sharing with others.
- Guidance about encryption from University of Hull IT Services.
- The ESRC-funded UK Data Service has published a guide to data encryption software.
Further advice about backups, external hard drives and data encryption can be found at Get Safe Online, a UK government-funded initiative for businesses and the general public.
You may be familiar with the cloud-based services Dropbox and Google Drive for sharing files. Be mindful that these services may store files in jurisdictions outside the European Economic Area, which is a breach of UK Data Protection legislation for personal data. As these services are not supported by the University, standards for file recovery and preservation may be subject to change without notice.
Any USB flash drives left behind in PCs or sockets on campus will be destroyed, so don't rely on a USB stick for storing or transporting files you can't afford to lose.
Don't share files as email attachments when you can use OneDrive instead, to benefit from file encryption and daily backup.
See the University's File Storage Policy (April 2021) for a detailed overview of the provision of local disk drives, network storage, cloud storage options, and guidance on the use of removable media (including USB sticks and external hard drives).
The University's Cloud Security Policy (2024) outlines University’s expectations in relation to the security of cloud services used to process or store University data.
If your project will involve the introduction of IT systems or technology into the University, you should submit an ICT Project Request as early as possible in the planning process, so that questions of security, data protection and technical compatibility can be assessed before implementation. A Data Protection Impact Assessment may be required if the platform will be used for storing or processing sensitive personal data.
Choosing a file format
The file formats best suited for long-term sustainability and accessibility:
• Are frequently used
• Have open specifications
• Are independent of specific software, developers or vendors.
It may be appropriate to deposit copies of your data in more than 1 format to maximise both utility and preservation (for instance, .xls and .csv).
Many established data repositories maintain directories of preferred (ideal) and accepted (good enough) file formats for each type of data: tabular, audio, video, text, geographical information and others:
UK Data Service Recommended File Formats
DANS File Formats (from the Dutch national data repository, one of the largest in the world)
Archaeology Data Service Guidelines for Depositors
The global Open Preservation Foundation has carried out an international comparison of recommended file formats.
Digital repositories which offer a mediated deposit service may permit certain less sustainable formats for file transfer, then migrate the data to an open format on your behalf: for instance, The National Archives (UK).
File management hints and tips
Recommended: the University of Cambridge's guide to Organizing Your Data, covering best practice with file naming, version control, documentation and metadata.
The free-to-use Omni Calculator (an open source EU-based start-up) offers tools for projecting file sizes, plus tips for compressing files:
Archiving websites
If you have created a project website outside the University's web domain, this should be preserved like any other research output after the completion of the project. If you have been relying on project funding to pay hosting fees, you should consider archiving your website in one of the dedicated online services.
A blog post from the University of Kent Research Support Team (June 2021) offers a detailed overview of your options.
See also Key Contacts at the University of Hull for sources of specialist advice on institutional policy, infrastructure and services.
External data storage and preservation services
A selection of established external data storage platforms and repositories for scholarly research. The University of Hull has not formally endorsed these services.
If you are a University of Hull researcher with a Worktribe account, and your data relates to a project recorded in Worktribe, you should also create a output record for the data, to signpost its location. It is not necessary to upload the files to Worktribe if they are preserved in an external data repository.
Open source research data repository software for institutions and communities, utilized by the Harvard Dataverse Repository and many other US and European research organisations. Researchers who aren't affiliated with Harvard can set up a standalone Dataverse collection for an ongoing project or research centre: data is assigned a DOI or other permanent ID for easy citation.
By default, all datasets deposited in a Dataverse repository are assigned a CC-0 licence (Public Domain), ie. the author/creator does not retain any rights to attribution or re-use of the data, although users are encouraged to cite their source.
A not-for-profit service supported by the University of California, which is free to use for data storage, but charges a base rate of $120 for publication (rising in line with file size).
A number of publishers sponsor Dryad so that their authors can deposit and publish data associated with a journal article free-of-charge. As part of the peer review process, reviewers will be able to access deposited data privately before the paper is accepted.
Data deposited in Dryad is assigned a mandatory CC-0 licence, ie. data can be re-used for any purpose, without attribution.
The EDS consists of a network of five disciplinary centres (specializing in marine, freshwater, terrestrial, atmospheric and polar research), which preserve and disseminate data from environmental scientists in the UK and around the world. Deposit in the EDS is mandatory for NERC-funded researchers (see the Policies tab); data generated from other funded projects in relevant disciplines will also be considered.
A Figshare account is currently free-of-charge; the service is owned by Digital Science (founders of Altmetric.com), and endorsed by a number of large funders and commercial publishers.
Upload any file format up to 20GB, for private retention or sharing publicly or with specific individuals. Receive a DOI in order to cite your data within your publications. Replacing the file or editing the metadata (e.g. title, authors, linked url) automatically triggers a new version, for accountability.
Figshare users are encouraged to choose a CC-0 licence for their data, or CC-BY for content which has undergone further processing such as figures and filesets. Figshare's Copyright and Licence Policy.
How to use Figshare for thesis and dissertation outputs: specific advice for graduate students looking to share outputs from their thesis work, "including datasets, presentations, posters, and other supplementary material".
Altmetric data (indicators of social media attention) are collected for all Figshare deposits. Dataset records with the necessary metadata are crawled by Google Scholar.
Figshare+ offers cloud storage for larger datasets, for a one-time data deposit charge, which it may be possible to include in a grant application.
A GitHub repository enables you to store and collaborate on project code. When you're ready to make your work public, GitHub's choosealicence.com helps you decide what licence is appropriate for your open source software.
Now owned by Elsevier, Mendeley Data acts as a searchable index to data repositories as well as a preservation platform. Register for a free account with your University of Hull ID in order to create dataset records and deposit files, up to a maximum size of 10GB per dataset. You can choose to keep records in 'draft' form (private), share with collaborators, set an embargo or publish with a Creative Commons licence and a DOI.
Mendeley Data is stored in the EU, and archived in perpetuity with the Dutch Data Archiving and Network Services (DANS).
OSF is an open-source collaboration platform for researchers, which supports study design, literature reviews and data collection as well as file storage (both private and public). It is maintained by the US-based Center for Open Science, sponsored by federal agencies, private foundations, and commercial entities. You can choose to store files on a server in Germany to meet EU data protection standards.
Register with OSF for a free account to manage projects and files. Individual files must be 5GB or less; the overall storage for a project is capped at 5GB for private projects or 50GB for public; there is no cap on the amount of storage per user. Depositors can choose any level of open licence or none: be mindful that if you do not specify a licence, your data cannot be considered 'open', regardless of whether it is publicly available.
See also OSF Support and OSF FAQs.
Depositing Data with the UK Data Service
Funded by the ESRC to "meet the data needs of researchers, students and teachers from all sectors", the UK Data Service is both a portal to national and international socioeconomic macrodata, and a data repository for researchers in quantitative and qualitative social sciences and humanities.
Register for a free account to deposit your data in ReShare: you can choose an open licence, or 'safeguard' your data for registered users only, to be used for research and teaching purposes. Deposit in ReShare is mandatory for ESRC-funded projects.
Zenodo is EU-funded; files and metadata are stored at CERN. Upload any file format up to 50GB, and receive a DOI. Zenodo supports DOI versioning: new deposits can be linked, and you can direct people to a specific version or the general DOI.
The depositor is permitted to choose the appropriate level of licence for their data (Creative Commons or another open licensing scheme). Your funder or publisher may specify a licence.
Register a Zenodo Community in order to share working documents, data and publications with other researchers in your field.
FAQs for more information and tips.
A number of publishers who are committed to open research incorporate storage options for data associated with papers, e.g. ArXiv, F1000.
Re3data.org (Registry of Research Data Repositories) provides a searchable directory of validated data repositories for all disciplines and formats.
FAIRsharing Collections and Recommendations: browse by discipline, funder or publisher platform to identify their recommended data repositories and metadata standards. Compiled by a team based at the University of Oxford.
Documenting your data
It's important that any dataset you preserve for potential use by other researchers has enough descriptive information (metadata) for humans and search engines to understand what the files consist of. A trusted data repository is likely to provide a metadata template, to help depositors create metadata systematically.
The UK Data Service Learning Hub provides an overview of rationale and best practice for documenting data at study level and file level, written for social science researchers but applicable in principle to all disciplines.
Research areas which have a strong tradition of sharing data may have published discipline-specific metadata standards to facilitate the discovery and re-use of files.
- DataCite Schema: developed by a UK, European and US working group formed from research councils and libraries, "a list of core metadata properties chosen for an accurate and consistent identification of a resource for citation and retrieval purposes, along with recommended use instructions". Member organisations who adopt DataCite metadata standards are able to issue DOIs (permanent digital IDs) for deposited files.
- FAIRSharing: Standards: a searchable directory of controlled vocabularies, identifier schemas and reporting guidelines for every academic discipline, maintained by the Data Readiness Team at the University of Oxford.
- The UK's Digital Curation Centre maintains a disciplinary directory of metadata standards.
Licensing your data
A licence statement on your data helps users understand what they are permitted to do with it. For instance, do you expect to be credited if other researchers convert your data into a new format, or publish new findings derived from your data? Are you willing to allow private sector organisations to utilize your data without payment?
The Library's guide to Copyright provides an overview of the Creative Commons licensing scheme and other open licences.
See also Who Owns My Data? for further recommended reading.
Publishing your data
Data journals are becoming more commonplace in the academic publishing landscape. Papers focus specifically on the dataset rather than the research outcomes, and provide details such as the context of the data collection, the choice of software environment and data processing decisions including file formats. There is no analysis or examination of the data; instead the paper describes the data, and signposts its long-term location in a repository.
Publishing in a data journal provides researchers with an opportunity to get their data peer-reviewed, and gain credit for their research through citation. (Adapted from Research Data Management & Sharing, University of Strathclyde, 2020).
Examples of data journals:
Launched in 2015, Data In Brief has now published over 5000 papers. Editors explain the benefits of publishing your data in two videos, and the publishers provide a template for submission and guidance for authors. The current charge for publishing in this title is $530 (May 2023).
With an editorial team based at Kings College London, this not-for-profit journal publishes short data papers (1000 words), and full length research papers which "discuss and illustrate methods, challenges, and limitations in the creation, collection, management, access, processing, or analysis of data in humanities research". The standard article publication charge of £450 may be waived for authors who do not have the ability to pay.
A UK-based, not-for-profit publication platform designed to enable the immediate dissemination of research in linked smaller units, including 'Rationale/Hypothesis', 'Method', 'Results' (describing and summarising the raw data), 'Analysis' and 'Interpretation'. Register with an ORCID account to upload files free of charge for publication with an open licence and a DOI. See the Author Guide and FAQs to help you get started.
Scientific Data is an open access journal dedicated to data, publishing descriptions of research datasets and articles on research data sharing from all areas of natural sciences, medicine, engineering and social sciences. UK-based authors are asked to pay an article processing charge of £1690.
See also the University of Edinburgh's directory of Data Journals (2021), including a number of discipline-specific titles.