An introduction to FAIR research data and metadata
The acronym FAIR stands for Findable, Accessible, Interoperable, Reusable. It is a set of principles for research data and metadata, to improve its discovery and access, how it can interact with other datasets and systems, and ultimately how it can be reused by others.
The term ‘FAIR’ was launched at a Lorentz workshop in 2014, and the resulting FAIR principles were published in 2016. They have been widely accepted and promoted by researchers, institutions, funders, publishers, and political leaders. There are a number of initiatives committed to developing, understanding and meeting them. The principles are summarised by the GO FAIR initiative as:
Findable
The first step in (re)using data is to find them. Metadata and data should be easy to find for both humans and computers. Machine-readable metadata are essential for automatic discovery of datasets and services, so this is an essential component of the FAIRification process.
Accessible
Once the user finds the required data, she/he needs to know how can they be accessed, possibly including authentication and authorisation.
Interoperable
The data usually need to be integrated with other data. In addition, the data need to interoperate with applications or workflows for analysis, storage, and processing.
Reusable
The ultimate goal of FAIR is to optimise the reuse of data. To achieve this, metadata and data should be well-described so that they can be replicated and/or combined in different settings.
Within each principle there are several steps to work towards - and work towards is a good way of looking at it, as achieving FAIR isn’t a binary state. Rather, it is a spectrum along which you can meet different aspects or degrees of making data FAIR. Realistically you might not meet every measure of FAIR, but all that you meet will help to enable data reuse. As nicely described in the Turing Way handbook for reproducible data science, FAIR applies not just to data files or datasets themselves, but to different entities in the storing and sharing infrastructure:
“The FAIR principles refer to three types of entities: data (as any digital object), metadata (information about that digital object), and infrastructure (i.e. software, repositories). For instance, the findability principle F4 defines that both metadata and data are registered or indexed in a searchable resource (e.g. a data repository).”
Furthermore, the responsibility and contribution towards making data FAIR are shared by researchers, institutions, technology providers, funders and publishers, with some examples being:
Individual Researchers will strive to: Document data to agreed community standards that describe provenance and enable discovery, assessment of reliability, and reuse
Funding agencies and organizations will strive to: Review data management plan requirements regularly to validate support of open and FAIR standards and promulgate leading practices.
Societies, communities, and institutions will strive to: Promote open and FAIR data activities as important criteria in promotion, awards, and honours.
Publishers will strive to: Adopt a shared set of author guidelines that support FAIR principles, providing a common set of expectations for authors
Repositories will strive to: Ensure that research outputs curated by repositories are open and FAIR, have essential documentation, and include human-readable and machine-readable metadata (e.g. on landing pages) in standard formats that are exposed and publicly discoverable.
These are among the principles contained in the Commitment Statement in the Earth, Space, and Environmental Sciences, which is one of many initiatives by groups with a disciplinary or process driven approach.
Where do I start?
The How FAIR are your data? checklist is a good place to start, to think in advance about what you might need to do to make your data FAIR, and to assess it before it is archived and shared.
If you’d like advice about making your data FAIR, please get in touch with us at research.data@kcl.ac.uk.
Further reading and resources:
- GO FAIR - a bottom-up, stakeholder-driven and self-governed initiative that aims to implement the FAIR data principles
- Fairsharing.org - A curated, informative and educational resource on data and metadata standards, inter-related to databases and data policies
- FAIRsFAIR - A project aiming to supply practical solutions for the use of the FAIR data principles throughout the research data life cycle