Research Data Management
The FAIR principles explained
Findable data
Reusing data implies that, first, it is possible to find them. It should be easy for both humans and computers to find data and the associated metadata. Data and metadata are assigned a globally unique and persistent identifier that is machine-readable. Metadata and PIDs are essential for automatic and reliable discovery of datasets and services. Metadata must clearly and explicitly include the identifier of the data it describes. Using community-endorsed, general or discipline-specific, standards for data and metadata (such as data models and metadata schemas) is recommended.
Accessible data
Once data are found, it must be clear how to access them. This can include information on whether authentication or authorisation is required prior to accessing the data. Data and metadata should be retrievable by their identifier using a standardised and open communications protocol that is open, free and universally implementable. It must allow for authentication and authorisation procedures for restrictions if necessary. Importantly, metadata should remain available and accessible even when the data are no longer available.
Interoperable data
It should be possible to integrate the data with other data, applications and workflows. This means that the format of the data should be open and interpretable by various tools for analysis, storage and processing (which proprietary software appications generally don’t allow). They must use a formal, accessible, shared and broadly applicaable language for knowledge representation Interoperability applies both at the data and metadata level. Common formats and standards for data and metadata, as well as controlled vocabularies (that follow FAIR principles), are essential to ensure data and metadata interoperability.
Reusable data
The ultimate goal of FAIR is to optimise data reuse. To achieve this, data and metadata should be well-described, have a clear and accessible license for data usage, include detailed information on provenance, and meet domain-relevant community standards. This will ensure repeatability of experiments and reproducibility of results.