Bibliometrical Phase
Step 0 - Preparation for Long-Term Data Archival
In the pre-phase of the long-term archival, general information is entered into the long-term archive based on CMIP6 web pages, e.g. project description, templates for summaries or descriptions of provided variables (short names, long names and connected CF Standard Names).
Step 1 - Create Use Metadata from the ESGF
The long-term archival starts at the snapshot date agreed with the IPCC, TGICA and IPCC Working Group I on 15th October 2020.
Detailed fine-granular information is accessed from the ESGF Search API and stored in the metadata database of the long-term archive at DKRZ. These information is included in the file headers during the production phase and extracted during ESGF data publication. A mapping of the information provided by the ESGF Search API to the local database schema and a connection to the entered information during Step 0 is done.
Step 2 - Data Archival and Adding Ancillary Metadata
In this second step two parallel processes are carried out:
- The data of the data pool is physically archived at DKRZ and moved on tape into the HPSS.
- The metadata is enriched by adding information provided by ancillary metadata repositories within CMIP6: ES-DOC for model and simulation descriptions (see https://es-doc.org ), Citation Service content (see https://cmip6cite.wdc-climate.de ) and other accessible information provided by CMIP6 repositories for ancillary data or the modeling centers.
Step 3 - Finalize Data Archival: Technical Quality Assurance
As a final step of the long-term data archival, the data and metadata is quality checked in the Technical Quality Assurance step. This was successfully applied for the CMIP5 data archival (see QC level 3 details at: https://cmip5qc.wdc-climate.de ). The main checks ensure data and metadata consistency as well as conformance and data accessibility. For metadata from several different locations and sources, consistency and conformance checks are highly important. After the successful finalization of the Technical Quality Assurance step, the long-term archival is completed.
Step 4 - Data Publication and Citation / IPCC Data Distribution Centre
The long-term archived data build the AR6 Reference Data Archive of the IPCC Data Distribution Centre (IPCC DDC, https://ipcc-data.org ). Therefore the archived data is added to the IPCC DDC web pages.
Stable data collections of the IPCC DDC data are registered DataCite DOIs on the same granularities as for the CMIP6 citation of evolving data (see https://cmip6cite.wdc-climate.de ), i.e.:
- model/MIP data: all data contributed by an institution with one model to an individual MIP
- experiment data: all data contributed by an institution with one model to a CMIP6 experiment
After DOI assignment the DOI is documented on the IPCC DDC web page and finally the data - data relation to the DDC data is added to the CMIP6 Citation on evolving data and the altered citation information is published.
The IPCC DDC data in the long-term archive is also published in the ESGF and thus made accessible for ESGF users.
Step 5 - Data Curation and Data Reuse Phase
After data archival and data publication the data curation and data reuse phase starts, in which the metadata is updated with new information, e.g. adding paper references using CMIP6 data via Scholix services or adding errata information. Information on the data is provided for harvesting of secondary portals, e.g. OpenAire or EUDAT.