This presentation guides listeners through all the stages of publication life cycle at CERN Document Server, from the ingestion using one of the various tools, through curation and processing, until the data is ready to be exported to other systems. It describes different tools that we are using to curate the incoming publications as well as to further improve the existing data on CDS. The second part of the talk goes through various challenges we have faced in the past and how we are going to overcome them in the new version of CDS.
CERN Document Server (CDS) is the CERN Institutional Repository, playing a key role in the storage, dissemination and archival for all research material published at CERN, as well as multimedia and some administrative documents. As the CERN’s document hub, it joins together submission and publication workflows dedicated to the CERN experiments, but also to the video and photo teams, to the administrative groups, as well as outreach groups. In the past year, Invenio, the underlying software platform for CDS, has been undergoing major changes, transitioning from a digital library system to a digital library framework, and moving to a new software stack (Invenio is now built on top of the Flask web development framework, using Jinja2 template engine, SQLAlchemy ORM, JSONSchema data model, and Elasticsearch for information retrieval). In order to reflect these changes on CDS, we are launching a parallel service, CDSLabs, with the goal of offering our users a continuous view of the reshaping of CDS, as well as increasing the feedback from the community in the development phase, rather than after release.