|
The CDS and other astronomical data centers are storing and distributing
the astronomical data to promote their usage primarily by professional
astronomers.
In order to ensure the scientific quality of the data,
we therefore require that the data are related to a publication
in a refereed journal, either as tables or catalogues actually published,
or as a paper describing the data and
their context.
For a quick view of the guidelines and recommandations for publishing your data
at CDS, please have a look at the "Make your data visible" brochure.
See also the Best Practices for Data Publication in the Astronomical Literature (T.Chen, 2022). The article is dedicated for authors, and is a basis of good practices expected in journals and data-centers.
A training summarizing how to publish your data in VizieR following the best practices
is available at The journey of your data through the Virtual Observatory and the European Open Science Cloud.
|
In order to facilitate the usability of the data, and to allow their processing
by the data centers, we require that:
- the data are described accurately enough to allow
an unambiguous interpretation of the data, as well as a
comprehension of the context in which the data were acquired
and/or processed;
a single ascii file, named ReadMe, is designed for this role.
- the data are in a format which allows their usage
by tools currently in usage in our discipline — normally
flat ascii files; other formats can be accepted, but
are converted into flat files.
A full description of the standard conventions used for the documentation
of the catalogues is available at URL
http://cds.unistra.fr/doc/catstd.htx.
The present document just tries to answer to some frequently asked question
about how to prepare the data for their inclusion in the
Data Center documents. The following topics are covered:
Contents:
1 The new submission interface
Since january 2018, the new submission interface is inline; it includes FITS ingestion procedure to improve the discoverability of images and spectra.
2 How to prepare the Data files
It is assumed that each component of the data set is stored in a file;
each file can represent a table, a spectrum (1-D data),
or an image (2-D data).
As a general rule, plain ascii data files (also called
flat files) — are preferred, simply because such files can
always be processed.
More explicitely, the following formats can be used:
- for tables and catalogues: ascii (simple flat files),
with details about their structures (description of columns)
detailed in the ReadMe file.
Some other data formats can be accepted, but are
converted into flat files:
latex, FITS, or TSV / CSV.
TSV (tab-separated values) and CSV
(character-separated values), are a presentation
where a dedicated character (the tab in TSV, or
a punctuation in CSV, typically the semi-colon) is used
as a column separator; this is one of the formats
available for the output of spreadsheets.
What cannot be used: postscript or word/excel processing
internal documents.
- for spectra (1-D data): either FITS file(s),
or 2-column ascii tables.
What cannot be used: postscript, word/excel documents,
GIF or JPEG images.
- for images (2-D data): FITS is the preferred format;
for images of the sky, the inclusion of the
FITS-WCS (World Coordinate System) parameters
describing the conversion between celestial coordinates
and pixel position is strongly encouraged.
What cannot be used: postscript, word/excel documents.
Therefore: never postscript files, postscript is a language
designed for printers, not for storing scientific data !
A short word about file naming conventions:
according to ISO 9660 standard, file names are
restricted to 8 + 3 characters: 8 characters in the
set [a-z0-9_-], followed by a dot and an extension made of 3
characters with the following conventions:
.dat for data files, .fit for FITS files,
.tex for TeX/LaTeX files, and
.txt for text files (ascii files containing only printable text).
Full details about the files and directories structures
can be found in the Adopted Standards for Catalogues document.
The CDS provides tools and services for authors submission :
- build ReadMe and tables :
- FITS spectra/images validation service: FITS validator
3 How to fill the ReadMe description file
This file is aimed at describing all data files stored in a catalogued
data set, and at providing the necessary explanations and references to
the stored material.
All catalogues available at CDS and in associated astronomical data centers
have such an associated file, and numerous
examples can be found on the FTP directories at CDS.
A full description of the conventions used in this ReadMe file can be
found in the
Standards for Astronomical Catalogues,
and a template is
readily accessible for
all journals.
A typical illustration could be e.g.
J/A+A/382/389/ReadMe. Short explanations about how
to fill the ReadMe file:
- the volume and page numbers:
for papers accepted for publication in A&A, but not yet
published, these will be added directly at CDS as soon
as we get these from the publisher. For papers accepted
for publication in other journals, it is recommended to
mail them (to cds-cats(at)unistra.fr) when you get
these details from the publisher.
- the Keywords: part lists the following keywords:
- ADC_Keywords introduces the list of
data-related keywords, out of a
controlled set
- Keywords: introduces the list of keywords
as in the printed publication
Unlike the Keywords: set
which is generally related to the scientific goal of
a paper, the ADC_Keywords are stricly related
to the tabular material collected in the paper.
- the Description: section is expected to describe the
context of the data, like the instrumentation used
or the observing conditions
— it therefore differs from the
Abstract which tends to describe the scientific results
that the author derived from the data.
- the File Summary: section describes the files making up the set:
for each file are specified its filename, the length of the
longest line (lrecl), the number of records (number of lines),
and a caption (short title of the file). Lengthy notes
can be added if necessary.
- the Byte-by-byte Description of file: section describes
the structure of each of the data files (files with the
.dat extension). This description is made in a tabular form,
each row describing one field (column) of the data file.
The description contains the following columns:
- the starting column of the data field
- the format of the field as a fortran-like
format:
An | for a character column made of
n characters; |
In | for a column containing an
integer number of n digits; |
Fn.d | for a column containing a number of
width n digits
and up to d digits in the fractional part; |
En.d Dn.d | for a number using
the exponential notation. |
- the
units
used in the field; the usage of SI units are strongly
encouraged, avoid the CGS units
(for instance, use mW/m2 instead of
ergs/s/cm2).
- the label (heading) of the field, made of
a single word (no embedded blank);
a few
basic conventions are used for usual parameters
(e.g. positions) and related quantities
(e.g. mean errors).
- the explanations can start with the following
special characters related to some important data
characteristics:
* | (the asterisk) | indicating a
lengthy note |
[...] | (square brackets) |
indicating data ranges |
? | (question mark) | indicating a possibility of
blank or NULL (unspecified) values |
- the References: section contains the necessary references;
the usage of the
bibcode
is strongly encouraged.
For large sets of references, it is suggested to gather
them into a dedicated reference file
named refs.dat .
4 How to deposit the data
If not too bulky, the ascii (text) files
data files with their ReadMe file can be uploaded from
https://cdsarc.cds.unistra.fr/vizier.submit/
where some basic checks on the ReadMe and data files are performed.
The checking procedure is also available as the
anafile package
which can be installed with the standard configure and make
Linux procedures
(man page)
Alternatively (needed for binary files like FITS) you can:
- upload the files with their ReadMe via ftp
(recommended for large files)
- e-mail your files to the e-mail address
cds-cats(at)unistra.fr
if these are not too bulky (< a few Megabytes).
- contact us for other possibilities like
download from your site, DVD posting, etc... at
Centre de Données astronomiques
11, rue de l'Université
67000 STRASBOURG, France
cds-cats(at)unistra.fr
5 What happens to your data
At the CDS, some checking procedures are executed to verify
the compatibility between the data files and their description.
This can lead to interactions with the authors, but we are trying
to minimize the level of interaction.
Once the data are public, they are accessible as plain files
in FTP directories at CDS and other
participating data centers (e.g. at
VizF.ADACNOAJ/ADAC, Japan).
The data are also added to the
CDS
service, with mirrors at
CfA/Harvard (USA),
NOAJ/ADAC (Japan),
IUCAA (India),
INASAN( Russia),
NAO (China).
IDIA (South Africa).
6 Contacts
For any question related to the preparation of the data, for
problems related to non-standard data formatting, or any other
difficulty in the management or the transfer of the electronic tables, either
send a mail by clicking on the envelope below, or contact directly
the VizieR team ()
/srv/httpd/Pages/submit.htx