Data File Structure
Data File Structure
Data acquired from the mass spectrometer is saved into data files on the computer ‘shard disk.
These data files may contain more than one acquisition function and may also contain processed
data derived from the original raw data, Since files are seen by programs as streams of data, a
method is required to determine the format of a particular file within the file system—an
example of metadata. Different operating systems have traditionally taken different approaches
to this problem, with each approach having its own advantages and disadvantages.
Of course, most modern operating systems, and individual applications, need to use all of these
approaches to process various files, at least to be able to read 'foreign' file formats, if not work
with them completely.
Filename extension
One popular method in use is to determine the format of a file based on the section of its name
following the final period. This portion of the filename is known as the filename extension. For
example, HTML documents are identified by names that end with .html (or .htm)
File structure
There are several types of ways to structure data in a file. The most usual ones are described
below.
This has several drawbacks. Unless the memory images also have reserved spaces for future
extensions, extending and improving this type of structured file is very difficult. It also creates
files that might be specific to one platform or programming language (for example a structure
containing a Pascal string is not recognized as such in C). On the other hand, developing tools
for reading and writing these types of files is very simple.
The limitations of the unstructured formats led to the development of other types of file
formats that could be easily extended and be backward compatible at the same time.
With this type of file structure, tools that do not know certain chunk identifiers simply skip
those that they do not understand.
This concept has been taken again and again by RIFF (Microsoft-IBM equivalent of IFF), PNG,
JPEG storage, DER (Distinguished Encoding Rules) encoded streams and files (which were
originally described in CCITT X.409:1984 and therefore predate IFF), and Structured Data
Exchange Format (SDXF). Even XML can be considered a kind of chunk based format, since each
data element is surrounded by tags which are akin to chunk identifiers.