0% found this document useful (0 votes)
57 views

DSRC User Guide

The document is a user guide for DSRC, a program for compressing DNA sequence reads stored in FASTQ format. It describes what DSRC is, its main features like compression ratio and speed, how to download, build and use the program, and contact support. DSRC can compress FASTQ files in a lossless or lossy way, has APIs for Python and C++, and works on Linux, Windows and MacOS.

Uploaded by

renaud
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views

DSRC User Guide

The document is a user guide for DSRC, a program for compressing DNA sequence reads stored in FASTQ format. It describes what DSRC is, its main features like compression ratio and speed, how to download, build and use the program, and contact support. DSRC can compress FASTQ files in a lossless or lossy way, has APIs for Python and C++, and works on Linux, Windows and MacOS.

Uploaded by

renaud
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

DNA Sequence Reads Compression

User Guide

Release 2.0

March 31, 2014


Contents

Contents ii

1 Introduction 1
1.1 What is DSRC? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Main features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 Compression factor and speed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.4 Contact and support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

2 Quickstart 3
2.1 Download . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2 Building . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.3 Program usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

3 DSRC integration 6
3.1 Python API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.2 Python examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.3 C++ API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.4 C++ examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

ii
1
Introduction

1.1 W HAT IS DSRC?

DNA Sequence Reads Compression is an application designed for loosless and lossy compression
of DNA sequencing reads stored in FASTQ format. The first release (0.x) of DSRC was accompa-
nied by the research paper:
• S. Deorowicz and S. Grabowski: Compression of DNA sequence reads in FASTQ format, Bioin-
formatics (2011).
The newest release is described in:
• Ł. Roguski and S. Deorowicz: DSRC 2—Industry-oriented compression of FASTQ files (under
review).

1.2 M AIN FEATURES

The main features of DSRC are:


• Effective multithreaded compression of DNA sequencing data stored in FASTQ format.
• Full support for Illumina, ABI SOLiD, and 454/Ion Torrent dataset formats with non-standard
(AGCTN) IUPAC base values.
• Support for lossy quality values compression using Illumina binning scheme.
• Support for lossy IDs compression keeping only key fields selected by user.
• Pipes support for easy integration with current pipelines.
• Python and C++ libraries allowing to integrate DSRC archives in own applications.
• Availability for Linux, Windows, and MacOS 64-bit operating systems.
• Open C++ source code under GNU GPL 2 license.

1.3 C OMPRESSION FACTOR AND SPEED

In terms of lossless compression factor (the ability to reduce the file size), DSRC in is usually 35–
60% better than gzip and 15–30% better than bzip2, which are currently the most common tools
for storing FASTQ files in a compressed format. DSRC is a multithreaded software and the speed
in the fast mode usually reaches the I/O limits (e.g., about 500 MB/s for 8 threads). The achieved
compression ratio and speeds are described in the mentioned research papers.

1
1.4 C ONTACT AND SUPPORT

• Official Web site: http://sun.aei.polsl.pl/dsrc/


• Contact and bug reports: sebastian.deorowicz [at] polsl.pl , lucas.roguski [at] gmail.com

2
2
Quickstart

2.1 D OWNLOAD

The compiled binaries, C++ and Python libraries with example of usage can be downloaded from
the official Web site: http://sun.aei.polsl.pl/dsrc/

2.2 B UILDING

DSRC compiles on both Windows and Linux platforms.

2.2.1 Linux
To compile DSRC on Linux platform do use provided makefile files in the main directory. The
default Makefile will compile DSRC in a static-linking variant using boost::thread1 library
for multithreading functionality, so boost development libraries are required to be present. The
Makefile.g++11 will use g++ (≥ 4.8) with implementation of C++11 standard, which provides
multithreading functionality without using Boost libraries.
To build DSRC binary in the main directory:
make bin
where the resulting dsrc binary file will be placed in bin directory.

To build DSRC C++ library use:


make lib
where the resulting dsrc binary file will be placed in lib directory.

2.2.2 Windows
To compile on Windows platform there are provided project solution files dsrc-vs2k10.sln and
dsrc-vs2k12.sln for respectively Microsoft Visual Studio 2010 and 2012 with configurations
for building DSRC binary and C++ library. When building DSRC using Microsoft Visual Studio
2010 boost::thread library will be required to provide multithreading functionality, while by
using Microsoft Visual Studio 2012 C++11 threads implementation will be used. DSRC can be also
compiled using MinGW-W64 (64-bit) using linux makefile files.
1 Boost libraries can be downloaded from http://www.boost.org/

3
2.2.3 Mac OSX
To compile DSRC on Mac OSX use the provided Makefile.osx file in the main directory. The
makefile uses clang compiler with C++11 standard support for multithreading.
To build DSRC binary in the main directory:
make -f Makefile . osx bin
where the resulting dsrc binary file will be placed in bin directory.

To build DSRC C++ library use:


make -f Makefile . osx lib
where the resulting dsrc binary file will be placed in lib directory.

2.2.4 Python module


To build the DSRC Python module the boost::python, boost::thread and boost::system
libraries in development versions are required. The module will be build using the Boost Build
bjam tool using Jamroot file located in py folder.
In order to compile Python module, the location of the local boost path needs to be edited in
Jamroot file:
# User - specified path to the Boost project
use - project boost
: / path / to / main / boost / directory / ;
After setting the local boost path to compile the pydsrc module just type in the main directory:
make pylib
which will launch Boost Build tool and place the library in the output pylib directory tree.

2.3 P ROGRAM USAGE

DSRC can be run from the command prompt:


dsrc <c|d> [options] <input_file_name> <output_file_name>
in one of two modes:

c — compression,
d — decompression.
Available compression options are:

-d<n > — DNA compression mode: 0–3, default: 0


-q<n > — Quality compression mode: 0–2, default: 0
-f<1,...> — keep only those fields no. in ID field string, default: keep all
-b<n > — FASTQ input buffer size in MB, default: 8

4
-m<n > — Automated compression mode (one of the three preset combination of other pa-
rameters): 0–2
-o<n > — Quality offset, 0 for auto selection, default: 0
-l — use Quality lossy mode (Illumina binning scheme), default: false
-c — calculate and check CRC32 checksum calculation per block (slows the compression
about twice), default: false

Available ‘automated’ compression modes:

-m0 — fast mode, equivalent to: -d0 -q0 -b8


-m1 — medium mode, equivalent to: -d2 -q2 -b64
-m2 — best mode, equivalent to: -d3 -q2 -b256
Options for both compression and decompression modes:

-t<n > — processing threads number, default: max available hardware threads
-s — use stdin/stdout for reading/writing FASTQ data (stderr is used for info/warning
messages)
Usage examples:

• Compress SRR001471.fastq file saving DSRC archive to SRR001471.dsrc:


dsrc c SRR001471.fastq SRR001471.dsrc
• Compress file in the fast mode with CRC32 checking and using 4 threads:
dsrc c -m0 -c -t4 SRR001471.fastq SRR001471.dsrc
• Compress file using DNA and Quality compression level 2 and using 512 MB buffer:
dsrc c -d2 -q2 -b512 SRR001471.fastq SRR001471.dsrc
• Compress file in the best mode with lossy Quality mode and preserving only 1–4 fields from
record IDs:
dsrc c -m2 -l -f1,2,3,4 SRR001471.fastq SRR001471.dsrc
• Compress in the best mode reading FASTQ file read from stdin:
cat SRR001471.fastq | dsrc c -m2 -s SRR001471.dsrc
• Decompress SRR001471.dsrc archive saving output FASTQ file to SRR001471.out.fastq:
dsrc d SRR001471.dsrc SRR001471.out.fastq
• Decompress archive using 4 threads and streaming FASTQ data to stdout:
dsrc d -t4 -s SRR001471.dsrc » SRR001471.out.fastq

5
3
DSRC integration

DSRC can be easily integrated with applications written in C++ or Python. We provide C++ and
Python libraries with very similar interfaces. Although the methods and members names are al-
most identical in both cases, for clarity the C++ and Python descriptions are divided into two
separate sections.

3.1 P YTHON API

To start using the compressor functionality in Python it’s only needed to import pydsrc module
(pydsrc.so file) in the project. In order to provide high performance of compression routines the
core operations were written in C++ and exported to Python.

3.1.1 FastqRecord
FastqRecord represents a single DNA sequencing read. The ID, sequence, plus, and quality fields
are accessible as Python string type.

3.1.2 FastqFile
FastqFile is used for reading and writing FASTQ files in a sequential manner, read by read,
where each is of type FastqRecord. Figure 3.1 shows FastqFile available public methods—all
methods throw exception on error or failure.

Table 3.1: FastqFile public methods


Method Returns Parameters Description
Open — string filename Opens specified file.
Creates a new file or overwrites one if
Create — string filename
exists.
Close — — Closes the file.
Gets the next record. Returns True on
FastqRecord
ReadNextRecord boolean successful read and False on reaching
record
the end of file.
FastqRecord
WriteRecord — Writes the next record.
record

6
3.1.3 DsrcArchive
DsrcArchive represents DSRC archive file and provides compression and decompression rou-
tines. Figure 3.2 shows DsrcArchive available public methods—all methods throw exception on
error or failure. Figure 3.3 shows public properties. It’s important to note, that properties cannot
be set, when DsrcArchive is already in processing mode (compression or decompression routines
have been stared).

Table 3.2: DsrcArchive methods


Method Returns Parameters Description
Creates a new DSRC archive and pre-
StartCompress — string filename
pares for compression.
WriteNextRecord — FastqRecord rec Writes a new record to file.
Finalizes the compression of the
FinishCompress — —
archive and performs cleanup.
Opens a DSRC archive and prepares
StartDecompress — string filename
for decompression.
Reads the next decompressed FASTQ
FastqRecord record from archive. Returns True on
ReadNextRecord boolean
record success and False on reaching end of
archive.
Finalizes the decompression of the
FinishDecompress — —
archive and performs cleanup.

Table 3.3: DsrcArchive properties


Property Type Description
DnaCompressionLevel int DNA compression level: 0–3. Default: 0.
QualityCompressionLevel int Quality compression level: 0–2. Default: 0.
Size of input FASTQ buffer specified in MB. De-
FastqBufferSize int
fault: 8.
Lossy quality compression mode indicator. De-
LossyCompression boolean
fault: False.
Mask defining ID fields to keep. Default: 0 (keep
TagFieldFilterMask int
all).
Value of Quality offset—typically 33 or 64. Default:
QualityOffset int
auto determined.
CRC32 checksum check during compression indi-
Crc32Checking boolean
cator. Default: False.
Color space archive type (e.g., ABI SOLiD) indica-
ColorSpace boo lean
tor. Default: False.
Flag indicating keeping additional information in
PlusRepetition boolean
plus lines. Default: False.

7
3.1.4 DsrcModule
DsrcModule provides automated and parallel compression routines, working on whole files in-
stead of single records. Figures 3.4 and 3.5 show available public methods and public properties.
All methods throw exception on error or failure.

Table 3.4: DsrcModule methods


Method Returns Parameters Description
string
fastqFilename, Compresses the FASTQ file to DSRC
Compress —
string archive.
dsrcFilename
string
dsrcFilename, Decompresses the DSRC archive to file
Decompress —
string in FASTQ format.
fastqFilename

Table 3.5: DsrcModule properties


Property Type Description
DnaCompressionLevel int DNA compression level: 0–3. Default: 0.
QualityCompressionLevel int Quality compression level: 0–2. Default: 0.
FastqBufferSize int Size of input FASTQ buffer in MB. Default: 8.
Lossy Quality compression mode indicator. De-
LossyCompression boolean
fault: False.
Mask defining ID fields to keep. Default: 0 (keep
TagFieldFilterMask int
all).
Value of Quality offset, typically 33 or 64. Default:
QualityOffset int
auto determined.
Flag indicating if CRC32 checking is enabled dur-
Crc32Checking boolean
ing compression. Default: False.

8
3.2 P YTHON EXAMPLES

3.2.1 Compressing FASTQ file using automated DsrcModule module

import pydsrc

filename_fq = " SRR001471 . fastq "


4 filename_dsrc = " SRR001471 . dsrc "

# create and configure DSRC module


module = pydsrc . DsrcModule ()
8 module . L o s s y C o m p r e s s i o n = True
module . D n a C o m p r e s s i o n L e v e l = 2
module . Q u a l i t y C o m p r e s s i o n L e v e l = 2
module . F a s t q B u f f e r S i z e M B = 512
12 module . ThreadsNumber = 2

module . Compress ( filename_fq , filename_dsrc )

3.2.2 Decompressing DSRC archive using automated DsrcModule module

import pydsrc

filename_dsrc = " SRR001471 . dsrc "


4 filename_fq = " SRR001471 . dsrc . fastq "

# create and configure DSRC module


module = pydsrc . DsrcModule ()
8 module . ThreadsNumber = 2

module . Decompress ( filename_dsrc , filename_fq )

9
3.2.3 Manual compression using DsrcArchive and FastqFile

import pydsrc

filename_fq = " SRR001471 . fastq "


4 filename_dsrc = " SRR001471 . dsrc "

fqfile = pydsrc . FastqFile ()


fqfile . Open ( filename_fq )
8
# create and configure DSRC archive file
archive = pydsrc . DsrcArchive ()
archive . D n a C o m p r e s s i o n L e v e l = 2
12
# lossy compress quality values using use Illumina bining scheme
archive . L o s s y C o m p r e s s i o n = True
archive . Q u a l i t y C o m p r e s s i o n L e v e l = 2
16
archive . F a s t q B u f f e r S i z e M B = 512

# do not keep extra information in ’ plus ’ field


20 archive . Plus Repet itio n = False

archive . StartCompress ( filename_dsrc )

24
# read all records from FASTQ file and write to DSRC archive
rc = 0
rec = pydsrc . FastqRecord ()
28 while fqfile . Read Next Recor d ( rec ):
archive . W ri t eN e xt R ec o rd ( rec )
rc += 1

32 # save archive and close FASTQ file


archive . Fini shCom pres s ()
fqfile . Close ()

36 print " Success !\ nRecords ␣ written : ␣ % d " % rc

10
3.2.4 Manual decompression using DsrcArchive and FastqFile

import pydsrc

filename_dsrc = " SRR001471 . dsrc "


4 filename_fq = " SRR001471 . dsrc . fastq "

# open existing DSRC archive file


8 archive = pydsrc . DsrcArchive ()
archive . S ta r tD e co m pr e ss ( filename_dsrc )

# create results FASTQ file


12 fqfile = pydsrc . FastqFile ()
fqfile . Open ( filename_fq )

# read all records from DSRC archive and write to FASTQ file
16 rc = 0
rec = pydsrc . FastqRecord ()
while archive . Re adNex tRec ord ( rec ):
fqfile . Wr i te N ex t Re c or d ( rec )
20 rc += 1

# close ( and save ) FASTQ file and close DSRC archive


fqfile . Close ()
24 archive . F i n i s h D e c o m p r e s s ()

print " Success !\ nRecords ␣ written : ␣ % d " % rc

11
3.3 C++ API

To start using DSRC C++ library it’s only needed to include Dsrc.h header file and to link appli-
cation with libdsrc (libdsrc.a file under Linux or libdsrc.lib file under Windows) library.

3.3.1 FastqRecord
FastqRecord stores a single DNA sequencing read information. The IDs, sequence, plus, and
quality fields are represented using std::string type.

3.3.2 FastqFile
FastqFile is used to read and write FASTQ records, where each record is of FastqRecord type.
Figure 3.6 shows FasqtFile public methods—methods throw std::runtime_error exception
on error.

Table 3.6: FastqFile public methods


Method Returns Parameters Description
std::string
Open — Opens specified file for reading.
filename
std::string Creates a new file or overwrites if one
Create —
filename exists for writing.
Close — — Closes the file.
FastqRecord& Gets the next buffered record. Returns
ReadNextRecord bool
record false when reaching end of file.
FastqRecord
WriteNextRecord — Writes the next record.
record

3.3.3 DsrcArchive
DsrcArchve provides methods to read from and write to DSRC archive file in a continuous way,
read by read, where record is of FastqRecord type. Figure 3.7 shows DsrcArchive public meth-
ods, while Figure 3.8 shows public accessors.

3.3.4 DsrcModule
DsrcModule provides automated parallel compression and decompression routines operating on
whole files instead of single records. Figure 3.9 shows available public methods, Figure 3.10 shows
public accessors.

12
Table 3.7: DsrcArchive public methods
Method Returns Parameters Description
Creates a new DSRC archive and pre-
StartCompress — std::string
pares for compression.
FastqRecord
WriteNextRecord — Writes a new record to file.
record
Finalizes the compression of the
FinishCompress — —
archive and performs cleanup.
Opens a DSRC archive and prepares
StartDecompress — std::string
for decompression.
Reads the next decompressed FASTQ
FastqRecord&
ReadNextRecord bool record from archive. Returns false on
record
reaching end-of-file.
Finalizes the decompression of the
FinishDecompress — —
archive and performs cleanup.

Table 3.8: DsrcArchive public accessors


Value
Methods Description
type
Get/Set Gets/Sets the level of DNA compression: 0–3. De-
int
DnaCompressionLevel fault: 0
Get/Set Gets/Sets the level of quality compression: 0–3.
int
QualityCompressionLevel Default: 0.
Get/Set Gets/Sets the size of input FASTQ buffer in MB.
int
FastqBufferSize Default: 8.
Is/Set Gets/Sets the state of lossy quality mode enabled.
bool
LossyCompression Default: false.
Get/Set unsigned Gets/Sets the mask describing ID fields to keep in
TagFieldFilterMask long FASTQ record. Default: 0 (keep all fields).
Gets/sets the quality values offset—typically 33 or
Get/Set QualityOffset int
64. Default: automatic selection.
Gets/sets CRC32 hash calculations. Default:
Is/Set Crc32Checking bool
false.
Gets/sets repetition of ID fields on ‘plus’ field in
Is/Set PlusRepetition bool
FASTQ record.
Gets/sets indicator whether file is encoded in color
Is/Set ColorSpace bool
space (e.g., ABI SOLiD type).

13
Table 3.9: DsrcModule public methods
Method Returns Parameters Description
std::string
fastqFilename, Compresses the FASTQ file to DSRC
Compress bool
std::string archive.
dsrcFilename
std::string
dsrcFilename, Decompresses the DSRC archive to
Decompress bool
std::string file.
fastqFilename

Table 3.10: DsrcModule public accessors


Value
Methods Description
type
Get/Set Gets/Sets the level of DNA compression: 0–3. De-
int
DnaCompressionLevel fault: 0
Get/Set Gets/Sets the level of quality compression: 0–2.
int
QualityCompressionLevel Default: 0.
Get/Set Gets/Sets the size of input FASTQ buffer in MB.
int
FastqBufferSize Default: 8.
Is/Set Gets/Sets the state of lossy quality mode enabled.
bool
LossyCompression Default: false.
Get/Set unsigned Gets/Sets the mask describing ID fields to keep in
TagFieldFilterMask long FASTQ record. Default: 0 (keep all fields).
Gets/sets the quality values offset—typically 33 or
Get/Set QualityOffset int
64. Default: automatic selection.
Gets/sets CRC32 hash calculations. Default:
Is/Set Crc32Checking bool
false.

14
3.4 C++ EXAMPLES

3.4.1 Compressing FASTQ file using automated DsrcModule module

# include < string >


# include < iostream >

4 # include " Dsrc . h "


using namespace dsrc :: lib ;

int main ( int argc_ , char * argv_ [])


8 {
if ( argc_ != 3)
return -1;

12 const std :: string inFastqFile = argv_ [0];


const std :: string outDsrcFile = argv_ [1];

try
16 {
DsrcModule dsrc ;

// compress archive in MED mode ( flags : - d2 - q2 - b64 )


20 dsrc . S e t Q u a l i t y C o m p r e s s i o n L e v e l (2);
dsrc . S e t D n a C o m p r e s s i o n L e v e l (2);
dsrc . S e t F a s t q B u f f e r S i z e M B (64);

24 // additionally perform CRC32 checking when compressing


dsrc . S e t C r c 3 2 C h e c k i n g ( true );

// set maximum compressing worker threads number


28 dsrc . S e t T h r e a d s N u m b e r (4);

dsrc . Compress ( inFastqFile , outDsrcFile );


}
32 catch ( const DsrcException & e )
{
std :: cerr << " Error ! " << std :: endl ;
std :: cerr << e . what () << std :: endl ;
36 return -1;
}

std :: cout << " Success ! " << std :: endl ;


40 return 0;
}

15
3.4.2 Decompressing DSRC archive using automated DsrcModule module

# include < string >


# include < iostream >

4 # include " Dsrc . h "


using namespace dsrc :: lib ;

int main ( int argc_ , char * argv_ [])


8 {
if ( argc_ != 3)
return -1;

12 const std :: string inDsrcFile = argv_ [0];


const std :: string outFastqFile = argv_ [1];

try
16 {
DsrcModule dsrc ;

// set maximum decompressing worker threads number


20 dsrc . S e t T h r e a d s N u m b e r (4);

dsrc . Decompress ( inDsrcFile , outFastqFile );


}
24 catch ( const DsrcException & e )
{
std :: cerr << " Error ! " << std :: endl ;
std :: cerr << e . what () << std :: endl ;
28 return -1;
}

std :: cerr << " Success ! " << std :: endl ;


32 return 0;
}

16
3.4.3 Manual compression using DsrcArchive and FastqFile

# include < string >


# include < iostream >

4 # include " Dsrc . h "


using namespace dsrc :: lib ;

int main ( int argc_ , char * argv_ [])


8 {
if ( argc_ != 3)
return -1;

12 const std :: string inFastqFile = argv_ [0];


const std :: string outDsrcFile = argv_ [1];

// 1. Open FASTQ file


16 FastqFile fastqFile ;
try
{
fastqFile . Open ( inFastqFile );
20 }
catch ( const DsrcException & e )
{
std :: cerr << " Error !\ n " << e . what () << std :: endl ;
24 return -1;
}

// 2. Create and configure DSRC archive


28 DsrcArchive archive ;
try
{
// compress quality values using Illumina bining scheme
32 archive . S e t L o s s y C o m p r e s s i o n ( true );
archive . S e t Q u a l i t y C o m p r e s s i o n L e v e l (2);

archive . S e t D n a C o m p r e s s i o n L e v e l (2);
36 archive . S e t F a s t q B u f f e r S i z e M B (512);

// discard repeated TAG information in "+" lines


archive . S e t P l u s R e p e t i ti o n ( false );
40
archive . StartCompress ( outDsrcFile );
}
catch ( const DsrcException & e )
44 {
fastqFile . Close ();

std :: cerr << " Error !\ n " << e . what () << std :: endl ;
48 return -1;
}

// 3. When configured , start compression


52 FastqRecord rec ;
long int recCount = 0;
while ( fastqFile . Rea dNex tReco rd ( rec ))
{
56 archive . W ri t eN e xt R ec o rd ( rec );
recCount ++;

17
}

60 // 4. Finish compression and close FASTQ file


archive . Finis hCom press ();
fastqFile . Close ();

64 std :: cout << " Sucess !\ nCompressed ␣ records : ␣ " << recCount << std :: endl ;
return 0;
}

18
3.4.4 Manual decompression using DsrcArchive and FastqFile
# include < string >
# include < iostream >

4 # include " Dsrc . h "


using namespace dsrc :: lib ;

int main ( int argc_ , char * argv_ [])


8 {
if ( argc_ != 3)
return -1;

12 const std :: string inDsrcFile = argv_ [0];


const std :: string outFastqFile = argv_ [1];

// 1. Open DSRC archive


16 DsrcArchive archive ;
try
{
archive . S ta r tD e co m pr e ss ( inDsrcFile );
20 }
catch ( const DsrcException & e )
{
std :: cerr << " Error !\ n " << e . what () << std :: endl ;
24 return -1;
}

// 2. Create output FASTQ file


28 FastqFile fastqFile ;
try
{
fastqFile . Create ( outFastqFile );
32 }
catch ( const DsrcException & e )
{
archive . F i n i s h D e c o m p r e s s ();
36
std :: cerr << " Error !\ n " << e . what () << std :: endl ;
return -1;
}
40
// 3. Start decompression
FastqRecord rec ;
long int recCount = 0;
44 while ( archive . R eadN extRe cord ( rec ))
{
fastqFile . W ri t eN e xt R eco rd ( rec );
recCount ++;
48 }

// 4. Finish decompression and close FASTQ file


archive . F i n i s h D e c o m p r e s s ();
52 fastqFile . Close ();

std :: cout << " Success !\ nDecompressed ␣ records : ␣ " << recCount << std :: endl ;
return 0;
56 }

19

You might also like