0% found this document useful (0 votes)
44 views16 pages

Otubase: December 9, 2011

The OTUbase package provides tools for organizing and analyzing operational taxonomic unit (OTU) data. It includes classes like OTUset and TAXset that allow storing of OTU sequences, classifications, sample data and metadata. Functions are provided for summarizing data, calculating abundances, clustering samples, and accessing metadata slots. The package aims to make OTU data an accessible format for microbiome analysis.

Uploaded by

Jakub Kreisinger
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views16 pages

Otubase: December 9, 2011

The OTUbase package provides tools for organizing and analyzing operational taxonomic unit (OTU) data. It includes classes like OTUset and TAXset that allow storing of OTU sequences, classifications, sample data and metadata. Functions are provided for summarizing data, calculating abundances, clustering samples, and accessing metadata slots. The package aims to make OTU data an accessible format for microbiome analysis.

Uploaded by

Jakub Kreisinger
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

OTUbase

December 9, 2011

OTUbase-package

The OTUbase package: A tool for organizing and manipulating Operational Taxonomic Unit data

Description The OTUbase Base class for OTU data Details Package: Type: Version: Date: License: LazyLoad: OTUbase Package 0.1.0 2010-04-05 Artistic-2.0 yes

~~OTUbase includes a number of OTUset type classes which provide structure for OTU based data. These classes allow the user to store information that may be usefull in the analysis of OTUs. Slots are provided for sequence and quality values, OTU classications, Sample identications, and metadata associated with samples and OTUs. In addition, basic functions are provided for the analysis and visualization of the data. In addition to OTU type analysis, classication data is also supported with the TAXset classes.~~ Author(s) Daniel Beck - <[email protected]>, Matt Settles - <[email protected]>, and James Foster Maintainer: Daniel Beck - <[email protected]> References An_introduction_to_OTUbase.pdf

.OTUset-class

.OTUset-class

"OTUset" class for OTU data

Description This class provides a way to store and manipulate operational taxonomic unit data. ".OTUset" is inherited by "OTUsetQ", "OTUsetF", and "OTUsetB". The user will want to use "OTUsetQ" when quality data is available, "OTUsetF" when sequence data (without quality data) is available, and "OTUsetB" when only OTU and sample data are available. Slots OTUsetB includes Slots id, sampleID, otuID, sampleData, assignmentData. OTUsetF includes Slots id sampleID, otuID, sampleData, assignmentData, sread. OTUsetQ includes Slots id sampleID, otuID, sampleData, assignmentData, sread, quality. Methods Methods include: id provides access to the id slot of object sampleID provides access to the sampleID slot of object otuID provides access the otuID slot of object sampleData provides accesss the sampleData slot of object assignmentData provides access the assignmentData slot of object sread provides access to the sread slot of object quality provides access to the quality slot of object seqnames returns the rst word of the id line. Intended to extract the sequence name from other sequence information. nsamples returns the number of samples in an OTUset object notus returns the number of OTUs in an OTUset object show signature(object=".OTUset"): provides a brief summary of the object, including its class, number of sequences, number of samples, and number of OTUs. Examples
showClass(".OTUset") showMethods(class=".OTUset") showClass("OTUsetQ")

.TAXset-class

.TAXset-class

"TAXset" class for TAX data

Description This class provides a way to store and manipulate read-classication data. ".TAXset" is inherited by "TAXsetQ", "TAXsetF", and "TAXsetB". The user will want to use "TAXsetQ" when quality data is available, "TAXsetF" when sequence data (without quality data) is available, and "TAXsetB" when only classication and sample data are available.

Slots TAXsetB includes Slots id, sampleID, tax, sampleData, assignmentData. TAXsetF includes Slots id sampleID, tax, sampleData, assignmentData, sread. TAXsetQ includes Slots id sampleID, tax, sampleData, assignmentData, sread, quality.

Methods Methods include: id provides access to the id slot of object sampleID provides access to the sampleID slot of object tax provides access the tax slot of object sampleData provides accesss the sampleData slot of object assignmentData provides access the assignmentData slot of object sread provides access to the sread slot of object quality provides access to the quality slot of object seqnames returns the rst word of the id line. Intended to extract the sequence name from other sequence information. nsamples returns the number of samples in an TAXset object show signature(object=".TAXset"): provides a brief summary of the object, including its class, number of sequences, and number of samples.

Examples
showClass(".TAXset") showMethods(class=".TAXset") showClass("TAXsetQ")

abundance

abundance

abundance

Description abundance generates an abundance table. This table can be either weighted or unweighted. Usage abundance(object, ...) Arguments object ... An OTUset or a TAXset object Additional arguments. These will depend on if the object is an OTUset or a TAXset object.

Details These are other arguments passed to abundance taxCol If generating the abundance from a TAXset object, taxCol selects the column of the tax dataframe from which to calculate the abundance. assignmentCol If generating the abundance from an OTUset object assignmentCol will select a column of the assignmentData dataframe to use when calculating abundance. This will override the default of creating an abundance table of the OTUs and instead create an abundance table of a column in the assignmentData dataframe. sampleCol sampleCol generates the abundance table using a column in the sampleData data fram instead of the default of using the sampleID. collab An optional parameter that selets a column of the sampleData dataframe to use when labeling the columns of the abundance table. weighted By default this is FALSE. When set to TRUE abundance will return proportional abundances. Value The returned value will be a data.frame. Examples
## locate directory with data dirPath <- system.file("extdata/Sogin_2006", package="OTUbase")

## read in data into OTUset object soginOTU <- readOTUset(dirPath=dirPath, level="0.03", samplefile="sogin.groups", fastafil ## calculate abundance abundance(soginOTU, collab="Site")

accessors

accessors

Accessor functions for OTUset and TAXset objects

Description These functions provide access to some of the slots of OTUset and TAXset objects. otuID returns the otuID slot of OTUset objects. sampleID returns the sampleID slot of both OTUset and TAXset objects. tax and tax<- return and replace the tax slot of TAXset objects. Usage sampleID(object, ...) otuID(object, ...) tax(object, ...) tax(object)<-value Arguments object value ... Value sampleID and otuID return a character. tax returns a data.frame. See Also ShortRead Examples
## locate directory with data dirPath <- system.file("extdata/Sogin_2006", package="OTUbase")

An OTUset or a TAXset object The replacement value for tax Added for completeness. Enables the passing of arguments.

## read in data into OTUset object soginOTU <- readOTUset(dirPath=dirPath, level="0.03", samplefile="sogin.groups", fastafil ## get the sampleID slot sampleID(soginOTU) ## get the otuID slot otuID(soginOTU)

assignmentData

assignmentData

assignmentData

Description These accessors access and replace the assignmentData slot of OTUbase objects. assignmentData is an AnnotatedDataFrame. assignmentData and assignmentData<- access and replace this AnnotatedDataFrame. assignmentLabels and assignmentLabels<- access and replace the labels of this AnnotatedDataFrame. aData and aData<- access and replace the dataframe component of the AnnotatedDataFrame. assignmentNames returns the assignment names present in the assignmentData slot. Usage aData(object,...) aData(object)<-value assignmentData(object,...) assignmentData(object)<-value assignmentLabels(object,...) assignmentLabels(object)<-value assignmentNames(object,...)

Arguments object value ... Value aData returns a dataframe. assignmentData returns an AnnotatedDataFrame. assignmentLabels returns a character. assignmentNames returns a character. Examples
## locate directory with data dirPath <- system.file("extdata/Sogin_2006", package="OTUbase")

An OTUset or a TAXset object The replacement value for assignmentData or assignmentLabels Added for completeness. Enables the passing of arguments.

## read in data into OTUset object soginOTU <- readOTUset(dirPath=dirPath, level="0.03", samplefile="sogin.groups", fastafil ## get the aData dataframe aData(soginOTU) ## get the assignmentData slot assignmentData(soginOTU)

clusterSamples

clusterSamples

clusterSamples

Description This function is a wrapper for the vegan function vegedist and hclust. It allows the user to cluster samples using a number of different distance measure and clustering methods. Please see the documentation for vegedist and hclust for a more indepth explanation. Usage clusterSamples(object, ...) Arguments object ... An OTUset or a TAXset object Additional arguments. These will depend on if the object is an OTUset or a TAXset object.

Details These are other arguments passed to clusterSamples. For further information on specic arguments, please see abundance, vegdist, or hclust. taxCol Column of the tax slot dataframe on which to cluster (unique to TAXset objects). Passed to the abundance function. assignmentCol Column of the assignmentData dataframe used to classify sequences for clustering. This overrides the default of using the OTUs to cluster samples. This is passed to the abundance function. collab Species a column of the sampleData dataframe that will provide the sample lables for the cluster analysis. This is passed to the abundance function. distmethod The distance method to be used. This value is passed to the vegedist function. The default is the Bray-Curtis distance. clustermethod The clustering method to be used. This value is passed to the hclust function. The default is complete clustering. Examples
## locate directory with data dirPath <- system.file("extdata/Sogin_2006", package="OTUbase")

## read in data into OTUset object soginOTU <- readOTUset(dirPath=dirPath, level="0.03", samplefile="sogin.groups", fastafil ## cluster samples clusterSamples(soginOTU, collab="Site", distmethod="jaccard")

other_functions

other_functions

Other functions

Description These are other functions available. Caution is advised when using them. Some are still in development and others only work on specic objects (OTUset or TAXset). Usage getOTUs(object, colnum, value, exact) getSamples(object, colnum, value, exact) o_diversity(object, ...) o_estimateR(object, ...) Arguments object colnum value exact ... Details getOTUs Returns OTU names that match given values in the assignmentData dataframe. getSamples Returns sample names that match given values in the sampleData dataframe. o_diversity Wrapper for vegans diversity function. o_estimateR Wrapper for vegans estimateR function. otuseqplot Plots the samples acording to number of OTUs and number of sequences. otusize Returns the size of each OTU. otuspersample Lists the number of OTUs in each sample. rseqplot Plots the samples by estimated richness and number of sequences. seqspersample Returns the number of sequences in each sample. sharedotus Returns the number of OTUs shared between samples. Examples
## locate directory with data dirPath <- system.file("extdata/Sogin_2006", package="OTUbase")

An OTUset or a TAXset object. The column of the sampleData or assignmentData dataframe that contains the value. The desired value. If exact=T value must match perfectly. If exact=F value will grep instead of match. Other arguments. Often these are passed to abundance

## read in data into OTUset object soginOTU <- readOTUset(dirPath=dirPath, level="0.03", samplefile="sogin.groups", fastafil getSamples(soginOTU, colnum="Site", value="Labrador", exact=FALSE)

otherGenerics

o_estimateR(soginOTU)

otherGenerics

Other Generics

Description Various functions. notus returns the number of OTUs in an OTUset object. nsamples returns the number of samples in either an OTUset or a TAXset object. seqnames returns the sequence names of the OTUset or TAXset object without the extra information commonly present with the id. Usage notus(object, ...) nsamples(object, ...) seqnames(object, ...) Arguments object ... Examples
dirPath <- system.file("extdata/Sogin_2006", package="OTUbase")

An OTUset or a TAXset object. Other arguments. These are currently nonfunctional.

## read in data into OTUset object soginOTU <- readOTUset(dirPath=dirPath, level="0.03", samplefile="sogin.groups", fastafil ## get the number of OTUs notus(soginOTU) ## get the number of samples nsamples(soginOTU)

readOTUset

readOTUset

Description This function reads in data and creates an OTUset object Usage

readOTUset(dirPath, otufile, level, fastafile, qualfile, samplefile, sampleA

10 Arguments dirPath otufile level fastafile qualfile samplefile sampleADF

readTAXset

The directory path were the datale are located. This is the current directory by default. The OTU le. The only format currently supported is the Mothur format for .list les. The OTU clustering level. By default this is 0.03. This level must correspond to levels present in the otule. The fasta le. This is read in by ShortRead. The quality le. This is read in by ShortRead. The sample le. Currently this must be in Mothur format (.groups). The sample meta data le. This is in AnnotatedDataFrame format.

assignmentADF The assignment meta data le (the OTU meta data). This is generally in AnnotatedDataFrame format although it is also possible to read in an RDP classication le if there is only one read classication for each cluster and rdp=TRUE. sADF.names aADF.names rdp otufiletype The column of the sampleADF le that has the sample names. The column of the assignmentADF le that has the assignment names. By default this is FALSE. Change to TRUE if assignmentADF is an RDP classication le. The RDP le must be in the xed format. The type of OTU le. Takes values "mothur", "cdhit", and "blastclust". Defaults to "mothur".

Examples
dirPath <- system.file("extdata/Sogin_2006", package="OTUbase")

## read in data into OTUset object soginOTU <- readOTUset(dirPath=dirPath, level="0.03", samplefile="sogin.groups", fastafil soginOTU

readTAXset

readTAXset

Description Function to read in data and create a TAXset object Usage

readTAXset(dirPath, taxfile, namefile, fastafile, qualfile, samplefile, samp

sampleData Arguments dirPath taxfile namefile fastafile qualfile samplefile sampleADF

11

The directory path were the datale are located. This is the current directory by default. The classication le. The default format is RDPs xed format. A names le in the Mothur format. This is used to add removed unique sequences back into the dataset. The fasta le. This is read in by ShortRead. The quality le. This is read in by ShortRead. The sample le. Currently this must be in Mothur format (.groups). The sample meta data le. This is in AnnotatedDataFrame format.

assignmentADF The assignment meta data le (the OTU meta data) This is in AnnotatedDataFrame format. sADF.names aADF.names type The column of the sampleADF le that has the sample names. The column of the assignmentADF le that has the assignment names. This is the type of taxle. By default this is the RDP xed format. However, if type is changed to anything else the read.table function is used to read in the taxfile. In this case the rst column of the taxle must be the sequence names. Additional arguments passed to read.table to read in the taxle.

...

Examples
dirPath <- system.file("extdata/Sogin_2006", package="OTUbase")

## read in data into TAXset object soginTAX <- readTAXset(dirPath=dirPath, samplefile="sogin.groups", fastafile="sogin.fasta soginTAX

sampleData

sampleData

Description These functions access and replace the sampleData slot of OTUbase objects. sampleData and sampleData<- access and replace the AnnotatedDataFrame sampleData. sampleLabels and sampleLabels<- access and replace the labels of this AnnotatedDataFrame. sData and sData<- access and replace the dataframe component of the AnnotatedDataFrame.

12 Usage sData(object,...) sData(object)<-value sampleData(object,...) sampleData(object)<-value sampleLabels(object,...) sampleLabels(object)<-value

subOTUset

Arguments object value ... Value sData returns a dataframe. sampleData returns an AnnotatedDataFrame. sampleLabels returns a character. assignmentNames returns a character. Examples
## locate directory with data dirPath <- system.file("extdata/Sogin_2006", package="OTUbase")

An OTUset or a TAXset object The replacement value for sampleData or sampleLabels Added for completeness. Enables the passing of arguments.

## read in data into OTUset object soginOTU <- readOTUset(dirPath=dirPath, level="0.03", samplefile="sogin.groups", fastafil ## get the sData dataframe sData(soginOTU) ## get the sampleData slot sampleData(soginOTU)

subOTUset

subOTUset

Description Function to get a subset of an OTUset object. Usage subOTUset(object, samples, otus)

subTAXset Arguments object samples otus An OTUset object A list of sample names A list of OTU names

13

Value subOTUset returns an OTUset Examples


## locate directory with data dirPath <- system.file("extdata/Sogin_2006", package="OTUbase")

## read in data into OTUset object soginOTU <- readOTUset(dirPath=dirPath, level="0.03", samplefile="sogin.groups", fastafil ## get subset of soginOTU

subOTUset(soginOTU, samples=getSamples(soginOTU, colnum="Site", value="Labrador", exact=F

subTAXset

subTAXset

Description Function to get a subset of an TAXset object. Usage subTAXset(object, samples)

Arguments object samples An TAXset object A list of sample names

Value subTAXset returns an TAXset

14 Examples
## locate directory with data dirPath <- system.file("extdata/Sogin_2006", package="OTUbase")

subTAXset

## read in data into TAXset object soginTAX <- readTAXset(dirPath=dirPath, samplefile="sogin.groups", fastafile="sogin.fasta ## get subset of soginTAX

subTAXset(soginTAX, samples=getSamples(soginTAX, colnum="Site", value="Labrador", exact=F

Index
Topic abundance abundance, 4 Topic classes .OTUset-class, 2 .TAXset-class, 3 Topic package OTUbase-package, 1 .OTUset-class, 2 .TAXset-class, 3 assignmentNames (assignmentData), 6 assignmentNames,.OTUbase-method (assignmentData), 6 clusterSamples, 7 clusterSamples,.OTUset-method (clusterSamples), 7 clusterSamples,.TAXset-method (clusterSamples), 7 clusterSamples-generic (clusterSamples), 7 clusterSamples-method (clusterSamples), 7

abundance, 4, 7 abundance,.OTUset-method (abundance), 4 abundance,.TAXset-method (abundance), 4 getOTUs (other_functions), 8 accessors, 5 getSamples (other_functions), 8 aData (assignmentData), 6 hclust, 7 aData,.OTUbase-method (assignmentData), 6 notus (otherGenerics), 9 aData<- (assignmentData), 6 notus,.OTUset-method aData<-,.OTUbase,data.frame-method (otherGenerics), 9 (assignmentData), 6 nsamples (otherGenerics), 9 aData<-,.OTUbase-method nsamples,.OTUbase-method (assignmentData), 6 (otherGenerics), 9 assignmentData, 6 assignmentData,.OTUbase-method o_diversity (other_functions), 8 (assignmentData), 6 o_estimateR (other_functions), 8 assignmentData<other_functions, 8 (assignmentData), 6 otherGenerics, 9 assignmentData<-,.OTUbase,AnnotatedDataFrame-method OTUbase (OTUbase-package), 1 (assignmentData), 6 OTUbase-package, 1 assignmentData<-,.OTUbase-method otuID (accessors), 5 (assignmentData), 6 otuID,.OTUset-method (accessors), assignmentLabels 5 (assignmentData), 6 otuseqplot (other_functions), 8 assignmentLabels,.OTUbase-method OTUset (.OTUset-class), 2 (assignmentData), 6 OTUsetB (.OTUset-class), 2 assignmentLabels<OTUsetB-class (.OTUset-class), 2 (assignmentData), 6 OTUsetF (.OTUset-class), 2 assignmentLabels<-,.OTUbase,character-method OTUsetF-class (.OTUset-class), 2 (assignmentData), 6 OTUsetQ (.OTUset-class), 2 assignmentLabels<-,.OTUbase-method OTUsetQ-class (.OTUset-class), 2 (assignmentData), 6 otusize (other_functions), 8 15

16 otuspersample (other_functions), 8 readOTUset, 9 readTAXset, 10 rseqplot (other_functions), 8

INDEX TAXsetF-class (.TAXset-class), 3 TAXsetQ (.TAXset-class), 3 TAXsetQ-class (.TAXset-class), 3 vegdist, 7

sampleData, 11 sampleData,.OTUbase-method (sampleData), 11 sampleData<- (sampleData), 11 sampleData<-,.OTUbase,AnnotatedDataFrame-method (sampleData), 11 sampleData<-,.OTUbase-method (sampleData), 11 sampleID (accessors), 5 sampleID,.OTUbase-method (accessors), 5 sampleLabels (sampleData), 11 sampleLabels,.OTUbase-method (sampleData), 11 sampleLabels<- (sampleData), 11 sampleLabels<-,.OTUbase,character-method (sampleData), 11 sampleLabels<-,.OTUbase-method (sampleData), 11 sData (sampleData), 11 sData,.OTUbase-method (sampleData), 11 sData<- (sampleData), 11 sData<-,.OTUbase,data.frame-method (sampleData), 11 sData<-,.OTUbase-method (sampleData), 11 seqnames (otherGenerics), 9 seqnames,.OTUbase-method (otherGenerics), 9 seqspersample (other_functions), 8 sharedotus (other_functions), 8 ShortRead, 5 subOTUset, 12 subTAXset, 13 tax (accessors), 5 tax,.TAXset-method (accessors), 5 tax<- (accessors), 5 tax<-,.TAXset,data.frame-method (accessors), 5 tax<-,.TAXset-method (accessors), 5 TAXset (.TAXset-class), 3 TAXsetB (.TAXset-class), 3 TAXsetB-class (.TAXset-class), 3 TAXsetF (.TAXset-class), 3

You might also like