-
Notifications
You must be signed in to change notification settings - Fork 3
Open
Description
Dear Temesgen,
I need some clarification on the output of slimm species-level classification "_profile.tsv". For a test sample I get the following output:
taxa_level taxa_id linage abundance read_count
species 9606 k__Eukaryota|p__Chordata|c__Mammalia|o__Primates|f__Hominidae|g__Homo|s__Homo sapiens 89.1426 25602864
species 45219 k__Viruses|p__unknown_phylum|c__unknown_class|o__unknown_order|f__Arenaviridae|g__Mammarenavirus|s__Guanarito mammarenavirus 0.0178544 5128
species 1821749 k__Viruses|p__unknown_phylum|c__unknown_class|o__Picornavirales|f__Picornaviridae|g__Cardiovirus|s__Cardiovirus A 0.0136867 3931
species 0* k__unknown_superkingdom|p__unknown_phylum|c__unknown_class|o__unknown_order|f__unknown_family|g__unknown_genus|s__unknown_species 10.8259 3109321
While most reads are classified as Human (89.14% of the reads), 11% of the reads are classified as unknown species. This is confusing because these reads are contained in the BAM file and must be mapped to a reference genome (bowtie2 --no-unal).
Does the fraction of 11% correspond to one species? Or: Could the species not be resolved because of missing taxonomic information? Or: Are these reads not discriminative for 1 species?
All the best,
Johannes
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels