0% found this document useful (0 votes)
56 views5 pages

BLAST and Multiple Sequence Alignment of NCBI 1

1. The sequences in Set 1 are from Bos taurus (cattle) and Cephalophorus (flowers), suggesting the purpose was to investigate different organisms. 2. Set 2 sequences are all from plants - Arabidopsis thaliana, Malus domestica, and Mus musculus - with genes coding for glucose, indicating the purpose was to identify the role of genes in plant growth. 3. Set 3 sequences include Trueperella pyogenes (an animal pathogen) and an uncultured bacterium clone, suggesting the aim was to determine gene coding related to infections and identify matching sequences to the clone.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
56 views5 pages

BLAST and Multiple Sequence Alignment of NCBI 1

1. The sequences in Set 1 are from Bos taurus (cattle) and Cephalophorus (flowers), suggesting the purpose was to investigate different organisms. 2. Set 2 sequences are all from plants - Arabidopsis thaliana, Malus domestica, and Mus musculus - with genes coding for glucose, indicating the purpose was to identify the role of genes in plant growth. 3. Set 3 sequences include Trueperella pyogenes (an animal pathogen) and an uncultured bacterium clone, suggesting the aim was to determine gene coding related to infections and identify matching sequences to the clone.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

Learning Assessment:

Part 1: Identify sequences with Blast

Identify unknown sequences

>Unknown sequence 1
AAATGAGTTAATAGAATCTTTACAAATAAGAATATACACTTCTGCTTAGGATGATA
ATTGGAGGCAAGTG
AATCCTGAGCGTGATTTGATAATGACCTAATAATGATGGGTTTTATTTCCAGACTTC
ACTTCTAATGGTG
ATTATGGGAGAACTGGAGCCTTCAGAGGGTAAAATTAAGCACAGTGGAAGAATTT
CATTCTGTTCTCAGT
TTTCCTGGATTATGCCTGGCACCATTAAAGAAAATATCATCTTTGGTGTTTCCTATG
ATGAATATAGATA
CAGAAGCGTCATCAAAGCATGCCAACTAGAAGAGGTAAGAAACTATGTGAAAACT
TTTTGATTATGCATA
TGAACCCTTCACACTACCCAAATTATATATTTGGCTCCATATTCAATCGGTTAGTCT
ACATATATTTATG
TTTCCTCTATGGGTAAGCTACTGTGAATGGATCAATTAATAAAACACATGACCTAT
GCTTTAAGAAGCTT GCAAACACATGAA

>Unknown sequence 2
AAATGAGTTAATAGAATCTTTACAAATAAGAATATACACTTCTGCTTAGGATGATA
ATTGGAGGCAAGTG
AATCCTGAGCGTGATTTGATAATGACCTAATAATGATGGGTTTTATTTCCAGACTTC
ACTTCTAATGGTG
ATTATGGGAGAACTGGAGCCTTCAGAGGGTAAAATTAAGCACAGTGGAAGAATTT
CATTCTGTTCTCAGT
TTTCCTGGATTATGCCTGGCACCATTAAAGAAAATATCATTGGTGTTTCCTATGATG
AATATAGATACAG
AAGCGTCATCAAAGCATGCCAACTAGAAGAGGTAAGAAACTATGTGAAAACTTTTT
GATTATGCATATGA
ACCCTTCACACTACCCAAATTATATATTTGGCTCCATATTCAATCGGTTAGTCTACA
TATATTTATGTTT
CCTCTATGGGTAAGCTACTGTGAATGGATCAATTAATAAAACACATGACCTATGCT
TTAAGAAGCTTGCA AACACATGAA
Part 1: Questions

1. In the Descriptions section, look at the top result, which should be the result with the highest
score. Write down information about the best match:

Sequence 1

● Description - Homo sapiens CF transmembrane conductance regulator


(CFTR), RefSeqGene (LRG_663) on chromosome 7
● E value - 0.0
● Identity - 100%
● Query - 504/504
● Cover - 100%

Sequence 2

● Description - Homo sapiens CF transmembrane conductance regulator


(CFTR), RefSeqGene (LRG_663) on chromosome 7
● E value - 0.0
● Identity - 100%
● Query - 504/504
● Cover - 100%

2. Now scroll down to the Alignments heading. Look at the top result, which should be the same
one. Look at the alignment between your query and the reference. Do you see any mismatches?
- None

3. How can you judge whether this is a good match?

- Aside from there being no difference between the query and the reference, the given
sequence is a good match because the genes being found in the query are similar to the
subject.

4. What is this gene? Google the name of the gene and write down something significant you
learned about it.

- The CFTR gene provides instructions for making a protein called the CF transmembrane
conductance regulator (CFTR). In other words, it encodes a protein in cell membranes in
epithelial tissues which affects multiple organ systems in the human body that help
produce mucus, sweat, saliva, tears and digestive enzymes. Therefore, the CFTR gene
plays a vital role in the human body as it controls and helps to maintain the balance of salt
and water on many surfaces in the body, such as the surface of the lung.
Part 2: Investigating sets of sequences
Each of the following sets of sequences were obtained from a sequencing experiment.

For each experiment (Set 1, Set 2 and Set 3), answer these questions:
1. What do these sequences have in common?
2. What is your best guess about the original purpose of this experiment?

Set 1
>Sequence1a GTAATGTACATAACATTAATGTAATAAAGA
>Sequence1bATCACGAGCTTAATTACCATGCCGCGTGAAACCAGCA
ACC
>Sequence1c ATGGACTAATGGCTAATCAGCCCATGCTCACACATA

1. What do these sequences have in common?

⮚ Sequence 1a and 1b are both sequences from ‘bos taurus’ animals,


particularly domesticated bovine animals while sequence 1c is about
‘cephalophorus’, which translates to 'head-bearing', in reference to the
flowers.

Figure2. SEQ Figure


What \* ARABIC
is your 1: about
best guess Slanted Cladogram
the of ‘bos taurus’
original purpose and ‘cephalophorus’
of this experiment?
⮚ The purpose of this experiment is to investigate or determine what kind of
organisms are in the given set of sequences.

Set 2
>Sequence2a
TTTGGTTGTTCGACGACGGATGCAGAGCTCAGGGAAGTGGGGACGTGTTTTGGCT
ATCCT
>Sequence2b
GCGATGCATCAGGATGCATCCTCTGATCTTAGGGTGGTACGAGAAAAATTGAAG
AATGTA
>Sequence2c
GCGGTTCCACAAGACCCTGAGGCGCCTGGTGCCTGACTCGGACGTCCGGTTCCTCCT
CTC

1. What do these sequences have in common?


- The sequences given in the set two are all plants however Sequence 2a,
Arabidopsis thaliana is a small flowering plant that is widely used as a
model organism in plant biology while sequence 2b (Malus domestica
hexokinase) or Sugar metabolism and accumulation in the fruit and 2c
(Mus musculus hexokinase) or cellular localization regulates the metabolic
fate of glucose. Therefore, it shows the gene coding for glucose in the
given organisms.

2. What is your best guess about the original purpose of this experiment?
- I guess the purpose of this experiment is to identify the importance of the
role of genes in plant growth.

Set 3
>Sequence3a
TAACCTACGGGTGGCCGCAGTGGGGAATATTGCACAATGGACACAAGTCTGATGCA
GCGACGC CG
CGTGGGGGATGAAGGCTTTCGGGTTGTAAACTCCTTTCAGTACAGAAGAAGCATTT
TTGTGAC GG
TATGTGCAGAAGAAGCGCCGGCTAACTACGTGCCAGCAGCCGCGGTAATACGTAGG
GCGCGA GCG
TTGTCCGGAATTATTGGGCGTAAAGAGCTCGTAGGCGGTTTGTTGCGCCTGCTGTG
>Sequence3b
TGTCCTACGGGGGGCTGCAGTGAGGAATATTGGTCAATGGGCGAGAGCCTGAACC
AGCCAAG TCG
CGTGAAGGATGACTGTCTTATGGATTGTAAACTTCTTTTATACGGGAATAACAAGA
GTCACGT GT
GGCTCCCTGCATGTACCGTATGAATAAGCATCGGCTAACTCCGTGCCAGCAGCCGC
GGTAATA CG
GAGGATGCGAGCGTTATCCGGATTTATTGGGTTTAAAGGGTGCGTAGGCGGC
>Sequence3c
GGCCTACGGGGGGCTGCAGTGGGTACGGGCAGACTAGAGTGTGGTAGGGGTAATTG
GAATTC CTG
GTGTAGCGGTGGAATGCGCAGATATCAGGAGGAACACCGATGGCGAAGGCAGGTT
ACTGGGC CAT
TACTGACGCTGAGGAGCGAAAGCGTGGGTAGCGAACAGGATTAGATACCCTAGTA
GTCT

1. What do these sequences have in common?


- The Sequence 3a and 3c are both Trueperella pyogenes, which is an
opportunistic pathogen that causes superlative infections in animals
including humans. On the other hand, sequence 3b are focuses on
uncultured bacterium clone.

Figure SEQ Figure \* ARABIC 2 Slanted Cladogram of Trueperella pyogenes and


uncultured bacterium clone.

2. What is your best guess about the original purpose of this experiment?
- The purpose of this experiment is to determine the gene coding that causes
infections to organisms and also to identify if the given sequence strain are
best match to clone.

You might also like