W9-SIO1003 Practical 4-Questions
W9-SIO1003 Practical 4-Questions
W9-SIO1003 Practical 4-Questions
Instruction:
1. Two questions are given. Each question carries 10 marks. Answer all questions.
2. Please send in your answers either in .doc/.docx/.pdf format via Spectrum-UM with
a filename as follows: SIO1003_ID_FIRSTNAME_W9_P4
3. Please also be reminded that plagiarism is an academic offense. If you are found to
have plagiarized your colleague’s work, you will be penalized with 0 mark.
4. Submission Deadline: Next Tuesday 11.59 pm
a. Description
(no need to write the whole thing)
=Homo sapiens CF transmembrane conductance regulator (CFTR),
RefSeqGene (LRG_663) on chromosome 7
b. E value
= 0.0
c. Percent identity
=503/503(100%)
d. Query cover
=100%
2. Now scroll down to the Alignments heading. Look at the top result, which should
be the same one. Look at the alignment between your query and the reference.
Do you see any mismatches?
= No mismatch.
4. What is this gene? Google the name of the gene and write down something
significant you learned about it.
= CTFR gene
This gene encodes a member of the ATP-binding cassette (ABC) transporter
superfamily. The encoded protein functions as a chloride channel, making it
unique among members of this protein family, and controls ion and water
secretion and absorption in epithelial tissues. Channel activation is mediated by
cycles of regulatory domain phosphorylation, ATP-binding by the nucleotide-
binding domains, and ATP hydrolysis. Mutations in this gene cause cystic
fibrosis, the most common lethal genetic disorder in populations of Northern
European descent. The most frequently occurring mutation in cystic fibrosis,
DeltaF508, results in impaired folding and trafficking of the encoded protein.
Multiple pseudogenes have been identified in the human genome.
Exercise 2: Investigating sets of sequences
Assume that you have joined as a staff at a bioinformatics company after your
undergraduate study. There, your supervisor is actively involved in research, and he
has a number of sequences which he had obtained from experimental work –
sequencing experiment. He has given three sets of sequences for you to analyze and
obtain some information from them.
The given sequences are as in the table below:
>Sequence1a
GTAATGTACATAACATTAATGTAATAAAGA
>Sequence1b
SET 1
ATCACGAGCTTAATTACCATGCCGCGTGAAACCAGCAACC
>Sequence1c
ATGGACTAATGGCTAATCAGCCCATGCTCACACATA
>Sequence2a
TTTGGTTGTTCGACGACGGATGCAGAGCTCAGGGAAGTGGGGACGTGTTTTG
GCTATCCT
>Sequence2b
GCGATGCATCAGGATGCATCCTCTGATCTTAGGGTGGTACGAGAAAAATTGA
SET 2
AGAATGTA
>Sequence2c
GCGGTTCCACAAGACCCTGAGGCGCCTGGTGCCTGACTCGGACGTCCGGTT
CCTCCTCTC
SET 3
>Sequence3a
TAACCTACGGGTGGCCGCAGTGGGGAATATTGCACAATGGACACAAGTCTGA
TGCAGCGACGCCGCGTGGGGGATGAAGGCTTTCGGGTTGTAAACTCCTTTC
AGTACAGAAGAAGCATTTTTGTGACGGTATGTGCAGAAGAAGCGCCGGCTAA
CTACGTGCCAGCAGCCGCGGTAATACGTAGGGCGCGAGCGTTGTCCGGAAT
TATTGGGCGTAAAGAGCTCGTAGGCGGTTTGTTGCGCCTGCTGTG
>Sequence3b
TGTCCTACGGGGGGCTGCAGTGAGGAATATTGGTCAATGGGCGAGAGCCTG
AACCAGCCAAGTCGCGTGAAGGATGACTGTCTTATGGATTGTAAACTTCTTTT
ATACGGGAATAACAAGAGTCACGTGTGGCTCCCTGCATGTACCGTATGAATA
AGCATCGGCTAACTCCGTGCCAGCAGCCGCGGTAATACGGAGGATGCGAGC
GTTATCCGGATTTATTGGGTTTAAAGGGTGCGTAGGCGGC
>Sequence3c
GGCCTACGGGGGGCTGCAGTGGGTACGGGCAGACTAGAGTGTGGTAGGGG
TAATTGGAATTCCTGGTGTAGCGGTGGAATGCGCAGATATCAGGAGGAACAC
CGATGGCGAAGGCAGGTTACTGGGCCATTACTGACGCTGAGGAGCGAAAGC
GTGGGTAGCGAACAGGATTAGATACCCTAGTAGTCT