Practical 01: Retrieval of FASTA Sequence: 01: Insr Gene

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 28

Practical 01:

Retrieval of FASTA sequence

01: INSR GENE:


Official Symbol: INSR provided by HGNC
Official Full Name: insulin receptor provided by HGNC
Primary source: HGNC: HGNC: 6091
 Ensemble: ENSG00000171105 MIM: 147670q   
Gene type: protein coding
Organism: Homo sapiens
Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini; Catarrhini;
Hominidae; Homo
Location: 19p13.2 
Exon count: 22
Summary
This gene encodes a member of the receptor tyrosine kinase family of proteins.
The encoded preproprotein is proteolytically processed to generate alpha and beta
subunits that form a heterotetrametric receptor. Binding of insulin or other ligands
to this receptor activates the insulin signaling pathway, which regulates glucose
uptake and release, as well as the synthesis and storage of carbohydrates, lipids and
protein. Mutations in this gene underlie the inherited severe insulin resistance
syndromes including type A insulin resistance syndrome, Donohue syndrome and
Rabson-Mendenhall syndrome. Alternative splicing results in multiple transcript
variants. [Provided by RefSeq, Oct 2015].
Homo sapiens chromosome 19, GRCh38.p13 Primary Assembly
TCCACCAAGAAATGTGCTTATTGGATTGGGAGGTGTTTATTTGTAGTCTGCTGTAACACGTGTGAAAGAG
CAGGAGCGTCATCAGCATATGACTTGCGCTGGTCATCCGGTAAATGGATGTGCTGTAGTCCCAGTGCTAA
TCATTTCTCTCCTTCACAGTGGGTGGAAGTTTAGGGTTAAATGTCCTTTGAATGTCACCTGGTGAGTCCT
TGACACCTTAGGCTCTTCAGAAACAATGGTTTTGTTGAGGATGGGGAACAGGGAATGCCGATTTTATATA
CATGGTACACAGAGAGGGGTGTCACTTCAGAAAATCTTCCAGCATGTTCTTCAGAATATTAATTTATATG
CGAGGTGAGGTTGGGAATGAAAAGAACAGGTCAGCACTTTTTTTTTTCCTAGAACATACAAAAGAACATG
GTGGACTTTCAGGGAGTGCAATGGAAGGTGAATATTTCCTTAAGGGTCCCCGAGAAATGGGAGTGAGGGG
AGGGGACACAATGGCTTTTTGAGCTTACTTTTACCTTCTGATACTAGTCAAGGTCCAGAACCAGCCACCA
GCCAAATTTCTATCTGGGTGCGGGCCACTGAAAATCCTTGTTAAAAACCAGATCACAAATCTGGGGCTCT
TGGTCCCATTGGAGAAGGAAGGAAGAGCCTCAAAATAAGTGTGCACCCATGCACATATTCAGGAACAGCT
TGTTTAGTCTTTACACTTTGCCTGAAAGTTGCTTCTCCTCGTCCCTTTGTGTGCCTGGGTGGCCTCGGCC
CTGTGCGTTGGCAACGCAGGATCAAATGTGCTGCAGCTTTTGCAGAAAACAACTCAGAAACACAAAACCC
CCCAACAGCTCAATTATTATTTTTTCAATGTTTTCCTACAAGAGCCAAGTAGCACCATGTACAGAAGACG
CCTTTTTTTTTGGAATATTGAAATCGTTCTGCATGTAAAATATGGGATAATGACCTGTTTATATTAAAAT
TCTGATTAAATTATCTGAGAA

02: ADIPOQ gene


Official Symbol: ADIPOQ provided by HGNC
Official Full Name: adiponectin, C1Q and collagen domain containing provided
by HGNC
Primary source: HGNC:  HGNC: 13633
Ensemble: ENSG00000181092 MIM: 605441
Gene type: protein coding
Organism: Homo sapiens
Lineage: Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini; Catarrhini;
Hominidae; Homo
Location: 3q27.3 
Exon count: 4
Summary
This gene is expressed in adipose tissue exclusively. It encodes a protein with
similarity to collagens X and VIII and complement factor C1q. The encoded
protein circulates in the plasma and is involved with metabolic and hormonal
processes. Mutations in this gene are associated with adiponectin deficiency.
Multiple alternatively spliced variants, encoding the same protein, have been
identified. [Provided by RefSeq, Apr 2010]
Homo sapiens chromosome 3, GRCh38.p13 Primary Assembly
 
ATTCTGACTGCAGTCTGTGGTTCTGATTCCATACCAGAGGGTAAGAGCAATTCTGTGAAGTTCCAGGCTG
GGTGGGGGATGCATGCATAGCCTCTGGCTGGGATCACCCAGGCTCTCCCGTCCGTAGTAGTGTGGGAGTG
GATACAGGTGGATACTCTGGTCAGAGCAGCACTGGTGGAGGCAGATATGCACTGGGCTTCTTCCTCCGTT
CTCCCACAGCCCCAAGAGAGAAAGGGTTATTTCAGACATTCCTTCTAAGATGCATGGAACCATTCTGAAT
TTTGCCCAGTTCGCTCTGTAGCAGGATACCTATTGAGAAAAAGTTAGGGTCAGTAAGGTGGAAGGGTCTG
TCCACAGATGAAGTCCAATTCGATTAAGGGGGATAAGGGAATACATTGCCTCTTAGCTTGACCAGGTAGG
GCAAAGGAAGAAGCATATATGAAGGCAGCTTCAGAAAAGTCAAGCTGAGCACTGACTTCAGACTGGAATT
AGGAATCCAGCTCTGCCACTTTATTCTACTCAGCAAATATTTACTGAGCAAATTCTATGGGCTAGACAGT
GGATTGGGTTCACAAGATACAATGAGTGTGACATGGTTGTTGTCTATGGATTTGGGGATATATGTAGGTA
TAGGGATATCTTACAAGGTAATCAAGAGGTTCTAATGAGGCCAGCCATGGTGGCTCACACCTGTAATCCC
AGCAATTTGGGAGACCGAGGCGGGTGGATCACCTGAGGTCAGGAGTTCCAGACTAGCCTGACCAACATGG
TGAAACCCCGCCTCTACCAAAAATACAAAAATTAGTTGGGCGTGATGGCAGGTGCCTGTAATCCCAGCTT
CTCGGGAGGCTGAGGCAGGAGAATTGTCTGAACCTGGGAGGCAGAGGTTGCAGTGAGCCGAGATTGTTGC
CACTGCATTCCAGCCTGGGTGACAGAGCGAGACTTTGTGTCAAAAAAAAAAAAAAAAGAAAGAAAAGAAA
AAGAGGCTCTAATGAGATAAA
 03: Leptin receptor gene 
Official Symbol: LEPR provided by HGNC
 
Official Full Name: leptin receptor provided by HGNC
 
Primary source: HGNC: HGNC: 6554
 
Ensemble: ENSG00000116678 MIM: 601007
 
Gene type: protein coding
 
Organism: Homo sapiens
 
Lineage: Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini; Catarrhini;
Hominidae; Homo
Location: 1p31.3 
Exon count: 24
Summary:
The protein encoded by this gene belongs to the gp130 family of cytokine
receptors that are known to stimulate gene transcription via activation of cytosolic
STAT proteins. This protein is a receptor for leptin (an adipocyte-specific hormone
that regulates body weight), and is involved in the regulation of fat metabolism, as
well as in a novel hematopoietic pathway that is required for normal
lymphopoiesis. Mutations in this gene have been associated with obesity and
pituitary dysfunction. Alternatively, spliced transcript variants encoding different
isoforms have been described for this gene. It is noteworthy that this gene and
LEPROT gene (GeneID: 54741) share the same promoter and the first 2 exons,
however, encode distinct proteins (PMID: 9207021). [Provided by RefSeq, Nov
2010]
 
Homo sapiens chromosome 1, GRCh38.p13 Primary Assembly
 
CCGGTCTGGCTTGGGCAGGCTGCCCGGGCCGTGGCAGGAAGCCGGAAGCAGCCGCGGCCCCAGTTCGGGA
GACATGGCGGGCGTTAAAGGTACATCGCGGTCCCCGGCTCGCTTGTCGTGTGGTGGGGTTGCCACCTCCG
TTCCGGTCAAGCCTGGGGCTGCGCCTTCCGCGCGCCGTTGGGGAACGGCCTCACCACCCTTCCCGCCTCT
CCGGTTCGGGAGGCGATCGACCGCTCCCTTCGTCCCTTGGGGTGTGGGTGGAGCGGCGTTTCGGGGGAGC
CTTGGCCCTTTTTGCAAGGCCGCTGGCCTTCCTGCTCTCCAGTGGCGGCACCAGACCCCTCCCCAGCCTG
AGCCCTCGGCGAGAGCGGCGCCCTCCACCTCTCCTGAGTTTTTAAACAAGGTTTCCACTTCTTGACGCCA
CCCTAACGCCTTTCTTCTGGTTTTCTGCCCCTCCGCAGTTTCCTATTTGCCACAAAGGACCCTTTGTCTC
CAGCGTTCTTAGCATTGGGAAACTTAATCCTCTTTTTCCTAATGATTTTTCCAACTGGGCAGAGTTGACC
GCGGGCGGGTGTCAATGGAAAGCACCCAGAAAGACGGTGTTTCTCGCAGTCGTGGAGAGTAGATTACGTG
TAATTTTAATACTGCTTTCTTCGGTGTTTTCTCTGTTTATGGACAGAGAAGAAACCAGTGTGTGTGTAGT
ATGTGTTTTTTGCATGGGGCAGTTGGTAAAAACACCGCGTCCCTTATCTGTATGGCTTCAGAGCAATGCG
AGACGGAAAAGGTTTTTTGCAAGGCTTCCTGTATTTTGGTAGGAAAACATTCCATTTCTAATCTGTCGAA
TAGAGTAGCATGACTTGTTTTTATATTGGCTTTTATATCATCCTTGAAATTTGCACCCAAAGATATCCCC
AGTTAACATCTGCTATAAACATGTAGATAGTATATATACGAGTACGCCTCTGTTCACTATGAATTTATCC
GAGTATGTGAGCTGTTAGCAG
04: BRCA 1 gene
Official Symbol: BRCA1provided by HGNC
Official Full Name: BRCA1 DNA repair associated provided by HGNC
Primary source:  HGNC: HGNC: 1100
Ensemble: ENSG00000012048 MIM: 113705
Gene type: protein coding
Organism: Homo sapiens
Lineage: Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini; Catarrhini;
Hominidae; Homo
Location: 17q21.31 
Exon count: 24
Summary
This gene encodes a 190 kD nuclear phosphoprotein that plays a role in
maintaining genomic stability, and it also acts as a tumor suppressor. The BRCA1
gene contains 22 exons spanning about 110 kb of DNA. The encoded protein
combines with other tumor suppressors, DNA damage sensors, and signal
transducers to form a large multi-subunit protein complex known as the BRCA1-
associated genome surveillance complex (BASC). This gene product associates
with RNA polymerase II, and through the C-terminal domain, also interacts with
histone deacetylase complexes. This protein thus plays a role in transcription, DNA
repair of double-stranded breaks, and recombination. Mutations in this gene are
responsible for approximately 40% of inherited breast cancers and more than 80%
of inherited breast and ovarian cancers. Alternative splicing plays a role in
modulating the subcellular localization and physiological function of this gene.
Many alternatively spliced transcript variants, some of which are disease-
associated mutations, have been described for this gene, but the full-length natures
of only some of these variants has been described. A related pseudogene, which is
also located on chromosome 17, has been identified. [Provided by RefSeq, May
2020]
Homo sapiens chromosome 17, GRCh38.p13 Primary Assembly
 
TCCCTAAGTTTACTTCTCTAAAACCCTGTGTTCACAAAGGCAGAGAGTCAGACCCTTCAATGGAAGGAGA
GTGCTTGGGATCGATTATGTGACTTAAAGTCAGAATAGTCCTTGGGCAGTTCTCAAATGTTGGAGTGGAA
CATTGGGGAGGAAATTCTGAGGCAGGTATTAGAAATGAAAAGGAAACTTGAAACCTGGGCATGGTGGCTC
ACGCCTGTAATCCCAGCACTTTGGGAGGCCAAGGTGGGCAGATCACTGGAGGTCAGGAGTTCGAAACCAG
CCTGGCCAACATGGTGAAACCCCATCTCTACTAAAAATACAGAAATTAGCCGGTCATGGTGGTGGACACC
TGTAATCCCAGCTACTCAGGTGGCTAAGGCAGGAGAATCACTTCAGCCCGGGAGGTGGAGGTTGCAGTGA
GCCAAGATCATACCACGGCACTCCAGCCTGGGTGACAGTGAGACTGTGGCTCAAAAAAAAAAAAAAAAAA
AGGAAAATGAAACTAGAAGAGATTTCTAAAAGTCTGAGATATATTTGCTAGATTTCTAAAGAATGTGTTC
TAAAACAGCAGAAGATTTTCAAGAACCGGTTTCCAAAGACAGTCTTCTAATTCCTCATTAGTAATAAGTA
AAATGTTTATTGTTGTAGCTCTGGTATATAATCCATTCCTCTTAAAATATAAGACCTCTGGCATGAATAT
TTCATATCTATAAAATGACAGATCCCACCAGGAAGGAAGCTGTTGCTTTCTTTGAGGTGATTTTTTTCCT
TTGCTCCCTGTTGCTGAAACCATACAGCTTCATAAATAATTTTGCTTGCTGAAGGAAGAAAAAGTGTTTT
TCATAAACCCATTATCCAGGACTGTTTATAGCTGTTGGAAGGACTAGGTCTTCCCTAGCCCCCCCAGTGT
GCAAGGGCAGTGAAGACTTGATTGTACAAAATACGTTTTGTAAATGTTGTGCTGTTAACACTGCAAATAA
ACTTGGTAGCAAACACTTCCA
05: FSH receptor gene
Official Symbol: FSHR provided by HGNC
Official Full Name: follicle stimulating hormone receptor provided by HGNC
Primary source: HGNC: HGNC: 3969
Ensemble: ENSG00000170820 MIM: 136435
Gene type: protein coding
Organism: Homo sapiens
Lineage: Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini; Catarrhini;
Hominidae; Homo
Location: 2p16.3 
Exon count: 14
Summary
The protein encoded by this gene belongs to family 1 of G-protein coupled
receptors. It is the receptor for follicle stimulating hormone and functions in gonad
development. Mutations in this gene cause ovarian dysgenesis type 1, and also
ovarian hyperstimulation syndrome. Alternative splicing results in multiple
transcript variants. [Provided by RefSeq, Mar 2010]
Homo sapiens chromosome 2, GRCh38.p13 Primary Assembly
 
ACTGTGGGGTATCACATTCTGAGCCCTAACACTTCCAATATTATGCTATGAATTTACATCATGATTTCAG
GTAATTATTCCAACAATGCCACAAGGTGAGCATTTGTGTTATCCAGTTTCACAGATGCAGAAACTGAAGT
GGAAAAAATTGACTAGCATTATATGGCTGGCAAGTGATCAAACAGGATTTTCTCATTATTTCATTCACTC
AATAGTTATTGAGCTCATAATATATGCCAGGCATTATGTCAGACTTCATGGATACAGACAGGTACACAGT
AAACAAGGTGGCCACTGCCCAAATGGAGCTTGCATTCTGGTGGGGAAGACAGATAATAAACAACAAGAAA
GAAGCAATATAACAGATTGGGACAGTGCTATTAATATAAGTAAATGAAGGAGGGATATCATCAGGAGAAT
CTGGGAAGGAGTGGATGCTACCTGAGACAGGATGGTCAAGGATCTGCCTAGTTGCAAAGCACTAGACTTT
CCACAACCCCTTCTACCCTCCAGTGGGCCTCTGCAGTATATATGGCAACCAATTCTGGTTTCATGTATTC
TACCACTTACTCCAACTCTAGTAAATATCTGCAAAGCTTACCATTGCCTACGACTCTCAGATTATTTCCC
CAAGATGCTGCAGAATCCTTATAATGTTTCTCAGCCTCAATAGAATGAAAAGCAGGTCTGTGCTTATATC
ACTTAATGACCAAAGAGGAAGGAAATTTACAATTAAAGTGTACTTTGCCAACTGTGGATGAATTAGTTAG
GTCACTGTGATCTACAGGTTAGATGTCTGTTCAGCAGTGTCCTCTACTTGAGATTCCAAGGAGGTTGAAG
CTCACTACTCGCCACCCCTCGCACCCCCCTCCGTTCTCTTCTTTCCCTTACCTGCTTCCTCACACTGATT
CAAAATCTCCCCCATGGCCCGGGCACGGTGGCTCACACCTGTAATCCCAGCACTTTGGGAGGCCGAGCCA
GGTGGGTCACCTAAGGTCA
Practical no 2
Finding similar sequence for DNA
BLAST: Basic Local Alignment Search Tool
            BLAST finds regions of similarity between biological sequences. The
program compares nucleotide or protein sequences to sequence databases and
calculates the statistical significance. 

1) Similar sequence for INSR Gene


FASTA Sequence;
 
TGACACCTTAGGCTCTTCAGAAACAATGGTTTTGTTGAGGATGGGGAACAGGGAATGCCGATTTTATATA
CATGGTACACAGAGAGGGGTGTCACTTCAGAAAATCTTCCAGCATGTTCTTCAGAATATTAATTTATATG
CGAGGTGAGGTTGGGAATGAAAAGAACAGGTCAGCACTTTTTTTTTTCCTAGAACATACAAAAGAACATG
BLAST:
This FASTA sequence with 210 nucleotide sequence is 100% match with accession number of
following three.    

1) Homo sapiens insulin receptor (INSR), transcript variant 2, mRNA 

Score Expect Identities Gaps Strand

388
6e-104 210/210(100%) 0/210(0%) Plus/Plus
bits(210)

Query  1     TGACACCTTAGGCTCTTCAGAAACAATGGTTTTGTTGAGGATGGGGAACAGGGAATGCCG  60
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  8645  TGACACCTTAGGCTCTTCAGAAACAATGGTTTTGTTGAGGATGGGGAACAGGGAATGCCG  8704

Query  61    ATTTTATATACATGGTACACAGAGAGGGGTGTCACTTCAGAAAATCTTCCAGCATGTTCT  120


             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  8705  ATTTTATATACATGGTACACAGAGAGGGGTGTCACTTCAGAAAATCTTCCAGCATGTTCT  8764
Query  121   TCAGAATATTAATTTATATGCGAGGTGAGGTTGGGAATGAAAAGAACAGGTCAGCACttt  180
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  8765  TCAGAATATTAATTTATATGCGAGGTGAGGTTGGGAATGAAAAGAACAGGTCAGCACTTT  8824

Query  181   tttttttCCTAGAACATACAAAAGAACATG  210


             ||||||||||||||||||||||||||||||
Sbjct  8825  TTTTTTTCCTAGAACATACAAAAGAACATG  8854

2)  Accession number: NM_000208.4


3) Accession Number: NG_008852.2 
2)  Similar sequence for ADIPOQ gene:
FASTA Sequence;
GGTGGGGGATGCATGCATAGCCTCTGGCTGGGATCACCCAGGCTCTCCCGTCCGTAGTAGTGTGGGAGTG
GATACAGGTGGATACTCTGGTCAGAGCAGCACTGGTGGAGGCAGATATGCACTGGGCTTCTTCCTCCGTT
CTCCCACAGCCCCAAGAGAGAAAGGGTTATTTCAGACATTCCTTCTAAGATGCATGGAACCATTCTGAAT
BLAST:
This FASTA sequence with 210 nucleotide sequence is 100% match with accession number of
following three.
1) Homo sapiens adiponectin enhancer region (LOC106660625) on
chromosome 3 
Sequence ID: NG_044949.1Length: 13241Number of Matches: 1

Score Expect Identities Gaps Strand

388 bits(210) 6e-104 210/210(100%) 0/210(0%) Plus/Plus

Query  1     GGTGGGGGATGCATGCATAGCCTCTGGCTGGGATCACCCAGGCTCTCCCGTCCGTAGTAG  60
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  2871  GGTGGGGGATGCATGCATAGCCTCTGGCTGGGATCACCCAGGCTCTCCCGTCCGTAGTAG  2930

Query  61    TGTGGGAGTGGATACAGGTGGATACTCTGGTCAGAGCAGCACTGGTGGAGGCAGATATGC  120


             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  2931  TGTGGGAGTGGATACAGGTGGATACTCTGGTCAGAGCAGCACTGGTGGAGGCAGATATGC  2990
Query  121   ACTGGGCTTCTTCCTCCGTTCTCCCACAGCCCCAAGAGAGAAAGGGTTATTTCAGACATT  180
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  2991  ACTGGGCTTCTTCCTCCGTTCTCCCACAGCCCCAAGAGAGAAAGGGTTATTTCAGACATT  3050

Query  181   CCTTCTAAGATGCATGGAACCATTCTGAAT  210


             ||||||||||||||||||||||||||||||
Sbjct  3051  CCTTCTAAGATGCATGGAACCATTCTGAAT  3080

2) Homo sapiens adiponectin, C1Q and collagen domain containing


(ADIPOQ), RefSeqGene on chromosome 3
Accession number: NG_021140.1
3) Homo sapiens chromosome 3 clone WI2-3182J7, complete sequence
Accession Number: AC193164.1

3) LAPTIN Receptor Gene;


FASTA Sequence:
CCGGTTCGGGAGGCGATCGACCGCTCCCTTCGTCCCTTGGGGTGTGGGTGGAGCGGCGTTTCGGGGGAGC
CTTGGCCCTTTTTGCAAGGCCGCTGGCCTTCCTGCTCTCCAGTGGCGGCACCAGACCCCTCCCCAGCCTG
AGCCCTCGGCGAGAGCGGCGCCCTCCACCTCTCCTGAGTTTTTAAACAAGGTTTCCACTTCTTGACGCCA
BLAST:
This FASTA sequence with 210 nucleotide sequence is 100% match with accession number of
following three.
1) Homo sapiens Leptin receptor (LEPR) gene, partial cds 
Sequence ID: MK050966.1Length: 982Number of Matches: 1

Alignment statistics for match #1

Score Expect Identities Gaps Strand

388 210/210(100%
6e-104 0/210(0%) Plus/Plus
bits(210) )

Query  1    CCGGTTCGGGAGGCGATCGACCGCTCCCTTCGTCCCTTGGGGTGTGGGTGGAGCGGCGTT  60
            ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  275  CCGGTTCGGGAGGCGATCGACCGCTCCCTTCGTCCCTTGGGGTGTGGGTGGAGCGGCGTT  334

Query  61   TCGGGGGAGCCTTGGCCCTTTTTGCAAGGCCGCTGGCCTTCCTGCTCTCCAGTGGCGGCA  120


            ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  335  TCGGGGGAGCCTTGGCCCTTTTTGCAAGGCCGCTGGCCTTCCTGCTCTCCAGTGGCGGCA  394

Query  121  CCAGACCCCTCCCCAGCCTGAGCCCTCGGCGAGAGCGGCGCCCTCCACCTCTCCTGAGTT  180


            ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  395  CCAGACCCCTCCCCAGCCTGAGCCCTCGGCGAGAGCGGCGCCCTCCACCTCTCCTGAGTT  454

Query  181  TTTAAACAAGGTTTCCACTTCTTGACGCCA  210


            ||||||||||||||||||||||||||||||
Sbjct  455  TTTAAACAAGGTTTCCACTTCTTGACGCCA  484

2) Homo sapiens Leptin receptor (LEPR), RefSeqGene (LRG_283) on


chromosome 1
Accession number: NG_015831.2
3) Homo sapiens cDNA FLJ37482 fis, clone BRAWH2013941
 Accession number: AK094801.1

4) BRCA1
FASTA Sequence
CATTGGGGAGGAAATTCTGAGGCAGGTATTAGAAATGAAAAGGAAACTTGAAACCTGGGCATGGTGGCTC
ACGCCTGTAATCCCAGCACTTTGGGAGGCCAAGGTGGGCAGATCACTGGAGGTCAGGAGTTCGAAACCAG
CCTGGCCAACATGGTGAAACCCCATCTCTACTAAAAATACAGAAATTAGCCGGTCATGGTGGTGGACACC
BLAST:
This FASTA sequence with 210 nucleotide sequence is 100% match with accession number of
following three.
 
1) Homo sapiens BRCA1 DNA repair associated (BRCA1), transcript
variant 1, mRNA 
Sequence ID: NM_007294.4Length: 7088Number of Matches: 1
 
Alignment statistics for match #1

Score Expect Identities Gaps Strand

388 210/210(100%
6e-104 0/210(0%) Plus/Plus
bits(210) )

Query  1     CATTGGGGAGGAAATTCTGAGGCAGGTATTAGAAATGAAAAGGAAACTTGAAACCTGGGC  60
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  6228  CATTGGGGAGGAAATTCTGAGGCAGGTATTAGAAATGAAAAGGAAACTTGAAACCTGGGC  6287
Query  61    ATGGTGGCTCACGCCTGTAATCCCAGCACTTTGGGAGGCCAAGGTGGGCAGATCACTGGA  120
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  6288  ATGGTGGCTCACGCCTGTAATCCCAGCACTTTGGGAGGCCAAGGTGGGCAGATCACTGGA  6347

Query  121   GGTCAGGAGTTCGAAACCAGCCTGGCCAACATGGTGAAACCCCATCTCTACTAAAAATAC  180


             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  6348  GGTCAGGAGTTCGAAACCAGCCTGGCCAACATGGTGAAACCCCATCTCTACTAAAAATAC  6407

Query  181   AGAAATTAGCCGGTCATGGTGGTGGACACC  210


             ||||||||||||||||||||||||||||||
Sbjct  6408  AGAAATTAGCCGGTCATGGTGGTGGACACC  6437
2) Homo sapiens BRCA1 DNA repair associated (BRCA1), transcript variant
6, non-coding RNA 
Accession number: NR_027676.2
3) Homo sapiens BRCA1 DNA repair associated (BRCA1), transcript variant
2, mRNA
Accession Number: NM_007300.4

5) FSH Receptor Gene:


FASTA Sequence
GTAATTATTCCAACAATGCCACAAGGTGAGCATTTGTGTTATCCAGTTTCACAGATGCAGAAACTGAAGT
GGAAAAAATTGACTAGCATTATATGGCTGGCAAGTGATCAAACAGGATTTTCTCATTATTTCATTCACTC
AATAGTTATTGAGCTCATAATATATGCCAGGCATTATGTCAGACTTCATGGATACAGACAGGTACACAGT
BLAST:
This FASTA sequence with 210 nucleotide sequence is 100% match with accession number of
following three.
 
1) Homo sapiens follicle stimulating hormone receptor (FSHR),
transcript variant X4, mRNA 
Sequence ID: XM_011532735.2Length: 11453Number of Matches: 1
 
Alignment statistics for match #1

Score Expect Identities Gaps Strand

388
6e-104 210/210(100%) 0/210(0%) Plus/Plus
bits(210)

Query  1      GTAATTATTCCAACAATGCCACAAGGTGAGCATTTGTGTTATCCAGTTTCACAGATGCAG  60
              ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  10523  GTAATTATTCCAACAATGCCACAAGGTGAGCATTTGTGTTATCCAGTTTCACAGATGCAG  10582

Query  61     AAACTGAAGTGGAAAAAATTGACTAGCATTATATGGCTGGCAAGTGATCAAACAGGATTT  120


              ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  10583  AAACTGAAGTGGAAAAAATTGACTAGCATTATATGGCTGGCAAGTGATCAAACAGGATTT  10642

Query  121    TCTCATTATTTCATTCACTCAATAGTTATTGAGCTCATAATATATGCCAGGCATTATGTC  180


              ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  10643  TCTCATTATTTCATTCACTCAATAGTTATTGAGCTCATAATATATGCCAGGCATTATGTC  10702

Query  181    AGACTTCATGGATACAGACAGGTACACAGT  210


              ||||||||||||||||||||||||||||||
Sbjct  10703  AGACTTCATGGATACAGACAGGTACACAGT  10732

2)  Homo sapiens follicle stimulating hormone receptor (FSHR), transcript


variant X3, mRNA
Accession Number; XM_011532736.2
3)  Homo sapiens follicle stimulating hormone receptor (FSHR), transcript
variant X2, mRNA
Accession Number: XM_011532734.2

Practical no: 03

Use ExPSXy to translate DNA sequence into protein sequence and


tell what translation frame is:
1) INSR gene
FASTA Sequence:
TCCACCAAGAAATGTGCTTATTGGATTGGGAGGTGTTTATTTGTAGTCTGCTGTAACACGTGTGAAAGAG
CAGGAGCGTCATCAGCATATGACTTGCGCTGGTCATCCGGTAAATGGATGTGCTGTAGTCCCAGTGCTAA
TCATTTCTCTCCTTCACAGTGGGTGGAAGTTTAGGGTTAAATGTCCTTTGAATGTCACCTGGTGAGTCCT
TGACACCTTAGGCTCTTCAGAAACAATGGTTTTGTTGAGGATGGGGAACAGGGAATGCCGATTTTATATA

Protein Sequence:
5'3' Frame 1

STKKCAYWIGRCLFVVCCNTCERAGASSAYDLRWSSGKWMCCSPSANHFSPSQWVEV-G-
MSFECHLVSP-HLRLFRNNGFVEDGEQGMPILY
5'3' Frame 2

PPRNVLIGLGGVYL-
SAVTRVKEQERHQHMTCAGHPVNGCAVVPVLIISLLHSGWKFRVKCPLNVTW-
VLDTLGSSETMVLLRMGNRECRFYI
5'3' Frame 3

HQEMCLLDWEVFICSLL-HV-KSRSVISI-LALVIR-MDVL-SQC-SFLSFTVGGSLGLNVL-
MSPGESLTP-ALQKQWFC-GWGTGNADFI
3'5' Frame 1

YIKSAFPVPHPQQNHCF-RA-GVKDSPGDIQRTFNPKLPPTVKERND-
HWDYSTSIYRMTSASHMLMTLLLFHTCYSRLQINTSQSNKHISWW
3'5' Frame 2

I-NRHSLFPILNKTIVSEEPKVSRTHQVTFKGHLTLNFHPL-RREMISTGTTAHPFTG-
PAQVIC--RSCSFTRVTADYK-TPPNPISTFLGG
3'5' Frame 3

YKIGIPCSPSSTKPLFLKSLRCQGLTR-HSKDI-P-TSTHCEGEK-
LALGLQHIHLPDDQRKSYADDAPALSHVLQQTTNKHLPIQ-AHFLV
Translation frame:
MTCAGHPVNGCAVVPVLIISLLHSGWKFRVKCPLNVTW-
Reason:
This is translated frame because it has the maximum length and also
start with start codon methionine and ends with stop codon.

02: ADIPOQ gene


FASTA Sequence:
GATACAGGTGGATACTCTGGTCAGAGCAGCACTGGTGGAGGCAGATATGCACTGGGCTTCTTCCTCCGTT
CTCCCACAGCCCCAAGAGAGAAAGGGTTATTTCAGACATTCCTTCTAAGATGCATGGAACCATTCTGAAT
TTTGCCCAGTTCGCTCTGTAGCAGGATACCTATTGAGAAAAAGTTAGGGTCAGTAAGGTGGAAGGGTCTG
TCCACAGATGAAGTCCAATTCGATTAAGGGGGATAAGGGAATACATTGCCTCTTAGCTTGACCAGGTAGG

Protein Sequence;
5'3' Frame 1

DTGGYSGQSSTGGGRYALGFFLRSPTAPREKGLFQTFLLRCMEPF-
ILPSSLCSRIPIEKKLGSVRWKGLSTDEVQFD-GG-GNTLPLSLTR-
5'3' Frame 2

IQVDTLVRAALVEADMHWASSSVLPQPQERKGYFRHSF-DAWNHSEFCPVRSVAGYLLRKS-GQ-
GGRVCPQMKSNSIKGDKGIHCLLA-PGR
5'3' Frame 3

YRWILWSEQHWWRQICTGLLPPFSHSPKRERVISDIPSKMHGTILNFAQFAL-QDTY-
EKVRVSKVEGSVHR-SPIRLRGIREYIAS-LDQV
3'5' Frame 1

PTWSS-EAMYSLIPLNRIGLHLWTDPSTLLTLTFSQ-
VSCYRANWAKFRMVPCILEGMSEITLSLLGLWENGGRSPVHICLHQCCSDQSIHLY
3'5' Frame 2

LPGQAKRQCIPLSPLIELDFICGQTLPPY-P-LFLNRYPATERTGQNSEWFHAS-KECLK-
PFLSWGCGRTEEEAQCISASTSAALTRVSTCI
3'5' Frame 3

YLVKLRGNVFPYPP-SNWTSSVDRPFHLTDPNFFSIGILLQSELGKIQNGSMHLRRNV-
NNPFSLGAVGERRKKPSAYLPPPVLL-PEYPPV
Translated Frame:
MYSLIPLNRIGLHLWTDPSTLLTLTFSQ-
Reason:
This is translated frame because it start with start codon methionine
and ends with stop codon.

3) Leptin receptor gene


FASTA Sequence:
CCGGTCTGGCTTGGGCAGGCTGCCCGGGCCGTGGCAGGAAGCCGGAAGCAGCCGCGGCCCCAGTTCGGGA
GACATGGCGGGCGTTAAAGGTACATCGCGGTCCCCGGCTCGCTTGTCGTGTGGTGGGGTTGCCACCTCCG
TTCCGGTCAAGCCTGGGGCTGCGCCTTCCGCGCGCCGTTGGGGAACGGCCTCACCACCCTTCCCGCCTCT
CCGGTTCGGGAGGCGATCGACCGCTCCCTTCGTCCCTTGGGGTGTGGGTGGAGCGGCGTTTCGGGGGAGC
Protein Sequence:
5'3' Frame 1

PVWLGQAARAVAGSRKQPRPQFGRHGGR-
RYIAVPGSLVVWWGCHLRSGQAWGCAFRAPLGNGLTTLPASPVREAIDRSLRPLGCGWSGVSGE
5'3' Frame 2

RSGLGRLPGPWQEAGSSRGPSSGDMAGVKGTSRSPARLSCGGVATSVPVKPGAAPSARRWGTASPPFPPL
RFGRRSTAPFVPWGVGGAAFRGS
5'3' Frame 3

GLAWAGCPGRGRKPEAAAAPVRETWRALKVHRGPRLACRVVGLPPPFRSSLGLRLPRAVGERPHHPSRLS
GSGGDRPLPSSLGVWVERRFGG
3'5' Frame 1

APPKRRSTHTPRDEGSGRSPPEPERREGW-
GRSPTARGRRSPRLDRNGGGNPTTRQASRGPRCTFNARHVSRTGAAAASGFLPRPGQPAQARP
3'5' Frame 2

LPRNAAPPTPQGTKGAVDRLPNRRGGKGGEAVPQRRAEGAAPGLTGTEVATPPHDKRAGDRDVPLTPAMS
PELGPRLLPASCHGPGSLPKPDR
3'5' Frame 3

SPETPLHPHPKGRRERSIASRTGEAGRVVRPFPNGARKAQPQA-PERRWQPHHTTSEPGTAMYL-
RPPCLPNWGRGCFRLPATARAACPSQT
Translation frame:
MYL-
Reason:
This is translated frame because it start with start codon methionine
and ends with stop codon.

4) BRCA 1 gene
FASTA Sequence:
TCCCTAAGTTTACTTCTCTAAAACCCTGTGTTCACAAAGGCAGAGAGTCAGACCCTTCAATGGAAGGAGA
GTGCTTGGGATCGATTATGTGACTTAAAGTCAGAATAGTCCTTGGGCAGTTCTCAAATGTTGGAGTGGAA
CATTGGGGAGGAAATTCTGAGGCAGGTATTAGAAATGAAAAGGAAACTTGAAACCTGGGCATGGTGGCTC
ACGCCTGTAATCCCAGCACTTTGGGAGGCCAAGGTGGGCAGATCACTGGAGGTCAGGAGTTCGAAACCAG
Protein Sequence:
5'3' Frame 1
SLSLLL-NPVFTKAESQTLQWKESAWDRLCDLKSE-
SLGSSQMLEWNIGEEILRQVLEMKRKLETWAWWLTPVIPALWEAKVGRSLEVRSSKP
5'3' Frame 2
P-VYFSKTLCSQRQRVRPFNGRRVLGIDYVT-SQNSPWAVLKCWSGTLGRKF-GRY-K-
KGNLKPGHGGSRL-SQHFGRPRWADHWRSGVRNQ
5'3' Frame 3
PKFTSLKPCVHKGRESDPSMEGECLGSIM-LKVRIVLGQFSNVGVEHWGGNSEAGIRNEKET-
NLGMVAHACNPSTLGGQGGQITGGQEFET
3'5' Frame 1
LVSNS-PPVICPPWPPKVLGLQA-
ATMPRFQVSFSFLIPASEFPPQCSTPTFENCPRTILTLSHIIDPKHSPSIEGSDSLPL-
TQGFREVNLG
3'5' Frame 2
WFRTPDLQ-SAHLGLPKCWDYRREPPCPGFKFPFHF-YLPQNFLPNVPLQHLRTAQGLF-L-VT-
SIPSTLLPLKGLTLCLCEHRVLEK-T-G
3'5' Frame 3
GFELLTSSDLPTLASQSAGITGVSHHAQVSSFLFISNTCLRISSPMFHSNI-
ELPKDYSDFKSHNRSQALSFH-RV-LSAFVNTGF-RSKLR

Translation frame:
MPRFQVSFSFLIPASEFPPQCSTPTFENCPRTILTLSHIIDPKHSPSIEGSDSLPL-
Reason:
This is translated frame because it has the maximum length and also
start with start codon methionine and ends with stop codon.

 
 5) FSH receptor gene
FASTA Sequence:
ACTGTGGGGTATCACATTCTGAGCCCTAACACTTCCAATATTATGCTATGAATTTACATCATGATTTCAG
GTAATTATTCCAACAATGCCACAAGGTGAGCATTTGTGTTATCCAGTTTCACAGATGCAGAAACTGAAGT
GGAAAAAATTGACTAGCATTATATGGCTGGCAAGTGATCAAACAGGATTTTCTCATTATTTCATTCACTC
AATAGTTATTGAGCTCATAATATATGCCAGGCATTATGTCAGACTTCATGGATACAGACAGGTACACAGT
Protein Sequence:
5'3' Frame 1

TVGYHILSPNTSNIML-IYIMISGNYSNNATR-AFVLSSFTDAETEVEKID-HYMAGK-
SNRIFSLFHSLNSY-AHNICQALCQTSWIQTGTQ
5'3' Frame 2

LWGITF-ALTLPILCYEFTS-
FQVIIPTMPQGEHLCYPVSQMQKLKWKKLTSIIWLASDQTGFSHYFIHSIVIELIIYARHYVRLH
GYRQVHS
5'3' Frame 3

CGVSHSEP-HFQYYAMNLHHDFR-LFQQCHKVSICVIQFHRCRN-SGKN-
LALYGWQVIKQDFLIISFTQ-LLSS-YMPGIMSDFMDTDRYT
3'5' Frame 1

TVYLSVSMKSDIMPGIYYELNNY-VNEIMRKSCLITCQPYNASQFFPLQFLHL-
NWITQMLTLWHCWNNYLKS-CKFIA-YWKC-GSECDTPQ
3'5' Frame 2

LCTCLYP-SLT-CLAYIMSSITIE-MK--ENPV-SLASHIMLVNFFHFSFCICETG-
HKCSPCGIVGIIT-NHDVNS-HNIGSVRAQNVIPHS
3'5' Frame 3

CVPVCIHEV-HNAWHIL-AQ-LLSE-NNEKILFDHLPAI-C-
SIFSTSVSASVKLDNTNAHLVALLE-LPEIMM-IHSIILEVLGLRM-YPT
 
Translation frame:
MRKSCLITCQPYNASQFFPLQFLHL-
Reason:
This is translated frame because it starts with start codon methionine
and ends with stop codon.

Prac 4:
Physical and chemical properties of protein using
ProtParam.

1) Keratin:
VTLARTDLEMQIEGLKEELAYLRKNHEEEMLALRGQTGGDVNVEMDAAPGVDLSRILNEMRDQYEQMAEK
NRRDAETWFLSKTEELNKEVASNSELVQSSRSEVTELRRVLQGLEIELQSQLSTKASLENSLEETKGRYC
MQLSQIQGLIGSVEEQLAQLRCEMEQQSQEYQILLDVKTRLEHEIATYRRLLXGEDAHLSSQQASGQSYS
SREVFTSSSSSSSRQTRPILKEQSSSSFSQGQSS

Properties:
Number of amino acids: 244
 
Molecular weight: 27659.84
 
Theoretical pI: 4.81
 
Amino acid composition: 
Ala (A)  13      5.3%
Arg (R)  18      7.4%
Asn (N)   7      2.9%
Asp (D)   8      3.3%
Cys (C)   2      0.8%
Gln (Q)  24      9.8%
Glu (E)  33    13.5%
Gly (G)  12      4.9%
His (H)   3      1.2%
Ile (I)   8      3.3%
Leu (L)  30    12.3%
Lys (K)   9      3.7%
Met (M)   7      2.9%
Phe (F)   3      1.2%
Pro (P)   2      0.8%
Ser (S)  34    13.9%
Thr (T)  12      4.9%
Trp (W)   1      0.4%
Tyr (Y)   6      2.5%
Val (V)  11      4.5%
Pyl (O)   0      0.0%
Sec (U)   0      0.0%
 
 (B)   0      0.0%
 (Z)   0      0.0%
 (X)   1      0.4%
 
 
Total number of negatively charged residues (Asp + Glu): 41
Total number of positively charged residues (Arg + Lys): 27
 
 
Atom composition:
 
As there is at least one ambiguous position (B,Z or X) in the sequence
considered, the atomic composition cannot be computed.
 
Extinction coefficients:
 
Extinction coefficients are in units of  M-1
cm , at 280 nm measured in water.
-1

 
Ext. coefficient    14565
Abs 0.1% (=1 g/l)   0.527, assuming all pairs of Cys residues form cystines
 
 
Ext. coefficient    14440
Abs 0.1% (=1 g/l)   0.522, assuming all Cys residues are reduced
 
Estimated half-life:
 
The N-terminal of the sequence considered is V (Val).
 
The estimated half-life is: 100 hours (mammalian reticulocytes, in vitro).
                            >20 hours (yeast, in vivo).
                            >10 hours (Escherichia coli, in vivo).
 
 
Instability index:
 
The instability index (II) is computed to be 76.14
This classifies the protein as unstable.
 
 
 
Aliphatic index: 79.14
 
Grand average of hydropathicity (GRAVY): -0.753

2) Gloverin:
Number of amino acids: 175
 
Molecular weight: 19064.25
 
Theoretical pI: 9.14
 
Amino acid composition: 
Ala (A)  14      8.0%
Arg (R)   9      5.1%
Asn (N)   8      4.6%
Asp (D)  14      8.0%
Cys (C)   1      0.6%
Gln (Q)  12      6.9%
Glu (E)   4      2.3%
Gly (G)  24    13.7%
His (H)   2      1.1%
Ile (I)   7      4.0%
Leu (L)   6      3.4%
Lys (K)  13      7.4%
Met (M)   4      2.3%
Phe (F)  11      6.3%
Pro (P)   6      3.4%
Ser (S)  12      6.9%
Thr (T)   7      4.0%
Trp (W)   2      1.1%
Tyr (Y)   7      4.0%
Val (V)  12      6.9%
Pyl (O)   0      0.0%
Sec (U)   0      0.0%
 
 (B)   0      0.0%
 (Z)   0      0.0%
 (X)   0      0.0%
 
 
Total number of negatively charged residues (Asp + Glu): 18
Total number of positively charged residues (Arg + Lys): 22
 
Atomic composition:
 
Carbon      C          841
Hydrogen    H          1289
Nitrogen    N          241
Oxygen      O          258
Sulfur      S            5
 
Formula: C H N O S
841 1289 241 258 5

Total number of atoms: 2634


 
Extinction coefficients:
 
Extinction coefficients are in units of  M-1
cm , at 280 nm measured in water.
-1

 
Ext. coefficient    21430
Abs 0.1% (=1 g/l)   1.124, assuming all pairs of Cys residues form cystines
 
 
Ext. coefficient    21430
Abs 0.1% (=1 g/l)   1.124, assuming all Cys residues are reduced
 
Estimated half-life:
 
The N-terminal of the sequence considered is M (Met).
 
The estimated half-life is: 30 hours (mammalian reticulocytes, in vitro).
                            >20 hours (yeast, in vivo).
                            >10 hours (Escherichia coli, in vivo).
 
 
Instability index:
 
The instability index (II) is computed to be 21.47
This classifies the protein as stable.
 
 
 
Aliphatic index: 56.86
 
Grand average of hydropathicity (GRAVY): -0.597
…………………………………………………………………………………………………………………………………………………….
Prac 5
Primer designing

Leptin:
GAGCCCCGTAGGAATCGCAGCGCCAGCGGTTGCAAGGTAAGGCCCCGGCGCGCTCCTTCCTCCTTCTCTG
CTGGTCTTTCTTGGCAGGCCACAGGGCCCCACACAACTCTGGATCCCGGGGAAACTGAGTCAGGAGGGAT
GCAGGGCGGATGGCTTAGTTCTGGACTATGATAGCTTTGTACCGAGTTCTAGCCAGATAGAAGGTTACCG
GGAGCTGGGGAGCGTTGGATTTGCTGCTGGGCTGTGCCGGTGCCCAGAAGGCAGGACCTTGCAGAACCAG
LEFT PRIMER                 TCCTTCCTCCTTCTCTGCTG    54   20   59.67   55.00 2.00 2.00 
RIGHT PRIMER               CAGCTCCCGGTAACCTTCTA    217   20   59.33   55.00 5.00 2.00 
Complementary Strand        TAGAAGGTTACCGGGAGCTG
SEQUENCE SIZE: 280
INCLUDED REGION SIZE: 280
PRODUCT SIZE: 164

Primer BLAST
    Sequence (5'->3')    Template strand    Length    Start    Stop    Tm    GC%    Self complementarity    Self 3' complementarity

Forward primer    GATCCCGGGGAAACTGAGTC    Plus    20    112    131    59.82    60.00    7.00    3.00

Reverse primer    CAGCAGCAAATCCAACGCTC    Minus    20    239    220    60.46    55.00    3.00    0.00

Product length    128

FSH:
GTAATTATTCCAACAATGCCACAAGGTGAGCATTTGTGTTATCCAGTTTCACAGATGCAGAAACTGAAGT
GGAAAAAATTGACTAGCATTATATGGCTGGCAAGTGATCAAACAGGATTTTCTCATTATTTCATTCACTC
AATAGTTATTGAGCTCATAATATATGCCAGGCATTATGTCAGACTTCATGGATACAGACAGGTACACAGT
AAACAAGGTGGCCACTGCCCAAATGGAGCTTGCATTCTGGTGGGGAAGACAGATAATAAACAACAAGAAA

PRIMER 3
LEFT PRIMER                 TGGCAAGTGATCAAACAGGA    98   20   60.24   45.00 6.00 0.00 
RIGHT PRIMER               ATCTGTCTTCCCCACCAGAA    264   20   59.51   50.00 4.00 0.00 
Complementary Strand     TTCTGGTGGGGAAGACAGAT
SEQUENCE SIZE: 280
INCLUDED REGION SIZE: 280
PRODUCT SIZE: 167
PRIMER BLAST
Sequence (5'->3')    Template strand    Length    Start    Stop    Tm    GC%    Self complementarity    Self 3' complementarity

Forward primer    AACAATGCCACAAGGTGAGC    Plus    20    12    31    59.32    50.00    3.00    3.00

Reverse primer    TGTCTTCCCCACCAGAATGC    Minus    20    261    242    59.96    55.00    3.00    2.00

Product length    250

……………………………………………………………………………………………………………………………………………………….

Prac 6
Multiple sequence alignment of protein using ClustalW
Clustal omega and T.Coffee.
Coronin protein
In human
>EAW74626.1MSFRKVVRQSKFRHVFGQPVKNDQCYEDIRVSRVTWDSTFCAVNPKFLAVIVEASGGGAFLVLPLSKTGR
IDKAYPTVCGHTGPVLDIDWCPHNDEVIASGSEDCTVMVWQIPENGLTSPLTEPVVVLEGHTKRVGIIAW
In rabbit
>AAD23736.1MSFRKVVRQSKFRHVFGQPVKNDQCYEDIRVSRVTWDSTFCAVNPKFLAVIVEASGGGAFLVLPLSKTGR
IDKAYPTVCGHTGPVLDIEWCPHNDGVIASGSEDCTVMVWQIPEDGLTSPLTEPVVVLEGHTKRVGIVTW
In mouse
 >EDL33019.1MSFRKVVRQSKFRHVFGQPVKNDQCYEDIRVSRVTWDSTFCAVNPKFLAVIVEASGGGAFMVLPLNKTGR
IDKAYPTVCGHTGPVLDIDWCPHNDEVIASGSEDCTVMVWQIPENGLTSPLTEPVVVLEGHTKRVGIITW
 
ClustalW;
Sequence 1: EAW74626.1    70 aa
Sequence 2: EDL33019.1   70 aa

Sequence 3: AAD23736.1    70 aa
Start of Pairwise alignments
Aligning...

Sequences (1:2) Aligned. Score: 98.5714

Sequences (1:3) Aligned. Score: 92.8571

Sequences (2:3) Aligned. Score: 94.2857

There are 2 groups


Start of Multiple Alignment
Aligning...
Group 1: Sequences:   2      Score: 1183

Group 2: Sequences:   3      Score: 1155


Alignment Score 1326

EAW74626.1MSFRKVVRQSKFRHVFGQPV      IDKAYPTVCGHTGPVLDIDWCPHNDEVIASGSEDCTVMVWQIPENGLTSP
EDL33019.1MSFRKVVRQSKFRHVFGQPV      IDKAYPTVCGHTGPVLDIDWCPHNDEVIASGSEDCTVMVWQIPENGLTSP
AAD23736.1MSFRKVVRQSKFRHVFGQPV      IDKAYPTVCGHTGPVLDIEWCPHNDGVIASGSEDCTVMVWQIPEDGLTSP
                                    ******************:****** ******************:*****

EAW74626.1MSFRKVVRQSKFRHVFGQPV      LTEPVVVLEGHTKRVGIIAW
EDL33019.1MSFRKVVRQSKFRHVFGQPV      LTEPVVVLEGHTKRVGIITW
AAD23736.1MSFRKVVRQSKFRHVFGQPV      LTEPVVVLEGHTKRVGIVTW
                                    *****************::*
Clustal omega       
AAD23736.1MSFRKVVRQSKFRHVFGQPVKNDQCYEDIRVSRVTWDSTFCAVNPKFLAVIVEASGGGAFLVLPLSKTGR     
IDKAYPTVCGHTGPVLDIEWCPHNDGVIASGSEDCTVMVWQIPEDGLTSPLTEPVVVLEG    60
 
EAW74626.1MSFRKVVRQSKFRHVFGQPVKNDQCYEDIRVSRVTWDSTFCAVNPKFLAVIVEASGGGAFLVLPLSKTGR     
IDKAYPTVCGHTGPVLDIDWCPHNDEVIASGSEDCTVMVWQIPENGLTSPLTEPVVVLEG    60
 
EDL33019.1MSFRKVVRQSKFRHVFGQPVKNDQCYEDIRVSRVTWDSTFCAVNPKFLAVIVEASGGGAFMVLPLNKTGR     
IDKAYPTVCGHTGPVLDIDWCPHNDEVIASGSEDCTVMVWQIPENGLTSPLTEPVVVLEG    60
                                                                                      ***********
*******:****** ******************:***************
 
AAD23736.1MSFRKVVRQSKFRHVFGQPVKNDQCYEDIRVSRVTWDSTFCAVNPKFLAVIVEASGGGAFLVLPLSKTGR     
HTKRVGIVTW    70
 
EAW74626.1MSFRKVVRQSKFRHVFGQPVKNDQCYEDIRVSRVTWDSTFCAVNPKFLAVIVEASGGGAFLVLPLSKTGR     
HTKRVGIIAW    70
 
EDL33019.1MSFRKVVRQSKFRHVFGQPVKNDQCYEDIRVSRVTWDSTFCAVNPKFLAVIVEASGGGAFMVLPLNKTGR     
HTKRVGIITW    70
                                                                                      *******::*

 T.Coffee:
AAD23736.1MSFRKVVRQSKFRHVFGQPVKNDQCYEDIRVSRVTWDSTFCAVNPKFLAVIVEASGGGAFLVLPLSKTGR
IDKAYPTVCGHTGPVLDIEWCPHNDGVIASGSEDCTVMVWQIPEDGLTSP
 
EAW74626.1MSFRKVVRQSKFRHVFGQPVKNDQCYEDIRVSRVTWDSTFCAVNPKFLAVIVEASGGGAFLVLPLSKTGR
IDKAYPTVCGHTGPVLDIDWCPHNDEVIASGSEDCTVMVWQIPENGLTSP
 
EDL33019.1MSFRKVVRQSKFRHVFGQPVKNDQCYEDIRVSRVTWDSTFCAVNPKFLAVIVEASGGGAFMVLPLNKTGR
IDKAYPTVCGHTGPVLDIDWCPHNDEVIASGSEDCTVMVWQIPENGLTSP
                                                                                  ****
**************:****** ******************:*****
 
AAD23736.1MSFRKVVRQSKFRHVFGQPVKNDQCYEDIRVSRVTWDSTFCAVNPKFLAVIVEASGGGAFLVLPLSKTGR
LTEPVVVLEGHTKRVGIVTW
 
EAW74626.1MSFRKVVRQSKFRHVFGQPVKNDQCYEDIRVSRVTWDSTFCAVNPKFLAVIVEASGGGAFLVLPLSKTGR
LTEPVVVLEGHTKRVGIIAW
 
EDL33019.1MSFRKVVRQSKFRHVFGQPVKNDQCYEDIRVSRVTWDSTFCAVNPKFLAVIVEASGGGAFMVLPLNKTGR
LTEPVVVLEGHTKRVGIITW
                                                                                  ****
*************::*
…………………………………………………………………………………………………………………………………………………………

Prac 07
SNP:

ANK3
rs10994336
Alleles:
C>T [Show Flanks]
Chromosome:
10:60420054 (GRCh38)
10:62179812 (GRCh37)
Canonical SPDI:
NC_000010.11:60420053:C:T
Gene:
ANK3 (Varview)
Functional Consequence:
intron_variant,genic_upstream_transcript_variant
Validated:
by frequency,by alfa,by cluster
MAF:
T=0.063421/8173 (ALFA)
T=0.046296/10 (Qatari)
T=0.05/2 (GENOME_DK)
T=0.054207/201 (TWINSUK)
T=0.056305/217 (ALSPAC)
T=0.0722/9066 (TOPMED)
T=0.076585/87 (Daghestan)
T=0.077154/77 (GoNL)
T=0.085559/2684 (GnomAD)
T=0.11/66 (NorthernSweden)
T=0.113497/37 (HapMap)
T=0.123203/617 (1000Genomes)
T=0.130841/28 (Vietnamese)
T=0.132143/592 (Estonian)
T=0.132516/10429 (PAGE_STUDY)
T=0.282594/828 (KOREAN)
T=0.286572/525 (Korea1K)
C=0.345/69 (SGDP_PRJ)
C=0.363636/8 (Siberian)
...more
HGVS:
NC_000010.11:g.60420054C>T, NC_000010.10:g.62179812C>T, NG_029917.1:g.318473G>A

rs10994415
Variant type:
SNV
Alleles:
T>C,G [Show Flanks]
Chromosome:
10:60562276 (GRCh38)
10:62322034 (GRCh37)
Canonical SPDI:
NC_000010.11:60562275:T:C,NC_000010.11:60562275:T:G
Gene:
ANK3 (Varview)
Functional Consequence:
intron_variant,genic_upstream_transcript_variant
Validated:
by frequency,by alfa,by cluster
MAF:
C=0.235513/1837 (ALFA)
C=0.064815/14 (Qatari)
C=0.074434/276 (TWINSUK)
C=0.075/3 (GENOME_DK)
C=0.079398/306 (ALSPAC)
C=0.091648/11508 (TOPMED)
C=0.093186/93 (GoNL)
C=0.095284/2990 (GnomAD)
C=0.096667/58 (NorthernSweden)
C=0.126786/568 (Estonian)
C=0.14996/751 (1000Genomes)
C=0.172727/57 (HapMap)
C=0.275701/59 (Vietnamese)
T=0.394231/82 (SGDP_PRJ)
C=0.396246/1161 (KOREAN)
C=0.400109/733 (Korea1K)
T=0.5/10 (Siberian)
...more
HGVS:
NC_000010.11:g.60562276T>C, NC_000010.11:g.60562276T>G, NC_000010.10:g.62322034T>C, NC_00
0010.10:g.62322034T>G, NG_029917.1:g.176251A>G, NG_029917.1:g.176251A>C
.
 
ALDOB
 
 
rs1057517091
 
 
 
Variant type:
DEL
Alleles:
T>- [Show Flanks]
Chromosome:
9:101426633 (GRCh38)
9:104188915 (GRCh37)
Canonical SPDI:
NC_000009.12:101426632:T:
Gene:
ALDOB (Varview)
Functional Consequence:
coding_sequence_variant,frameshift_variant
Clinical significance:
likely-pathogenic
HGVS:
NC_000009.12:g.101426633del, NC_000009.11:g.104188915del, NG_012387.1:g.14148del, NM_000035.4
:c.546del, NM_000035.3:c.546del, NP_000026.2:p.Leu183fs

rs1057516534
Variant type:
DELINS
Alleles:
C>- [Show Flanks]
Chromosome:
9:101430775 (GRCh38)
9:104193057 (GRCh37)
Canonical SPDI:
NC_000009.12:101430774:CC:C
Gene:
ALDOB (Varview)
Functional Consequence:
coding_sequence_variant,splice_donor_variant
Clinical significance:
likely-pathogenic
HGVS:
NC_000009.12:g.101430776del, NC_000009.11:g.104193058del, NG_012387.1:g.10006del

………………………………………………………………………………………………………………………………………………………………….

Prac 08
Prediction of secondary structure of protein:
Jpred4
Insulin
 
MALWMRLLPLLALLALWGPDPAAAFVNQHLCGSHLVEALYLVCGERGFFYTPKTRREAEDLQVGQVELGG
GPGAGSLQPLALEGSLQKRGIVEQCCTSICSLYQLENYCN
Name: Hormone

Title: Monoclinic human insulin in complex with p-coumaric acid


Structur Insulin. Chain: a, b, c, d, e, f, g, h, i, j, k, l. Engineered: yes
e:
Source: Homo sapiens. Human. Organism_taxid: 9606. Gene: ins. Expressed in: saccharomyces cerevisiae.
Expression_system_taxid: 4932
Resoluti
on:
1.36Å  R- 0.16 R- 0.
factor:   0     free:   18
    8
Authors: D.-P.Triandafillidis,N.Parthenios,M.Spiliopoulou,A.Valmas,C.Kosinas, F.Gozzo,M.Reinle-
Schmitt,D.Beckers,T.Degen,M.Pop,A.Fitch, J.Wollenhaupt,M.S.Weiss,F.Karavassili,I.Margiolaki
Key ref: D.P.Triandafillidis et al. (2020). Insulin polymorphism induced by two polyphenols: new crystal forms and advances
in macromolecular powder diffraction. Acta Crystallogr D Struct Biol, 76, 1065-
1079. PubMed id: 33135678 DOI: 10.1107/S205979832001195X

Date:

04-Nov-19  Release 11-Nov-


date:   20   
   

Myosin:
CILITGESGAGKTEASKLVMSYVAAVCGKGAEVNQVKEQLLQSNPVLEAFGNAKTVRNDNSSRFGKYMDI
EFDFKGDPLGGVISNYLLEK
Name: Structural protein

Title: High-resolution cryo-em structures of actin-bound myosin states reveal the mechanism of myosin force sensing
Structu Actin, alpha skeletal muscle. Chain: a, b, c, d, e. Synonym: alpha-actin-1. Unconventional myosin-ib. Chain: p.
re: Synonym: myosin i alpha,mmia,myosin heavy chain myr 1. Calmodulin. Chain: r
Source Oryctolagus cuniculus. Rabbit. Organism_taxid: 9986. Rattus norvegicus. Rat. Organism_taxid: 10116. Unidentified.
: Organism_taxid: 32644
Author A.Mentes,A.Huehn,X.Liu,A.Zwolak,R.Dominguez,H.Shuman,E.M.Ostap, C.V.Sindelar
s:
Key ref A.Mentes et al. (2018). High-resolution cryo-EM structures of actin-bound myosin states reveal the mechanism of
: myosin force sensing. Proc Natl Acad Sci U S A, 115, 1292-1297. PubMed id: 29358376

Date:

04-Jan-18  Release 31-Jan-


date:   18    
   

You might also like