0% found this document useful (0 votes)
15 views4 pages

Tut3 2022

The document provides instructions for writing programs to analyze DNA sequences, including: 1) Calculating AT content and complementing a DNA sequence. 2) Finding restriction fragments after digestion with EcoRI. 3) Separating coding and non-coding regions and calculating the percentage that is coding. 4) Splitting a genomic sequence into coding and non-coding parts and writing them to separate files. 5) Creating a FASTA file with 3 sequences in uppercase only containing bases A, T, G and C.

Uploaded by

Jia Lin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views4 pages

Tut3 2022

The document provides instructions for writing programs to analyze DNA sequences, including: 1) Calculating AT content and complementing a DNA sequence. 2) Finding restriction fragments after digestion with EcoRI. 3) Separating coding and non-coding regions and calculating the percentage that is coding. 4) Splitting a genomic sequence into coding and non-coding parts and writing them to separate files. 5) Creating a FASTA file with 3 sequences in uppercase only containing bases A, T, G and C.

Uploaded by

Jia Lin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

1) Calculating AT content

Write a program that will print out the AT content of this DNA sequence.

DNA sequence: ACTGATCGATTACGTATAGTATTTGCTATCATACATATATATCGATGCGTTCAT

The sample outcome is :

AT content is 0.6851851851851852

2) Complementing DNA

Write a program that will print the complement of this sequence.

DNA sequence: ACTGATCGATTACGTATAGTATTTGCTATCATACATATATATCGATGCGTTCAT

Sample output:

TGACTAGCTAATGCATATCATAAACGATAGTATGTATATATAGCTACGCAAGTA

The final outcome

tCTGtTCGtTTtCGTtTtGTtTTTGCTtTCtTtCtTtTtTtTCGtTGCGTTCtT

tCaGtaCGtaatCGatatGataaaGCataCtatCtatatataCGtaGCGaaCta

tgaGtagGtaatgGatatGataaaGgatagtatgtatatatagGtaGgGaagta

tgactagctaatgcatatcataaacgatagtatgtatatatagctacgcaagta

TGACTAGCTAATGCATATCATAAACGATAGTATGTATATATAGCTACGCAAGTA
3) Restriction fragment lengths

The sequence contains a recognition site for the EcoRI restriction enzyme, which cuts at the motif
G*AATTC (the position of the cut is indicated by an asterisk). Write a program that will calculate the size
of the two fragments that will be produced when the DNA sequence is digested with EcoRI.

DNA sequence: ACTGATCGATTACGTATAGTAGAATTCTATCATACATATATATCGATGCGTTCAT

Sample output:

1. Assign variable: my_dna


2. Find GAATTC---- my_dna.find(“GAATTC”)
3. Calculate length -----frag1_len=my_dna.find(“GAATTC”)+1
4. Calculate length of frag 2----- frag2_len=len(my_dna)-frag1_len

4) Splicing out introns,

section of genomic DNA:


ATCGATCGATCGATCGACTGACTAGTCATAGCTATGCATGTAGCTACTCGATCGATCGA
TCGATCGATCGATCGATCGATCGATCATGCTATCATCGATCGATATCGATGCATCGACT ACTAT

It comprises two exons and an intron. The first exon runs from the start of the sequence to the sixty-
third character, and the second exon runs from the ninetyfirst character to the end of the sequence.

a) Write a program that will print just the coding regions of the DNA sequence.
Sample output:

b) Using the data from a, write a program that will calculate what percentage of the DNA sequence
is coding.

Sample output:
78….
c) Write a program that will print out the original genomic DNA sequence with coding bases in
uppercase and non-coding bases in lowercase.

Sample outcome:

Exo1=my_dna[0:63]

Exo2=my_dna[90:]

Noncoding=my_dna[63:90]

Exon1+non_coding.lower()+exon2

4. Splitting genomic DNA

Write a program that will split the genomic DNA from file genomic_dna.txt into coding and non-coding
parts, and write these sequences to two separate files.

5. Write a program that will create a FASTA file (output.txt) for the following three sequences – make
sure that all sequences are in uppercase and only contain the bases A, T, G and C.

Sequence
DNA sequence
header

ABC123 ATCGTACGATCGATCGATCGCTAGACGTATCG

DEF456 actgatcgacgatcgatcgatcacgact

HIJ789 ACTGAC-ACTGT--ACTGTA----CATGTG
Sample outcome

You might also like