Manual de Ejercicios de Python

The document provides code solutions to exercises involving manipulating DNA sequences in Python. It includes examples of calculating AT content in a sequence, taking the complement of a sequence, calculating restriction fragment lengths after digestion with an enzyme, and writing DNA sequences to files in FASTA format. The solutions demonstrate reading and writing files, string manipulation functions, regular expressions, and processing sequences from a file.

Uploaded by

Daniel Alonso

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as ODT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

284 views1 page

Manual de Ejercicios de Python

Uploaded by

Daniel Alonso

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as ODT, PDF, TXT or read online on Scribd

You are on page 1/ 1

Chapter 2: Printing and manipulating text header_1 = "ABC123"

Using the data from part one, write a program that will header_2 = "DEF456"
1.Calculating AT content print out the original genomic DNA sequence with coding header_3 = "HIJ789"
bases in uppercase and non-coding bases in lowercase.
Here's a short DNA sequence: # set the values of all the sequence variables
Solution: seq_1 = "ATCGTACGATCGATCGATCGCTAGACGTATCG"
ACTGATCGATTACGTATAGTATTTGCTATCATACATATATATCGATGCGTTCAT seq_2 = "actgatcgacgatcgatcgatcacgact"
my_dna = seq_3 = "ACTGAC-ACTGT-ACTGTA----CATGTG"
Write a program that will print out the AT content of this "ATCGATCGATCGATCGACTGACTAGTCATAGCTATGCATGTAGCTACTCGATCGATCG
DNA sequence. Hint: you can use normal mathematical symbols ATCGATCGATCGATCGATCGATCGATCATGCTATCATCGATCGATATCGATGCATCGAC # make three files to hold the output
like add (+), subtract (-), multiply (*), divide (/) and TACTAT" output_1 = open(header_1 +
parentheses to carry out calculations on numbers in Python. exon1 = my_dna[0:63] "/home/daniel/Python/exercises/Chapter_3/exercises/one.fast
intron = my_dna[63:90] a", "w")
Solution: exon2 = my_dna[90:] output_2 = open(header_2 +
print(exon1 + intron.lower() + exon2) "/home/daniel/Python/exercises/Chapter_3/exercises/two.fast
from __future__ import division a", "w")
ATCGATCGATCGATCGACTGACTAGTCATAGCTATGCATGTAGCTACTCGATCGATCGA output_3 = open(header_3 +
my_dna = TCGAtcgatcgatcgatcgatcgatcatgctATCATCGATCGATATCGATGCATCGACT "/home/daniel/Python/exercises/Chapter_3/exercises/three.fa
"ACTGATCGATTACGTATAGTATTTGCTATCATACATATATATCGATGCGTTCAT" ACTAT sta", "w")
length = len(my_dna)
a_count = my_dna.count('A') Chapter 3: Reading and writing files # write sequence 1 to output file 1
t_count = my_dna.count('T') output_1.write('>' + header_1 + '\n' + seq_1 + '\n')
7.Splitting genomic DNA
at_content = (a_count + t_count) / length # write sequence 2 to output file 2
print("AT content is " + str(at_content)) Look in the chapter_3 folder for a file called output_2.write('>' + header_2 + '\n' + seq_2.upper() +
genomic_dna.txt – it contains the same piece of genomic DNA '\n')
AT content is 0.685185185185 that we were using in the final exercise from chapter 2.
Write a program that will split the genomic DNA into coding # write sequence 3 to output file 3
2.Complementing DNA and non-coding parts, and write these sequences to two output_3.write('>' + header_3 + '\n' + seq_3.replace('-',
separate files. '') + '\n')
Here's a short DNA sequence:
Hint: use your solution to the last exercise from chapter 2 Chapter 4: Lists and loops
ACTGATCGATTACGTATAGTATTTGCTATCATACATATATATCGATGCGTTCAT as a starting point.
10.Processing DNA in a file
Write a program that will print the complement of this Solution:
sequence. The file input.txt contains a number of DNA sequences, one
# open the file and read its contents per line. Each sequence starts with the same 14 base pair
Solution: dna_file = fragment – a sequencing adapter that should have been
open("/home/daniel/Python/exercises/Chapter_3/exercises/gen removed. Write a program that will (a) trim this adapter
my_dna = omic_dna.txt") and write the cleaned sequences to a new file and (b) print
"ACTGATCGATTACGTATAGTATTTGCTATCATACATATATATCGATGCGTTCAT" my_dna = dna_file.read() the length of each sequence to the screen.
replacement1 = my_dna.replace('A', 't')
print(replacement1) # extract the different bits of DNA sequence Solution:
replacement2 = replacement1.replace('T', 'a') exon1 = my_dna[0:62]
print(replacement2) intron = my_dna[62:90] # open the input file
replacement3 = replacement2.replace('C', 'g') exon2 = my_dna[90:] file =
print(replacement3) open("/home/daniel/Python/exercises/Chapter_4/exercises/inp
replacement4 = replacement3.replace('G', 'c') # open the two output files ut.txt")
print(replacement4) coding_file =
print(replacement4.upper()) open("/home/daniel/Python/exercises/Chapter_3/exercises/cod # open the output file
ing_dna.txt", "w") output =
tCTGtTCGtTTtCGTtTtGTtTTTGCTtTCtTtCtTtTtTtTCGtTGCGTTCtT noncoding_file = open("/home/daniel/Python/exercises/Chapter_4/exercises/tri
tCaGtaCGtaatCGatatGataaaGCataCtatCtatatataCGtaGCGaaCta open("/home/daniel/Python/exercises/Chapter_3/exercises/non mmed.txt", "w")
tgaGtagGtaatgGatatGataaaGgatagtatgtatatatagGtaGgGaagta coding_dna.txt", "w")
tgactagctaatgcatatcataaacgatagtatgtatatatagctacgcaagta # go through the input file one line at a time
TGACTAGCTAATGCATATCATAAACGATAGTATGTATATATAGCTACGCAAGTA # write the sequences to the output files for dna in file:
coding_file.write(exon1 + exon2)
3.Restriction fragment lengths noncoding_file.write(intron) # get the substring from the 15th character to the end
trimmed_dna = dna[14:]
Here's a short DNA sequence: 8.Writing a FASTA file
# get the length of the trimmed sequence
ACTGATCGATTACGTATAGTAGAATTCTATCATACATATATATCGATGCGTTCAT FASTA file format is a commonly-used DNA and protein trimmed_length = len(trimmed_dna) - 1
sequence file format. A single sequence in FASTA format
The sequence contains a recognition site for the EcoRI looks like this: # print out the trimmed sequence
restriction enzyme, which cuts at the motif G*AATTC (the output.write(trimmed_dna)
position of the cut is indicated by an asterisk). Write a >sequence_name
program which will calculate the size of the two fragments ATCGACTGATCGATCGTACGAT # print out the length to the screen
that will be produced when the DNA sequence is digested print("processed sequence with length " +
with EcoRI. Where sequence_name is a header that describes the sequence str(trimmed_length))
(the greater-than symbol indicates the start of the header
Solution: line). Often, the header contains an accession number that processed sequence with length 42
relates to the record for the sequence in a public sequence processed sequence with length 37
my_dna = database. A single FASTA file can contain multiple processed sequence with length 48
"ACTGATCGATTACGTATAGTAGAATTCTATCATACATATATATCGATGCGTTCAT" sequences, like this: processed sequence with length 33
frag1_length = my_dna.find("GAATTC") + 1 processed sequence with length 47
frag2_length = len(my_dna) - frag1_length >sequence_one
print("length of fragment one is " + str(frag1_length)) ATCGATCGATCGATCGAT 11.Multiple exons from genomic DNA
print("length of fragment two is " + str(frag2_length)) >sequence_two
ACTAGCTAGCTAGCATCG The file genomic_dna.txt contains a section of genomic DNA,
length of fragment one is 22 >sequence_three and the file exons.txt contains a list of start/stop
length of fragment two is 33 ACTGCATCGATCGTACCT positions of exons. Each exon is on a separate line and
the start and stop positions are separated by a comma.
4.Splicing out introns, part one Write a program that will create a FASTA file for the Write a program that will extract the exon segments,
following three sequences – make sure that all sequences concatenate them, and write them to a new file.
Here's a short section of genomic DNA: are in upper case and only contain the bases A, T, G and C.
Solution:
ATCGATCGATCGATCGACTGACTAGTCATAGCTATGCATGTAGCTACTCGATCGATCGA Sequence header DNA sequence
TCGATCGATCGATCGATCGATCGATCATGCTATCATCGATCGATATCGATGCATCGACT ABC123 ATCGTACGATCGATCGATCGCTAGACGTATCG # open the genomic dna file and read the contents
ACTAT DEF456 actgatcgacgatcgatcgatcacgact genomic_dna =
HIJ789 ACTGAC-ACTGT—ACTGTA----CATGTG open("/home/daniel/Python/exercises/Chapter_4/exercises/gen
It comprises two exons and an intron. The first exon runs omic_dna.txt").read()
from the start of the sequence to the sixty-third Solution:
character, and the second exon runs from the ninety- first # open the exons locations file
character to the end of the sequence. Write a program that # set the values of all the header variables exon_locations =
will print just the coding regions of the DNA sequence. header_1 = "ABC123" open("/home/daniel/Python/exercises/Chapter_4/exercises/exo
header_2 = "DEF456" ns.txt")
Solution: header_3 = "HIJ789"
# create a variable to hold the coding sequence
my_dna = # set the values of all the sequence variables coding_sequence = ""
"ATCGATCGATCGATCGACTGACTAGTCATAGCTATGCATGTAGCTACTCGATCGATCG seq_1 = "ATCGTACGATCGATCGATCGCTAGACGTATCG"
ATCGATCGATCGATCGATCGATCGATCATGCTATCATCGATCGATATCGATGCATCGAC seq_2 = "actgatcgacgatcgatcgatcacgact" # go through each line in the exon locations file
TACTAT" seq_3 = "ACTGAC-ACTGT--ACTGTA----CATGTG" for line in exon_locations:
exon1 = my_dna[0:63]
exon2 = my_dna[90:] # make a new file to hold the output # split the line using a comma
print(exon1 + exon2) output = positions = line.split(',')
open("/home/daniel/Python/exercises/Chapter_3/exercises/seq
ATCGATCGATCGATCGACTGACTAGTCATAGCTATGCATGTAGCTACTCGATCGATCGA uences.fasta", "w") # get the start and stop positions
TCGAATCATCGATCGATATCGATGCATCGACTACTAT start = int(positions[0])
# write the header and sequence for seq1 stop = int(positions[1])
5.Splicing out introns, part two output.write('>' + header_1 + '\n' + seq_1 + '\n')
# extract the exon from the genomic dna
Using the data from part one, write a program that will # write the header and uppercase sequences for seq2 exon = genomic_dna[start:stop]
calculate what percentage of the DNA sequence is coding. output.write('>' + header_2 + '\n' + seq_2.upper() + '\n')
# append the exon to the end of the current coding
Solution: # write the header and sequence for seq3 with hyphens sequence
removed coding_sequence = coding_sequence + exon
from __future__ import division output.write('>' + header_3 + '\n' + seq_3.replace('-', '')
my_dna = + '\n') # write the coding sequence to an output file
"ATCGATCGATCGATCGACTGACTAGTCATAGCTATGCATGTAGCTACTCGATCGATCG output =
ATCGATCGATCGATCGATCGATCGATCATGCTATCATCGATCGATATCGATGCATCGAC 9.Writing multiple FASTA files open("/home/daniel/Python/exercises/Chapter_4/exercises/cod
TACTAT" ing_sequence.txt", "w")
exon1 = my_dna[0:63] Use the data from the previous exercise, but instead of output.write(coding_sequence)
exon2 = my_dna[90:] creating a single FASTA file, create three new FASTA files output.close()
coding_length = len(exon1 + exon2) – one per sequence. The names of the FASTA files should be
total_length = len(my_dna) the same as the sequence header names, with the
print(100 * coding_length / total_length) extension .fasta.

78.0487804878 Solution:

6.Splicing out introns, part three # set the values of all the header variables

CSC1301 Syllabus Spring 2023
No ratings yet
CSC1301 Syllabus Spring 2023
9 pages
Answers To Problems: An Introduction To Systems Biology by Uri Alon
No ratings yet
Answers To Problems: An Introduction To Systems Biology by Uri Alon
8 pages
Solution Manual For Introduction To Biomedical Engineering 2nd Edition by Domach
100% (1)
Solution Manual For Introduction To Biomedical Engineering 2nd Edition by Domach
6 pages
Python For Biologists
No ratings yet
Python For Biologists
227 pages
Bioinformatics Tools: Stuart M. Brown, PH.D Dept of Cell Biology NYU School of Medicine
No ratings yet
Bioinformatics Tools: Stuart M. Brown, PH.D Dept of Cell Biology NYU School of Medicine
50 pages
Conservation Angular Momentum
0% (1)
Conservation Angular Momentum
4 pages
Marker Assisted Selection
No ratings yet
Marker Assisted Selection
23 pages
Biopython Tutorial and Cookbook
No ratings yet
Biopython Tutorial and Cookbook
324 pages
Bio Python
100% (1)
Bio Python
357 pages
A Beginners Guide To Hypothesis Testing - Hypothesis Made Simple - Data Dojo - Data Analytics With Sunil Kappal
No ratings yet
A Beginners Guide To Hypothesis Testing - Hypothesis Made Simple - Data Dojo - Data Analytics With Sunil Kappal
5 pages
Pandas Cheat Sheet
No ratings yet
Pandas Cheat Sheet
2 pages
Python Game Programming by Example. Full Code 2024
No ratings yet
Python Game Programming by Example. Full Code 2024
476 pages
Python Next Steps l1 The Basics
No ratings yet
Python Next Steps l1 The Basics
18 pages
Resume Tkinter
100% (1)
Resume Tkinter
2 pages
Docker Presentation
No ratings yet
Docker Presentation
8 pages
Binder1 Python
No ratings yet
Binder1 Python
29 pages
Python Journey From Novice To Expert B01LD8K8WW SAMPLE
0% (1)
Python Journey From Novice To Expert B01LD8K8WW SAMPLE
21 pages
Biopython: Ahmed G. A. Ali
No ratings yet
Biopython: Ahmed G. A. Ali
13 pages
1 Non-Programmer's Tutorial For Python 3
No ratings yet
1 Non-Programmer's Tutorial For Python 3
74 pages
Advanced Bash Scripting Guide
No ratings yet
Advanced Bash Scripting Guide
864 pages
Pandas PDF
No ratings yet
Pandas PDF
3,021 pages
Python Full Meal Handout
No ratings yet
Python Full Meal Handout
3 pages
Python 04 Exercises
No ratings yet
Python 04 Exercises
5 pages
Why Program?: Python For Informatics: Exploring Information
No ratings yet
Why Program?: Python For Informatics: Exploring Information
47 pages
Hands-On Data Science and Python Machine Learning - Perform Data Mining and Machine Learning Efficiently Using Python and Spark PDF
No ratings yet
Hands-On Data Science and Python Machine Learning - Perform Data Mining and Machine Learning Efficiently Using Python and Spark PDF
415 pages
Module 3 - Intro To C++
No ratings yet
Module 3 - Intro To C++
54 pages
PC1431 MasteringPhysics Assignment 3
0% (2)
PC1431 MasteringPhysics Assignment 3
14 pages
Class XII - File Handling Text Files
No ratings yet
Class XII - File Handling Text Files
10 pages
Quiz and Test Module 1
No ratings yet
Quiz and Test Module 1
6 pages
Reading Files: Python For Informatics: Exploring Information
No ratings yet
Reading Files: Python For Informatics: Exploring Information
23 pages
Python - Programming
No ratings yet
Python - Programming
9 pages
Graphics, Pygame Basics: Programming in Python: Graphics
No ratings yet
Graphics, Pygame Basics: Programming in Python: Graphics
3 pages
GATK GuideBook 2.4-7
100% (1)
GATK GuideBook 2.4-7
342 pages
File Handling Easy Notes
No ratings yet
File Handling Easy Notes
9 pages
UNIT-5 Data Visualization Using Dataframe
No ratings yet
UNIT-5 Data Visualization Using Dataframe
38 pages
100 Numpy Exercises
No ratings yet
100 Numpy Exercises
13 pages
CS 505 - Linux Laboratory Manual - Final - 1659164997
No ratings yet
CS 505 - Linux Laboratory Manual - Final - 1659164997
29 pages
OOP Using Python Hands-On Assessment
No ratings yet
OOP Using Python Hands-On Assessment
10 pages
Maximum Parsimony Using PAUP and TNT
No ratings yet
Maximum Parsimony Using PAUP and TNT
9 pages
Biomedical Image Analysis Using Python
No ratings yet
Biomedical Image Analysis Using Python
27 pages
Data Science - Unit-3-Part-2
No ratings yet
Data Science - Unit-3-Part-2
32 pages
Anexo 1 Stanford - Computer Science I Semestre
No ratings yet
Anexo 1 Stanford - Computer Science I Semestre
2 pages
PHS 3E Manual 20190418
No ratings yet
PHS 3E Manual 20190418
16 pages
CS 505 - Linux Laboratory Manual - Updated - 1632739076
No ratings yet
CS 505 - Linux Laboratory Manual - Updated - 1632739076
29 pages
Lecture12 Functional Pathway Analysis
No ratings yet
Lecture12 Functional Pathway Analysis
13 pages
Curso de Python 2
No ratings yet
Curso de Python 2
152 pages
Progit
No ratings yet
Progit
505 pages
Bioinformatics Assignment Topic: Phylogenetics Analysis Softwares
No ratings yet
Bioinformatics Assignment Topic: Phylogenetics Analysis Softwares
12 pages
Py4inf Exercises
No ratings yet
Py4inf Exercises
14 pages
Git 101 For Dummies: Prologue
No ratings yet
Git 101 For Dummies: Prologue
13 pages
ACSL by Example
No ratings yet
ACSL by Example
291 pages
How To Extend RapidMiner 5
No ratings yet
How To Extend RapidMiner 5
92 pages
Slicing. Both, Numpy Array Indexing and Slicing Will Be Discussed in The Remainder
No ratings yet
Slicing. Both, Numpy Array Indexing and Slicing Will Be Discussed in The Remainder
50 pages
How To Work With List Columns
No ratings yet
How To Work With List Columns
104 pages
Assignment 2
No ratings yet
Assignment 2
31 pages
Python Lists: Python For Informatics: Exploring Information
No ratings yet
Python Lists: Python For Informatics: Exploring Information
28 pages
solutionsExerciseMaster11 23
No ratings yet
solutionsExerciseMaster11 23
13 pages
02 Handling Files
No ratings yet
02 Handling Files
18 pages
HW 13
No ratings yet
HW 13
6 pages
Python Basics Exercises
No ratings yet
Python Basics Exercises
4 pages
p3 Python Project
No ratings yet
p3 Python Project
4 pages
Dna Visulization
No ratings yet
Dna Visulization
16 pages
Resume
No ratings yet
Resume
4 pages
Kelompok 3 - PPT Bioteknologi Dalam Bidang Pangan
No ratings yet
Kelompok 3 - PPT Bioteknologi Dalam Bidang Pangan
14 pages
DR. Ram Manohar Lohia Avadh University, Faizabad: Revised Examination Scheme of M.Sc./M.A (Previous & Final) - 2011
No ratings yet
DR. Ram Manohar Lohia Avadh University, Faizabad: Revised Examination Scheme of M.Sc./M.A (Previous & Final) - 2011
6 pages
India Pharmaceutical Market Oct 2018 Annex - 1
No ratings yet
India Pharmaceutical Market Oct 2018 Annex - 1
4 pages
SYLLABUS - Intro To Bio Syllabus 2
No ratings yet
SYLLABUS - Intro To Bio Syllabus 2
8 pages
Gene Technology Assignment - 634558222401433110
No ratings yet
Gene Technology Assignment - 634558222401433110
6 pages
Dna Replication Worksheet
No ratings yet
Dna Replication Worksheet
1 page
IB Bio G11 Programme
No ratings yet
IB Bio G11 Programme
1 page
Bacterial Plasmids
No ratings yet
Bacterial Plasmids
9 pages
Genetics Paper III .301 PDF
No ratings yet
Genetics Paper III .301 PDF
4 pages
ShRNA Protocol
No ratings yet
ShRNA Protocol
1 page
Statement of Purpose
No ratings yet
Statement of Purpose
3 pages
Enyzmes in Industry
No ratings yet
Enyzmes in Industry
24 pages
Journal Abreviation PDF
No ratings yet
Journal Abreviation PDF
3 pages
EMBL-EBI Train Online
No ratings yet
EMBL-EBI Train Online
19 pages
Jagannath University: Index
No ratings yet
Jagannath University: Index
10 pages
Phylogenomics of Afrotherian mammals and improved resolution of extant Paenungulata (科研通-ablesci.com)
No ratings yet
Phylogenomics of Afrotherian mammals and improved resolution of extant Paenungulata (科研通-ablesci.com)
36 pages
Bushra Raj Flyer
No ratings yet
Bushra Raj Flyer
1 page
Inter 2nd Year Botany
100% (4)
Inter 2nd Year Botany
9 pages
STE Elective - Biotech - Q4 - W1 8 - Learning Activity Sheets 1
No ratings yet
STE Elective - Biotech - Q4 - W1 8 - Learning Activity Sheets 1
10 pages
200812114104advertisement For Recruitment 21.8.12
No ratings yet
200812114104advertisement For Recruitment 21.8.12
2 pages
FAQ in Microbiology
100% (2)
FAQ in Microbiology
117 pages
06 Neurulation
No ratings yet
06 Neurulation
5 pages
Csir-Institute of Genomics & Integrative Biology Mall Road, Delhi - 110007
No ratings yet
Csir-Institute of Genomics & Integrative Biology Mall Road, Delhi - 110007
2 pages
Duolink Probemaker User Manual Proximity Ligation Assay
No ratings yet
Duolink Probemaker User Manual Proximity Ligation Assay
14 pages
Efficient Adaptor Ligation For The Preparation of DsDNA Libraries Using Blunt-TA Ligase Master Mix
No ratings yet
Efficient Adaptor Ligation For The Preparation of DsDNA Libraries Using Blunt-TA Ligase Master Mix
4 pages
Sulfonamide Drug Study
No ratings yet
Sulfonamide Drug Study
1 page
Phylogenetic Tree Lab (FASTA)
No ratings yet
Phylogenetic Tree Lab (FASTA)
8 pages

Manual de Ejercicios de Python

Uploaded by

Manual de Ejercicios de Python

Uploaded by

Chapter 2: Printing and manipulating text header_1 = "ABC123"

You might also like