User Guide For Propy 1.0: 1.1 What Is This?
User Guide For Propy 1.0: 1.1 What Is This?
This document is intended to provide an overview of how one can use the propy functionality from Python. Its not comprehensive and its not a manual. If you find mistakes, or have suggestions for improvements, please either fix them yourselves in the source document (the .py file) or send them to the mailing list: [email protected]
propy has been successfully tested on Linux and Windows systems. The author could download the propy package from http://code.google.com/p/protpy/downloads/list (.zip and .tar.gz). The install process of propy is very easy:
On Windows: (1): download the propy package (.zip) (2): extract or uncompress the .zip file (3): cd propy-1.0 (4): python setup.py install On Linux: (1): download the propy package (.tar.gz) (2): tar -zxf propy-1.0.tar.gz (3): cd propy-1.0 (4): python setup.py install or sudo python setup.py install 1.3 Download proteins from Uniprot
You can get a protein sequence from the Uniprot website by providing a Uniprot ID.
You can get the window 2+1 sub-sequences whose central point is the given amino acid ToAA.
You can also get several protein sequences by providing a file containing Uniprot IDs of these proteins.
You could check whether the input sequence is a valid protein sequence or not.
You could get the properties of amino acids from the AAindex database by providing a property name (e.g., KRIW790103). The output is given in the form of dictionary.
If the user provides the directory containing the AAindex database (the AAindex database could be downloaded from ftp://ftp.genome.jp/pub/db/community/aaindex/. It consists of three files: aaindex1, aaindex2 and aaindex3), the program will read the given database to get the property.
It should be noted that the propy package has contained the AAindex database. The GetAAIndex1 methods in AAIndex will get the property from the aaindex1 database.
If the user does not provide the directory containing the AAindex database, the program will downlaod the three databases (i.e., aaindex1, aaindex2 and aaindex3) to obtain the property. It should be noted that the downloaded AAindex will be saved in the current directory. You can also specify the directory according to your needs.
The downloaded databases are saved in F disk. The GetAAIndex23 methods in AAIndex will get the property from the aaindex2 and aaindex3 databases.
There are two ways to calculate protein descriptors in the propy package. One is to directly use the corresponding methods, the other one is firstly to construct a GetProDes class and then run their methods to obtain the protein descriptors. It should be noted that the output is a dictionary form, whose keys and values represent the descriptor name and the descriptor value, respectively. The user could clearly understand the meaning of each descriptor.
Use functions:
When we change the values of lamda and weight, we could get different PAAC values. Note that the number of PAAC depends on the choice of lamda. If lamda = 10, we can obtain 20+lamda=30 PAAC descriptors.
Example 4: Calculating all protein descriptors The GetProDes class includes a built-in method which can calculate all protein descriptors.
The user could provide some property in the form of dictionary in python. Thus, propy could calculate the descriptors based on the user-defined property.
Example 6:
A powerful ability of propy is that it can easily calculate thousands of protein features through automatically obtaining the needed property from AAindex.
Number of descriptors 20 400 8000 240a 240 a 240 a 21 21 105 60 100 50 b 50c
Autocorrelation
CTD
Quasi-sequence order
The number depends on the choice of the number of properties of amino acid and the choice of the maximum values
of the lag. The default is use eight types of properties and lag = 30.
b
The number depends on the choice of the number of the set of amino acid properties and the choice of the lamda
value. The default is use three types of properties proposed by Chou et al and lamda = 30.
c
The number depends on the choice of the lamda vlaue. The default is that lamda = 30.