PyProtein module¶
A class used for computing different types of protein descriptors!
You can freely use and distribute it. If you have any problem,
you could contact with us timely.
Authors: Zhijiang Yao and Dongsheng Cao.
Date: 2016.06.04
Email: gadsby@163.com
-
class
PyProtein.
PyProtein
(ProteinSequence='')[source]¶ This GetProDes class aims at collecting all descriptor calcualtion modules into a simple class.
-
AALetter
= ['A', 'R', 'N', 'D', 'C', 'E', 'Q', 'G', 'H', 'I', 'L', 'K', 'M', 'F', 'P', 'S', 'T', 'W', 'Y', 'V']¶
-
GetAAindex1
(name, path='.')[source]¶ Get the amino acid property values from aaindex1
Usage:
result=GetAAIndex1(name)
Input: name is the name of amino acid property (e.g., KRIW790103)
Output: result is a dict form containing the properties of 20 amino acids
-
GetAAindex23
(name, path='.')[source]¶ Get the amino acid property values from aaindex2 and aaindex3
Usage:
result=GetAAIndex23(name)
Input: name is the name of amino acid property (e.g.,TANS760101,GRAR740104)
Output: result is a dict form containing the properties of 400 amino acid pairs
-
GetAPAAC
(lamda=10, weight=0.5)[source]¶ Amphiphilic (Type II) Pseudo amino acid composition descriptors
default is 30
Usage:
result = GetAPAAC(lamda=10,weight=0.5)
lamda factor reflects the rank of correlation and is a non-Negative integer, such as 15.
Note that (1)lamda should NOT be larger than the length of input protein sequence;
- lamda must be non-Negative integer, such as 0, 1, 2, ...; (3) when lamda =0, the
output of PseAA server is the 20-D amino acid composition.
weight factor is designed for the users to put weight on the additional PseAA components
with respect to the conventional AA components. The user can select any value within the
region from 0.05 to 0.7 for the weight factor.
-
GetGearyAutop
(AAP={}, AAPName='p')[source]¶ Geary autocorrelation descriptors for the given property (30)
Usage:
result = GetGearyAutop(AAP={},AAPName=’p’)
AAP is a dict containing physicochemical properities of 20 amino acids
-
GetMoranAutop
(AAP={}, AAPName='p')[source]¶ Moran autocorrelation descriptors for the given property (30)
Usage:
result = GetMoranAutop(AAP={},AAPName=’p’)
AAP is a dict containing physicochemical properities of 20 amino acids
-
GetMoreauBrotoAuto
()[source]¶ Normalized Moreau-Broto autocorrelation descriptors (240)
Usage:
result = GetMoreauBrotoAuto()
-
GetMoreauBrotoAutop
(AAP={}, AAPName='p')[source]¶ Normalized Moreau-Broto autocorrelation descriptors for the given property (30)
Usage:
result = GetMoreauBrotoAutop(AAP={},AAPName=’p’)
AAP is a dict containing physicochemical properities of 20 amino acids
-
GetPAAC
(lamda=10, weight=0.05)[source]¶ Type I Pseudo amino acid composition descriptors (default is 30)
Usage:
result = GetPAAC(lamda=10,weight=0.05)
lamda factor reflects the rank of correlation and is a non-Negative integer, such as 15.
Note that (1)lamda should NOT be larger than the length of input protein sequence;
- lamda must be non-Negative integer, such as 0, 1, 2, ...; (3) when lamda =0, the
output of PseAA server is the 20-D amino acid composition.
weight factor is designed for the users to put weight on the additional PseAA components
with respect to the conventional AA components. The user can select any value within the
region from 0.05 to 0.7 for the weight factor.
-
GetPAACp
(lamda=10, weight=0.05, AAP=[])[source]¶ Type I Pseudo amino acid composition descriptors for the given properties (default is 30)
Usage:
result = GetPAACp(lamda=10,weight=0.05,AAP=[])
lamda factor reflects the rank of correlation and is a non-Negative integer, such as 15.
Note that (1)lamda should NOT be larger than the length of input protein sequence;
- lamda must be non-Negative integer, such as 0, 1, 2, ...; (3) when lamda =0, the
output of PseAA server is the 20-D amino acid composition.
weight factor is designed for the users to put weight on the additional PseAA components
with respect to the conventional AA components. The user can select any value within the
region from 0.05 to 0.7 for the weight factor.
AAP is a list form containing the properties, each of which is a dict form.
-
GetQSO
(maxlag=30, weight=0.1)[source]¶ Quasi sequence order descriptors default is 50
result = GetQSO(maxlag=30, weight=0.1)
maxlag is the maximum lag and the length of the protein should be larger
than maxlag. default is 45.
-
GetQSOp
(maxlag=30, weight=0.1, distancematrix={})[source]¶ Quasi sequence order descriptors default is 50
result = GetQSO(maxlag=30, weight=0.1)
maxlag is the maximum lag and the length of the protein should be larger
than maxlag. default is 45.
distancematrix is a dict form containing 400 distance values
-
GetSOCN
(maxlag=45)[source]¶ Sequence order coupling numbers default is 45
Usage:
result = GetSOCN(maxlag=45)
maxlag is the maximum lag and the length of the protein should be larger
than maxlag. default is 45.
-
GetSOCNp
(maxlag=45, distancematrix={})[source]¶ Sequence order coupling numbers default is 45
Usage:
result = GetSOCN(maxlag=45)
maxlag is the maximum lag and the length of the protein should be larger
than maxlag. default is 45.
distancematrix is a dict form containing 400 distance values
-
GetSubSeq
(ToAA='S', window=3)[source]¶ obtain the sub sequences wit length 2*window+1, whose central point is ToAA
Usage:
result = GetSubSeq(ToAA=’S’,window=3)
ToAA is the central (query point) amino acid in the sub-sequence.
window is the span.
-
GetTriad
()[source]¶ Calculate the conjoint triad features from protein sequence.
Useage:
res = GetTriad()
Output is a dict form containing all 343 conjoint triad features.
-
Version
= 1.0¶
-