GetProtein module¶
Created on Sat Jul 13 11:18:26 2013
This module is used for downloading the PDB file from RCSB PDB web and
extract its amino acid sequence. This module can also download the protein
sequence from the uniprot (http://www.uniprot.org/) website. You can only
need input a protein ID or prepare a file (ID.txt) related to ID. You can
obtain a .txt (ProteinSequence.txt) file saving protein sequence you need.
Authors: Zhijiang Yao and Dongsheng Cao.
Date: 2016.06.04
Email: gadsby@163.com
-
GetProtein.
GetPDB
(pdbidlist=[])[source]¶ Download the PDB file from PDB FTP server by providing a list of pdb id.
-
GetProtein.
GetProteinSequence
(ProteinID)[source]¶ Get the protein sequence from the uniprot website by ID.
Usage:
result=GetProteinSequence(ProteinID)
Input: ProteinID is a string indicating ID such as “P48039”.
-
GetProtein.
GetProteinSequenceFromTxt
(path, openfile, savefile)[source]¶ Get the protein sequence from the uniprot website by the file containing ID.
Usage:
result=GetProteinSequenceFromTxt(path,openfile,savefile)
Input: path is a directory path containing the ID file such as “/home/orient/protein/”
openfile is the ID file such as “proteinID.txt”
-
GetProtein.
IsFasta
(seq)[source]¶ Judge the Seq object is in FASTA format. Two situation: 1. No seq name. 2. Seq name is illegal. 3. No sequence.
Parameters: seq – Seq object.
-
GetProtein.
ReadFasta
(f)[source]¶ Read a fasta file.
Parameters: f – HANDLE to input. e.g. sys.stdin, or open(<file>)
-
exception
GetProtein.
TimeoutException
[source]¶ Bases:
exceptions.Exception
-
GetProtein.
pdbDownload
(file_list, hostname='ftp.wwpdb.org', directory='/pub/pdb/data/structures/all/pdb/', prefix='pdb', suffix='.ent.gz')[source]¶ Download all pdb files in file_list and unzip them.