PyBioMed

A python package for generating various molecular representations for chemicals, proteins, DNAs and their interactions

PyBioMed


Programming language:
Python
Running environment
Python version == 2.7.x
License:
BSD 3-clause

Introduction

To develop a powerful model for prediction tasks, one of the most important things to consider is how to effectively represent the molecules under investigation such as small molecules, proteins, DNA and even complex interactions, by a descriptor. With PyBioMed you can analyze and represent various complex molecular data under investigation. We hope that the packages will be helpful when exploring questions concerning structures, functions and interactions of various molecular data in the context of systems biology.PyBioMed is a feature-rich python package used for the characterization of various complex biological molecules and interaction samples, such as chemicals,proteins,DNA and their interactions. PyBioMed calculates nine types of features including chemical descriptors or molecular fingerprints, structural and physicochemical features of proteins and peptides from amino acid sequence, composition and physicochemical features of DNA from their primary sequences, chemical-chemical interaction features, chemical-protein interaction features, chemical-DNA interaction features, protein-protein interaction features, protein-DNA interaction features, and DNA-DNA interaction features. PyBioMed can also pretreating molecule structures, protein sequences and DNA sequence. In order to be convenient to users, PyBioMed provides the module to get molecule structures, protein sequence and DNA sequence from Internet.


Features

  • Tools for pretreating molecules, proteins sequence and DNA sequence
  • Calculating chemical descriptors or molecular fingerprints from molecules’ structures
  • Calculating structural and physicochemical features of proteins and peptides from amino acid sequence
  • Calculating composition and physicochemical features of DNA from their primary sequences
  • Calculating interaction features including chemical-chemical interaction features, chemical-protein interaction features, chemical-DNA interaction features, protein-protein interaction features, protein-DNA interaction features and DNA-DNA interaction features
  • Getting molecular structures, protein sequence and DNA sequence from Internet through the molecular ID, protein ID and DNA ID

Function

PyBioMed allows users to compute the characterization of various complex biological molecules and interaction samples, such as chemicals,proteins,DNA and their interactions. (for details see the table).

Types Features
PyMolecule
  •     Constitution(30)
  •     Connectivity descriptors (44)
  •     Basak descriptors (21)
  •     Burden descriptors (64)
  •     Topology descriptors (35)
  •     Kappa descriptors (7)
  •     E-state descriptors (237)
  •     Moran autocorrelation descriptors (32)
  •     Geary autocorrelation descriptors (32)
  •     Molecular property descriptors (6)
  •     Moreau-Broto autocorrelation descriptors (32)
  •     Charge descriptors (25)
  •     MOE-type descriptors (60)
  •     CATS2D descriptors (150)
  •     Daylight-type fingerprints (2048)
  •     MACCS fingerprints (166)
  •     Atom pairs fingerprints (2048)
  •     TopologicalTorsion fingerprints (2048)
  •     E-state fingerprints (79)
  •     FP2 fingerprints (1024)
  •     FP3 fingerprints (210)
  •     FP4 fingerprints (307)
  •     ECFP2 fingerprints (1024)
  •     ECFP4 fingerprints (1024)
  •     ECFP6 fingerprints (1024)
  •     Morgan fingerprints (1024)
  •     Ghosecrippen fingerprints (110)
  •     FCFP2 fingerprints (1024)
  •     FCFP4 fingerprints (1024)
  •     FCFP6 fingerprints (1024)
  •     Pharm2D2point fingerprints (135)
  •     Pharm2D3point fingerprints (2135)
  •     PubChem fingerprints (881)
PyProtein
  •     Amino acid composition (20)
  •     Dipeptide composition (400)
  •     Tripeptide composition (8000)
  •     CTD composition (21)
  •     CTD transition (21)
  •     CTD distribution (105)
  •     M-B autocorrelation (240)
  •     Moran autocorrelation (240)
  •     Geary autocorrelation (240)
  •     Conjoint triad features (343)
  •     Quasi-sequence order descriptors (100)
  •     Sequence order coupling number (60)
  •     Pseudo amino acid composition 1 (50)
  •     Pseudo amino acid composition 2 (50)
PyDNA
  •     Basic kmer (16)
  •     Reverse compliment kmer (12)
  •     DAC (76)
  •     DCC (2812)
  •     DACC (2888)
  •     TAC (24)
  •     TCC (264)
  •     TACC (288)
  •     PseDNC (18)
  •     PseKNC (66)
  •     PC-PseDNC (18)
  •     PC-PseTNC (66)
  •     SC-PseDNC (92)
  •     SC-PseTNC (88)
PyInteraction
  •     Feature type 1
  •     Feature type 2
  •     Feature type 3

Developed by

Zhi-Jiang Yao
Jie Dong



Download


Overview



Related links

Software

Github (latest development): >>Click here

Documentation

PyBioMed Documentation.pdf

The introduction of descriptors

The introduction of molecular descriptors

>>Click here

The introduction of protein descriptors

>>Click here

The introduction of DNA descriptors

>>Click here

The introduction of interaction descriptors

>>Click here