Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
538 views
in Technique[技术] by (71.8m points)

python - Calculating the distance between atomic coordinates

I have a text file as shown below

ATOM    920  CA  GLN A 203      39.292 -13.354  17.416  1.00 55.76           C 
ATOM    929  CA  HIS A 204      38.546 -15.963  14.792  1.00 29.53           C
ATOM    939  CA  ASN A 205      39.443 -17.018  11.206  1.00 54.49           C  
ATOM    947  CA  GLU A 206      41.454 -13.901  10.155  1.00 26.32           C
ATOM    956  CA  VAL A 207      43.664 -14.041  13.279  1.00 40.65           C 
.
.
.

ATOM    963  CA  GLU A 208      45.403 -17.443  13.188  1.00 40.25           C  

I would like to calculate the distance between two alpha carbon atoms i.e calculate the distance between first and second atom and then between second and third atom and so on..... The distance between two atoms can be expressed as:distance = sqrt((x1-x2)^2+(y1-y2)^2+(z1-z2)^2) .

The columns 7,8 and 9 represents x,y and z co-ordinates respectively.I need to print the distance and the corresponding residue pairs(column 4) as shown below.(the values of distance are not real)

GLN-HIS   4.5
HIS-ASN   3.2
ASN-GLU   2.5

How can I do this calculation with perl or python?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Do NOT split on whitespace

The other answers given here make a flawed assumption - that coordinates will be space-delimited. Per the PDB specification of ATOM, this is not necessarilly the case: PDB record values are specified by column indices, and may flow into one another. For instance, your first ATOM record reads:

ATOM    920  CA  GLN A 203      39.292 -13.354  17.416  1.00 55.76           C 

But this is perfectly valid as well:

ATOM    920  CA  GLN A 203      39.292-13.3540  17.416  1.00 55.76           C 

The better approach

Because of the column-specified indices, and the number of other problems that can occur in a PDB file, you should not write your own parser. The PDB format is messy, and there's a lot of special cases and badly formatted files to handle. Instead, use a parser that's already written for you.

I like Biopython's PDB.PDBParser. It will parse the structure for you into Python objects, complete with handy features. If you prefer Perl, check out BioPerl.

PDB.Residue objects allow keyed access to Atoms by name, and PDB.Atom objects overload the - operator to return distance between two Atoms. We can use this to write clean, concise code:

Code

from Bio import PDB
parser = PDB.PDBParser()

# Parse the structure into a PDB.Structure object
pdb_code = "1exm"
pdb_path = "pdb1exm.ent"
struct = parser.get_structure(pdb_code, pdb_path)

# Grab the first two residues from the structure
residues = struct.get_residues()
res_one = residues.next()
res_two = residues.next()

try:
    alpha_dist = res_one['CA'] - res_two['CA']
except KeyError:
    print "Alpha carbon missing, computing distance impossible!"

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...