Beta version

Summary

Chemical cross-linking of proteins or protein complexes and the mass spectrometry based localization of the cross-linked amino acids is a powerful method for generating distance restraints on the substrate’s topology. Xwalk was written to predict and validate these cross-links on existing protein structures. Xwalk calculates and displays non-linear distances between chemically cross-linked amino acids on protein surfaces, while mimicking the flexibility and non-linearity of cross-linker molecules. It returns a Solvent Accessible Surface Distance, which corresponds to the length of the shortest path between two amino acids, where the path leads through solvent occupied space without penetrating the protein surface.

Click here to read our Application Note in the Bioinformatics journal.

Cross-links

The inclusion of experimentally determined distance restraints in the computation of protein structures and complex topologies has become a key technique to increase the reliability of computational structure prediction 1,2. Chemical cross-links are a valuable source for such distance restraints 3,4. Of particular interest are cross-link modifications, where the cross-linker molecule covalently connects a pair of peptides. If both peptides originate from within the same protein chain the cross-link is referred to as intra-protein cross-link. In contrary, a cross-link between two peptides from distinct protein chains is called inter-protein cross-link 3,5. Other frequent modifications include mono-links and loop-links, where only one side of a bi-functional cross-linker molecule has reacted with the protein or twice with a single peptide, respectively. Xwalk does not distinguish between loop and intra-protein cross-links.You can get a list of all theoretically possible (virtual) intra or inter-protein cross-links by running Xwalk in Production Mode.

Challenge

So far, in protein structure prediction distance restraints from cross-linking experiments have been employed as an upper limit on the Euclidean distance between a pair of cross-linked amino acids. With Xwalk you can too use the Euclidean distance to validate your cross-link data. However, be aware that Euclidean distance is not a precise measure for deducing the cross-linkability between two amino acids. The Euclidean distance metric being a standard L2 norm represents the length of the vector that connects two points (here any two atom centers of the cross-linked amino acids) in Cartesian space. However, such two points on the protein surface are likely separated by molecular slopes and depressions and form a physical barrier for a cross-linker molecule. The cross-linker molecule cannot penetrate these barriers but must circumvent them to bridge two reactive amino acids covalently together. Thus, a more precise representation of cross-linked amino acids on protein surfaces requires the incorporation of a non-linear distance measure that accommodates the flexibility of the cross-linker molecule and the cross-linked side chains.

Implementation

Xwalk mimics the cross-linker molecule by calculating the shortest path between two amino acids on the protein surface while circumventing slopes and bridging depressions. Important thereby is that the path throughout its length does not penetrate the protein surface and only lead through solvent occupied space. We refer to the length of the shortest path as the Solvent-Accessible-Surface Distance (SASD).

Algorithm

SASD are calculated based on the following breadth first search algorithm, exemplary between ε-amino groups of lysine residues:

  1. Read in input data:
    1. xyz.pdb, spatial coordinates of a protein or protein complex in PDB format.
    2. maxdist: maximum distance of the path (i.e. the length of the cross-linker + AA sidechain length).
    3. listXL, a list of experimentally determined cross-linked lysine residues (Validation Mode).
  2. Remove all non-protein atoms in xyz.pdb and assign protein atoms a van der Waals radius sum of SURFNET atom radii + solvent radius.
  3. Select a random lysine pair AAab from listXL.
  4. Check Euclidean distance (Euc) of AAab. Continue, if Eucmaxdist, disregard otherwise and go back to 3.
  5. Generate a local grid of size maxdist and grid spacing 1 Å centered at AAa.
  6. Set MAX_VALUE as distance for all grid cells and label grid cells as residing in the:
    1. protein
    2. solvent
    3. boundary between protein and solvent
  7. Label grid cells residing in AAab as solvent.
  8. Set distance dist = 0.0 for central grid cell of AAa and store grid cell in the active list listactive.
  9. Start breadth–first search. Iterate through listactive:
    1. Check that grid cell i is labeled as solvent.
    2. Find all immediate neighbors listneighbour.
    3. Iterate through listneighbour:
      1. Check that grid cell j is labeled as solvent.
      2. Compute new distance for grid cell j as the sum of the distance in grid cell i and the Euclidean distance between grid cell i and j.
      3. If distance sum in 9.c.ii is smaller than the current distance in grid cell j, store the distance sum as new distance for grid cell j and add grid cell j to the new active list listnew_active.
      4. Break up iteration of 9.c if grid cell j == central grid of AAb.
  10. Go back to step 9. with listactive = listnew_active.

Credits

Please refer to the following reference when you cite Xwalk:

Kahraman A., Malmström L., Aebersold R. (2011). Xwalk: Computing and Visualizing Distances in Cross-linking Experiments. Bioinformatics, doi:10.1093/bioinformatics/btr348.

System Requirements

  • xwalk.org was successfully tested on Safari v5, Firefox v3.6, Chrome v9 and Internet Explorer v8.
  • xwalk.org requires at least Java 1.4 to run the Jmol Applet. After having installed Java you might need to restart your operating system.
  • Some functionalities of xwalk.org require JavaScript and DOM.
  1. de Vries, S.J., Melquiond, A.S.J., Kastritis, P.L., Karaca, E., Bordogna, A., van Dijk, M., Rodrigues, J.P.G.L.M., and Bonvin, A.M.J.J. (2010). Strengths and weaknesses of data-driven docking in critical assessment of prediction of interactions. Proteins 78, 3242-3249 DOI.
  2. Lensink, M.F., and Wodak, S.J. (2010). Docking and scoring protein interactions: CAPRI 2009. Proteins 78, 3073-3084 DOI.
  3. Leitner, A., Walzthoeni, T., Kahraman, A., Herzog, F., Rinner, O., Beck, M., and Aebersold, R. (2010). Probing Native Protein Structures by Chemical Cross-linking, Mass Spectrometry, and Bioinformatics. Mol Cell Proteomics 9, 1634-1649 DOI.
  4. Sinz, A. (2003). Chemical cross-linking and mass spectrometry for mapping three-dimensional structures of proteins and protein complexes. J Mass Spectrom 38, 1225-1237 DOI.
  5. Rappsilber, J. (2010). The beginning of a beautiful friendship: cross-linking/mass spectrometry and modelling of proteins and multi-protein complexes. Journal of structural biology 173, 530-540 DOI.