TALOS: Protein backbone angle restraints from searching a database for chemical shift and sequence homology

Gabriel Cornilescu, Frank Delaglio and Ad Bax

Laboratory of Chemical Physics
National Institute of Diabetes and Digestive and Kidney Diseases
National Institutes of Health
Bethesda, Maryland 20892-0520

Abstract: Chemical shifts of backbone atoms in proteins are exquisitely sensitive to local conformation, and homologous proteins show quite similar patterns of secondary chemical shifts. The inverse of this relation is used to search a database for triplets of adjacent residues with secondary chemical shifts and sequence similarity which provide the best match to the query triplet of interest. The database contains 13Ca, 13Cb, 13C',1Ha and 15N chemical shifts for 20 proteins for which a high resolution X-ray structure is available. The computer program TALOS was developed to search this database for strings of residues with chemical shift and residue type homology. The relative importance of the weighting factors attached to the secondary chemical shifts of the five types of resonances relative to that of sequence similarity was optimized empirically. TALOS yields the 10 triplets which have the closest similarity in secondary chemical shift and amino acid sequence to those of the query sequence. If the central residues in these 10 triplets exhibit similar phi and psi backbone angles, their averages can reliably be used as angular restraints for the protein whose structure is being studied. Tests carried out for proteins of known structure indicate that the root-mean-square difference (rmsd) between the output of TALOS and the X-ray derived backbone angles is about 15s. Approximately 3% of the predictions made by TALOS are found to be in error.

J. Biomol. NMR, 13 (1999) 289-302
[PubMed PMID: 10212987]

The following table shows proteins contained in the TALOS database. Also listed are references describing the chemical shifts, the X-ray structure, the accession codes for data deposited in the BMRB and PDB databeses, the resolution at which the crystal structure was solved, and the types of nuclei for which chemical shifts are available.

Protein Chemical shifts ref. (BMRB code) No. of residues X-ray structure ref. (PDB code) Resolution (Angs.) Shifts
Alpha-lytic protease
Davis et al., 1997
198 Fujinaga et al., 1985
(2alp)
1.7 Ca, Cb, C', Ha, N
Basic pancreatic trypsin inhibitor
Hansen P.E., 1991
58 Wlodawer et al., 1984
(5pti)
1.1 Ca, Cb, C', Ha, N
Calbindin
Drakenberg et al., 1989
(390)
76 Svensson et al., 1992
(4icb)
1.6 Ca, Cb, Ha, N
Calmodulin
Ikura et al.,1990
(547)
148 Chattopadhyaya et al., 1992
(1cll)
1.7 Ca, Cb, C', Ha, N
Calmodulin/M13
Ikura et.al, 1991 
(1648)
148 Meador et al., 1992
(1cdl)
2.2 Ca, Cb, C', Ha, N
Cutinase
Pompers et al., 1997
(4101)
214 Longhi et al., 1997
(1cex)
1.0 Ca, Cb, C', Ha, N
Cyclophilin
Ottiger et al., 1997
165 Ke, 1992
(2cpl)
1.63 Ca, Cb, Ha, N
Cyanovirin-N
Bewley et al.,1998
101 Yang et al., 1999
(3ezm)
1.5 Ca, Cb, C', Ha, N
Dehydrase
Copie et al., 1996
171 Leesong et al., 1996
(1mka)
2.0 Ca, Cb, C', Ha, N
D-maltodextrin-binding protein
Gardner et al., 1998
370 Sharff et al., 1993
(1dmb )
1.8 Ca, Cb, C', Ha, N
HIV-1 protease
Yamazaki et al., 1996
99 Lam et. al, 1994 1.8 Ca, Cb, C', Ha, N
Human carbonic anhydrase I
Sethson et al., 1996
(4022)
260 Kumar and Kannan, 1994
(1hcb)
1.6 Ca, Cb, C', Ha, N
Human thioredoxin in reduced form
Qin et al., 1996
105 Weichsel et al., 1996
(1ert)
1.7 Ca, Cb, Ha, N
III-glc
Pelton et al., 1991
168 Worthylake et al., 1991
(1f3g)
2.1 Ca, Cb, C', Ha, N
Interleukin-1b
Clore et al., 1990
(1061)
153 Veerapandian et al., 1992
(4i1b)
2.0 Ca, Cb, Ha, N
Metallo-beta-lactamase
Scrofani et al., 1998
(4102)
232 Concha et al., 1996
(1znb)
1.85 Ca, Cb, C', Ha, N
Profilin
Archer et. al 1994
125 Fedorov et al., 1994
(1acf)
2.0 Ca, Cb, C', Ha, N
Serine protease PB 92
Fogh et al., 1995
269 Betzel et al., 1992
(1svn)
1.4 Ca, Cb, C', Ha, N
Staph nuclease
D. A. Torchia, personal communication
141 Loll and Lattman, 1989
(1snc)
1.65 Ca, Cb, C', Ha, N
Ubiquitin
Wang et. al 1995
76 Vijay-Kumar et al., 1987
(1ubq)
1.8 Ca, Cb, C', Ha, N