New Alignment Strategy For Transmembrane

Uploaded by

CRISTIAN GABRIEL ZAMBRANO VEGA

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF or read online on Scribd

0% found this document useful (0 votes)

30 views9 pages

New Alignment Strategy For Transmembrane

Uploaded by

CRISTIAN GABRIEL ZAMBRANO VEGA

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF or read online on Scribd

You are on page 1/ 9

J Mot. Biot, (1984) 243, 388-396 ‘COMMUNICATIONS New Alignment Strategy for Transmembrane Proteins M. Cserzé"”, J~M. Bernassau’, I. Simon’ and B, Maigret’ ‘Laboratoire de Chimie Théorique URA CNRS No. 518 Université de Nancy-1-BP239, 54506 Vandoeuvre les Nancy Cedex, France "SANOFI Recherche rue du Professeur J. Blayac, 34082 Montpellier Cedex 04, France “nstitute of Enzymology Biological Research Center, Hungarian Academy of Sciences, 1518 RO. Box 7, Budapest, Hungary In this paper an algorithm which locates helical transmembrane segments is described. It is shown that given the location of transmembrane helices of a protein, corresponding helices in another membrane related protein can be pinpointed. The method seems to be extremely insensitive to sequence identity but highly sensitive to the property of a sequence to assume transmembrane helical structure. As an example, using the present method, @ sequence alignment between bacteriorhodopsin and human rhodopsin is carried out and it provides a ‘good starting point for homology modeling of this G-protein coupled receptor. It is difficult to obtain ia partioular alignment using the traditional methods because of poor sequence homology. There are indications thet hint at the broader range of applicability of the presented method. Keywords: transmembrane heli G-protein coupled receptor; signal peptide; sequence alignment; homology modeling With more than 300 primary structures available, ‘the family of G-protein coupled receptors (GPCRY) is ‘one of the most important targets of pharmaceutical research: serious efforte are being made world-wide to, discover new potent agonist and antagonist ligands of therapeutic interest (see Kerlavage, (1991) and references therein). For an effective, rational drug design, knowledge of the detailed three-dimensional structures of these receptors would be essential. Due to exper. imental difficulties only bacteriorhodopsin (Hender- son et al., 1990}, halorhodopsin (Havelka ef al, 1903) and rhodopsin (Schertler et al., 1993) cryo-microscopy structures are available within the seven transmembrane helices (TM7) protein family (the latter two structures are at a low resolution}. Recently, a promising technique, which improves the quality of the transmembrane protein crystals and may facilitate the X-ray structure determination, has been reported ehafmeister ef al., 1993). But even if this approach is successfully applied, it should take Correspondence: B. Maigret, Laboratoire de Chimie ‘Theorique, URA CNRS No. 510, Universive de ‘Nancy-1-BP239, Vandoouvre les Nancy Cedex, France. ‘Abbreviations used: GPCR, G-protein coupled receptor, TIT, T-transmembrane helices protein: BR, bbacteriorhodopsin; AR, humam rhodopsin; 3-D, three-dimensional several years bofore the high resolution stractures are, available. ‘Therefore, the only way to circumvent this problem and to apply a structure based drug design strategy is to use homologous protein structures for computer modelling Such models mey help to make new hypotheses and drive chemical synthesis of ligands or protein engineered receptors. Unfortunately, in the case of GPCR, the only uscful ‘TM7-like transmembrane 3D structure with acceptable resolution, namely bacteriorhodopsin, shows a very poor level of sequence homology with all the known members of the GPCR family (see Figure t). In this case, the application of homology modeling techniques is questionable, and the quality of the obtained 3-1 models is doubtful (Hilbert ef al., 1993; Kontoyienni & Lybrand, 1993; Hoffack ef al, 1994). The key question is thus: despite its poor homology, is dacteriorhodopsin a valuable starting model for homology modeling of any GPCR? In GPORs, supposing thet the packing of the seven ‘transmembrane helices could be similar to that of bactetiorhodopsin (Taylor & Agarwal, 1993), the first problem would be to find a proper sequence alignment between both systems (eee Donelly & Copdett, 1098; Edelman, 1993; Oliveira et al., 1993 for discussion). We present here an alternative to the existing methods that is able to detect the location of transmembeané helices in sequences showing @ poor level of homology against each other, and which 388 ‘0022-2836 94/430888-09 $08 00/0 ‘© M006 Acudemie Press LimitedCommunications 389) PILEUP of 73, 7 transmenbrane helices proteins ae 3 Lh Figure 1. The homology clustering of 73 transmembrane proteins. The BR (SWISS-PROT code: bacr_halha) appears as a stand-alone branch on the tree. The Figure is made by the ‘pileup’ utili provides a satisfactory unique alignment between bacteriorhodopsin and any GPCR. There is also an indication that the method is able to deteot signal peptides as well Sequence alignment surface Each sequence was randomized to obtain a reference for the alignment. ‘The amino acid composition of the simulated chains were identical to, their natural pair Saroff, 1984). The original Dayhoff algorithm (Dayhoffet al., 1978) was used to obtain the sequence alignment surface. The alignment. score (8) fora certain window of residues was computed for the complete alignment surface of each randomized sequences. The average score (4) and square root of the standard deviation (D} were obtained. Next the scores of the real sequences were computed over the complete alignement surface. For each window, ‘the score (S) was normalized as N = (S ~ A){D, i.e. the values were converted to standard deviation units (sdu). If this normalized score (N) was higher than a cutoff limit (C) the corresponding segment of the alignment surface was marked with a line of N-C thickness. Cumulative score profiles For scoring, an analogue of the “Residue Replace ability Matrix” was used (Tddos et al., 1990). The matrix was recalculated from the updated “Neigh- bourhood Selectivity” data (Cserz5 & Simon, 1989) based on the protein coding, non-ORF regions of the NCBI-GenBank Flat File (release 76.0, Burks ef al., 1901). The matrix is referred to as RReM (Table 1). ‘The values of the alignment surface were summed ¥y of the GCG package rowwise and columnwise as well. This process resulted in a cumulated score profile for the two sequences: X= E(Ny- 0) ‘The profiles were refined in a second summation eyele resulting in new ones, Xf, obtained as follows: Xf =B(My- CP YIT, where Xj; is the cross weighted cumulative seore for the first sequence at the ith position, N,; ~ C is the normalized score at the (i,j) position of the alignment, surface, ¥, is the cumulative score at the jth position of the second sequence, and T’ is the summa of the ‘cumulative score over the second sequence. This cross, weighting was done analogously for the second sequence. ‘This procedure scans the alignment surface twice. During the first scan all the hits are considered with equal weight resulting in a rough score profile for the ‘two sequences (X and ¥ profiles). In the second sean, the hits are weighted according to the result of the first scanning cycle: a particular score is added with more weight to the Xf profile if the hit is located on one of the peaks of the ¥ profile. Therefore the cross weighting did not affect the strong peaks but, amplified the weak ones against the noise. The usual strategy for sequence alignment surface construction is to choose the cutoff limit: as high as, possible. For example, if the cutoff is set to 3 sdw the hit isnot random with 299.5% of eonfidence, Usually the commercial alignment. packages use even higher cutoff criteria by default. In this way one can get very few hits on the alignment surface so that. the construction of the alignment is easy. This method requires high homology between the two sequences to be compared (the minimum homology required is usually about 30%). As it is not the case for the390, Communications Table 1 RReM scoring matrix O08 0.08 0.24 008-051 0.37 035 0.80 -0.40 ~0.50 0.11 025 0.1 0.23 - 0.09 =011 0.18 022-019-008 018 =010 0.26 -0.54 051 060-022 ~0.18 =0.07-043 0.18 0.54 -0.41 ~093 - 0.09 -0.40 020 023-055-045 044 0.14 -0.12 0.60 -034 wgcdonowugrmmoemel 015 010-025-014 022-001-019 036-016 038 014 0.18 056 013-033 022 032-037 022-039-011 0.07 0.02 009 -0.11 0.12 025 0.05 -0.19-0.17 -0.20-0.17 0.05 001-029 0.19 051-044-013 0.05 048 952-030-018 0.20 -003, 007-020-001 034-031-011 0.139028 047-020-017 012-017 057 O14 010 022-017-001 029 0.44027 -0.24 -0.19-0.17 0.18 027-006 -0.16 O11 0.14 023-027 002-002 0.08 0.00 -0.24 0.04 -0.02 0.13 020-015-032 045, 0.08 033-048-044 0470.12 -0.18 0.70036 049 0.23 -0.46 -0.17 -0.38 -029-0.18 007 0.18 -041 022 041-008-011 086-022 022 0,16 0.39 -0.13 0.17 0.03 017-012 029, 036 021-034-082 054-024 0.08 049-020 017 0.07 -097 ~0.23 028-003-021 0.18 O31 0.43 Ac bp E F @ H I K LM N P Q@ Rk 8 T VOW "The updated version of the ‘Residue Replace ability Matria” (RRM) used for scoring in this paper proteins compared here, we choose lower values for the cutoff such as 1.5 or even 0.5 sdu. ‘Due to the low cutoff limit the obtained alignment surface will contain typically several thousands of hits with low significance. But in turn, due to the large population, the result can be analysed statistically. Profile alignment ‘The cross weighted cumulative score profiles were aligned with a modified Needleman & Wunsch algorithm (Needleman & Wunsch, 1970, for more details see Pearson & Miller, 1992 and references therein). In this procedure the elements of the alignment matrix are defined as follows: My = XX where Xf, and Yf, are the cross weighted cumulative scores for the two sequences at the ith and jth position, respeotively. The elements of the matrix was treated according to the following: Tarja t My a “"Yf @) T= “Yh; (3) where the (1) ease corresponds to the continuation of the alignment while the (2) and (3) ones correspond to the introduction of gaps in one of the sequences. At each position of the alignment matrix the path giving the highest score was chosen. Fisa penalty factor that allows introduction of gaps when the alignment matrix element is low and limits them otherwise. The value for F was chosen to be between I and 2, In practice this choice results in a position-dependent penalty scheme. Smoothing ‘The obtained alignments using the procedure described above contain many short, consecutive gaps in both sequences. This appears on the dot- plots as a “saw-edge” profile instead of straight lines parallel to the diagonal. This is supposed to be the consequence of the non-optimal target function used and to overcome this problem the alignments were next “smoothed” as follows. For each position of the alignment a shift of the second sequence relative to the first one can be assigned. ‘This relative shift is constant over the matching regions and may change only at the positions of gaps, ie. the difference of the relative shifts in the consecutive positions are zero. Thus when summing over the whole matching region this value remains zero until the first gap in one of the sequences is found. In the case of our saw-edge profiles this value is within an interval around zero for the matching regions and diverges from this limit at real gaps. This parameter is referred to as “smoothing tolerance” and has a typical value for 2 or 3. For the regions where the smoothing tolerance criteria were satisfied the saw-edge profiles were replaced @ straight line across the original profile After the smoothing procedure the matching cumulative score profile positions correspond to those residues which match each other in the final sequence alignment, Selection of the test set From the SWISS-PROT database 257 sequences having seven “FT/TRANSMEM*“ annotation lines were extracted. These sequences were aligned with the “pileup” multiple alignment utility of the GCG package (Genetics Computer Group, 1991). In the procedure the default parameters were used. On the hasis of this alignment the set, was restricted so that no sequence identity was allowed at the level of 25% or above. This criteria reduced the size of the set. to only 57 sequences. The corresponding SWISS-PROT codes are listed in Table 2.Communications 391 Results In order to test the proposed method we focused first on the comparison between sequences of the transmembrane proteins for which three-dimensional structures are available: photo-reaction center (PRC) L, M and H¥ proteins with themselves and with bacteriorhodopsin (BR), Next human rhodopsin (hR) was chosen as a reference GPCR protein without 3-D. information. The method was also applied on a non- trans-membrane helical protein, interleukin 4. ‘The corresponding SWISS-PROT codes are reel_rhow: reem_rhovi, reehrhovi. bacr-hatha, opsd_human and il_human, respectively. Finally the method was applied on the members of the test set in “all against all” fashion Alignment of PRC}PRC The alignment. surface of two PCR proteins is presented in Figure 2. Clearly high density areas of hits appear on it arranged in a grid-like fashion. These dense areas correlate with the intersection of the 5 versus 5 transmembrane helices of the M and L. PRC units. The associated cross-weighted cumulative score of the PRC-M verus L are presented in Figure 3 on which the segments corresponding to dense areas appear as strong peaks. To test the precision of the correlation between the peak positions revealed by our method and the transmembrane helix locations, the PRC-L and -M sequences were aligned peak-to-peak. The result is, presented in Figure 4. For comparison a traditional alignment was added using the “gap” utility of the GCG package (Genetics Computer Group, 1991). The gap utility uses the original Needleman & Wunsch algorithm to produce the alignment. ‘The program was used with the default parameters, ie. gap penalty = 3, gap length penalty =0.1 and for the scoring the “nwsgappep.emp” matrix of the GCG package. First of all it should be noted that our procedure Table 2 PROT codes of the proteins in the test-set BROLeige ——-Shta_human facm2-human —appe-evali atet_rat hgar-mouse—bact als Thagschatha ——ewcdrome ——rb2human carl-died eptlayenst raeaeeoli—iddr_human, dadr-human —sigl_human —erst_yeast fmllchuman— frin_drome setbarat slgp_rat hhmdli-yeast mas mouse nk2rmiouse ——nimbe_buman nyle_human —ny3r_be fopaSadeome ops pols-rubym eis todpa Rem matrix C210 windows10 ne-yaour-t902 vst oz age ove oor os veux Figure 2. ‘The alignment surface of the PRC-M versus -L protein. The helical regions are marked according to the FEATUREJHELIX records of the SWISS-PROT data base. A 10 residue window size has been applied, tends to insert gaps at the ends of the helices. The obvious errors should be corrected manually. Sinee the method requires the knowledge one of the structures this correetion is possible. Probably a more RReM rateix C=2.0 window-10 a 59 100 bso 289280 Sequence length !a.a-} room_thovi.ew reel_rhovi sw igure 3. Cross-weighted cumulative scores profiles of PRC-M and -L.392, Communications Sequence alignnent of rcom rhovi-sw(segl) v-8. rcel_rhevi .sw(seq2) A-weighted profiles were generated witl RReM matrix, Cutotf=1.0 Window=7 Sequence alignnent were generated with: Gap-penalitys2.0 Snocthing-tolerance= 3 seql: adyqtiytai qargphitvs gewgdnd:vg kpfysywigk igdaqigpiy seq2: alister kyrvrggtli ~ g gal fdfwvap seql: IGASGIAAFA FGSTAILITL FNMAAevhEd plaffratty lglyppkaay seq2: yEvgFFGVSA IFFIFLGVSL IGYasagp- —--twanfes einppdlkys seq: gmgipplhdg GwWLMAGLEM TLSLSsHOLR Ralglgtita seq2: lgaapllegc FWOATTVCAL GAFTSWMLRE, a--Igigwev seql: WNFARAIFFY LCIgcihpt vgawsegvpf givphidwlt afsirygnty seq2: PLAFCYPIFM FCViqvfrpi llgswghafp ygilshidwy antayayinw fegi: yCP----WHG FSIGFAYGCG LLFAAHGATI gareiegita seq2: hynPGHUSSV SFLPVNAMAL GLHCGLIL-~ Gone, seql: rgtavera-- -alfwrvtig fnatTESVHR WOWFFSL~-M VMVSASVGIL seq2: vktaehengy frdvvaysig RIGUFLASNT FLIGAFGTia seq: Itgtfva-. Ipstpdpas- -Ipga--pk. seq2: sgpfktrawp ai penn fies Figure 4. Peak to peak alignment of PRO-M versus -L membrane helices based on the profiles of Figure 3. The residues in the helical regions aro printed with capital letters. Note that this alignment still contains the false gaps inserted into the helical regions, These should be removed manually by the user. efficient target fanction in the modified Needleman & Wunsch algorithm ean rule out this problem. In this example the two alignment methods show a fair agreement. While the traditional one aligns all the helices “head-to-head” our method allows relative shifts of one or two residues with the respect of the beginning of the helical regions. We should emphasize that the five helices of the two protein sequences compared here can be aligned at the cost of only two short gaps despite their quite low sequence identity (27%). Thisis an exception rather than the rule at this level of homology and may explain the good results obtained with traditional alignment procedure. Alignment surfaces of PROBR ‘To test further the method on a less obvious example, it was useful to compare the PRC and BR. sequences in order to sev if the helices could recognize each other in these two unrelated proteins. For this, purpose the alignment surfaces were generated between BR and PRC-M -L or -H. In the case of the PRC-L and M versus BR, the helices of each system recognized each other in a 7 x 5 grid fashion, whereas the single trans-membrane helix of PCR-H is matched seven times with no false hits in the rest of the sequence. The alignment surface of BR versus PRC-M is presented in Figure 5, Alignment of hR{BR ‘The method was next applied to generate alignment surfaces between BR and several selected GPCR. proteins. All of these calculations resulted in 7 «7 grid patterns. Here the result obtained for human thodopsin (hR) is presented asa typical example. This GPCR was selected as it may be expected that its 9 A resolution three-dimensional structure will be refined ROM matrix C=1-0 windows 10 s+ raourwoss ogt “ost age 052 oe ost os °o 50 100 150, 200 250 309 ace hala. vee oe Figare 5. The alignment surface of the BR versus PRC-M. A 10 residue window size has been appliedCommunications 393, REM matrix C+1.9 window-10 Re! matrix C=1.0 window-10 ns woungpedo ost 092 ose opr as bacr_haiha.sw Figure 6. The alignment surface of the BR versus HR. A. 10 residue window size has heen applied, in the near future providing a direct experimental comparison with our prediction. The alignment surface and the cross-weighted cumulative score of BR versus hR are presented in Figures 6 and 7, respectively. The 7 x 7 grid structure is clearly visible in Figure 6 corresponding to the seven transmembrane helices of these proteins. However, the helix intersections are not equally strong. The obtained peak-to-peak alignment. between BR and hR was based on the peak positions of Figure 7 and is presented in Figure 8. The corresponding optimal alignment of the curves is presented in Figure 9. ‘Two experimental evidences can he considered to evaluate this alignment: the relative positions of the retinal binding residue and the counter ion of the Shiff-base. In our alignment these key residues, namely Lys residue on helices 7 and Asp/Glu residues ‘on helices 3, are shifted by one and two residue, respectively when going from BR to bR. Considering the other helices, for which no structural data are available, our alignment is at least in agreement with the hydrophobicity profile analysis of these proteins (data not shown), In contrast. to this result, the GOG gap procedure results in shifts of seven and 23 of the key residues, respectively: Moreover, nine gaps were inserted into the 14 helical regions of the sequences to achieve the 22% identity in the alignment. These failures suggest ‘that the standard GCG alignment is not useful in this, particular ease of GPCR proteins. For the alignment the default. parameters were used as in the case of PRO-M versus L. The BR versus bovine thodopsin alignment was also obtained in the same way and was compared to & 3 oso 200150200250 Sequence length {2.8.1 bace_haihe.sw oped_numan 5H Figure 7, Cross-weighted cumulative score profiles of BR and HR. the one reported in the paper of Oliveria et al. (1993). The relative shifts between the two alternative alignments are 0, 6, 0, 0, 2, 4 and 1 residues for the helices from I to VII, respectively. It is not obvious which alignment is wrong at helix 1 and VI or which method is closer to the reality. We should note that, Oliveira’s alignment also misses the position of the counter ion in helix IIT, which indicates that their method can deviate a couple of residues too. Unfortunately no additional structural information was available for the helices where the two methods disagree. Alignment surface of ILA|BR To see if the method is, as expected, specifie to transmembrane helix recognition, we have tested it to detect possible mutual recognition between the helices observed in bacteriorhodopsin_ and in interleukine4. Thislatest protein for which X-ray and NMR structures are available (Smith ef al., 1992; Wlodaver ef al., 1992) is mainly structured as four helices packed in a left-handed anti-parallel bundle. Despite the presence of similar 3-D structural patterns in these two proteins the apparent, less dense areas do not correlate with the intersection of the helices, Moreover the heights even of the highest peaks, ‘on the corresponding cross weighted score profiles are lower than 2. ie, the signal/noise ratio is worse (data not shown). Whereas if the signal peptide of the interleukine is included into the alignment thestrong, characteristic dense area of hits appear in a 1x7 fashion. This suggests that the method recognizes only transmembrane segments including signal peptides as well.304 Communications Sequence alignment of bacr_halha.su(seql) v.s oped_human.2w(seq2) H-weighted profiles were generated with: RReM matrix, Cutote. 20 Window-10 Sequence alignnent were generated with: Gap-penality=1.0 Snoothing-toleranc seql: lel Iptavegvsg -~g--pEWT WLALGTALMG seq2: MNGTEGENEY VPPSNATCW RSPFEYPQYY LAE-PHQPSM LAAYMFLLIV seg: LOTLYFL-VK angvs- WapaAKKPYA ITTLVPATAF THYLSMA Igy seq?! LGFPINFLTL YVTVQHKKLR ULNEAVADLE MVLGGFTSTL gitrvpfoge QNPIYWA--- -RYADWLETT PLULLDLALL vdadqg. YESLEGYFVF GPTGCNLEGF FATLGGEIAL WSLVVLATER YWWVCKPMSN TIL ALVGADGIMI GTGLVGa1-~ ek-y- FRFGENHAIM GVAPTWVUAL ACAAPPLAGH SRYIPEGLOC SCGEDYYTLK seal -sypf VWiATSTARM LYILYVLEFG ftekeesmrp oVAST-. seq2: PEVNNESFVI YM--FVVKFT IPMITIFECY GQLVFTVKE- ———-AARQO -PKVLRAVT WLNSAYPW Wli---geeq agivpln-- (QBSATTORAE KEVIRMVIIM VIAFLICHVP YASVAPYI-~ —------PTH seal TETLL FMVLDVSRK- -NGRGLILIY ©. ~aifgeacap eq?! QGSNFCPIFH TLPAFFAKSA AIYNPVIY-- -INARIKQPRN CHLTTICCG- seql: epsagdgaas ~--t~ a6 a seq? KNPLGDDEAS ATVSKTETSQ APA Figure 8. Peak to peak alignment of BR versus HI transmembrane helices, For BR the residues falling into the helical regione are printed with capital letters. The residues proven to being involved in structure/funetion relation hip are marked by *. Note that this alignment still contains the false gaps inserted into the helical regions. These shoutd be removed manually by the usor, Runs on the test-set ‘Toestablish the generality of the method all the 57 sequences of the test-set were aligned against the other 56, The resulting cross weighted cumulative score curves of these 1596 runs were averaged for each protein separately. The seven peaks characteristic for the seven trans-membrane domains appear on 55 averaged curves out of the 57 (data not shown). ‘The method failed to detect them only for the gastrine receptor (gasr_canfa) and fora structural polyprotei {pols_rubvm). For all other sequences the positions o the seven strongest peaks correlate with the expected position of the transmembrane segments according to the “FT/TRANSMEM" records of the SWISS-PROT annotations. In all plots bat these two exceptions, the peaks are higher than or close to 2, proving their significance as this could have happened only if the peaks are located in the same regions for a sequence independently from the second one in the alignment. ‘This effect is demonstrated on the 56 curves obtained for the BR (Figure 10). The value of 2.0 can be considered as an empirical cutoff limit for the significance of a peak in general. In the cases where the sequence includes the signal peptide a strong extra peak occurred at the corresponding position. ‘The high efficiency of the detection of transmembrane segments is about the same as the most recent prediction methods can provide (Persson & Argos, 1994; Domb & Lawrence, 1994). However, the direct comparison of the methods is difficult since they are based on quite different, principles. The cited methods score single sequences with transmembrane helical propensities directly or through a neural net. ‘Therefore they give the absolute positions of the helical regions, These values deviate considerably relative to the experimental ones or the results of alternative predictions while the number of the transmembrane segments is matched correctly (see ‘Table 3 of Persson & Argos, 1994). In contrast to this, our method requires the positions of transmembrane segments in one of the sequences as input and gives the location of the potential helical regions relative to them, Unfortunately the lack of experimental data prevents the objective ranking of these alternative methods concerning the accuracy of the prediction.Communications 395 RRM maceix C=1.0 windows 20 BR vig. the test-ser ay028 paayeyon-22039 £ ex008 porysron-s2023 050100 «150-200-2500 Sequence Length (4-0) bocr_halhe sw oped_nunan sw Figure 9. The crossweighted cumulative scores of BR and IR as the modified Needleman & Wunsch algorithm aligns them: Conclusion First, it should be noted that the picture of the alignment surface is strongly dependent on the scoring matrix, and that no dense areas appear if the usual “pam250.cmp” or the “pileuppe.emp” matrices of the GCG sequence analysis package were used. The reason may be that the RReM matrix used here is, rather related to the physico-chemical character of amino acids than the pam250 matrix, which reflects the genetical aspects of point mutations (for details about the differences between the two matrices, see ‘Tudis et al., 1990), ‘The original matrix scores identical residues with 1.0; this was modified to make the matrix less sensitive to identity relative to similarity. A value of 0.8 was, retained and this modification did not affect the results significantly, The general aspect of the alignment. surfaces was not modified hy the cutoff as long as its values are in the 1.5 >C > 0.5 sdu range, ‘The C = 1.0 value was found to be an optimal choice. Under these conditions, the method presented here seems to be a sensitive and unambiguous way to detect transmembrane helices in proteins regardless of their level of homology. In the particular and very. sensitive case of GPCR proteins, since the degree of identity between the primary structures of various, members of the group can be below 10%, our method may solve the crucial sequenee alignment. problem necessary for 3-D model building. The success of the method may be related to the particular character of transmembrane helices compared to those found in globular proteins; the fact that no helical relationship has been found between BR and ILA suggests that in Cr rt Sequence Length [a.al Figure 10. The cross weighted cumulative score profiles of BR obtained with the alignment against the 56 other proteins of the test-set. The positions of the peaks are ‘conserved, the height of the peaks deviate considerably, but typically higher than 2.0, transmembrane helices supplementary information is required apart from the usual helix-helix hydro- phobic packing forces. This may indicate a particular and important functional role for transmembrane helical fragments as they must pack together while facing the phospholipid environment and using polar residues for transmembrane signaling. This special character may explain why the application of structure prediction algorithms and hydropathy plotssuggest that theTM7 helices of the GPCR family were neither regular helices nor convincingly hydrophobie. Further validation of the method will be carried out by constructing a 3-D molecular model of hR based on the presented alignment using the 3-D structure of BR as the template and then comparing it with experimental 3-D structure of hR when refined to a high resolution. M. Cis indebted to SANOFI Recherche (France) for a postdactoral fellowship. ‘This work was partially supported by the Hungarian OTKA 1361 and T 12800 grants. Dave Gillespie's PASCAL-to-C source level translator is acknowl: edge. Special thanks to Dr Pasupulati Lakshminarasimhula for helpful discussions. To obtain the source code for UNIX, systems and the scoring matrix send E-mail to miklos@ Jetn.u-naneysfr, using “TMbelix” as the subject of the message. References Burks, C., Cassidy, M., Cinkosky, M. J, Cumella, K. Gilna, P, Hayden, J. E.-D., Keen, G, M. Kelley, T. A, elly, M., Kristofferson, D.& Ryals, J. (1991). Genbank Nucl. Acids Res. 19, (Sappl), 2221396 Communications Cvera6, M. & Simon, L. (1989). Regularities in the primary sructureof proteins, fut, I Prpt, Protein. 34, 184-198. Dayhoff, M. 0, Schwurts, R.M., & Oreutt, B.C. (1978). A model of evolutionary changes in proteins, In Alas of Protein Sequence and Structure (Dayhoff, 0. M., e4), vol. 5. suppl. 3, pp. 345-352, National Biochemieal Foundation Georgetown University Medical Center Washington DC. Donnelly, D. & Cogdell, R. J. (1993), Predicting the point at which transmembrane helices protrude from the bilayer: a model of the antenna complexes from photosynthetie bacteria, Protein Eng. 6, 629-636. Domb, G. W. & Lawrence, J. (1994). Analysis of protein transmembrane helical regions by a neural network. Protein Sci. 3, 557-566. Edelman, J. (1993). Quadratic minimization of predictors, for protein secondary structure; application to transmembrane alphachelices, J. Mol. Biol. 232, 165-191 Genetics Computer Group (1991). Progeam Manual for the GCG Package, Version 7, April 1991, 575 Science Drive, Madison, Wisconsin, USA, 58711 Havelka, W.'A., Henderson, R., Heymann, J. A. W. & Ocsterhelt, D. (1993). Projection structure of halorhodopsin from Halobucterium halobium at 6A resolution obtained by electron grye-microscopy. Mol. Biol. 234, 837-86 Henderson, R., Baldwin, J. M., Coska, T. Aw, Zemlin, FR, Beckmann, B. & Downing, K. H. (1990). Model for the structure of bacteriorhodopsin based on high resolution electron eryo-microscopy. oJ Mot, Biol. 213, 899-92 Hilbert, M., Bohm, G., Jaenicke, R, (1993), Structural relationship of homologous proteins as a fundamental principle in homology modeling. Proteins: Struct Funct, Genet. 17, 138-151 Hoflack, J., Trumpp-Kallmoyer, 8. & Hibert, M. (1994) Re-evaluation of bacteriorhodopsin as a model for protein-coupled receptors. Trends Pharmacol, Sci 15, 1-9. Kerlavage, A. (1991). G-protein-coupled receptor family. Cure Opin. Struct. Biol. 1, 394-401 Kontoyianni, M, & Lybrand, T. P (1993). ‘Three-dimensional models for integral membrane proteins possibilities and pitfalls, Persp. Drug Discoe. Dis. 1, 1-300. Needleman, 8. & Wunsch, ©. (1970). A general method applicable to the search for similarities in the amino acid sequence of two proteins J Mol. Biol. 48, 443-453, Oliveira, L., Paiva, A-C.M.& Veiend, D. (1993). A common rmotifin G-protein-coupled seven transmembrane helix receptors. J. Contp, Aided Mol, Design 7, 649-98. Pearson, W. R. & Miller, W. (1992), Dynamic programming algorithms for biological sequence comparation. In Methods in Enzymology (Brand L. & Johnson, M. L., ds) vol. 210, pp. 575-601, Academie Press, New York. Persson, B. & Argos, P. (1993). Prediction of transmem. brane segments in proteins utilizing multiple sequence alignments. J Mol, Biol. 237, 182-102. Saroff H. A. (1084). The uniqueness of protein sequences. Uniqueness diagrams for the Dayhoff file. Bull. Math. Biol. 46, 661-672 Schafmeister, B., Miercke, LJ. W.& Stroud, R. M, (1993), ‘Structure nt 2'5 A of adesigned poptide that maintains solubility of membrane proteins. Science, 262, 734-738. riler, G. FX., Villa, C. & Henderson, R. (1993), Projection structure of thodopsin. Nature (London) 362, 770.772, Smith, L. J, Redfield, C., Boyd, J, Lawrence, M. Edwards. R. dl, Smith, R./A. & Dobson, C. M. (1992). Human interleukin-4: the solution structure of a fourhelix bundle protein. J. Mol. Biot. 224, 899-904. ‘Taylor, E,W. & Agarwal, A (1993), Sequence homology between bacteriorhadopsin and C-protein coupled receptors; exon shufiling or evolution by duplication? FEBS Letters, 325, 161-166. Tadis, E., Cser2d, Mo & Simon, 1. (1990). Predicting ‘isomorphic residue replacement for protein design. Int J. Peplide Protein Res, 36, 236-239. Wlodaver, A., Pavlovsky: A. & Gustchina, A. (1992). Crystal structure of human recombinant interleukin at, FEBS Letters, 309, 59-64 2.25 A resolution Edited by M. Yaniv (Received 21 February 1994; accepted 27 July 1994)

Principles: Life and Work
From Everand
Principles: Life and Work
Ray Dalio
4/5 (648)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
From Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
4/5 (1175)
The Glass Castle: A Memoir
From Everand
The Glass Castle: A Memoir
Jeannette Walls
4.5/5 (1856)
Sing, Unburied, Sing: A Novel
From Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
4/5 (1267)
The Perks of Being a Wallflower
From Everand
The Perks of Being a Wallflower
Stephen Chbosky
4.5/5 (4103)
Her Body and Other Parties: Stories
From Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
4/5 (903)
Shoe Dog: A Memoir by the Creator of Nike
From Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
4.5/5 (629)
Steve Jobs
From Everand
Steve Jobs
Walter Isaacson
4.5/5 (1139)
The Emperor of All Maladies: A Biography of Cancer
From Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
4.5/5 (298)
The Outsider: A Novel
From Everand
The Outsider: A Novel
Stephen King
4/5 (2886)
Angela's Ashes: A Memoir
From Everand
Angela's Ashes: A Memoir
Frank McCourt
4.5/5 (943)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
From Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
3.5/5 (2289)
The Yellow House: A Memoir (2019 National Book Award Winner)
From Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
4/5 (100)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
From Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
3.5/5 (233)
Fear: Trump in the White House
From Everand
Fear: Trump in the White House
Bob Woodward
3.5/5 (836)
Team of Rivals: The Political Genius of Abraham Lincoln
From Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
4.5/5 (244)
Little Women
From Everand
Little Women
Louisa May Alcott
4.5/5 (2369)
Rise of ISIS: A Threat We Can't Ignore
From Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
3.5/5 (144)
Manhattan Beach: A Novel
From Everand
Manhattan Beach: A Novel
Jennifer Egan
3.5/5 (919)
John Adams
From Everand
John Adams
David McCullough
4.5/5 (2546)
The Light Between Oceans: A Novel
From Everand
The Light Between Oceans: A Novel
M.L. Stedman
4.5/5 (815)
The Unwinding: An Inner History of the New America
From Everand
The Unwinding: An Inner History of the New America
George Packer
4/5 (45)

New Alignment Strategy For Transmembrane

Uploaded by

New Alignment Strategy For Transmembrane

Uploaded by

You might also like