N - and C-Terminal Protein Sequencing by MALDI ISD Feb 2009
N - and C-Terminal Protein Sequencing by MALDI ISD Feb 2009
N - and C-Terminal Protein Sequencing by MALDI ISD Feb 2009
By Ejvind Mortz, Thanh Ha Nguyen, and Janne Crawford. Alphalyse A/S, Unsbjergvej 4, Odense, Denmark.
Email correspondence: [email protected]
Analytical characterization of purified proteins, in particular recombinant proteins for drug discovery and development, requires confirmation
of the full length protein sequence. Here we present a powerful mass spectrometry method (MALDI ISD) that provides partial sequence
information of the intact protein with up to 20-50 amino acid residues from both the N-terminal and C-terminal in one single analysis.
The analysis can thus confirm expression and purification of the full length protein sequence, and detect unexpected truncations and
modifications of the termini. The MALDI ISD analysis can also be applied to N-terminally blocked and PEGylated proteins. Alphalyse
provides protein analysis services and here we present results obtained on a variety of proteins.
INTRODUCTION
Structural characterization of purified natural and down sequencing of peptides and intact proteins (refs 1-5).
recombinant proteins traditionally includes N-terminal
In this article, Alphalyse presents results obtained using
Edman sequencing to confirm 5-15 amino acid residues from
ISD MALDI mass spectrometry on a range of purified
the amino terminus. N-terminal sequencing is a requirment
proteins with Mw from 6-80 kDa. MALDI ISD offers
according to the ICH Q6B Guideline for characterization of
several key benefits compared to Edman sequencing:
recombinant proteins for clinical testing, and to demonstrate
comparability and consistency between cGMP batches. • Both N- and C-terminal sequences of 20-50 residues
N-terminal Edman sequencing does not work for can be obtained.
N-terminally blocked proteins, and the analysis is time- • N-terminally modified and blocked proteins (acetyla
consuming with a cycle time of 40 minutes per amino acid. ted, pyroglutamate, PEGylated) can be sequenced.
An alternative technique, ISD MALDI MS has been • Data acquisition is fast.
developed and demonstrated in several publications for top- • Very long sequence reads can be obtained with up to
80 residues from one single ISD MALDI mass
spectrum.
www.Alphalyse.COM
Application note: N- and C-terminal sequencing by MALDI ISD # 2000902
RESULTS
A range of different proteins with molecular weights from residues. Only 10 amino acids in the middle of the small
6.5 to 77 kDa were analyzed by MALDI ISD. The smallest protein were not covered by the analysis. Thus, 83% of the
protein, the aprotinin peptide of 6.5 Da (Figure 2) showed 2 sequence is confirmed in one single MS analysis.
ion-series that confirmed 25 N-terminal and 23 C-terminal
www.Alphalyse.COM
Application note: N- and C-terminal sequencing by MALDI ISD # 2000902
The Association of Biomolecular Resource Facilities spectrum covered 30 N-terminal and 29 C-terminal amino
(ABRF) 2009 study for comparing Edman and MS based acid residues. The protein in Sample 2 was identified as
techniques for N-terminal analysis included two protein GAPDH (P46406).
samples provided to a broad range of laboratories as a The MALDI ISD spectrum shown in Figure 3 confirmed
blinded study. The 2 protein samples were analyzed by 39 N-terminal and 32 C-terminal residues. The leading
MALDI ISD. The protein in Sample 1 was identified as a Methionine residue in the database sequence was not
fusion protein between a his-tag expression vector and present in the N-terminal. See reference 6 for the complete
alcohol dehydrogenase (P00330). The correct protein was ABRF-ESRG 2009 study results for comparison of Edman
identified by a combination of Mascot database searching and MS techniques for N-terminal sequencing.
and de-novo sequencing of the fusion area. The MALDI ISD
FIGURE 3 ABRF - ESRG STUDY. ANNOTATED ISD MALDI SPECTRUM OF SAMPLE 2 (36 kDa)
The ISD MALDI spectrum was used for a Mascot database search (insert) in the NCBI nrdb database for identification of the
protein. The spectrum confirms both the N- and C-terminals of the database sequence, covering in total 71 amino acid residues
in a single mass spectrum.
In collaboration with ACE BioSciences A/S, a recombinant MALDI ISD analysis confirmed the expected C-terminal,
vaccine protein was analyzed by MALDI ISD (Figure 4). The and covered 43 N-terminal and 37 C-terminal residues in
Mascot database search confirmed an expected N-terminal total.
truncation where a leader sequence is removed. The
www.Alphalyse.COM
Application note: N- and C-terminal sequencing by MALDI ISD # 2000902
The range of different proteins analyzed by MALDI ISD lengths from 19-43 residues. In several proteins N-terminal
for this application note is summarized below in Table truncations were detected. For the monoclonal antibody,
1. The analysis was successful for proteins from 6 to the analysis confirmed the expected N- and C-terminals
80 kDa. For all proteins analyzed both N-terminal and for both the heavy and the light chains. An N-terminal
C-terminal sequences were obtained with sequence pyroglutamate residue was detected in the heavy chain.
Aprotinin 6.5 25 AA 23 AA
Myoglobin 16.9 33 AA 31 AA
GAPDH 35.8 21 AA 24 AA
Transferrin 76.9 43 AA 19 AA
www.Alphalyse.COM
Application note: N- and C-terminal sequencing by MALDI ISD # 2000902
CONCLUSION
In the present study, the MALDI ISD technique showed • MALDI ISD worked efficiently for all analyzed proteins
several advantages compared to traditional Edman from 6-80 kDa.
sequencing. A comparison of the 2 techniques is given in • Both N-terminal and C-terminal sequences were
obtained for all proteins.
Table 2. The features and advantages of MALDI ISD make
it a very useful technique for analysis and quality control of • Long sequence reads of 19-43 residues were obtained
from both termini.
purified natural and recombinant proteins.
• Truncated and N-terminally blocked proteins could be
sequenced.
TABLE 2. COMPARISON OF MALDI ISD AND EDMAN TECHNIQUE FOR PROTEIN SEQUENCING
Disadvantage • Low throughput analysis with cycle time 40 • Due to MALDI matrix ions, the first 5-7 amino
mins per residue. acids can only be sequenced directly on
• Expensive chemicals results in high cost per instruments with “T3-sequencing” utility.
residue. • De-novo sequencing of novel proteins is
• Does not work for N-terminally blocked difficult.
proteins. • Isobaric amino acids cannot be distinguished
• No information about the C-terminal (I/L, Q/K).
REFERENCES
1. Reiber, DC et al. Anal. Chem.. 1998, 70, 673-683. Identifying proteins using matrix- assisted laser desorption/ionization in-source fragmentation data
combined with database searching.
2. Takayama, M et al. Electrophoresis 2000, 21, 1670-1677. Sequence information of peptides and proteins with in-source decay in matrix assisted laser
desorption/ionization-time of flight mass spectrometry.
3. Schnaible, V et al. Anal. Chem. 2002, 74, 4980-4988. Screening for disulfide bonds in proteins by MALDI in-source decay and LIFT-TOF/TOF-MS.
4. Demeure, K et al. Anal. Chem. 2007, 79, 8678-85. Rational selection of the optimum MALDI matrix for top-down proteomics by in-source decay.
5. Suckau, D et al. Bruker Daltronics Application Notes #MT-57. Reflector in-source-decay (ISD) MALDI-TOF MS: A powerful tool for N-terminal sequence
characterization of proteins.
6. Thoma, RS et al. The ABRF ESRG 2009 Study: Comparison of Edman and mass spectrometry techniques for N-terminal sequencing. http://www.abrf.
org/ResearchGroups/EdmanSequencing/EPosters/ESRG2009_poster(2009_02_03).pdf.
www.Alphalyse.COM