PROFILING AND DATA PROCESSING STRATEGIES FOR PEPTIDOMIC ANALYSIS By Siobhan Shay A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of Chemistry – Doctor of Philosophy 2013 ABSTRACT PROFILING AND DATA PROCESSING STRATEGIES FOR PEPTIDOMIC ANALYSIS By Siobhan Shay Peptide functions have been underappreciated compared to proteins. With a few notable exceptions, peptides have been thought to function largely as degradation intermediates between proteins and individual amino acids, but there is growing recognition that peptides have their own significant biological functions. Peptides, from both endogenous and exogenous sources, including diet, play essential roles regulating beneficial and detrimental physiological functions. Some bioactive peptides are unusual in that to exhibit physiological functions, they show resistance to proteolytic enzymes. The elucidation of identities of bioactive peptides is essential for determination of their abundances, structures and functions. Peptidomics, the analysis and description of total peptide content within a biological sample, however, is often hindered both by the nature of the peptide and current analytical approaches. Peptides are exposed in vivo to multiple proteolytic enzymes, and the constituents of the resulting digestion-resistant and bioactive peptidome are not guaranteed to have a specific N- or C-termini. As a result, digestion-resistant peptides often lack terminal amino acids with basic side chains that yield mass spectrometric fragments sufficient for their identification. In addition, bioactive peptides are also hindered by the effects of chromatographic co-elution, as peptidomes are more complex than can be resolved in a single liquid chromatographic separation. To mitigate the challenges that inhibit peptidomic analyses from detecting, identifying and characterizing the maximum number of peptides within a complex biological sample, this dissertation describes an alternate method that applies metabolomic-like data processing strategies to non-targeted profiling of the peptidome using a simulated digestion procedure to generate a peptide mixture of great complexity. The application of a data-independent LC/TOF MS analysis followed by multivariate statistical analysis allows for the survey of the entire complement of a peptidome, providing abundances are above limits of detection. Multivariate analysis also allows for the recognition of peptides within a peptidome that differentiate physiological states, genotypes, treatments, or temporal changes. In this work, the extent of posttranslational deamidation of glutamine was found to be a distinguishing characteristic of protein grains of wheat (Triticum aestivum) and its relative spelt (Triticum spelta). In the model digestion-resistant peptidome generated from proteolysis of wheat storage proteins using gastrointestinal enzymes, surviving peptides exhibit a narrow range of hydrophobicity, and chromatographic co-elution hinders the analysis. The use of non-traditional stationary phase/solvent system combinations for peptide separation can spread peptide elution over a wider range of retention times. This dissertation describes use of a pentafluorophenyl propyl HPLC stationary phase that provides for orthogonal separations relative to octadcecylsilyl phases that are obtained through manipulation of the stationary phase/solvent system. Finally, while the application of a data-independent HPLC-MS approach can detect thousands of peptides in a single sample, confirmation of peptide identity relies on additional information, most notably, MS/MS. However, many of the digestion-resistant peptides derived from wheat are rich in glutamate, and collision-induced dissociation yielded peptides that were not identified using sequence database searching. In some of these peptides, fragment ion spectra were dominated by internal sequence fragment ions including some with masses consistent with dehydrative cyclization. These findings highlight the need for continued improvement to peptidomic technologies. ACKNOWLEDGMENTS The completion of this dissertation and the work held within would not have been possible without the support of many people who I would like to take the time to thank here. I would first like to acknowledge my grandfathers, who both regrettably passed before I could complete my degree. They were exceptional men that in conjunction with their surviving counterparts served as an example on unconditional love. In particular, my grandfather Dr. Charles Boyle gave special inspiration, stressing the importance of education from an early age, encouraging me to further my pursuits whenever possible. I would also like to thank my parents, Maura Boyle and Paul Shay for their unending love and support through what must have felt a long process, especially my mother for allowing me to live with her these past six years without complaint. My brother Kevin also deserves mention for playing his own special part, doing what brothers do best – reminding me that I am not always as smart as I think I am, and bringing out the humor in all situations. I would also like to acknowledge my advisor Dr. A. Daniel Jones. Without his support, endless patience, and guidance I do not think that I would have found my graduate school experience as rewarding as I did. While sometimes frustrating, it was his endeavor to have me answer my own questions that inspired the greatest growth during my career. Those long conversations starting from the most basic fundamentals of chemistry and working up to answer an indefinitely more complicated question were invaluable and will be something that I will strive to keep with me always. I also am grateful for him providing me with a project of great interest to me. When I first started within the group, I was striving to find something to hold my interest, and will never be able to fully express my gratitude for when Dr. Jones recognized my iv struggle and found a collaborative project that better satisfied my curiosity. It is without question that during my time in the Jones lab, my self-confidence and scientific curiosity has grown, which in no small part is due to his leadership. I would also like to thank the Lyman Briggs family that has supported me over the past three years, particularly Dr. Maxine Davis. Since my first semester as a graduate student, I have found that I have a special love of teaching, and have looked to foster that career goal. Maxine as well as others offered integral advice when it came to both my teaching as well as a friendly person to talk to when writing my dissertation. I look forward to continuing these relationships in my future career. I would also like to take the time to acknowledge the friends that I have made during my time here at MSU. I have been extremely lucky with the group members that I have had over the years. They have all been kind, supportive and supply endless amounts of entertainment. I would like to make special mention of Dr. Sarah Luderer, Dr. Danielle McBride, Aaron McBride, and Susan Achberger, all of whom played a part in listening and commiserating when needed, as well being a part of some of my most treasured memories. Finally, I would like to thank Dr. Michael Stagliano. If there has been one person that has continuously been there for both successes and times of troubles, it has been him. His unwavering support and advice was integral in getting me through this process sanely. I will forever be grateful. v TABLE OF CONTENTS LIST OF TABLES ...................................................................................................................... IX LIST OF FIGURES ......................................................................................................................X KEY TO SYMBOLS AND ABBREVIATIONS ................................................................... XIV CHAPTER 1 .................................................................................................................................. 1 1.1 Regulatory Endogenous Bioactive Peptides ..................................................................... 1 1.1.1 Insulin and C-peptide ................................................................................................. 2 1.1.2 Substance P .................................................................................................................. 3 1.1.3 Angiotensin ................................................................................................................... 4 1.1.4 Bradykinin.................................................................................................................... 5 1.2 Endogenous Neuropeptides ............................................................................................... 6 1.2.1 Opioid Neuropeptides: Endorphins and Enkephalins ............................................. 7 1.2.2 Neuropeptide Y family: Neuropeptide Y, Peptide YY, Pancreatic Polypeptide ... 8 1.2.3 Neurotensin ................................................................................................................ 10 1.2.4 Adrenocorticotropin Hormone ................................................................................. 12 1.3 Exogenous Bioactive Peptides.......................................................................................... 13 1.3.1 Angiotensin Converting Enzyme Inhibitory Peptides ............................................ 14 1.3.2 Exogenous Opioid Peptides....................................................................................... 15 1.3.3 Immunomodulatory Peptides ................................................................................... 16 1.3.4 Antioxidative Peptides ............................................................................................... 17 1.3.5 Antimicrobial Peptides .............................................................................................. 18 1.3.6 Antigenic Peptides ..................................................................................................... 19 1.4 Current State of Bioactive Peptides ................................................................................ 21 1.5 Current Technologies in Detecting and Identifying Bioactive Peptides..................... 22 1.5.1 Isolation and Detection .............................................................................................. 22 1.5.2 Identification .............................................................................................................. 25 1.5.3 Fragmentation ............................................................................................................ 26 1.5.4 Comparative Peptide Quantification ........................................................................ 28 1.5.5 Current Advancements in Technology .................................................................... 30 1.6 Peptidomics ....................................................................................................................... 31 1.6.1 Targeted vs. Non-Targeted Analysis ........................................................................ 32 1.6.2 Challenges of the Peptidomic Workflow ................................................................. 32 1.6.3 Proposed Improvements to Peptidomics ................................................................. 34 CHAPTER 2 ................................................................................................................................ 36 2.1 Introduction ...................................................................................................................... 36 2.1.1 Peptidomics and Current Challenges ....................................................................... 36 vi 2.1.2 Current Peptidomic Approaches ............................................................................. 39 2.1.2 Metabolomics and the Multivariate Analysis Approach ....................................... 40 2.1.3 Wheat and Spelt ......................................................................................................... 41 2.2 Materials and Methods .................................................................................................... 42 2.3 Results/Discussion ............................................................................................................ 46 2.3.1 Peptidomic Profiling of Digestion-Resistant Peptidome ........................................ 46 2.3.2 Confirmation of Peptide Annotation ....................................................................... 56 2.3.3 Post-translational Modifications that Differentiate Digests .................................. 57 2.4 Conclusion ......................................................................................................................... 69 CHAPTER 3 ................................................................................................................................ 71 3.1 Introduction ...................................................................................................................... 71 3.1.1 Challenges of the Proteomic Approach to Identifying Digestion Resistant Peptides ................................................................................................................................. 71 3.1.2 Fragmentation of Peptides ........................................................................................ 72 3.2 Materials and Methods .................................................................................................... 75 3.3 Results and Discussion ..................................................................................................... 77 3.3.1 Peptidomic Profiling .................................................................................................. 77 3.3.2 Confirmation of Annotations using LC-LIT-FTICR MS ...................................... 82 3.3.3 Internal Fragment Ion Formation for the Identification of Challenging to Fragment Peptides ............................................................................................................... 83 3.4 Conclusion ......................................................................................................................... 94 CHAPTER 4 ................................................................................................................................ 96 4.1 Introduction ...................................................................................................................... 96 4.1.1 Challenges of Analyzing the Digestion-Resistant Peptidome ................................ 96 4.1.2 Multidimensional Separations ................................................................................... 97 4.1.3 C18 Retention Mechanisms ...................................................................................... 98 4.1.4 Fluorinated Stationary Phases ................................................................................. 99 4.2 Materials and Methods .................................................................................................. 101 4.2.1 Materials .................................................................................................................... 101 4.2.2 Extraction of Gliadin Proteins................................................................................. 102 4.2.3 Simulated Gastrointestinal Digestion of Wheat Storage Proteins ....................... 102 4.2.4 Liquid Chromatography-Mass Spectrometry ....................................................... 103 4.2.5 Peak detection, alignment and integration ............................................................. 104 4.2.6 Peptide annotation .................................................................................................... 106 4.2.7 Calculation of peptide properties ............................................................................ 107 4.3. Results and Discussion ................................................................................................... 107 4.3.1 Digest Complexity ..................................................................................................... 107 4.3.2 A fluorinated stationary phase as an alternative to C18 ....................................... 114 4.3.2.1 Separation using aqueous HCOOH/acetonitrile gradient ................................. 114 4.3.2.2 Effects of mobile phase ammonium on peptide retention .................................. 120 4.3.4 The quest for orthogonal structure-retention relationships ................................. 131 4.3.5 A mechanistic view of retention on the perfluorinated stationary phase ........... 140 4.4 Conclusions ...................................................................................................................... 143 vii CHAPTER 5 .............................................................................................................................. 145 5.1 Conclusion ....................................................................................................................... 145 BIBLIOGRAPHY ..................................................................................................................... 149 viii LIST OF TABLES Table 2.1: Relative mass defect values for internal amino acid residues .....................................49 Table 2.2: Annotations of discriminating peptides detected in digests of wheat and annotated using LC/FTICR and LC/MS/MS data and the UniProt sequence database, allowing for conversion of glutamine residues to glutamate by deamidation ............................60 Table 2.3: Relative quantitative abundances of peptides more abundant in digests of wheat relative to digests of spelt. For multiply-charged peptide ions, the m/z value for the singly-protonated species was calculated from the multiply-charged monoisotopic ion. ...............................................................................................................................62 Table 3.1: List of peptide masses annotated using ExPASy simulated gastrointestinal digestions through comparison of observed masses from LC/LIT-FTICR MS ............................78 Table 4.1: Description of the four separation protocols including UHPLC columns, mobile phase components, and gradients. ..............................................................................105 Table 4.2: Description of various properties for five selected peptides found in the various separations. Numberings for the figures in reference to these peptides are found in the left most column...................................................................................................110 Table 4.3: List of identified peptides that display the greatest changes in retention according to % organic modifier (% org) and change in relative retention time. Only those peptides showing a fold change of 2 (or 0.5) or greater were considered significant.122 ix LIST OF FIGURES Figure 2.1: Chromatograms: LC-TOF MS base peak intensity chromatograms of products of sequential digestions of 70% ethanol extracts of (A) ground wheat and (B) spelt grain using pepsin, trypsin, and chymotrypsin .........................................................47 Figure 2.2: Histogram: Relative mass defect histogram for combined LC-TOF MS data extracted from digests of wheat and spelt. Values of RMD greater than 1000 ppm likely reflect contributions from multiply-charged ions. ..........................................48 Figure 2.3: Histogram: Histogram of detected ions in digests of wheat and spelt as a function of ion m/z .......................................................................................................................51 Figure 2.4: PCA Scores Plot: PCA scores plot for LC-TOF MS profiles of peptides generated by sequential pepsin, trypsin, and chymotrypsin digestion of 70% ethanol extracts of ground wheat and spelt grain ................................................................................52 Figure 2.5: OPLS-DA S-plot: OPLS-DA S-plot for LC-TOF MS profiles of peptides generated by sequential pepsin, trypsin and chymotrypsin digestion of 70% ethanol extracts of ground wheat and spelt grain. Each data point corresponds to the extracted ion chromatogram peak area for a specific m/z-retention time pair, and the coordinates reflect the loadings (x-axis) and p-corr values (y-axis). ...........................................54 Figure 2.6: MS/MS Spectrum: Product ion MS/MS spectra generated for two peptides detected in the sequential digestion products of 70% ethanol extracts of ground wheat. (A) peptide annotated as LEPHEIAHL, a product of deamidation of the γ-gliadin peptide LQPHQIAHL and (B) peptide annotated as ELEPF, a deamidation product of the α-gliadin peptide QLQPF ...............................................................................58 Figure 2.7: Bar graphs: LC-TOF MS extracted ion chromatogram peak areas, normalized to the internal standard, for a series of peptide digestion products observed in higher abundance in digests of wheat than in digest of spelt. ..............................................66 + Figure 3.1: MS/MS Spectrum: Product ion MS/MS spectrum generated for [M+H] the peptide LEPHEIAEL in the sequential digestion products of 70% ethanol extracts of ground wheat. Open circles (°) indicate loss of water, bracketed ions [], indicate internal fragment ions. ...........................................................................................................84 + Figure 3.2: MS/MS Spectrum: Product ion MS/MS spectrum generated for [M+H] of the peptide EVPL in the sequential digestion products of 70% ethanol extracts of ground wheat. Open circles (°) indicate loss of water. ............................................86 + Figure 3.3: MS/MS Spectrum: Product ion MS/MS spectrum generated for [M+H] of the peptide SEEEEPVL in the sequential digestion products of 70% ethanol extracts of x ground wheat. Open circles (°) indicate loss of water, and bracketed ions [], indicate internal fragment ions..................................................................................87 + Figure 3.4: MS/MS Spectrum: Product ion MS/MS spectrum of [M+H] of the peptide PPEEEEEEL in the sequential digestion products of 70% ethanol extracts of ground wheat. Open circles (°) represent loss of water, and bracketed ions [], internal fragment ions. ...........................................................................................................89 + Figure 3.5: MS/MS Spectrum: Product ion MS/MS spectrum generated for [M+H] of the peptide PEEPPFSEEEEPVL in the sequential digestion products of 70% ethanol extracts of ground wheat. Open circles (°) indicate loss of water, and bracketed ions [], indicate internal fragment ions. ....................................................................90 + Figure 3.6: MS/MS Spectrum: Product ion MS/MS spectrum generated for [M+H] of the peptide EPEELPEF in the sequential digestion products of 70% ethanol extracts of ground wheat. Open circles (°) indicate loss of water, bracketed ions [], indicate internal fragment ions. ..............................................................................................91 + Figure 3.7: Product ion MS/MS spectrum generated for [M+H] of the peptide NILL in the sequential digestion products of 70% ethanol extracts of ground wheat. .................93 Figure 4.1: Chromatogram: Base peak ion LC-TOF MS chromatogram of a simulated gastrointestinal digest of wheat storage proteins using a C18 column and mobile phase gradient based on 0.15% aqueous formic acid and acetonitrile. The acetonitrile volume percent is displayed as a function time in the form of a dashed line. .........................................................................................................................108 Figure 4.2: Chromatogram: Base peak ion LC- TOF MS chromatogram of a simulated gastrointestinal digest of wheat storage proteins using a C18 column and mobile phase gradient based on 0.15% formic acid and methanol. The methanol volume percent is displayed as a function of time in form of a dashed line. .......................111 Figure 4.3: Chromatograms: Extracted ion chromatograms for the five indicator peptides (m/z 747.40, 528.26, 417.25, 761.30 and 439.26) in simulated gastrointestinal digest of wheat storage proteins using a C18 column and (A) C18/ACN and (B) C18/MeOH for separation. .........................................................................................................112 Figure 4.4: Chromatogram: Base peak ion LC-TOF MS chromatogram of a simulated gastrointestinal digest of wheat storage proteins using a PFPP column and mobile phase gradient based on 0.15% aqueous formic acid and methanol. ......................115 Figure 4.5: Scatter plot: Plot of retention time versus GRAVY for the five selected peptides annotated according to Table 4.2. Peptides are those identified from the PFPP/MeOH separation protocol............................................................................117 xi Figure 4.6: Scatter plot: Plot of retention time versus GRAVY for peptides annotated according to Table 4.3. Peptides are those identified from the PFPP/MeOH separation protocol. ..................................................................................................................118 Figure 4.7: Scatter plot: Plot of retention time versus aliphatic index for peptides annotated according to Table 4.3. Peptides are those identified from the PFPP/MeOH separation protocol. .................................................................................................119 Figure 4.8: Scatter plot: Plot of retention time versus peptide volume for peptides annotated according to Table 4.3. Peptides are those identified from the PFPP/MeOH protocol. ..................................................................................................................121 Figure 4.9: Chromatogram: Extracted ion chromatograms for the five indicator peptides (m/z 747.40, 528.26, 417.25, 761.30 and 439.26) in simulated gastrointestinal digest of wheat storage proteins using a PFPP column and the PFPP/MeOH/NH4 separation protocol. ..................................................................................................................127 Figure 4.10: Chromatogram: Base peak ion LC-TOF MS chromatogram of a simulated gastrointestinal digest of wheat storage proteins using a PFPP column and mobile phase gradient based on 10 mM ammonium formate and methanol. .....................129 Figure 4.11: Chromatogram: Extracted ion chromatograms for the five indicator peptides (m/z 747.40, 528.26, 417.25, 761.30 and 439.26) in simulated gastrointestinal digest of wheat storage proteins using a PFPP column and the PFPP/MeOH separation protocol. ..................................................................................................................130 Figure 4.12: Scatter plot: Plot of retention time versus peptide volume for peptides annotated according to Table 4.3. Peptides are those identified from the PFPP/MeOH/NH4 protocol. ..................................................................................................................132 Figure 4.13: Scatter plot: Scatter plot correlating the retention times of corresponding peptides from the PFPP/MeOH and C18/ACN separations. Annotated markers are in accordance to Table 4.3. .........................................................................................133 Figure 4.14: Scatter plot: Scatter plot correlating the retention times of corresponding peptides from the PFPP/MeOH and C18/MeOH separations. Annotated markers are in accordance to Table 4.3. .........................................................................................135 Figure 4.15: Scatter plot: Scatter plot correlating the retention times of corresponding peptides from the PFPP/MeOH/NH4 and C18/ACN separations. Annotated markers are in accordance to Table 4.3. .........................................................................................136 xii Figure 4.16: Scatter plot: Scatter plot correlating the retention times of corresponding peptides from the PFPP/MeOH/NH4 and C18/MeOH separations. Annotated markers are in accordance to Table 4.3. .........................................................................................138 Figure 4.17: Scatter plot: Scatter plot correlating the retention times of corresponding peptides from the PFPP/MeOH/NH4 and PFPP/MeOH separations. Annotated markers are in accordance to Table 4.3. .....................................................................................139 xiii KEY TO SYMBOLS AND ABBREVIATIONS ACE: Angiotensin Converting Enzyme ACTH: Adrenocorticotropic Hormone AI: Aliphatic Index ATP: Adenosine Triphosphate BLAST: Basic Logical Alignment Search Tool BPI: Base Peak Ion (or Base Peak Ion abundance) C18: Octadecylsilyl phase C18/ACN: Separation performed on C18 stationary phase with 0.15% formic acid/acetonitrile solvent system C18/MeOH: Separation performed on C18 stationary phase with 0.15% formic acid/methanol solvent system cAMP: Cyclic Adenosine Monophosphate CD: Celiac Disease CID: Collision-Induced Dissociation CRF: Corticotropin Releasing Factor DDA: Data Dependent Acquisition DOPA: 3, 4-dihydroxyphenylalanine xiv EDTA: Ethylenediaminetetraacetic acid ESI: Electrospray Ionization ETD: Electron Transfer Dissociation FWHM: Full Width Half Mass FTICR: Fourier Transform Ion Cyclotron Resonance GABA: γ-aminobutyric acid GRAVY: Grand Average Hydropathy HPLC: High Performance Liquid Chromatography HPLC-MS: High Performance Liquid Chromatography-Mass Spectrometry HPLC-MS/MS: High Performance Liquid Chromatography-Tandem Mass Spectrometry HPLC-TOF-MS: High Performance Liquid Chromatography-Time-of-Flight Mass Spectrometry IgG: Immunoglobulin G LC: Liquid Chromatography LC/LIT-FTICR MS: Liquid Chromatography-Linear Ion Trap-Fourier Transform Ion Cyclotron Resonance Mass Spectrometry LC-MS: Liquid Chromatography-Mass Spectrometry LC-MS/MS: Liquid Chromatography-Tandem Mass Spectrometry xv LC-TOF MS: Liquid Chromatography-Time-of-Flight Mass Spectrometry LIT: Linear Ion Trap MALDI: Matrix Assisted Laser Desorption Ionization MHC: Major Histocompatability Complex MMC: Mixed Mode Chromatography MS: Mass Spectrometry MS/MS: Tandem Mass Spectrometry MudPIT: Multidimensional Protein Identification Technology NCBI: National Center for Biotechnology Information ncRNA: Non-coding RNA NK: Neurokinin NO: Nitric Oxide NPY: Neuropeptide Y OPLS-DA: Orthogonal Projection to Latent Structures-Discriminant Analysis PBL: Peripheral Blood Lymphocyte PCA: Principal Components Analysis xvi PFPP: Pentafluorophenylpropyl PFPP/MeOH: Separation performed on PFPP stationary phase with 0.15% formic acid/methanol solvent system PFPP/MeOH/NH4: Separation performed on PFPP stationary phase with 10 mM ammonium formate/methanol solvent system pH: -log(hydrogen ion concentration) pI: isoelectric point PLS-DA: Projection to Latent Structures-Discriminant Analysis POMC: Pro-opiomelanocortin PP: Pancreatic Polypeptide PYY: Peptide YY RMD: Relative Mass Defect SP: Substance P TOF: Time of Flight TOF-MS: Time of Flight-Mass Spectrometry UHPLC: Ultra High Performance Liquid Chromatography Å: Angstrom (10 -10 m) C: Celsius xvii Da: Dalton g: gram x g: times force of gravity k’: capacity factor L: Liter m: Mass m/z: mass to charge ratio M: Molarity/Molar ppm: parts per million rpm: revolutions per minute v: velocity V: Volt xviii CHAPTER 1 1.1 Regulatory Endogenous Bioactive Peptides The extent of physiological function is not discovered all at once, but through a series of observations. Over time, the dogma of function is developed and challenged as new discoveries are made. For example, non-coding RNA (ncRNA), once thought to be a product of “junk DNA”, with no biological function, is now known to have significant biological function [1]. miRNA’s, one of the most diverse of the ncRNA’s deals with gene regulation in a variety of organisms, found to be down regulated in cancer, and to regulate leaf morphogenesis in Arabidopsis thaliana and Zea mays [1-3]. Similar to the ncRNA, the physiological functions of peptides have not been well described, especially compared to proteins [1-3]. Previously thought to function as degradation intermediates between whole protein molecules and individual amino acids that can be recycled, peptides resulting from the partial digestion of parent proteins are increasingly recognized to have their own significant biological functions [4-18]. As polymer chains consisting of amino acids, peptides are distinguished from proteins based upon size. While the cutoff on size varies among the research community, peptides primarily differ from protein through structure. Proteins are large enough to have tertiary and quaternary structures, providing a wide range of functions. Unlike proteins, peptides are too small to retain tertiary structure, and functionality is determined by side chain groups. These biologically active peptides source from both endogenous proteins as well as exogenous food proteins, and vary in function from neurologically active peptides to antimicrobial peptides. However, the entire complement of peptides, known as the peptidome, is only minimally understood, as are the physiological 1 functions of these peptides. The basis for understanding the roles of biologically active peptides started approximately 100 years ago as described in the subsequent sections. 1.1.1 Insulin and C-peptide Endogenous bioactive peptides are those peptides with significant biological functions that are naturally encoded for or a product of a partial hydrolysis of a protein precursor. Described in 1921 by Banting and Best, insulin was one of the first bioactive peptides to be discovered [19]. Stored in and secreted by the beta cells of the pancreas, insulin is an essential bioactive peptide responsible for blood sugar regulation [19-22]. Inhibiting hyperglycemia, insulin directs the processing of glucose as the primary energy source rather than fat in response to a threshold level of glucose in the blood stream [19-22]. Once the blood glucose threshold is reached, the number of available glucose transporters into the beta cells is also increased, and glycolysis begins. The proinsulin parent protein consists of a signal peptide and three poly peptide regions denoted A-C-B. Prohormone convertases then cleave the protein leaving the two polypeptide regions A and B bound together by disulfide bonds, and the free connecting Cpeptide. Active insulin consists of the bound A and B polypeptide chains [19-22]. The free insulin C-peptide, not discovered until 1967, was thought to be inactive, or have unknown biological activity [23]. The blood levels of C-peptide are now widely used to distinguish type 1 and type 2 diabetes mellitus. Since type 1 diabetes patients are unable to produce sufficient levels of insulin, C-peptide concentrations in blood of type 1 patients are lower than in type 2 patients who produce insulin naturally, but are resistant to its effects [2330]. More recently, C-peptide has been recognized as having more significant biological activity than an indicator, with activity as an intercellular signaling molecule to renal tubules. Rich in + + Na -, and K -ATPase, these enzymes are activated by C-peptide in rats at low nanomolar 2 concentrations, which is associated with improved renal function [25-30]. C-peptide also been demonstrated to induce the release of ATP, which stimulates the endothelium cells to produce NO. When incubated with metals, it was noted that there was an increase in ATP release, and an uptake in glucose. Noting that zinc is present in the beta cells in much higher (mM) concentrations than that of other metal ions, Spence and Reid examined zinc-C-peptide metal binding [31, 32]. They found that zinc potentiated C-peptide regulated ATP release and subsequent glucose uptake, and while no mechanism has fully explained the increase in glucose uptake, it was found that zinc is bound to acidic residues of C-peptide, and loss of those residues, significantly reduces activity [31, 32]. Although known since 1967, its function has just recently been elucidated, exemplifying the lack of knowledge about the bioactive peptidome. 1.1.2 Substance P In 1931, a substance that was found and described as qualitatively different than known bioactive compounds, including acetylcholine in the brain and the intestine, and was extracted using a method called preparation P. As this substance was unknown, the reporting scientists von Euler and Gaddom named the unknown substance "Substance P" (SP) [33]. While it plays many roles physiologically, the main roles of SP is involved in vasodilation, inflammation and pain perception. Considered the first neuropeptide, SP is found in high concentrations in the brain, specifically among the dorsal root of the spinal cord, indicating involvement in pain transmission, and confirmed with electrical stimuli experiments [34]. In vasodilation, SP activates NK1 (neurokinin) receptors on the arterial vessels specific for SP, which is regulated by nitric oxide released from the endothelium [11, 34, 35]. SP plays an integral role in the process of neurogenic inflammation, which is when SP and other agonists are released from the peripheral endings of the primary sensory neurons, an indicator for both inflammation and pain 3 [11, 34, 35]. SP was recognized as a peptide because it was degraded by trypsin and pepsin, suggesting a polypeptide structure. SP is an undecameric polypeptide released from the preprotachykinin-A precursor through convertases, which also releases the simultaneously encoded neurokinin A peptide [11, 34, 35]. The C-terminal is then amidated by peptidyl-Gly-αamidating monoxygenase before becoming active [11, 34, 35]. 1.1.3 Angiotensin One of the most integral and most studied physiological functions is the control of vasodilation and constrictions, and bioactive peptides play important roles. The control of vasodilation and constriction is essential for health as it controls physiological blood pressure in response to physical and emotional stress. Early in the 1900s, it was shown that kidney tissue extracts raised blood pressure, and in 1928, Franz Volhard suggested that substance secreted by the kidney may be responsible for hypertension [36, 37]. Later, Volhard in conjunction with Page, isolated a kidney extract named renin, which was shown to cause vasoconstriction, but only through some enzymatic activation [36, 37]. After enzymatic degradation of a renin substrate, an active decapeptide named angiotensin I is produced [36, 37]. Angiotensin I, a decapeptide, is enzymatically converted by angiotensin I converting enzyme (ACE I) to Angiotensin II through a cleavage of H-L residues from the C-terminal end forming an octapeptide [37-41]. A dipeptidylcarboxypeptidase, ACE I is incapable of cleaving a proline peptide bond, which is why the conversion is limited to two amino acids. While angiotensin I is has low potency, angiotensin II is responsible for vasoconstriction at low nM concentrations, making the enzyme responsible for the conversion important in the regulation of blood pressure [42]. Angiotensin II induces vasoconstriction by binding to vascular smooth muscle AT1 receptor, which requires disulfide bridge binding for activation. Although increased levels of 4 angiotensin II result in increased level of receptor activation, if exposed to angiotensin II for prolonged times, the system self regulates through negative feedback, thus controlling blood pressure in cases of disease [43]. Receptor binding is G-protein coupled and incites a cascade resulting in myosin and actin to react, causing smooth muscle contraction of the vascular system [43]. While vasoconstriction is one important aspect of physiological function, dilation, or the prevention of constriction is equally important, and can be regulated by other bioactive peptide such as bradykinin. 1.1.4 Bradykinin Discovered in 1948 by Rocha e Silva and coworkers through the addition of snake venom to blood, bradykinin incited smooth muscle stimulation of gut muscle as well as a hypotensive effect [44]. In addition to venom, treatment with trypsin resulted in the same smooth muscle stimulation and hypotensive effect, suggesting that the bioactive substance was a polypeptide. Bradykinin belongs to the kininogen family, of peptides similar to angiotensins in that they require enzymes for activation [44-48]. Bradykinin binds to endothelial cells through specialized receptors, and its most prominent effect is the reduction of blood pressure. Binding of bradykinin to the B2 receptor triggers nitric oxide production, elevated intracellular calcium ion levels, which then result in vasodilation [44-48]. Bradykinin and ACE I work in opposition to each other, with ACE I specifically inhibiting bradykinin function through proteolysis [47, 49, 50]. With cleavage of any peptide bond, bradykinin loses functionality. Besides vasodilation, bradykinin also regulates many other physiological functions. As a result of vasodilation, bradykinin is believed to be a part of inflammatory disease and pain. The binding of the B2 receptors via agonists instigate a cascade of physiological responses including plasma 5 extravasation, stimulation of sensory neurons, release of prostaglandins, leukotrienes and cytokines, all of which are integral in the development of inflammation and pain [51]. 1.2 Endogenous Neuropeptides Endogenous bioactive peptides are an essential part of the regulation of physiological function. One of the most studied areas of bioactive peptides is neuropeptides. Used by neurons, neuropeptides are small signaling polypeptides that are produced and released by neurons, and then incite a response when acting upon neural substrates [52]. Neuropeptides include a variety of different families, each with their own specialized functions. These functions include signaling in the brain that involves a wide range of function from metabolism to behavior. Substance P, previously discussed, was the first bioactive peptide discovered, and is a neuropeptide of the tachykinin family noted for their contraction of the gut tissue. Other family members of the tachykinins also include neurokinin A and neurokinin K. Neuropeptides also include peptide hormones and in some cases neurotransmitters, which are defined by the cells from which they are released. Neuropeptides are released from neurons and glial cells and signal to nearby neurons and glial cells [11, 52]. Peptide hormones are released from neuroendocrine cells and signal to distant tissues via the blood to induce a physiological response [11, 52]. Neurotransmitters, while normally small molecules that are released from the presynaptic side (axon terminal) of the synapse across the synaptic cleft to the post synaptic side (dendrite) [11, 52]. The major non-peptide neurotransmitters include amino acids including glutamate, aspartate and glycine. More extensively researched neurotransmitters include monoamines and other small molecules such as dopamine, norepinephrine, epinephrine, histamine, acetylcholine, adenosine and nitric oxide. However, some small peptides are also 6 released across the synapse, and include somatostatin, SP, and opioid peptides, and these are described below [11, 52]. 1.2.1 Opioid Neuropeptides: Endorphins and Enkephalins Produced in the central nervous system, opioid peptides function as both peptide hormones and as neurotransmitters. The most noted effect of the opioid peptides is the mocking of the alkaloid opiates such as morphine and heroin [4, 53]. Able to bind to the same receptors, the binding of opioid peptide to opiate receptors produce the same physiological effects, one of the most noted is the “runner’s high” produced from the endorphin peptides of the opioid family, produced in the pituitary gland [54-57]. In addition to endorphin, the opioid family also includes enkephalins, a pentapeptide produced in the spinal cord, both of which require proteolysis of a parent protein to the active peptide form. Endorphin derives from beta-endorphin/ACTH precursor, also known as pro-opiomelanocortin (POMC) and enkephalin derives from the proenkephalin precursor [54-56]. The POMC includes the sequence for a variety of different bioactive peptides: the 31 amino acid beta-endorphin sequence at the C-terminus, the 91 amino acid beta-lipotropin precursor and a 39 amino acid sequence of adrenocorticotropic hormone (ACTH). These peptides are released through processing via prohormone convertases [56]. For further activation, beta-endorphin is further processed via post-translational modifications, which alters the physiological function of the neuropeptide. Acetylated on the N-terminus, and further proteolytically degraded on the C-terminal end, six different structural analogs are produced [58]. Although the different structural analogs seem relatively minor, the physiological effects are profound. The six different beta-endorphin analogs include beta-endorphin 1-27, 1-26, and 1-31 each with an acetylated derivative. The main form of beta-endorphin is beta-endorphin 131, followed by the other two deacetylated forms. The post-translated acetylated are in lesser 7 quantities indicating that the primary post-translational pathway is the C-terminal proteolysis, occurring in the hypothalamus of the brain. While beta-endorphin 1-31 is the only analog to retain opioid agonist activity, the other analogs are not without function [58]. The most significant bioactive function of the other analogs results from the proteolysis of the C-terminal end resulting in beta-endorphin 1-27, where rather than having an agonistic function, fully activating the opioid receptor, the 1-27 analog blocks beta-endorphin 1-31 from binding, exhibiting antagonistic behavior [58]. When agonistically bound to the opioid receptor, the main effect is analgesia, caused by the inhibition of the release of inflammatory compounds like SP from sensory nerves [4, 53-55, 59]. Enkephalins, both the Met-enkephalin and the Leuenkephalin, which differ by their terminal residue, also induce the same analgesic effect as the endorphins. However, unlike beta-endorphin, which primarily acts upon the µ and δ opioid receptors, enkephalins are targeted towards the δ receptors only. These receptors are found throughout the central nervous system, but the δ receptor is primarily expressed in the cerebral cortex, while the µ receptor is expressed more widely, found in the amygdala, thalamus and brainstem [4, 53-55, 59]. While both endorphin and enkephalins will cause sedation when activating their respective receptors, the µ receptor unique to endorphin will cause euphoria while the binding of the δ receptor results in an antidepressant effect [4, 53-55, 59]. 1.2.2 Neuropeptide Y family: Neuropeptide Y, Peptide YY, Pancreatic Polypeptide Similar to beta-endorphin and enkephalin, peptides belonging to the neuropeptide Y (NPY) family are found in the central nervous system, and released from neurons across synapses to incite a response. The NPY family consists of the predominant namesake NPY, first isolated by Tatemoto and Mutt in 1981, a NPY like peptide exhibiting a high degree of homology (70%) between NPY termed peptide YY (PYY), and a peptide with 50% degree of 8 homology termed pancreatic polypeptide (PP) [60, 61]. The most abundant of the NPY family, is NPY, found in both the central and peripheral nervous systems. In the central nervous system, it is distributed with other neurotransmitters such as somatostatin, GABA, norepinephrine, epinephrine and serotonin [62, 63]. In the peripheral nervous system, NPY is distributed with norepinephrine and found in the gastrointestinal tract, salivary and thyroid glands, pancreas, urogenital system and heart [62, 63]. PYY is predominantly distributed in the intestinal tract and pancreas in the endocrine cells. Like PYY, PP is also not found in the brain, but located predominantly in the pancreas as the name implies [60]. Cleaved from the pro-neuropeptide-Y parent protein, the NPY family of peptides binds to a number of receptors, each resulting in their own physiological effects. NPY receptors bind to a pertussis toxin sensitive G-protein and stimulates additional responses. The different receptors are distinguished through the binding affinities of the three NPY peptides [62, 64]. Y1, predominantly expressed in the cerebral cortex, thalamus and amygdala, but also in in the adipose and vascular smooth muscle, is responsible for vasoconstriction. Y1 has a larger binding affinity towards the peptides NPY and PYY. A post-synaptic peptide, it mediates the smooth muscle contraction through the potentiation of norepinephrine. In addition to vasoconstriction, Y1 has also been associated with behavioral effects. Y1 has been associated with alleviation of anxiety, an increase of co-released neurotransmitters norepinephrine and dopamine, each responsible for their own behavioral modifications, and stimulation of food intake. In regards to food intake, it was found that all Y1 agonistic NPY fragments did not mediate behavior similarly. For example, the full peptide sequence stimulates appetite, while the C-terminal fragment NPY 13-36 did not to the same extent, and further experiments showed that the removal of the C-terminus reduced food uptake stimulation [60, 62, 63, 65, 66]. The Y2 receptor preferentially binds to both NPY and PYY, 9 both with similar affinities. Found in the hippocampus, thalamus and cortex, Y2 receptors are found presynaptically in neurons, it mediates neurotransmitter release through suppression of the neurotransmitters. Y2 has also been implicated in the behavioral effects of memory, circadian rhythm, and angiogenesis. These behavioral effects are related to the fact that Y2 receptors are located central to norepinephrine and dopamine terminals, as well the location of receptors in the hippocampus, a portion of the brain responsible for memory formation [60, 62, 63, 65, 66]. Y4 is the only NPY family receptor subtype with a high affinity for the PP. NPY and PPY are also able to activate the receptor as well, but require much higher concentrations. Expressed in the primarily in the gastrointestinal tract and to a lesser extent in the heart, skeletal muscle and thyroid gland, the primary mode of action of agonistic binding to the receptor is inhibition of gall bladder contraction and pancreatic secretion [60, 62, 65, 66]. The Y5 receptor is expressed in the hypothalamus and secondarily in the hippocampus. The location of the receptors are responsible for behavioral mediation, which involve the increase in food intake due to the receptors located in the hypothalamus, and seizure triggering in response to agonistic activation in the hippocampus region[60, 62, 65, 66]. While other NPY receptors have been identified only subtypes 1, 2, 4, and 5 are found in humans, with other subtypes found and researched more extensively in rodents (rat and rabbit). 1.2.3 Neurotensin Neurotensin, a neuropeptide, also works a neurotransmitter like endorphin or enkephalin, mediating signals from adjacent neurons. In addition, it also acts as a hormone in the periphery system, transducing a physiological signal in a more distant tissue. A 13 amino acid peptide that was isolated in 1973 from the hypothalamus during the purification of SP, it is derived from a larger 117 residue proneurotensin precursor which also contains the sequence for a closely 10 related neuropeptide neuromedin [67]. Its main role is as a neurotransmitter, primarily expressed in the central nervous system. In the central nervous system, neurotensin release is strongly correlated to the dopamine system, particularly those involved in movement and ‘reward’. Experiments suggest that two neurotensin receptors exist in mammals, one with a particularly high affinity. Both receptors are of the G protein family, with the high affinity agonist receptor mediated by a guanine nucleotide [65, 67, 68]. Binding of neurotensin to the receptors results in a signal transduction that involves secondary intracellular messengers. Phospholipase C hydrolyzes phophatidyl inositol 4, 5-bisphosphate to produce diacylglycerol and inositol 1, 4, 5triphosphate, which was experimentally shown to be correlated to an increase in intracellular calcium ion concentration [65, 67, 68]. The disruption of the dopamine pathway involves desensitization of the D2 dopamine receptor in the brain, which stems from neurotensin increasing cAMP production through increased protein kinase activation. This cascade increases the dopamine production in response to neurotensin release, however, a side effect of the potentiation of dopamine in relation to neurotensin, is the desensitization in response [65, 67, 68]. This results in a diminished response to agonistic binding despite repeated treatment. This diminished response has been found to be irreversible after an extended (90 min+) washout treatment [65, 67, 68]. In the peripheral system, neurotensin is responsible for the relaxation and contraction of gastrointestinal muscles. In the guinea pig ileum, it was found that neurotensin triggers the contraction of the ileal smooth muscle, and after repeated exposure, desensitization occurs, much like in the brain. The contraction of the muscle is counteracted by several receptor antagonists, including atropine and tetrodotoxin [65, 67, 68]. The release of neurotensin in the gut also stimulates the subsequent release of acetylcholine which is a more effective muscle contractor, which is the result of excitation post-ganglionically. Neurotensin also self-regulates 11 the opposite effect, which is the inhibition of the contraction response. While atropine is an antagonist, it only mediates approximately 50% of the smooth muscle contraction response. In the absence of other antagonists, neurotensin induces the release of SP from enteric sensory nerves which prompts a return to basal levels due to specific auto-desensitization [65, 67, 68]. 1.2.4 Adrenocorticotropin Hormone Stress hormones are released in response to a physical or psychological stress or demand. These hormones, unlike neurotransmitters or neuropeptides do not act upon receptors of the neurons directly, but incite a physiological response to distant tissues. Synthesized from the same pre-POMC polypeptide that releases endorphins, ACTH is a stress hormone that is released from pituitary gland, which subsequently incites the synthesis of corticosteroids in the adrenal cortex of the brain [5]. ACTH release is primarily regulated by a corticotropin releasing factor (CRF), which is synthesized in the hypothalamus and then reaches the pituitary through the blood. Other studies have shown that there are adrenoreceptors induce ACTH release upon agonistic binding of isoproterenol [5]. ACTH is also essential in the regulation of catecholamines and other hormones released from the adrenal gland. The adrenal gland is responsible for the release of essential catecholamines and hormones that regulate brain chemistry and behavior, including dopamine, norepinephrine and subsequently epinephrine [5]. ACTH regulates glucocorticoid synthesis, which then mediates the turnover of norepinephrine to epinephrine. In addition, glucocorticoids also regulate the activity of tyrosine hydroxylase, the enzyme responsible for the conversion of tyrosine to DOPA, the precursor to dopamine. The release of ACTH from the pituitary can be regulated not only by CRF and agonistic binding to adrenoreceptors, but other factors as well. Hormone induced secretion of ACTH requires intracellular secondary messengers to induce a signal [5]. Vasopressin, catecholamine and CRF 12 binding activates a guanine nucleotide stimulatory protein, which then activates adenylate cyclase. Adenylate cyclase activation then incites the formation of cAMP from adenylate cyclase. cAMP then activates protein kinase, a catalyst for the phosphorylation of the substrate protein, which then stimulates release of ACTH. ACTH release is potentiated by calcium ion influx through the cellular membrane, induced by polarization from extracellular potassium. ACTH release is inhibited by the presence of glucocorticoids as it is a systematic regulatory response. Somatostatin also inhibits ACTH release through the inactivation of adenylate cyclase, which prevents the formation of cAMP [5]. 1.3 Exogenous Bioactive Peptides As opposed to endogenous bioactive peptides which result from partial hydrolysis of an encoded protein for activation, exogenous bioactive peptides are sourced from outside protein sources, with partial hydrolysis and activation occurring through the digestion process. Once introduced into body, a protein is introduced to a multitude of enzymes which break down food stuffs into essential building blocks that can be absorbed for nutrition. In the case of proteins, for maximum nutrient absorption, proteins are completely hydrolyzed into individual amino acids and then absorbed through the intestinal lining [17, 69-71]. Degradation of the protein occurs through sequential exposure of several primary enzymes: pepsin in the stomach, followed by exposure to trypsin, and chymotrypsin. Bioactive peptides from exogenous sources are those that survive the digestion process as peptides two amino acids in length or larger and are absorbed through the intestinal lining, which allows them to trigger a physiological response [17, 69-71]. The physiological responses from exogenous sources are varied, ranging from beneficial responses that regulate blood pressure and antioxidative effects, but also deleterious effects from inappropriate immunological responses. 13 1.3.1 Angiotensin Converting Enzyme Inhibitory Peptides Blood pressure is subject to endogenous regulation by bradykinin and ACE I, however, blood pressure can be regulated through the consumption of food proteins sourced from milk, fish, egg, grains, and other commonly consumed proteins [17, 70]. An assortment of exogenously-derived peptides that regulate blood pressure inhibit the function of ACE I, thereby preventing conversion of angiotensin I to angiotensin II which results in a decrease in blood pressure. While ACE inhibitory peptides from different protein sources do not share a complete conserved sequence, there are similarities within the group [17, 72, 73]. ACE inhibitory peptides are typically rich in hydrophobic residues with a proline, lysine or arginine at the C-terminal end. Typically short peptides, ACE inhibitors are mostly di- and tri-peptides, but range as large as nonapeptides [17, 72, 73]. ACE inhibitory peptides are categorized into three different subgroups: the inhibitor type, the prodrug type and the substrate type [72]. The inhibitor type are those peptides whose IC50 values are unaffected by incubation with ACE I. This implies that any (if at all) partial degradation by ACE (a dipeptidylcarboxypeptidase) does not diminish nor enhance the function [72]. Prodrug type inhibitors are those peptides whose IC50 values are enhanced upon partial degradation by ACE I. For example, LKPNM, a peptide from bonito muscle exhibited inhibitory effects with an IC50 of 2.40 µmol/L, but further proteolysis cleaved it to LKP and increased its inhibitory potency to 0.32 µmol/L [72]. The substrate types of ACE inhibitory peptides are those that exhibit inhibitory effects pre-incubation with ACE, but the IC50 is weakened or inhibition is completely negated once exposed to ACE. Continued hydrolysis of these bioactive peptides by ACE destroys the active site of binding, rendering the peptide non- or less functional [72]. For inhibitory peptides to be effective, they must be able to survive the 14 digestive process and be absorbed through the intestinal lining and into the blood stream, where they can bind to ACE and act as inhibitors. 1.3.2 Exogenous Opioid Peptides While endogenous peptides are an excellent source of opioid activity, food, when ingested can affect appetite, behavior and the motility of the gastrointestinal system. All of these effects are related back to opioid binding, and agonists and antagonist have been discovered from sources including milk, wheat gliadin proteins, corn zein, and barley hordein fractions [17, 70]. The most extensively researched, peptides derived from the partial digestion of milk proteins (casomorphins, lactorphins) bind agonistically, resulting morphine like effects, similar to that to when endorphins and enkephalins bind to opioid receptors in the brain [17, 70]. Common to agonistically binding peptides, a terminal tyrosine and proline residue motif is needed to bind to receptor sites. A positive charge near the terminal tyrosine increases the efficacy of binding [17, 70, 74]. In addition to agonists, antagonistic peptides from milk resultant from partial digestion of lactoferroxins and casoxins counteract naturally produced enkephalins, reducing binding efficacy [17, 70, 74-76]. Appetite regulation can be regulated by a variety of opioid peptides including both milk derived casomorphins which modify endocrine systems to increase pancreatic output of insulin [17, 70, 74-76]. Wheat gluten is a source of several different exorphins, shown to have a variety of effects. Most notable of these include motility control of the bowels, which can lead to constipation if excessive concentrations of the opioid peptides are introduced [17, 70]. Opiate peptides that exhibit the least effective binding are those that are the most susceptible to further degradation by peptidases when passing through the brush border of the intestine. Also, research into the relationship between wheat and milk products and mental health has been some interest. As both milk and wheat hydrosylates result in opiate receptor 15 agonist and antagonists, a popular theory has been that consumption of these products can alter brain chemistry, particularly in those afflicted by autism and schizophrenia [77]. As a proposed treatment, a gluten-free, casein-free diet has been adopted by several who abide by this theory. While studies have been published on the effects of gluten and casein digests to those afflicted with autism in particular, the studies have been lacking, specifically in a control group or double blind application for verification of results [77]. 1.3.3 Immunomodulatory Peptides As mammals, milk is the first source of nutrients that humans are exposed to as infants. Therefore, it is logical that milk is an essential source for the development of a healthy digestive as well as immune system. Immunomodulatory peptides include those that both enhance and suppress the immune system [17, 70]. Naturally occurring peptides, those that do not require digestion of a proprotein for release of the bioactive form, include the proline-rich polypeptide which enhances the response to red blood cells, and induces cytokine production, as well as the induction of B lymphocyte growth and regulation [17, 70]. Milk growth factor, another naturally occurring peptide, has also been shown to suppress T-lymphocyte function in humans, an essential part of the immune recognition cascade [17, 70]. Peptides derived from the proteolytic digestion of casein in the gastrointestinal tract have also shown to be immunomodulatory. While milk is generally thought to be beneficial, there are some immunomodulatory effects that have shown to be suppressive rather than stimulatory. Bovine κ-casein as well as other bovine casein peptides induces inhibited immune responses in mouse spleen lymphocytes via a reduction of proliferation [7, 8, 17, 70, 75, 76, 78, 79]. In humans, bovine milk proteins also suppress proliferation of peripheral blood lymphocytes (PBL), but only at low concentrations of βcasomorphin and β-casokinin. At higher concentrations, proliferation is induced. Proliferation of 16 PBLs is also induced by small aromatic peptides derived from casein and lactalbumin [17]. While there is no overriding trend in the sequence of peptides that modulate immune response, one note of import is that the major source of immunomodulatory effects of milk comes not from the major milk proteins lactoglobulin and lactalbumin, but from antibodies secreted into milk. It has been estimated that 80-90% of the antibodies in bovine milk are comprised of IgG [70]. These antibodies have specificity against human gut pathogens which can be further boosted through injecting pregnant cows with polyvalent human-gut bacteria vaccine. The bacteria are then recognized by the bovine immune system, generating antibodies. Benefits of ingesting immunized milk products include a prophylactic effect where people can consume antibodies generated against particular bacteria and viruses [70]. This is particularly significant in those who are immunocompromised, such as infants and elderly. 1.3.4 Antioxidative Peptides While oxidation reactions are necessary for biological function, oxidation must be controlled, preventing oxidative damage to important cellular substances and subsequent cell death. Antioxidants, which can include reducing agents such as ascorbic acid and thiols, can be either produced endogenously, as in the case of glutathione, or taken in from exogenous sources. Antioxidative peptides are derived from a variety of sources, including soybeans, fish, and milk [17, 70]. One common aspect of many antioxidative peptides lies with histidine residues, which are capable of chelating metal ions, but also serve as activated oxygen quenchers and hydroxylradical scavengers. Found in fish muscle cells, carnosine is a small peptide (AH) which has high antioxidative properties, was found to inhibit lipid peroxidation, chelate divalent metals, and inhibit the oxidative stress from hydroxyl radical exposure [17, 70]. While each aspect is important in managing oxidative stress, it is the combination of all factors; metal chelation, 17 oxygen quenching, and hydroxyl-radical scavenging that make antioxidative peptides the most effective. In milk, caseinophosphopeptide exhibits these behaviors as well, chelating to ferrous iron as well as scavenging both peroxyls and hydroxyls in vivo. Antioxidative properties of caseinophosphopeptide are attributed to the phosphates, which can stabilize the radicals [17, 76, 80]. Antioxidative peptides are essential for the disruption of an oxidative pathway, regulating oxidative stress in the body. 1.3.5 Antimicrobial Peptides As was the case for immunomodulatory peptides, antimicrobial peptides also play an important part in regulating health. As multi-drug resistant pathogens present an increasing problem within the health care system, alternative methods for the treatment of infections is needed. In recent years, medicine has turned to antimicrobial peptides as a solution, with advantages of increased specificity and activity with lower toxicity as well as a new mode of attack against the microbes. In 2004, there were 720 antimicrobial drugs either on the market (5%), in clinical trials (38%), or in preclinical phases (56%) [81]. The mechanism of killing for antimicrobial peptides is best understood at this time for bacterial microbes. For an antimicrobial drug to be effective, it first must be attracted to the target. This is mediated by a cationic charge of the peptide which has a high affinity for a negative charge on the bacteria [82, 83]. After initial attraction to the target, the bacterial cell membrane is attacked, penetrating into the cytoplasmic membrane. This leads to lysis and cell death. In order to penetrate the membranes which are composed of layers of peptidoglycans (gram positive), and lipopolysaccharides and peptidoglycans (gram negative) requires a peptide with an amphiphilic and helical structure, which allows them to form pores through the lipid bilayer. The ratio of hydrophobic to hydrophilic residues in an antimicrobial peptide ranges from 1:1 to 2:1 [82, 83]. 18 While the endogenous production of antimicrobial peptides is essential for normal defenses, antimicrobial peptides can also be consumed for increased defense against attack. The main exogenous sources of antimicrobial peptides include milk, fish and grains [16, 17, 70, 75, 82-84]. Lactoferrin in milk, found in both human and bovine sources, exhibits antimicrobial effects once hydrolyzed through the digestion process, producing lactoferricin. Used to improve the natural flora of the bowel and prevent infections, lactoferricin has been proved to be effective against both gram positive and gram negative bacterial membranes, including the E. coli, Streptococcus sp., Salmonella sp., Pseudomonas sp., and Enterobacter sp [16, 17, 70, 75, 82-84]. Lysozyme, a prominent protein found in a variety of secretions including tears, saliva, as well as milk and egg whites which can be consumed. A relatively small protein or large peptide with a molecular weight of approximately 14 kDa, it does not need to be enzymatically freed for bioactivity. Lysozyme acts by attacking the peptidoglycan layer of bacterial cells, which is its natural substrate [16, 17, 70, 75, 82-84]. Previously thought to be ineffective against gram negative bacteria which have a lipopolysaccharide layer protecting the peptidoglycan layer, it has been shown that gram negative bacteria can be rendered susceptible by EDTA or heat [16, 17, 70, 75, 82-84]. In grains, thionins, cysteine- and lysine-rich peptides have antimicrobial activity against bacteria and fungi, likely due to hydrophobic and hydrophilic nature of these medium-sized peptides [16, 17, 70, 75, 82-84]. This trend, along with the formation of helical structures is essential for the formation of pores in the cellular membranes, and is maintained across both exogenous and endogenous sources. 1.3.6 Antigenic Peptides Although the consumption of foodstuffs is generally thought to be a beneficial activity, providing nutrients, and as previously mentioned, a multitude of other beneficial modes of action 19 including antioxidant, antimicrobial, and regulatory peptides, foodstuffs can also have a deleterious physiological effect. Partially digested peptides are largely responsible for inappropriate immune responses in food allergies [85-91]. The seven most common food allergies are to milk, peanuts, tree nuts, shellfish, fish, eggs, and wheat [85]. Allergic reactions range from skin irritations to more serious forms including anaphylaxis or in the case of wheat, Celiac Disease (CD), a severe autoimmune disorder [85, 88, 91-97]. Each presents multiple proteins which, when digested by gastrointestinal enzymes, result in peptides larger than single amino acids that then cross the intestinal brush border. At this point an immune cascade is initiated in sensitive individuals, triggered by recognition of the Major Histocompatibility Complex (MHC). The MHC then presents the peptide on the surface for further recognition by T-cells (T lymphocytes). Once recognized by the T-cells as “not self”, more T-cells are produced for future action, which then leads to, the deleterious immune reaction [85, 91, 93100]. Wheat incites a wide variety of immune responses, ranging from skin irritation to serious forms of immune responses such as anaphylaxis, and in the most extreme cases CD. The immune response can be incited by one of the four major protein fractions in wheat, the water soluble albumin fraction, the salt soluble globulins, the weak acid soluble glutenins, and the alcohol soluble gliadin. Gluten, the combination of glutenin and gliadin fractions which gives wheat the desired elasticity for baking, is the major component that provokes immune responses [101, 102]. This is due to the unique composition of the proteins which lack proteolytic digestion sites resulting in peptides that survive proteolysis and reach the small intestine. In the case of CD, an inherited autoimmune disease, gliadin proteins considered to be the antigen, are rich in proline and glutamine residues [92-100, 103-122]. Increased antigenicity is achieved 20 through deamidation of the glutamine as it crosses the brush border through action of transglutaminase enzymes [95, 96, 99, 105, 110, 114, 121, 123-127]. CD causes inflammation of the small intestinal wall, and atrophy of intestinal villous which inhibits normal absorption of nutrients, and this makes CD particularly painful and severe for those inflicted [92-100, 103122]. Spelt, a relative of wheat, has been suggested as a possible grain substitution for those afflicted with wheat allergies due to an increased ease of digestibility. Spelt contains gliadin protein sequences similar to that of wheat, and individuals suffering from CD have been advised to abstain from consuming spelt for this reason [109, 122, 128-133]. The α- and γ-gliadin sequence of the wheat and spelt proteins are 70-90% identical, depending on the accension numbers used. Although the wheat protein sequence has been elucidated, wheat is hexaploid, making its genetics particularly complex and subject to sequence diversity among genes of a specific family. As spelt is also hexaploid, while retaining a high degree of homology with the gliadin sequences, make wheat and spelt ideal non-model samples for investigating digestionresistant peptides that result from simulated gastrointestinal digestion [109, 122, 128-133]. 1.4 Current State of Bioactive Peptides While many biologically active peptides have been identified and their activities determined, the entire scale of biologically active peptides is poorly characterized. Despite being completed in 2003, knowing the human genome does not mean the function of all proteins in the body is known [134]. Furthermore, bioactive peptides result from the proteolytic processing of both endogenous and exogenous proteins, and proteolysis then generates peptides with a wider range of potential physiological functions. Alternate reading frames can encode for different genes and therefore different proteins, making the proteome and peptides derived from its degradation increasingly more complex and challenging to identify [135]. G protein coupled 21 receptors mediate numerous physiological responses, however the human genome project indicates there are many “orphan receptors”, or ligands to G proteins coupled receptors that have yet to be found [136]. Exogenously consumed bioactive peptides also suffer from lack of knowledge. With the rise in frequency of food allergies, it is increasingly important to identify the antigens in the foods consumed. However, only a relative few peptides proven to be antigenic have been discovered. For example in wheat, out of the thousands of peptides that can be generated from the gastrointestinal digestion of gliadin proteins, only peptides on the order of tens have truly been identified as antigenic, leaving the question of how many other peptides give the same antigenicity [99, 100, 108-110, 112, 115, 120, 126, 137-143]. In view of these factors, there is great need to develop improved approaches for profiling peptides in complex mixtures. 1.5 Current Technologies in Detecting and Identifying Bioactive Peptides 1.5.1 Isolation and Detection Methods and analyses for the detection and identification of proteins and peptides have been in development for over a century, with major advancements having been achieved in the last few decades, particularly in regard to developments in mass spectrometry technologies. The analysis of an entire complement of proteins in a biological sample has been given the term “proteomics”, which combines protein with genome, insinuating the integration of the two – all the proteins which are coded for by a genome [144]. Given that an entire set of proteins can exceed tens of thousands for an individual organism, isolation of proteins often served as a key step in the analysis. Prior to mass spectrometry, isolation and characterization of proteins was reliant upon gel based separations, which could characterize proteins by size and isoelectric points, and assays to characterize activity [145-151]. Further identification of the isolated 22 proteins was often performed using Edman degradation, which yielded sequential release of derivatized amino acids from the N-terminus of the protein, and analysis of the products allowed the sequence of the amino acid to be determined [152]. While mass spectrometry has been in practice since the early portion of the 1900’s, until 1988, intact proteins were unable to be suitably analyzed because ionization methods yielded fragmentation of the protein ion and loss of molecular mass information. With the development of Electrospray Ionization (ESI) mass spectrometry by John Fenn in 1988, molecular ion mass information without fragmentation was obtained for proteins with molecular masses above 20 kDa [153-155]. ESI involves passing the protein analytes through a capillary with a high voltage differential of approximately 3000 V between the capillary exit and the entrance to the mass spectrometer, and the process creates charged droplets from the solution. The solvent and the analyte are the expelled from the capillary tip, nebulized, and reduced to charged analytes as the solvent droplets evaporate in the nebulizing gas, split into smaller droplets, and eventually release gas-phase ions for mass analysis. This process allows for the creation of multiply-charged ions as well as singly-charged ions without in source degradation [155-157]. Matrix-assisted laser desorption ionization (MALDI), developed around the same time as ESI, is an alternative soft ionization technique that is also suitable for the analysis of proteins. In contrast to ESI, in which samples are ionized from solution, MALDI has required analytes to be fixed and dried within a matrix on a solid surface. The matrix is essential for the transferring of laser energy from the beam to the analytes [158, 159]. Once sufficient energy has been transferred, a plume is desorbed from the surface, which includes a mixture of charged analytes, and neutral matrix. The energy dispersed among the analytes produces a variety of ions, dominated by singly-charged molecular ions, but also in smaller quantities multiply-charged ions [158, 159]. During the energy transfer process, ions are 23 generated with sufficient internal energy that fragment ions derived from proteins and peptides may often be observed [158]. Once proteins could be analyzed by mass spectrometry, the main issue facing proteomics is the number of different proteins that can be analyzed in a single analysis. As previously mentioned, to analyze thousands of proteins, they usually need to be separated from other potentially-interfering substances, including other proteins or peptides. Mass spectrometry can be combined with separation techniques, and was first achieved by McLafferty with Gohlke in 1956 with the combination of gas chromatography with mass spectrometry [160]. For nonvolatile samples, particularly those that require labile techniques such as proteins, high performance liquid chromatography (HPLC) is combined with ESI and mass spectrometric detection. HPLC involves driving a solution of analytes through a column filled with a stationary phase that allows for differential partitioning between mobile and stationary phases, so that those that partition more strongly to the column elute later [161-163]. HPLC offers higher plate numbers, which is a measure of separation capacity, through the use of smaller particles which yields higher chromatographic resolution. The detection of proteins by liquidchromatography-mass spectrometry requires a mass analyzer and detector capable of analyzing high mass molecules in the time frame of an analyte’s elution from the column. One common mass analyzer used for whole protein analysis is Time-of-Flight (ToF) mass analyzer, which has been paired with the microchannel plate detector [164, 165]. In the ToF mass analyzer, analytes are separated based on m/z based on the relationship between mass, velocity, and kinetic energy. Pockets of ions are pulsed, giving them the same kinetic energy if they have a common charge state, and the time to reach the detector is measured. As kinetic energy is related to mass by 2 m/v , the longer it takes an ion to reach the detector, the larger the m/z [160, 164]. 24 1.5.2 Identification While HPLC-ToF mass spectrometry is appropriate for the analysis of some intact proteins, many proteins lack adequate solubility to be separated using common columns and mobile phases. For this reason, many proteome analyses rely on proteolytic digestion to generate smaller peptides that have greater solubility and offer improved chromatographic behavior. In addition, the identification of proteins by mass spectrometry requires subsequent fragmentation along the peptide backbone to confirm the identity because molecular mass alone is often unable to discriminate all proteins from a given source. One of the more commonly used methods for comprehensive proteome analysis is the “bottom-up” approach [144, 147-149, 151, 166-186]. As fragmentation of large intact proteins often suffers from low yields of fragmentation and great spectral complexity when product ions are formed, specific enzymatic digestion is used to break the proteins into smaller peptides. The most common approach employs trypsin, as it cleaves at the basic residues lysine and arginine, leaving a basic C-terminal residue with a greater affinity for protonation [144, 147-149, 151, 166-186]. Enzymatic digestion can be carried out in gels after electrophoretic separation, or in solution, and the digest products are then often injected onto an HPLC column. After peptides are injected onto the column, they are separated, reducing the complexity of ions reaching the mass spectrometer at a given time. Eluting analytes are then ionized, and the resulting ions are isolated by mass, induced to fragment, and the resulting fragments are then detected. To isolate ions by mass, quadrupole or ion-trapping mass analyzers are used. Quadrupoles use a combination of direct current (dc) and radio-frequency (rf) potentials applied to four rods through which the ions pass. At specific combinations of dc and rf potentials, only ions with a particular m/z value have a stable trajectory through the quadrupole, and ions of all other m/z values are removed [187]. A second quadrupole can be 25 used as collision cell, in which the ions transmitted by the first quadrupole are entrained, the chamber flooded with a collision gas and “tickled” to incite collisions with the gas molecules that convert translational energy into internal vibrational energy, which in turn results in dissociation reactions to form product ions. These product ions are then sent to another mass analyzer for analysis, this method is referred to as tandem mass spectrometry or MS/MS [188]. To combat the complexity of the proteome, and to maximize the number of proteins identified, John Yates’ group at The Scripps Research Institute developed a technique known as multidimensional protein identification technology (MudPIT) [171]. MudPIT uses two different stationary phases attached in series to further increase the separation of the peptides that result from a proteolytic digestion. During peptide separations, a survey scan is performed to determine the most abundant peptides eluting during a narrow time window, and peptide ions are selected, isolated, and subjected to further fragmentation [186]. This method is known as data dependent analysis (DDA), in which the peptides that are selected for dissociation are determined by the ion abundances in the survey scan and a series of rules established before the analysis [147, 170, 189, 190]. The benefit to using MudPIT in combination with DDA is that the multidimensional separation reduces the co-elution of peptides through use of a two-dimensional chromatographic separation, and therefore the number of peptides eluting within a single scan, allowing for the normally less abundant peptides to become more prominent, and selected for subsequent dissociation [149, 171]. 1.5.3 Fragmentation Following ion activation, peptides fragment by two major processes, both charge-directed and charge-remote reaction mechanisms. At the lower collision energies experienced in most tandem mass spectrometers based on quadrupole or ion-trap designs, the charge-directed 26 pathway is relatively more efficient owing to lower activation energies, with charge-remote reactions often less rapid [191-193]. The charge-directed pathway is also explained using the mobile proton model, in which proton migration plays a key role in driving dissociation reactions at multiple sites along a peptide backbone. Since most peptides have at least one basic site of protonation, richer structural information is facilitated by a protonation at a second, and lessbasic, protonation site. Protonation of a peptide is determined by proton affinity with basic side chain groups, particularly arginine and lysine side chains, having greater proton affinities than the amine terminus, followed by the amide CO and the amide nitrogen [191-194]. Generation of informative b and y type ions from cleavage of peptide bonds has been explained as involving migration of the proton to an adjacent amide nitrogen. The protonated nitrogen on the amide backbone both weakens the amide bond and makes the protonated amide group a good target for a nucleophilic attack from an electron rich group [191-194]. When the proton moves to the amide backbone, the peptide can cleave by direct bond cleavage, but it is unfavorable in low energy CID since it requires high amounts of energy to be deposited into the ions through collisions. Most excited ions undergo a more favorable low energy rearrangement through nucleophilic attacks on the amide bond. Fragmentation through the b-y ion pathway relies on a nucleophilic attack by the oxygen of the N-terminal amino group forming an oxazolone ring. The amide bond then breaks forming a possible b or y ion, which represent products of cleavage of the amide bond with charge retention on the N- and C-terminal sides respectively. Whether or not a b or y ion is formed is dependent upon proton affinity, and is thermodynamically controlled [191-194]. While singly-charged peptides are capable of undergoing fragmentation, it requires more energy than doubly-charged peptides, particularly to generate a series of ions derived from cleaving a wide assortment of peptide bonds. Therefore, under normal conditions, small singly- 27 charged peptides are less likely to undergo y ion formation, rather forming an abundance of b ions because the charge is localized at the N-terminus [193, 195, 196]. In contrast to charge-directed fragmentation of peptides, charge-remote fragmentation involves fragmentation of the backbone removed from a site adjacent to that of the charge. With acidic residues, the oxygen of the OH in the carboxylic group of the aspartate or glutamate residue attacks the carbonyl of the amide backbone, breaking the amide bond. The charge is fixed to the b ion where it remains on the N-terminus [191, 192]. Charge-remote pathways at low energies are directed by the chemical properties of residues contained within the peptide itself. Peptides containing acidic residues (aspartate and glutamate) undergo fragmentation that is distinct from that of other peptides [191, 192]. In contrast to those peptides that produce b and y ions, acidic peptides produce abundant b ions at the residue attached to the C-terminal side, adjacent to the acidic residues. 1.5.4 Comparative Peptide Quantification The most common methods of determining peptides and proteins that differentiate biological states or samples use mass tags to distinguish those peptides that originate from specific biological states or samples. While these methods can target, sequence, and quantify proteins or peptides that are differentially expressed, such targeted processing is often less universal, often requiring significantly more effort than the proposed automated non-targeted approach. For example, Isotope Coded Affinity Tags (ICAT) rely on the presence of cysteine residues for the covalent binding of light and heavy biotin tags to differentiate two different states prior to digestion. Therefore, proteins with fewer than five cysteine residues often fail to give useful tagging and subsequent recovery through affinity chromatography. As a result, it has been estimated that only 10% of a total protein digest has been commonly recovered and 28 analyzed by MS/MS using ICAT, leaving a significant portion of the digest omitted from analysis. In contrast, a non-targeted approach has the capacity to detect all mass signals in a digest [197], providing their signal is above the low limit of detection. The use of isobaric tags allows for comparisons of more than two biological states, with the introduction of 4- or 8-plex tagging systems. In the case of Isobaric tags for Relative and Absolute Quantitation (iTRAQ) and dimethyl labeling, where the isobaric tags derivatize amine groups on a global scale rather than the cysteine residues alone, the number of tagged proteins increases. However, despite the increased depth of coverage compared to ICAT, introduction of tags increases the complexity of the spectra, as the digest contains not only one complex protein digest, but two, each with labeled and unlabeled peaks. The nontargeted peak detection and multivariate analysis approach removes the need for labels, directly comparing the peak area responses of the two digests automatically without any user input other than defining parameters. By automating the detection, alignment and integration of peaks in the digests, the process of profiling of complex peptidomes is consistent with shorter HPLC gradients, and yields higher throughput than the proteomic tagging methods for quantification. To sufficiently reduce the complexity of the digest prior to MS/MS for the quantification methods, two-dimensional separations or prior fractionation steps may be required, increasing time spent on the analysis, on the day time scale, rather than the hour time scale presented here. After recognition of those peptides to be differentiating with a high degree of confidence, an inclusion list can be made for a more focused analysis using MS/MS, ensuring as many peptides of interest are annotated, omitting those that do not differentiate from being chosen. Finally, the reagents needed for tagging the proteins or peptides prior to analysis can cost thousands of dollars, which when attempting analyze large sample sets can become limiting. 29 Though label-free profiling approaches reduce the complexity of the spectra and increase the throughput of analysis compared to common quantitative techniques, label-free approaches to quantification via MS/MS have some disadvantages. Spectral counting, a method that counts the number of spectra for a peptide that yield a positive identification for a protein, is based upon the assumption that more abundant proteins produce more peptides that will map back to the parent protein. However, in the case of peptidomics, this method has significant drawbacks. The goal of peptidomics is to survey the entire complement of peptides, not proteins, so determining what parent protein they derive from does not satisfy the central goal of peptidome profiling. Also, when attempting to determine which peptides differentiate genotypes (as with simulated gastric digests of wheat and spelt), DDA and spectral counting may miss low abundant proteins. Recent improvements in mass spectrometer scan speeds have improved DDA coverage, reducing the analytical duty cycle. One way to mitigate the number of peptides excluded by DDA is using a dynamic exclusion list, in which if a particular m/z value has previously been chosen for fragmentation, that m/z does not get selected again. While dynamic exclusion has been useful in proteomics where the identification of proteins has been boosted by the exclusion of signals previously measured or derived from contamination, to further probe into the less abundant signals, dynamic exclusion can be blind to sequence isomers for small peptides. 1.5.5 Current Advancements in Technology Proteomic analysis is technology driven, with ongoing technological advances allowing for further advances in the field. As it stands currently, improved technologies have increased ion transmission and sensitivity, increasing the number of analytes and the signal received to the detector. On the market today, are a number of instruments that each offer different advantages. The Orbitrap from Thermo Scientific has been paired with an several configurations of linear ion 30 trap mass analyzers or a quadrupole mass analyzer on the front end. The Orbitrap is an ion trap that uses electrostatic fields to trap ions, and uses image currents of the oscillating ions to produce a time-domain signal that is then Fourier transformed to yield a mass spectrum. The Orbitrap offers exceptional mass resolution, frequently exceeding 100,000, and can also be fitted to a variety of ion activation accessories including collision-induced dissociation (CID) and electron transfer dissociation (ETD) [198]. Exquisite mass accuracy, often with mass errors of less than 1 part-per-million, is obtained, and additional recent improvements have made faster scan speeds possible. Time-of-flight instruments have also improved in sensitivity, mass resolution, and mass accuracy in recent years.. 1.6 Peptidomics In contrast to the bottom-up proteomics workflow, profiling of pools of bioactive peptides requires as alternative approach because of the greater diversity of the population of peptides compared to peptides generated by action of a single defined protease. In 2001, Peter Schultz-Knappe coined a new term “peptidomics” as “the technology for the comprehensive qualitative and quantitative description of peptides in a biological sample” [199]. Compared to “bottom-up” proteomics where the goal is to identify proteins from peptide sequence information, peptidomics aims to identify as many intact peptides as possible without resorting to further proteolysis. As the peptides are the ultimate goal for identification, a preparative enzymatic digestion step is typically not used, particularly in the case of endogenous bioactive peptides [6, 14, 146, 186, 199-213]. Bioactive peptides are often characteristically small and lack sites of proteolytic digestion that allow them to survive the numerous enzymes present in vivo. For bioactive peptides derived from food, often a simulated gastrointestinal digestion is used as a preparative method to generate the desired peptides. Simulated gastrointestinal 31 digestions primarily use pepsin followed by trypsin and chymotrypsin, resulting in less uniform cleavage sites compared to the tryptic digests common in proteome analyses [14, 16-18, 72, 74, 78, 84, 89, 214-221]. 1.6.1 Targeted vs. Non-Targeted Analysis To profile peptides or proteins in biological samples, two different approaches can be taken; a targeted or untargeted approach depending upon the desired analysis. In a targeted analysis, analytes of known parameters are specifically searched for in samples. By using a targeted analysis, instrument parameters can be tuned for specific analytes, optimizing the chromatography and acquisition for optimal separation and sensitivity. Only those peptides of interest can be isolated and fragmented. This increase in separation and sensitivity can result in a higher throughput, allowing for more samples to be analyzed for the targets [149, 151, 175, 176, 179, 180, 182, 222-224]. In an untargeted analysis or comprehensive profiling of analytes, the goal is to maximize discovery. Untargeted analysis seeks to detect and fragment as many analytes as possible in a given biological sample. In each scan, as many peptides as allowed by DDA are isolated and fragmented for maximum identification. As a result, analyses for untargeted profiling are often more time-intensive as the chromatography is more intensive than for a targeted analysis, resulting in lower throughput [149, 170, 172, 174, 181, 184, 225]. 1.6.2 Challenges of the Peptidomic Workflow Currently, peptidomics is limited by the technologies and methodologies of proteomics. In peptidomics, biological samples may contain thousands of peptides, even without a preparative digestion step. The resulting complexity of the samples makes the analysis challenging in several regards. With insufficient separations, as many as several hundred peptides may co-elute, particularly when attempting to separate peptides with similar properties. 32 Co-elution of peptides then results in both competitive ionization and ion suppression [155, 226]. When ions co-elute, if there are multiple analytes in a charged solvent droplet, the analytes will compete for the charge, with those with a higher basicity having a greater affinity for charge. As a result, not every analyte is equally ionized, leading to lower than actual abundance when the ions reach the detector [155, 226]. In addition to desired analytes, in complex samples, there is a wide variety of different molecules. Any substance can lead to ion suppression, including endogenous and exogenous sources. Salts, lipids, proteins, or other analogous compounds/metabolites can be present from a sample preparation from the biological source, or on the exogenous end, preparation materials can lead to contamination by polymer residues such as wetting agents, phthalates, or detergents. While some of these substances can be separated from the analytes of interest with chromatography, chromatography does not solve all matrix issues. Within the solvent droplet, if there are interfering compounds, particularly in high concentrations, the viscosity and surface tension of the droplet can be altered. If the viscosity and surface tension increases, there is a reduction in the efficiency for the analytes to reach the gas phase, and therefore as ions to the detector [226]. Also, as previously mentioned, if these coeluting interfering compounds are ionized, they will compete for ionization, reducing ionization efficiency as well. Besides ionization efficiency, the nature of the peptides themselves limits the bottom-up proteomics approach from achieving optimal data. Particularly in the case of bioactives derived from food, the peptides can be exposed to multiple enzymatic digestions. Bioactive peptides as a result do not have a specified terminal residue as with tryptic digestions. This lack of a guaranteed basic terminal basic residue and exposure multiple enzymes results in a peptidome that is dominated by small, singly-charged peptides. These small, singly-charged peptides are 33 not ideal for the proteomic/peptidomic work flow as it relies on tandem mass spectrometry for the identification of the peptides. In tandem mass spectrometry, multiply-charged peptides are desirable as they are more likely to generate informative y-type ions upon activation. The small, singly-charged peptides often fail to give informative fragment ion peaks following CID, leading to an inability to identify the proteins or peptides based on the CID spectra. In addition, the proteomic/peptidomic workflow relies upon DDA for the selection of ions to be submitted for fragmentation [191-193, 195, 196]. While DDA is often successful in providing sufficient coverage in proteomics where peptides derived from a single protein overlap, in peptidomics each peptide may hold biological significance. When the concentrations of peptides can range twelve orders of magnitude as in the case of plasma proteins, important peptide information can be lost in DDA when they are not automatically chosen for submission for tandem mass spectrometry. 1.6.3 Proposed Improvements to Peptidomics The current limitations of co-elution, competitive ionization, ion suppression, and insufficient fragmentation that face peptidomic analysis inhibit researchers from being able to detect and identify as many peptides as possible. To mitigate the issues that still hinder progress, a new analytical method is needed. In the world of metabolomics, non-targeted analysis is commonly used to obtain a systematic view of the metabolite composition in a biological system at a specific time, usually after a stressor or genetic change [227-242]. As the metabolite profile is extensive, in some cases reaching many thousands of molecules, multivariate statistical analysis is used to reduce the dimensionality of the data into a two dimensional format that is more easily interpreted. This method is particularly useful when comparing two or more complex biological samples that share a degree of similarity, as the differences will stand out 34 [227-242]. Ions with specific m/z values of interest are then rooted out, and chosen for further analysis. By applying a metabolomic approach for automated feature extraction to peptidomics, all mass signals that are detected can be monitored simultaneously, even if they co-elute. As the majority of digestion-resistant bioactive peptides are small and singly-charged, they do not behave when subjected to fragmentation. To better integrate a metabolomic workflow with peptidomics, and to mitigate the complications that arise from insufficient identification, focusing on a HPLC-MS workflow, rather than one that is reliant on tandem mass spectrometry allows for the focus to be made on those analytes of interest. By using existing databases, preliminary potential identifications can be made based on mass, and then confirmed using tandem mass spectrometry in a targeted fashion, reducing workload and increasing throughput. Finally, alternative separations to the traditionally used C18 separations can reduce the extent of co-elution and subsequent competitive ionization and ion suppression to increase the ionization efficiency of lower abundant peptides that may hold biological significance. Therefore, combination of a metabolomic-like HPLC-MS workflow that does not rely solely on tandem mass spectrometry for the identification of peptides in combination with an alternative separation holds promise for a new approach to peptidomics for maximizing the detection and identification of digestion-resistant bioactive peptides and distinguishing samples based on differences of a small number of peptides in complex mixtures. 35 CHAPTER 2 2.1 Introduction Peptides from both exogenous and endogenous sources confer beneficial physiological functions including protection from microbial pathogens, hormonal regulation (e.g. insulin), and various aspects of intra- and intercellular signaling [16, 84, 206, 207, 243, 244]. However, bioactive peptides also play detrimental physiological roles including toxicity and disease, as seen in the cases of the bee venom melittin and immunological instigators such as common food allergies like milk and wheat [85, 86, 105, 245]. Diet represents an important source of peptides when proteins are degraded to peptides that exhibit resistance to digestive enzymes [70, 86, 207, 220, 221]. Profiling and identification of protein-derived peptides that survive hydrolysis can reveal novel regulators of physiological functions [219, 221, 246-248]. Peptides derived from partial protein digestion are also potential indicators of physiological states (biomarkers). As peptides play integral and varied regulatory and signaling roles, the need for improved methods to profile the complete set of peptides and identify candidates of potential biological importance. 2.1.1 Peptidomics and Current Challenges In 2001, Peter Schulz-Knappe first defined the term peptidomics as “technology for the comprehensive qualitative and quantitative description of peptides in a biological sample”[199]. Though the distinction between peptides and proteins is imprecise, size is key, with 50-100 amino acids representing the transition between peptides and proteins. The most common proteomic strategy involves a bottom-up approach based on selective digestion that yields peptide fragments that are identified from their tandem mass spectra (MS/MS) [144, 182, 222]. It has been common that only a fraction of the total peptide population is detected by datadependent MS/MS analyses in shotgun proteomics owing to sample complexity [249], though 36 proteome coverage has been improved through longer HPLC gradients, performing ion accumulation during mass analysis,[250] utilization of dynamic exclusion lists [251], often with replicate analyses, and improvements in mass spectrometer scan speeds [252]. Out of similar concerns, profiling of the entire complement of peptides is also limited by measurement technology, sample complexity and dynamic range, as peptide concentrations may span ten orders of magnitude [171, 184, 199]. Furthermore, peptidome components cannot be presumed to arise from site-specific proteolysis by a single protease, but rather by the actions of one or more proteolytic enzymes. As a result, the N- and C-terminal residues are more variable than encountered in bottom-up proteome analyses that employ a single protease such as trypsin [70, 147, 207]. This has ramifications for peptide identification using collision-induced dissociation (CID) and MS/MS, since many peptides may lack basic C-terminal residues and yield less uniform fragmentation along the peptide backbone [194]. For bottom-up protein profiling, analytical samples are typically digested with trypsin to yield peptides that are amenable to fragmentation and identification [147, 149, 168, 170, 171]. However, in the case of peptidomics, peptides often must be examined intact to avoid compromising identity and structure. In many cases, surviving peptides lack cleavage sites of abundant proteases. MS/MS, normally carried out by CID, is most amenable to tryptic digests of proteins as the resulting peptides have basic residues located at the terminal ends. Charged terminal ends promote fragmentation at numerous sites along the peptide backbone through charged directed fragmentation [194, 253]. As digestion-resistant peptides have been subjected to multiple enzymes, many peptides will lack a basic C-terminal residue, and these often fail to give useful CID spectra CID due to the lack of y series fragment ions. Furthermore, peptides with internal basic residues often fail to yield a set of fragment ions adequate for peptide 37 identification. As a result from lacking the basic terminal residue and being exposed to multiple enzymes, the digestion-resistant peptide pool often contains a larger fraction of small singlycharged peptides, which when subjected to collisional activation, fail to give informative fragmentation information that is necessary for the identification of the peptide. Proteomic analysis often involves at least one chromatographic separation followed by Data Dependent Acquisition (DDA) of tandem mass spectra. DDA systematically selects a userdefined number of most abundant peptides ions detected in a survey scan [254, 255]. Though DDA is a powerful approach, when faced with analysis of mixtures of great complexity, it has commonly missed co-eluting ions of low-abundance peptides and loses potentially vital information about such peptides [190]. Peptides that are low in abundance, or do not ionize well are often not selected for MS/MS and are subsequently omitted from analysis. While this method tends to be sufficient for many proteome analyses, its performance in identifying low-abundance peptides that discriminate samples is less certain unless analytical protocols have sufficiently short duty cycles that allow low-abundance ions to be sampled. Unfortunately, low-abundance precursor ions may not yield sufficient ion current in product ions for identification, since the precursor signal is distributed among numerous product ions. In such cases a data-independent approach offers advantages, particularly when sample quantities are limited. Data-independent LC/MS analyses are usually data-intensive, and generally do not include MS/MS spectra which provide greater confidence in peptide identification. Non-selective CID, or data-independent CID of ions within a narrow range of m/z values known as SWATH [256], may partially alleviate this limitation as these approaches employ a constant data acquisition strategy and yield fragment ions from all peptides. The resulting spectra may be submitted to an online database for sequencing or sequenced via de novo sequencing [253]. Instead, the use of accurate mass as a 38 means of identification improves confidence in peptide annotation relative to nominal mass measurements. When the objective is to identify genotype- or treatment-dependent differences in peptide composition, the first task is to recognize which peptides are distinguishing. Once these have been annotated, a more targeted MS/MS analysis of the same sample can help confirm discriminating peptide identities. 2.1.2 Current Peptidomic Approaches Functional peptidomics, specifically peptidome analyses that focus on bioactive peptides, can be divided into two main fields of study: that of the endogenous peptides and that of the exogenous peptides, particularly those that stem from food stuffs consumed by individuals. While many groups have focused on endogenous bioactive peptides including hormones and neuropeptides, fewer have focused on the aspects of novel discovery and differential determination. In the case of the endogenous peptides, the discovery of novel bioactive peptides has focused on specific chemistries. Tatemoto and Mutt enriched C-terminal amides and Yamaguchi’s group isolated peptides with post-translational amidation of the C-terminus for the discovery of neuropeptide Y and neuroendocrine regulatory peptides respectively. [61, 213]. Osaki and coworkers took advantage of the highly basic nature of antimicrobial peptides for the discovery of a novel peptide fragment of an insulin-like growth factor binding protein [205]. Food bioactive peptides are produced from the partial hydrolysis of food proteins either before consumption by industrial means or post consumption by digestive enzymes. The most extensively studied bioactive peptides produced from food proteins come from milk. Milk bioactive peptides include those exhibiting ACE-inhibitory, immunomodulatory, and antimicrobial effects [8, 79, 220, 257]. Although milk is the most studied source of bioactive peptides, other foods such as egg, fish and cereals also have been shown to have beneficial health 39 effects when consumed [16, 70, 258, 259]. In addition to beneficial effects, food proteins are also the source of the antagonist in food allergies, triggering the major histocompatibility cascade. The antagonists responsible for the triggering of allergies are typically attributed to digestion-resistant peptide fragments of food proteins that can pass through the intestinal barrier [86, 90, 215]. 2.1.2 Metabolomics and the Multivariate Analysis Approach For comparisons of peptides in two or more genotypes, treatments, or time points, the goal is often to first recognize those peptides that differ in presence or abundance between samples, and these can be targeted for subsequent identification. Such an approach has been the subject of a recent report regarding neuropeptide profiling [260]. Since a single analysis performed using DDA may miss many distinguishing peptides, a data-independent approach is more desirable, and can often be performed using instrumentation less expensive than the latest fast-scanning orbitraps. In such cases, accurate mass measurement is relied on to support peptide annotation. Differentiation of physiological states based on non-targeted chemical profiling has become common in comprehensive metabolite profiling, or metabolomics, and LC-MS has become a common tool for this purpose [230, 238]. Processing of nontarget LC-MS data using automated data extraction and multivariate statistical analytical tools including Principal Components Analysis (PCA), Projection to Latent Structures-Discriminant Analysis (PLS-DA), and Orthogonal Projection to Latent Structures-Discriminant Analysis (OPLS-DA), enables visualization of the effects of these changes on a system in two-dimensions and aids recognition of components that distinguish samples [239, 242]. Multivariate analysis takes into account m/z, retention time, and peak area values and determines a set of markers in each sample. To address 40 the complexity of biological samples, multivariate analysis allows for the complete profiling of potentially thousands of mass signals in a data set regardless of abundance, and can differentiate between two or more complex samples or sample groups. Therefore, by treating the peptidome as a metabolome, there is potential for expanding the breadth of peptidomic analysis to maximize detection, identification and characterization of discriminating peptides. In this context, Sforza and coworkers recently applied multivariate analysis to LC-MS/MS data using DDA, and employed PCA to track the changes in oligopeptides in cheese fermentation [261]. 2.1.3 Wheat and Spelt The major objective of this investigation has centered on an evaluation of how well nontargeted metabolomic strategies can be applied to distinguish survivor peptides derived from a model system of simulated gastrointestinal digestion of food proteins from related plant genotypes. To serve as a model for the complex digestion-resistant peptidome, protein extracts from the grains wheat (Triticum aestivum) and spelt, its ancient relative (Triticum spelta), were incubated with a series of proteases to simulate gastrointestinal digestion, following the general approach of Shan et al. (2002) [99]. Wheat is the third-most cultivated grain behind rice and maize, and is also one of the most common sources of food allergies [92, 102]. Wheat allergies, and particularly celiac disease, have been attributed to an inappropriate immune response to partially digested gliadin proteins, which are abundant storage proteins in wheat grain [105]. Gliadins have an unusually high content of glutamine and proline residues, and exhibit low solubility in water [116, 137]. While significantly less information is known about how the sequences of spelt gliadins compare to those of wheat, sequences are known for one isoform each of spelt α- and γ-gliadin, and have 90% and 93% shared identity for wheat α- and γ-amino acid sequences respectively. 41 The abundance of multiple glutamine/glutamate and proline residues in gliadins yields regions of sequence that are resistant to proteolysis by the enzymes common to the stomach and upper-mid-intestine (pepsin, trypsin, and chymotrypsin), particularly at residues adjacent to proline, where the nitrogen is tertiary instead of secondary [262, 263]. Many glutamine residues of the gliadin proteins undergo posttranslational deamidation by transglutaminase that converts them to glutamate residues [124]. Gastric digestion of gliadins by pepsin, which preferentially cleaves at hydrophobic amino acid residues, yields relatively large peptides that are transported into the small intestine, where the pH is higher and activities of trypsin and chymotrypsin can catalyze further hydrolysis. 2.2 Materials and Methods Whole wheat and spelt grains were obtained from Dr. Laura McCabe from Michigan State University. The gastrointestinal enzymes pepsin, trypsin and chymotrypsin, and the tetracycline internal standard were purchased from Sigma-Aldrich. Sodium bicarbonate, 88% formic acid, and concentrated hydrochloric acid were purchased from VWR Scientific. Approximately 1 g each of wheat and spelt grains was separately ground to fine powders using a mortar and pestle, and lipids and other extractable metabolites were removed by extraction by addition of 10 mL of dichloromethane to the ground powder. After 5 min extraction, the mixture was subjected to centrifugation (10000 x g, 10 min), and the supernatant was removed. A protein fraction enriched in gliadin was prepared by extracting the pellets with 10 mL of 70% ethanol under shaking in an incubator shaker at 200 rpm for 8 hrs at 37 ˚C. Following centrifugation (10000 x g, 10 min) the supernatant was collected. This extract was divided into triplicate 3 mL aliquots for subsequent digestion. Extracts were evaporated to 42 dryness under vacuum using a Speedvac because ethanol inhibits pepsin [264]. The dried residues were dissolved using 9 mL 1% hydrochloric acid immediately before pepsin digestion. Gastric digestion was simulated using pepsin in a 75:1 (w/w) protein:pepsin ratio assuming that 10% of the grain mass was protein extracted into 70% ethanol [102]. Gastric digestion was allowed to proceed for 2 hrs at 37 ˚C in the incubator, with shaking at 200 rpm. Digestion was halted by addition of a saturated sodium bicarbonate solution to raise the solution pH to 8.0. Intestinal digestion was simulated using the sequential addition of trypsin and chymotrypsin. A 0.39 µM trypsin solution was prepared immediately beforehand in a 10 mM Tris-HCl pH 7.5 buffer, and 50 µL were then added to the digested gliadin proteins. Digestion was allowed to occur at 37 ˚C in the incubator shaker for 2 hrs at 200 rpm. After tryptic digestion, a 9.3 µM chymotrypsin solution, freshly prepared in a 10 mM Tris-HCl pH 7.5 solution, and 50 µL was added to the digest. Digestion was allowed to occur at 37 ˚C in the incubator shaker at 200 rpm. Digestion was halted by drop-wise addition of concentrated hydrochloric acid until the pH was in the range of 2-3. Digestion product solutions were evaporated to dryness under vacuum (Speedvac), and reconstituted in 300 µL 0.1% aqueous formic acid in preparation for mass spectrometric analysis. After reconstitution, tetracycline was added to final concentration of 10 µM as an internal standard. Profiling of residual peptides was performed using a Waters LCT Premier time-of-flight mass spectrometer (TOF-MS) interfaced to a Shimadzu LC-20AD high performance liquid chromatography (HPLC) solvent delivery system. The order of the analysis of the digests was randomized using a random number generator in Microsoft Excel, and randomly selected product mixtures were analyzed in duplicate for quality control. Aliquots (10 µL) of the samples 43 were injected per analysis using a Shimadzu SIL-5000 autosampler. Digestion products were separated using a BetaBasic-18 column from Thermo (1 x 100 mm, 3 µm particle size) held at 40 ˚C. Elution was achieved using gradients based on solvent A (0.1% aqueous formic acid) and solvent B (acetonitrile). The initial solvent mixture (A/B) was 99/1. Initial conditions were held for 3 min followed by linear gradient to 95/5 (at 5 min), 85/15 (at 20 min), and 65/35 (at 65 min), followed by a 3-min hold, then a step increase to 100% methanol to wash the column for 8 min before returning to initial conditions for 3 min. The flow rate was 0.10 ml/min. Analytes were ionized using electrospray ionization in positive ion mode, and analyzed via a Waters LCT Premier mass spectrometer using V-mode ion optics (resolution ~4500). The capillary voltage and sample cone were optimized at 3500 and 50 V respectively. Mass spectra were acquired over m/z 200 to 2400 using a scan time of 1 s. Mass spectra were collected in continuum mode to avoid loss of information content for multiply-charged species as a result of real-time signal centroiding. Once HPLC-TOF-MS data were collected, all mass spectra in each data file were baseline subtracted, smoothed using the mean over 3 channels, and peaks with a minimum width of 10 data points at half height were centroided using MassLynx v 4.1 software to facilitate subsequent peak detection, integration and retention time alignment. Multivariate analysis was carried out using Waters MarkerLynx XS software with Extended Statistics (Umetrics EZ Info) software. Data files were processed over the peptide elution time range of 10 to 55 min and over m/z 400-2400 given the following parameters: a mass tolerance and mass windows of m/z 0.05 Da, a retention time window of 3 min, and intensity threshold of 40 ion counts. To adjust for any chromatographic retention time drift, retention times were adjusted relative to retention of the tetracycline internal standard, detected as m/z 445.1800 with an allowable error of m/z 0.05 and a retention time of 16.89 min with an 44 allowable error of 2 min. Multivariate analysis was performed with the extended statistics software package, which included Principal Components Analysis (PCA), and Orthogonal Projection to Latent Structures-Discriminant Analysis (OPLS-DA) for the determination of peptides that contributed to discrimination of wheat and spelt peptide profiles. In the supervised method, OPLS-DA, the different digests were defined as wheat and spelt digests for discriminant analysis. Reported ion masses are average masses from the centered spectra from the raw data files. Reported peak areas were normalized to the internal standard tetracycline, pareto scaled, and mean values were calculated across the set of subsample replicates (N=3). Potential identification was executed using the ExPASy PeptideMass tool (http://web.expasy.org/peptide_mass/). Sequences of wheat gliadin proteins representing α/βand γ-gliadins were systematically digested in silico using 14 different Uniprot accessions (P02863, P18573, P04723, P04730, P21292, P08079, P08453, P06659, P04722, P04724, P04729, P04721, P04725, P10386) and the exact monoisotopic masses were matched with marker masses collected from the multivariate analysis. To verify peptide identities, the same digests were analyzed on a ThermoFisher LTQ-FT Ultra mass spectrometer equipped with Waters nanoAcquity pump LC/Fourier Transform Ion Cyclotron Resonance MS (LC/LIT-FTICR MS) was used. A Michrom MAGIC C18AQ column (3 µm, 200 Å, 100 µm x 150 mm) was used with a gradient based on 0.1% aqueous formic acid and acetonitrile. Survey scans were generated using the FT-ICR analyzer (25000 resolution at m/z 400), and the five most abundant ions in each survey scan were subjected to CID in the linear ion trap. 45 2.3 Results/Discussion 2.3.1 Peptidomic Profiling of Digestion-Resistant Peptidome LC-TOF MS peptide profiling of the simulated gastrointestinal digestion of wheat or spelt yielded evidence for complex populations of peptides (Figure 2.1). Automated peak detection, integration, and alignment using Waters MarkerLynx XS software extracted 2061 individual signals organized by m/z-retention time pairs. Evidence that the majority of detected ions were peptides, and not lipids or residual organic acids, came from relative mass defect analysis [265], which serves as a measure of fractional hydrogen content. Excluding multiply-charged ions, the overwhelming majority (92%) of detected singly-charged ions had RMD values ranging from 400-750 ppm (Figure 2.2) consistent with peptides, but higher than oligosaccharides and lower than most lipids (Table 2.1). The detected ions ranged in charge state from singly-charged to larger multiply-charged peptides. However, the majority were small singly-charged peptides, with 96% of detected ions of m/z <1300 (Figure 2.3). The complex data set of detected peptides is conveniently visualized through multivariate statistical tools, often in the form of a PCA scores plot. A scores plot is calculated from the sum of the contribution of each peak area to the principal components (eigenvectors), and offers visualization of similarities between chemical profiles of the individual samples. Figure 2.4 presents the PCA scores plot based on the LC-TOF MS analyses of the digestion products of wheat and spelt, with the two genotypes separated into two distinct groupings that reflect differences in product compositions. The larger spread in the individual wheat replicates is attributed to issues regarding technical replication, since the two most distant samples are technical replicates of the same digestion product. 46 100 % A 50 0 0 % 100 30 60 30 60 B 50 0 0 Retention Time (min) Figure 2.1: LC-TOF MS base peak intensity chromatograms of products of sequential digestions of 70% ethanol extracts of (A) ground wheat and (B) spelt grain using pepsin, trypsin and chymotrypsin. Digests were separated on a C18 stationary phase using a 0.15% formic acid/ACN reversed phase gradient. 47 Figure 2.2: Relative mass defect histogram for combined LC-TOF MS data extracted from digests of wheat and spelt. Values of RMD greater than 1000 ppm likely reflect contributions from multiply-charged ions. For interpretation of the references to color in this and all other figures, the reader is referred to the electronic version of this dissertation. 48 Table 2.1: Relative mass defect values for internal amino acid residues Monoisotopic mass (Da) %H by weight Relative mass defect (ppm) Alanine Internal residue formula C3H5NO 71.03711 7.09 522 Arginine C6H12N4O 156.10111 7.74 648 Asparagine C4H6N2O2 114.04293 5.30 376 Aspartic acid C4H5NO3 115.02694 4.38 234 Cysteine C3H5NOS 103.00919 4.89 89 Glutamic acid C5H7NO3 129.04259 5.46 330 Glutamine C5H8N2O2 128.05858 6.29 457 Glycine C2H3NO 57.02146 5.30 376 Histidine C6H7N3O 137.05891 5.15 430 Isoleucine C6H11NO 113.08406 9.80 743 Leucine C6H11NO 113.08406 9.80 743 Lysine C6H12N2O 128.09496 9.44 741 Methionine C5H9NOS 131.04049 6.92 309 Phenylalanine C9H9NO 147.06841 6.16 465 Proline C5H7NO 97.05276 7.27 544 Serine C3H5NO2 87.03203 5.79 368 Threonine C4H7NO2 101.04768 6.98 472 Tryptophan C11H10N2O 186.07931 5.41 426 C9H9NO2 163.06333 5.56 388 Amino acid Tyrosine 49 Table 2.1 cont’d Valine C5H9NO 99.06841 9.15 691 Mean value: 6.69 468 50 Figure 2.3: Histogram of detected ions in digests of wheat and spelt as a function of ion m/z 51 Figure 2.4: PCA scores plot for LC-TOF MS profiles of peptides generated by sequential pepsin, trypsin, and chymotrypsin digestion of 70% ethanol extracts of ground wheat and spelt grain The most abundant components in mixtures exert strong influence on PCA results, and the contributions of peptides of minor abundance are perhaps best discovered through use of 52 supervised statistical methods. One of these, OPLS-DA is a supervised method in which the samples from the LC-TOF MS analysis are classed into two groups, in this case wheat and spelt. Once two groups are selected, OPLS-DA determines the markers that maximize the variation between the classified groups in a method known as PLS-DA, and rotates the scores plot orthogonally (OPLS-DA) resulting in the first principal component representing inter-class variation and the second component describing intra-class variation [239, 240]. From the OPLSDA plot, an S-plot can be generated in which a correlation score, indicative of the extent to which each marker is distributed between the two classes, is plotted on the y-axis against the loadings value for each detected ion (x-axis). Those markers that are qualitatively present in one sample over another are given a correlation score of 1.0. Figure 2.5 displays the S-plot generated from LC-TOF MS analyses of wheat and spelt digestion products where the upper right quadrant are the markers that are highly differentiating wheat from spelt, meaning that they are quantitatively and qualitatively present in wheat and not in spelt the selection of markers that discriminate the two classes with a particular p-score possible, but also allows for the selection of discriminating markers without regard for abundance. One limitation of the MarkerLynx algorithms is their inability to annotate ions as multiply-charged ions, but since the vast majority (>80%) of the detected wheat peptidome is detected as singly-charged ions, MarkerLynx is able to detect individual mass peaks, and the charge states of differentiating ions were established manually, as was assignments as monoisotopic ions, and this information was used to annotate peptides. 53 Figure 2.5: OPLS-DA S-plot for LC-TOF MS profiles of peptides generated by sequential pepsin, trypsin and chymotrypsin digestion of 70% ethanol extracts of ground wheat and spelt grain. Each data point corresponds to the extracted ion chromatogram peak area for a specific m/z-retention time pair, and the coordinates reflect the loadings (x-axis) and p-corr values (yaxis). 54 The nontargeted profiling and OPLS-DA analyses bring attention to those peptides that differentiate the peptidomes derived from wheat and spelt, and facilitates recognition of lowabundance peptides. For our purposes, markers with a p(corr) score of 0.9-1.0 were chosen as highly differentiating between the two classes. Of note, no marker possessed a score of 1.0 or 1.0, indicating that no peptides were present in digests of only one genotype. An alternative. The S-plot not only allows for possibility is that small non-zero peak areas were generated owing to noise. This resulted in a list of 96 distinguishing markers that exhibited higher abundances in wheat digests. After confirmation by manual inspection of the raw LC-TOF MS data files, the list was reduced to 87 distinguishing markers once isotopic peaks were removed. These 87 distinguishing markers formed the targeted list of gliadin wheat peptides for identification via a database. The effectiveness of the supervised method was determined using cross-validation in which each sample was purposefully assigned to the incorrect genotype. If the supervised method for determining differentiating peptides is valid, the mislabeled sample will continue to be proximal in the correct group. Each sample was cross-validated using the OPLS-DA scores plot. Of the eight samples, all eight remained appropriately grouped, indicating that the model correctly predicted whether a sample is wheat or spelt based on the nontargeted peptide profile data. Traditional identification of peptides has involved fragmentation of peptide bonds through CID to determine primary amino acid sequences via database matching or, when sequence information is limited, via de novo sequencing. Some digestion-resistant peptides, particularly prolamines, fail to give sufficient fragmentation information needed for identification using CID. In the case of prolamines, the high occurrence of proline residues in the primary structure leads to too few b-series fragment ions and a decreased identification rate 55 [262, 266]. Due to high prevalence of proline in gliadin proteins, identification of the resulting peptides requires more reliable methods. The use of accurate molecular mass allows for the identification of the targeted peptides when compared to an established protein sequence database without the use of CID. Fourteen wheat gliadin proteins were chosen from the ExPASy database to aid the identification of differentiating peptides. Each protein underwent a simulated digestion using pepsin, trypsin and chymotrypsin sequentially. Three missed cleavages were allowed, and the cysteines were untreated. Lock mass-corrected monoisotopic masses were used for the assignment of potential identification with data obtained from LC-TOF MS analyses. Of the 87 differentiating peptides, 19 were annotated, assuming a maximum mass error of 20 ppm. The majority of the annotated sequences contained proline and glutamine, often with multiple glutamine or glutamate residues. 2.3.2 Confirmation of Peptide Annotation To confirm the annotations, the digestion products of both wheat and spelt were analyzed using LC/LIT-FTICR MS in which MS/MS was performed using DDA to obtain product ion spectra. In addition to fragmentation information which offers support of annotations based on molecular mass alone, the survey scan using the FTICR offers mass accuracy superior to that of the LCT Premier. High accuracy FTICR mass measurements were used to both support previous annotations but also to eliminate those that were outside the range of FTICR experimental error (10 ppm). With the increased mass accuracy of the FTICR analysis, 26 of the 87 differentiating (elevated in wheat) peptides were assigned to wheat gliadin sequences based on molecular mass. As predicted, the LC-MS/MS results failed to provide unambiguous identification of many peptides. Six of the 26 peptides yielded sufficient information in fragment ion masses to confirm their initial annotation from LC-TOF MS data. Seventy-five of the 87 peptides were too low in 56 abundance to be selected for CID by the DDA process, and six failed to yield enough diversity of fragment ions for conclusive identification either by database searching or manual de novo sequencing. Figure 2.6 displays MS/MS spectra for wheat-derived digestion products detected as singly-charged ions of m/z 1058.6 and 634.3 respectively. The CID spectrum of the former demonstrates a series of abundant b- and y-series fragment ions consistent with the sequence LEPHEIAHL, and internal fragment ions offered additional confirmation of identity. Initial searching of the MS/MS results against the non-redundant NCBI sequence database failed to identify the peptide, so manual de novo sequencing was performed instead. A BLAST search of this sequence against the NCBI sequence showed sequence similarity to the -gliadin peptide LQPHQIHL, suggesting conversion of both glutamine residues to glutamates via action of transglutaminase. The CID spectrum for the peptide detected as singly-charged m/z 634.3 demonstrated fragment ions too sparse for unambiguous sequence annotation. With only three major fragment ions, none of which could be assigned as b- or y- fragment ions, compositional analysis relied on accurate mass measurements using FTICR MS and comparison of fragment ions to those predicted for the gliadin sequences described above, but allowing for deamidation. These tools allowed for annotation of the peptide as ELEPF, supported two internal fragment ions consistent with fragmentation adjacent to proline residues. These two findings highlight the challenges of identifying peptides by searching of sequence databases when extensive deamidation has occurred. 2.3.3 Post-translational Modifications that Differentiate Digests Following the annotation of differentiating peptides, the relative quantitative abundances of individual peptides in the wheat and spelt digests were determined, and the peptides preferentially abundant in the wheat digests were annotated using FTICR and MS/MS data 57 Figure 2.6: Product ion MS/MS spectra generated for two peptides detected in the sequential digestion products of 70% ethanol extracts of ground wheat. (A) peptide annotated as LEPHEIAHL, a product of deamidation of the γ-gliadin peptide LQPHQIAHL and (B) peptide annotated as ELEPF, a deamidation product of the α-gliadin peptide QLQPF. 58 (Table 2.2, Table 2.3). For most detected peptides, it was anticipated that the sequences must be identical to enable their detection and alignment at the same retention times and masses. By using a data-independent approach for the analysis of digestion products of proteins, differences in peptide levels in the digests of the two closely related species were quantified. The correlation score from the S-plot offers a guideline for the confidence at which a marker is differentiating as it approximates a t-table value. Figure 2.7A is a bar graph that illustrates the relative amounts of the top ten differentiating peptides according to the correlation (p(corr)) score and Figure 2.7B shows a bar graph demonstrating the relative amounts of the top ten differentiating markers according to their one tailed p-values. While the top ten differentiating markers from the S-plot are significantly different at a confidence level of greater than 99%, the top ten according to pvalue were all significantly different at a confidence level of greater than 99.95%. The differences in the calculated p-values and the correlation scores can be attributed to differences in extracted peak area information from the raw data used for calculation and peak area extracted by the MarkerLynx program. The significant quantitative differences between the two genotypes are attributed to post-translational deamidation. For example, in the peptides SEEEEL, PPEEEEEEL, and PSEEPYL all glutamate residues are the result of deamidation of SQQQQL and PPQQQQQQL (low molecular weight glutenin subunit) and PSQQPYL (-gliadin). The fully deamidated forms were significantly more abundant in wheat digests with normalized peak area ratios (wheat:spelt) of 3.3, 10.6 and 3.5 respectively. The less deamidated forms however, were more abundant in spelt, the most significant forms being SEEEQL (uncertain location of the glutamine), PPEEQQQQL (uncertain location of the glutamines), and PSQQPYL respectively. Through differential comparison of two genotypes, relative levels of post- 59 Table 2.2: Annotations of discriminating peptides detected in digests of wheat and annotated using LC/FTICR and LC/MS/MS data and the UniProt sequence database, allowing for conversion of glutamine residues to glutamate by deamidation m/z of [M+H]+ m (ppm) Protein class 401.2880 Peptide annotation RIL 2.2 γ, LMW-glutenin 431.2866 SLVL 0.5 Γ 455.2477 PPEL -6.1 α/β 472.3133 NILL 0.8 Γ 516.3152 EVIR 2.3 Γ 546.3142 ALETL 0.7 α/β 568.2634 SEVSF 2.8 α/β 590.3034 IEESL -0.3 Γ 590.3034 IEESL -0.3 Γ 634.3091 ELEPF 0.5 α/β 634.3451 PYLEL -0.2 α/β 737.3026 EPEESF or ESEEPF 4.5 Γ 762.4036 QLVQQF -14.2 α/β 773.3892 EPRQPF -6.8 Γ 834.3537 SEEEPPF 1.1 γ, LMW-glutenin 834.3894 PSEEPYL 1.9 α/β 844.3733 PEPEEPF or PEEEPPF 0.6 Γ 60 Table 2.2 cont’d 844.3734 0.6 Γ 852.3762 PEPEEPF or PEEEPPF FPEEPSF -2.0 LMW-glutenin 860.4300 EESKPASL -7.4 Γ 908.4850 RTTTSVPF 1.5 Γ 937.4712 EPHEIAEL 8.7 γ, LMW-glutenin 1058.5650 LEPHEIAHL 1.5 Γ 1060.5579 PSELPYLEL 1.3 γ, LMW-glutenin 1100.4671 PPEEEEEEL 3.4 Γ 1262.6134 PQQQQQHQQL -8.1 Γ 1617.7722 RPEQPYPQSQPQY 3.8 α/β 1617.7242 SEEPPFSEEEEPVL 4.4 Γ 61 Table 2.3: Relative quantitative abundances of peptides more abundant in digests of wheat relative to digests of spelt. For multiply-charged peptide ions, the m/z value for the singlyprotonated species was calculated from the multiply-charged monoisotopic ion. Mean Retention Time (min) 13.32 m/z of [M+H] 401.28 peak area 1948 12.47 424.16 18.59 Mean Wheat Mean Spelt peak area 409 Wheat/spelt abundance 4.8 p-value <0.001 9730 5133 1.9 <0.05 431.28 2071 536 3.9 <0.0005 13.05 435.30 804 96 8.4 <0.0005 18.89 444.31 2368 729 3.2 <0.0005 13.24 455.26 1776 356 5.0 <0.0005 13.80 458.30 7400 24 304.7 <0.0005 15.45 461.26 6524 2334 2.8 <0.0005 17.68 465.29 2872 280 10.2 <0.0005 13.54 472.32 5427 1447 3.8 <0.0005 16.66 473.30 5135 637 8.1 <0.0005 22.84 473.32 7159 1883 3.8 <0.0005 24.17 473.33 827 1039 0.8 >0.05 16.58 476.28 5308 2602 2.0 <0.0005 13.70 481.26 5240 1276 4.1 <0.0005 13.04 513.29 1735 661 2.6 <0.001 19.17 516.33 8220 2099 3.9 <0.0005 17.10 518.30 7857 2662 3.0 <0.0005 12.57 528.28 2829 488 5.8 <0.0005 18.57 529.35 5627 2791 2.0 <0.01 13.96 545.30 4439 1100 4.0 <0.0005 12.90 546.28 4881 1191 4.1 <0.0005 23.30 557.38 7555 2444 3.1 <0.01 + a 62 a Table 2.3 cont’d 24.73 568.33 1793 0 NA ND 22.58 572.36 5738 187 30.7 <0.0005 15.43 574.30 2507 903 2.8 <0.025 20.54 574.34 5073 0 NA ND 18.57 575.35 6150 696 8.8 <0.001 13.18 585.32 5872 913 6.4 <0.0025 18.39 586.38 3461 1142 3.0 <0.0005 23.22 590.29 2119 575 3.7 <0.001 17.88 590.33 4245 817 5.2 <0.0005 22.67 607.38 3480 805 4.3 <0.0005 22.92 614.42 4674 1109 4.2 <0.0005 16.69 620.28 4667 779 6.0 <0.005 15.17 623.29 2263 1021 2.2 <0.0025 21.50 626.41 786 263 3.0 <0.0005 21.42 634.39 2836 1115 2.5 <0.01 15.73 641.38 1846 205 9.0 <0.0005 17.50 666.34 1534 401 3.8 <0.0005 13.58 687.32 2746 265 10.3 <0.001 22.80 690.41 4575 325 14.1 <0.0005 13.48 703.33 2402 667 3.6 <0.0005 33.50 705.44 12803 309 41.5 <0.0005 15.31 735.31 4426 1071 4.1 <0.005 17.69 737.34 2008 282 7.1 <0.0005 18.07 751.37 1297 287 4.5 <0.0005 23.30 762.42 9124 0 NA ND 13.72 772.39 1599 504 3.2 <0.0005 17.41 773.46 2333 1996 1.2 <0.05 18.10 779.35 6659 1885 3.5 <0.001 63 Table 2.3 cont’d 18.47 780.42 1841 206 8.9 <0.0005 16.52 787.48 1623 131 12.4 <0.0005 15.80 808.37 2563 213 12.0 <0.0005 16.46 808.45 1822 781 2.3 <0.001 27.67 812.46 12738 3918 3.3 <0.0025 22.84 824.43 2715 1216 2.2 <0.05 12.79 833.42 2673 922 2.9 <0.0025 28.13 834.42 7045 1708 4.1 <0.01 24.18 834.43 5153 4159 1.2 not sig 18.00 837.50 3445 1304 2.6 <0.0005 22.29 844.44 509 352 1.4 <0.05 21.24 844.45 1384 862 1.6 <0.10 13.48 850.42 1248 259 4.8 <0.0005 16.51 852.40 1160 677 1.7 <0.01 22.96 856.41 3800 1401 2.7 <0.0005 18.38 860.48 1390 1042 1.3 not sig 17.00 875.51 4084 2429 1.7 <0.025 18.08 884.46 2250 25 91.5 <0.0005 17.64 884.55 3068 442 6.9 <0.0005 21.53 893.49 11483 6785 1.7 <0.025 13.37 907.37 969 478 2.0 <0.0025 18.26 908.41 2821 196 14.4 <0.0005 23.00 909.51 3926 959 4.1 <0.0005 16.91 937.50 2330 850 2.7 <0.0005 18.50 971.48 3439 111 31.1 <0.001 21.35 1012.54 7698 2194 3.5 <0.01 22.66 1021.52 1773 57 31.2 <0.0005 24.71 1058.61 1541 374 4.1 <0.0005 64 Table 2.3 cont’d 21.21 1060.51 5354 2733 2.0 <0.025 23.79 1074.54 5165 1966 2.6 <0.025 19.25 1100.54 2148 113 19.0 <0.001 23.82 1135.64 2004 734 2.7 <0.0005 22.76 1155.62 2177 35 61.8 <0.0005 19.16 1262.65 2963 0 NA ND 35.43 1617.80 2152 806 2.7 <0.001 15.00 1741.77 4619 0 NA ND a Extracted ion chromatogram peak areas normalized to the internal standard. NA = not applicable; ND = not defined 65 Figure 2.7: LC-TOF MS extracted ion chromatogram peak areas, normalized to the internal standard, for a series of peptide digestion products observed in higher abundance in digests of wheat than in digest of spelt. 66 translational modifications were determined without time-consuming enrichment and with a data-independent strategy. Concurrent with this study, Prandi and coworkers also examined peptides generated by simulated gastrointestinal digestion of wheat varietals, focusing on the prolamine content in its relevance to celiac disease [267]. Their profiling focused on abundant peptides, and they detected and identified about 70 peptides between two wheat varietals (T. aestivum and T. durum), and compared levels of several peptides with CD epitopes. In contrast, the nontargeted approach presented in this report detected ~2000 peptides, and highlighted those that distinguished the two genotypes both qualitatively and quantitatively. From the differentiating peptides it was determined that the post-translational deamidation of the glutamine residues are responsible for the majority of the quantitative differences without resorting to peptide enrichment and MS/MS. Although modern bottom-up proteomic approaches provide the capacity to differentiate biological samples based upon tryptic peptide abundances, the label-free and data-independent approach presented here offers several advantages. The first lies in the experimental simplicity of a data-independent approach. All signals above the detection limit are recorded, and there is no concern that . A second advantage lies in cost, since a less-expensive instrument can be used, and costs associated with derivatization are avoided. While it is recognized that multiplexed isotope labeling strategies such as reductive dimethylation allow for comparison of multiple samples in a single analysis, the addition of tagged species is not necessarily desired, leading to additional complexity in an already complex protein digest. For example, the wheat peptide resulting from gastrointestinal digestion at m/z 833.38 has three different potential sequences: PGQQEQF (not deamidated), EGQQQPF, PGEQQQF, 67 where the position of the deamidation can vary between the latter two peptides to further the complexity. If found to retain differently, only the first would be selected (if abundant enough) by dynamic exclusion, particularly if the isomeric peptides elute near one another. Therefore, a label-free and non-targeted approach is better suited to profile the peptidome given the increased simplicity and higher throughput through automated peak detection and multivariate analysis as well as lower cost without losing integral information for the determination of those peptides that differentiate two different states. Although a non-targeted ‘metabolomic-like’ approach to peptidomics allows for the determination of those peptides that discriminate two or more physiological states, and for the assignment of potential identifications, there are still some disadvantages to the approach. While the primary goal of the non-targeted approach is the recognition of those peptides that discriminate two or more states from each other for the paring down of the peptidome for further targeted analysis, obtaining a conclusive identification from this approach is not feasible. The non-targeted approach does not provide fragmentation information on the full peptidomic survey, only mass-to-charge/retention time information of individual peptides. This information is then used to compile a list of peptides that differentiate samples, but confirming their identities from fragmentation data must be achieved in a separate, targeted step that includes MS/MS. Also, using HPLC-TOF MS as the primary instrumentation for the initial peptidomic survey and subsequent assignment of tentative identifications is limited by the mass measurement accuracy of the TOF analyzer. As the non-targeted profiling of the peptidome does not have a MS/MS component to provide fragmentation information for the confirmation of identity, greater mass accuracy is desired to gain the most accurate mass match possible from simulated digestions obtained from online databases. However, the mass accuracy of the TOF instrumentation, even with using an internal standard is approximately 10 – 20 ppm, 68 which leaves room for error when assigning tentative identifications, particularly when multiple peptide products lie within a reasonable range of the found peak. With higher mass accuracy, peptide annotations are made with greater confidence. The higher degree of confidence is necessary as many of the peptides do not fragment to yield the typical sequence ions, but instead produce internal fragment ions, which are more readily identifiable with prior knowledge of sequence. 2.4 Conclusion The application of metabolomic tools for nontargeted peptidomic profiling expands the window of analysis for a wider variety of peptides than can be detected using DDA, and in this study, documented substantial quantitative differences in levels of peptides derived from deamidated proteins. This exploration has focused on wheat storage proteins, and since gliadins are notorious for high protein heterogeneity [140], it is not surprising that nontargeted analysis yielded approximately 2000 peptide signals. However, nontargeted strategies are not limited to the investigation of wheat and spelt storage proteins, and have potential to distinguish genotypes, treatments, or temporal changes in peptide levels. The nontargeted approach relies heavily on accurate mass measurements, but recognition of the masses of distinguishing peptides allows for subsequent targeted MS/MS analyses to be performed without reliance on the need for shotgun DDA to select these peptides from among complex mixtures. Additionally, through the nontargeted profiling of a complex peptidome, the annotation of peptides that differentiate two highly similar peptide profiles revealed post-translational deamidation without the use of fragmentation. It is noted that a recent report of DDA-based LC/MS/MS profiling of simulated digests of several varieties of wheat annotated 67 peptides that served as indicators of relative amounts of specific gliadins or immunogenic digestion products, but none of the reported 69 peptides appear to be products of deamidation [267]. Though a few observed peptides were common to both investigations, the nontargeted approach described here stands out as yielding unusual detail regarding the extent of deamidation, and offers the potential for deep comparative investigations of protein and peptide processing across a range of applications. 70 CHAPTER 3 3.1 Introduction Bioactive peptides represent a largely unexplored region of biochemistry, and the structures and functions of many remain to be elucidated. While neuropeptides have been extensively researched, sequencing the human genome has provided evidence that there are several peptides ligands of G-coupled protein receptors that have yet to be discovered [11, 35, 52, 60, 136, 201, 206]. Digestion of food proteins yields a variety of exogenous bioactive peptides including antihypertensive and antimicrobial peptides; however, in addition to the beneficial bioactive peptides, deleterious antigenic peptides responsible for allergic reactions are not well defined [8, 16, 70, 83, 85, 259]. Elucidating the sequences of peptides within a complex biological sample allows for determination of those peptides that have the capacity for biological activity. 3.1.1 Challenges of the Proteomic Approach to Identifying Digestion Resistant Peptides The profiling and subsequent identification of bioactive peptides face a variety of challenges due to the complex nature of the biological samples. Within the proteomic workflow, proteins are identified and characterized through the isolation and subsequent fragmentation of the most abundant peptides in a single scan that have undergone specific digestions, designated as the ‘bottom up’ approach [168, 170, 184]. In the case of bioactive peptides, the samples are often complex, containing a multitude of peptides that are not well separated through the normally used C18 reversed phase chromatography. As a result, many peptides elute simultaneously. Due to the simultaneous elution and ionization, ions compete for charge and are suppressed by simultaneously eluting compounds such as salts, neutrals and other biologicals that hinder ionization, resulting in lower abundance [226]. Peptides with low abundance are 71 typically not chosen automatically for fragmentation via data-dependent acquisition (DDA), although they may have significant biological activity. In addition, when undergoing fragmentation for identification bioactive peptides face additional challenges that inhibit the obtaining of informative data. Opposed to the typical proteomic workflow that uses specific proteolysis, usually trypsin, which results in basic C-terminal containing, multiply-charged peptides, bioactive peptides are potentially exposed to multiple enzymes, and therefore the terminal residues cannot be guaranteed to be basic. In addition, due to the numerous enzymes, the resulting peptides are typically small and singly-charged. As a result, when subjected to low energy fragmentation like collision-induced dissociation (CID), small singly-charged peptides typically generate insufficient fragmentation data. This is due to the fact that the small singlycharged peptides lack a secondary charge site removed from the N-terminus to facilitate breaking of the amide backbone via the mobile proton model [193, 196]. 3.1.2 Fragmentation of Peptides Fragmentation of peptides in low energy collisions are carried out via either the charge directed or charge remote pathways. The most common pathway undergone at low energy collisions is the charge directed pathways which results in characteristic sequence b and y ions. Charge directed fragmentation is also known as the mobile proton model, in which a mobile proton cleaves along the peptide backbone to form b and y ions [193, 194, 196, 268]. In the mobile proton model, typically there are two sites of protonation; the amine group of the Nterminus, and a secondary site determined by gas phase basicity. The charge site with greater proton affinity is ‘sequestered’ or fixed and is subsequently not involved in the mobile proton model [193, 194, 196, 268]. The other proton then migrates to other amides for subsequent protonation. When the backbone amide is protonated, it weakens the bond by destroying the 72 resonance between the carbonyl and the lone pair on the nitrogen. The amide bond can be broken through direct cleavage of the bond, but is unlikely at low energies [193, 194, 196, 268]. Low energy collisions, which include CID and ion traps, undergo the bx-yz pathway. In the bx-yz pathway, the mobile proton attaches to an amide nitrogen, which is now a good target for a nucleophilic attack. The N-terminal carbonyl oxygen then attacks the adjacent carbonyl carbon, which creates a five membered oxazolone ring. The weakened amide bond then breaks, creating a b and y ion, in which the charged ion is determined by proton affinity [193, 194, 196, 268]. While energetically preferential for multiply-charged peptides, singly-charged peptides are also capable of undergoing fragmentation, but it requires more energy. Therefore, under normal conditions, small singly-charged peptides like bioactive peptides, are less likely to undergo y ion formation, instead forming an abundance of b ions as the charge is fixed at the N-terminus and is less mobile. Alternatively, charge remote pathways can also occur, but normally at higher energy collisions [191]. Opposed to charge directed pathways, charge remote pathways cleave peptide bonds at a point non-adjacent to that of the site of protonation, and can be directed by the residues of the peptides [191]. The formation of b and y ions via charge directed fragmentation are also influenced by the amino acid residues contained within the peptide. Proline has a unique side chain group that incorporates the amide nitrogen to form a five membered ring. The unique ring side chain alters the conformation of the peptide, directing the fragmentation towards N-terminal formation of y ions over b ion formation [269]. In peptides rich in proline residues, there will be a distinct lack in informative b ions, making it difficult to correctly identify the peptide. In addition to proline residues, acidic residues can also direct fragmentation. In the charge directed pathway, acidic peptides produce abundant b ions C-terminal to the acidic residues [192]. This arises from when 73 the oxygen of the OH in the carboxylic acid of the glutamate or the aspartate residue attacks the carbonyl of the amide backbone, breaking the amide bond. The charge is fixed to the N-terminus creating b ions rather than a mix of the two determined by proton affinity as through the charge directed pathway [192]. These residue directed fragmentation pathways are of particular interest in the instance of gliadin peptides where the bioactive peptides are predominantly glutamate and proline rich peptides. During the formation of series ions through low energy collisions, rearrangements and internal fragments also occur, increasing fragmentation spectrum complexity. Rearranged ions are formed through the cyclization of the b ion from the N-terminus to the C-terminus [158, 196, 270-274]. After ring formation, the ring can be broken open at many points, changing the order of the residues and then undergo normal charge directed fragmentation. Internal fragments are shorter peptides arising from the loss of N-terminal or C-terminal residues through previous neutral losses, or ion formation. Cyclization and internal fragments are more likely to occur in ion traps where the reaction times are longer, allowing for the formation to occur and be observed [271-274]. Previously, we have presented a new methodology for the detection, identification and characterization of digestion-resistant bioactive peptides. This method applied metabolomic tools like LC-MS and multivariate analysis to peptidomic data to maximize the number of peptides detected, and given potential identifications based upon simulated gastrointestinal digestions of known protein sequences. The metabolomic approach mitigates the challenges of co-elution, competitive ionization, ion suppression and DDA. However, despite the advantages, to confirm the tentative identities proposes by this method, high mass resolution and tandem 74 mass spectrometry is needed. Shown here are the fragmentation of wheat peptidome and the unique fragmentation and rearrangement patterns of the digestion-resistant peptides. 3.2 Materials and Methods Whole wheat and spelt grains were obtained from Dr. Laura McCabe from Michigan State University. The gastrointestinal enzymes pepsin, trypsin, and chymotrypsin were purchased from Sigma Aldrich. The internal standard tetracycline was also purchased from Sigma Aldrich. 88% formic acid was purchased from J.T. Baker and concentrated hydrochloric acid was purchased from EMD. Sodium bicarbonate was purchased from J.T. Baker. Tris-HCl was purchased from Invitrogen. Approximately 1 g of whole wheat and spelt grain was ground to a fine powder, and lipids and other extractable metabolites and lipids were removed by extraction following addition of 10 mL of dichloromethane for five minutes. After centrifugation (10000 x g, 10 min), the supernatant was removed. A protein fraction enriched in gliadin was prepared by extracting wheat and spelt grain with 10 mL of 70% ethanol under shaking in an incubator shaker at 200 rpm for 8 hrs at 37 ˚C. Following centrifugation (10000 x g, 10 min) the supernatant was collected. This extract was divided into triplicate 3 mL aliquots for subsequent digestion. Extracts were evaporated to dryness under vacuum using a Speedvac. The dried residues were dissolved using 9 mL 1% hydrochloric acid. Gastric digestion was simulated using pepsin in a 75:1 (w/w) protein:pepsin ratio assuming that 10% of the grain mass was protein extracted into 70% ethanol. Gastric digestion was allowed to proceed for 2 hrs at 37 ˚C in the incubator, with shaking at 200 rpm. Digestion was halted by addition of sufficient saturated sodium bicarbonate solution to raise solution pH to 75 8.0. Intestinal digestion was simulated using the sequential addition of trypsin and chymotrypsin. A 0.39 µM trypsin solution was prepared immediately beforehand in a 10 mM Tris-HCl pH 7.5 buffer, and 50 µL were then added to the digested gliadin proteins. Digestion was allowed to occur at 37 ˚C in the incubator shaker for 2 hrs at 200 rpm. After tryptic digestion, a 9.3 µM chymotrypsin solution, freshly prepared in a 10 mM Tris-HCl pH 7.5 solution, and 50 µL was added to the digest. Digestion was allowed to occur at 37 ˚C in the incubator shaker for 2 hrs at 200 rpm. Intestinal digestion was halted by drop-wise addition of concentrated hydrochloric acid, lowering the pH to 2.0-3.0. Samples were evaporated to dryness under vacuum (Speedvac), and reconstituted in 0.1% formic acid in preparation for mass spectrometric analysis. After reconstitution, tetracycline was added to a final concentration of 10 µM as an internal standard. All samples were diluted 1:3 before the time of analysis. Samples were analyzed by the Proteomics Core Facility at Michigan State University using a Thermo Fourier Transform Ion Cyclotron Resonance-Linear Ion Trap (FTICR-LIT). Separation was carried out using an Ascentis Express C18 fused core column (2.1 x 5 mm, 2.7 µm particles; Sigma-Aldrich) was used with a 0.4 mL/min flow rate. Digestion products were separated using a gradient of 0.15% aqueous formic acid (solvent A) and acetonitrile (solvent B), with the initial conditions (A/B) of 99/1 held for 2 min, followed by a linear increase to 95/5 at 5 min, increasing again to 85/15 at 15 min, undergoing a final increase to 70/30 at 55 min and a 3 min hold before returning to initial conditions. The FTICR allows for high mass accuracy on the parent ion scan, but low mass accuracy on the daughter ion scans. Peptides were selected by fragmentation using DDA. The FTICR scan scanned masses from m/z 300-2000 with a resolution of 50000. For DDA, the 5 most abundant mass signals within a single scan with a minimum intensity of 8000 were selected for 76 collision in the LIT via CID. Collision energy was 35 eV, and the q-value for activation was 0.25. Peptides previously given tentative identifications through the developed peptidomic workflow were then examined through de novo sequencing to confirm the identity along with the high mass resolution from the survey scan. 3.3 Results and Discussion 3.3.1 Peptidomic Profiling Through previously developed methods for improved peptidomic profiling of digestionresistant peptides in complex biological samples, 175 unique peptide masses were detected and assigned a potential identification, and 150 of those were also detected using LC-LIT FTICR MS. During the initial assignment of identity, accurate mass of stimulated gastrointestinal digestions were used as the basis of identification. As digestion-resistant peptides are often small, singly-charged, and lack a basic terminal residue from potentially multiple enzymatic digestions, the peptides do not fragment well according to the mobile proton model preferred by low energy collisions. Therefore, to maximize the identifications from detected peptides, a nontraditional LC-MS approach in combination with multivariate analysis is used to determine which peptides differentiate highly complex samples from each other. However to confirm the identity more absolutely, higher mass resolution and attempted tandem mass spectrometry is needed. With greater mass resolution, tentative peptide identities can be more accurately assigned through simulated gastrointestinal digestion as seen in Table 3.1. With higher mass accuracy afforded by FTICR-MS, 30 peptides were assigned new tentative identifications based on higher accuracy m/z values that allow for narrower mass tolerances to be applied for annotation of ion 77 Table 3.1: List of peptide masses annotated using ExPASy simulated gastrointestinal digestions through comparison of observed masses measured using LC-LIT-FTICR-MS Observed FTICR m/z + [M+H] 401.2880 Peptide Annotation RIL FTICR ppm error -1.0 403.2306 RDL 0.5 417.2343 GLDI or VEGL -1.4 417.2343 GLDI or VEGL -1.4 431.2867 SLVL -0.5 445.2665 DVVL or ALEL 0.7 455.2477 PPEL -6.1 457.2670 VPEL or EVPL 1.7 472.3133 NILL -0.2 473.2764 NLAR -15.0 473.2767 NLAR -14.4 487.2938 QAIR -11.1 503.2721 LEEL or EEIL 1.0 516.3152 EVIR 1.3 532.3247 RPLF 0.0 546.3142 ALETL 0.7 568.2634 SEVSF 2.8 590.3034 IEESL -0.3 590.3034 IEESL -0.3 616.2850 EPEEL or EEEPL 3.4 626.2727 QQSTY -8.5 632.3147 LEEEL -0.8 634.3451, 634.3091 PYLEL or ELEPF -0.2, 0.5 649.3290 EQYPL 14.5 707.3626 NPCKVF 10.7 735.3215 EPQQSF or ESQQPF -13.2 78 Table 3.1 cont’d 737.3026 EPEESF or ESEEPF 4.5 744.3614 QPQQPF or QQQPPF -8.2 747.4305 SSLAEKL 7.2 761.3342 CEEPQR 11.9 761.3342 CEEPQR 11.9 762.4036 QLVQQF -14.2 773.3892 EPRQPF -6.8 809.4150 SATTSVPF 13.1 809.4150 SATTSVPF 13.1 828.4136 PEEPFPL -0.7 PGQQEQF, EGQQQPF or 833.3834 PGEQQQF 4.9 834.3530, 834.3869 SEEEPPF or PSEEPYL 1.1, -1.9 834.3530, 834.3869 PSEEPYL or SEEEPPF 1.1, -1.9 842.3798 PEPEEEL 1.8 844.3733 PEPEEPF or PEEEPPF 0.6 844.3733 PEPEEPF or PEEEPPF 0.6 850.3858 EPEESFL or ESEEPFL 2.8 850.3858 EPEESFL or ESEEPFL 2.8 852.3762 FPEEPSF -2.0 852.3762 FPEEPSF -2.0 860.4300 EESKPASL -7.4 866.3536 EEEQQGF 0.6 868.4226 LEPHEPF 2.5 874.3692 EEPEEEL 1.2 876.3615 EEPEEPF or PEPEEEF -1.2 908.4850 RTTTSVPF 1.5 931.4292 WEQQPPF -2.2 79 Table 3.1 cont’d 937.4712 EPHEIAEL 8.7 947.4468 PLEPEESF 11.3 976.4623 SEQPQQIF -11.9 989.4492 EPEELPEF 2.6 993.4032 SEEEEEEL 13.5 1006.4850 FEIPEESR 0.6 1050.5471 LEPHEIAEL 0.1 1050.5474, 1050.5006 LDGSSVQTPF or LEPHEIAEL -9.5, 0.4 1058.5642 LEPHEIAHL 0.7 1058.5652 LEPHEIAHL 0.7 1060.4347 GEEPEEEEL 2.4 1060.5579 PSELPYLEL 1.3 1070.4915 PETEEPQQL -8.4 1100.4671 PPEEEEEEL 3.4 1190.5430 SEEEEVGEGIL 0.2 1262.6134 PQQQQQHQQL -8.1 1272.5850 LQELCCHLW -1.1 1276.5352, 1276.5670 EEEEEEEQIL or PEEEEPAIESF -6.2, 6.8 1366.5834 FPESEEPEQQF 2.3 1366.5834 FPESEEPEQQF 2.3 1368.5618 FPESEEPEEEF 9.9 1431.6248 PEEPPFSEEEEL 5.7 1433.6348 EEPLSEEPQQTF -6.1 RPEQPYPQSQPQY or 1617.7722, 1617.7242 SEEPPFSEEEEPVL 3.8, 4.4 1627.7424 PEEPPFSEEEEPVL 2.8 1657.6828 EEPEEEYPSSEVSF 4.4 80 Table 3.1 cont’d 1689.7136 GECVSEPEQQSQQQL -14.9 1697.7416 EEPEEPFPEPEQQL -7.5 1988.8662 EEEPLPPEESFSEEPPF 0.5 2425.0807 VSIILPRSDCEVMEEECCEQL 2.1 81 masses measured by the survey scan. In addition, the mass error of the tentative identifications decreased accordingly, with 54 of the 82 annotations agreeing with calculated masses within 5 ppm. However, not all peptide annotations could be confirmed using MS/MS because their abundances were too low for them to be selected by the DDA process. From the 175 peptides annotated using LC-TOF MS, 82 were also using the LC-LIT FTICR MS analysis, with other ions not included because they were not detected using FTICR MS. The peptides resulting from the simulated gastrointestinal digestion of wheat gliadin proteins ranged in mass from m/z 400 to 2425, with the overwhelming majority of the peptides being singly-charged in the original LC-TOF MS analysis, but when subjected to LC-LIT FTICR MS, all confirmatory peptide ions were singly-charged. In addition, only 18 of the tentative annotated peptides contained a basic C-terminal residue, instead favoring aromatic and leucine terminal residues consistent with pepsin and chymotrypsin activity. Also, the peptides show evidence of extensive deamidation, the post-translational modification of the glutamines to glutamates, a factor known to increase toxicity of gliadin peptides [110]. 3.3.2 Confirmation of Annotations using LC-LIT-FTICR MS Efforts to annotate peptides in a peptidomic analysis are challenged by numerous factors, since a priori knowledge of terminal residue identities is often minimal, and one cannot rely on information from other peptides to support the identification. The DDA method executed by the Michigan State University Proteomics Core chose the five most abundant peptide mass signals in a single scan for submission to fragmentation. Despite efforts to maximize the number of peptides identified via LC-LIT FTICR MS, of the 82 peptides listed in Table 3.1, only 37 were submitted for fragmentation. This shows that the 60% of the peptides that were previously determined to distinguish wheat from spelt with high confidence, were too low in abundance for 82 DDA to elucidate the identity. This shows that without prior knowledge of likely peptide identities, the bottom up approach to proteomics/peptidomics misses many distinguishing peptide mass signals, particularly those of low abundance relative to other peptides in the mixture. Further examination of the peptides submitted to fragmentation demonstrated that digestion-resistant peptides, particularly those that hold similar characteristics to the gliadin digestion products do not always yield MS/MS spectra with sufficient information to yield unambiguous identification. 3.3.3 Internal Fragment Ion Formation for the Identification of Challenging to Fragment Peptides Ion trap fragmentation is differentiated from CID in QTof or triple quadrupoles in that the isolated primary ion is isolated and, subjected to resonant excitation that leads to ion-molecule collisions and ion dissociation. In the ion trap, product ions that are formed are no longer in resonance with the excitation frequency, and fragmentation is less extensive than with QTof instruments where product ions continue to undergo energetic collisions with collision gas molecules. For example, Figure 3.1 shows one of the more useful LIT product ion spectra, + specifically products of [M+H] of the peptide LEPHEIAEL. The product ion spectrum is dominated by numerous b- and one y- series ions, as well as a few internal sequence ions. The preponderance of b- series ions is consistent with the charge more localized towards the Nterminus, as expected given the N-terminal amino group and the histidine in position 4. The relative abundances of the fragment ions also bolster confidence in the annotation. The proline effect, in which fragmentation N-terminal to the proline is preferential in the presence of a mobile proton, results in higher abundance y-ion type fragments as seen with the ion y7 83 + Figure 3.1: Product ion MS/MS spectrum generated for [M+H] of the peptide LEPHEIAEL in the sequential digestion products of 70% ethanol extracts of ground wheat. Open circles (°) indicate loss of water, bracketed ions [], indicate internal fragment ions. 84 [269, 275]. Although this ion contains a C-terminal glutamate residue that normally produces an abundance of b-type ions, the formation of b-ions has been reduced by the preferred proline effect because pyrrolidine has a higher proton affinity for the mobile proton [275]. Additionally, the glutamic acid (aspartic acid) effect is seen in the higher relative abundances of the b8 ion, further increasing the confidence of identification. While the peptide LEPHEIAEL yielded numerous product ions that supported its annotation, the majority of gliadin-derived peptides did not demonstrate the same behavior. An example of this is observed in a peptide annotated + EVPL, whose product ion spectrum is presented in Figure 3.2. The [M+H] mass measured by FTICR-MS bolsters the identification of the peptide, however the product ion spectrum yields minimal evidence to support this annotation, displaying only loss of water from the precursor and b2 and y2 (both with nominal m/z 229, and derived from cleavage of the V-P bond). Since the b2 and y2 ions have the same nominal mass, no distinction could be made between the two ions from the LIT MS/MS spectrum, although it is likely a y-type ion due to the proline effect. The high prevalence of glutamate residues in the annotated gliadin peptides help explain the unusual MS/MS spectra, often displaying both internal fragment ions and b-H2O and y-H2O ions. Figure 3.3 shows the product ion MS/MS spectrum of the annotated peptide SEEEEPVL which formed both b and y series dehydrated ions in addition to the corresponding series ions. In addition to the series ions, two internal fragments, EEE and EEVL are formed. EEVL indicates a rearrangement of the residues, postulated to be driven by intramolecular dehydrative cyclization involving glutamate side chains. Internal fragments, particularly the [EEVL] 85 + + Figure 3.2: Product ion MS/MS spectrum generated for [M+H] of the peptide EVPL in the sequential digestion products of 70% ethanol extracts of ground wheat. Open circles (°) indicate loss of water. 86 + Figure 3.3: Product ion MS/MS spectrum generated for [M+H] of the peptide SEEEEPVL in the sequential digestion products of 70% ethanol extracts of ground wheat. Open circles (°) indicate loss of water, and bracketed ions [], indicate internal fragment ions. 87 internal fragment that includes four residues out of their sequence order, point towards cyclization and rearrangement. For cyclization to occur, a nucleophilic attack from an electron rich group upon an electron poor group, forming a ring structure. In the case of a peptide containing glutamate residues, the amide nitrogen can attack the carbonyl carbon of the carboxylic acid, causing a water molecule to leave and this is particularly facile when the carboxylic acid is protonated. The resulting ion has the mass of a cyclic dehydrated b ion, which + can then undergo subsequent fragmentation. The fragment ion annotated as [EEVL] is consistent with cyclization and rearrangement to eliminate the proline from a terminal end. For extensively deamidated peptides that are rich in glutamate residues, the internal fragments are often essential for peptide annotation. Demonstrated by the peptides PPEEEEEEL, PEEPPFSEEEEPVL and EPEELPEF in Figures 3.4, 3.5 and 3.6 respectively, the product ion spectra are dominated by internal fragment ions, and show a lack of informative bor y-series ions. For such glutamate-rich peptides, identification relies upon annotation of internal fragments. The peptide PPEEEEEEL (Figure 3.4), also demonstrates unusual behavior consistent with formation of a cyclic form of the dehydrated molecular ion. Given the distinct lack of y- series, and the presence of only two informative b- series ions, whose high abundances are consistent with the glutamic acid effect, the identification of the peptide without internal fragments would be challenging, particularly given the extensive deamidation, in which all six glutamates are a product of deamidation. Likewise, annotation of the larger peptide PEEPPFSEEEEPVL (Figure 3.5), also relied heavily upon internal fragments for identification. The most abundant peak representative of b11, does follow the suppositions of the glutamic acid effect, increasing the confidence of identification, however, given only four non-dehydrative sequence ions, sequence annotation runs a greater risk of false discovery if based entirely on 88 + Figure 3.4: Product ion MS/MS spectrum of [M+H] of the peptide PPEEEEEEL in the sequential digestion products of 70% ethanol extracts of ground wheat. Open circles (°) represent loss of water, closed circles (•) ions plus water ion types, and bracketed ions [], internal fragment ions. 89 + Figure 3.5: Product ion MS/MS spectrum generated for [M+H] of the peptide PEEPPFSEEEEPVL in the sequential digestion products of 70% ethanol extracts of ground wheat. Open circles (°) indicate loss of water, and bracketed ions [], indicate internal fragment ions. 90 + Figure 3.6: Product ion MS/MS spectrum generated for [M+H] of the peptide EPEELPEF in the sequential digestion products of 70% ethanol extracts of ground wheat. Open circles (°) indicate loss of water, bracketed ions [], indicate internal fragment ions. 91 these product ions, and the internal fragments must be used. Of note, in the gliadin peptides, which have a high prevalence of glutamine, and as a result of deamidation, glutamate residues, as well as proline residues, the ring openings upon cyclization are predominantly C-terminal to the proline residues (Figures 3.4, 3.5), which is counter to that of the proline effect. This phenomenon has been previously observed by Jia and coworkers, where cyclized peptides with a fixed charge show preferential cleavage C-terminal to the proline [271]. However, not all gliadin-derived peptides yielded product ion spectra that provided such support for peptide identification. The peptide annotated as EPEELPEF (Figure 3.6) does not yield fragments derived from cleaving as many peptide bonds as was observed for other glutamate rich peptides. It is suspected that dehydrative cyclization involving one more glutamate side chains in parallel may yield intermediate ions that primarily decompose to yield internal sequence ions. While identification of this peptide would have been less convincing if only the traditional a, b, c/x, y, z fragments were considered, the three internal fragment peaks bolster confidence in annotation of this peptide. Similar behavior was observed for a large majority of the gliadin peptides that were heavily deamidated. While several peptides, particularly the heavily deamidated-gliadin peptides were successfully annotated and supported using MS/MS spectra, there were several instances in which this was not possible. For example, in Figure 3.7, the spectrum of the peptide annotated NILL. In this example, no confirmatory sequence ions or internal fragment ion peaks were identified using de novo sequencing, making confirmation of identification using MS/MS questionable, although the mass differences between the peaks are in line with known amino acids, suggesting that the chosen ion is a peptide. 92 Relative Abundance 100 0 125 225 325 425 m/z + Figure 3.7: Product ion MS/MS spectrum generated for [M+H] of the peptide NILL in the sequential digestion products of 70% ethanol extracts of ground wheat. 93 In terms of confirming identification, proteomics often employs automated searches using protein databases. However, given the variable amount of deamidation seen among the gliadin peptides, no confirmatory matches could be found using common databases such as MASCOT. Even when entering in peptides whose MS/MS spectra contained many mass fragments that were identified as series or internal fragment peaks such as the peptide LEPHEIAEL seen in Figure 3.1, no protein or peptide matches of good confidence were obtained (score >70), and none of the lower scoring matches were from a wheat parent protein, despite the inclusion of deamidation for a post translational modification. The deviation from the parent protein sequence with deamidation with increased propensity for internal fragments, appears to decrease the ability to find database matches in order to confirm the de novo sequence of the peptides. Therefore, especially as the number of sequence ions decreases, the potential identifications assigned from the non-targeted approach combined with the internal fragments, become essential for the identification of these peptides as current approaches are insufficient. 3.4 Conclusion Successful identification of digestion-resistant peptides is often hindered by their low charge states and lack of common a, b, c/x, y, z fragment ions. To mitigate the limitations, a metabolomic approach was developed and used to annotate genotype differentiating peptides based on peptide molecular mass, regardless of abundance. However, to provide greater confidence to peptide annotation, MS/MS spectra generated using a linear ion trap along with high mass accuracy survey scans from FTICR-MS was employed. The majority of genotypediscriminating digestion-resistant peptides were not selected for fragmentation using DDA owing to their low abundance relative to other co-eluting peptides. While part of this limitation of LCMS/MS with DDA might be solved through use of mass inclusion lists, MS/MS spectra of many 94 differentiating peptides were not successfully identified using Mascot and Protein Prospector tools. Those peptides that were selected for fragmentation often showed a lack of informative a, b, c/x, y, z series ions, but the multiple post-translational deamidation of individual peptides also challenges common peptide identification strategies. The highly deamidated gliadin peptides also result in unusual fragmentation patterns suggestive of dehydrative cyclization followed by rearrangement, often leading to abundant internal fragment ions. Based on these findings, it is evident that reliable identification of digestion-resistant peptides must rely heavily on accurate mass measurements, and could benefit from application of alternative ion activation methods that yield a more complete range of fragments derived from cleavage at multiple sites along the peptide backbone. However, since the majority are singly-charged peptides, alternative approaches beyond electron capture and electron transfer technologies will probably be needed. 95 CHAPTER 4 4.1 Introduction 4.1.1 Challenges of Analyzing the Digestion-Resistant Peptidome The analysis of peptide mixtures through mass spectrometry has been of great interest in the past three decades due to implications in disease state diagnosis as well as determination and characterization of structure with the development of proteomic strategies. Resulting from a variety of endogenous and exogenous sources, and responsible for a wide range of biological activity including beneficial and deleterious effects, the characterization and identification of digestion resistant peptides is of special interest to researchers in the field of peptidomics [70, 199, 201]. Digestion resistant peptides present a particular challenge using bottom-up proteomic methods because many peptides lack common proteolytic sites [15, 89, 90]. For example, the wheat gliadin proteins, the protein known to cause some wheat allergies and trigger Celiac Disease (CD) symptoms, is rich in proline and glutamine residues, and the peptides derived from gliadins lack sites of proteolysis for the common gastrointestinal enzymes [99, 116]. Successful identification of digestion-resistant bioactive peptides is often challenging. Many such peptides have been challenging to identify due to a lack of informative MS/MS spectra owing to their survival of exposure to multiple enzymes and their low charge states [193] [195, 196]. Therefore, to more successfully identify peptides without relying on deliberate proteolytic digestion to break them into multiple digestion products, an alternative method for annotation of peptides is needed. It was proposed that by eliminating data dependent MS/MS acquisition and approaching peptidomics with an approach borrowed from non-targeted metabolomic analyses, accurate peptide masses obtained from High Performance Liquid Chromatography-Mass Spectrometry (HPLC-MS) could be used to annotate peptides in complex mixtures. However, the analysis of 96 complex peptide mixtures by liquid chromatography-mass spectrometry (LC-MS) has long been hindered by insufficient separations. Traditionally performed on reversed phase columns, specifically an octadecyl (C18) alkyl chain covalently bonded to a silica surface, the separation of peptide mixtures in a one-dimensional separation often suffers from co-elution and competitive ionization and ion suppression of analytes of interest [226]. These challenges in separating peptides may result in inaccurate relative ion abundances that can compromise recognition of altered peptide profiles in diagnosis of disease or recognition of phenotypic differences between genotypes. In addition, peptides of interest are frequently some of the least abundant peptides in a complex mixture, and are often obscured by the presence of more abundant peptides, as exemplified with the plasma proteome where proteins are present in abundances ranging over ten orders of magnitude [173]. 4.1.2 Multidimensional Separations A single liquid chromatographic separation offers peak capacities ranging from hundreds to perhaps a thousand analytes, but these numbers are dwarfed by the tens of thousands of peptides that may be present in biological tissues and fluids. One approach to improve peak capacity derives from employing multi-dimensional separations to reduce co-elution by fractionating complex protein digests based upon two or more modes of retention on-line. Offline pre-fractionation of protein digests include two-dimensional gel separations that have found success in fractionating and isolating proteins classes using isoelectric points and molecular weight [146, 148, 151]. Excision of protein spots have been further digested and analyzed via HPLC-MS/MS [146, 148, 151]. While successful in reducing complexity full protein digest and subsequent spectra, off-line methods are time consuming and labor intensive, and are restricted by the analysis of proteins with moderate isoelectric points and larger molecular weights. In 97 2001, an online multidimensional separation was introduced for the separation of complex protein digests named the Multidimensional Protein Identification Technology (MudPIT) [171]. MudPIT serially employs a strong cation exchange column to a reversed phase column allowing for initial fractionation based on strength of ionic interactions, where each fraction is then separated further based on hydrophobicity (solubility) on the reversed phase column. These types of separations achieves the goal of reducing complexity to the point where lower abundant proteins normally obscured by more abundant proteins are able to be analyzed, but is hindered by the time consuming step-wise chromatography needed to accomplish sufficient chromatography [171]. 4.1.3 C18 Retention Mechanisms Separations based largely on solvation thermodynamics lack the capacity for separation selectivity and resolving power for peptides sharing similar hydrophobicities. Therefore, alternative mechanisms of retention are needed to separate complex protein digests via HPLC. The prevailing theory for C18 reversed phase separations, proposed by Horvath in 1976 as the solvophobic theory in which an analyte will associate with a non-polar stationary phase due to unfavorable interaction with the initial mobile phase until a sufficient concentration of organic in the mobile phase is reached at which point the analyte can partition into the mobile phase [276, 277]. If analytes exhibit identical retention behavior owing to similar hydrophobicity, they coelute, and competitive ionization often leads to ionization suppression [226]. The solvophobic theory, however, does not take into account the role of surface silanol groups in analyte retention. Silanols are ionizable surface functional groups that are capable of participating in ion exchange interactions with ionic functional groups of analytes [278-280]. When the silanols are not endcapped, reversed phase separations often are driven by mixed-mode mechanisms of 98 retention, in this case involving solvophobic partitioning and ion exchange mechanisms of retention [278-280]. 4.1.4 Fluorinated Stationary Phases To mitigate issues associated with co-elution, alternative separations exhibiting other modes of retention for complex digests are needed. Fluorinated stationary phases bound to silica have exhibited mixed mode chromatographic (MMC) separation for the unique retention of small polar analytes due to both mobile phase and stationary phase interactions. This behavior was observed to result in a "U-shaped" relationship between organic modifier and retention time [281]. Fluorinated stationary phases have most notably been used for the separation of taxanes, diterpenes used for the synthesis of chemotherapy drugs [282-284]. Needham et. al observed the superior retention and ionization of analytes at high (90%) concentrations of organic modifier, implying behavior similar to normal phase retention at these concentrations, normally unsuitable for traditional columns, but highly amenable to mass spectrometry [285-288]. Using chemometrics, Euerby found that fluorinated stationary phases displayed dissimilar retention of many basic analytes compared to traditional stationary phases, prompting further study [283, 289]. Of particular note was the extended retention of hydrophilic basic analytes that are important to the pharmaceutical industry, resulting in a non-linear relationship between log k' and percent organic modifier, while neutral basic analytes retained the linear relationship observed with traditional stationary phases. Euerby suggested ion-exchange at the surface bound silanols as a possible explanation for the orthogonal selectivity of basic analytes [283, 289]. In a related study, Neue et. al also examined several different stationary phases and established that fluorinated stationary phases exhibited increased polar and phenolic selectivity [290]. However, 99 in both cases, while suggestions of silanol interactions were made, no clear mechanism of retention was given. Extending the observed retention characteristics of fluorinated stationary phases, Bell and Jones sought to define the retention mechanism of analytes, particularly basic analytes at high percentages of organic modifier where a significant increase in the retention of analytes had been observed [281]. In the Bell study, a fluorinated stationary phase was compared to the classical C18 hydrocarbon stationary phase as well as bare silica to explore the implications of ionexchange mechanisms. In addition, the effects of mobile phase parameters were also explored, investigating the effects of not only the proportion of organic modifier, but also pH of the mobile phase components, as both affect the ionization of the surface silanol groups. Bell and Jones found that at higher concentrations of organic modifier (40-100% acetonitrile), basic analytes displayed a "U-shaped" retention time vs. percent organic modifier profile, which was pronounced beyond that observed in other reversed phase columns, a behavior explained by Horvath as a MMC retention from silanol interactions [280, 281]. The "U-shaped" retention profile observed by Bell was attributed to a "hydrophobically-assisted ion-exchange" mechanism, which supports the suppositions of Neue and Euerby of silanol interactions at the surface dominating the retention mechanism of basic analytes on a fluorinated column [281, 283, 291]. When bound to the fluorinated stationary phase, basic analytes displayed increased retention through an ion exchange mechanism supported by a hydrophobic mechanism at high concentrations of acetonitrile, but at lower concentrations (40%) ion exchange was less prominent. Bell also found that control over mobile phase conditions (pH) was an indispensable tool used for influencing the retention the basic analytes by manipulating the ionization of both the silanol groups on the surface as well as the analytes in solution [281]. 100 To date, the pentafluorophenylpropyl (PFPP) stationary phase and its retention mechanisms at low concentrations of organic modifier have not been well characterized. As the efficient separation of peptides for proteomic and peptidomic analyses is essential in order to obtain useful and informative data, improved strategies are needed when reversed phase separations do not provide the retention selectivity needed for key analytes. It is our goal to explore the relationships between analyte structure and function that leads to retention on the fluorinated stationary phase using a complex protein digest consisting of peptides known to contain varying degrees of polar, aromatic and aliphatic groups. To determine if hydrophobicity is a primary or secondary mode of retention, comparisons between a hydrocarbon (C18) stationary phase and the PFPP stationary phase will be made. Although few studies linking retention indexes and hydrophobicity have been performed, some correlations have been found. Commonly, retentive indexes have used hydrophobicity coefficients of individual residues to predict peptide elution for peptides up to 50 amino acids in length [292]. However, more recently, Krokhin and coworkers developed a new predictive retention index for tryptic peptides which takes into account retention coefficients and distance from the N-terminus for each residue. Using this new index, they were able to correlate hydrophobicity to retention time with a coefficient of determination of > 0.9 using a reversed phase separations of tryptic peptides [293]. 4.2 Materials and Methods 4.2.1 Materials Whole wheat and spelt grains were obtained from Dr. Laura McCabe from Michigan State University. The gastrointestinal enzymes pepsin, trypsin, and chymotrypsin, and the internal standard tetracycline were purchased from Sigma Aldrich. Sodium bicarbonate, 88% formic acid, Tris-HCl, and concentrated hydrochloric acid were purchased from VWR Scientific. 101 4.2.2 Extraction of Gliadin Proteins Approximately 1 g of whole wheat and spelt grain was ground to a fine powder, and lipids and other extractable metabolites and lipids were removed by extraction following addition of 10 mL of dichloromethane for five minutes. After centrifugation (10000 x g, 10 min), the supernatant was removed. A protein fraction enriched in gliadin was prepared by extracting wheat and spelt grain with 10 mL of 70% ethanol under shaking in an incubator shaker at 200 rpm for 8 hrs at 37 ˚C. Following centrifugation (10000 x g, 10 min) the supernatant was collected. This extract was divided into triplicate 3-mL aliquots for subsequent digestion. Extracts were evaporated to dryness under vacuum using a Speedvac. The dried residues were dissolved in 9 mL 1% aqueous hydrochloric acid immediately before digestion with pepsin. 4.2.3 Simulated Gastrointestinal Digestion of Wheat Storage Proteins Gastric digestion was simulated using pepsin in a 75:1 (w/w) protein: pepsin ratio assuming that 10% of the grain mass was protein extracted into 70% ethanol [102]. Gastric digestion was allowed to proceed for 2 hrs at 37 ˚C in an incubator, with shaking at 200 rpm. Digestion was halted by addition of sufficient saturated sodium bicarbonate solution to raise solution pH to 8.0. Intestinal digestion was simulated using the sequential addition of trypsin and chymotrypsin. A 0.39 µM trypsin solution was prepared immediately beforehand in a 10 mM Tris-HCl pH 7.5 buffer, and 50 µL were then added to the digested gliadin proteins. Digestion was allowed to occur at 37 ˚C in the incubator shaker for 2 hrs at 200 rpm. After tryptic digestion, a 9.3 µM chymotrypsin solution, freshly prepared in a 10 mM Tris-HCl pH 7.5 solution, and 50 µL was added to the digest. Digestion was allowed to occur at 37 ˚C in the incubator shaker for 2 hrs at 200 rpm. Intestinal digestion was halted by drop-wise addition of 102 concentrated hydrochloric acid, lowering the pH to 2.0-3.0. Samples were evaporated to dryness under vacuum (Speedvac), and reconstituted in 0.1% formic acid in preparation for mass spectrometric analysis. After reconstitution, tetracycline was added to yield a final concentration of 10 µM as an internal standard. All digestion products were diluted 1:3 in 0.1% formic acid before the time of analysis. 4.2.4 Liquid Chromatography-Mass Spectrometry Digestion products were analyzed using a Waters LCT Premier time-of-flight mass spectrometer (TOF-MS) interfaced to a Shimadzu LC-20AD high performance liquid chromatography (HPLC) solvent delivery system. The order of analysis for the digests was randomized using Microsoft Excel, and randomly-selected digests were analyzed in duplicate for quality control. Aliquots (10 µL) were injected using a Shimadzu SIL-5000 autosampler. Separations were compared using two endcapped columns of similar dimensions based on superficially porous silica particles: an Ascentis Express C18 column (2.1 x 50 mm, 2.7 µm -2 particles; pore size 90 Å; carbon load of 2 µmol-m ; Supelco) and a Kinetex PFP column (2.1 x 50 mm, 2.6 µm particles, pore size of 100 Å, and 9% effective carbon load; Phenomenex). The column temperature was held at 40 ˚C. Digestion products were separated using four different low pH (pH 2.7-3.0) protocols abbreviated by the column and mobile phase modifiers: (1) C18 column using acetonitrile as the organic modifier (C18/ACN), (2) a C18 column using methanol as the organic modifier (C18/MeOH), (3) a PFPP column using methanol as organic modifier (PFPP/MeOH), and (4) a PFPP column using methanol and 10 mM ammonium formate at pH 3.0 as modifiers. All separations employed the same flow rate (0.4 mL/min) and the proportions of solvents A and B 103 were the same at different times, as shown in Table 4.1. Three technical replicates were performed for each separation. Analytes were ionized using electrospray ionization in positive ion mode, and analyzed via an orthogonal TOF mass analyzer using V-mode ion optics, with mass resolution (FWHM) of >4000 at m/z 587. The capillary voltage and sample cone were set at 3500 V and 50 V respectively. Mass spectra were acquired over m/z 200-2000 with a scan time of 1 s in continuum mode to avoid loss of information content for multiply-charged species that can result from realtime signal centroiding. Once HPLC-TOF-MS data were collected, the spectra were centroided to facilitate peak detection, integration, and retention time alignment. Parameters were optimized to ensure proper reporting of multiply-charged ions. The spectra were baseline-subtracted, smoothed using the mean moving average over 3 channels, and peaks with a minimum of width of 10 data points at half height were centered. 4.2.5 Peak detection, alignment and integration For comprehensive extraction of retention time data for a complex mixture of peptides, Waters MarkerLynx XS software was employed to perform automated peak detection, alignment and integration separately for each chromatographic separation. Processing employed a range of m/z 400-2000 with a window and tolerance of m/z 0.05 in each case. A retention time window of 1 min was used with a minimum threshold of 300 ion counts for the C18/ACN, C18/MeOH and PFPP/MeOH separations and a 150 ion count threshold for the PFPP/MeOH/NH4 separation, due to a difference in total ion yields, to yield similar numbers of approximately 4000 massretention time pairs for each separation. The list of mass-retention time pairs was edited further by retaining only those entries with all non-zero values for each replicate injection. Peptide ion masses from each separation 104 Table 4.1: Description of the four separation protocols including UHPLC columns, mobile phase components, and gradients. a Solvent B Solvent A Column Protocol C18/ACN 0.15% aq. HCOOH CH3CN C18 C18/MeOH 0.15% aq. HCOOH CH3OH C18 PFPP/MeOH 0.15% aq. HCOOH CH3OH PFP PFPP/MeOH/NH4 10 mM aq. NH4OH, CH3OH PFP pH of solvent A b 2.7 b 2.7 c 2.7 c 3.0 adjusted to pH with HCOOH a Linear gradient elution was performed using the following ratios of solvent A/solvent B: 0-2 min (99/1); to 95/5 at 4 min, to 85/15 at 14 min, to 60/40 at 50 min, held until 52 min, then increased to 100% B from 52-60 min. b Ascentis Express C18 column; 2.1 x 50 mm, 2.7 µm particles c Kinetex PFP column; 2.1 x 50 mm, 2.6 µm particles 105 were then sorted and compared to the respective list from each separation, and masses within m/z 0.01 with < 5% deviation in relative isotopolog abundances were chosen as probable matches. Solvent composition at the time of elution was calculated for each peptide based on the solvent gradient program. 4.2.6 Peptide annotation To investigate the relationship between structure and retention, the peptides displaying the greatest fold-change in retention comparing the five separation protocols were then annotated through comparison of experimental peptide molecular masses with calculated masses of wheat gliadin peptides using sequences from the UniProt/ExPasY database (P02863, P18573, P04723, P04730, P21292, P08079, P08453, P06659, P04722, P04724, P04729, P04721, P04725, P10386), accounting for likely posttranslational modifications including deamidation. To verify peptide identities, the same digests were analyzed on a Waters nanoAcquity pump interfaced to a ThermoFisher LTQ-FT Ultra Fourier Transform Ion Cyclotron Resonance mass spectrometer (LC/LIT-FTICR MS). A Michrom MAGIC C18AQ column (3 µm, 200 Å, 100 µm x 150 mm) was used with a gradient based on 0.1% aqueous formic acid and acetonitrile. The 0.1% formic acid/acetonitrile ratio was held at 99/1 for 2 min, ramped to 95/5 at five min, then ramped again to 85/15 over 15 min, ramped to 70/30 at 55 min, held for 3 min and returned to initial conditions. Survey scans were generated using the FT-ICR analyzer (25000 resolution at m/z 400), and the top five ions in each survey scan were subjected to CID in the linear ion trap. Owing to extensive evidence of deamidation (Q—>E conversion), peptides were not readily identified by database searching, and instead were annotated based on accurate molecular mass measurements and manual de novo spectrum interpretation. 106 4.2.7 Calculation of peptide properties Peptide hydrophobicity was calculated based on the Grand Average of Hydropathy (GRAVY) was calculated according to Kyte and Doolittle [294] using ExPASy’s ProtParam tool (http://web.expasy.org/protparam/). Each residue is assigned a hydrophobicity value, where positive values indicate hydrophobic and negative values indicate hydrophilic properties. The sum of the values is then divided by the number of residues to give an average. Isoelectric point (pI) was calculated using ExPASy’s ProtParam tool according to Bjellqvist [295] which uses pK values of individual amino acids. Aliphatic index (AI) was also calculated using ExPASy’s ProtParam tool according to Ikai [296], using the following equation: ( ) ( ) ( ( ) ( )) where X represents the mole percent of the respective residues and a and b are 2.9 and 3.9 respectively. 4.3. Results and Discussion 4.3.1 Digest Complexity The simulated gastrointestinal digestion of extracted wheat proteins produced a complex mixture of peptides based on Ultra High Performance Liquid Chromatography (UHPLC) separation on a C18 column and acetonitrile as the organic modifier (protocol C18/ACN), as shown in Figure 4.1. Automated peak detection yielded 4849 individual peptide mass signals in this digest after deisotoping. Such sample complexity overwhelms the peak capacity of the highest-resolution HPLC separations, and co-elution of more than 25 peptides was observed in virtually all of the observed chromatographic peaks in the base peak ion (BPI) chromatograms. Furthermore, the majority of peptides eluted over a narrow range of organic modifier, 107 Figure 4.1: Base peak ion LC-TOF MS chromatogram of a simulated gastrointestinal digest of wheat storage proteins using a C18 column and mobile phase gradient based on 0.15% aqueous formic acid and acetonitrile. The acetonitrile volume percent is displayed as a function time in the form of a dashed line. 108 approximately 5-23% acetonitrile, which was anticipated because both pepsin and chymotrypsin cleave at hydrophobic residues. Our efforts to annotate peptides suggested that the C-terminal residue was the only hydrophobic residue in most identified peptides. Given the narrow range of peptide elution using the C18/ACN combination, it was decided to explore other stationary phase and solvent system combinations in an effort to spread elution of peptides across a wider range of retention times. To better examine the effects of changing solvents and stationary phases, five peptides (ELEPFL, SGHDL, LER, CEEPQR, and HVAL) were chosen that vary in mass, molecular volume, hydrophobicity based on the GRAVY scale [294], and content of polar and ion exchangeable functional groups were chosen because they were detected using all of the column/gradient combinations except PFPP/ACN (Table 4.2). When the organic modifier was changed to methanol C18/MeOH from acetonitrile, extended retention of the digestion products was observed (Figure 4.2). In contrast to C18/ACN’s narrow retention of digestion products (5 to 24 min), 3357 peptide mass signals were detected over a 30% organic modifier range (9 to 43 min). This increase in retention is illustrated by comparing retention times of the selected five peptides shown in Figures 4.3A and B, where instead of the peptides eluting in a range of 6.5% as in the C18/ACN separation, the peptides elute over a range of 28% organic modifier. Since the organic modifier is the only difference between the separations, the cause for the change in elution is likely sourced to differences in solvent dipole moments, polarizability, and hydrogen-bonding properties. 109 Table 4.2: Description of various properties for five selected peptides found in the various separations. Numberings for the figures in reference to these peptides are found in the left most column. 5 Peptide Sequence ELEPFL 48 Volume pI 3.80 Charge at pH 2.5 0.8 GRAVY 0.30 639 5.06 1.8 -0.82 416.5 503 6.00 1.8 -1.40 CEEPQR 760.8 921 4.53 1.8 -2.35 HVAL 438.5 530 6.74 1.8 1.65 3 MW 746.8 (Å ) 904 SGHDL 527.5 41 LER 24 27 110 Figure 4.2: Base peak ion LC- TOF MS chromatogram of a simulated gastrointestinal digest of wheat storage proteins using a C18 column and mobile phase gradient based on 0.15% formic acid and methanol. The methanol volume percent is displayed as a function of time in form of a dashed line. 111 Figure 4.3: Extracted ion chromatograms for the five indicator peptides (m/z 747.40, 528.26, 417.25, 761.30 and 439.26) in simulated gastrointestinal digest of wheat storage proteins using a C18 column and (A) C18/ACN and (B) C18/MeOH for separation. 112 Figure 4.3 cont’d 113 4.3.2 A fluorinated stationary phase as an alternative to C18 4.3.2.1 Separation using aqueous HCOOH/acetonitrile gradient Continuing our attempt to spread peptide elution over a wider range of retention times, separations were explored using the fluorinated PFPP stationary phase, which exhibits selectively increased retention for aromatic and fluorinated functional groups [284]. Previous work from our laboratory proposed a hydrophobically-assisted ion exchange mechanism to explain the retention behavior of basic (cationic) analytes on a PFPP column. These effects were most pronounced at high concentrations of acetonitrile in the mobile phase [281]. Expanding on this work, the retention of wheat protein digest peptides on the PFPP column was examined, first by employing the same aqueous formic acid/acetonitrile gradient used on the C18 column. To our surprise, few peptides eluted until the acetonitrile content of the mobile phase exceeded 90%, suggesting substantially greater retention on PFPP relative to C18 using this mobile phase. Exploration of the use of methanol as the organic modifier using separation protocol 3 (PFPP/MeOH), in which a PFPP stationary phase was used in conjunction with 0.15% formic acid and methanol, yielded significant changes in the chromatographic profile compared to the C18/ACN separation, with 5359 peptide mass signals spread over 3-40% organic modifier (Figure 4.4). The increased retention of the digestion products points towards the PFPP stationary phase as having increased retentive properties towards the digestion products other than hydrophobicity, since the GRAVY values, which measure the average hydrophobicity of amino acid residues, do not vary widely among the identified digestion products, with 65% of the 26 identified peptides from the PFPP/MeOH separation being classed as hydrophilic with an average GRAVY value of -0.75 (± 1.3, standard deviation). The narrow range of hydrophobicity shared by the digestion products is consistent with the observed narrow range 114 Ion Count 20000 0 0 10 20 30 40 50 Time (min) Figure 4.4: Base peak ion LC-TOF MS chromatogram of a simulated gastrointestinal digest of wheat storage proteins using a PFPP column and mobile phase gradient based on 0.15% aqueous formic acid and methanol. 115 of retention in the C18/ACN separation, and comparatively, wide range of retention on the PFPP/MeOH separation implies factors other than hydrophobicity drive retention on the PFPP column. To more closely examine the retention of peptides, comparisons between the C18/ACN and PFPP/MeOH separations were made using the five selected peptides described above. In contrast to the hydrophobicity-related elution order observed with the C18/ACN separation, the order of elution with the PFPP/MeOH separation displayed minimal correlation to the average residue hydrophobicity (Figure 4.5). This lack of dependence upon the hydrophobicity is demonstrated by the two peptides eluting earliest on the PFPP/MeOH separation, CEEPQR and HVAL, which have the most hydrophilic (-2.35) and hydrophobic (+1.65) GRAVY values respectively. To more fully examine the relationship between retention and hydrophobicity beyond that of the five selected peptides, a plot of retention time versus GRAVY for all 26 identified peptides from the PFPP/MeOH separation is shown in Figure 4.6. The coefficient of determination of 0.00 documents an orthogonal relationship between retention and hydrophobicity. Since hydrophobicity, as measured by the GRAVY score, was shown to not be a major determinant of retention of the digestion products using PFPP/MeOH, additional parameters to explain retention were explored, specifically peptide molecular volumes. Aliphatic index, a relative measure of the volume occupied by the aliphatic residue side chains (A, V, I/L), was also investigated as possible contributor to retention on the PFPP stationary phase [296]. Figure 4.7 shows a plot of retention time versus aliphatic index for the 26 identified peptides using the PFPP/MeOH separation. The coefficient of determination of 0.00 also indicates that aliphatic index has negligible contribution to retention. Another factor, peptide molecular volume, was 116 Figure 4.5: Plot of retention time versus GRAVY for the five selected peptides annotated according to Table 4.2. Peptides are those identified from the PFPP/MeOH separation protocol. 117 Figure 4.6: Plot of retention time versus GRAVY for peptides annotated according to Table 4.3. Peptides are those identified from the PFPP/MeOH separation protocol. 118 Figure 4.7: Plot of retention time versus aliphatic index for peptides annotated according to Table 4.3. Peptides are those identified from the PFPP/MeOH separation protocol. 119 considered as a potential contributor to retention. Figure 4.8 shows a plot of retention time versus peptide volume for the 26 identified peptides using PFPP/MeOH. The coefficient of determination of 0.18 shows greater correlation to the peptide volume, but suggests other factors play more important roles. 4.3.2.2 Effects of mobile phase ammonium on peptide retention To further probe the relationship between structure and retention on the PFPP stationary phase at low organic modifier concentrations, the solvent system was adjusted to introduce ammonium ions in the aqueous phase (PFPP/MeOH/NH4) with minimal difference in pH, which manipulates ion exchange chemistries at the stationary phase. The introduction of ammonium ions has the potential to disrupt ion exchange through the addition of competing ions, which would result in earlier elution of analytes retained through ion exchange mechanisms. Since hydrophobicity, aliphatic index and peptide volume were determined not to be influential in the retention of digestion products, it was decided to explore ion exchange as a possibility. When comparing the PFPP/MeOH and C18/ACN separations we observed a significant increase in the retention of polar peptides (Table 4.3) relative to C18/ACN. These polar functional groups are of interest as they are capable of ion exchange chemistries, despite the silanol groups being endcapped, and increased retention of polar functional groups is congruous with ion exchange between polar functional groups and free silanols at the surface. To investigate whether ion exchange contributes to retention, ammonium formate was used. Seen in Figure 4.9, the addition of ammonium to the aqueous phase drastically changes the order of elution of the five peptides compared to the PFPP/MeOH separation. Like the PFPP/MeOH separation, the elution order of the five peptides compared to the C18/ACN changed upon switching the stationary phase and solvent system combination (Figure 4.9). Furthermore, addition of ammonium formate resulted 120 Figure 4.8: Plot of retention time versus peptide volume for peptides annotated according to Table 4.3. Peptides are those identified from the PFPP/MeOH protocol. 121 Table 4.3: List of identified peptides that display the greatest changes in retention according to % organic modifier (% org) and change in relative retention time. Only those peptides showing a fold change of 2 (or 0.5) or greater were considered significant. PEPTIDES RETAINED MORE On PFPP/MeOH compared to C18/MeOH 1 [M+H] 604.2054 Database ID CCEHL % org PFPP/MeOH/% org C18/MeOH 5.28 2 628.2111 EESTY 3 503.2212 HQAF C18/MeOH + (tR to)PFPP/MeOH/(tR GRAVY 0.42 Aliphatic Index 78.00 volume 731.00 3.80 -1.96 0.00 760.00 6.74 -0.53 20.00 606.00 volume 990.00 to)C18/MeOH 7.84 pI 5.24 3.39 6.80 10.41 23.15 On PFPP/MeOH compared to C18/ACN 4 [M+H] 818.3610 Database ID SDPNSSVL % org PFPP/MeOH/% org C18/ACN 4.63 5 747.3969 ELEPFL 6 747.3957 7 C18/ACN (tR to)PFPP/MeOH/(tR to)C18/ACN 6.37 pI 3.80 GRAVY -0.38 Aliphatic Index 85.00 4.11 5.46 3.80 0.30 130.00 904.00 ELEPFL 4.03 5.46 3.80 0.30 130.00 904.00 875.4089 PEPEEQF or EEPEQPF 3.80 5.05 3.67 -2.06 0.00 1059.00 8 616.2937 EPEEL or EEEPL 3.77 4.32 3.67 -1.66 78.00 616.00 9 1175.4720 TEQFDSYGTK 3.72 4.93 4.37 -1.55 0.00 1422.00 + 122 Table 4.3 cont’d 10 989.4107 EPEELPEF 2.58 3.20 3.58 -1.33 48.75 1197.00 11 1366.5260 FPESEEPEQQF 2.11 2.48 3.58 -1.76 0.00 1653.00 volume 1008.00 On PFPP/MeOH/NH4 compared to C18/MeOH NONE On PFPP/MeOH/NH4 compared to C18/ACNN 12 [M+H] 833.4099 Database ID PSEQPYL % org PFPP/MeOH/NH4/% org C18/ACN 4.14 13 445.2216 PSEL 14 473.2789 15 16 C18/ACN (tR to)PFPP/MeOH/NH4/(t 5.23 pI 4.00 GRAVY -1.21 Aliphatic Index 55.71 3.61 4.42 4.00 -0.53 78.00 537.00 NLAR 3.63 4.49 9.75 -0.60 98.00 572.00 747.3950 ELEPFL 4.00 5.05 3.80 0.30 130.00 904.00 789.3022 CCQHLW 4.13 5.23 6.72 0.20 65.00 955.00 + R - to)C18/ACN PGQQEQF, EGQQQPF 17 833.3693 or PGEQQQF 4.09 5.16 4.00 -1.89 0.00 1008.00 18 937.4004 MQQQQQF 2.29 2.53 5.28 -1.83 0.00 1134.00 19 1884.6805 EESSCHVMEEECCEQL 2.24 2.54 3.89 -0.74 42.50 2281.00 20 1070.4593 PETEEPQQL 2.40 2.81 3.67 -1.96 43.33 1295.00 21 1139.5172 QTLPAMCNVY 2.14 2.39 5.52 0.36 78.00 1378.00 22 1050.4755 LDGSSVQTPF 2.37 2.78 3.80 -0.05 68.00 1271.00 123 Table 4.3 cont’d On PFPP/MeOH/NH4 compared to PFPP/MeOH 23 [M+H] 628.2248 Database ID EESTY % org PFPP/MeOH/NH4/% org PFPP/MeOH 3.82 24 761.3054 CEEPQR 25 761.3054 26 PFPP/MeOH + (tR to)PFPP/MeOH/NH4/(t GRAVY -1.96 Aliphatic Index 0.00 volume 760.00 4.53 -2.35 0.00 921.00 5.54 4.53 -2.35 0.00 921.00 3.56 4.63 5.24 0.20 65.00 956.00 HVAL 4.61 6.42 6.74 1.65 156.00 530.00 SWRSK 9.28 15.60 11.00 -2.18 0.00 802.00 R - to)PFPP/MeOH 4.93 pI 3.80 4.54 6.31 CEEPQR 4.13 790.2740 CCEHLW 27 439.2615 28 663.3340 PEPTIDES THAT RETAIN LESS On PFPP/MeOH compared to C18/MeOH C18/MeOH + 29 [M+H] 820.3473 30 439.2675 Database ID LGEEEPF HVAL % org PFPP/MeOH/% org C18/MeOH 0.45 (tR to)PFPP/MeOH/(tR - 124 pI 3.67 GRAVY -0.84 Aliphatic Index 55.71 volume 992.00 0.14 0.22 to)C18/MeOH 0.32 6.74 1.65 156.00 530.00 Table 4.3 cont’d On PFPP/MeOH compared to C18/ACN 31 C18/ACN + [M+H] 439.2451 Database ID HVAL % org PFPP/MeOH/% org C18/ACN 0.32 32 761.3033 CEEPQR 33 809.4301 34 887.4737 (tR to)PFPP/MeOH/(tR to)C18/ACN 0.24 pI 6.74 GRAVY 1.65 Aliphatic Index 156.00 0.35 0.26 4.53 -2.35 0.00 921.00 SATTSVPF 0.42 0.34 5.24 0.53 48.75 979.00 LEPREPF 0.57 0.48 4.53 -1.16 55.71 1073.00 volume 555.00 volume 530.00 On PFPP/MeOH/NH4 compared to C18/MeOH 35 [M+H] 459.2464 Database ID PQSK % org PFPP/MeOH/NH4/% org C18/MeOH 0.11 36 487.2754 VQQL 37 487.2851 38 C18/MeOH (tR to)PFPP/MeOH/NH4/(t R - to)C18/MeOH 0.05 pI 9.18 GRAVY -2.45 Aliphatic Index 0.00 0.10 0.05 5.49 0.25 136.00 486.00 VQQL 0.11 0.05 5.49 0.25 136.00 486.00 649.3112 EQYPL 0.33 0.22 4.00 -1.22 78.00 785.00 39 649.3142 EQYPL 0.34 0.22 4.00 -1.22 78.00 785.00 40 745.3308 EQQPPF or EPQQPF 0.48 0.34 4.00 -1.82 0.00 901.00 41 417.2516 LER 0.13 0.05 6.00 -1.40 78.00 416.00 42 809.4145 SATTSVPF 0.29 0.19 5.24 0.53 48.75 979.00 + 125 Table 4.3 cont’d On PFPP/MeOH/NH4 compared to C18/ACNN 43 [M+H] 457.2484 Database ID VPEL or EVPL % org PFPP/MeOH/NH4/% org C18/ACN 0.29 44 457.2525 PANR 45 532.3423 46 C18/ACN + (tR to)PFPP/MeOH/NH4/(t GRAVY 0.73 Aliphatic Index 136.00 volume 552.00 10.18 -1.95 20.00 553.00 0.20 9.75 0.13 78.00 644.00 0.34 0.16 5.84 -1.40 78.00 486.00 GLDI 0.21 0.10 3.80 1.10 156.00 503.00 SGHDL 0.25 0.14 5.06 -0.82 78.00 639.00 volume 979.00 R - to)C18/ACN 0.17 pI 4.00 0.23 0.13 RPLF 0.30 403.2138 RDL 47 417.2274 48 528.2438 On PFPP/MeOH/NH4 compared to PFPP/MeOH 49 [M+H] 809.4021 Database ID SATTSVPF % org PFPP/MeOH/NH4/% org PFPP/MeOH 0.56 50 417.2375 LER 51 528.2561 SGHDL PFPP/MeOH + (tR to)PFPP/MeOH/NH4/ (tR - to)PFPP/MeOH 0.48 pI 5.24 GRAVY 0.53 Aliphatic Index 48.75 0.14 0.07 6.00 -1.40 78.00 416.00 0.17 0.09 5.06 -0.82 78.00 639.00 126 Figure 4.9: Extracted ion chromatograms for the five indicator peptides (m/z 747.40, 528.26, 417.25, 761.30 and 439.26) in simulated gastrointestinal digest of wheat storage proteins using a PFPP column and the PFPP/MeOH/NH 4 separation protocol 127 in extended retention of the digestion products, with 3295 peptide mass signals detected over an organic modifier range of 3% to 40%, with those retaining more showing polar side chain functionality, further consistent with ion exchange as a possible factor leading to retention. However, unlike the PFPP/MeOH separation, in which the peptides elute continuously and evenly spaced over the 52 min, the majority of the peptides on the PFPP/MeOH/NH4 elute earlier than 30 min, shown in the BPI chromatogram (Figure 4.10). The earlier elution of the peptides is consistent with displacement of positively-charged peptides from negatively-charged surface sites by as seen in the retention time shifts of the peptides LER and SGHDL, whose extracted ion chromatograms are included in the selection of the five peptides seen in Figures 4.9 and 4.11 for the PFPP/MeOH/NH4 and PFPP/MeOH separations, which shifted from 18.28 min to 1.28 min and from 17.67 min to 1.59 min respectively. However, not all peptides showed earlier elution in the presence of ammonium formate in comparison to the PFPP/MeOH separation. For example, from the selection of five peptides, the peptides CEEPQR and HVAL retained longer in the presence of ammonium when compared to the PFPP/MeOH separation. Since the peptide CEEPQR has two glutamate and an arginine side chains, and does not elute earlier when competitive ions are present, this implies an additional retentive driving force other than ion exchange. When ammonium and formate ions are added to the mobile phase, the energy requirements needed to solvate the peptides is increased as the attractive forces between water molecules increase due to an increased order in the aqueous phase [297]. The increased energy requirements to partition the peptide into the aqueous phase causes increased and maintained maximal retention of larger volume peptides as evident by the peptides CEEPQR and ELEPFL, which shifted in retention from 3.31 min to 20.93 min and maintained retention at 40% organic respectively, at which point the concentration of the aqueous phase was lowered to a point where 128 . Figure 4.10: Base peak ion LC-TOF MS chromatogram of a simulated gastrointestinal digest of wheat storage proteins using a PFPP column and mobile phase gradient based on 10 mM ammonium formate and methanol. 129 Figure 4.11: Extracted ion chromatograms for the five indicator peptides (m/z 747.40, 528.26, 417.25, 761.30 and 439.26) in simulated gastrointestinal digest of wheat storage proteins using a PFPP column and the PFPP/MeOH separation protocol. 130 solvation of the larger peptides became more favorable. This trend is seen beyond the five peptides here, where those peptides more retained on the PFPP/MeOH/NH4 separation (Table 3 4.3) typically have a larger volume (average = 815 Å ) than those that are elute comparatively 3 earlier (average = 678 Å ), and although polar functional groups are present, the energy requirements to solvate the peptides into a high concentration of aqueous mobile phase are too great for early elution. To confirm the relationship between peptide volume and retention, a plot of retention time versus peptide volume of the identified peptides from the PFPP/MeOH/NH4 separation was made and is shown in Figure 4.12. A correlation coefficient of 0.39, which in comparison to the value of 0.18 obtained with the PFPP/MeOH plot, shows that there is increased retention of those peptides with greater peptide volume. This implies that in addition to ion exchange, peptide volume also plays a role in retention of digestion products. 4.3.4 The quest for orthogonal structure-retention relationships To maximize the retention of peptides, an orthogonal structure-retention relationship compared to that of the C18/ACN separation is needed. Evidence of increased separations is seen in the BPI and extracted ion chromatograms of the PFPP separations, but to further investigate the relationship between structure and retention, scatter plots correlating the retention times of the peptides between the two separations were constructed. If the two separations have a strong dependence upon the same parameters for retention, the coefficient of determination value will be near unity, as those that elute early on the C18 separations would elute correspondingly on the PFPP separations and vice versa. Comparing the C18/ACN and the PFPP/MeOH separations show little correlation with a coefficient of determination of 0.07 (Figure 4.13). As they 131 Figure 4.12: Plot of retention time versus peptide volume for peptides annotated according to Table 4.3. Peptides are those identified from the PFPP/MeOH/NH4 protocol. 132 Figure 4.13: Scatter plot correlating the retention times of corresponding peptides from the PFPP/MeOH and C18/ACN separations. Annotated markers are in accordance to Table 4.3. 133 C18/ACN separation is reliant mainly on hydrophobicity for the separation of analytes, the near orthogonal separation, indicates that the PFPP/MeOH separation separates the digestion products based on a factor other than hydrophobic interactions. The lack of dependence on hydrophobicity is confirmed by the identified peptides (Table 4.3), in which the GRAVY values span from hydrophilic to hydrophobic despite showing greater or lesser retention on the PFPP/MeOH than the C18/ACN separation. In addition, the plot of retention time versus hydropathy (Figure 4.6), which had a coefficient of determination of 0.00 showed a similar lack of correlation. Similarly, when the organic modifier of the C18 reversed phase separation is changed to the identical organic modifier methanol (C18/MeOH) as used in the PFPP/MeOH separation, a likewise low value for the coefficient of determination value of 0.06 was obtained (Figure 4.14). This confirms that the PFPP stationary phase separates based upon a completely separate basis other than hydrophobicity, as those peptides that display increased and decreased retention in comparison to the C18/MeOH separation spanned from hydrophobic to hydrophilic, further implying other sources as the driving forces behind retention of the digestion products. Additionally, with aliphatic index and peptide volume ruled out as possible major factors guiding retention, we had to consider other possibilities. The extended retention of peptides that contain polar function groups, which are capable of ion exchange chemistries, led us to investigate the possibility of ion exchange despite the endcapping on the surface silanols. To further confirm the possible relationships between structure and retention, particularly those of ion exchange, comparisons of the C18 separations with the PFPP/MeOH/NH4 separation were made. The comparison of the C18/ACN separation and the PFPP/MeOH/NH4 separation resulted in a coefficient of determination of 0.19 (Figure 4.15), which although the highest value, does not 134 Figure 4.14: Scatter plot correlating the retention times of corresponding peptides from the PFPP/MeOH and C18/MeOH separations. Annotated markers are in accordance to Table 4.3. 135 Figure 4.15: Scatter plot correlating the retention times of corresponding peptides from the PFPP/MeOH/NH4 and C18/ACN separations. Annotated markers are in accordance to Table 4.3. 136 indicate a high degree of similarity between the two separations. With the addition of ammonium formate, retention due to ion exchange chemistries can be impeded, which if unfavorable hydrophobic interactions were a secondary factor in retaining analytes, would result in an increase in the coefficient of determination. However, this is not observed. Likewise, the comparison of the C18/MeOH separation with the PFPP/MeOH/NH4 separation resulted in a coefficient of determination of 0.00 (Figure 4.16), essentially a complete orthogonal separation when the organic modifiers were identical. The higher correlation with the C18/ACN separation is likely due to the fact that the entire complement of peptides eluted over a very short range (5 to 23 min), so there was less variation in retention time. With the digestion products showing a varied range of retention on the fluorinated stationary phase (2 to 51 min), the implication is that despite hydrophobic interactions not affecting retention, the PFPP stationary phase offers a greater diversity of retentive chemistries to take advantage of for a more efficient separation of those analytes that do not differ sufficiently enough in hydrophobicity. To further investigate the retentive characteristics of the PFPP stationary phase, we then examined the effects of the addition ammonium formate to the aqueous mobile phase compared to the PFPP/MeOH separation (Figure 4.17). Given the relatively low concentrations of ammonium formate, we expected to see evidence of ion exchange interactions at the free silanols disrupted through earlier elution of a fraction of the digestion products that show weak ion exchanging capabilities. This would result in a deviation from unity of the coefficient of determination towards zero, with the more pronounced the effect, the greater the deviation. What we did not expect was a completely orthogonal separation, resulting in a coefficient of determination of 0.00. Instead of the ammonium formate having a minor effect on the retention of the digest products, the effect is more prominent. Upon examination of identified peptides, 137 Figure 4.16: Scatter plot correlating the retention times of corresponding peptides from the PFPP/MeOH/NH4 and C18/MeOH separations. Annotated markers are in accordance to Table 4.3. 138 Figure 4.17: Scatter plot correlating the retention times of corresponding peptides from the PFPP/MeOH/NH4 and PFPP/MeOH separations. Annotated markers are in accordance to Table 4.3. 139 the peptides that elute earlier in the presence of ammonium formate present ion exchangeable/hydrogen bonding functional groups that compete for binding with the silanol groups (Table 4.3). However, those peptides that display greater retention in the presence of ammonium formate are unusual having a wide range of functional groups, including aromatic, ion exchangeable/hydrogen bonding, and highly aliphatic groups. What sets the group apart is that these peptides are larger in volume than those that retain less with the exception of HVAL, a highly aliphatic peptide. This suggests that the energetics of solvating a larger peptide in the presence of ammonium formate is significantly less favorable than in 0.15% aqueous formic acid. In addition, the fact that the 10 mM concentration of ammonium formate has such a pronounced effect on the separation implies the ability to tailor the retention of analytes with a small change in the mobile phase additive. The additional ability to tailor separations on the PFPP stationary by adjusting mobile phase composition shows that the PFPP stationary phase a greater diversity in the chemical factors that lead to retention than that of the C18 stationary phase, even without hydrophobicity, which can be taken advantage of to target specialized chemistries of a group of analytes. Also, the orthogonality of the different digestions is indicative of the complexity that guides the retention. With multiple factors including peptide charge and volume playing roles in analyte retention, the retention mechanisms are more complex than with the simpler hydrophobically driven C18 separations, leading to separations on the PFPP column that are distinct and orthogonal. 4.3.5 A mechanistic view of retention on the perfluorinated stationary phase Efforts to maximize the separation of digestion-resistant peptides through the use of uncommon separation protocols necessitate the examination of the mechanism by which the analytes are separated. Although all the necessary experimentation has not been completed to 140 fully determine the mechanism by which the peptides were retained, particularly in the area of ion exchange, a broad consensus can be made based on the observations. Currently, the world of separations uses a variety of separation techniques for the separation of analytes, depending on the sample and conditions. Previous results have shown that the application of a reversed phase gradient to the PFPP stationary phase does not result in a similar separation to a classical reversed phase separation based on comparisons of retention times (Figure 4.13). Unlike reversed phase separations, normal phase separations typically involve polar stationary phases and relatively nonpolar mobile phases, and analytes elute in order of increasing polarity. While the stationary phase of the PFPP is polar, the mobile phase is distinctly aqueous as the percentage of water was never less than 60%, indicating a solvent of high polarity and suggesting against a classical normal phase mechanism. Hydrophilic interaction liquid chromatography (HILIC), a variant of normal phase chromatography must also be considered as a possible mechanistic explanation. HILIC retention is believed to involve formation of an aqueous layer on the surface of a polar stationary phase, and retention is based upon partition between a polar organic stationary phase and the aqueous surface layer. HILIC uses hydrophilic stationary phases with reversed phase solvents such as acetonitrile and water, and typically the least polar analytes elute first, followed by the most polar analytes. However, HILIC separations are generally limited to high organic mobile phases because high aqueous phases lose the capacity to partition between the mobile phase and aqueous surface layer. In the current study of the PFPP column, peptides with more polar groups were retained longer on the PFPP stationary phase using reversed phase separation conditions leads for a more direct comparison than the normal phase separation, but with the distinct difference of the fact that the PFPP separations were carried out using a reversed phase separation that initiated at a high concentration of aqueous 141 mobile phase concentration. With the conventional separation mechanisms of reversed phase, normal phase and HILIC excluded as possible explanations for the retention of the more polar analytes, other modes of retention have to be examined. Given that the polar peptides predominately contain acidic residues, ion exchange mechanisms are a possibility. The likely source of ion exchange would occur at free silanol groups at the surface, causing increased retention of the peptide containing acidic functional groups. Evidence for ion exchange was bolstered by the earlier retention of many peptides when the aqueous mobile phase was changed to a 10 mM ammonium formate solution buffered to pH 3.0, indicating that the competing ions cause earlier elution of the peptides (Figure 4.10). However, the Kinetex PFP stationary phase that was used for the separation is partially endcapped, significantly decreasing the likelihood of free silanols and the possibility of ion exchange groups. To further investigate whether ion exchange is truly a driving force behind the increased retention of polar peptides, further experiments using differing concentrations of the ammonium salts are needed. If ion exchange drives the retention mechanism, as the concentration of the salt increases, the concentration of competing ions would increase, causing further decreased retention of polar analytes. An alternative explanation of the increased the observed results under the introduction of the ammonium salt is a change in the zeta potential at the surface of the stationary phase. An increased concentration of charge resultant from the charged ammonium formate ions at the double layer near the surface and becoming increasingly more diffuse as the distance increases towards the bulk is likely to cause a decrease in retention for the protonated acidic peptides compared to the PFPP/MeOH separation. This would also explain the why peptides with a larger volume but are still acidic still are sufficiently retained in comparison to the smaller acidic peptides. Those peptides with the larger volume typically have a higher number of acidic 142 residues, capable of overcoming the zeta potential near the double layer and retaining longer compared to the smaller peptides with fewer acidic residues which cannot overcome the repulsion of the zeta potential at the double layer. Since no one current mechanistic theory describes the retention of these peptides using a reversed phase solvent system, a new one must be devised to describe it. It has been shown that a reversed phase mechanism does not accurately describe the PFPP separation, resulting in an orthogonal separation to that of the C18/ACN separation. In addition, the descriptors of the normal phase and HILIC separations also do not accurately describe the separations because the solvent system is not congruent, and the resulting partitioning into the increasingly polar solvent does not sufficiently describe the separation. Instead, it is best described as a mix of the separations given that there is not an ion exchange component. Perhaps best described as a hydrophilic reversed phase chromatography, in which the stationary phase is polar, but the mobile phase is conventionally polar to non-polar. The increased retention of the peptides despite the hydrophilicity of the peptide side chains can be explained by the fact that the concentration of aqueous phase is still in the majority at the time of elution, and the peptides must be looked at as a whole, not on the individual side chain level. While acidic side chains contribute to the hydrophilicity, other side chains contribute to hydrophobicity, and can increase the concentration of organic modifier of which it will partition into. 4.4 Conclusions The detection and subsequent identification of biologically active digestion-resistant peptides by LC-MS often fails to yield useful information due to the complexity of the samples. Ineffective separations result in decreased sample information owing to co-elution of peptides, and suffer from ion suppression and competitive ionization. However, the implementation of 143 alternate separations, either by varying the solvent system or the stationary phase for the separation of peptides can increase the capacity for the recognition and detection. In an attempt to maximize retention of digestion products at standard reversed phase concentrations, the alternative PFPP stationary phase was used. It was discovered through comparisons of retention of identified peptides and chromatographic profiles on both traditional reversed phase conditions and in the presence of ammonium formate in the aqueous phase that hydrophobic interactions do not play a significant role in retention of the digestion products. Despite hydrophobic interactions not being a major influence on retention, the PFPP stationary phase still displays a great diversity in retentive properties. These investigations have found that ion exchange interactions are a dominant retentive property, confirmed by earlier elution of peptides with ion exchangeable functional groups in the presence of ammonium formate. Interestingly, these findings also revealed that small changes in the solvent composition result in dramatic changes in peptide retention, exemplified by the orthogonal relationships between the PFPP/MeOH and the PFPP/MeOH/NH4 separations. The great diversity in chemical factors that lead to retention on the fluorinated stationary phase implies that retention can be targeted towards specific chemical functional groups using minor adjustments in mobile phase composition. Also of note is the substantial impact solvent composition has on the retentive properties of the stationary phase. In comparison to the acetonitrile separations on the C18 phase, the use of methanol as the organic modifier showed a marked increase in retention of the peptides, pointing towards new opportunities for manipulation of retention selectivity. Also, the extreme retention of peptides on the PFPP stationary phase using acetonitrile as organic modifier has potential exploited through modifying the concentration through mixing to show unique retentive properties. 144 CHAPTER 5 5.1 Conclusion Current technologies for the targeted and non-targeted profiling of complex peptidomes suffer from a limited pool of selected peptides due to the use of DDA and insufficient separation technologies. The expansion of the window of analysis in order to survey a wider variety of peptides is essential for peptidomics. Many biologically significant peptides may be of low abundance or inhibited by competitive ionization or ion suppression, causing the peptides to be omitted from DDA, and left unanalyzed. To maximize the number of peptides detected for the subsequent characterization and identification, we developed a metabolomic-like approach for the non-targeted data-independent profiling of a complex peptidome. The application of metabolomic tools to wheat and spelt storage protein digestion products routinely yielded between two and five thousand individual peptide signals, which is unsurprising giving the high degree of heterogeneity of the gliadin fraction. Our non-targeted approach recognized significant quantitative differences between the wheat and spelt digests, most notably due to deamidation of glutamine residues in the wheat peptides. While our nontargeted approach focused on the wheat and spelt, this strategy has implications beyond that of storage protein digests. Metabolomics has been an established tool in the world of small molecules as a way to distinguish genotypes, phenotypes, treatments and temporal changes, and this can also be applied to the world of peptides as demonstrated by the differentiation of the wheat and spelt genotypes. With the recognition of those peptides that distinguish one group from another through multivariate analysis, the collection of accurate mass measurements allows for subsequent targeted MS/MS analyses to be performed on only those peptides of interest, without reliance on DDA for the selection of peptides out of complex mixtures. Additionally, 145 the accurate mass measurements also allows for annotations to be made from known protein sequences. This annotation of peptides that differentiated between two highly similar peptide profiles showed deamidation as the post-translational modification responsible for the majority of the differences in wheat without the use of fragmentation. To confirm the assigned identities of the peptides MS/MS along with high mass accuracy survey scans were used. Given the extraordinarily high number of differentiating peptides that were deemed biologically significant through the non-targeted analysis, DDA was used in an effort to maximize the number of peptides submitted for fragmentation as a preferential list that large could not be compiled. From the list of peptides reasoned to hold potential biological significance, only 25% were sufficiently abundant to be selected for fragmentation by DDA, highlighting the need for a non-targeted approach that is independent upon the abundance of the analytes, as seen in the application OPLS-DA for the extraction of highly differentiating peptides. The peptides that were submitted for fragmentation showed a lack of informative series ions, particularly the y type, due to an absence of a basic internal or terminal residue for an additional charge, resulting in a fixed charge on the N-terminus. The gliadin peptides, which have been shown to be highly deamidated in wheat, also demonstrated unique fragmentation patterns essential for the confirmation of identity without series ions. The deamidated peptides form a dehydrated cyclic b ion from a nucleophilic attack on the protonated carboxylic acid. This ring then can reopen to form internal fragments, which includes rearranged ions depending on the point of reopening. Without these internal fragments, confirmation of identity would be particularly challenging due to a lack of informative series ions. However, without prior annotations obtained from the non-targeted metabolomic approach, recognition of the internal 146 fragments would be time consuming given the numerous possible combinations, especially with the loss of the low mass region in the LIT. Despite the advantages demonstrated by the non-targeted metabolomic approach, the analysis of complex peptidomes is still hindered by insufficient separations when using the C18/Acetonitrile stationary phase/organic modifier combination. The simultaneous elution of the peptides results in competitive ionization and ion suppression which can yield lower than actual relative abundances for peptides which when using a data dependent approach will result in omission of potentially biologically significant peptides. Exploring alternative stationary phase/solvent system combination in an effort to increase the capacity for the recognition and detection of digestion products, it was discovered that not only could retention be expanded, but also orthogonal separations could be obtained. The fluorinated stationary PFPP stationary phase has not been well studied at reversed phase concentrations, and through comparative studies, it was found that unlike the C18 stationary phase, hydrophobic interactions do not play a significant role in the retention of digestion products on the PFPP stationary phase. In addition, aliphatic index, and peptide volume were also found not to be a major contributing factor to retention of the peptides, so another parameter must be responsible for the majority of retention. Although the silanol of the stationary phases were endcapped we observed evidence of ion exchange chemistries. Peptides containing polar and ion exchangeable sites displayed increased retention on the PFPP stationary phase. Furthermore, when a competing ion was introduced, those peptides were displaced to an earlier elution. However, that was not the case with all peptides. In addition to peptides with a larger volume were displaced to a later retention time due to solvent-ion-solute interactions. This shows a great diversity in chemical factors that lead 147 to retention on the PFPP stationary phase using a variety of solvent systems that can be taken advantage of to target specific chemical functional groups. Although we have presented the application of a new metabolomic-like, non-targeted data independent approach to the peptidomic profiling of wheat and spelt storage proteins, the implications extend beyond that of the two grains. Complex protein digests, particularly those employing a non-tryptic digestion scheme, can benefit from this approach as resolution of the peptides as well as abundance is not a limiting factor. In combination with alternative separation techniques, orthogonal separations to the C18 separations can be used to target different chemical properties, extending the retention of desired products. Also of note is the drastic effect that the solvent has on the retention on both the C18 and PFPP stationary phases. On the C18 stationary phase, the use of methanol as the organic modifier increased the retention of digestion products beyond acetonitrile which indicates that methanol can be used in conjunction with the C18 stationary phase to extend retention and reduce co-elution. Also, acetonitrile shows severe retention on the PFPP stationary phase which has potential for future manipulations through control of acetonitrile concentration in the mobile phase to control retention for a more desirable separation. 148 BIBLIOGRAPHY 149 BIBLIOGRAPHY [1] He, L., Hannon, G. J., Micrornas: Small RNAs with a big role in gene regulation. Nat. Rev. Genet. 2004, 5, 522-531. [2] Lu, J., Getz, G., Miska, E. A., Alvarez-Saavedra, E., et al., MicroRNA expression profiles classify human cancers. Nature 2005, 435, 834-838. [3] Proudfoot, N. J., Brownlee, G. G., 3' Non-Coding Region Sequences in Eukaryotic Messenger-RNA. Nature 1976, 263, 211-214. [4] Akil, H., Watson, S. J., Young, E., Lewis, M. E., et al., Endogenous Opioids - Biology and Function. Annu. Rev. Neurosci. 1984, 7, 223-255. [5] Axelrod, J., Reisine, T. D., Stress Hormones - Their Interaction and Regulation. Science 1984, 224, 452-459. [6] Boonen, K., Creemers, J. W., Schoofs, L., Bioactive peptides, networks and systems biology. Bioessays 2009, 31, 300-314. [7] Britton, J. R., Kastin, A. J., Biologically-Active Polypeptides in Milk. Am. J. Med. Sci. 1991, 301, 124-132. [8] Choi, J., Sabikhi, L., Hassan, A., Anand, S., Bioactive peptides in dairy products. Int J Dairy Technol 2012, 65, 1-12. [9] Gill, I., LopezFandino, R., Jorba, X., Vulfson, E. N., Biologically active peptides and enzymatic approaches to their production. Enzyme Microb Tech 1996, 18, 162-183. [10] Hartmann, R., Meisel, H., Food-derived peptides with biological activity: from research to food applications. Curr Opin Biotech 2007, 18, 163-169. [11] Hokfelt, T., Broberger, C., Xu, Z. Q. D., Sergeyev, V., et al., Neuropeptides - an overview. Neuropharmacology 2000, 39, 1337-1356. [12] Klavdieva, M. M., The history of neuropeptides .4. Front. Neuroendocrinol. 1996, 17, 247280. [13] Korhonen, H., Milk-derived bioactive peptides: From science to applications. J Funct Food 2009, 1, 177-187. [14] Minkiewicz, P., Dziuba, J., Darewicz, M., Iwaniak, A., et al., Food peptidomics. Food Technol Biotech 2008, 46, 1-10. 150 [15] Moreno, F. J., Gastrointestinal digestion of food allergens: Effect on their allergenicity. Biomed. Pharmacother. 2007, 61, 50-60. [16] Pellegrini, A., Antimicrobial peptides from food proteins. Curr Pharm Design 2003, 9, 1225-1238. [17] Rutherfurd-Markwick, K. J., Moughan, P. J., Bioactive peptides derived from food. J Aoac Int 2005, 88, 955-966. [18] Yamamoto, N., Ejiri, M., Mizuno, S., Biogenic peptides and their potential use. Curr Pharm Design 2003, 9, 1345-1355. [19] Banting, F. G., Best, C. H., Collip, J. B., MacLeod, J. J. R., Noble, E. C., The effect of pancreatic extract (insulin) on normal rabbits. Am. J. Physiol. 1922, 62, 162-176. [20] Grodsky, G. M., Forsham, P. H., Insulin and Pancreas. Annu. Rev. Physiol. 1966, 28, 347-&. [21] Mayhew, D. A., Wright, P. H., Ashmore, J., Regulation of Insulin Secretion. Pharmacol. Rev. 1969, 21, 183-&. [22] Porte, D., Bagdade, J. D., Human Insulin Secretion - An Integrated Approach. Annu. Rev. Med. 1970, 21, 219-&. [23] Steiner, D. F., Clark, J. L., Rubenste.Ah, Isolation of Proinsulin Connecting Peptide (CPeptide) From Mammalian Pancreas. Diabetes 1969, S 18, 339-&. [24] Faber, O. K., Binder, C., C-Peptide Response to Glucagon - Test for Residual Beta-Cell Function in Diabetes-Mellitus. Diabetes 1977, 26, 605-610. [25] Johansson, B. L., Borg, K., Fernqvist-Forbes, E., Kernell, A., et al., Beneficial effects of Cpeptide on incipient nephropathy and neuropathy in patients with Type 1 diabetes mellitus. Diabetic Med. 2000, 17, 181-189. [26] Kuzuya, H., Blix, P. M., Horwitz, D. L., Steiner, D. F., Rubenstein, A. H., Determination of Free and Total Insulin and C-Peptide in Insulin-Treated Diabetics. Diabetes 1977, 26, 22-29. [27] Samnegard, B., Jacobson, S. H., Jaremko, G., Johansson, B. L., et al., C-peptide prevents glomerular hypertrophy and mesangial matrix expansion in diabetic rats. Nephrol. Dial. Transplant. 2005, 20, 532-538. [28] Stevens, M. J., Zhang, W. X., Li, F., Sima, A. A. F., C-peptide corrects endoneurial blood flow but not oxidative stress in type 1 BB/Wor rats. Am. J. Physiol.-Endocrinol. Metab. 2004, 287, E497-E505. [29] Wahren, J., Kallas, A., Sima, A. A. F., The Clinical Potential of C-Peptide Replacement in Type 1 Diabetes. Diabetes 2012, 61, 761-772. 151 [30] Wang, S., Wei, W., Zheng, Y. D., Hou, J. L., et al., The Role of Insulin C-Peptide in the Coevolution Analyses of the Insulin Signaling Pathway: A Hint for Its Functions. PLoS One 2012, 7. [31] Keltner, Z., Meyer, J. A., Johnson, E. M., Palumbo, A. M., et al., Mass spectrometric characterization and activity of zinc-activated proinsulin C-peptide and C-peptide mutants. Analyst 2010, 135, 278-288. [32] Spence, D. M., The Effect of Combined C-Peptide and Zinc on Cellular Function, Humana Press Inc, Totowa 2012. [33] von Euler, U. S., Gaddum, J. H., An unidentified depressor substance in certain tissue extracts. J. Physiol.-London 1931, 72, 74-87. [34] Harrison, S., Geppetti, P., Substance P. Int. J. Biochem. Cell Biol. 2001, 33, 555-576. [35] O'Connor, T. M., O'Connell, J., O'Brien, D. I., Goode, T., et al., The role of substance P in inflammatory disease. J. Cell. Physiol. 2004, 201, 167-180. [36] Page, I. H., Bumpus, F. M., Schwarz, H. J., Angiotensin. Sci.Am. 1959, 200, 54-&. [37] Peach, M. J., Renin-Angiotensin System - Biochemistry and Mechanisms of Action. Physiol. Rev. 1977, 57, 313-370. [38] Erdos, E. G., Angiotensin-I Converting Enzyme. Circ.Res. 1975, 36, 247-255. [39] Natesh, R., Schwager, S. L. U., Sturrock, E. D., Acharya, K. R., Crystal structure of the human angiotensin-converting enzyme-lisinopril complex. Nature 2003, 421, 551-554. [40] Paul, M., Mehr, A. P., Kreutz, R., Physiology of local renin-angiotensin systems. Physiol. Rev. 2006, 86, 747-803. [41] Soffer, R. L., Angiotensin-Converting Enzyme and Regulation of Vasoactive Peptides. Annu. Rev. Biochem. 1976, 45, 73-94. [42] GarciaSainz, J. A., MartinezAlfaro, M., RomeroAvila, M. T., GonzalezEspinosa, C., Characterization of the AT(1) angiotensin II receptor expressed in guinea pig liver. J. Endocrinol. 1997, 154, 133-138. [43] Mehta, P. K., Griendling, K. K., Angiotensin II cell signaling: physiological and pathological effects in the cardiovascular system. Am. J. Physiol.-Cell Physiol. 2007, 292, C82C97. 152 [44] Rocha e Silva, M., Beraldo, W. T., Rosenfeld, G., Bradykinin, a hypotensive and smooth muscle stimulating factor released from plasma globulin by snake venoms and by trypsin. Am J Physiol 1949, 156, 261-273. [45] Elliott, D. F., Horton, E. W., Lewis, G. P., Actions of Pure Bradykinin. J. Physiol.-London 1960, 153, 473-480. [46] Fox, R. H., Goldsmith, R., Lewis, G. P., Kidd, D. J., Bradykinin as a Vasodilator in Man. J. Physiol.-London 1961, 157, 589-&. [47] Regoli, D., Barabe, J., Pharmacology of Bradykinin and Related Kinins. Pharmacol. Rev. 1980, 32, 1-46. [48] Vanarman, C. G., The Origin of Bradykinin. Proc. Soc. Exp. Biol. Med. 1952, 79, 356-359. [49] Dorer, F. E., Kahn, J. R., Lentz, K. E., Levine, M., Skeggs, L. T., Hydrolysis of Bradykinin by Angiotensin-Converting Enzyme. Circ.Res. 1974, 34, 824-827. [50] Maurer, M., Bader, M., Bas, M., Bossi, F., et al., New topics in bradykinin research. Allergy 2011, 66, 1397-1406. [51] Heitsch, H., Bradykinin B-2 receptor as a potential therapeutic target. Drug News Perspect. 2000, 13, 213-225. [52] Burbach, J. P. H., in: Merighi, A. (Ed.), Neuropeptides: Methods and Protocols, Humana Press Inc, Totowa 2011, pp. 1-36. [53] Millan, M. J., Multiple Opioid Systems and Pain. Pain 1986, 27, 303-347. [54] Benarroch, E. E., Endogenous opioid systems Current concepts and clinical correlations. Neurology 2012, 79, 807-814. [55] Bodnar, R. J., Endogenous opiates and behavior: 2011. Peptides 2012, 38, 463-522. [56] Lazarus, L. H., Ling, N., Guillemin, R., Beta-Lipotropin as a Pro-hormone for Morphinomimetic Peptides Endorphins and Enkephalins. Proc. Natl. Acad. Sci. U. S. A. 1976, 73, 2156-2159. [57] Li, C. H., Chung, D., Doneen, B. A., Isolation, Characterization and Opiate Activity of Beta-Endorphin from Human Pituitary-Glands. Biochem. Biophys. Res. Commun. 1976, 72, 1542-1547. [58] Millington, W. R., Smith, D. L., The Posttranslational Processing of Beta-Endorphin in Human Hypothalamus. J. Neurochem. 1991, 57, 775-781. 153 [59] Stein, C., Mechanisms of Disease - The Control of Pain in Peripheral Tissue by Opioids. N. Engl. J. Med. 1995, 332, 1685-1690. [60] Balasubramaniam, A., Neuropeptide Y family of hormones: Receptor subtypes and antagonists. Peptides 1997, 18, 445-457. [61] Tatemoto, K., Mutt, V., Isolation of 2 Novel Candidate Hormones Using a Chemical Method for Finding Naturally-Occurring Polypeptides. Nature 1980, 285, 417-418. [62] Lindner, D., Stichel, J., Beck-Sickinger, A. G., Molecular recognition of the NPY hormone family by their receptors. Nutrition 2008, 24, 907-917. [63] Wahlestedt, C., Grundemar, L., Hakanson, R., Heilig, M., et al., Neuropeptide-Y Receptor Subtypes, Y1 and Y2. Ann. N.Y. Acad. Sci. 1990, 611, 7-26. [64] Michel, M. C., Beck-Sickinger, A., Cox, H., Doods, H. N., et al., XVI. International Union of Pharmacology recommendations for the nomenclature of neuropeptide Y, peptide YY, and pancreatic polypeptide receptors. Pharmacol. Rev. 1998, 50, 143-150. [65] Vincent, J. P., Neurotensin receptors: Binding properties, transduction pathways, and structure. Cell. Mol. Neurobiol. 1995, 15, 501-512. [66] Walther, C., Morl, K., Beck-Sickinger, A. G., Neuropeptide Y receptors: ligand binding and trafficking suggest novel approaches in drug development. J. Pept. Sci. 2011, 17, 233-246. [67] Kitabgi, P., Effects of Neurotensin on Intestinal Smooth-Muscle - Application to the Study of Structure Activity Relationships. Ann.NY Acad.Sci. 1982, 400, 37-55. [68] Hermans, E., Maloteaux, J. M., Mechanisms of regulation of neurotensin receptors. Pharmacol. Ther. 1998, 79, 89-104. [69] Kitts, D. D., Bioactive Substances in Food - Identification and Potential Uses. Can. J. Physiol. Pharmacol. 1994, 72, 423-434. [70] Kitts, D. D., Weiler, K., Bioactive proteins and peptides from food sources. Applications of bioprocesses used in isolation and recovery. Curr. Pharm. Design 2003, 9, 1309-1323. [71] Korhonen, H., Pihlanto, A., Food-derived bioactive peptides - Opportunities for designing future foods. Curr Pharm Design 2003, 9, 1297-1308. [72] Li, G. H., Le, G. W., Shi, Y. H., Shrestha, S., Angiotensin I-converting enzyme inhibitory peptides derived from food proteins and their physiological and pharmacological effects. Nutr. Res. 2004, 24, 469-486. [73] Ondetti, M. A., Cushman, D. W., Enzymes of the Renin-Angiotensin System and Their Inhibitors. Annu. Rev. Biochem. 1982, 51, 283-308. 154 [74] Yoshikawa, M., Exorphin-Opioid Active Peptides of Exogenous Origin, Elsevier Academic Press Inc, San Diego 2006. [75] Meisel, H., Biochemical properties of bioactive peptides derived from milk proteins: Potential nutraceuticals for food and pharmaceutical applications. Livest Prod Sci 1997, 50, 125138. [76] Tidona, F., Criscione, A., Guastella, A. M., Zuccaro, A., et al., Bioactive peptides in dairy products. Ital J Anim Sci 2009, 8, 315-340. [77] Christison, G. W., Ivany, K., Elimination diets in autism spectrum disorders: Any wheat amidst the chaff? J. Dev. Behav. Pediatr. 2006, 27, S162-S171. [78] Meisel, H., Biochemical properties of regulatory peptides derived from milk proteins. Biopolymers 1997, 43, 119-128. [79] Miglioresamour, D., Floch, F., Jolles, P., Biologically-Active Casein Peptides Implicated in Immunomodulation. J. Dairy Res. 1989, 56, 357-362. [80] Meisel, H., Schlimme, E., Bioactive peptides derived from milk proteins: Ingredients for functional foods? Kieler Milchw Forsch 1996, 48, 343-357. [81] Marx, V., Watching peptide drugs grow up. Chem. Eng. News 2005, 83, 17-+. [82] Brogden, K. A., Antimicrobial peptides: Pore formers or metabolic inhibitors in bacteria? Nat. Rev. Microbiol. 2005, 3, 238-250. [83] Zasloff, M., Antimicrobial peptides of multicellular organisms. Nature 2002, 415, 389-395. [84] Moller, N. P., Scholz-Ahrens, K. E., Roos, N., Schrezenmeir, J., Bioactive peptides and proteins from foods: indication for health effects. Eur J Nutr 2008, 47, 171-182. [85] Aalberse, R. C., Food allergens. Environ Toxicol Phar 1997, 4, 55-60. [86] Asero, R., Mistrello, G., Roncarolo, D., de Vries, S. C., et al., Lipid transfer protein: A panallergen in plant-derived foods that is highly resistant to pepsin digestion. Int Arch Allergy Imm 2000, 122, 20-32. [87] Astwood, J. D., Leach, J. N., Fuchs, R. L., Stability of food allergens to digestion in vitro. Nat Biotechnol 1996, 14, 1269-1273. [88] Battais, F., Richard, C., Jacquenet, S., Denery-Papini, S., Moneret-Vautrin, D. A., Wheat grain allergies: an update on wheat allergens. Eur Ann Allergy Clin Immunol 2008, 40, 67-76. 155 [89] Fu, T. T., Abbott, U. R., Hatzos, C., Digestibility of food allergens and nonallergenic proteins in simulated gastric fluid and simulated intestinal fluid - A comparative study. J Agr Food Chem 2002, 50, 7154-7160. [90] Koppelman, S. J., Hefle, S. L., Taylor, S. L., de Jong, G. A. H., Digestion of peanut allergens Ara h 1, Ara h 2, Ara h 3, and Ara h 6: A comparative in vitro study and partial characterization of digestion-resistant peptides. Mol Nutr Food Res 2010, 54, 1711-1721. [91] Sampson, H. A., Update on food allergy. J Allergy Clin Immun 2004, 113, 805-819. [92] Inomata, N., Wheat allergy. Curr Opin Allergy Cl 2009, 9, 238-243. [93] Auricchio, S., Deritis, G., Devincenzi, M., Silano, V., Toxicity Mechanisms of Wheat and Other Cereals in Celiac-Disease and Related Enteropathies. J Pediatr Gastr Nutr 1985, 4, 923930. [94] Maki, M., Collin, P., Coeliac disease. Lancet 1997, 349, 1755-1759. [95] Sollid, L. M., Molecular basis of celiac disease. Annu Rev Immunol 2000, 18, 53-81. [96] Sollid, L. M., Coeliac disease: Dissecting a complex inflammatory disorder. Nat Rev Immunol 2002, 2, 647-655. [97] Taylor, R., Celiac disease. Am. J. Dis. Child. 1923, 25, 46-54. [98] Diosdado, B., van Oort, E., Wijmenga, C., "Coelionomics": towards understanding the molecular pathology of coeliac disease. Clin Chem Lab Med 2005, 43, 685. [99] Shan, L., Molberg, O., Parrot, I., Hausch, F., et al., Structural basis for gluten intolerance in Celiac sprue. Science 2002, 297, 2275-2279. [100] Sjostrom, H., Lundin, K. E. A., Molberg, O., Korner, R., et al., Identification of a gliadin T-cell epitope in coeliac disease: General importance of gliadin deamidation for intestinal T-cell recognition. Scand J Immunol 1998, 48, 111-115. [101] Carre, B., Mignon-Grasteau, S., Peron, A., Juin, H., Bastianelli, D., Wheat value: improvements by feed technology, plant breeding and animal genetics. World Poultry Sci J 2007, 63, 585-596. [102] Shewry, P. R., Tatham, A. S., Lazzeri, P., Biotechnology of wheat quality. J Sci Food Agr 1997, 73, 397-406. [103] Alaedini, A., Green, P. H. R., Narrative review: Celiac disease: Understanding a complex autoimmune disorder. Ann Intern Med 2005, 142, 289-298. 156 [104] Battais, F., Mothes, T., Moneret-Vautrin, D. A., Pineau, F., et al., Identification of IgEbinding epitopes on gliadins for patients with food allergy to wheat. Allergy 2005, 60, 815-821. [105] Ciccocioppo, R., Di Sabatino, A., Corazza, G. R., The immune recognition of gluten in coeliac disease. Clin Exp Immunol 2005, 140, 408-416. [106] Dicke, W. K., Weijers, H. A., v. d. Kamer, J. H., Coeliac Disease II. The Presence in Wheat of a Factor Having a Deleterious Effect in Cases of Coeliac Disease. Acta Paediatr 1953, 42, 34-42. [107] Henderson, R. A., Michel, H., Sakaguchi, K., Shabanowitz, J., et al., Hla-A2.1-Associated Peptides from a Mutant-Cell Line - a 2nd Pathway of Antigen Presentation. Science 1992, 255, 1264-1266. [108] Hunt, D. F., Henderson, R. A., Shabanowitz, J., Sakaguchi, K., et al., Characterization of Peptides Bound to the Class-I MHC Molecule HLA-A2.1 by Mass-Spectrometry. Science 1992, 255, 1261-1263. [109] Kasarda, D. D., D'Ovidio, R., Deduced amino acid sequence of an alpha-gliadin gene from spelt wheat (spelta) includes sequences active in celiac disease. Cereal Chem. 1999, 76, 548-551. [110] Kim, C. Y., Quarsten, H., Bergseng, E., Khosla, C., Sollid, L. M., Structural basis for HLA-DQ2-mediated presentation of gluten epitopes in celiac disease. P Natl Acad Sci USA 2004, 101, 4175-4179. [111] Macdonal.Wc, Dobbins, W. O., Rubin, C. E., Studies of Familial Nature of Celiac Sprue Using Biopsy of Small Intestine. N. Engl. J. Med. 1965, 272, 448-&. [112] Mamone, G., Ferranti, P., Rossi, M., Roepstorff, P., et al., Identification of a peptide from alpha-gliadin resistant to digestive enzymes: Implications for Celiac disease. J Chromatogr B 2007, 855, 236-241. [113] Marietta, E. V., Schuppan, D., Murray, J. A., In vitro and in vivo models of celiac disease. Expert Opin Drug Dis 2009, 4, 1113-1123. [114] Schuppan, D., Current concepts of celiac disease pathogenesis. Gastroenterology 2000, 119, 234-+. [115] Shan, L., Qiao, S. W., Arentz-Hansen, H., Molberg, O., et al., Identification and analysis of multivalent proteolytically resistant peptides from gluten: implications for celiac sprue. J Proteome Res 2005, 4, 1732-1741. [116] Silano, M., De Vincenzi, M., Bioactive antinutritional peptides derived from cereal prolamins: A Review. Nahrung 1999, 43, 175-184. 157 [117] Singh, M., Khatkar, B. S., Structural and functional properties of wheat storage proteins: A review. J Food Sci Tech Mys 2005, 42, 455-471. [118] Vader, L. W., Stepniak, D. T., Bunnik, E. M., Kooy, Y. M. C., et al., Characterization of cereal toxicity for celiac disease patients based on protein homology in grains. Gastroenterology 2003, 125, 1105-1113. [119] van de Kamer, J. H., Weijers, H. A., Dicke, W. K., Coeliac Disease IV. An Investigation into the Injurious Constituents of Wheat in Connection with their Action on Patients with Coeliac Disease. Acta Paediatr 1953, 42, 223-231. [120] van de Wal, Y., Kooy, Y. M. C., van Veelen, P. A., Pena, S. A., et al., Small intestinal T cells of celiac disease patients recognize a natural pepsin fragment of gliadin. P Natl Acad Sci USA 1998, 95, 10050-10054. [121] van Heel, D. A., West, J., Recent advances in coeliac disease. Gut 2006, 55, 1037-1046. [122] Wieser, H., Comparative investigations of gluten proteins from different wheat species. III. N-terminal amino acid sequences of alpha-gliadins potentially toxic for coeliac patients. Eur Food Res Technol 2001, 213, 183-186. [123] Anderson, R. P., Degano, P., Godkin, A. J., Jewell, D. P., Hill, A. V. S., In vivo antigen challenge in celiac disease identifies a single transglutaminase-modified peptide as the dominant A-gliadin T-cell epitope. Nat Med 2000, 6, 337-342. [124] Dieterich, W., Ehnis, T., Bauer, M., Donner, P., et al., Identification of tissue transglutaminase as the autoantigen of celiac disease. Nat Med 1997, 3, 797-801. [125] Dorum, S., Arntzen, M. O., Qiao, S. W., Holm, A., et al., The Preferred Substrates for Transglutaminase 2 in a Complex Wheat Gluten Digest Are Peptide Fragments Harboring Celiac Disease T-Cell Epitopes. PLOS One 2010, 5. [126] Molberg, O., Mcadam, S. N., Korner, R., Quarsten, H., et al., Tissue transglutaminase selectively modifies gliadin peptides that are recognized by gut-derived T cells in celiac disease. Nat Med 1998, 4, 713-717. [127] Vader, L. W., de Ru, A., van der Wal, Y., Kooy, Y. M. C., et al., Specificity of tissue transglutaminase explains cereal toxicity in celiac disease. J Exp Med 2002, 195, 643-649. [128] Forssell, F., Wieser, H., Spelt Wheat and Celiac-Disease. Z Lebensm Unters For 1995, 201, 35-39. [129] Hogberg, L., Stenhammar, L., Is spelt wheat toxic to those with celiac disease? J Pediatr Gastr Nutr 2000, 31, 321-321. 158 [130] Klockenbring, T., Boese, A., Bauer, R., Goerlich, R., Comparative investigations of wheat and spelt cultivars: IgA, IgE, IgG1 and IgG4 binding characteristics. Food Agric Immunol 2001, 13, 171-181. [131] Moudry, J., Dvoracek, V., Chemical composition of grain of different spelt (Triticum spelta L.) varieties. Rostl. Vyroba 1999, 45, 533-538. [132] Vincentini, O., Maialetti, F., Gazza, L., Silano, M., et al., Environmental factors of celiac disease: Cytotoxicity of hulled wheat species Triticum monococcum, T-turgidum ssp dicoccum and T-aestivum ssp spelta. J Gastroen Hepatol 2007, 22, 1816-1822. [133] Zielinski, H., Ceglinska, A., Michalska, A., Bioactive compounds in spelt bread. Eur Food Res Tech 2008, 226, 537-544. [134] Collins, F. S., Lander, E. S., Rogers, J., Waterston, R. H., Int Human Genome Sequencing, C., Finishing the euchromatic sequence of the human genome. Nature 2004, 431, 931-945. [135] Morice, Y., Ratinier, M., Miladi, A., Chevaliez, S., et al., Seroconversion to Hepatitis C Virus Alternate Reading Frame Protein During Acute Infection. Hepatology 2009, 49, 14491459. [136] Im, D. S., Orphan G protein-coupled receptors and beyond. Jpn. J. Pharmacol. 2002, 90, 101-106. [137] Arentz-Hansen, H., McAdam, S. N., Molberg, O., Fleckenstein, B., et al., Celiac lesion T cells recognize epitopes that cluster in regions of gliadins rich in proline residues. Gastroenterology 2002, 123, 803-809. [138] Ellis, H. J., Pollock, E. L., Engel, W., Fraser, J. S., et al., Investigation of the putative immunodominant T cell epitopes in coeliac disease. Gut 2003, 52, 212-217. [139] Ferranti, P., Marnone, G. R., Picariello, G., Addeo, F., Mass spectrometry analysis of gliadins in celiac disease. J Mass Spectrom 2007, 42, 1531-1548. [140] Mamone, G., Addeo, F., Chianese, L., Di Luccia, A., et al., Characterization of wheat gliadin proteins by combined two-dimensional gel electrophoresis and tandem mass spectrometry. Proteomics 2005, 5, 2859-2865. [141] Qiao, S. W., Bergseng, E., Molberg, O., Jung, G., et al., Refining the rules of gliadin T cell epitope binding to the disease-associated DQ2 molecule in celiac disease: Importance of proline spacing and glutamine deamidation. J Immunol 2005, 175, 254-261. [142] Tanabe, S., IgE-binding abilities of pentapeptides, QQPFP and PQQPF, in wheat gliadin. J Nutr Sci Vitaminol 2004, 50, 367-370. 159 [143] Vartdal, F., Johansen, B. H., Friede, T., Thorpe, C. J., et al., The peptide binding motif of the disease associated HLA-DQ (alpha 1* 0501, beta 1* 0201) molecule. Eur J Immunol 1996, 26, 2764-2772. [144] Aebersold, R., Goodlett, D. R., Mass spectrometry in proteomics. Chem Rev 2001, 101, 269-295. [145] Currie, M. G., Geller, D. M., Cole, B. R., Siegel, N. R., et al., Purification and SequenceAnalysis of Bioactive Atrial Peptides (Atriopeptins). Science 1984, 223, 67-69. [146] Schrader, M., Schulz-Knappe, P., Peptidomics technologies for human body fluids. Trends Biotechnol 2001, 19, S55-S60. [147] Spahr, C. S., Davis, M. T., McGinley, M. D., Robinson, J. H., et al., Towards defining the urinary proteome using liquid chromatography-tandem mass spectrometry I. Profiling an unfractionated tryptic digest. Proteomics 2001, 1, 93-107. [148] Jones, M. B., Krutzsch, H., Shu, H. J., Zhao, Y. M., et al., Proteomic analysis and identification of new biomarkers and therapeutic targets for invasive ovarian cancer. Proteomics 2002, 2, 76-84. [149] McDonald, W. H., Yates, J. R., Shotgun proteomics and biomarker discovery. Dis Markers 2002, 18, 99-105. [150] Verhoeckx, K. C. M., Bijlsma, S., Jespersen, S., Ramaker, R., et al., Characterization of anti-inflammatory compounds using transcriptomics, proteomics, and metabolomics in combination with multivariate data analysis. Int. Immunopharmacol. 2004, 4, 1499-1514. [151] Finehout, E. J., Franck, Z., Choe, L. H., Relkin, N., Lee, K. H., Cerebrospinal fluid proteomic biomarkers for Alzheimer's disease. Ann. Neurol. 2007, 61, 120-129. [152] Bethell, D., Metcalfe, G. E., Sheppard, R. C., Kinetics and Mechanism of Edman Degradation. Chem. Commun. 1965, 189-&. [153] Yamashita, M., Fenn, J. B., Electrospray Ion-Source - Another Variation on the Free-Jet Theme. J Phys Chem-US 1984, 88, 4451-4459. [154] Meng, C. K., Mann, M., Fenn, J. B., Of Protons or Proteins - a Beams a Beam for a That (Burns,O.S.). Z Phys D Atom Mol Cl 1988, 10, 361-368. [155] Fenn, J. B., Mann, M., Meng, C. K., Wong, S. F., Whitehouse, C. M., Electrospray Ionization for Mass-Spectrometry of Large Biomolecules. Science 1989, 246, 64-71. [156] Loo, J. A., Udseth, H. R., Smith, R. D., Solvent Effects on the Charge-Distribution Observed with Electrospray Ionization-Mass Spectrometry of Large Molecules. Biomed Environ Mass 1988, 17, 411-414. 160 [157] Loo, J. A., Udseth, H. R., Smith, R. D., Peptide and Protein-Analysis by Electrospray Ionization Mass-Spectrometry and Capillary Electrophoresis Mass-Spectrometry. Anal Biochem 1989, 179, 404-412. [158] Brown, R. S., Feng, J. H., Reiber, D. C., Further studies of in-source fragmentation of peptides in matrix-assisted laser desorption-ionization. Int. J. Mass Spectrom. 1997, 169, 1-18. [159] Hillenkamp, F., Karas, M., Beavis, R. C., Chait, B. T., Matrix-Assisted Laser Desorption Ionization Mass-Spectrometry of Biopolymers. Anal. Chem. 1991, 63, A1193-A1202. [160] Gohlke, R. S., Time-of-Flight Mass Spectrometry and Gas-Liquid Partition Chromatography. Anal. Chem. 1959, 31, 535-541. [161] Kirkland, J. J., High Speed Liquid-Partition Chromatography with Chemically Bonded Organic Stationary Phases. J. Chromatogr. Sci. 1971, 9, 206-&. [162] Schmit, J. A., Henry, R. A., Williams, R. C., Dieckman, J. F., Applications of High Speed Reversed-Phase Liquid Chromatography. J Chromatogr Sci 1971, 9, 645-&. [163] Pei, P. T. S., Henly, R. S., Ramachandran, S., New Application of High-Pressure Reversed-Phase Liquid-Chromatography in Lipids. Lipids 1975, 10, 152-156. [164] Boyle, J. G., Whitehouse, C. M., Time-of-Flight Mass-Spectrometry with an Electrospray Ion-Beam. Anal. Chem. 1992, 64, 2084-2089. [165] Gilmore, I. S., Seah, M. P., Ion detection efficiency in SIMS: dependencies on energy, mass and composition for microchannel plates used in mass spectrometry. Int. J. Mass Spectrom. 2000, 202, 217-229. [166] Hunt, D. F., Yates, J. R., Shabanowitz, J., Winston, S., Hauer, C. R., Protein Sequencing by Tandem Mass-Spectrometry. P Natl Acad Sci USA 1986, 83, 6233-6237. [167] Eng, J. K., Mccormack, A. L., Yates, J. R., An Approach to Correlate Tandem MassSpectral Data of Peptides with Amino-Acid-Sequences in a Protein Database. J Am Soc Mass Spectr 1994, 5, 976-989. [168] McCormack, A. L., Schieltz, D. M., Goode, B., Yang, S., et al., Direct analysis and identification of proteins in mixtures by LC/MS/MS and database searching at the lowfemtomole level. Anal Chem 1997, 69, 767-776. [169] Raida, M., Schulz-Knappe, P., Heine, G., Forssmann, W. G., Liquid chromatography and electrospray mass spectrometric mapping of peptides from human plasma filtrate. J Am Soc Mass Spectr 1999, 10, 45-54. 161 [170] Davis, M. T., Spahr, C. S., McGinley, M. D., Robinson, J. H., et al., Towards defining the urinary proteome using liquid chromatography-tandem mass spectrometry - II. Limitations of complex mixture analyses. Proteomics 2001, 1, 108-117. [171] Washburn, M. P., Wolters, D., Yates, J. R., Large-scale analysis of the yeast proteome by multidimensional protein identification technology. Nat Biotechnol 2001, 19, 242-247. [172] Adkins, J. N., Varnum, S. M., Auberry, K. J., Moore, R. J., et al., Toward a human blood serum proteome - Analysis by multidimensional separation coupled with mass spectrometry. Mol Cell Proteomics 2002, 1, 947-955. [173] Bergquist, J., Palmblad, M., Wetterhall, M., Hakansson, P., Markides, K. E., Peptide mapping of proteins in human body fluids using electrospray ionization Fourier transform ion cyclotron resonance mass spectrometry. Mass Spectrom Rev 2002, 21, 2-15. [174] Li, J. N., Zhang, Z., Rosenzweig, J., Wang, Y. Y., Chan, D. W., Proteomics and bioinformatics approaches for identification of serum biomarkers to detect breast cancer. Clin Chem 2002, 48, 1296-1304. [175] Tsuji, T., Shiozaki, A., Kohno, R., Yoshizato, K., Shimohama, S., Proteomic profiling and neurodegeneration in Alzheimer's disease. Neurochem Res 2002, 27, 1245-1253. [176] Wulfkuhle, J. D., Liotta, L. A., Petricoin, E. F., Proteomic applications for the early detection of cancer. Nat Rev Cancer 2003, 3, 267-275. [177] Liao, H., Wu, J., Kuhn, E., Chin, W., et al., Use of mass spectrometry to identify protein biomarkers of disease severity in the synovial fluid and serum of patients with rheumatoid arthritis. Arthritis Rheum 2004, 50, 3792-3803. [178] Pisitkun, T., Shen, R. F., Knepper, M. A., Identification and proteomic profiling of exosomes in human urine. P Natl Acad Sci USA 2004, 101, 13368-13373. [179] Menzel, C., Guillou, V., Kellmann, M., Khamenya, V., et al., High-throughput biomarker discovery and identification by mass spectrometry. Comb Chem High T Scr 2005, 8, 743-755. [180] Gronborg, M., Kristiansen, T. Z., Iwahori, A., Chang, R., et al., Biomarker discovery from pancreatic cancer secretome using a differential proteomic approach. Mol. Cell. Proteomics 2006, 5, 157-171. [181] Hu, S., Loo, J. A., Wong, D. T., Human body fluid proteome analysis. Proteomics 2006, 6, 6326-6353. [182] Ehmann, M., Felix, K., Hartmann, D., Schnolzer, M., et al., Identification of potential markers for the detection of pancreatic cancer through comparative serum protein expression profiling. Pancreas 2007, 34, 205-214. 162 [183] Carpentier, S. C., Panis, B., Vertommen, A., Swennen, R., et al., Proteome analysis of nonmodel plants: A challenging but powerful approach. Mass Spectrom Rev 2008, 27, 354-377. [184] Hanash, S. M., Pitteri, S. J., Faca, V. M., Mining the plasma proteome for cancer biomarkers. Nature 2008, 452, 571-579. [185] Lin, Y., Zhou, J., Bi, D., Chen, P., et al., Sodium-deoxycholate-assisted tryptic digestion and identification of proteolytically resistant proteins. Anal Biochem 2008, 377, 259-266. [186] Westman-Brinkmalm, A., Ruetschi, U., Portelius, E., Andreasson, U., et al., Proteomics/peptidomics tools to find CSF biomarkers for neurodegenerative diseases. Front Biosci 2009, 14, 1793-1806. [187] Brubaker, W. M., Tuul, J., Performance Studies of Quadrupole Mass Filter. Rev. Sci. Instrum. 1964, 35, 1007-&. [188] Yost, R. A., Enke, C. G., Triple Quadrupole Mass-Spectrometry for Direct Mixture Analysis and Structure Elucidation. Anal. Chem. 1979, 51, 1251-&. [189] America, A. H. P., Cordewener, J. H. G., van Geffen, M. H. A., Lommen, A., et al., Alignment and statistical difference analysis of complex peptide data sets generated by multidimensional LC-MS. Proteomics 2006, 6, 641-653. [190] Liu, H. B., Sadygov, R. G., Yates, J. R., A model for random sampling and estimation of relative protein abundance in shotgun proteomics. Anal. Chem. 2004, 76, 4193-4201. [191] Adams, J., Charge-Remote Fragmentations - Analytical Applications and FundamentalStudies. Mass Spectrom. Rev. 1990, 9, 141-186. [192] Harrison, A. G., Fragmentation reactions of protonated peptides containing glutamine or glutamic acid. J Mass Spectrom 2003, 38, 174-187. [193] Paizs, B., Suhai, S., Fragmentation pathways of protonated peptides. Mass Spectrom. Rev. 2005, 24, 508-548. [194] Dongre, A. R., Jones, J. L., Somogyi, A., Wysocki, V. H., Influence of peptide composition, gas-phase basicity, and chemical modification on fragmentation efficiency: Evidence for the mobile proton model. J. Am. Chem. Soc. 1996, 118, 8365-8374. [195] Biniossek, M. L., Schilling, O., Enhanced identification of peptides lacking basic residues by LC-ESI-MS/MS analysis of singly charged peptides. Proteomics 2012, 12, 1303-1309. [196] Mouls, L., Aubagnac, J. L., Martinez, J., Enjalbal, C., Low energy peptide fragmentations in an ESI-Q-Tof type mass spectrometer. J. Proteome Res. 2007, 6, 1378-1391. 163 [197] Patton, W. F., Schulenberg, B., Steinberg, T. H., Two-dimensional gel electrophoresis; better than a poke in the ICAT? Curr. Opin. Biotechnol. 2002, 13, 321-328. [198] Perry, R. H., Cooks, R. G., Noll, R. J., Orbitrap Mass Spectrometry: Instrumentation, Ion Motion and Applications. Mass Spectrom. Rev. 2008, 27, 661-699. [199] Schulz-Knappe, P., Zucht, H. D., Heine, C., Jurgens, M., et al., Peptidomics: The comprehensive analysis of peptides in complex biological mixtures. Comb Chem High T Scr 2001, 4, 207-217. [200] Baggerman, G., Verleyen, P., Clynen, E., Huybrechts, J., et al., Peptidomics. J Chromatogr B 2004, 803, 3-16. [201] Fricker, L. D., Lim, J. Y., Pan, H., Che, F. Y., Peptidomics: identification and quantification of endogenous peptides in neuroendocrine tissues. Mass Spectrom Rev 2006, 25, 327-344. [202] Jost, M. M., Budde, P., Tammen, H., Hess, R., et al., The concept of functional peptidomics for the discovery of bioactive peptides in cell culture models. Comb Chem High T Scr 2005, 8, 767-773. [203] Kim, Y. G., Lone, A. M., Nolte, W. M., Saghatelian, A., Peptidomics approach to elucidate the proteolytic regulation of bioactive peptides. P Natl Acad Sci USA 2012, 109, 8523-8527. [204] Menschaert, G., Vandekerckhove, T. T. M., Baggerman, G., Schoofs, L., et al., Peptidomics Coming of Age: A Review of Contributions from a Bioinformatics Angle. J Proteome Res 2010, 9, 2051-2061. [205] Osaki, T., Sasaki, K., Minamino, N., Peptidomics-Based Discovery of an Antimicrobial Peptide Derived from Insulin-Like Growth Factor-Binding Protein 5. J. Proteome Res. 2011, 10, 1870-1880. [206] Sasaki, K., Takahashi, N., Satoh, M., Yamasaki, M., Minamino, N., A Peptidomics Strategy for Discovering Endogenous Bioactive Peptides. J Proteome Res 2010, 9, 5047-5052. [207] Schulz-Knappe, P., Schrader, M., Zucht, H. D., The peptidomics concept. Comb Chem High T Scr 2005, 8, 697-704. [208] Soloviev, M., Finch, P., Peptidomics, current status. J Chromatogr B 2005, 815, 11-24. [209] Soloviev, M., Finch, P., Peptidomics: Bridging the gap between proteome and metabolome. Proteomics 2006, 6, 744-747. [210] Tammen, H., Peck, A., Budde, P., Zucht, H. D., Peptidomics analysis of human blood specimens for biomarker discovery. Expert Rev Mol Diagn 2007, 7, 605-613. 164 [211] Tinoco, A. D., Saghatelian, A., Investigating Endogenous Peptides and Peptidases Using Peptidomics. Biochemistry-Us 2011, 50, 7447-7461. [212] Villanueva, J., Martorella, A. J., Lawlor, K., Philip, J., et al., Serum peptidome patterns that distinguish metastatic thyroid carcinoma from cancer-free controls are unbiased by gender and age. Mol Cell Proteomics 2006, 5, 1840-1852. [213] Yamaguchi, H., Sasaki, K., Satomi, Y., Shimbara, T., et al., Peptidomic identification and biological validation of neuroendocrine regulatory peptide-1 and-2. J Biol Chem 2007, 282, 26354-26360. [214] Bauchart, C., Morzel, M., Chambon, C., Mirand, P. P., et al., Peptides reproducibly released by in vivo digestion of beef meat and trout flesh in pigs. Brit J Nutr 2007, 98, 11871195. [215] Cavatorta, V., Sforza, S., Aquino, G., Galaverna, G., et al., In vitro gastrointestinal digestion of the major peach allergen Pru p 3, a lipid transfer protein: Molecular characterization of the products and assessment of their IgE binding abilities. Mol Nutr Food Res 2010, 54, 14521457. [216] Escudero, E., Sentandreu, M. A., Toldra, F., Characterization of Peptides Released by in Vitro Digestion of Pork Meat. J. Agric. Food Chem. 2010, 58, 5160-5165. [217] Gomez-Ruiz, J. A., Ramos, M., Recio, I., Angiotensin-converting enzyme-inhibitory peptides in Manchego cheeses manufactured with different starter cultures. Int Dairy J 2002, 12, 697-706. [218] Hesari, J., Ehsani, M. R., Mosavi, M. A. E., McSweeney, P. L. H., Proteolysis in ultrafiltered and conventional Iranian white cheese during ripening. Int J Dairy Technol 2007, 60, 211-220. [219] Niehues, M., Euler, M., Georgi, G., Mank, M., et al., Peptides from Pisum sativum L. enzymatic protein digest with anti-adhesive activity against Helicobacter pylori: Structureactivity and inhibitory activity against BabA, SabA, HpaA and a fibronectin-binding adhesin. Mol Nutr Food Res 2010, 54, 1851-1861. [220] Picariello, G., Ferranti, P., Fierro, O., Mamone, G., et al., Peptides surviving the simulated gastrointestinal digestion of milk proteins: Biological and toxicological implications. J Chromatogr B 2010, 878, 295-308. [221] Quiros, A., Contreras, M. D., Ramos, M., Amigo, L., Recio, I., Stability to gastrointestinal enzymes and structure-activity relationship of beta-casein-peptides with antihypertensive properties. Peptides 2009, 30, 1848-1853. 165 [222] Dalle-Donne, I., Scaloni, A., Giustarini, D., Cavarra, E., et al., Proteins as biomarkers of oxidative/nitrosative stress in diseases: The contribution of redox proteomics. Mass Spectrom Rev 2005, 24, 55-99. [223] Link, A. J., Carmack, E., Yates, J. R., A strategy for the identification of proteins localized to subcellular spaces: Application to E-coli periplasmic proteins. Int J Mass Spectrom 1997, 160, 303-316. [224] Porto, A. C. R. C., Oliveira, L. L., Ferraz, L. C., Ferraz, L. E. S., et al., Isolation of bovine immunoglobulins resistant to peptic digestion: New perspectives in the prevention of failure in passive immunization of neonatal calves. J Dairy Sci 2007, 90, 955-962. [225] Ackermann, B. L., Hale, J. E., Duffin, K. L., The role of mass spectrometry in biomarker discovery and measurement. Curr Drug Metab 2006, 7, 525-539. [226] Antignac, J. P., de Wasch, K., Monteau, F., De Brabander, H., et al., The ion suppression phenomenon in liquid chromatography-mass spectrometry and its consequences in the field of residue. Anal. Chim. Acta 2005, 529, 129-136. [227] America, A. H. P., Cordewener, J. H. G., Comparative LC-MS: A landscape of peaks and valleys. Proteomics 2008, 8, 731-749. [228] Backstrom, D., Moberg, M., Sjoberg, P. J. R., Bergquist, J., Danielsson, R., Multivariate comparison between peptide mass fingerprints obtained by liquid chromatography-electrospray ionization-mass spectrometry with different trypsin digestion procedures. J Chromatogr A 2007, 1171, 69-79. [229] Currie, L. A., Devoe, J. R., Filliben, J. J., Statistical and Mathematical Methods in Analytical-Chemistry. Anal Chem 1972, 44, R497-&. [230] Fiehn, O., Metabolomics - the link between genotypes and phenotypes. Plant Mol Biol 2002, 48, 155-171. [231] Fiehn, O., Kopka, J., Dormann, P., Altmann, T., et al., Metabolite profiling for plant functional genomics. Nat Biotechnol 2000, 18, 1157-1161. [232] Hollywood, K., Brison, D. R., Goodacre, R., Metabolomics: Current technologies and future trends. Proteomics 2006, 6, 4716-4723. [233] Idborg, H., Edlund, P. O., Jacobsson, S. P., Multivariate approaches for efficient detection of potential metabolites from liquid chromatography/mass spectrometry data. Rapid Commun Mass Spectrom 2004, 18, 944-954. [234] Kowalski, B. R., Chemometrics. Anal. Chem. 1980, 52, R112-R122. 166 [235] Nicholson, J. K., Lindon, J. C., Holmes, E., 'Metabonomics': understanding the metabolic responses of living systems to pathophysiological stimuli via multivariate statistical analysis of biological NMR spectroscopic data. Xenobiotica 1999, 29, 1181-1189. [236] Norden, B., Broberg, P., Lindberg, C., Plymoth, A., Analysis and understanding of highdimensionality data by means of multivariate data analysis. Chem Biodivers 2005, 2, 1487-1494. [237] Saurina, J., Characterization of wines using compositional profiles and chemometrics. Trac-Trends Anal. Chem. 2010, 29, 234-245. [238] Trygg, J., Gullberg, J., Johansson, A. I., Jonsson, P., Moritz, T., Chemometrics in Metabolomics. Biotechnol Agric For 2006, 57, 117-128. [239] Trygg, J., Holmes, E., Lundstedt, T., Chemometrics in metabonomics. J Proteome Res 2007, 6, 469-479. [240] Trygg, J., Wold, S., Orthogonal projections to latent structures (O-PLS). J Chemometr 2002, 16, 119-128. [241] Wang, C. Z., Ni, M., Sun, S., Li, X. L., et al., Detection of Adulteration of Notoginseng Root Extract with Other Panax Species by Quantitative HPLC Coupled with PCA. J. Agric. Food Chem. 2009, 57, 2363-2367. [242] Wiklund, S., Johansson, E., Sjostrom, L., Mellerowicz, E. J., et al., Visualization of GC/TOF-MS-based metabolomics data for identification of biochemically interesting compounds using OPLS class models. Anal Chem 2008, 80, 115-122. [243] Allaker, R. P., Host defence peptides - a bridge between the innate and adaptive immune responses. T Roy Soc Trop Med H 2008, 102, 3-4. [244] Lopez-Fandino, R., Otte, J., van Camp, J., Physiological, chemical and technological aspects of milk-protein-derived peptides with antihypertensive and ACE-inhibitory activity. Int Dairy J 2006, 16, 1277-1293. [245] Tosteson, M. T., Holmes, S. J., Razin, M., Tosteson, D. C., Melittin Lysis of Red-Cells. J. Membr. Biol. 1985, 87, 35-44. [246] Selle, H., Lamerz, J., Buerger, K., Dessauer, A., et al., Identification of novel biomarker candidates by differential peptidomics analysis of cerebrospinal fluid in Alzheimer's disease. Comb Chem High T Scr 2005, 8, 801-806. [247] Adt, I., Dupas, C., Boutrou, R., Oulahal, N., et al., Identification of caseinophosphopeptides generated through in vitro gastro-intestinal digestion of Beaufort cheese. Int Dairy J 2011, 21, 129-134. 167 [248] Gomez-Ruiz, J. A., Ramos, M., Recio, I., Identification of novel angiotensin-converting enzyme-inhibitory peptides from ovine milk proteins by CE-MS and chromatographic techniques. Electrophoresis 2007, 28, 4202-4211. [249] Michalski, A., Cox, J., Mann, M., More than 100,000 Detectable Peptide Species Elute in Single Shotgun Proteomics Runs but the Majority is Inaccessible to Data-Dependent LCMS/MS. J. Proteome Res. 2011, 10, 1785-1793. [250] Kelstrup, C. D., Young, C., Lavallee, R., Nielsen, M. L., Olsen, J. V., Optimized Fast and Sensitive Acquisition Methods for Shotgun Proteomics on a Quadrupole Orbitrap Mass Spectrometer. J. Proteome Res. 2012, 11, 3487-3497. [251] Rudomin, E. L., Carr, S. A., Jaffe, J. D., Directed Sample Interrogation Utilizing an Accurate Mass Exclusion-Based Data-Dependent Acquisition Strategy (AMEx). J. Proteome Res. 2009, 8, 3154-3160. [252] Olsen, J. V., Schwartz, J. C., Griep-Raming, J., Nielsen, M. L., et al., A Dual Pressure Linear Ion Trap Orbitrap Instrument with Very High Sequencing Speed. Mol. Cell. Proteomics 2009, 8, 2759-2769. [253] Panchaud, A., Affolter, M., Kussmann, M., Mass spectrometry for nutritional peptidomics: How to analyze food bioactives and their health effects. J Proteomics 2012, 75, 3546-3559. [254] Yates, J. R., Eng, J. K., McCormack, A. L., Schieltz, D., Method to Correlate Tandem Mass-Spectra of Modified Peptides to Amino-Acid-Sequences in the Protein Database. Anal. Chem. 1995, 67, 1426-1436. [255] Yates, J. R., McCormack, A. L., Schieltz, D., Carmack, E., Link, A., Direct analysis of protein mixtures by tandem mass spectrometry. J. Protein Chem. 1997, 16, 495-497. [256] Gillet, L. C., Navarro, P., Tate, S., Rost, H., et al., Targeted Data Extraction of the MS/MS Spectra Generated by Data-independent Acquisition: A New Concept for Consistent and Accurate Proteome Analysis. Mol. Cell. Proteomics 2012, 11. [257] Paul, M., Somkuti, G. A., Hydrolytic breakdown of lactoferricin by lactic acid bacteria. J Ind Microbiol Biot 2010, 37, 173-178. [258] Di Bernardini, R., Harnedy, P., Bolton, D., Kerry, J., et al., Antioxidant and antimicrobial peptidic hydrolysates from muscle protein sources and by-products. Food Chem 2011, 124, 1296-1307. [259] Jimsheena, V. K., Gowda, Lalitha R., Angiotensin I-converting enzyme (ACE) inhibitory peptides derived from arachin by simulated gastric digestion. Food Chem 2011, 125, 561-569. 168 [260] Romanova, E. V., Lee, J. E., Kelleher, N. L., Sweedler, J. V., Gulley, J. M., Comparative peptidomics analysis of neural adaptations in rats repeatedly exposed to amphetamine. J. Neurochem. 2012, 123, 276-287. [261] Sforza, S., Cavatorta, V., Lambertini, F., Galaverna, G., et al., Cheese peptidomics: A detailed study on the evolution of the oligopeptide fraction in Parmigiano-Reggiano cheese from curd to 24 months of aging. J Dairy Sci 2012, 95, 3514-3526. [262] Vanhoof, G., Goossens, F., Demeester, I., Hendriks, D., Scharpe, S., Proline Motifs in Peptides and Their Biological Processing. Faseb J 1995, 9, 736-744. [263] Yaron, A., The Role of Proline in the Proteolytic Regulation of Biologically-Active Peptides. Biopolymers 1987, 26, S215-S222. [264] Tang, J., Competitive Inhibition of Pepsin by Aliphatic Alcohols. J. Biol. Chem. 1965, 240, 3810-&. [265] Stagliano, M. C., DeKeyser, J. G., Omiecinski, C. J., Jones, A. D., Bioassay-directed fractionation for discovery of bioactive neutral lipids guided by relative mass defect filtering and multiplexed collision-induced dissociation. Rapid Commun. Mass Spectrom. 2010, 24, 35783584. [266] Loo, J. A., Edmonds, C. G., Smith, R. D., Tandem Mass-Spectrometry of Very Large Molecules 2. Dissociation of Multiply Charged Proline-Containing Proteins from Electrospray Ionization. Anal. Chem. 1993, 65, 425-438. [267] Prandi, B., Bencivenni, M., Faccini, A., Tedeschi, T., et al., Composition of peptide mixtures derived from simulated gastrointestinal digestion of prolamins from different wheat varieties. J. Cereal Sci. 2012, 56, 223-231. [268] Lioe, H., Laskin, J., Reid, G. E., O'Hair, R. A. J., Energetics and dynamics of the fragmentation reactions of protonated peptides containing methionine sulfoxide or aspartic acid via energy- and time-resolved surface induced dissociation. J Phys Chem A 2007, 111, 1058010588. [269] Grewal, R. N., El Aribi, H., Harrison, A. G., Siu, K. W. M., Hopkinson, A. C., Fragmentation of protonated tripeptides: The proline effect revisited. J. Phys. Chem. B 2004, 108, 4899-4908. [270] Dupre, M., Cantel, S., Martinez, J., Enjalbal, C., Occurrence of C-Terminal Residue Exclusion in Peptide Fragmentation by ESI and MALDI Tandem Mass Spectrometry. J Am Soc Mass Spectr 2012, 23, 330-346. [271] Jia, C. X., Qi, W., He, Z. M., Cyclization reaction of peptide fragment ions during multistage collisionally activated decomposition: An inducement to lose internal amino-acid residues. J Am Soc Mass Spectr 2007, 18, 663-678. 169 [272] Palumbo, A. M., Reid, G. E., Evaluation of Gas-Phase Rearrangement and Competing Fragmentation Reactions on Protein Phosphorylation Site Assignment Using Collision Induced Dissociation-MS/MS and MS3. Anal. Chem. 2008, 80, 9735-9747. [273] Vachet, R. W., Bishop, B. M., Erickson, B. W., Glish, G. L., Novel peptide dissociation: Gas-phase intramolecular rearrangement of internal amino acid residues. J Am Chem Soc 1997, 119, 5481-5488. [274] Yague, J., Paradela, A., Ramos, M., Ogueta, S., et al., Peptide rearrangement during quadrupole ion trap fragmentation: Added complexity to MS/MS spectra. Anal Chem 2003, 75, 1524-1535. [275] Dong, N. P., Zhang, L. X., Liang, Y. Z., A comprehensive investigation of proline fragmentation behavior in low-energy collision-induced dissociation peptide mass spectra. Int. J. Mass Spectrom. 2011, 308, 89-97. [276] Horvath, C., Melander, W., Liquid-Chromatography with Hydrocarbonaceous Bonded Phases - Theory and Practice of Reversed Phase Chromatography. J Chromatogr Sci 1977, 15, 393-404. [277] Horvath, C., Melander, W., Molnar, I., Solvophobic Interactions in LiquidChromatography with Nonpolar Stationary Phases. J Chromatogr 1976, 125, 129-156. [278] Bij, K. E., Horvath, C., Melander, W. R., Nahum, A., Surface Silanols in Silica-Bonded Hydrocarbonaceous Stationary Phases .2. Irregular Retention Behavior and Effect of Silanol Masking. J Chromatogr 1981, 203, 65-84. [279] Nawrocki, J., The silanol group and its role in liquid chromatography. J Chromatogr A 1997, 779, 29-71. [280] Nahum, A., Horvath, C., Surface Silanols in Silica-Bonded Hydrocarbonaceous Stationary Phases 1. Dual Retention Mechanism in Reversed-Phase Chromatography. J Chromatogr 1981, 203, 53-63. [281] Bell, D. S., Jones, A. D., Solute attributes and molecular interactions contributing to "Ushape" retention on a fluorinated high-performance liquid chromatography stationary phase. J Chromatogr A 2005, 1073, 99-109. [282] Billiet, H. A. H., Schoenmakers, P. J., Degalan, L., Retention and Selectivity Characteristics of a Nonpolar Perfluorinated Stationary Phase for Liquid-Chromatography. J Chromatogr 1981, 218, 443-454. [283] Euerby, M. R., McKeown, A. P., Petersson, P., Chromatographic classification and comparison of commercially available perfluorinated stationary phases for reversed-phase liquid chromatography using Principal Component Analysis. J Sep Sci 2003, 26, 295-306. 170 [284] Nichthauser, J., Stepnowski, P., Retention Mechanism of Selected Ionic Liquids On a Pentafluorophenylpropyl Polar Phase: Investigation using RP-HPLC. J Chromatogr Sci 2009, 47, 247-253. [285] Needham, S. R., Brown, P. R., The high performance liquid chromatography electrospray ionization mass spectrometry analysis of diverse basic pharmaceuticals on cyanopropyl and pentafluorophenylpropyl stationary phases. J Pharmaceut Biomed 2000, 23, 597-605. [286] Needham, S. R., Brown, P. R., Duff, K., Phenyl ring structures as stationary phases for the high performance liquid chromatography electrospray ionization mass spectrometric analysis of basic pharmaceuticals. Rapid Commun Mass Spectrom 1999, 13, 2231-2236. [287] Needham, S. R., Brown, P. R., Duff, K., Bell, D., Optimized stationary phases for the highperformance liquid chromatography-electrospray ionization mass spectrometric analysis of basic pharmaceuticals. J Chromatogr A 2000, 869, 159-170. [288] Needham, S. R., Jeanville, P. M., Brown, P. R., Estape, E. S., Performance of a pentafluorophenylpropyl stationary phase for the electrospray ionization high-performance liquid chromatography-mass spectrometry-mass spectrometry assay of cocaine and its metabolite ecgonine methyl ester in human urine. J Chromatogr B 2000, 748, 77-87. [289] Euerby, M., Petersson, P., Chromatographic classification and comparison of commercially available reversed-phase liquid chromatographic columns using principal component analysis. J Chromatogr A 2003, 994, 13-36. [290] Neue, U. D., VanTran, K., Iraneta, P. C., Alden, B. A., Characterization of HPLC packings. J Sep Sci 2003, 26, 174-186. [291] Neue, U. D., Phoebe, C. H., Tran, K., Cheng, Y. F., Lu, Z. L., Dependence of reversedphase retention of ionizable analytes on pH, concentration of organic solvent and silanol activity. J Chromatogr A 2001, 925, 49-67. [292] Mant, C. T., Burke, T. W. L., Black, J. A., Hodges, R. S., EFFECT OF PEPTIDE-CHAIN LENGTH ON PEPTIDE RETENTION BEHAVIOR IN REVERSED-PHASE CHROMATOGRAPHY. 1988, 458, 193-205. [293] Krokhin, O. V., Craig, R., Spicer, V., Ens, W., et al., An improved model for prediction of retention times of tryptic peptides in ion pair reversed-phase HPLC - Its application to protein peptide mapping by off-line HPLC-MALDI MS. Mol. Cell. Proteomics 2004, 3, 908-919. [294] Kyte, J., Doolittle, R. F., A Simple Method for Displaying the Hydropathic Character of a Protein. J Mol Biol 1982, 157, 105-132. [295] Bjellqvist, B., Hughes, G. J., Pasquali, C., Paquet, N., et al., THE FOCUSING POSITIONS OF POLYPEPTIDES IN IMMOBILIZED PH GRADIENTS CAN BE 171 PREDICTED FROM THEIR AMINO-ACID-SEQUENCES. Electrophoresis 1993, 14, 10231031. [296] Ikai, A., THERMOSTABILITY AND ALIPHATIC INDEX OF GLOBULARPROTEINS. J. Biochem. 1980, 88, 1895-1898. [297] Irudayam, S. J., Henchman, R. H., Long-range hydrogen-bond structure in aqueous solutions and the vapor-water interface. J. Chem. Phys. 2012, 137. 172