How To Find Isoelectric Point Of Protein

Methods Mol Biol. Author manuscript; available in PMC 2022 Jan one.

Published in final edited form equally:

PMCID: PMC2754287

NIHMSID: NIHMS117568

ProMoST: A tool for calculating the pI and molecular mass of phosphorylated and modified proteins on 2 dimensional gels

Abstract

Protein modifications such as phosphorylation are oft studied by 2-dimensional gel electrophoresis since the perturbation in the protein's pI value is readily detected by this method. It is important to be able to calculate the changes in the pI values that specific mail-translational modifications cause and to visualize how these changes will effect protein migration on 2D gels. To address this demand, we have developed ProMoST. ProMoST is a freely accessible web based application that calculates and displays the mass and pI values for either proteins in the NCBI database identified past accession number or from submitted FASTA format sequence.

Keywords: Two-dimensional gel electrophoresis, protein modification, phosphorylation

1. Introduction

One of the near successful methods for detecting and analyzing protein posttranslational modifications (PTMs) has been ii-dimensional gel electrophoresis (2D-GE). Since many PTMs, such as phosphorylation, innovate charged groups into the protein, there is often a detectable modify in the position of the protein on a 2D gel. Although the change in the mass of the poly peptide due to the PTM is oft besides pocket-sized to be hands detected by standard sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE), the modification tin can cause a change in the net charge of the protein leading to a change in the isoelectric signal, or pI of the poly peptide. The kickoff dimension of the second gel, ordinarily shown horizontally, is the isoelectric focusing dimension; changes in protein pI's are reflected as changes in the horizontal position of the protein spot within the 2d pattern of spots. Often it is observed that there are 'trains' of spots on the gel that are presumably formed by multiple versions of the same protein that differ in isoelectric point due to increasing numbers posttranslational modifications such as phosphorylation or deamidation (1, 2).

Although 2D-GE is a sensitive method for determining that there are posttranslationally modified forms of proteins present, it does not directly bespeak what the modification is or how many of the residues in the poly peptide are modified. Since proteins vary greatly in their ability to buffer the modify in pI due to posttranslational modifications, to examine these results more closely it is necessary to calculate the predicted pI changes caused by the modification in the context of the poly peptide sequence.

To see this need, we have developed ProMoST, a web based application that allows users considerable freedom in calculating the pI values of modified and unmodified proteins (iii). ProMoST has predefined modifications so that casual users are able to rapidly determine the predicted pI values of modified proteins and peptides. In addition, ProMoST also provides additional options for more advanced users allowing them to define boosted custom modifications, alter the pKa values for the defined modifications and even make changes to the default pKa values for charged amino acids used to summate pI values. The results of the calculations tin can be displayed both in a tabular format likewise as in a graphic representation of the migration of the protein on a 2D gel.

1.one pI

The pKa values of the side bondage of the twenty common amino acids that comprise well-nigh proteins vary from approximately pH ii.8 to pH 11.2 (4). 3 amino acids are positively charged nether physiological conditions (lysine, arginine, and histidine) are termed basic amino acids and two amino acids are negatively charged under physiological conditions (glutamic acid and aspartic acid) and are termed acidic amino acids. In addition, the amino (N) and carboxyl (C) termini of the protein can besides be charged. To determine the total charge of a protein at a given pH, the fractional number of positive and negative charges for each of the amino acids in the protein's sequence is determined and sum of the partial charges is equal to the charge on the protein.

The isoelectric point, or pI of the protein is the pH value at which the full charge on the poly peptide is zero. At this pH value the negative and positive charges of the protein are equal and the poly peptide is at neutral charge. The pI of the protein therefore gives an indication of whether the protein will carry a cyberspace positive or negative accuse nether physiological conditions. Proteins that have a pI > vii.0 are considered to be basic proteins and proteins that have a pI < 7.0 are considered to be acidic proteins.

In addition to giving an indication of the charge of the protein, the pI is also a skillful indicator of the solubility of the protein at a given pH. One of the most of import aspects of a protein'due south physiochemical properties that determines solubility is its accuse. Thus at a pH equal to the pI of the poly peptide, it is uncharged and therefore information technology is usually the least soluble. Manipulating poly peptide charge, either by changing pH or by calculation salt to neutralize charge is the basis for many of the early methods for protein purification by differential solubility (5).

The loss of charge at a protein'south pI is besides part of the fractionation procedure during the first dimension of the second gel that is based on isoelectric focusing (6). Proteins are introduced to a strip on which a pH gradient has been established and in the presence of a high electric field they migrate to the position on the strip at which the poly peptide has a net neutral modify and it stops migrating. This pH value corresponds to the pI of the protein. Thus the concluding migration position of the protein in the horizontal dimension of the 2D gel is determined by the pI value of the protein.

1.2 Modifications and mutations alter pI

The fact that the migration in the isoelectric focusing dimension of proteins in 2D-GE is very sensitive to changes in pI makes 2D-GE a valuable technique for identifying modifications and mutations (one). Modifications such every bit phosphorylation that add highly charged groups to the poly peptide can cause easily detectable changes in pI and therefore mobility of the poly peptide in the isoelectric focusing dimension. Similarly, the changes in protein mass and pI due to mutations that cause a net loss or proceeds of charge on the protein by altering the number of charged acidic and bones residues nowadays in the protein can as well be calculated and displayed. The amount of mobility shift that is observed due to modification or mutation is dependent on three factors. First, the pKa value for the modification or modify induced by mutation is very important to the final change in the poly peptide pI. Modifications, such equally phosphorylation, that introduce a group with either a strongly acidic or bones pKa will have a greater consequence than those with a pKa value closer to neutrality. Similarly, a mutation that causes a alter from an acidic residue to a basic residue will atomic number 82 to a larger change in pKa than a modify from a charged residue to a neutral residue. The larger pKa alteration will lead to a larger change in protein pI and therefore a larger mobility shift in the isoelectric focusing dimension of the 2d gel. Second, the number of modifications or remainder changes volition also have an bear on on the mobility shift observed on the gels. Often for modifications, a train of spots will exist observed. Interestingly, the shift in mobility is often non constant and the altitude between spots can vary. This is explained by the third gene that determines the magnitude of the observed pI shift: the accuse buffering capacity of the protein at a given pH. Since different proteins are comprised of dissimilar mixtures of positively and negatively charged amino acrid depending on their main amino acrid sequence, the charge titration profile for each protein is unique. Thus the extent to which a modification changes the pI of the protein and impacts on the mobility of the protein, is different since the change titration contour changes with pH. Figure 1 shows an example of this for human being cyclin-dependent kinase 2 (CDK2). Figure 1 Panel A shows the titration of the unmodified protein. Panels B–D show the titrations with ane, 2 or iii phosphate groups. For comparing, Figure one Panel East shows the 2D gel spot positions calculated by ProMoST. Note that the magnitude of the shift in the calculated spot position due to additional phosphorylation varies from spot to spot. This variance correlates with the titration curves for CDK2 shown in Figure 1, Panels A–D.

The correlation of the calculated mass and pI for a protein and an actual 2nd gel is shown in Effigy 2. Proteins were isolated from cultured rat fibroblast cells and analyzed by 2nd-GE (Figure two, Panel A). Proteins were extracted from gel spots, digested with trypsin and analysis by MALDI equally previously described (7). Poly peptide identification was carried out by peptide mass fingerprinting (PMF) using the Mascot program (eight) Figure 2, Panel A shows the stained image of the 2D gel. Effigy ii, Panel B shows the ProMoST produced graphic showing the calculated relative position of the unmodified and phosphorylated vimentin. Figure 2, Panel C shows the composite of stained image of the 2D gel with the calculated positions of unmodified and phosphorylated vimentin indicated. MALDI assay of the spots confirmed their identification as containing vimentin.

1.3 Calculation of pI values

The accuse state of the protein at a given pH is the sum of the negative and positive charges on the charged residues and the C-terminal and N-terminal residues of the protein. To determine the pI value for the protein, the pH value at which the charge state of the poly peptide is equal to zero must be found. There are two basic approaches to computing the pI value for a protein. One method is to construct a model of the charge land of protein as a series of differential equations and then solve the equations for the condition of zero net charge. While this method provides an exact conclusion of the pI value, it can be computationally expensive and an exact conclusion is not required for practical work.

A second approach is to make up one's mind the pH value at which the charge on the poly peptide is neutral to inside a minor tolerance by successive approximations. In this method, a starting pH is chosen, usually pH vii, and the accuse on the protein is calculated based on the pKa values for each of the charged residues and the N and C terminal amino acids of the protein. If the net charge on the protein at a pH of vii is determined to be positive, the charge calculation is repeated with an increased pH value. If the net charge at a pH of 7 is determined to exist negative, the charge calculation is repeated with a decreased pH value. Later on the first calculations, the pH is inverse past 3.5 units, bringing the pH value to three.v or 10.five, and the adding repeated. If the charge is same polarity equally the pH 7.0 calculation, the pH is changed by an additional 3.5 units, bringing the pH to 0 or 14, and repeated. If the polarity of the accuse switches, then the pH change is halved (1.75 units) and the adding repeated. This iterative process of adding, changing the pH value by half of the previous change, and recalculation is repeated until the net charge on the protein at the pH used for calculation is less than a preset tolerance value, usually ± 0.002, or the modify in pH value is less than ± 0.01 pH units. This method can require at about 12 rounds of calculation, but typically converges in 6 or less rounds, making this method far faster than the exact method while yielding an answer of sufficient precision for applied work.

1.4 ProMoST algorithm

The ProMoST awarding is based on the successive approximation method for calculating protein pI values. The major difference is that in add-on to the standard acidic amino acrid residues (glutamic acid and aspartic acid), ProMoST also considers the pKa values of cysteine and tyrosine when calculating unmodified poly peptide pI values. To calculate the pI values for modified proteins, the number and pKa values for the modifications is included in the calculation. For modifications such equally phosphorylation in which there are multiple charge states, the pKa values for all charge states are also included in the calculation. In some cases, it is necessary to remove from the calculation the pKa values for the unmodified amino acrid. For instance, in the case of the phosphorylation of tyrosine, for each phosphate group added to the adding, a tyrosine group is removed since the OH group on tyrosine is both the position of the charge and the site of phosphorylation.

The first step in the analysis is the determination of the amino acrid composition of the protein. The molecular mass of the protein is calculated be summing the loftier precision mono or average isotopic masses of its amino acids and calculation the mono or average isotopic mass of i water molecule, respective to an H at the N terminal stop and a OH group at the C last end of the molecule.

The accuse on the protein at a particular pH value is calculated using a method developed past Tabb (9). This approach works by determining the sum of the fractional charges for all the charged amino acids and modifications using the standard equations:

Positive ions:
Negative ions:

The partial charge contribution, P_Ci , of any species to the entire protein is equal to:

where north is the number of that item amino acrid or modification. The total charge on the protein is the sum of the partial charges:

The pI is defined as the pH value at which all charges one the protein are balanced and the internet accuse is goose egg. To determine that pH value, an initial value of pH=7 is tested and the net charge on the protein calculated. Depending on the sign of the accuse on the protein, Δ pH value of 3.v is added or subtracted from the initial value of seven and the charge on the protein recalculated. The process of dividing the Δ pH value in half and changing sign is reiterated until a cyberspace protein charge of less than 0.002 is obtained. This 'binary search' method rapidly converges on an accurate value for the poly peptide pI.

ii. Materials

The ProMoST web service provides an interface to a PERL based cgi programme that calculates protein molecular mass and pI values. The interface allows the user to choose the standard pK values for charged amino acids and modifications or to modify the values. The programme takes protein accession numbers, names or sequence equally input and produces tables of values for modified and unmodified proteins. It also has a graphic output of a theoretical 2d gel.

2.1 Requirements to access ProMoST web application

ProMoST is a web base of operations application. Currently it is freely bachelor at either http://proteomics.mcw.edu or http://halligan.us/promost.html. It has been tested with most modern web browsers and is compatible with Microsoft Internet Explorer versions 6 and 7, Safari versions 2 and iii, Firefox version 2, SeaMonkey version 1.one and Opera version 9 running on the Nintendo Wii console.

2.2 Requirements to host ProMoST web awarding

If confidentiality and control of protein sequences is required or especially heavy utilize is anticipated, an organization may wish to host a local re-create of ProMoST. Upon request, ProMoST is distributed equally a Perl cgi programme and has been tested with the Apache web server. It depends on the CGI, Fcntl, and Spreadsheet∷WriteExcel CPAN perl modules likewise as GD.pm and GD libraries for generating the graphic output.

three. Methods

The ProMoST interface has been designed to allow for rapid employ by occasional users while even so meeting the demands of more advanced users. To exercise this, 2 versions of the interface to the program have been designed. The default interface is the bones interface that allows the user to submit either protein sequence data or accretion numbers and apply predefined modifications to generate tables and gel graphics. The advanced interface additionally allows the user to ascertain boosted modifications and alter the standard pKa values used in the calculation.

iii.1 Basic interface

Figure 3, Panel A shows the default or 'simple' interface to ProMoST. A web interface is used to get poly peptide information from the user. The user has a choice of either entering the protein data in a text box or uploading a file. The protein information tin can consist of a list of accession numbers or poly peptide names, or protein sequences in FASTA format (10). The program dynamically determines the format of the input protein information. The accretion numbers or protein names are used by the program to obtain the sequence data from a local re-create of the NCBI nr protein database.

An external file that holds a picture, illustration, etc. Object name is nihms117568f3a.jpg

An external file that holds a picture, illustration, etc. Object name is nihms117568f3b.jpg

Panel A. ProMoST Standard Interface.

Panel B. ProMoST Advanced Interface.

In addition to the normal charged amino acids, values for the mutual protein modifications (deamidation and phosphorylation) are also included. The user is able to specify the number of each modification that is to be considered. Thus, information technology is possible to examine the effects of a single phosphotyrosine or a serial of up to 10 phosphotyrosines on the same protein molecule. The user tin can also choose to block either the N terminal, C terminal, or both ends of the protein.

3.two Advanced interface

In addition to the standard interface for ProMoST, there is likewise an 'avant-garde' interface that allows for more than values to be customized (Effigy iii, Panel B). The standard pK values for the charged amino acids (internal, C-terminal and North-last) are presented by the spider web interface every bit a serial of text boxes. The user can thereby examine and change any of the default pK values and as well has the ability to exclude any of the charged amino acids from the pI adding, as would be required if the residue were modified to an uncharged state.

3.iii Defining new PTMs

To extend the ability of ProMoST to calculate the pI of modified proteins, the web interfaces allows the user tin can to add the name and pK values for up to iii additional user defined poly peptide modifications. For each of the modifications, the user specifies a label to exist used in the text output of the program. The user also indicates if the modification volition produce a negative or positive charge and up to two pKa values.

3.4 Input Options

The standard interface allows for the input of either sequence data in FASTA format or as accession numbers. This data can either be submitted in a text box on the web form or uploaded as a file. In addition to these standard input options, there are several extended options. Lines that are prefaced with a number sign (#) are treated as comments and ignored by ProMoST. This allows text files containing either FASTA sequences or accession numbers to exist annotated. FASTA sequence header lines or accession numbers prefaced with a dollar sign ($) point sequences for which post translational modifications should non be calculated or displayed. This is useful if the user wishes to examine the mobility of a poly peptide in the context of other loftier abundance proteins that are normally present in the sample. An example of this is shown in Figure 4 and Figure 5. Figure iv shows an input text file and Figure 5 shows the resulting ProMoST output. The goal of this demonstration is to show the phosphorylation of the alpha 1-acid glycoprotein, an acute phase serum protein, in the context of other serum proteins.

An external file that holds a picture, illustration, etc. Object name is nihms117568f4.jpg

Input text file for ProMoST analysis. Lines get-go with # are considered as comments and are ignored by ProMoST. The $ preceding the accession numbers for major serum proteins indicates that modified forms of these proteins should not exist calculated or displayed.

An external file that holds a picture, illustration, etc. Object name is nihms117568f5.jpg

Output of ProMoST using the text file from Figure 4 every bit input.

iii.five Output options

The output of the program is divided into two sections: the input information and the calculated results. The user tin can opt to accept the input data displayed in the class of the actual input accession number/protein name, the deduced accession number, the sequence read from the database, or the composition of the protein. Any or all of these options tin can be agile at the same fourth dimension.

There are three main output modes, all of which can be used at the same time. Data can be displayed to the screen, or it can be either saved or displayed or saved equally a text file or Excel format file. The screen display takes the class of a HTML tabular array. The user has the option to cull from different columns of data. The molecular mass choices include the monoisotopic mass, the average isotopic mass, both or neither calculated molecular mass. The protein data can be displayed as the input accession numbers, the deduced accession numbers, or sequence description. The calculated pI is optional. The table also shows which modifications are agile for each line in the tabular array. An example of the output of ProMoST is shown in Figure 5.

Data tin can also be sent to either a tab delimited text file or to an Excel format file. The files can be either viewed on the screen with the browser (text files) or with Excel (Excel files). By using the browser "Salve link as" option, the utilise tin can straight save the text or Excel file to their estimator.

A graphic gel image output is also available. The user tin specify the molecular mass and pI range of the gel as well equally the gel size. Proteins are plotted to the gel equally ovals at the location of their calculated molecular mass and pI. The ovals are color coded for the modification. The parent, unmodified protein is plotted as an open oval and in the case of multiple proteins on the aforementioned plot, is labeled with a protein index number that matches the tabular array or file of values.

iv. Notes

i Other uses of ProMoST

Although ProMoST was primarily designed to summate the mass and pI values for postal service-translational modifications and map them to 'theoretical' second gels, information technology can also be used to predict the mass, pI and mobility of mutant and variant forms of proteins. Using the 'FASTA sequence' pick, both the original and variant sequences can be entered and analyzed. This allows for the comparison of mutant, variant or processed forms of a protein.

ProMoST tin can too be used to brandish the results from LC-MS/MS experiments in a graphic form. The Visualize plan, also developed past the Medical Higher of Wisconsin NHBL Proteomics Middle, allows for the analysis of results of LC-MS/MS experiments. Equally 1 of its output options, it tin create files of accession numbers that tin can exist directly imported into ProMoST and the proteins identified past LC-MS/MS can be visualized as a pseudo-2D gel.

2 Troubleshooting and limitations

One of the most common errors encountered by ProMoST users is the failure of ProMoST to recognize their input sequence information. The normally cause of this problem is that the sequence information is improperly formatted. ProMoST requires that sequences exist submitted in FASTA format. The key component to FASTA sequence format is that each sequence must begin with a header line, which is designated by a greater than symbol (>) at the commencement of the line. If a properly constructed header line is not present, then ProMoST fails to recognize the sequence.

It is important to recall that the pI values calculated past ProMoST are theoretical and estimate. While these values are useful for approximating the migration of proteins on second gels, they are not meant to be accurate for non-denatured proteins. In native proteins, it is possible that some of the potentially charged residues are not solvent accessible and therefore may not contribute to poly peptide'due south overall charge. Furthermore, microenvironments within the protein may allow amino acids to interact and influence the pKa of individual amino acid residues.

Acknowledgments

The 2D gel epitome of rat fibroblast proteins was graciously provided by I. Matus. This work was supported in function by the NHLBI Proteomics Center contract NIH-N01 HV-28182.

References

ane. Gorg A, Weiss W, Dunn MJ. Current ii-dimensional electrophoresis technology for proteomics. Proteomics. 2004;four:3665–3685. [PubMed] [Google Scholar]

2. Robinson NE, Robinson AB. Deamidation of human proteins. Proc Natl Acad Sci U S A. 2001;98:12409–12413. [PMC gratuitous article] [PubMed] [Google Scholar]

3. Halligan BD, Ruotti V, Jin W, Laffoon S, Twigger SN, Dratz EA. ProMoST (Protein Modification Screening Tool): a web-based tool for mapping protein modifications on ii-dimensional gels. Nucleic Acids Res. 2004;32:W638–W644. [PMC costless article] [PubMed] [Google Scholar]

iv. Cantor CR, Schimmel PR. Biophysical chemistry. San Francisco: W. H. Freeman; 1980. [Google Scholar]

5. Fevold HL. In: "Amino acids and proteins; theory, methods, awarding". Greenberg DM, editor. Springfield, Ill: Thomas; 1951. p. ix, 950. [Google Scholar]

6. Dunn MJ. Two-dimensional gel electrophoresis of proteins. J Chromatogr. 1987;418:145–185. [PubMed] [Google Scholar]

vii. Freed JK, Smith JR, Li P, Greene AS. Isolation of signal transduction complexes using biotin and crosslinking methodologies. Proteomics. 2007;7:2371–2374. [PubMed] [Google Scholar]

8. Perkins DN, Pappin DJ, Creasy DM, Cottrell JS. Probability-based protein identification by searching sequence databases using mass spectrometry information. Electrophoresis. 1999;20:3551–3567. [PubMed] [Google Scholar]

10. Lipman DJ, Pearson WR. Rapid and sensitive poly peptide similarity searches. Science. 1985;227:1435–1441. [PubMed] [Google Scholar]

Source: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2754287/

Posted by: brownthorthamme.blogspot.com