PhosphoVariant
Protein selection
Variation selection
Kinase selection
Specificity option

 

PredPhospho
Input sequence
Kinase selection
Specificity option

 

Result analysis

1.    When user’s query is a protein ID.

Possible PhosphoVariant(s) in the protein.

Predicted phosphorylation sites

Confirmed phosphorylation sites in the protein.

2.    When user’s query is a variation ID.

Variation information

Possible changed phosphorylation site(s) by this variation

Predicted phosphorylation sites flanking on the variation in the original protein

Confirmed phosphorylation sites flanking on the variation.

 

 

Protein selection

If you input a protein ID of the Swiss-Prot or reference sequence, you can obtain predicted phosphovariants contained in the interested protein. After you select database, you need to input a protein ID.

 

Variation selection

If you input an ID of the SwissVariant or dbSNP, you can obtain detailed information for the variation and you can decide whether it is a phosphovariant or not. After you select database, you need to input a variation ID. If the SwissVariant ID is ‘VAR_020695’, you may input numeric number like ‘020695’.

Input sequence

Fill out a interested protein sequence (with 20 alphabets) in the area of text form. The sequence is allowed to include numbers and spaces.

The following example protein sequence is that of bovine's myelin basic protein (MBP). The known phosphorylation sites of this protein are written in the table under the sequence.

ex)

AAQKRPSQRSKYLASASTMDHARHGFLPRHRDTGILDSLGRFFGSDRGAPKRGSGKDGHHAARTTHYGSLPQKAQGHRPQ 80

DENPVVHFFKNIVTPRTPPPSQGKGRGLSLSRFSWGAEGQKPGFGYGGRASDYKSAHKGLKGHDAQGTLSKIFKLGGRDS 160

RSGSPMARR

Kinase selection

Manning et al. found 518 human protein kinase genes in the human genome sequence with the hidden Markov model (HMM) profile and confirmed the identities of more than 90% of the identified kinase genes using cDNA cloning. They also classified the protein kinase superfamily into 9 broad groups and subdivided the groups into 134 families and 204 subfamilies, using sequence comparisons of kinase catalytic domains. We classified the phosphorylated site sequences according to their kinases and made the classifiers with kinase specific manner. Due to the limitations of present phosphorylated sequence data of public databases, we can make classifiers of seven kinase groups − AGC, Atypical, CAMK, CK1, CMGC, STE, and TK – and eighteen kinase families − AKT, CAMK2, CAMKL, CDK, CK1, CK2, GSK, IKK, JakA, MAPK, PDGFR, PIKK, PKA, PKC, RSK, Src, STE20, and Syk.

           You can choose kinase groups or kinase families that can recognize phosphorylation sites.

 

 

Specificity option

We made option of specificity of each kinase model by modification of cutoff of SVMs. ‘95%’, ‘97%’, ‘98%’, and ‘99%’ specificity option mean the each kinase model’s performance are over 95%, 97%, 98%, and 99%. With higher level of specificity, you can predict phosphorylation sites or phosphovariants with higher specificity and lesser sensitivity. You can choose the level of specificity in the select form field.

Possible PhosphoVariant(s) in the protein.

In this section, you can see predicted phosphovariants in the query protein according to your selected options. If you click each variation ID, you can see detailed information for the variation.

 

 

Predicted phosphorylation sites

The phosphorylation sites predicted by the PredPhospho are marked in the original sequence. In the following table, you can see kinase information and prediction score for each phosphorylation site.

 

 

Confirmed phosphorylation sites in the protein

Confirmed phosphorylation sites mean phosphorylation sites which have been proved for their existence. The phosphorylation sites are marked in the original sequence. In the following table, you can see their recognizing kinase. We followed kinase’s nomenclature of Manning et al. We stratified the name of kinase like CMGC::CDK::CDC2 (CMGC is a group name of kinase, CDK is a family name of kinase, and CDC2 is a subfamily name of kinase). We sometimes omit subfamily name, because we cannot find some of subfamily name in the original article.

 

 

Variation information

           In this section, you can see brief information for the query variation.

           

 

 

Possible changed phosphorylation site(s) by this variation

           Changed phosphorylation sites by the query variation are shown in this table.

           

            By variation, some kinases (Added kinase) can newly recognize the phosphorylation site while other kinases (Removed kinase) become to loss their recognizing sites. If the phosphorylation sites are confirmed ones, the related predicted phosphovariants are more reliable. For types of phosphovariants, please refer the this.

 

 

Predicted phosphorylation sites flanking on the variation in the original protein

            The phosphorylation sites predicted by the PredPhospho are marked in 21 amino acid peptide of the original sequence centered on the query variation. In the following table, you can see kinase information and prediction score for each phosphorylation site.

           

 

 

Confirmed phosphorylation sites flanking on the variation.

            Confirmed phosphorylation sites mean phosphorylation sites which have been proved for their existence. The phosphorylation sites are marked in 21 amino acid peptide of the original sequence centered on the query variation. In the following table, you can see their recognizing kinase.