This prediction tool applies four rule-based and machine learning methods (using CART) to predict whether NXS/T sites in the sequence are glycosylated or not. This tool can also predict a modified sequence of the protein of interest.
As input, the user must provide the UniProt ID of the protein of interest (for example, P06280) for the N-glycosylation prediction utility and the sequence in fasta format for the predicted modified sequence.
Please note that only one fasta record can be uploaded at a time. Below is an example result from the tool:
This tool allows users to investigate the distribution of secondary structures, as well as the the sequence annotations (features) from UniProtKB, using structural information from Protein Data Bank (PDB) and annotations from UniProtKB.
The tool can currently intake the following features: Metal Site, Active Site, and N-glycosylation. Users are required to input the data set in the format of a UniProtKB flat file. For quick analysis, users should provide a data set containing fewer than 20 UniProtKB proteins. However, the tool can handle a maximum of 100 UniProt proteins, in which case users should provide an email address at which to be notified when the program is complete.
The result will be reported in the table with information about UniProtKB Accession number, UniProtKB Position, corresponding PDB ID, mapping PDB position, and secondary structure. A pie chart is generated in order to illustrate the result.
Below is an example result from the tool:
Instructions to define a motif
The motif should be given in capital letters and can be defined by regex expression, as described in the table below:
^ | Matches the beginning of line e. g. The expression ^ATG matches ATGCGT but not CCATGTT. |
$ | Matches the end of line e. g. The expression ATG$ matches TGCATG but not CCATGTT. |
. | Matches any single character except newline e. g. The expression A.G matches ATG, AtG, A4G, and also A-G or A G. |
[ABC] | Character A, B or C e. g. The expression T[ABC]G matches TAG, TBG or TCG, but not TG, TABG. |
[0 - 9] | Any number from 0 to 9 |
[^AB] | Matches any character except A and B e. g. The expression N[^P]T matches NLT, NAT but not NTP. |
* | Matches 0 or more occurrences of preceding expression e. g. The expression A(CG)*T matches AT, ACGT, ACGCGT... |
+ | Matches 1 or more occurrences of preceding expression e. g. The expression A(CG)+T matches ACGT, ACGCGCGCGT... but not AT. |
? | Matches 0 or 1 occurrence of preceding expression e. g. The expression A(CG)?T matches AT or ACGT. |
{n} | Matches exactly n number of occurrences of preceding expression e. g. The expression N[S|T]{3} matches NSSS, NSTS, NTTT, NTTS. |
{n,m} | Matches at least n and at most m occurrences of preceding expression. |
{n,} | Matches n or more occurrences of preceding expression. |
[S|T] | Matches either S or T e. g. The expression N[S|T]T matches NST, NTT but not NT. |