What is ProteinSpecific predictor?
ProteinSpecific predictor is a tool to predict the functional significance of nonsynonymous variants of more than 80 proteins involved in diseases with an inheritance component from the Clinical and Translational Bioinformatics research group at Vall d'Hebron Institute of Research.
How can I predict my variant?
You can submit your variant in our query page indicating the native amino acid, the residue and the mutated amino acid. Afterwards, you will be redirect to the prediction page.
Why is my variant not accepted?
We use as a reference the database UniProt, a comprehensive, high-quality and freely accessible resource of protein sequence and functional information. In particular, we use the most prevalent isoform, the canonical isoform. So, if you are using another isoform or another database for protein sequence reference such as NCBI or Ensembl, you can find some small diferences.
Which metrics has a prediction?
We provide you with three metrics:
Label: the variants are classified as pathogenic or neutral according to its functional consequence.
Score: the numerical score of the functional consequence of the variant. It has a continuous scale from 0 to 1, being 0 a neutral and 1 a pathogenic variant. The threshold between pathogenic and neutral variant is at 0.5.
Reliability: measures the accuracy of the prediction. It has a continuous scale from 0 to 1, being 1 a trueful prediction.
How are these predictions calculated?
These predictions are calculated by a machine learning algorithm previously trained with a set of already known variants. To develop the predictor, we followed these steps:
Collect the pathogenic and neutral variants of the proteins
Search the features able to discriminate between pathogenic and neutral variants
Build the model by training the neural network algorithm with a set of features of known variants
Estimate the model performance by cross-validation to ensure reliable results
Riera et alt., Human Mutation, 2016
Which is the performance of ProteinSpecific predictor?
The ProteinSpecific predictor have been evaluated by leave-one-out cross valiation and compared to the state of the art predictors. The Matthews Correlation Coefficient (MCC) per gene and predictor is:
In mean, the performance metrics per predictor are:
Can I download all the predictions for my protein?
You can download all the pre-calculated predictions to make your own queries. The file is in csv format containing the following columns:
|1||Gene||HGNC official gene symbol|
|2||Protein||Uniprot accession number|
|3||Variant||Nonsynonymous variant from the canonical isoform|
|4||Prediction||Predicted functional consequence of the variant|
|5||Score||Numerical score of the pathogenic prediction|