What is NeoVar predictor?

NeoVar predictor is a tool to predict the functional significance of missense variants of more than 50 proteins involved in neonatal diseases from the Clinical and Translational Bioinformatics research group at Vall d'Hebron Institute of Research.

How can I analyze my variant?

You can submit your variant in our query page indicating the native amino acid, the residue and the mutated amino acid. Afterwards, you will be redirect to the prediction page.

Why is my variant not accepted?

We map your variant to the canonical isoform of the protein provided by UniProt (UniProt Consortium, 2023). If the native amino acid of your variant does not match the native amino acid in the canonical isoform, the program will raise an error message. This may happen if your sequence is based on another isoform of UniProt, NCBI or Ensembl.

What information we provide for a pathogenicity prediction?

We provide:

  • Predictor: the best predictor for the protein among PON-P2, PolyPhen-2, SIFTand CADD

  • Label: the variant is classified as pathogenic or neutral according to its functional consequence.

  • Score: the numerical score of the functional consequence of the variant.

Score Plot

How are the pathogenicity predictions calculated?

These predictions are calculated by selecting the best predictor for each protein among PON-P2, PolyPhen-2, SIFT and CADD. To develop the predictor, we followed these steps:

  1. Collect the pathogenic and neutral variants of the proteins

  2. Run the predictors for this set of variants

  3. Estimate the performance of each predictor

  4. Select the best predictor per protein

What is the performance of NeoVar predictor?

The NeoVar predictor has been evaluated and compared to the state of the art predictors. The Matthews Correlation Coefficient (MCC) per gene and predictor is:


In mean, the performance metrics per predictor are:

Sensitivity Specificity Accuracy MCC Coverage
NeoVar 0.946 0.907 0.924 0.803 71%
PON-P2 0.866 0.892 0.905 0.741 49%
PolyPhen-2 0.899 0.846 0.857 0.652 99%
CADD 0.961 0.67 0.788 0.571 27%

Can I download all the predictions for my protein?

Yes, you can download all the pre-calculated predictions here. The file is in csv format and has the following columns:

# Field Description
1 Gene HGNC official gene symbol
2 Protein Uniprot accession number
3 Variant Missense variant from the canonical isoform
4 Prediction Predicted functional consequence of the variant
5 Score Numerical score of the pathogenic prediction

Is there any additional information provided with the pathogenicity prediction?

The results report a great amount of information related to the variant divided in different sections: