Our software and server is capable of predicting the functional effects of protein missense mutations by combining
sequence conservation within hidden Markov models (HMMs), representing the alignment of homologous sequences and
conserved protein domains, with "pathogenicity weights", representing the overall tolerance of the protein/domain to mutations.
For more information on our coding predictions, please refer to the following publications:
Our software and server accepts one of the following formats (see here for annotating VCF files):
dbSNP rs identifiers
<protein>is the protein identifier and
<substitution>is the amino acid substitution in the conventional one letter format. Multiple substitutions can be entered on a single line and should be separated by a comma. Our server accepts SwissProt/TrEMBL, RefSeq and Ensembl protein identifiers, e.g.:
P43026 L441P ENSP00000325527 N548I,E1073K,C2307S
Unfortunately, due to disk space constraints, we are unable to annotate Variant Call Format (VCF) files on your behalf. However, the consequences of all VCF variants
can be derived using the Ensembl Variant Effect Predictor (VEP).
Once annotated, the following script (available here) is capable of parsing these annotations and will provide you with a list of protein
consequences which can then be used as input into our server/software.
Additional help on using our script is available by typing the following command:
python parseVCF.py --help