Instructions on how to install our MKL-based algorithm, capable of predicting the effects of both coding and non-coding variants using nucleotide-based HMMs, can be found on our fathmm-MKL GitHub repository
Instructions on how to install our original algorithm, specifically designed for non-synonymous single nucleotide variants (nsSNVs), please visit our fathmm GitHub repository
Our software is licenced under the GNU General Public License (v3).
Back to Top ...
Inherited Disease (weighted)
We use the Human Gene Mutation Database (HGMD) and SwissProt/TrEMBL to train our inherited disease model. We are therefore unable to circulate our
pathogenic training data (HGMD). However, we observe similar performance to those reported in our publication when using SwissProt/TrEMBL. The SwissProt/TrEMBL dataset used in our
inherited disease model (along with the associated pathogenic variants) can be found here.
Our pathogenic training dataset (i.e. cancer-associated mutations) can be found here. Variation
data is recorded in the header of each sequence: variants starting with "cs" (cancer-associated variants) are those used in our training whereas those starting with "rs" (neutral polymorphisms) are ignored.
Note: The neutral dataset used in our cancer model is the same one used in our inherited disease model (see above).
SwissProt/TrEMBL (2014_05) variant data can be found here and the corresponding disease concepts used in our analysis can be
Note: The neutral dataset used in our disease-specific model is the same one used in our inherited disease model (see above).
If you use these datasets for your analysis, please cite the following publication:
Back to Top ...