Resources

Aug 6

SHAP (SHapley Additive exPlanations) - https://github.com/slundberg/shap
A Python library implementing game-theoretic feature attribution methods for interpreting model outputs. Widely used for tabular, text, and biological data.
LIME (Local Interpretable Model-agnostic Explanations) - https://github.com/marcotcr/lime
Model-agnostic explanation technique for understanding individual predictions by approximating complex models locally with interpretable ones.
Captum - https://captum.ai/
An interpretability library developed by Facebook for PyTorch models, supporting integrated gradients, DeepLIFT, and other attribution methods.
Alibi - https://github.com/SeldonIO/alibi
An open-source Python library providing algorithms for explaining and interpreting black-box models with support for counterfactuals and anchors.
Explainability Tools for Graph Neural Networks
- GNNExplainer: Offers explanations for predictions made by graph neural networks by identifying important subgraphs and features.
  https://github.com/RexYing/gnn-explainer
- GraphLIME: Local interpretable explanations for GNNs applied to biological networks.
  https://github.com/lujiarui/GraphLIME

Docker
Containerization platform to create reproducible computational environments encapsulating code, dependencies, and system libraries.
https://www.docker.com/
Snakemake
A workflow management system for building reproducible and scalable data analyses. Widely used in bioinformatics pipelines.
https://snakemake.readthedocs.io/en/stable/
Nextflow
Enables portable and reproducible workflows across different computational environments, including cloud and HPC clusters.
https://www.nextflow.io/
Git & GitHub
Version control and collaborative code hosting platforms to track changes and facilitate reproducible research.
https://git-scm.com/
https://github.com/
Zenodo
An open repository for sharing datasets, code, and preprints with DOI assignment, supporting open reproducible science.
https://zenodo.org/

Interpretable Machine Learning Book (Molnar, 2020)
Comprehensive open-access book covering theory and application of interpretability methods, with examples in R and Python.
https://christophm.github.io/interpretable-ml-book/
Practical tutorials on SHAP & LIME
- Medium tutorial series by the original SHAP author: https://towardsdatascience.com/interpreting-machine-learning-models-with-shap-values-263a1b33fa7a
- Hands-on LIME tutorial: https://towardsdatascience.com/explain-your-model-with-the-lime-python-package-21fd6e7f71e5
Reproducible Research in Bioinformatics (Carpentries Workshop)
Introduces tools and principles for reproducible workflows including version control and containerization.
https://carpentries.org/lessons/
FAIR Principles Tutorial
Video and reading materials on implementing FAIR data principles to enhance reproducibility.
https://www.go-fair.org/fair-principles/

OpenML
Repository of datasets with metadata and task definitions, including biological and biomedical data useful for benchmarking interpretability methods.
https://www.openml.org/
recount2
Uniformly processed RNA-seq data from thousands of human samples, ideal for reproducible gene expression modeling.
https://jhubiostatistics.shinyapps.io/recount/
STRING Database
Known and predicted protein-protein interactions, useful for integrating biological priors in interpretable network models.
https://string-db.org/
BioGRID
Comprehensive repository of genetic and protein interaction data across multiple species.
https://thebiogrid.org/

BioStars
A Q&A forum focused on bioinformatics and computational biology, where interpretability and reproducibility questions are actively discussed.
https://www.biostars.org/
r/MachineLearning and r/Bioinformatics on Reddit
Active communities discussing state-of-the-art ML techniques and bioinformatics workflows.
https://www.reddit.com/r/MachineLearning/
https://www.reddit.com/r/bioinformatics/
ISMB (Intelligent Systems for Molecular Biology)
Annual conference with workshops and sessions on interpretable AI and reproducible computational biology.
https://www.iscb.org/ismb2024
FAIRsharing.org
A curated resource on standards, databases, and policies supporting reproducible research and FAIR data practices.
https://fairsharing.org/

Core AI and ML Concepts for Computational Biology