Strengths and Limitations of AI Predictions

Mar 11

While the predicted structures and associated confidence metrics such as pLDDT provide valuable guidance, it remains essential to critically assess when these predictions can be trusted and to recognize their inherent limitations. We will highlight common pitfalls in interpretation and underscores scenarios where experimental validation continues to be indispensable.

When are AI predictions reliable?

The high accuracy of models such as AlphaFold2 largely derives from robust evolutionary information encoded in multiple sequence alignments (MSAs) and extensive training on known protein structures [Jumper et al., 2021]. Regions exhibiting high predicted confidence scores typically correspond to well-folded, conserved domains with ample sequence homologs, where the AI can effectively infer structural constraints.

Several studies have demonstrated that AI-predicted structures can achieve root-mean-square deviations (RMSDs) comparable to medium-resolution experimental data, particularly for globular domains with rich homologous sequence data [Tunyasuvunakool et al., 2021]. Furthermore, when structural templates from homologous proteins are available, predictions often improve in accuracy [Baek et al., 2021]. Thus, researchers can generally have high confidence in predictions for protein regions with deep and diverse MSAs.

However, the accuracy markedly decreases for proteins lacking extensive evolutionary context, such as orphan proteins or rapidly evolving viral sequences, where homologous sequences are sparse or absent [Ovchinnikov et al., 2017]. In such cases, predictions should be interpreted cautiously, particularly if confidence metrics indicate uncertainty.

Common pitfalls and interpretation challenges

Despite significant advances, several limitations of AI-based predictions warrant attention. One critical pitfall is the tendency to overinterpret confidence metrics. For instance, high per-residue confidence scores do not necessarily guarantee accurate domain orientations in multi-domain proteins or assemblies [Jumper et al., 2021]. The Predicted Aligned Error (PAE) matrix can help assess inter-domain uncertainties, but thorough evaluation requires domain expertise.

Additionally, intrinsically disordered or highly flexible regions often yield low confidence predictions. Rather than reflecting model failure, such results may indicate genuine biological disorder, which traditional static structural models cannot capture [Uversky, 2019]. This nuance must be appreciated to avoid erroneous conclusions about protein structure and function.

Moreover, AI models currently have limited capacity to account for ligand-induced conformational changes, post-translational modifications, or complex allosteric effects unless these features are explicitly included in the training data or modeling procedure [Evans et al., 2022]. This constraint reduces predictive reliability in contexts where such phenomena are critical.

Another source of error stems from biases inherent in training datasets, which overrepresent well-studied protein families and underrepresent membrane proteins, complexes, or novel folds. Such imbalance can limit the generalizability of AI predictions, emphasizing the need for cautious interpretation in these underrepresented categories [Callaway, 2022].

The indispensable role of experimental validation

While AI predictions offer valuable structural hypotheses, experimental validation remains the gold standard. Techniques such as X-ray crystallography, cryo-electron microscopy (cryo-EM), and nuclear magnetic resonance (NMR) spectroscopy provide definitive spatial resolution and can confirm or refine computational models.

Integration of AI models with experimental data has already shown promising synergy. For example, AI-predicted structures can serve as initial templates for fitting cryo-EM density maps, accelerating the structural determination process [Terwilliger et al., 2022]. However, discrepancies between predictions and experimental data frequently highlight areas where AI models require refinement.

Complementary biochemical and biophysical assays, including mutagenesis and ligand-binding studies, remain essential to validate functional interpretations derived from AI structures. Together, computational and experimental approaches create a robust framework for understanding protein behavior.

AI-driven protein structure prediction represents a transformative advance in structural biology, delivering high-quality models that facilitate hypothesis generation and experimental design. Nonetheless, these predictions are not infallible. Critical assessment of confidence metrics, recognition of biological and technical limitations, and rigorous experimental validation are necessary to maximize the utility of AI-generated models. Structural biologists equipped with this critical perspective can harness AI predictions effectively to accelerate discovery while maintaining scientific rigor.

References

Baek, M., et al. (2021). Accurate prediction of protein structures and interactions using a three-track neural network. Science, 373(6557), 871–876. https://doi.org/10.1126/science.abj8754
Callaway, E. (2022). ‘It will change everything’: DeepMind’s AI makes gigantic leap in solving protein structures. Nature. https://pubmed.ncbi.nlm.nih.gov/33257889/
Evans, R., et al. (2022). Protein complex prediction with AlphaFold-Multimer. bioRxiv. https://doi.org/10.1101/2021.10.04.463034
Jumper, J., et al. (2021). Highly accurate protein structure prediction with AlphaFold. Nature, 596(7873), 583–589. https://doi.org/10.1038/s41586-021-03819-2
Ovchinnikov, S., et al. (2017). Protein structure determination using metagenome sequence data. Science, 355(6322), 294–298. https://doi.org/10.1126/science.aah4043
Terwilliger, T.C., et al. (2022). Improvement of cryo-EM maps by density modification. Nature Methods, 19(3), 249–255. https://pubmed.ncbi.nlm.nih.gov/32807957/
Tunyasuvunakool, K., et al. (2021). Highly accurate protein structure prediction for the human proteome. Nature, 596(7873), 590–596. https://doi.org/10.1038/s41586-021-03828-1
Uversky, V.N. (2019). Intrinsically disordered proteins and their “mysterious” (meta)physics. Frontiers in Physics, 7, 10. https://doi.org/10.3389/fphy.2019.00010

Kamayani Gupta

Strengths and Limitations of AI Predictions

When are AI predictions reliable?

Common pitfalls and interpretation challenges

The indispensable role of experimental validation

References

AI Foundations for Structural Biologists

AI in Cryo-EM, X-ray, and NMR Workflows