Mechanistic Inference from Statistical Models at Different Data-Size Regimes

Danilo M. Lustosa, Anat Milo

Research output: Contribution to journalReview articlepeer-review

11 Scopus citations


The chemical sciences are witnessing an influx of statistics into the catalysis literature. These developments are propelled by modern technological advancements that are leading to fast and reliable data production, mining, and management. In organic chemistry, models encoded with information-rich parameters have facilitated the formulation of mechanistic hypotheses across different data-size regimes. Herein, we aim to demonstrate through selected examples that the integration of statistical principles into homogeneous catalysis can streamline not only reaction optimization protocols but also mechanistic investigation procedures. Namely, we highlight how different aspects of molecular modeling, data set design, data visualization, and nuanced data restructuring can contribute to improving chemical reactivity and selectivity, while furthering our understanding of reaction mechanisms. By mapping out these techniques at different data set sizes, we hope to encourage the broad application of data-driven approaches for mechanistic studies regardless of the accessible amount of data.

Original languageEnglish
Pages (from-to)7886-7906
Number of pages21
JournalACS Catalysis
Issue number13
StatePublished - 1 Jul 2022


  • cheminformatics
  • data set design
  • data visualization
  • machine learning
  • mechanism
  • molecular descriptors
  • statistics

ASJC Scopus subject areas

  • Catalysis
  • General Chemistry


Dive into the research topics of 'Mechanistic Inference from Statistical Models at Different Data-Size Regimes'. Together they form a unique fingerprint.

Cite this