AI3SD Video: Statistics Are a Girl's best Friend: Expanding the mechanistic Study Toolbox with Data Science

  • Anat Milo (Creator)
  • Jeremy Frey (Contributor)
  • Samantha Kanza (Contributor)
  • Mahesan Niranjan (Contributor)



The value of amassing and standardizing chemical data for improving the efficiency of chemical discovery is becoming increasingly clear. Machine learning analyses of these data are focused on finding correlations, trends and patterns to uncover needles of knowledge in the haystack of chemical reactions. However, in many cases, especially in academic settings, we do not have the means to produce large data sets, so by necessity we remain in the Small Data regime. In this talk, I will present our work in the field of organocatalysis focused on applying machine learning strategies to small data sets as a means to uncover underlying mechanisms. We aim to show that whereas Big Data serves to identify hidden correlations, Small Data encourages the discovery of causation. In this sense, Small Data is not just a necessity, but is key to bridging the gap between human intuition and machine learning.
Date made available2021
PublisherUniversity of Southampton

Cite this