TY - GEN
T1 - Self-consistency analysis of physical property and molecular descriptor databases using a variety of prediction techniques - Id# 255431
AU - Shacham, Mordechai
AU - Elly, Michael
AU - Paster, Inga
AU - Brauner, Neima
PY - 2012/12/1
Y1 - 2012/12/1
N2 - The dominant descriptor version of the targeted QSPR method (TQSPR1, Shacham and Brauner, Chem. Eng. Sci., 66, 2606, 2011) is used to predict 16 constant properties for close to 1800 compounds in order to assess the range of applicability of the prediction technique and to analyze the consistency of a property and molecular descriptor database. It is demonstrated that the TQSPR1 method can model the properties within the data uncertainty level for most groups of compounds included in the database. Common causes of poor prediction accuracy were identified as: 1. inconsistencies in the 3D molecular structure of the compounds; 2. use of molecular descriptors outside their range of applicability; 3. mixing compounds at different phases for properties defined at a standard state and 4. use of descriptors whose asymptotic behavior do not match that of the modeled property for long range extrapolation to high carbon number compounds. The results of this study enable improving the consistency of the physical property and descriptor databases for increasing the robustness of QSPRs that can be derived for a variety of properties and compounds.
AB - The dominant descriptor version of the targeted QSPR method (TQSPR1, Shacham and Brauner, Chem. Eng. Sci., 66, 2606, 2011) is used to predict 16 constant properties for close to 1800 compounds in order to assess the range of applicability of the prediction technique and to analyze the consistency of a property and molecular descriptor database. It is demonstrated that the TQSPR1 method can model the properties within the data uncertainty level for most groups of compounds included in the database. Common causes of poor prediction accuracy were identified as: 1. inconsistencies in the 3D molecular structure of the compounds; 2. use of molecular descriptors outside their range of applicability; 3. mixing compounds at different phases for properties defined at a standard state and 4. use of descriptors whose asymptotic behavior do not match that of the modeled property for long range extrapolation to high carbon number compounds. The results of this study enable improving the consistency of the physical property and descriptor databases for increasing the robustness of QSPRs that can be derived for a variety of properties and compounds.
UR - http://www.scopus.com/inward/record.url?scp=84871774451&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84871774451
SN - 9780816910731
T3 - AIChE Annual Meeting, Conference Proceedings
BT - AIChE 2012 - 2012 AIChE Annual Meeting, Conference Proceedings
T2 - 2012 AIChE Annual Meeting, AIChE 2012
Y2 - 28 October 2012 through 2 November 2012
ER -