Quantitative structure and bioactivity relationship study on HCV NS5B polymerase inhibitors
Wang, M.L.; Zhong, M.; Yan, A.X.*; Li, L.; Yu, C.Y
SAR and QSAR in Environmental Research, 2014, 25(1), 1-15.
建立了丙型肝炎病毒 (HCV) NS5B聚合酶NNI III变构位点的抑制剂活性的定量预测模型。共收集了333个已知活性的抑制剂构建数据库，所有的抑制剂都是HCV聚合酶非核苷类类似物抑制剂 (NNIs)，与NNI III结合位点口袋吻合。应用ADRIANA.Code程序对每个化合物分别计算了全局描述符和二维自相关描述符。采用Pearson相关分析选出重要的描述符进行建模。使用Kohonen自组织映射 (SOM) 将整个数据集随机划分为训练集和测试集。此后，分别采用多元线性回归(MLR)和支持向量机算法, 构建333个NS5B聚合酶抑制剂活性的定量预测模型。对于测试集，模型2B具有最好的预测能力，其相关系数为0.91。分子复杂度(Complexity)、氢键供体数目 (HDon) 和分子的水溶性 (LogS)等分子描述符是影响HCV NS5B抑制剂生物活性的重要因素。其它一些分子性质如静电和电荷性质在配体与蛋白质的相互作用中也起着重要作用。通过分析了两种代表性抑制剂与聚合酶在其晶体结构中的相互作用，进一步确定了所选的分子描述符。
Several QSAR (quantitative structure–activity relationship) models for predicting the inhibitory activity of 333 hepatitis C virus (HCV) NS5B polymerase inhibitors were developed. All the inhibitors are HCV polymerase non-nucleoside analogue inhibitors (NNIs) fitting into the pocket of the NNI III binding site. For each molecule, global descriptors and 2D property autocorrelation descriptors were calculated from the program ADRIANA.Code. Pearson correlation analysis was used to select the significant descriptors for building models. The whole dataset was split into a training set and a test set randomly or using a Kohonen’s self-organizing map (SOM). Then, the inhibitory activity of 333 HCV NS5B polymerase inhibitors was predicted using multilinear regression (MLR) analysis and support vector machine (SVM) method, respectively. For the test set of the best model (Model 2B), correlation coefficient of 0.91 was achieved. Some molecular descriptors, such as molecular complexity (Complexity), the number of hydrogen bonding donors (HDon) and the solubility of the molecule in water (log S) were found to be very important factors which determined the bioactivity of the HCV NS5B inhibitors. Some other molecular properties such as electrostatic and charge properties also played important roles in the interaction between the ligand and the protein. The selected molecular descriptors were further confirmed by analysing the interaction between two representative inhibitors and the polymerase in their crystal structures.
QSAR Models performance: Dataset (333 HCV NS5B polymerase inhibitors)
|Model Name||Algorithm||Descriptors||Spliting method||Training set numbers||Training set r||Training set s||Test set numbers||Test set r||Test set s|
|Model 1A||MLR||9 CORINA global, 8 CORINA 2D||Random||232||0.86||0.53||101||0.85||0.49|
|Model 1B||SVM||9 CORINA global, 8 CORINA 2D||Random||232||0.93||0.39||101||0.86||0.47|
|Model 2A||MLR||9 CORINA global, 8 CORINA 2D||Kohonen’s self-organizing map (SOM)||232||0.86||0.51||101||0.88||0.55|
|Model 2B||SVM||9 CORINA global, 8 CORINA 2D||Kohonen’s self-organizing map (SOM)||232||0.95||0.31||101||0.91||0.49|