Finding the structural requirements of diverse HIV-1 protease inhibitors using multiple QSAR modelling for lead identification
Multiple Quantitative Structure-Activity Relationship (QSAR) analysis is widely used in drug discovery for lead identification. Human Immunodeficiency Virus (HIV) protease is one of the key targets for the treatment of Acquired Immunodeficiency Syndrome (AIDS). One of the major challenges for the design of HIV-1 protease inhibitors (HIV PRIs) is to increase the inhibitory activities against the enzyme to a level where the problem associated to drug resistance may be considerably delayed. Herein, chemometric analyses were performed with 346 structurally diverse HIV PRIs with experimental bioactivities against a sub-type B mutant to develop highly predictable QSAR models and also to identify the effective structural determinants for higher affinity against HIV PR. The QSAR models were developed using OCHEM-based machine learning tools (ASNN, FSMLR, KNN, RF, MANN and XGBoost), with descriptors calculated by eight different software packages. Simultaneously, a Monte Carlo optimization-based QSAR modelling was performed using SMILES and graph-based descriptors to understand fragment and topochemical contributions. To validate the actual predictability of all these models, an additional set of 104 compounds (also with known experimental activities) with slightly different chemical space were employed. This ligand-based study serves as a crucial benchmark for further development of the HIV protease inhibitors with improved activities.