Accessibility Links

  • Skip to content
  • Skip to search IOPscience
  • Skip to Journals list
  • Accessibility help
  • Accessibility Help

Click here to close this panel.

Purpose-led Publishing is a coalition of three not-for-profit publishers in the field of physical sciences: AIP Publishing, the American Physical Society and IOP Publishing.

Together, as publishers that will always put purpose above profit, we have defined a set of industry standards that underpin high-quality, ethical scholarly communications.

We are proudly declaring that science is our only shareholder.

The potential of discovery learning models to empower students' critical thinking skills

Muhammad Minan Chusni 1 , Sulistyo Saputro 1 , Suranto 1 and Sentot Budi Rahardjo 1

Published under licence by IOP Publishing Ltd Journal of Physics: Conference Series , Volume 1464 , The 1st International Conference on Education and Technology (ICETECH) 8 August 2019, Madiun, Indonesia Citation Muhammad Minan Chusni et al 2020 J. Phys.: Conf. Ser. 1464 012036 DOI 10.1088/1742-6596/1464/1/012036

Article metrics

1258 Total downloads

Share this article

Author e-mails.

[email protected]

Author affiliations

1 Program Studi Doktor Pendidikan IPA, Universitas Sebelas Maret, Jl. Ir. Sutami 36 A Surakarta, Jawa Tengah 57126 Indonesia

Buy this article in print

Critical thinking skills have become the competencies of educational goals. This article aims to examine the potential of discovery learning models that are applied in science learning to empower students' critical thinking skills. The method used is qualitative with the main source of literature review about discovery learning models and critical thinking skills. The results of the analysis of the discovery learning model literature with orientation, hypothesis generation, hypothesis testing, conclusion, and regulation stages. Discovery learning model has the potential to empower critical thinking skills starting from the hypothesis generation stage which aims to provide a rational argument from a real phenomenon orientation phase which is continued by the process of interpretation, analyzing, evaluating, concluding the experimental results of the hypothesis testing stage until the right conclusion is obtained from the experimental results.

Export citation and abstract BibTeX RIS

Content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence . Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

Academia.edu no longer supports Internet Explorer.

To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to  upgrade your browser .

Enter the email address you signed up with and we'll email you a reset link.

  • We're Hiring!
  • Help Center

paper cover thumbnail

Discovery Learning for the 21st Century: What is it and how does it compare to traditional learning in effectiveness in the 21st Century

Profile image of Dana Mussa

Related Papers

In D. Abrahamson & M. Kapur (Eds.), Practicing discovery-based learning: Evaluating new horizons [Special issue]. Instructional Science, 46(1), 1-10.

Dor Abrahamson

Whereas some educational designers believe that students should learn new concepts through explorative problem solving within dedicated environments that constrain key parameters of their search and then support their progressive appropriation of empowering disciplinary forms, others are critical of the ultimate efficacy of this discovery based pedagogical philosophy, citing an inherent structural challenge of students constructing historically achieved conceptual structures from their ingenuous notions. This special issue presents six educational research projects that, while adhering to principles of discovery-based learning, are motivated by complementary philosophical stances and theoretical constructs. The editorial introduction frames the set of projects as collectively exemplifying the viability and breadth of discovery-based learning, even as these projects: (a) put to work a span of design heuristics, such as productive failure, surfacing implicit know-how, playing epistemic games, problem posing, or participatory simulation activities ; (b) vary in their target content and skills, including building electric circuits, solving algebra problems, driving safely in traffic jams, and performing martial-arts maneuvers; and (c) employ different media, such as interactive computer-based modules for constructing models of scientific phenomena or mathematical problem situations, networked classroom collective ''video games,'' and intercorporeal master–student training practices. The authors of these papers consider the potential generativity of their design heuristics across domains and contexts. Keywords Attitude Á Epistemic forms and games Á Explorative practice Á Problem posing Á Productive failure Á Situated intermediary learning objectives

literature review on discovery learning

Muhammad Antareza

Andrew Johnson

This chapter describes the essential elements of discovery learning, a form of student-centered learning. Instructional video are included with this paper.

Journal of Educational Psychology

Louis Alfieri

Paul Twelker

In D. Abrahamson & M. Kapur (Eds.), Practicing discovery-based learning: Evaluating new horizons [Special issue]. Instructional Science

Kiera Chase , Dor Abrahamson

Forty 4th and 9th grade students participated individually in tutorial interviews centered on a problem-solving activity designed for learning basic algebra mechanics through diagrammatic modeling of an engaging narrative about a buccaneering giant burying and unearthing her treasure on a desert island. Participants were randomly assigned to experimental (Discovery) and control (No-Discovery) conditions. Mixed-method analyses revealed greater learning gains for Discovery participants. Elaborating on a heuristic activity architecture for technology-based guided-discovery learning (Chase and Abrahamson 2015), we reveal a network of interrelated inferential constraints that learners iteratively calibrate as they each refine and reflect on their evolving models. We track the emergence of these constraints by analyzing annotated transcriptions of two case-study student sessions and argue for their constituting role in conceptual development.

Eurasian Journal of Educational …

Ali Günay Balım

Jurnal Penelitian Pendidikan IPA

Rosnidar Rosnidar

This research aims to find out the application of discovery learning models in increasing students' interest and learning outcomes in harmonic vibrational materials in MAN 4 Aceh Besar. The method in this study is quasi-experimentation with the design of a pretest-posttest control group. The instruments used are questionnaires and problems. The results showed that the average N-gain of student learning interest in the experimental class was 0.79 high category and control class 0.28 low category. The results of each experimental class indicator included a very positive category while the category control class was positive. Based on the results of the analysis of both classes, it can be concluded that the average interest in student learning in the experimental class is more increased than in the control class, especially on indicators of student engagement. The average N-gain result of student learning outcomes in the experimental class obtained a score of 0.61 moderate categori...

Lisa Hammershaimb

Research in Science Education

Smile Pretty

RELATED TOPICS

  •   We're Hiring!
  •   Help Center
  • Find new research papers in:
  • Health Sciences
  • Earth Sciences
  • Cognitive Science
  • Mathematics
  • Computer Science
  • Academia ©2024

Jurnal Sains Edukatika Indonesia (JSEI)

2715-4661 (Print)

2656-4890 (Online)

COLLABORATE WITH

literature review on discovery learning

  • Other Journals
  • For Readers
  • For Authors
  • For Librarians

LITERATURE REVIEW: IMPLEMENTASI MODEL PEMBELAJARAN DISCOVERY-LEARNING BERBASIS SAINS PADA PEMBELAJARAN IPA

Alfieri, L., Brooks, P. J., Aldrich, N. J., & Tenenbaum, H. R. (2011). Does discovery-based instruction enhance learning? Journal of Educational Psychology, 103(1), 1–18.

Belton, D. J. (2016). Teaching process simulation using video-enhanced and discovery/inquiry-based learning: Methodology and analysis within a theoretical framework for skill acquisition. Education for Chemical Engineers, 17(2002), 54–64. https://doi.org/10.1016/j.ece.2016.08 .003

Bernhard, J., & Lindwall, O. (2003). Approaching Discovery Learning. Proceedings of ESERA2003, Nordwijkerhout.

Chiappetta, E. L. (1997). Inquiry-based science. SCIENCE TEACHER-WASHINGTON-, 64, 22-26.

Friesen, S., & Scott, D. (2013). Inquiry-based learning: A review of the research literature. Alberta Ministry of Education, 32.

Hanafiah, N., & Suhana, C. (2012). Konsep Strategi Pembelajaran. Bandung: Refika Aditama.

Kemendikbud (2013). Kementerian Pendidikan dan Kebudayaan Republik Indonesia. Jakarta

Mayer, R.E. (2004). Should There Be a ThreeStrikes Rule against Pure Discovery Learning? The Case for Guided Methods of Instruction. American Psychologist, 59, 14-19.

Nurdin, E. A. (2011). THE PROMISE AND CAVEATS OF IMPLEMENTING DISCOVERY-INQUIRY LEARNING. Jurnal Pengajaran MIPA, 23(1), 76-80.

Nurdin, K., Muh, H. S., & Muhammad, M. H. (2019). The implementation of inquiry-discovery learning. IDEAS: Journal on English Language Teaching and Learning, Linguistics and Literature, 7(1).

Suendartia, M. (2017). The effect of the learning discovery model on the learning outcomes of natural science of junior high school students in Indonesia. International Journal of Environmental & Science Education, 12(10), 2213-2216.

Sugiyono, D. (2013). Metode penelitian pendidikan pendekatan kuantitatif, kualitatif dan R&D.

Syah, M. (2014). Psikologi Pendidikan dengan Pendekatan Baru. Bandung: PT.Remaja Rosdakarya.

Tompo, B., Ahmad, A., & Muris, M. (2016). The Development of Discovery-Inquiry Learning Model to Reduce the Science Misconceptions of Junior High School Students. International Journal of Environmental and Science Education, 11(12), 5676-5686.

Wartono, J. T., Batlolona, J. R., & Grusche, S. (2018). Inquiry-discovery empowering high order thinking skills and scientific literacy on substance pressure topics. Inquiry.

  • There are currently no refbacks.

Jalan Ir. Sutami 36 A, Surakarta, 57126

(0271) 638959

Creative Commons Licence

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License .

Universitas Sebelas Maret Logo

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Springer Nature - PMC COVID-19 Collection

Logo of phenaturepg

Deep learning in drug discovery: an integrative review and future challenges

1 Faculty of Computers and Artificial Intelligence, University of Sadat City, Sadat City, Egypt

Enas Elgeldawi

2 Computer Science Department, Faculty of Science, Minia University, Minia, Egypt

Heba Aboul Ella

4 Faculty of Pharmacy and Drug Technology, Chinese University in Egypt (CUE), Cairo, Egypt

Yaseen A. M. M. Elshaier

5 Faculty of Pharmacy, University of Sadat City, Sadat City, Menoufia Egypt

Mamdouh M. Gomaa

Aboul ella hassanien.

3 Faculty of Computers and Artificial Intelligence, Cairo University, Cairo, Egypt

Associated Data

  • Grieves M. 2014. Digital twin: manufacturing excellence through virtual factory replication. Glob J Eng Sci Res. [ CrossRef ]

Recently, using artificial intelligence (AI) in drug discovery has received much attention since it significantly shortens the time and cost of developing new drugs. Deep learning (DL)-based approaches are increasingly being used in all stages of drug development as DL technology advances, and drug-related data grows. Therefore, this paper presents a systematic Literature review (SLR) that integrates the recent DL technologies and applications in drug discovery Including, drug–target interactions (DTIs), drug–drug similarity interactions (DDIs), drug sensitivity and responsiveness, and drug-side effect predictions. We present a review of more than 300 articles between 2000 and 2022. The benchmark data sets, the databases, and the evaluation measures are also presented. In addition, this paper provides an overview of how explainable AI (XAI) supports drug discovery problems. The drug dosing optimization and success stories are discussed as well. Finally, digital twining (DT) and open issues are suggested as future research challenges for drug discovery problems. Challenges to be addressed, future research directions are identified, and an extensive bibliography is also included.

Introduction

The examination of how various drugs interact with the body and how a medication needs to act on the body to have a therapeutic impact is known as drug discovery. Drug discovery strategy constitutes from different approaches as physiology-based and target based. This strategy is based on information about the ligand and the target. In this regard, our attention was directed in certain topics especially drug (ligand)–target interactions, drug sensitivity and response, drug–drug interaction, and drug–drug similarity. For certain diseases such as cancer or pandemic situations as COVID-19, more than one drug combination is required to alleviate the prognosis and pathogenesis interactions. Despite all the recent advances in pharmaceuticals, medication development is still a labor-intensive and costly process. As a result, several computational algorithms are proposed to speed up the drug discovery process (Betsabeh and Mansoor 2021 ).

As DL models progress and the drug data size is getting bigger, a slew of new DL-based approaches is cropping up at every stage of the drug development process (Kim et al. 2021 ). In addition, we’ve seen large pharmaceutical corporations migrate toward AI in the wake of the development of DL approaches, eschewing outmoded, ineffective procedures to increase patient profit while also increasing their own (Nag et al. 2022 ). Despite the DL impressive performance, it remains a critical and challenging task, and there is a chance for researchers to develop several algorithms that improve drug discovery performance. Therefore, this paper presents a SLR that integrates the recent DL technologies and applications in drug discovery. This review study is the first one that incorporates the recent DL models and applications for the different categories of drug discovery problems such as DTIs, DDIs similarity, drug sensitivity and response, and drug-side effects predictions, as well as presenting new challenging topics such as XAI and DT and how they help the advancement of the drug discovery problems. In addition, the paper supports the researchers with the most frequently used datasets in the field.

The paper is developed based on six building blocks as shown in Fig.  1 . More than 300 articles are presented in this paper, and they are divided across these building blocks. The papers are selected using the following criteria:

  • The papers which published from 2000 to 2022.
  • The papers which published in IEEE, ACM, Elsevier, and Springer have more priority.

An external file that holds a picture, illustration, etc.
Object name is 10462_2022_10306_Fig1_HTML.jpg

The main building blocks of the paper

The following analytical questions are discussed and completely being answered in the paper:

  • AQ1: What DL algorithms have been used to predict the different categories of drug discovery problems?
  • AQ2: Which deep learning methods are mostly used in drug dosing optimization?
  • AQ3: Are there any success stories about drug discovery and DL?
  • AQ4: What about the newest technologies such as XAI and DT in drug discovery?
  • AQ5: What are the future and open works related to drug discovery and DL?

The remainder of this review paper is organized as: Sect.  2 presents a review of related studies; Sect.  3 covers the various DL techniques as an overview. Section  4 presents the organization of DL applications in drug discovery problems through explaining each drug discovery problem category and gives a literature review of the DL techniques used. Section  5 discusses the numerous benchmark data sets and databases that have been employed in the drug development process. Section  6 presents the evaluation metrics used for each drug discovery problem category. The drug dose optimization, successful stories, and XAI are introduced in Sect.  7 , Sect.  8 , and Sect.  9 . DT and open problems are suggested as future research challenges in Sects.  10 and 11 . Section  12 presents a discussion of the analytical questions. Finally, Sect.  13 concludes the paper.

Review of related studies

Although the drug discovery is a large field and has different research categories, there is a few review studies about this field and each related study has focused only on a one research category such as reviewing the DL applications for the DTIs. This section aims to review these related studies and a summary is presented in Table ​ Table1 1 .

Related studies included DL for drug discovery

Kim et al. ( 2021 ) presented a survey of DL models in the prediction of drug–target interaction (DTI) and new medication development. They start by providing a thorough summary of many depictions of drugs and proteins, DL applications, and widely used exemplary data sets to test and train models. One good point for this study, they identify a few obstacles to the bright future of de novo drug creation and DL-based DTI prediction. However, the major drawback of this study was that it did not consider the latest technology in DL application for the DTIs such as XAI and DTs.

Rifaioglu et al. ( 2019 ) presented the recent ML applications in Virtual Screening (VS) with the techniques, instruments, databases, and materials utilized to create the model. They outline what VS is and how crucial it is to the process of finding new drugs. Good points for this study, they highlighted the DL technologies that are accessible as open access programming libraries and provided instances of VS investigations that resulted in the discovery of novel bioactive chemicals and medications, tool kits and frameworks, and can be employed for the foreseeable future's computational drug discovery (including DTI prediction). However, they did not consider the drug dose optimization in their literature review.

Sachdev and Gupta ( 2019 ) presented the various feature based chemogenomic methods for DTIs prediction. They offer a thorough review of the different methodologies, datasets, tools, and measurements. They give a current overview of the various feature-based methodologies. Additionally, it describes relevant datasets, methods for determining medication or target properties, and evaluation measures. Although the study considered the initial integrated review which concentrate only on DTI feature-based techniques, they did not consider the latest technology in DL application for the DTIs such as XAI and DTs.

Deep learning (DL) techniques

Detecting spam, recommending videos, classifying images, and retrieving multimedia ideas are just a few of the techniques used are just a few of the applications where machine learning (ML) has lately gained favor in research. Deep learning (DL) is one of the most extensively utilized ML methods in these applications. The ongoing appearance of new DL studies is due to the unpredictability of data acquisition and the incredible progress made in hardware technologies. DL is based on conventional neural networks but outperforms them significantly. Furthermore, DL uses transformations and graph technology to build multi-layer learning models (Kim et al. 2021 ). With their groundbreaking invention, Machine Learning and Deep Learning have revolutionized the world's perspective. Deep learning approaches have revolutionized the way we tackle problems. Deep learning models come in various shapes and sizes, capable of effectively resolving problems that are too complex for standard approaches to tackle. We'll review the various deep learning models in this section (Sarker 2021 ).

Classic neural networks

As shown in Fig.  2 , Multi-layer perceptron are frequently employed to recognize Fully Connected Neural Networks. It involves converting the algorithm into simple two-digit data inputs (Mukhamediev et al. 2021 ). This paradigm allows for both linear and nonlinear functions to be included. The linear function is a single line with a constant multiplier that multiplies its inputs. Sigmoid Curve, Hyperbolic Tangent, and Rectified Linear Unit are three representations for nonlinear functions. This model is best for categorization and regression issues with real-valued data and a flexible model of any kind.

An external file that holds a picture, illustration, etc.
Object name is 10462_2022_10306_Fig2_HTML.jpg

Multilayer Perceptron or ANN

Convolutional neural networks (CNN)

As shown in Fig.  3 , The classic convolutional neural network (CNN) model is an advanced and high-potential variant ANN Which developed to manage escalating complexity levels, as well as data pretreatment and compilation. It is based on how an animal's visual cortex's neurons are arranged (Amashita et al. 2018 ). One of the most flexible algorithms for the processing of data with and without images is CNNs. CNN can be processed through 4 phases:

  • For analyzing basic visual data, such as picture pixels, it includes one input layer that is often the case a 2D array of neurons.
  • Some CNNs analyze images on their inputs using a single-dimensional output layer of neurons coupled to distributed convolutional layers.
  • Layer number 3, called as the sampling layer, is included in CNNs o restrict the number of neurons which It took part in the relevant network levels.
  • The sampling and output layers are joined by one or more connected layers in CNNs.

An external file that holds a picture, illustration, etc.
Object name is 10462_2022_10306_Fig3_HTML.jpg

Convolutional Neural Networks (CNN)

This network concept can potentially aid in extracting relevant visual data in pieces or smaller units. In the CNN, the neurons are responsible for the group of neurons from the preceding layer.

After the input data has been included into the convolutional model, the CNN is constructed in four steps:

  • Convolution: The method produces feature maps based on supplied data., which are then subjected to a purpose.
  • Max-Pooling: It aids CNN in detecting an image based on supplied changes.
  • Flattening: The data is flattened in this stage so that a CNN can analyze it.
  • Full Connection: It's sometimes referred to as a "hidden layer" which creates the loss function for a model.

Image recognition, image analysis, image segmentation, video analysis, and natural language processing (NLP) (Chauhan et al. 2018 ; Tajbakhsh et al. May 2016 ; Mohamed et al. 2020 ; Zhang et al. 2018 ) are among the tasks that CNNs are capable of.

Recurrent neural networks (RNNs)

RNNs were first created to help in sequence prediction. These networks rely solely on data streams with different lengths as inputs. For the most recent forecast, the knowledge of its previous state is used as an input value by the RNN. As a result, it can help a network's short-term memory achievers (Tehseen et al. 2019 ). As shown in Fig.  4 , The Long Short-Term Memory (LSTM) method, for example, is renowned for its adaptability.

An external file that holds a picture, illustration, etc.
Object name is 10462_2022_10306_Fig4_HTML.jpg

LSTM Network

LSTMs, which are advantageous in predicting data in time sequences using memory, and LSTMs, which are useful in predicting data in time sequences using memory, are two forms of RNN designs that aid in the study of problems. The three gates are Input, Output, and Forget. Gated RNNs are particularly helpful for temporal sequence prediction using memory-based data. Both types of algorithms can be used to address a range of issues, including image classification (Chandra and Sharma 2017 ), sentiment analysis (Failed 2018 ), video classification (Abramovich et al. 2018 ), language translation (Hermanto et al. 2015 ), and more.

Generative adversarial networks: GAN

As shown in Fig.  5 , It combines a Generator and a Discriminator DL neural network approach. The Discriminator helps to discriminate between real and fake data while the Generator Network creates bogus data (Alankrita et al. 2021 ).

An external file that holds a picture, illustration, etc.
Object name is 10462_2022_10306_Fig5_HTML.jpg

GAN: Generative Adversarial Networks

Both networks compete with one another as The Discriminator still distinguishes between actual and fake data, and the Generator keeps making fake data look like real data. The Generator network will generate simulated data for the authentic photos if a picture library is necessary. Then, a deconvolution neural network would be created. Then, an Image Detector network would be utilized to discriminate between fictitious and real images. This competition would eventually help the network's performance. It can be employed in creating images and texts, enhancing the image and discovering new drugs.

Self-organizing maps (SOM)

As shown in Fig.  6 , Self-Organizing Maps operate by leveraging unsupervised data to decrease a model's number of random variables (Kohonen 1990 ). Given that every synapse is linked to both its input and output nodes, the output dimension in this DL approach is set as a two-dimensional model. The competition between each data point and its model representation in the Self-Organizing Maps, the weight of the closest nodes or Best Matching Units is adjusted (BMUs). The value of the weights varies based on how close a BMU is. The value represents the node's position in the network because weights are a node attribute in and of themselves. It's great for evaluating dataset frameworks that don't have a Y-axis value or project explorations that don't have a Y-axis value.

An external file that holds a picture, illustration, etc.
Object name is 10462_2022_10306_Fig6_HTML.jpg

Self-Organizing Maps (SOM)

Boltzmann machines

As shown in Fig.  7 , the nodes are connected in a circular pattern because there is no set orientation in this network model. This deep learning technique is utilized to generate model parameters because of its uniqueness. The Boltzmann Machines model is stochastic, unlike all preceding deterministic network models. It can monitor systems, create a binary recommendation platform, and analyze specific datasets (Hinton 2011 ).

An external file that holds a picture, illustration, etc.
Object name is 10462_2022_10306_Fig7_HTML.jpg

Boltzmann Machines

The architecture of the Boltzmann Machine is a two-layer neural network. The visible or input layer is the first, while the hidden layer is the second. They are made up of several neuron-like nodes that carry out computations. These nodes are interconnected at different levels but are not linked across nodes in the same layer. As a result, there is no connectivity between layers, which is one of the Boltzmann machine's disadvantages. When data is supplied into these nodes, it is transformed into a graph, and they process it and learn all the parameters, motifs, and relations between them before deciding whether to transmit it. As a result, an Unsupervised DL model is often known as a Boltzmann Machine.

Autoencoders

As shown in Fig.  8 , This algorithm, one of the most popular deep learning algorithms, automatically based on its inputs, applies an activation function, and decodes the result at the end. Because of the backlog, there are fewer types of data produced, and the built-in data structures are used to their fullest extent (Zhai et al. 2018 ).

An external file that holds a picture, illustration, etc.
Object name is 10462_2022_10306_Fig8_HTML.jpg

There are various types of autoencoders:

  • Sparse: The generalization technique is used when the hidden layers outnumber the input layer to decrease the overfitting. It constrains the loss function and restricts the autoencoder from utilizing all its nodes simultaneously.
  • Denoising: In this case, randomly, the inputs are adjusted and made to equal 0.
  • Contractive: When the hidden layer outnumbers the input layer, to avoid overfitting and data duplication, a penalty factor is introduced to the loss function.
  • Stacked: When another hidden layer is added to an autoencoder, it results in two stages of encoding and Initial stages of decoding.

Feature identification, establishing a strong recommendation model, and adding features to enormous datasets are some of the difficulties it can solve.

Organization of DL applications in drug discovery problems

The evolution of safe and effective treatments for human is the primary goal of drug discovery (Kim et al. 2021 ). Drug discovery is the problem of finding the suitable drugs to treat a disease (i.e., a target protein) which relies on several interactions. This paper divides the drug discovery problems into four main categories, as presented in Fig.  9 . They are drug–target interactions, drug–drug similarity, drug combinations side effects, and drug sensitivity and response predictions. The following subsections provide a literature review of DL with these problems and some of the investigated literature articles related to each category are summarized in Table ​ Table2 2 .

An external file that holds a picture, illustration, etc.
Object name is 10462_2022_10306_Fig9_HTML.jpg

Drug discovery problem categories

Classification of articles related to drug discovery and DL

Drug–target interactions prediction using DL

Drug repurposing attempts to uncover new uses for drugs that are already on the market and have been approved. It has attracted much attention since it takes less time, costs less money, and has a greater success rate than traditional de novo drug development (Thafar et al. 2022 ). The discovery of drug–target interactions is the initial step in creating new medications, as well as one of the most crucial aspects of drug screening and drug-guided synthesis (Wang et al. 2020a ). Exploring the link between possible medications and targets can aid researchers in better understanding the pathophysiology of targets at the drug level, which can help with the disease's early detection, treatment prognosis, and drug design. This is well known as drug–target interactions (DTIs) (Lian et al. 2021 ). Achieving success to the drug repositioning mechanism largely reliant on DTI's forecast because it reduces the number of potential medication candidates for specific targets. The approaches based on molecular docking and the approaches based on drugs are the two basic tactics used in traditional computational methods. When target proteins' 3D structures aren't available, the effectiveness of molecular docking is limited. When there are only a few known binding molecules for a target, drug-based techniques typically produce subpar prediction results. DL technologies overcome the restrictions of the high-dimensional structure of drug and target protein by using unstructured-based approaches which do not need 3D structural data or docking for DTI prediction. Therefore, this section provides a recent comprehensive review of DL-based DTIs prediction models (Chen et al. 2012 ).

As shown in Fig.  10 , there are known interactions (solid lines) and unknown interactions (dashed lines) between diseases (proteins) and drugs. DTIs forecast unknown interactions or what diseases (or target proteins) a new drug might treat. According to their input features, we divided the latest DL models used to predict DTIs into three categories: drug-based models, structure (graph)-based models, and drug-protein(disease)-based models.

An external file that holds a picture, illustration, etc.
Object name is 10462_2022_10306_Fig10_HTML.jpg

DL models used for predicting the DTIs are grouped into three categories: a drug-based models, b structure (graph)-based models, and c drug-protein(disease)-based models

Drug-based models

Figure  10 A shows drug-based models that assume a potential drug will be like known drugs for the target proteins. It calculates the DTI using the target's medication information. Similarity search strategies are used in these models, which postulate that structurally similar substances have similar biological functions (Thafar et al. 2019 ; Matsuzaka and Uesawa 2019 ). These methods have been used for decades to select compounds in vast compound libraries employing massive computer jobs or solve problems using human calculations. Deep neural network models gradually narrow the gap between in silico prediction and empirical study, and DL technology can shorten these time-consuming procedures and manual operations.

Researchers may now use deep neural networks to analyze medicines and predict drug-related features, including as bioactivities and physicochemical qualities, thanks to using benchmark packages like MoleculeNet (Wu et al. 2018 ) and DeepChem (). As a result, basic neural networks like MLP and CNN have been used in numerous drug-based DL approaches (Zeng et al. 2020 ; Yang et al. 2019 ; Liu et al. 2017 ). The representation power of molecular descriptors was often the focus of ADMET investigations rather than the model itself (Zhai et al. 2018 ; Liu et al. 2017 ; Kim et al. 2016 ; Tang et al. 2014 ). Hirohara et al. trained a CNN model with the SMILES string and then used learned attributes to discover motifs using significant structures for locations that bind proteins or unidentified functional groupings (Hirohara et al. 2018 ). Atom pairs and pharmacophoric donor–acceptor pairings have been employed by Wenzel et al. ( 2019 ) as adjectives in multi-task deep neural networks to predict microsomal metabolic liability. Gao et al. ( 2019 ) compared 6 different kinds of 2D fingerprints in the prediction of affinity between proteins and drugs using ML methods such as RF, single-task DNN, and multi-task DNN models. Matsuzaka and Uesawa ( 2019 ) used 2D pictures of 3D chemical compounds to train a CNN model to predict constitutive androstane receptor agonists. They optimized the greatest performance in snapshots of a 3D ball-and-stick model taken at various angles or coordinates. Therefore, the method outperformed seven common 3D chemical structure forecasts.

Since the GCN's development, drug related GCN models have created depictions of graphs which concerned with molecules that incorporate details on the chemical structures by adding up the adjacent atoms' properties (Gilmer et al. 2017 ).

GCNs have been employed as 3D descriptors instead of SMILES strings in a lot of research, and it's been discovered that these learned descriptors outperform standard descriptors in prediction tests and are easier to understand (Shin et al. 2019 ; Ozturk et al. 2018 ; Yu et al. 2019 ). Chemi-net employed GCN models to represent molecules and compared the performance of single-task and multi-task DNNs on their own QSAR datasets (Liu et al. 2019a ). Yang et al. ( 2019 ) introduced the directed message passing neural network, which uses a directed message-passing paradigm, as a more advanced model (D-MPNN). They tested their approaches on 19 publicly available and 16 privately held datasets and discovered that in most situations, they were correct. The D-MPNN models outperformed the previous models. In two datasets, they underperformed and were not as resilient as typical 3D descriptors when the sample was small or unbalanced. The D-MPNN model was then employed by another research group to correctly forecast a kind of antibiotic named HALICIN, which demonstrated bactericide effects in models for mice (Stokes et al. 2020 ). This was the first incident that resulted in the finding of an antibiotic by using DL methods to explore a large-scale chemical space that current experimental methodologies cannot afford. The application of attention-based graph neural networks is another interesting contemporary method (Sun et al. 2020a ). Edge weights and node features can be learned together since a molecule's graph representations can be altered by edge properties. As a result, Shang et al. suggested a multi-relational GCN with edge attention (Shang et al. 2018 ). For each edge, they created a reference guide on attention spans. Because it is used throughout the molecule, the approach can handle a wide range of input sizes.

In the Tox21 and HIV benchmark datasets, they found that this model performed better than the random forest model. As a result, the model may effectively learn pre-aligned features from the molecular graph's inherent qualities. Withnall et al. ( 2020 ) extended the MPNN model with AMPNN (attention MPNN), which is an attention technique that the message forwarding step employs weighted summation. Moreover, they termed the D-MPNN model the edge memory neural network because it was extended by the same attention mechanism as the AMPNN (EMNN). Although it is computationally more intensive than other models, this model fared better than others on the uniformly absent information from the maximal unbiased validation (MUV) reference.

Structure (graph)-based models

Unlike the drug- and structure-based models in Fig.  10 b, protein targets and medication information should be included. Typical molecular docking simulation methods aim to predict the geometrically possible binding of known tertiary structure drugs and proteins. Atom sequences and amino acid residues can be used to express both the medicine as well as the target. Descriptors based on sequences were selected because DL approaches may be implemented right away with non-significant pre-processing of the entering data.

The Davis kinase binding affinity dataset (Davis et al. 2011 ) and the KIBA dataset (Sun et al. 2020a ) were used in that study. DeepDTA, suggested by Ozturk et al. ( 2018 ), outperformed moderate ML approaches such as KronRLS (Nascimento et al. 2016 ) and SimBoosts (Tong et al. 2017 ) by applying solely information about the sequence of a CNN model based on the SMILES string and amino acid sequences. Wen et al. used ECFPs and protein sequence composition descriptors as examples of common and basic features and trained them using semi-supervised learning via a deep belief network (Wen et al. 2017 ). Another study, DeepConv-DTI, built a deep CNN model using only an RDKit Morgan fingerprint and protein sequences (Lee et al. 2019 ). They also used the pooled convolution findings to capture local residue patterns of target protein sequences, resulting in high values for critical protein areas like actual binding sites.

The scoring feature, which ranks the protein-drug interaction with 3D structures and makes the training data parametric to forecast values for binding affinities of targeted proteins, is used to predict binding affinity values or binding pocket sites of the target proteins as a key metric for the structure-based regression model. The protein–drug complexes' 3D structural characteristics were included in the CNNs by AtomNet (Wallach et al. 2015 ). They placed 3D grids with set sizes (i.e., voxels) in comparison to protein–drug combinations, with every cell in the grid representing structural properties at that position. Several researchers have examined the situation since then, deep CNN models that use voxels to predict binding pocket location or binding affinity (Wang et al. 2020b ; Ashburner et al. 2000 ; Zhao et al. 2019 ). In comparison to common docking approaches such as AutoDock Vina (Trott and Olson 2010 ) or Smina (Koes et al. 2013 ), these models have shown enhanced performance. This is since CNN models are relatively impervious even with large input sizes. It can be taught and is resilient to input data noise.

Many DTI investigations using GCNs based on structure-based approaches have been reported (Feng et al. 2018 ; Liu et al. 2016 ). Feng et al. ( 2018 ) used both ECFPs and GCNs as pharmacological characteristics. In the Davis et al. ( 2011 ), Metz et al. ( 2011 ), and KIBA Tang et al. ( 2014 ) benchmark datasets, their methods outperformed prior models such as KronRLS (Nascimento et al. 2016 ) and SimBoost (Tong et al. 2017 ). However, they did agree that their GCN model couldn't beat their ECFP model due to time and resource constraints in implementing the GCN. In a different DTI investigation study, Torng et al. employed a graph without supervision to become familiar with constant size depictions of protein binding sites (Torng and Altman 2019 ). The pre-trained GCN model was then trained using the newly created protein pocket GCN, the drug GCN model, on the other hand, used attributes to be trained and which were generated automatically. They concluded that without relying on target–drug complexes, their model effectively captured protein–drug binding interactions.

Because the models that implement the attention mechanism have key qualities that enable the model to be interpreted, attention-based DTI prediction approaches have evolved (Hirohara et al. 2018 ; Liu et al. 2016 ; Perozzi et al. 2014 ).

For protein sequences, Gao et al. ( 2017 ) employed compressed vectors with the LSTM RNNs and the GCN for drug structures. They concentrated on demonstrating their method's capacity to deliver biological insights into DTI predictions. To do so, Mechanisms for two-way attention were employed. to calculate the binding of drug–target pairs (DTPs), allowing for flexible interpretation of superior data from target proteins, such as GO keywords. Shin et al. ( 2019 ) introduced the Molecule transformer DTI (MT-DTI) approach for drug representations, which uses the self-attention mechanism. The MT-DTI model was tweaked to perfection and assessed using two Davis models Using pre-trained parameters from the 97 million chemicals PubChem (Davis et al. 2011 ) and (KIBA) (Tang et al. 2014 ) benchmark datasets, which are both publicly available. However, the attention mechanism was not used to depict the protein targets because it would take too long to calculate the target sequence in an acceptable amount of time. Pre-training is impossible due to a lack of target information.

On the other hand, attention DTA presented by Zhao et al. incorporates a CNN attention mechanism model to establish the weighted connections between drug and protein sequences (Zhao et al. 2019 ). They showed that these attention-based drug and protein representations have good MLP model affinity prediction task performance. DeepDTIs used external, experimental DTPs to infer the probability of interaction for any given DTP. Four of the top ten predicted DTIs have previously been identified, and one was discovered to have a poor glucocorticoid receptor binding affinity (Huang et al. 2018 ). DeepCPI was used to predict drug–target interactions. Small-molecule interactions with the glucagon-like peptide one receptor, the glucagon receptor, and the vasoactive intestinal peptide receptor have been tested in experiments (Wan et al. 2019 ).

Drug–protein(disease)-based models

According to poly pharmacology, most medicines have multiple effects on both primary and secondary targets. The biological networks involved, as well as the drug's dose, influence these effects. As a result, the drug–protein(disease)-based models shown in Fig.  10 c are particularly beneficial when evaluating protein promiscuity or drug selectivity (Cortes-Ciriano et al. 2015 ). Furthermore, Neural networks that can do multiple tasks are ideal for simultaneously learning the properties of many sorts of data (Camacho et al. 2018 ). Several DL model applications, such as drug-induced gene-expression patterns and DTI-related heterogeneous networks, leverage relational information for distinct views. A network-based strategy employs heterogeneous networks includes a variety of nodes and edges kinds (Luo et al. 2017 ; David et al. 2019 ). The nodes in these networks have a local similarity, which is a significant aspect of these models. One can anticipate DTIs using their connections and topological features when a network of similarity with medications as its nodes and drug–drug similarity values as a measure of the edges' weights is investigated. Machine to support vectors (Bleakley and Yamanishi 2009 ; Keum and Nam 2017 ), Machine learning techniques that use heterogeneous networks as prediction frameworks include the regularized least square model (RLS) (Liu et al. 2016 ; Xia et al. 2010 ; Hao et al. 2016 ) and random walk with the restart model Nascimento (Lian et al. 2021 ; Nascimento et al. 2016 ). DTI prediction research using networks have employed DL to enhance the methods used to forecast associations today for evaluating the comparable topological structures of drug and target networks that are bipartite and tripartite linked networks, owing to the increased interest in the usage of DL technologies (drug, target, and disease networks) (Hassan-Harrirou et al. 2020 ; Lamb et al. 2006 ; Korkmaz 2020 ; Townshend et al. 2012 ; Vazquez et al. 2020 ). Zong et al. ( 2017 ) used the DeepWalk approach to collect local latent data, compute topology-based similarity in tripartite networks, and demonstrate the technology's promise as a medication repurposing solution.

Relationship-based features collected by training the AE were used in some network-based DTI prediction studies. Zhao et al. ( 2020 ) developed a DTI-CNN prediction model that combined depth information that is low-dimensional but rich with a heterogeneous network that has been taught using the stacked AE technique. To construct the topological similarity matrix of drug and target, Wang et al. used a deep AE and mutually beneficial pointwise information in their analysis (Wang et al. 2020b ). Peng et al. ( 2020 ) employed a denoising Autoencoder to pick network-based attributes and decrease the representation dimensions in another investigation.

By helping the self-encoder learn to denoise, the anti-aliasing effect (Autoencoder) enhances high-dimensional images with noise, input data that is noisy and incomplete, allowing the encoder to learn more reliably. These approaches, however, have a drawback in that it is challenging to foresee recent medications or targets, a problem. The problem of recommendation systems' "cold start" is known as the "cold start" problem (Bedi et al. 2015 ). The size and form of the network have a big impact on these models, so if the network isn't big enough, they will not be able to collect all the medications or targets that aren't in the network (Lamb et al. 2006 ).

Various investigations have also utilized Gene expression patterns as chemogenomic traits to predict DTIs. This research presumes that medications with similar expression patterns have similar effects on the same targets (Hizukuri et al. 2015 ; Sawada et al. 2018 ).

The revised version of CMAP, the LINCS-L1000 database, has been integrated into the DL DTI models in recent works (Subramanian et al. 2017 ; Thafar et al. 2020 ; Karpov et al. 2020 ; Arus-Pous et al. 2020 ). Based on the LINCS pharmacological perturbation and knockout gene data, using a deep neural network, Xie et al. developed a binary classification model (Xie et al. 2018 ).

On the other hand, Lee and Kim employed as a source of expression signature genes medication and target features. They used node2vec to train the rich data by examining three elements of protein function, including pathway-level memberships and PPI (Lee and Kim 2019 ). Saho and Zhang employed a GCN model to extract drug and target attributes from LINCS data and a CNN model to forecast DTPs by extracting latent features in DTIGCCN (Shao et al. 2020 ). The Gaussian kernel function was identified to aid in the production of high-quality graphs, and as a result, this hybrid model scored better on classification tests.

DeepDTnet employs a heterogeneous drug–gene-disease network to uncover known drug targets containing fifteen types of chemicals and genomic, phenotypic, and cellular network properties. DeepDTnet predicted and experimentally confirmed topotecan, a new direct inhibitor of the orphan receptor linked to the human retinoic acid receptor (Zeng et al. 2020 ).

Drug sensitivity and response prediction using DL

Drug response is the clinical outcome treated by the drug of interest ( https://www.sciencedirect.com/topics/drug-response ). This is due to the normally low ratio of samples to measurements each sample, which makes traditional feedforward neural networks unsuitable. The main idea of drug response prediction is shown in Fig.  11 . The DL method takes the heterogenous network of drug and protein interactions as inputs and predicts the response scores. Although the widespread use of the deep neural network (DNN) approaches in various domains and sectors, including related topics like computational chemistry (Gómez-Bombarelli et al. 2018 ), DNNs have only lately made their way into drug response prediction. Overparameterization, overfitting, and poor generalization are common outcomes of recent simulation datasets. However, more public data has become available recently, and freshly built DNN models have shown promise. As a result, this section summarizes current DL computational problems and drug response prediction breakthroughs.

An external file that holds a picture, illustration, etc.
Object name is 10462_2022_10306_Fig11_HTML.jpg

Drug binding with proteins and drug sensitivity (response) scores prediction

Since the 1990s, neural networks have been used to predict drug response (El-Deredy et al. 1997 ) revealed that data from tumor nuclear magnetic resonance (NMR) spectra might be used to train a neural network and can be utilized to predict drug response in gliomas and offer information on the metabolic pathways involved in drug response.

In 2018, The DRscan model was created by Chang et al. ( 2018 ), and it uses a CNN architecture that was trained on 1000 drug reaction studies per molecule. Compared to other traditional ML algorithms like RF and SVM, their model performed much better. CDRscan's ability to incorporate genomic data and molecular fingerprints is one of the reasons it outperformed these baseline models. Furthermore, its convolutional design has been demonstrated to be useful in various machine learning areas. A neural network called an autoencoder attempts to recreate the original data from the compressed form after compressing its input. As proven by Way and Greene ( 2018 ), this is very useful for feature extraction, which condensed a gene expression profile with 5000 dimensions with a maximum of 100 dimensions, some of which revealed to significant characteristics such as the patient's sexual orientation or melanoma status. Using variational autoencoders, Dincer et al. ( 2018 ) created DeepProfile, a technique for learning a depiction of gene expression in AML patients in eight dimensions that is then fitted to a Lasso linear model for treatment response prediction with superior results to that of no extracting features.

Ding et al. ( 2018 ) proposed a deep autoencoder model for representation learning of cancer cells from input data consisting of gene expression, CNV, and somatic mutations.

In 2019, MOLI (Multi-omics Late Integration) (Sharifi-Noghabi et al. 2019 ) was a deep learning model that incorporates multi-omics data and somatic mutations to characterize a cell line. Three separate subnetworks of MOLI learn representations for each type of omics data. A final network identifies a cell's response as responder or non-responder based on concatenated attributes. Those methods share two characteristics: integrating multiple input data (multi-omics) and binary classification of the drug response. Although combining several forms of omics data can improve the learning of cell line status, it may limit the method's applicability for testing on different cell lines or patients because the model requires extra data beyond gene expression.

Furthermore, a certain threshold of the IC50 values should be set before binary classification of the drug response, which may vary depending on the experimental condition, such as drug or tumor types. Twin CNN for drugs in SMILES format (TCNNS) (Liu et al. 2019b ) takes a one-hot encoded representation of drugs and feature vectors of cell lines as the inputs for two encoding subnetworks of a One-Dimensional (1D) CNN. One-hot encodings of drugs in TCNNS are Simplified Molecular Input Line Entry System (SMILES) strings which describe a drug compound's chemical composition. Binary feature vectors of cell lines represent 735 mutation states or CNVs of a cell. KekuleScope (Cortés-Ciriano and Bender 2019 ) adopts transfer learning, using a pre-trained CNN on ImageNet data. The pre-trained CNN is trained with images of drug compounds represented as Kekulé structures to predict the drug response.

Yuan et al. ( 2019 ) offer GNNDR, a GNN-based technique with a high learning capacity and allows drug response prediction by combining protein–protein interactions (PPI) information with genomic characteristics. The value of including protein information has been empirically proven. The proposed method offers a viable avenue for the discovery of anti-cancer medicines. Semi-supervised variational autoencoders for the prediction of monotherapy response were examined by the Rampášek et al. ( 2019 ). In contrast to many conventional ML methodologies, together developed a model for predicting medication reaction that took advantage of expression of genes before and after therapy in cell lines and demonstrated enhanced evaluation on a variety of FDA-approved pharmaceuticals. Chiu et al. ( 2019 ) trained a deep drug response predictor after pre-training autoencoders using mutation data and expression features from the TCGA dataset. The use of pretraining distinguishes their strategy from others. Compared to using only the labeled data, the pretraining process permits un-labelled data from outside sources, like TCGA, as opposed to just gene expression profiles obtained from drug reaction tests, resulting in a significant increase in the number of samples available and improved performance.

Chiu et al. ( 2019 ) and Li et al. ( 2019 ) used a combination of auto-encoders and predicted drug reactions in cell lines with deep neural networks and malignancies that had been gnomically characterized. To anticipate cell lines reactions to drug combinations, in https://string-db.org/cgi/download.pl?sessionId=uKr0odAK9hPs used deep neural encoders to link genetic characteristics with drug profiles.

In 2020, Wei et al. ( 2020 ) anticipate drug risk levels (ADRs) based on adverse drug reactions. They use SMOTE and machine learning techniques in their studies. The proposed framework was used to investigate the mechanism of ADRs to estimate degrees of drug risk and to assist with and direct decision-making during the changeover from prescription to over-the-counter medications. They demonstrated that the best combination, PRR-SMOTE-RF, was built using the above architecture and that the macro-ROC curve had a strong classification prediction effect. They suggested that this framework could be used by several drug regulatory organizations, including the FDA and CFDA, to provide a simple but dependable method for ADR signal detection and drug classification, as well as an auxiliary judgement basis for experts deciding on the status change of Rx drugs to OTC drugs. They propose that more ML or DL categorization algorithms be tested in the future and that computational complexity be factored into the comparison process. Kuenzi et al. ( 2020 ) built DrugCell, an interpretable DL algorithm of personal cancer cells based on the reactions of 1235 tumor cell lines to 684 drugs. Genotypes of cancer cause conditions in cellular systems combined with medication composition to forecast therapeutic outcome while also learning the molecular mechanisms underlying the response. Predictions made by DrugCell in cell lines are precise and help to categorize clinical outcomes. The study of DrugCell processes results in the development of medication combinations with synergistic effects, which we test using combinatorial CRISPR, in vitro drug–drug screening, and xenografts generated from patients. DrugCell is a step-by-step guide to building interpretable predictive medicine models.

Artificial Neural Networks (ANNs) that operate on graphs as inputs are known as Graph Neural Networks (GNNs). Deep GNNs were recently employed for learning representations of low-dimensional biomolecular networks (Hamilton 2020 ; Wu et al. 2020 ). Ahmed et al. ( 2020 ) used two separate GNN methods to develop a GNN using GE and a network of genes that are expressed together. This is a network that depicts the relationship between gene pairs' expression.

The CNN is one of the neural network models adopted for drug response prediction. The CNN has been actively used for image, video, text, and sound data due to its strong ability to preserve the local structure of data and learn hierarchies of features. In 2021, several methods had been developed for drug response prediction, each of which utilizes different input data for prediction (Baptista et al. 2021 ).

Nguyen et al. ( 2021 ) proposed a method to predict drug response called GraphDRP, which integrates two subnetworks for drug and cell line features, like CNN in Liu et al. ( 2019b ) and Qiu et al. ( 2021 ). Gene expression data from cancer cell lines and medication response data, the author finds predictor genes for medications of interest and provides a reliable and accurate drug response prediction model. Using the Pearson correlation coefficient, they employed the ElasticNet regression model to predict drug response and fine-tune gene selection after pre-selecting genes. They ran a regression on each drug twice, once using the IC50 and once with the area under the curve (AUC), to obtain a more trustworthy collection of predictor genes (or activity area). The Pearson correlation coefficient for each of the 12 medicines they examined was greater than 0.6. With 17-AAG, IC50 has the highest Pearson correlation coefficient of 0.811.

In contrast, AUC has the highest Pearson correlation coefficient of 0.81. Even though the model developed in this study has excellent predictive performance for GDSC, it still has certain flaws. First, the cancer cell line's properties may differ significantly from those of in vivo malignancies, and it must be determined whether this will be advantageous in a clinical trial. Second, they primarily use gene expression data to predict drug response. While drug response is influenced by structural changes such as gene mutations, it is also influenced by gene expression levels. To improve the prediction capacity of the model, more research is needed to use such data and integrate it into the model.

In 2022, Ren et al. ( 2022 ) suggested a graph regularized matrix factorization based on deep learning (DeepGRMF), which uses a variety of information, including information on drug chemical composition, their effects on cell biology signaling mechanisms, and the conditions of cancer cells, to integrate neural networks, graph models, and matrix-factorization approaches to forecast cell response to medications. DeepGRMF trains drug embeddings so that drugs in the embedding space with similar structures and action mechanisms, (MOAs) are intimately linked. DeepGRMF learns the same representation embeddings for cells, allowing cells with similar biological states and pharmacological reactions to be linked. The Cancer Cell Line Encyclopedia (CCLE) and On the Genomics of Drug Sensitivity in Cancer (GDSC) datasets, DeepGRMF outperforms competing models in prediction performance. In the Cancer Genome Atlas (TCGA) dataset, the suggested model might anticipate the effectiveness of a treatment plan on lung cancer patients' outcomes. The limited expressiveness of our VAE-based chemical structure representation may explain why new cell line prediction outperforms innovative drug sensitivity prediction in terms of accuracy. A family of neural graph networks has recently been shown to depict better chemical structures that can be investigated in the future. Pouryahya et al. ( 2022 ) proposed a new network-based clustering approach for predicting medication response based on OMT theory. Gene-expression profiles and cheminformatic drug characteristics were used to cluster cell lines and medicines, and data networks were used to represent the data. Then, RF model was used regarding each pair of cell-line drug clusters. by comparison, prediction-clustered based models regarding the homogenous data are anticipated to enhance drug sensitivity and precise forecasting and biological interpretability.

Drug–drug interactions (DDIs) side effect prediction using DL

Drugs are chemical compounds consumed by people and interact with protein targets to create a change. The drugs may alter the human body positively or negatively. Drug side effects are the undesirable alterations medications cause in the human body. These adverse effects might range from moderate headaches to life-threatening reactions like cardiac arrest, malignancy, and death. They differ depending on the person's age, gender, stage of sickness, and other factors (Kuijper et al. 2019 ). In the laboratory, to determine whether the medications have any unfavorable side effects, several tests are conducted on them. However, these examinations are both pricey and additionally lengthy. Recently, many computational algorithms for detecting medication adverse effects have been created. Computational methodologies are replacing laboratory experiments.

On the other hand, these methods do not provide adequate data to predict drug–drug interactions (DDIs). The phenomenon of DDIs is discussed in Fig.  12 . The desired effects of a drug resulting from its interaction with the intended target and the unfavorable repercussions emerging from drug interactions with off targets make up a drug's entire reaction on the human body (undesirable effects). Even though A medication has a strong affinity for binding to one target, it binds to several proteins as well with varied affinities, which might cause adverse consequences (Liu et al. 2021 ). Predicting DDIs can assist in reducing the likelihood of adverse reactions and optimizing the medication development and post-market monitoring processes (Arshed et al. 2022 ). Side effects of DDIs are often regarded as the leading cause of drug failure in pharmacological development. When drugs have major side effects, the market is quickly removed from them. As a result, predicting side effects is a fundamental requirement in the drug discovery process to keep drug development costs and timelines in check and launch a beneficial drug in terms of patient health recovery.

An external file that holds a picture, illustration, etc.
Object name is 10462_2022_10306_Fig12_HTML.jpg

Drug binding with proteins and DDI side effects

Furthermore, the average drug research and development cost is $2.6 billion (Liu et al. 2019 ). As a result, determining the possibility of negative consequences is important for lowering the expense and risk of medication development. The researchers use various computer tools to speed up the process. In pharmacology and clinical application, DDI prediction is a difficult topic, and correctly detecting possible DDIs in clinical studies is crucial for patients and the public. Researchers have recently produced a series of successes utilizing deep learning as an AI technique to predict DDIs by using drug structural properties and graph theory (Han et al. 2022 ). AI successfully detected potential drug interactions, allowing doctors to make informed decisions before prescribing prescription combinations to patients with complex or numerous conditions (Fokoue et al. 2016 ).

Therefore, this section comprehensively reviews the researchers' most popular DL algorithms to predict DDIs.

In 2016, Tiresias is a framework proposed by Achille Fokoue et al. ( 2017 ) for discovering DDIs. The Tiresias framework uses a large amount of drug-related data as input to generate DDI predictions. The detection of the DDI approach begins using input data that has been semantically integrated, resulting in a knowledge network that represents drug properties and interactions using additional components like enzymes, chemical structures, and routes. Numerous similarity metrics between all pharmacological categories were determined using a knowledge graph in a scalable and distributed setting. To forecast the DDIs, a large-scale logistic regression prediction model employs calculated similarity metrics. According to the findings, the Tiresias framework was proven to help identify new interactions between currently available medications and freshly designed and existing drugs. The suggested Tiresias model's necessity for big, scaled medication information was negative, resulting in the developed model's high cost.

In 2017, Reza et al. ( 2017 ) developed a computational technique for predicting DDIs based on functional similarities among all medicines. Several major biological aspects were used to create the suggested model: carriers, enzymes, transporters, and targets (CETT). The suggested approach was implemented on 2189 approved medications, for which the associated CETTs were obtained, and binary vectors to find the DDIs were created. Two million three hundred ninety-four thousand seven hundred sixty-seven potential drug–drug interactions were assessed, with over 250,000 unidentified possible DDIs discovered. Inner product-based similarity measures (IPSMs) offered good values predicted for detecting DDIs among the several similarity measures used. The lack of pharmacological data was a key flaw in this strategy, which resulted in the erroneous detection of all potential pairs of DDIs.

In 2018, Ryu et al. ( 2018 ) proposed a model that predicts more DDI kinds using the drug's chemical structures as inputs and applied multi-task learning to DDI type prediction in the same vein Decagon (Zitnik et al. 2018 ) models polypharmacy side effects using a relational GNN. To comprehend the representations of intricate nonlinear pharmacological interactions, Chu et al. ( 2018 ) utilized an auto-encoder for factoring. To predict DDIs, Liu et al. ( 2019c ) presented the DDI-MDAE based on shared latent representation, a multimodal deep auto-encoder. Recently, interest in employing graph neural networks (GNNs) to forecast DDI has increased. Distinct aggregation algorithms lead to different versions of GNNs to efficiently assemble the vectors of its neighbors’ feature vectors (Asada et al. 2018 ) uses a convolutional graph network (GCN) to encode the molecular structures to extract DDIs from text. Furthermore, Ma et al. ( 2018 ) has incorporated attentive Multiview graph auto-encoders into a coherent model.

Chen ( 2018 ) devised a model for predicting Adverse Drug Reactions (ADR). SVM, LR, RF, and GBT were all used in the predictive model. The DEMO dataset, which contains properties such as the patient's age, weight, and sex, and the DRUG dataset, which includes features such as the drug's name, role, and dosage, were employed in this model. Males make up 46% of the sample, while females make up 54%. The developed model had a fair forecasting accuracy for a representative sample set. Furthermore, the outputs revealed that the suggested model is only accurate for a significant number of datasets.

To anticipate the possible DDI, Kastrin et al. ( 2018 ) employed statistical learning approaches. The DDI was depicted as a complex network, with nodes representing medications and links representing their potential interactions. On networks of DDIs, the procedure for predicting links was represented as a binary classification job. A big DDI database was picked randomly to forecast. Several supervised and unsupervised ML approaches, such as SVM, classification tree, boosting, and RF, are applied for edge prediction in various DDIs. Compared to unsupervised techniques, the supervised link prediction strategy generated encouraging results. To detect the link between the pharmaceuticals, The proposed method necessitates Unified Medical Language System (UMLS) filtering, which provided a dilemma for the scientists. Furthermore, the suggested system only considers fixed network snapshots, which is problematic for DDI's system because It's a fluid system.

In 2019, Lee et al. ( 2019 ) proposed a deep learning system for accurately forecasting the results of DDIs. To learn more about the pharmacological effects of a variety of DDIs, an assortment of auto-encoders and a deep feed-forward neural network was employed in the suggested method that were honed utilizing a mix of well-known techniques. The results revealed that using SSP alone improves GSP and TSP prediction accuracy, and the autoencoder is more powerful than PCA at reducing profile features. In addition, the model outperformed existing approaches and included numerous novel DDIs relevant to the current study Yue et al. ( 2020 ) combines numerous graphs embedding methods for the DDI job, while models DDI as link prediction with the help of a knowledge graph (Karim et al. 2019 ). There's also a system for co-attention (Andreea and Huang 2019 ), which presented a deep learning model based solely on side-effect data and molecular drug structure. CASTER in Huang et al. ( 2020 ) also based on drug chemical structures, develops a framework for dictionary learning to anticipate DDIs (Chu et al. 2019 ) and proposes using semi-supervised learning to extract meaningful information for DDI prediction in both labeled and unlabeled drug data. Shtar et al. ( 2019 ) used a mix of computational techniques to predict medication interactions, including artificial neural networks and graph node factor propagation methods such as adjacency matrix factorization (AMF) and adjacency matrix factorization with propagation (AMFP). The Drug-bank database was used to train the model, containing 1142 medications and 45,297 drug drugs. With 1442 drugs and 248,146 drug–drug interactions, the trained model was tested from the drug bank's most recent version. AMF and AMFP were also used to develop an ensemble-based classifier, and the outcomes were assessed using the receiver operating characteristic (ROC) curve. The findings revealed that the suggested a classifier that uses an ensemble delivers important drug development data and noisy data for drug prescription. In addition, drug embedding, which was developed during the training of models utilizing interaction networks, has been made available. To anticipate adverse drug events caused by DDIs, Hou et al. ( 2019 ) suggested a deep neural network architecture model. The suggested model is based on a database of 5000 medication codes obtained from Drug Bank. Using the computed features, it discovers 80 different types of DDIs. Tensor Flow-GPU was also used to create the model, which takes 4432 drug characteristics as input.

Medicines for inflammatory bowel disease (IBD) can predict how they will react; the trained model has an accuracy of 88 percent. The findings also revealed that the model performs best when many datasets are used. Detecting negative effects of drugs with a DNN Model was proposed by Wang et al. ( 2019 ). The model predicts ADRs by using synthetic, biological, and biomedical knowledge of drugs. Drug data from SIDER databases was also incorporated into the model. The proposed system's performance was improved by distributing. Using a word-embedding approach, determine the association between medications using the target drug representations in a vector space. The suggested system's fundamental flaw was that it only worked well with ordinary SIDER databases.

In 2020, numerous AI-based methods were developed for DDI event prediction, including evaluating chemical structural similarity using neural graph networks (Huang et al. 2020 ). Attempts to forecast DDI utilizing different data sources have also been made, such as leveraging similarity features to create pharmacological features for the DDI job predicting occurrences (Deng et al. 2020 ).

With the help of word embeddings, part-of-speech tags, and distance embeddings. Bai et al. ( 2020 ) suggested a deep learning technique that executes the DDI extraction task and supports the drug development cycle and drug repurposing. According to experimental data, the technique can better avoid instance misclassifications with minimal pre-processing. Moreover, the model employs an attention technique to emphasize the significance of each hidden state in the Bi-LSTM layers.

A tool for extracting features regarding a graph convolutional network (GCN) and a predictor based on a DNN. Feng et al. ( 2020 ) suggested DPDDI, an effective and robust approach for predicting potential DDIs by utilizing data from the DDI network lacking a thought of drug characteristics (i.e., drug chemical and biological properties). The proposed DPDDI is a useful tool for forecasting DDIs. It should benefit from other DDI-related circumstances, such as recognizing unanticipated side effects and guiding drug combinations. The disadvantage of this paradigm is that it ignores drug characteristics.

Zaikis and Vlahavas ( 2020 ), by developing a bi-level network with a more advanced level reflecting the network of biological entities' interactions, suggested a multi-level GNN framework for predicting biological entity links. Lower levels, however, reflect individual biological entities such as drugs and proteins, although the proposed model's accuracy needs to be enhanced.

In 2021, To overcome the DDI prediction, Lin et al. ( 2021 ) suggested an end-to-end system called Knowledge Graph Neural Network (KGNN). KGNN expands the use of spatial GNN algorithms to the knowledge graph by selectively various aggregators of neighborhood data, allowing it to learn the knowledge graph's topological structural information, semantic relations, and the neighborhood of drugs and drug-related entities. Medical risks are reduced when numerous medications are used correctly, and drug synergy advantages are maximized. For multi-typed DDI pharmacological effect prediction, Yue et al. ( 2021 ) used knowledge graph summarization. Lyu et al. ( 2021 ) also introduced a Multimodal Deep Neural Network (MDNN) framework for DDI event prediction. On the drug knowledge graph, a graph neural network was used, MDNN effectively utilizes topological information and semantic relations. MDNN additionally uses joint representation structure information, and heterogeneous traits are studied, which successfully investigates the multimodal data's complementarity across modes. Karim et al. ( 2019 ) built a knowledge graph that used CNN and LSTM models to extract local and global pharmacological properties across the network. DANN-DDI is a deep attention neural network framework proposed by Liu et al. ( 2021 ). To anticipate unknown DDIs, it carefully incorporates different pharmacological properties (Chun and Yi-Ping Phoebe 2021 ) and developed a deep hybrid learning (DL) model to provide a descriptive forecasting of pharmacological adverse reactions. It was one of the initial hybrid DL models through conception models that could be interpreted. The model includes a graph CNN through conception models to improve the learning efficiency of chemical drug properties and bidirectional long short-term memory (BiLSTM) recurrent neural networks to link drug structure to adverse effects. After concatenating the outputs of the two networks (GCNN and BiLSTM), a fully connected network is utilized to forecast pharmacological adverse reactions. Regardless of the classification threshold, the model obtains an AUC of 0.846. It has a 0.925 precision score. Even though a tiny drug data set was used for adverse drug response (ADR) prediction, the Bilingual Evaluation Understudy (BLEU) concluded results were 0.973, 0.938, 0.927, and 0.318, indicating considerable achievements. Furthermore, the model can correctly form words to explain pharmacological adverse reactions and link them to the drug's name and molecular structure. The projected drug structure and ADR relationship will guide safety pharmacology research at the preclinical stage and make ADR detection easier early in the drug development process. It can also aid in the detection of unknown ADRs in existing medications. DDI extraction using a deep neural network model from medical literature was proposed by Mohsen and Hossein (). This model employs an innovative approach of attracting attention to improve the separation of essential words from other terms based on word similarity and location concerning candidate medications. Before recognizing the type of DDIs, this method calculates the results of a bi-directional long short-term memory (Bi-LSTM) model's attention weights in the deep network architecture. On the standard DDI Extraction 2013 dataset, the proposed approach was tested. According to the findings of the experiments, they were able to get an F1-Score of 78.30, which is comparable to the greatest outcomes for stated existing approaches.

In 2022, Pietro et al. ( 2022 ) introduced DruGNN, a GNN-based technique for predicting DDI side effects. Each DDI corresponds to a class in the prediction, a multi-class, multi-label node classification issue. To forecast the side effects of novel pharmaceuticals, they use a combination inductive-transudative learning system that takes advantage of drug and gene traits (induction path) and knowledge of known drug side effects (transduction path). The entire procedure is adaptable because the base for machine learning can still be used if the graph dataset is enlarged to include more node properties and associations. Zhang et al. ( 2022 ) proposed CNN-DDI, a new semi-supervised algorithm for predicting DDIs that uses a CNN architecture. They first extracted interaction features from pharmacological categories, targets, pathways, and enzymes as feature vectors. They then suggested a novel convolution neural network as a predictor of DDIs-related events based on feature representation. Five convolutional layers, two full-connected layers, and a CNN-based SoftMax layer make up the predictor. The results reveal that CNN-DDI superior to other cutting-edge techniques, but it takes longer to complete (Jing et al. 2022 ) presented DTSyn. This unique dual-transformer-based approach can select probable cancer medication combinations. It uses a multi-head attention technique to extract chemical substructure-gene, chemical-chemical, and chemical-cell-line connections. DTSyn is the initial model that incorporates two transformer blocks to extract linkages between interactions between genes, drugs, and cell lines, allowing a better understanding of drug action processes. Despite DTSyn's excellent performance, it was discovered that balanced accuracy on independent data sets is still limited. Collecting more training data is expected to solve the problem. Another issue is that the fine-granularity transformer was only trained on 978 signature genes, which could result in some chemical-target interactions being lost.

Furthermore, DTSyn used expression data as the only cell line attributes. To fully represent the cell line, additional omics data may be added going forward, including methylation and genetic data. He et al. ( 2022 ) proposed MFFGNN, a new end-to-end learning framework for DDI forecasting that can effectively combine information from molecular drug diagrams, SMILES sequences, and DDI graphs. The MFFGNN model used the molecular graph feature extraction module to extract global and local features from molecular graphs.

They run thorough tests on a variety of real-world datasets. The MFFGNN model routinely beats further cutting-edge models, according to the findings. Furthermore, the module for multi-type feature fusion configures the gating mechanism to limit the amount of neighborhood data provided to the node.

Drug–drug similarity prediction using DL

Drug similarity studies presume that medications with comparable pharmacological qualities have similar activation mechanisms, and side effects are used to treat problems like each other (Brown 2017 ; Zeng et al. 2019 ).

The drug-pharmacological similarity is critical for various purposes, including identifying drug targets, predicting side effects, predicting drug–drug interactions, and repositioning drugs. Features of the chemical structure (Lu et al. 2017 ; O’Boyle 2016 ), protein targets (Vilar 2016 ; Wang et al. 2014 ), side-effect profiles (Campillos et al. 2008 ; Tatonetti et al. 2012 ), and gene expression profiles (Iorio et al. 2010 ) provide a multi-perspective viewpoint for forecasting medications that are similar and can correct for data gaps in different data sources and offer fresh perspectives on drug repositioning and other uses. The main idea of drug–drug similarity is presented in Fig.  13 . The vector represents the drug features, and the links reflect the similarity between the two drugs.

An external file that holds a picture, illustration, etc.
Object name is 10462_2022_10306_Fig13_HTML.jpg

Drug–drug similarity main idea

Drug similarity measures

The similarity estimations are calculated based on chemical structure, target protein sequence-based, target protein functional, and drug-induced pathway similarities.

The similarity in chemical structure

DrugBank ( 2019 ) provides tiny molecule medicine chemical structures in SDF molecular format. Invalid SDFs can be recognized and eliminated, such as those with a NA value or fewer than three columns in atom or bond blocks. For valid compounds, atom pair descriptors can be computed, pairwise comparison of compounds, δ c ( di , dj ), was evaluated using atom pairs using the Tanimoto coefficient, which is defined as the number of atom pairs in each fraction shared by two different compounds divided by their union (Eq.  1 ).

where AP i and AP j are atom pairs from pharmaceuticals d i and dj, respectively, the numerator is the total number of atom pairs in both compounds, while the denominator is the number of common atom pairs in both compounds.

Target protein sequence-based similarity

DrugBank provides all small molecule drugs have target sequences in FASTA format. The basic Needleman-Wunsch et al. ( 1970 ) dynamic programming approach for global alignment can be used to compare pairwise protein sequences. The proportion of pairwise sequence identity (Raghava 2006 ) can be represented as the corresponding sequence similarity. Equation  2 was used to calculate drug–drug similarity based on target sequence similarities:

where δ t ( di , dj ) denotes target-based similarity between medicines di and dj. Drugs di target a group of proteins known as Ti. Tj is a set of proteins that pharmaceuticals dj target and S(x,y) is a similarity metric based on symmetric sequences between two targeted proteins, x ∈ Ti and y ∈ Tj. Overall, Eq.  2 calculates the average of the best matches, wherein each first medicine's target is only connected to the second medicine's most comparable phrase, and vice versa.

Target protein functional similarity

Protein targets that are overrepresented by comparable biological functions and have similar sequences imply shared pharmacological mechanisms and downstream effects (Passi et al. 2018 ). As a result, each protein has a set of Gene Ontology (GO) concepts from all three categories associated with it, such as cellular components (CC), molecular functions (MF), and biological processes (BP). We filtered out GO keywords that were either very specialized (with 15 linked genes) or very general (with 100 genes). DrugBank ( 2019 ) provided the Human Protein–Protein Interaction (PPI) network. Wang et al. ( 2007 ) proposed leveraging the topology of the GO graph structure to determine the semantic similarity of their linked GO terms, which was used to determine how functionally comparable two drugs are, such as δ f (d i , d j ). Using a best-match average technique, any two GO keywords are compared for pairwise semantic similarity connected with di and d j were aggregated into a single semantic similarity measure and presented into a final similarity matrix.

Drug-induced pathway similarity

A medication pair that triggers similar pathways or overlaps shows that the drugs' mechanisms of action are similar, which is useful information for drug similarities and repositioning research (Zeng et al. 2015 ). Kanehisa and Goto ( 2000 ) was used to find the pathways activated by each small molecule medication. Using dice similarity, the similarity in pairs of any two options was calculated based on their constituent genes' closeness. After that, a pathway-based similarity score was calculated for each medication pair d i and d j , i.e., δ p ( d i , d j ), was calculated using Eq.  3 :

where P i and P j are a group of drug-induced pathways d i and d j , respectively; x and y are two paths represented by a group of genes that make up their constituents, and D S C x , y = 2 x ∩ y / x + y is the probability of a pair of dice matching, this determines how much the two trajectories overlap. When no gene is shared by any two pathways produced by the comparing drug pair, the similarity is set to 0.0. Overall, Eq.  3 implies that if two medications stimulate one or more identical pathways, the maximum pathway-based similarity will be achieved (s).

DL for drug similarity prediction

Wang et al. ( 2019 ) introduced a gated recurrent units (GRUs) model that employs similarity to predict drug–disease interactions. In this approach, CDK turned the SMILES into 2D chemical fingerprints, and the Jaccard score of the 2D chemical fingerprints was used to compare the two medicines. This section comprehensively reviews the researchers' most popular DL algorithms to predict drug similarity.

Hirohara et al. ( 2018 ) employed a CNN to learn molecular representation. The network is given the molecule's SMILES notation as input to feed into the convolutional layers in this scenario. The TOX 21 dataset was used.

To conduct similarity analysis, Cheng et al. ( 2019 ) used the Anatomical Therapeutic Chemical (ATC) based on the drug ATC classification systems and code-based commonalities of drug pairs. The authors created interaction networks, performed drug pair similarity analyses, and developed a network-based methodology for identifying clinically effective treatment combinations for a specific condition.

Xin et al. ( 2016 ) presented a Ranking-based k-Nearest Neighbour (Re-KNN) technique for medication repositioning. The method's key feature combines the Ranking SVM (Support Vector Machine) algorithm and the traditional KNN algorithm. Chemical structural similarity, target-based similarity, side-effect similarity, and topological similarity are the types of similarity computation methodologies they used. The Tanimoto score was then used to determine the similarity between the two profiles.

Seo et al. ( 2020 ) proposed an approach that combined drug–drug interactions from DrugBank, network-based drug–drug interactions, polymorphisms in a single nucleotide, and anatomical hierarchy of side effects, as well as indications, targets, and chemical structures.

Zeng et al. ( 2019 ) developed an assessment of clinical drug–drug similarity derived from data from the clinic and used EHRs to analyse and establish drug–diagnosis connections. Using the Bonferroni adjusted hypergeometric P value, they created connections between drugs and diagnoses in an EMR dataset. The distances between medications were assessed using the Jaccard similarity coefficient to form drug clusters, and a k-means algorithm was devised.

Dai et al. ( 2020 ) reviewed, summarized representative methods, and discussed applications of patient similarity. The authors talked about the values and applications of patient similarity networks. Also, they discussed the ways to measure similarity or distance between each pair of patients and classified it into unsupervised, supervised, and semi-supervised.

Yan et al. ( 2019 ) created BiRWDDA, a new computational methodology for medication repositioning that combines bi-random walk and various similarity measures to uncover potential correlations between diseases and pharmaceuticals. First drug and disease–disease similarities are assessed to identify optimal drug and disease similarities. The information entropy is evaluated between the similarity of medicine and disease to determine the right similarities. Four drug–drug similarity metrics and three disease–disease similarity measurements were calculated depending on some drug- and disease-related characteristics to create a heterogeneous network. The drug's protein sequence information, the extracted drug interaction from DrugBank then utilized the Jaccard score to determine this similarity, the chemical structure, derived canonical SMILES from DrugBank, and the side effect, respectively the four drug–drug similarities.

Yi et al. ( 2021 ) constructed the model of a deep gated recurrent unit to foresee drug–disease interactions that likely employ a wide range of similarity metrics and a kernel with a Gaussian interaction profile. Based on their chemical fingerprints, the similarity measure is utilized to detect a distinguishing trait in medications. Meanwhile, based on established disease–disease relationships, the Gaussian interactions profile kernel is used to derive efficient disease features. After that, a model with a deep gated recurrent cycle is created to anticipate drug-disease interactions that could occur. The outputs of the experiments showed that the suggested algorithm could be used to anticipate novel drug indications or disease treatments and speed up drug repositioning and associated drug research and discovery.

To forecast DDIs, Yan et al. ( 2022 ) suggested a semi-supervised learning technique (DDI-IS-SL). DDI-IS-SL uses the cosine similarity method to calculate drug feature similarity by combining chemical, biological, and phenotypic data. Drug chemical structures, drug–target interactions, drug enzymes, drug transporters, drug routes, drug indications, drug side effects, harmful effects of drug discontinuation, and DDIs that have been identified are all included in the integrated drug information.

Heba et al. ( 2021 ) used DrugBank to develop a machine learning framework based on similarities called "SMDIP" (Similarity-based ML for Drug Interaction Prediction), where they calculated drug–drug similarity utilizing a Russell–Rao metric for the biological and structural data that is currently accessible on DrugBank to represent the limited feature area. The DDI classification is carried out using logistic regression, emphasizing finding the main predictors of similarity. The DDI key features are subjected to six machine learning models (NB: naive Bayes; LR: logistic regression; KNN: k-nearest neighbours; ANN: neural network; RFC: random forest classifier; SVM: support vector machine).

For large-scale DDI prediction, Vilar et al. ( 2014 ) provided a procedure combining five similar drug fingerprints (Two-dimensional structural fingerprints, fingerprinting of interaction profiles, fingerprints of the target profile, Fingerprints of ADE profiles, and pharmacophoric techniques in three dimensions).

Song et al. ( 2022 ) used similarity theory and a convolutional neural network to create global structural similarity characteristics. They employed a transformer to extract and produce local chemical sub-structure semantic characteristics for drugs and proteins. To create drug and protein global structural similarity characteristics, The Tanimoto coefficient, Levenshtein distance, and CNN are all utilized in this study.

Benchmark datasets and databases

Drug development or discovery has been based on a range of direct and indirect data sources and has regularly demonstrated strong predictive capability in finding confirmed repositioning candidates and other applications for computer-aided drug design. This section reviews the most important and available benchmark datasets and databases used in the drug discovery problem and which the researchers may need according to each problem category. Thirty-five datasets are summarized in Table 3.

Evaluation metrics

Performance measures are required for evaluating machine learning models (Benedek et al. 2021 ). The measures serve as a tool for comparing different techniques. They aid in comparing many approaches to identify the best one for execution. This section describes the many metrics defined for the four categories of drug discovery difficulties below.

Table ​ Table4 4 shows the metrics employed in drug discovery problems—understanding the metrics aids in assessing the effectiveness of various prediction systems. True positives (TP) are drug side effects that have been recognized appropriately, False positives (FP) are adverse pharmacological effects that aren't present but were detected by the model, and True negatives (TN) are pharmacological side effects that do not exist but that the model failed to detect. False negatives (FN) are adverse pharmacological effects the model did not predict.

The important metrics for drug discovery problems

Drug dosing optimization

Drugs are vital to human health and choosing the proper treatment and dose for the right patient is a constant problem for clinicians. Even when taken as studied and prescribed, drugs have adverse impact profiles with varying response rates. As a result, all medications must be well-managed, especially those utilized in treating critical ailments or with a tight exposure window between efficacy and toxicity. Clinicians follow typical guidelines for the first dosage, which is not always optimal or secure for every patient, especially if the medicine no longer is evaluated in various dosages for various patient types. Precision dosage can revolutionize by increasing perks in health care while reducing drug therapy risks. While precise dosing will probably influence some pharmaceuticals significantly, perhaps not essential or practical to apply to all drugs or therapeutic classes. As a result, recognizing the characteristics that make medications suitable for precision dosage targets will aid in directing resources to where they'll have the most impact. Precision-dosing meds with a high priority and therapeutic classes could be crucial in achieving increased health care performance, safety, and cost-effectiveness (Tyson et al. 2020 ).

Due to standard, fixed dosing procedures or gaps in knowledge, imprecise drug dosing in specific subpopulations increases the risk of potentiating adverse effects due to supratherapeutic or subtherapeutic concentrations (Watanabe et al. 2018 ). Currently, the Food and Medicine Administration (FDA) simply requires a drug to be statistically better than a non-inferior to placebo of the existing treatment standard. This does not guarantee that the medicine will benefit most patients in clinical trials, especially if malignancies treatment can be tough, like diffuse intrinsic pontine glioma (DIPG) and unresectable meningioma, where rates of therapy response can be exceedingly low (Fleischhack et al. 2019 ).

There are essential aspects for dose optimization ( https://friendsofcancerresearch.org/wpcontent/uploads/Optimizing_Dosing_in_Oncology_Drug_Development.pdf ) that vary based on the product, the target population, and the available data to find the most effective dose, which varies based on the product, the target population, and the available data:

  • Therapeutic properties: Drug features such as small molecule vs. large molecule and agonist vs. antagonist impact how drugs interact with the body regarding safety and efficacy. The therapeutic characteristics impact the first doses used in dose-finding studies and the procedures used to determine which doses should be used in registrational trials.
  • Patient populations: Patient demographics vary depending on tumour kind, stage of disease, and comorbidities. Understanding how diverse factors influence the drug's efficacy may justify modifying the dose correspondingly, especially in the context of enlarged clinical trial populations.
  • Supplemental versus original approval: Differences in disease features and patient demographics between tumour types and treatment settings, such as monotherapy versus combination therapy, must be considered when assessing whether additional dose exploration is required for a supplemental application. In cases when more dose exploration is required, the research design can include previous exposure-response knowledge from the initial approval.

Drug discovery and XAI

The topic of XAI addresses one of the most serious flaws in ML and DL algorithms: model interpretability and explain ability. Understanding how and why a prediction is formed becomes increasingly crucial as algorithms grow more sophisticated and can forecast with greater accuracy. It would be impossible to trust the forecasts of real-world AI applications without interpretability and explain ability. Human-comprehensible explanations will increase system safety while encouraging trust and sustained acceptance of machine learning technologies (). XAI has been studied to circumvent the limitations of AI technologies due to their black-box nature. In contrast to making decisions and model justifications which may be provided by AI approaches like DL and XAI (Zhang et al. 2022 ). Attention has been attracted to XAI approaches (Lipton 2018 ; Murdoch et al. 2019 ) to compensate for the lack of interpretability of some ML models as well as to aid human decision-making and reasoning (Goebel et al. 2018 ). The purpose of presenting relevant explanations alongside mathematical models is to help students understand them better by (1) Making the decision-making process more transparent (Doshi-Velez and Kim 2017 ), (2) correct predictions should not be made for the wrong motives (Lapuschkin et al. 2019 ), (3) avoid biases and discrimination that are unjust or unethical (Miller 2019 ), and (4) close the gap between ML and other scientific disciplines. Effective XAI can also help scientists in navigating the scientific process (Goebel et al. 2018 ), enabling people to fine-tune their understanding and opinions on the process under inquiry (Chander et al. 2018 ). We hope to provide an overview of recent XAI drug discovery research in this section.

XAI has a place in drug development. While the precise definition of XAI is still up for controversy (Guidotti et al. 2018 ), the following characteristics of XAI are unquestionably beneficial in applications of drug design (Lipton 2018 ):

  • Transparency is accomplished by understanding how the system came to a specific result.
  • The explanation of why the model's response is suitable serves as justification. It is instructive to provide new information to human decision-makers.
  • Determining the reliability of a prediction to estimate uncertainty.

The molecular explanation of pharmacological activity is already possible with XAI (Xu et al. 2017 ; Ciallella and Zhu 2019 ), as well as drug safety and organic synthesis planning (Dey et al. 2018 ). If It's working overtime, XAI will be important in processing and interpreting increasingly complex chemical data, as well as creating new pharmaceutical ideas, all while preventing human bias (Boobier et al. 2017 ). Application-specific XAI techniques are being developed to quickly reply to unique scientific issues relating to the Pathophysiology and biology of the human may be boosted by pressing drug discovery difficulties such as the coronavirus pandemic.

AI tools can increase their prediction performance by increasing model complexity. As a result, these models become opaque, with no clear grasp of how they operate. Because of this ambiguity, AI models are not generally utilized in important industries such as medical care. As a result, XAI focuses on understanding what goes into AI model prediction to meet the demand for transparency in AI tools. AI model interpretability approaches can be categorized depending on the algorithms used, a scale for interpreting, and the kind of information (Adadi and Mohammed 2018 ). Regarding the objectives of interpretability, approaches grouped as white-box model development, black-box model explanation, model fairness enhancement, and predictive sensitivity testing (Guidotti et al. 2018 ).

According to the gradient-based attribution technique (Simonyan et al. 2014 ), the network's input features are to blame for the forecast. Because this strategy is commonly employed when producing a DNN system's predictions, it may be a suitable solution for various black-box DNN models in DDI prediction (Quan, et al. 2016 ; Sun et al. 2018 ). In addition, DeepLIFT is a frequent strategy for implementing on top of DNN models that have been demonstrated to be superior to techniques based on gradients (Shrikumar et al. 2017 ). As opposed to that, the Guided Backpropagation model may be used to construct network architectures (Springenberg 2015 ). A convolutional layer with improved stride can be used instead of max pooling in CNN to deal with loss of precision. This method could be employed in CNN-based DDI prediction, as shown in Zeng et al. ( 2015 ).

Furthermore, in the Tao et al. ( 2016 ) was implemented neural networks that parse natural language. Using rationales, this method aimed to achieve the small pieces of input text. This method's design comprises two parts: a generator and an encoder that seek for text subsets that are closely connected to the predicted outcome. Because NLP-based models are used to extract DDIs (Quan et al. 2016 ), the above methods should be examined for usage in improving the model's clarity.

Aside from that, XAI has created methods for developing white-box models, including linear, decision tree, rule-based, and advanced but transparent models. However, these approaches are receiving less attention due to their weak ability to predict, particularly in the NLP-based sector, such as in the DDIs the job of extracting. Several ideas to address AI fairness have also been offered. Nonetheless, while extracting DDIs, only a small number of these scholarly studies looked at non-tabular data impartiality, such as text-based data. Many DDIs experiments used the word embedding method (Quan et al. 2016 ; Zhang 2020 ; Bolukbasi 2016 ). As a result, attempts to ensure fairness in DDI research should be considered more. To ensure the reliability of AI models, numerous methods also make an effort to examine the sensitivity of the models. Regarding their Adversarial Example-based Sensitivity Analysis, Zügner et al. ( 2018 ) used this model to explore graph-structured data. The technique looks at making changes to links between nodes or node properties to target node categorization models. Because graph-based methods are frequently utilized in DDIs research (Lin et al. 2021 ; Sun et al. 2020b ), methods like those used in the previous study suggest that they might be used in a DDIs prediction model. In RNN, word embedding perturbations (Miyato et al. 1605 ) are also worth addressing. Significantly, the input reduction strategy utilized by Feng et al. ( 2018 ) to expose hypersensitivity in NLP models could be applied to DDI extraction studies. The DDIs study of Schwarz et al. ( 2021 ) attempted to provide model interpretability using Attention ratings derived at all levels of modeling in their DDIs study. The significance of similarity matrices to the vectors for medication depiction is determined using these scores, and drug properties that contribute to improved encoding are identified using these scores. This method makes use of data that travels through all tiers of the network.

Graph neural networks (GNNs) and their explain ability are rapidly evolving in the field of graph data. GNNExplainer in Ying et al. ( 2019 ) uses mask optimization to learn soft masks for edge and node attributes to elaborate on the forecasts. Soft masks have been initiated at random and regarded as trainable variables. After that, the masks are then combined in comparison to the first graph using multiplications on a per-element basis by GNNExplainer. After that, by enhancing the exchange of information between the forecasts from the first graph and the recently acquired graph, the masks are maximized. Even when various regularization terms, such as element-by-element entropy, motivate optimal disguises for stealth, the resulting Masks remain supple.

In addition, because the masks are tuned for each input graph separately, it’s possible that the explanations aren't comprehensive enough. To elaborate on the forecasts, PGExplainer (Luo et al. 2020 ) discovers approximated discrete edge masks. To forecast edge masks, it develops a mask predictor that is parameterized. It starts by concatenating node embeddings to get the embeddings for each edge in an input graph. The predictor then forecasts the chances of each edge being selected using the edge embeddings, that regarded as an evaluation of significance. The reparameterization approach is then used to sample the approximated discrete masks. Finally, the mutual information between the previous and new forecasts is optimized to train the mask predictor. GraphMask (Schlichtkrull et al. 2010 ) describes the relevance of edges in each GNN layer after the fact. It uses a classifier, like the PGExplainer, to forecast if an edge may be eliminated and does not impact the original predictions. A binary concrete distribution (Louizos et al. 1712 ) and a reparameterization method are used to roughly represent separate masks. The classifier is additionally trained by removing a term for a difference, which evaluates the difference between network predictions over the entire dataset. ZORRO (Thorben et al. 2021 ) employs discrete masks to pinpoint key input nodes and characteristics. A greedy method is used to choose nodes or node attributes from an input network. ZORRO chooses one node characteristic with the greatest fidelity score for each stage. The objective function, fidelity score, measures the degree of the recent forecasts resemble the model's original predictions by replacing the rest of the nodes/features with random noise values and repairing chosen nodes/features. The non-differentiable limitation of discrete masks is overcome because no training process is used.

Furthermore, ZORRO avoids the problem of "introduced evidence" by wearing protective masks. The greedy mask selection process, on the other hand, may result in optimal local explanations. Furthermore, because masks are generated for each graph separately, the explanations may lack a global understanding. Causal Screening (Xiang et al. 2021 ) investigates the attribution of causality to various edges in the input graph. It locates the explanatory subgraph's edge mask. The essential concept behind causal attribution is to look at how predictions change when an edge is added to the present explanatory subgraph, called the influence of causality. It examines the causal consequences of many edges at each step and selects one to include in the paragraph. It selects edges using the individual causal effect (ICE), which assesses the difference in information between parties after additional edges are introduced to the subgraph.

Causal Screening, like ZORRO, is a rapacious algorithm that generates undetectable masks without any prior training. As a result, it does not suffer due to the issue of the evidence presented. However, it is possible to lack worldwide comprehension and be caught in optimum local explanations. SubgraphX (Yuan et al. 2102 ) investigates deep graph model subgraph-level explanations. It uses the Monte Carlo Tree Search (MCTS) method (Silver et al. 2017 ) to effectively investigate various subgraphs by trimming nodes and choose the most significant subgraph from the search tree's leaves as the explanation for the prediction.

Furthermore, the Shapley values can be used to update the mask generation algorithm's objective function. Its produced subgraphs are more understandable by humans and suited for graph data than previous perturbation-based approaches. However, the computational cost is higher because the MCTS algorithm explores distinct subgraphs.

Success stories about using DL in drug discovery

Big pharmaceutical companies have migrated toward AI as DL methodologies have advanced, abandoning conventional approaches to maximize patient and company profit. AstraZeneca is a multinational, science-driven, worldwide pharmaceutical company that has successfully used artificial intelligence in each stage of drug development, from virtual screening to clinical trials. They could comprehend current diseases better, identify new targets, plan clinical trials with higher quality, and speed up the entire process by incorporating AI into medical science. AstraZeneca's success is a shining illustration of how combining AI with medical science can yield incredible results. Their collaborations with other AI-based companies demonstrate their continual attempts to increase AI utilization. One such cooperation is with Ali Health, an Alibaba subsidiary that wants to provide AI-assisted screening and diagnosis systems in China (Nag et al. 2022 ).

SARS-CoV-2 virus outbreak placed many businesses under duress to develop the best medicine in the shortest amount of time feasible. These businesses have turned to employ AI in conjunction based on the data available to attain their goals. Below are some examples of firms that have been successful in identifying viable strategies to combat the COVID-19 virus because of their efforts.

Deargen, a South Korean startup, developed the MT-DTI (Molecule Transformer Drug Target Interaction Model), a DL-based drug-protein interaction prediction model. In this approach, the strength of an interaction between a drug and its target protein is predicted using simplified chemical sequences rather than 2D or 3D molecular structures. A critical protein on the COVID-19-causing virus SARS-CoV-2 is highly likely to bind to and inhibit the FDA-approved antiviral drug atazanavir, a therapy for HIV. It also discovered three more antivirals, as well as Remdesivir, a not-yet-approved medicine that is currently being studied in patients. Deagen's ability to uncover antivirals utilizing DL approaches is a significant step forward in pharmaceutical research, making it less time-consuming and more efficient. If such treatments are thoroughly evaluated, there is a good chance that we will be able to stop the epidemic in its tracks (Beck et al. 2020 ; Scudellari 2020 ).

Another example is Benevolent AI, a biotechnology company in London leverages medical information, AI, and machine learning to speed up health-related research. They've identified six medicines so far, one of which, Ruxolitinib, is claimed to be in clinical trials for COVID19 (Gatti et al. 2021 ). To find prospective medications that might impede the procedure for viral replication of SARS-CoV-2, The business has been utilizing a massive reservoir of information pertaining to medicine, together Utilizing data obtained from the scientific literature by their AI system and ML. They received FDA permission to use their planned Baricitinib medication in conjunction with Remdesivir, which resulted in a higher recovery rate for hospitalized COVID19 patients (Richardson et al. 2020 ).

Skin cancer is a form of cancer that is very frequent around the globe. As the rate at which skin cancer continues to rise, it is becoming increasingly crucial to diagnose it initially developed, research demonstrate that early identification and therapy improve the survival rate of skin cancer patients. With the advancement of medical research and AI, several skin cancer smartphone applications have been introduced to the market, allowing people with worrisome lesions to use a specialized technique to determine whether they should seek medical care. According to studies, over 235 dermatology smartphone apps were developed between 2014 and 2017 (Flaten et al. 2020 ). Previously, they worked by sending a snapshot of the lesion over the internet to a health care provider. Still, thanks to smartphones' internal AI algorithms, these applications can detect and classify images of lesions as high or low risk and Immediately assess the patient's risk and offer advice. SkinVison (Carvalho et al. 2019 ) is an example of a successful application.

Future challenges

Digital twinning in drug discovery.

The development and implementation of Industry 4.0 emerging technologies allow for creation of digital twins (DTs), that promotes the modification of the industrial sector into a more agile and intelligent one. A DT is a digital depiction of a real entity that interacts in dynamic, two-way links with the original. Today, DTs are being used in a variety of industries. Even though the pharmaceutical sector has grown to accept digitization to embrace Industry 4.0, there is yet to be a comprehensive implementation of DT in pharmaceutical manufacture. As a result, it is vital to assess the pharmaceutical industry's success in applying DT solutions (Chen et al. 1088 ).

New digital technologies are essential in today's competitive marketplaces to promote innovation, increase efficiency, and increase profitability (Legner et al. 2017 ). AI (Venkatasubramanian 2019 ), Internet of Things (IoT) devices (Venkatasubramanian 2019 ; Oztemel and Gursev 2018 ), and DTs have all piqued the interest of governments, agencies, academic institutions, and corporations (Bao et al. 2018 ). Industry 4.0 is a concept offered by a professional community to increase the level of automation to boost productivity and efficiency in the workplace.

This section provides a quick look at the evolution of DT and its application in pharmaceutical and biopharmaceutical production. We begin with an overview of the technology's principles and a brief history, then present various examples of DTs in pharmacology and drug discovery. After then, there will be a discussion of the significant technical and other issues that arise in these kinds of applications.

History and main concepts of digital twin

The idea of making a "twin" of a process or a product returned to NASA's Apollo project in the late 1960s (Rosen et al. 2015 ; Mayani et al. 2018 ; Schleich et al. 2017 ), when it assembled two identical space spacecraft. In this scenario, the "twin" was employed to imitate the counterpart's action in real-time.

The DT, according to Guo et al. ( 2018 ), is a type of digital data structure that is generated as a separate entity and linked to the actual system. Michael Grieves presented the original meaning of a DT in 2002 at the University of Michigan as part of an industry presentation on product lifecycle management (PLM) (Grieves 2014 ; Grieves and Vickers 2017 ; Stark et al. 2019 ). However, the first actual use of this notion, which gave origin to the current moniker, occurred in 2010, when NASA (the United States National Aeronautics and Space Administration) attempted to create virtual spaceship simulators for testing (Glaessgen and Stargel 2012 ).

A digital reproduction or representation of a physical thing, process, or service is what a DT is in theory. It's a computer simulation with unique features that dynamically connect the physical and digital worlds. The purpose of DTs is to model, evaluate, and improve a physical object in virtual space til it matches predicted performance, at which time it can be created or enhanced (if already built) in the real world (Kamel et al. 2021 ; Marr 2017 ).

Since then, DT technology has acquired popularity in both business and academia. Main components of DTs presently exist, as shown in Fig.  14 . Still, the theoretical model comprises three parts: the real entity in the actual world, the digital entity in the virtual space, and the interconnection between them (Glaessgen and Stargel 2012 ).

An external file that holds a picture, illustration, etc.
Object name is 10462_2022_10306_Fig14_HTML.jpg

Main components of DT

In an ideal world, the digital component would have all the system's information that could be acquired from its physical counterpart (Kritzinger et al. 2018 ). When integrated with AI, IoT, and other recent intelligent systems, a DT can forecast how an object or process will perform.

Digital twin in pharmaceutical manufacturing

Developing a drug is lengthy and costly, requiring efforts in biology, chemistry, and manufacturing, and it has a low success rate. An estimated 50,000 hits (trial versions of compounds that are subsequently tweaked to develop a medication in the future) are evaluated to develop a successful drug. Only one in every 12 therapeutic compounds, clinical trials have been performed on humans, makes it to market successfully. Toxicity (A medication's capacity to offer a patient with respite and slow the progression of a disease) and lack of effectiveness contribute to more than 60% of all drug failures (Subramanian 2020 ).

Making the appropriate decisions about which targets, hits, leads, and compounds to pursue is important to a drug's successful market introduction. However, the decision is based on in vitro (Experimental system in a test tube or petri dish.) and in vivo (experiments in animals.) systems, both of which have a shaky correlation with clinical outcomes (Mak et al. 2014 ). Answers to the following inquiries would be provided by a perfect decision support system for drug discovery:

  • What is the magnitude of any target's influence on the desired clinical result?
  • Is the potential compound changing the target enough to change clinical outcomes?
  • Is the chemical sufficiently selective and free of side effects or harmful consequences?
  • Is the ineffectiveness attributable to the drug's failure to reach its target?
  • Has the trial chosen the appropriate dose and dosing regimen?
  • Are there any surrogate or biomarkers such as cholesterol that serves as a proxy for the illness's root cause that can forecast a drug's success or failure?
  • Have the correct patients been chosen for the study?
  • Is it possible to identify hyper- and hypo-responders before the study begins?

Therapeutic failures are prevalent and difficult to address, given the complex process of developing drugs based on the points above. This issue must be addressed by combining data and observations from many stages of the drug development process and developing a system that can forecast an experiment's outcome or a chemical modification's influence on a therapeutic molecule. This highlights the significance of DT in the field of drug discovery.

In the United States, funding organizations such as DARPA, NSF, and DOE have aggressively supported bioprocess modeling at the genomic and cellular levels, resulting in high-profile programs such as BioSPICE (Kumar and Feidler 2003 ). These groups have shown that smaller models built to answer specific issues can greatly influence drug development efficiency. This would make it possible to apply the prediction methodology to various stages of the drug discovery and research process, including confirmation of the target, enhancing leads, and choosing candidates, Recognition of biomarkers, fabrication of assays and screens, and the improvement of clinical trials.

The pharmaceutical business is embracing the overall digitization trend in tandem with the US FDA's ambition to establish an agile, adaptable pharmaceutical manufacturing sector that delivers high-quality pharmaceuticals without considerable regulatory scrutiny (O’Connor et al. 2016 ). Industries are beginning to implement Industry 4.0 and DT principles and use them for development and research (Barenji et al. 2019 ; Steinwandter et al. 2019 ; Lopes et al. 2019 ; Kumar et al. 2020 ; Reinhardt et al. 2020 ). Pharma 4.0 (Ierapetritou et al. 2016 ) is a digitalization initiative that integrates Industry 4.0 with International Council for Harmonisation (ICH) criteria to model a combined operational model and production control plan.

As shown in Fig.  15 , live monitoring of the system `by the Process Analytical Technology (PAT), data collection from the machinery, the supplementary and finished goods, and a worldwide modelling and software for data analysis are some of the key requirements for achieving smart manufacturing with DT (Barenji et al. 2019 ). Quality-by-Design (QbD) and Continuous Manufacturing (CM) (Boukouvala et al. 2012 ), flowsheet modeling (Kamble et al. 2013 ), and PAT implementations (James et al. 2006 ) have all been used by the pharmaceutical industry to achieve this. Although some of the instruments have been thoroughly examined, DTs' entire integration and development is still a work in progress.

An external file that holds a picture, illustration, etc.
Object name is 10462_2022_10306_Fig15_HTML.jpg

Main categories of smart manufacturing with DT

The pharmaceutical industry has used PAT in different programs across the steps involved in producing drugs (Nagy et al. 2013 ). Even though this has resulted in a rise in the use of PAT instruments, their implementations are limited to research and development rather than manufacturing on a large scale (Papadakis et al. 2018 ). They have been successful in decreasing production costs and enhancing product quality monitoring in the small number of examples where they have been used in manufacturing (Simon et al. 2019 ). The development of various PAT approaches, as well as their convincing implementation is a vital component of a scheme for surveillance and control (Boukouvala et al. 2012 ) and has given a foundation for obtaining essential data from the physical component.

Papadakis et al. ( 2018 ) recently provided a framework for identifying efficient reaction paths for pharmaceutical manufacture (Rantanen and Khinast 2015 ), which comprises modeling reaction route workflows discovery, analysis of reactions and separations, process simulation, assessment, optimization, and the use (Sajjia et al. 2017 ).

To develop models, data-driven modeling methods require the gathering and using of many substantial experiments, and the resulting models are solely reliant on the datasets provided. Artificial neural networks (ANN) (Pandey et al. 2006 ; Cao et al. 2018 ), multivariate statistical analysis, and in Monte Carlo Badr and Sugiyama ( 2020 ) are all commonly used in pharmaceutical manufacturing. These methods are less computationally costly, but the prediction outside the dataset space is frequently unsatisfactory due to the trained absence of underlying physics understanding in models. Using IoT devices in pharmaceutical manufacturing lines results in massive data collection volumes. The virtual component must receive this collection of process data and CQAs quickly and effectively. Additionally, for accurate prediction, several pharmaceutical process models need material properties. As a result, to provide virtual component access to all datasets, a central database site is necessary (Lin-Gibson and Srinivasan 2019 ).

Digital twin in biopharmaceutical manufacturing

The synthesis of big molecule-based entities in various combinations that has applications in the treatment of inflammatory, microbial, and cancer issues, is the focus of biopharmaceutical manufacturing (Glaessgen and Stargel 2012 ; Narayanan et al. 2020 ). The demand for biologic-based medications has risen in recent years, necessitating greater production efficiency and efficacy (Kamel et al. 2021 ). As a result, many businesses are switching from batch to continuous production and implementing intelligent manufacturing systems (Lin-Gibson and Srinivasan 2019 ). DT can aid in decision-making, risk analysis, product creation, and process prediction., which incorporates the physical plant, data collecting, data analysis, and system control (Tao et al. 2018 ).

biological products' components and structures are intimately connected to treatment effectiveness (Read et al. 2010 ) and are very sensitive to cell-line. Operating conditions thorough actual plant's virtual description in a simulation environment is required to apply DT in biopharmaceutical manufacturing (Tao et al. 2018 ). This means that each unit activity inside an integrated model's simulation should accurately reflect the crucial process dynamics. Previous reviews Narayanan et al. ( 2020 ) Tang et al. ( 2020 ) Farzan et al. ( 2017 ) Baumann and Hubbuch ( 2017 ) Smiatek et al. ( 2020 ) and Olughu et al. ( 2019 ) focused on process modelling methodologies for both upstream and downstream operations.

Data from a biopharmaceutical monitoring system is typically diverse regarding data kinds and time scales. A considerable amount of data is collected during biopharmaceutical manufacture thanks to the deployment of real-time PAT sensors. As a result, data pre-processing is required to deal with missing data, visualize data, and reduce dimensions (Gangadharan et al. 2019 ). In batch biopharmaceutical production, Casola et al. ( 2019 ) presented data mining-based techniques for stemming, classifying, filtering, and clustering historical real-time data. Lee et al. ( 2012 ) combined different spectroscopic techniques and used data fusion to forecast the composition of raw materials.

AI-driven digital twins in today's pharmaceutical drug discovery

In the pharmaceutical industry, challenges are emerging from clinical studies that make drug development incomplete, sluggish, uncertain, and maybe dangerous. For example, It is not a true reflection of reality where clinical trials can take into account that in the real world, just a small portion of a big and diverse population is depicted among the many billions of humans on the planet where it is not possible to get a view of how each person based on how they will respond to a medicine. Clinical trials' rigorous requirements for physical and mental health in some cases also result in failure because of a lack of qualified participants. Pharmaceutical firms battle to provide the precise number and kind of participants needed to comply with the stringent requirements of clinical trial designs. Also, in most trials, the actual drug is replaced by a placebo as this helps contrast how sick individuals behave when they are not administered the experimental medication; This implies that at least some trial participants do not receive it. Here, These issues can be solved by using digital twins, which can imitate a range of patient features, giving a fair representation of how a medicine affects a larger population. AI-enabled digital twinning may reduce the trial's setup by revealing how susceptible a patient is to various inclusion and exclusion criteria as a result, patients can be rapidly identified, and digital twins can predict a patient's reaction, and placebos won't be required. Therefore, the new treatment can be assured for every patient in the trial, and digital twins can reduce the dangerous impact of drugs in the early stages by decreasing the number of patients who need to be tested in the real world. Figure  16 illustrates a framework by running all possible combinations. All treatment protocols are tested on a digital twin of the patient to discover an appropriate treatment protocol for this patient. Doing this quickly and accurately can lead to providing the best quality treatment for the patient without experimenting with the patient, which saves effort, cost, and accuracy in determining an appropriate treatment protocol for patients.

An external file that holds a picture, illustration, etc.
Object name is 10462_2022_10306_Fig16_HTML.jpg

Open problems

This section discusses important issues to consider regarding progression from preclinical to clinical and implementation in practice that necessitate new ML solutions to assist transparent, usable, and data-driven decision-making procedures to accelerate drug discovery and decrease the number of failures in clinical development phases.

  • Complex disorders, such as viral infections and advanced malignancies frequently necessitate drug combinations (Julkunen et al. 2020 ; White et al. 2021 ). For example, kinase inhibitor combos or single compounds that block several kinases may improve therapeutic efficacy and duration while combating treatment resistance in cancer (Attwood et al. 2021 ). While several ML models have been created to predict response pairs of drug–dose combinations, higher-order combination effects can be predicted in a systematic way involving more than two medicines or targets is still a problem. In cancer cell lines, tensor learning methods have permitted reliable prediction of paired drug combination dose-response matrices (Smiatek et al. 2020 ). This computationally efficient learning approach could use extensive pharmacogenomic data, determine which drug combinations are most successful for additional in vitro or in vivo testing in many kinds of preclinical models, such as higher-order combinations among novel therapeutic compounds and doses.
  • While possible toxicity and effectiveness that is targeted are important criteria for clinical development success, most existing ML models for predicting response to the therapy accentuate effectiveness as the primary result. As a result, careful examination, and harmful effects prediction of instances in simulated and preclinical settings is required to strike a balance between the effectiveness of the toxicity and therapy that is acceptable to accelerate the next stages of drug development (Narayanan et al. 2020 ). Applying single-cell data and ML algorithms to develop combinations of anticancer drugs has shown the potential to boost the likelihood of clinical success (Tao et al. 2018 ). Transfer of knowledge and deconvolution techniques for in silico cell set (Avila et al. 2020 ) may offer effective ways to reduce the requirement to generate a lot of single-cell data to predict combination therapy responders and impacts of toxicity, as well as the recommended dosage that optimizes both efficacy and safety.
  • In addition, patient data and clinical profiles must be used to validate the in-silico therapy response forecasts. This real data for ML predictions is crucial for progress in medicine and establishing the practical value and providing clinical guidance in making decisions. A no-go decision was made early, for example, if the substance has harmful consequences. Many of the present issues encountered when using machine learning for drug discovery, particularly in clinical development, are since current AI algorithms do not meet the requirements for clinical research. As a result, ML model validation requires systematic and comprehensive high-quality clinical data sets. The discovery methods must be thoroughly evaluated for accuracy and reproducibility using community-agreed performance measures in various settings, not just a small collection of exemplary data sets. sharing and exploiting private patient information is possible with systems that isolate the code from the data or use the model to data method (Guinney and Saez-Rodriguez 2018 ), which It makes it possible for federated learning to utilise patient-level data for model construction and thorough assessment.
  • Even if there are many applications for drug discovery, The majority of ML and particularly DL models remain "black boxes”, and interpretation by a human specialist is sometimes tricky (Jiménez-Luna et al. 2020 ). Implementing mathematical models as online decision support tools must be understandable to users to obtain confidence. Comprehensible, accessible, and explainable models should clearly state the optimization goals, such as synergy, efficacy, and/or toxicity.
  • DTI prediction is a notable example of fields of drug discovery research. It has been ongoing more than 10 years and aims to enhance the effectiveness of computational models using various technologies. The most recent computational approaches for predicting DTIs are DL technologies. These use unstructured-based approaches that don't need 3D structural data or docking to get over the drug and target protein's high-dimensional structure restrictions. Despite the DL's outstanding performance, regression inside the DTI prediction remains a critical and difficult issue, and researchers could develop several strategies to improve prediction accuracy. Furthermore, data scarcity and the lack of a standardized benchmark database are still considered current research gaps.
  • While DL approaches show promise in detecting drug responses, especially when dealing with large amounts of data, drug response prediction research is in its first stages, and more efficient and relevant models are needed.
  • While DL techniques have shown to be effective in detecting DDIs, especially when dealing with large amounts of data, more promising algorithms that focus on complex molecular reactions need to be developed.
  • Only a few studies in the drug discovery field have investigated their models' explain ability, leaving much room for improvement. The explanations generated by XAI for human decision-making must be not insignificant, not artificial, and helpful to the scientific community. Until now, ensuring that XAI techniques achieve their goals and produce trustworthy responses would necessitate a combined effort amongst DL specialists, chemo informaticians and chemists, biologists, data scientists, and other subject matter experts. As a result, we believe that more developed methodologies to explain black-box models for drug discovery fields like DDIs, drug–target interactions, drug sensitivity, and drug side effects must be considered in the future to ensure model fairness or strict sensitivity evaluations of models. Further exploration of the capabilities and constraints of the existing chemical language for defining these models will be critical. The development of novel interpretable molecular representations for DL and the deployment of self-explanatory algorithms alongside sufficiently accurate predictions will be a critical area of research in the coming years. Because there are currently no methods that combine all the stated advantageous XAI characteristics (transparency, justification, informativeness, and uncertainty estimation), consensus techniques that draw on the advantages of many XAI approaches and boost model dependability will play a major role in the short and midterm. Currently, there is no open-community platform for exchanging and refining XAI software and model interpretations in drug discovery. As a result, we believe that future study into XAI in drug development has much potential.

This section presents a brief about how the proposed analytical questions in Sect.  2 are being answered through the paper.

Several DL algorithms have been used to predict the different categories of drug discovery problems as deeply illustrated in Sect. 4 with respect to the main categories of drug discovery problems in Fig.  8 . In addition, a summary of a sample of these algorithms, their methods, advantages and weaknesses are presented in Table ​ Table2 2 .

Recognizing the characteristics that make medications suitable for precision dosage targets will aid in directing resources to where they'll have the most impact. Employing DL in drug dosing optimization is a big challenge which increases the health care performance, safety, and cost-effectiveness as presented in Sect.  7 .

With the advancement of DL methods, we've seen big pharmaceutical businesses migrate toward AI, such as ‘AstraZeneca’ which is a global multinational pharmaceutical business that has successfully used AI in every stage of drug development. Several success stories have been presented in Sect.  9 .

The topic of XAI addresses one of the most serious flaws in ML and DL algorithms: model interpretability and explain ability. It would be impossible to trust the forecasts of real-world AI applications without interpretability and explain ability. Section  8 presents the literature that address this issue. A digital twin (DT) is a virtual representation of a living thing that is connected to the real thing in dynamic, reciprocal ways. Today, DTs are being used in a variety of industries. Even though the pharmaceutical sector has grown to accept digitization to embrace Industry 4.0, there is yet to be a comprehensive implementation of DT in pharmaceutical manufacture. Success stories regarding employing DT into drug discovery is presented in Sect. 10.

Through the paper, we present how DL succeed in all aspects of drug discovery problems, However, it is still a very important challenge for future research. Section 11 covers these challenges.

Figure  17 presents the percentage of the different DL applications for each building block of our study. It is well observed that the most percentage segment is dedicated for the drug discovery and DL because it is the main core of our research.

An external file that holds a picture, illustration, etc.
Object name is 10462_2022_10306_Fig17_HTML.jpg

Percentages of DL applications for each category

Despite all the breakthroughs in pharmacology, developing new drugs still requires a lot of time and costs. As DL technology advances and the amount of drug-related data grows, a slew of new DL-based approaches is cropping up at every stage of the drug development process. In addition, we’ve seen large pharmaceutical corporations migrate toward AI in the wake of the development of DL approaches.

Although the drug discovery is a large field and has different research categories, there is a few review studies about this field and each related study has focused only on a one research category such as reviewing the DL applications for the DTIs. So, the main goal of our research is to present a systematic Literature review (SLR) which integrates the recent DL technologies and applications for the different categories of drug discovery problems Including, Drug–target interactions (DTIs), drug–drug similarity interactions (DDIs), drug sensitivity and responsiveness, and drug-side effect predictions. That is associated with the benchmark data sets and databases. Related topics such as XAI and DT and how they support the drug discovery problems are also discussed. In addition, the drug dosing optimization and success stories are presented as well. Finally, we suggest open problems as future research challenges.

Although the DL has proved its strength in drug discovery problems, it is still a promising open research area for the interested researchers. In this paper, they can find all they want to know about using DL in various drug discovery problems. In addition, they can find success stories and open areas for future research.

Given the recent success of DL approaches and their use by pharmaceuticals in identifying new medications, it seems clear that current DL techniques being highly regarded in the next generation of enormous data investigation and evaluation for drug discovery and development.

Author contributions

Ask wrote the main text, HA wrote the digital twining part, EE wrote the deep learning part, YAMME wrote the data sets part, MMG wrote the similarly part, AEH, suggest the idea of the review and supervision

Open access funding provided by The Science, Technology & Innovation Funding Authority (STDF) in cooperation with The Egyptian Knowledge Bank (EKB).

Declarations

The authors declare no competing interests.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Heba Askr, Email: [email protected] .

Enas Elgeldawi, Email: [email protected] .

Heba Aboul Ella, Email: ge.ude.uce@alleleobah .

Yaseen A. M. M. Elshaier, Email: [email protected] .

Mamdouh M. Gomaa, Email: [email protected] .

Aboul Ella Hassanien, Email: ge.ude.uc@oriactioba .

  • Abramovich I, Ben-Yehuda T, Cohen R. Low-complexity video classification using recurrent neural networks. IEEE Int Conf Sci Electr Eng Israel (ICSEE) 2018; 2018 :1–4. doi: 10.1109/ICSEE.2018.8646076. [ CrossRef ] [ Google Scholar ]
  • Adadi A, Mohammed B. Peeking inside the black-box: a survey on explainable artificial intelligence (XAI) IEEE Access. 2018; 6 :2169–3536. [ Google Scholar ]
  • Ahmed KT, Park S, Jiang Q, et al. Network-based drug sensitivity prediction. BMC Med Genomics. 2020; 13 :193. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Alankrita A, Mamta M, Gopi B. Generative adversarial network: an overview of theory and applications. Int J Inf Manag Data Insights. 2021; 1 (1):100004. [ Google Scholar ]
  • Amashita R, Nishio M, Do RKG, et al. Convolutional neural networks: an overview and application in radiology. Insights Imaging. 2018; 9 :611–629. doi: 10.1007/s13244-018-0639-9. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Andreea D, Yu-Hsiang H, Petar V, Pietro L, Jian T (2019) Drug–drug adverse effect prediction with graph co-attention. https://arxiv.org/abs/1905.00534
  • Arshed MA, Mumtaz S, Riaz O, Sharif W, Abdullah S. A deep learning framework for multi drug side effects prediction with drug chemical substructure. Int J Innovat Sci Technol. 2022; 4 (1):19–31. [ Google Scholar ]
  • Arus-Pous J, Patronov A, Bjerrum EJ, Tyrchan C, Reymond JL, Chen H, Engkvist O. SMILES-based deep generative scaffold decorator for de-novo drug design. J Cheminform. 2020; 12 :1–18. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Asada M, Miwa M, Sasaki Y (2018) Enhancing drug–drug interaction extraction from texts by molecular structure information. In: proceedings of the 56th annual meeting of the association for computational linguistics. 2, pp 680–685, 10.18653/v1/P18-2108
  • Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, et al. Gene ontology: tool for the unification of biology. Nat Genet. 2000; 25 :25–29. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Attwood MM, Fabbro D, Sokolov AV, et al. Trends in kinase drug discovery: targets, indications and inhibitor design. Nat Rev Drug Discov. 2021; 20 (11):839–861. [ PubMed ] [ Google Scholar ]
  • Avila C, Alquicira-Hernandez J, Powell JE, et al. Benchmarking of cell type deconvolution pipelines for transcriptomics data. Nat Commun. 2020; 11 (1):5650. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Azad AKM, Dinarvand M, Nematollahi A, Swift J, Lutze-Mann L, Vafaee F. A comprehensive integrated drug similarity resource for in-silico drug repositioning and beyond. Brief Bioinform. 2021; 22 (3):bbaa126. doi: 10.1093/bib/bbaa126. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Badr S, Sugiyama H. A PSE perspective for the efficient production of monoclonal antibodies: integration of process, cell, and product design aspects. Curr Opin Chem Eng. 2020; 27 :121–128. [ Google Scholar ]
  • Bao J, Guo D, Li J, Zhang J. The modelling and operations for the digital twin in the context of manufacturing. Enterp Inf Syst. 2018; 13 :534–556. [ Google Scholar ]
  • Baptista D, Ferreira PG, Rocha M. Deep learning for drug response prediction in cancer. Briefings Bioinform. 2021; 22 :360–379. [ PubMed ] [ Google Scholar ]
  • Barenji RV, Akdag Y, Yet B, Oner L. Cyber-physical-based PAT (CPbPAT) framework for Pharma 4.0. Int J Pharm. 2019; 567 :118445. [ PubMed ] [ Google Scholar ]
  • Baumann P, Hubbuch J. Downstream process development strategies for effective bioprocesses: Trends, progress, and combinatorial approaches. Eng Life Sci. 2017; 17 :1142–1158. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Beck BR, Shin B, Choi Y, Park S, Kang K. Predicting commercially available antiviral drugs that may act on the novel coronavirus (SARS-CoV-2) through a drug–target interaction deep learning model. Comput Struct Biotechnol J. 2020; 18 :784–790. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Bedi P, Sharma C, Vashisth P, Goel D, Dhanda M (2015) Handling cold start problem in Recommender Systems by using Interaction Based Social Proximity factor. In: Proceeding of the 2015 international conference on advances in computing, communications and informatics, Kerala, India, 10–13 August 2015; pp 1987–1993
  • Benedek R, Stephen B, Andriy N, Michael U, Sebastian N, Eliseo P (2021) A unified view of relational deep learning for drug pair scoring. coRR V. https://arxiv.org/abs/2111.02916 .
  • Betsabeh T, Mansoor ZJ. Using drug–drug and protein-protein similarities as feature vector for drug–target binding prediction. Chemom Intell Lab Syst. 2021; 217 :104405. doi: 10.1016/j.chemolab.2021.104405. [ CrossRef ] [ Google Scholar ]
  • Bleakley K, Yamanishi Y. Supervised prediction of drug–target interactions using bipartite local models. Bioinformatics. 2009; 25 :2397–2403. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Bolukbasi T (2016) Man is to computer programmer as woman is to homemaker? debiasing word embeddings. Advances in neural information processing systems, 2016; 29. In Identifying gender and sexuality of data subjects. https://cis.pubpub.org/pub/debiasing-word-embeddings-2016 .
  • Bongini P, Pancino N, Dimitri GM, Bianchini M, Scarselli F, Lio P (2022) Modular multi-source prediction of drug side-effects with DruGNN. http://arxiv.org/abs/2202.08147 . [ PubMed ]
  • Boobier S, Osbourn A, Mitchell JB. Can human experts predict solubility better than computers? J Cheminform. 2017; 9 :63. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Boukouvala F, Niotis V, Ramachandran R, Muzzio FJ, Ierapetritou MG. An integrated approach for dynamic flowsheet modeling and sensitivity analysis of a continuous tablet manufacturing process. Comput Chem Eng. 2012; 42 :30–47. [ Google Scholar ]
  • Brown AS, Patel CJ. MeSHDD: literature-based drug-drug similarity for drug repositioning. J Am Med Inf Assoc. 2017; 24 (3):614–618. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Camacho DM, Collins KM, Powers RK, Costello JC, Collins JJ. Next-generation machine learning for biological networks. Cell. 2018; 173 :1581–1592. [ PubMed ] [ Google Scholar ]
  • Campillos M, et al. Drug target identification using side-effect similarity. Science. 2008; 321 (5886):263–666. doi: 10.1126/science.1158140. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Cao H, Mushnoori S, Higgins B, Kollipara C, Fermier A, Hausner D, Jha S, Singh R, Ierapetritou M, Ramachandran R. A systematic framework for data management and integration in a continuous pharmaceutical manufacturing processing line. Processes. 2018; 6 :53. [ Google Scholar ]
  • Casola G, Siegmund C, Mattern M, Sugiyama H. Data mining algorithm for pre-processing biopharmaceutical drug product manufacturing records. Comput Chem Eng. 2019; 124 :253–269. [ Google Scholar ]
  • Chabner BA. NCI-60 cell line screening: a radical departure in its time. J Natl Cancer Inst. 2016 doi: 10.1093/jnci/djv388. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Chander A, Srinivasan R, Chelian S, Wang J, Uchino K (2018) Working with beliefs: AI transparency in the enterprise. In: Joint proceedings of the ACM IUI 2018 workshops co-located with the 23rd acm conference on intelligent user interfaces 2068 (eds Said, A. and Komatsu, T.) (CEUR-WS.org, 2018)
  • Chandra B, Sharma RK. On improving recurrent neural network for image classification. Int Jt Conf Neural Netw (IJCNN) 2017; 2017 :1904–1907. doi: 10.1109/IJCNN.2017.7966083. [ CrossRef ] [ Google Scholar ]
  • Chang Y, Park H, Yang HJ, Lee S, Lee KY, Kim TS, Jung J, Shin JM. Cancer drug response profile scan (CDRscan): a deep learning model that predicts drug effectiveness from cancer genomic signature. Sci Rep. 2018; 8 :1–11. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Chauhan R, Ghanshala KK, Joshi RC. Convolutional neural network (CNN) for image detection and recognition. First Int Conf Secure Cyber Comput Commun (ICSCCC) 2018; 2018 :278–282. doi: 10.1109/ICSCCC.2018.8703316. [ CrossRef ] [ Google Scholar ]
  • Chen AW. Predicting adverse drug reaction outcomes with machine learning. Int J Commun Med Public Health. 2018; 5 (3):901–904. [ Google Scholar ]
  • Chen JY, Mamidipalli S, Huan T. Happi: an online database of comprehensive human annotated and predicted protein interactions. BMC Genomics. 2009; 10 (1):S16. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Chen X, Liu M-X, Yan G-Y. Drug–target interaction prediction by random walk on the heterogeneous network. Mol BioSyst. 2012; 8 :1970–1978. doi: 10.1039/C2MB00002D. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Chen Y, Yang O, Sampat C, Bhalode P, Ramachandran R, Ierapetritou M. Digital twins in pharmaceutical and biopharmaceutical manufacturing: a literature review. Processes. 2020; 8 (9):1088. doi: 10.3390/pr8091088. [ CrossRef ] [ Google Scholar ]
  • Cheng F, Kovács IA, Barabási AL. Network-based prediction of drug combinations. Nat Commun. 2019; 10 (1):1–11. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Chiu Y-C, Chen H-IH, Zhang T, Zhang S, Gorthi A, Wang L-J, Huang Y, Chen Y. Predicting drug response of tumors from integrated genomic profiles by deep neural networks. BMC Med Genomics. 2019; 12 :119. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Chu X, Lin Y, Gao J, Wang J, Wang Y, Wang L (2018) Multi-label robust factorization autoencoder and its applicationin predicting drug–drug interactions. arXiv:1811.00208 .
  • Chu X, Lin Y, Wang Y, Wang L, Wang J, Mlrda JG (2019) A multitask semi-supervised learning framework for drug–drug interaction prediction. In: proceedings of the international joint conference on artificial intelligence, pp 4518– 4524
  • Ciallella HL, Zhu H. Advancing computational toxicology in the big data era by artificial intelligence: data-driven and mechanism-driven modeling for chemical toxicity. Chem Res Toxicol. 2019; 32 :536–547. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Cortes-Ciriano I, Ain QU, Subramanian V, Lenselink EB, Méndez-Lucio O, IJzerman AP, Wohlfahrt G, Prusis P, Malliavin TE, van Westen GJP, et al. Polypharmacology modelling using proteochemometrics (PCM): recent methodological developments, applications to target families, and future prospects. Medchemcomm. 2015; 6 :24–50. [ Google Scholar ]
  • Cortés-Ciriano I, Bender A. KekuleScope: prediction of cancer cell line sensitivity and compound potency using convolutional neural networks trained on compound images. J Cheminform. 2019; 11 :1–16. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Dai L, Zhu H, Liu D (2020) Patient similarity: methods and applications. http://arxiv.org/abs/2012.01976
  • David L, Arús-Pous J, Karlsson J, Engkvist O, Bjerrum EJ, Kogej T, Kriegl JM, Beck B, Chen H. Applications of deep-learning in exploiting large-scale and heterogeneous compound data in industrial pharmaceutical research. Front Pharmacol. 2019; 10 :1303. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Davis MI, Hunt JP, Herrgard S, Ciceri P, Wodicka LM, Pallares G, Hocker M, Treiber DK, Zarrinkar PP. Comprehensive analysis of kinase inhibitor selectivity. Nat Biotechnol. 2011; 29 :1046–1051. [ PubMed ] [ Google Scholar ]
  • De Carvalho TM, Noels E, Wakkee M, Udrea A, Nijsten T. Development of smartphone apps for skin cancer risk assessment: progress and promise. JMIR Dermatol. 2019; 2 (1):e13376. [ Google Scholar ]
  • De Kuijper GM, Risselada A, van Dijken R. Handbook of intellectual disabilities. Cham: Springer; 2019. Monitoring drug side-effects; pp. 275–301. [ Google Scholar ]
  • “deepchem/deepchem: Democratizing Deep-Learning for Drug Discovery”; Quantum Chemistry, Materials Science and Biology; Available online: https://github.com/deepchem/deepchem (accessed on 15 April 2022).
  • Dey S, Luo H, Fokoue A, Hu J, Zhang P. Predicting adverse drug reactions through interpretable deep learning framework. BMC Bioinform. 2018; 19 :476. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Dincer AB, Celik S, Hiranuma N, Lee S-I. DeepProfile: deep learning of cancer molecular profiles for precision medicine. bioRxiv. 2018 doi: 10.1101/278739. [ CrossRef ] [ Google Scholar ]
  • Ding MQ, Chen L, Cooper GF, Young JD, Lu X. Precision oncology beyond targeted therapy: combining omics data with machine learning matches the majority of cancer cells to effective therapeutics. Mol Cancer Res. 2018; 16 :269–278. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Doshi-Velez F, Kim B (2017) Towards a rigorous science of interpretable machine learning. https://arxiv.org/abs/1702.08608
  • DrugBank (2019) DrugBank Release Version 5.1.3, chemical structures. https://www.drugbank.com
  • Dua D, Graff C (2017) UCI machine learning repository. https://archive.ics.uci.edu/ml/index.php
  • El-Deredy W, et al. Pretreatment prediction of the chemotherapeutic response of human glioma cell cultures using nuclear magnetic resonance spectroscopy and artificial neural networks. Cancer Res. 1997; 57 :4196–4199. [ PubMed ] [ Google Scholar ]
  • Farzan P, Mistry B, Ierapetritou MG. Review of the important challenges and opportunities related to modeling of mammalian cell bioreactors. AIChE J. 2017; 63 :398–408. [ Google Scholar ]
  • Fatehifar M, Karshenas H. Drug–drug interaction extraction using a position and similarity fusion-based attention mechanism. J Biomed Inf. 2021; 115 :103707. doi: 10.1016/j.jbi.2021.103707. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Feng S, et al (2018) Pathologies of neural models make interpretations difficult. http://arxiv.org/abs/1804.07781
  • Feng Q, Dueva E, Cherkasov A, Ester M (2018) PADME: a deep learning-based framework for drug–target interaction prediction. arXiv 2018; arXiv:1807.09741
  • Feng YH, Zhang SW, Shi JY. DPDDI: a deep predictor for drug–drug interactions. BMC Bioinform. 2020; 21 :419. doi: 10.1186/s12859-020-03724-x. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Ferdousi R, Safdari R, Omidi Y. Computational prediction of drug–drug interactions based on drugs functional similarities. J Biomed Inform. 2017 doi: 10.1016/j.jbi.2017.04.021. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Finn RD, et al. Pfam: the protein families database. Nucleic Acids Res. 2013; 42 (D1):D222–D230. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Flaten HK, St Claire C, Schlager E, Dunnick CA, Dellavalle RP. Growth of mobile applications in dermatology. Dermatol Online J. 2020; 24 (2):13–16. [ PubMed ] [ Google Scholar ]
  • Fleischhack G, Massimino M, Warmuth-Metz M, Khuhlaeva E, Janssen G, Graf N, et al. Nimotuzumab and radiotherapy for treatment of newly diagnosed diffuse intrinsic pontine glioma (DIPG): a phase III clinical study. J Neurooncol. 2019; 143 :107–113. doi: 10.1007/s11060-019-03140-z. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Fokoue A, Sadoghi M, Hassanzadeh O, Zhang P (2016) Predicting drug–drug interactions through large-scale similarity-based link prediction. In: European semantic web conference 2016 May 29; pp 774–789
  • Fushman D, Shooshan SE, Rodriguez L, Aronson AR, Lang F, Rogers W, Tonning J. A dataset of 200 structured product labels annotated for adverse drug reactions. Sci Data. 2018; 5 :180001. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Gangadharan N, Turner R, Field R, Oliver SG, Slater N, Dikicioglu D. Metaheuristic approaches in biopharmaceutical process development data analysis. Bioprocess Biosyst Eng. 2019; 42 :1399–1408. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Gao Z, et al. PDTD: a web-accessible protein database for drug target identification. BMC Bioinf. 2008; 9 (1):104. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Gao KY, Fokoue A, Luo H, Iyengar A, Dey S, Zhang P (2017) Interpretable drug target prediction using deep neural representation. In: Proceedings of the international joint conference on artificial intelligence, Melbourne, Australia, 19–25 August 2017
  • Gao K, Duy Nguyen D, Sresht V, Mathiowetz AM, Tu M, Wei G-W. Are 2D fingerprints still valuable for drug discovery? Phys Chem Chem Phys. 2019; 22 :8373–8390. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Gatti M, Turrini E, Raschi E, Sestili P, Fimognari C. Janus kinase inhibitors and coronavirus disease (COVID)-19: rationale, clinical evidence and safety issues. Pharmaceuticals. 2021; 14 (8):738. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Gaulton A, et al. ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res. 2011; 40 (D1):D1100–D1107. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Gilmer J, Schoenholz SS, Riley PF, Vinyals O, Dahl GE. Neural message passing for quantum chemistry. 34th Int Conf Mach Learn ICML. 2017; 3 :2053–2070. [ Google Scholar ]
  • Glaessgen EH, Stargel DS (2012) The digital twin paradigm for future NASA and US Air Force vehicles. In: Proceedings of the 53rd AIAA/ASME/ASCE/AHS/ASC structures, structural dynamics and materials conference, Honolulu, HI, USA. https://ntrs.nasa.gov/citations/20120008178
  • Goebel R, et al. Explainable AI: the new 42? In: Holzinger A, Kieseberg P, Tjoa A, Weippl E, et al., editors. Machine learning and knowledge extraction. CD-MAKE Lecture Notes in Computer Science. New York: Springer; 2018. [ Google Scholar ]
  • Gómez-Bombarelli R, et al. Automatic chemical design using a data-driven continuous representation of molecules. ACS Cent Sci. 2018; 4 :268–276. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Grieves M, Vickers J. Digital twin: mitigating unpredictable undesirable emergent behavior in complex systems. Cham: Springer; 2017. pp. 85–113. [ Google Scholar ]
  • Guidotti R, et al. A survey of methods for explaining black box models. ACM Comput Surv. 2018; 51 :93. [ Google Scholar ]
  • Guinney J, Saez-Rodriguez J. Alternative models for sharing confidential biomedical data. Nat Biotechnol. 2018; 36 (5):391–392. [ PubMed ] [ Google Scholar ]
  • Gunther S, et al. SuperTarget and Matador: resources for exploring drug–target relationships. Nucleic Acids Res. 2007; 36 :D919–D922. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Hamilton WL. Graph representation learning. Synth Lect Artif Intell Mach Learn. 2020; 14 :1–159. [ Google Scholar ]
  • Han X, Xie R, Li X, Li J. SmileGNN: drug–drug interaction prediction based on the smiles and graph neural network. Life (basel). 2022; 12 (2):319. doi: 10.3390/life12020319. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Hao M, Wang Y, Bryant SH. Improved prediction of drug–target interactions using regularized least squares integrating with kernel fusion technique. Anal Chim Acta. 2016; 909 :41. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Hassan-Harrirou H, Zhang C, Lemmin T. RosENet: improving binding affinity prediction by leveraging molecular mechanics energies with an ensemble of 3D convolutional neural networks. J Chem Inf Model. 2020; 60 :2791–2802. [ PubMed ] [ Google Scholar ]
  • He C, Liu Y, Li H, Zhang H, Mao Y, Qin X, Liu L, Zhang X. Multi-type feature fusion based on graph neural network for drug-drug interaction prediction. BMC Bioinf. 2022; 23 (1):1–8. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Hecker N, et al. SuperTarget goes quantitative: update on drug–target interactions. Nucleic Acids Res. 2011; 40 (D1):D1113–D1117. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Hermanto A, Adji TB, Setiawan NA. Recurrent neural network language model for English-Indonesian machine translation: experimental study. Int Conf Sci Inf Technol (ICSITech) 2015; 2015 :132–136. doi: 10.1109/ICSITech.2015.7407791. [ CrossRef ] [ Google Scholar ]
  • Hinton G. Boltzmann machines. In: Sammut C, Webb GI, editors. Encyclopedia of machine learning. Boston: Springer; 2011. [ Google Scholar ]
  • Hirohara M, Saito Y, Koda Y, Sato K, Sakakibara Y. Convolutional neural network based on SMILES representation of compounds for detecting chemical motif. BMC Bioinform. 2018; 19 :83–94. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Hizukuri Y, Sawada R, Yamanishi Y. Predicting target proteins for drug candidate compounds based on drug-induced gene expression data in a chemical structure-independent manner. BMC Med Genomics. 2015; 8 :82. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Hou X, You J, Hu P (2019) Predicting drug–drug interactions using deep neural network. In: proceedings of the 11 th international conference on machine learning and computing, pp 168–172
  • http://zinc.docking.org
  • https://bioinf-applied.charite.de/supernatural_new/index.php .
  • https://friendsofcancerresearch.org/wpcontent/uploads/Optimizing_Dosing_in_Oncology_Drug_Development.pdf .
  • https://ncats.nih.gov/tox21
  • https://pharmacodb.pmgenomics.ca/datasets/4
  • https://sites.broadinstitute.org/ccle/
  • https://string-db.org/cgi/download.pl?sessionId=uKr0odAK9hPs
  • https://www.cancer.gov/about-nci/organization/ccct/ctrp
  • https://www.ebi.ac.uk/chebi/
  • https://www.sciencedirect.com/topics/drug-response
  • Hu J, Gao J, Fang X, Liu Z, Wang F, Huang W, Wu H, Zhao G. DTSyn: a dual-transformer-based neural network to predict synergistic drug combinations. bioRxiv. 2022 doi: 10.1101/2022.03.29.486200. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Huang C-T, et al. A large-scale gene expression intensity-based similarity metric for drug repositioning. iScience. 2018; 7 :40–52. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Huang K, Xiao C, Hoang TN, Glass LM, Sun J (2020) Caster: predicting drug interactions with chemical substructure representation. In: AAAI 2020 34th AAAI Conference on Artificial Intelligence, American Association for Artificial Intelligence (AAAI) Press, pp 702–709
  • Ibrahim H, El Kerdawy AM, Abdo A, Eldin AS. Similarity-based machine learning framework for predicting safety signals of adverse drug–drug interactions. Inf Med Unlocked. 2021; 26 :100699. [ Google Scholar ]
  • Ierapetritou M, Muzzio F, Reklaitis G. Perspectives on the continuous manufacturing of powder-based pharmaceutical processes. AIChE J. 2016; 62 :1846–1862. [ Google Scholar ]
  • Iorio F, et al. Discovery of drug mode of action and drug repositioning from transcriptional responses. PNAS. 2010; 107 (33):14621–14626. doi: 10.1073/pnas.1000138107. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Iorio F, Knijnenburg TA, Vis DJ, Bignell GR, Menden MP, Schubert M, Aben N, Gonçalves E, Barthorpe S, Lightfoot H, et al. A landscape of pharmacogenomic interactions in cancer. Cell. 2016; 166 :740–754. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • James M, Stanfield CF, Bir G. A review of process analytical technology (PAT) in the US pharmaceutical industry. Curr Pharm Anal. 2006; 2 :405–414. [ Google Scholar ]
  • Ji ZL, Han LY, Yap CW, Sun LZ, Chen X, Chen YZ. Drug adverse reaction target database (DART) Drug Saf. 2003; 26 (10):685–690. [ PubMed ] [ Google Scholar ]
  • Jiménez-Luna J, Grisoni F, Schneider G. Drug discovery with explainable artificial intelligence. Nat Mach Intell. 2020; 2 (10):573–584. [ Google Scholar ]
  • Julkunen H, Cichonska A, Gautam P, et al. Leveraging multi-way interactions for systematic prediction of pre-clinical drug combination effects. Nat Commun. 2020; 11 (1):6136. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Kamath U, Liu J. Explainable artificial intelligence: an introduction to interpretable machine learning. Cham: Springer; 2021. [ Google Scholar ]
  • Kamble R, Sharma S, Varghese V, Mahadik K. Process analytical technology (PAT) in pharmaceutical development and its application. Int J Pharm Sci Rev Res. 2013; 23 :212–223. [ Google Scholar ]
  • Kamel Boulos MN, Zhang P. Digital twins: from personalised medicine to precision public health. J Person Med. 2021; 11 (8):745. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Kanehisa M, Goto S. KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 2000; 28 (1):27–30. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Karim MR, Cochez M, Jares JB, Uddin M, Beyan O, Decker S (2019) Drug–drug interaction prediction based on knowledge graph embeddings and convolutional-LSTM network. In: Proceedings of the 10th ACM international conference on bioinformatics, computational biology and health informatics, pp 113–123
  • Karim MR, Cochez M, Jares JB, Uddin M, Beyan O, Decker S (2019) Drug–drug interaction prediction based on knowledge graph embeddings and convolutional-LSTM network. In: Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics 2019, pp 113–123
  • Karpov P, Godin G, Tetko IV. Transformer-CNN: Swiss knife for QSAR modeling and interpretation. J Cheminform. 2020; 12 :17. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Kastrin A, Ferk P, Leskošek B. Predicting potential drug–drug interactions on topological and semantic similarity features using statistical learning. PLoS ONE. 2018; 13 (5):e0196865. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Keum J, Nam H. SELF-BLM: prediction of drug–target interactions via self-training SVM. PLoS ONE. 2017; 12 :e0171839. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Kim S, Thiessen PA, Bolton EE, Chen J, Fu G, Gindulyte A, Han L, He J, He S, Shoemaker BA, et al. PubChem substance and compound databases. Nucleic Acids Res. 2016; 44 :D1202–D1213. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Kim J, Park S, Min D, Kim W. comprehensive survey of recent drug discovery using deep learning. Int J Mol Sci. 2021; 22 :9983. doi: 10.3390/ijms22189983. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Koes DR, Baumgartner MP, Camacho CJ. Lessons learned in empirical scoring with smina from the CSAR 2011 benchmarking exercise. J Chem Inf Model. 2013; 53 :1893–1904. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Kohonen T. The self-organizing map. Proc IEEE. 1990; 78 (9):1464–1480. [ Google Scholar ]
  • Korkmaz S. Deep learning-based imbalanced data classification for drug discovery. J Chem Inf Model. 2020; 60 :4180–4190. [ PubMed ] [ Google Scholar ]
  • Kritzinger W, Karner M, Traar G, Henjes J, Sihn W. Digital Twin in manufacturing: a categorical literature review and classification. IFAC-PapersOnLine. 2018; 51 :1016–1022. [ Google Scholar ]
  • Kuenzi BM, et al. Predicting drug response and synergy using a deep learning model of human cancer cells. J Elsevier Cancer Cell. 2020; 38 (5):1535–6108. doi: 10.1016/j.ccell.2020.09.014. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Kuhn M, et al. A side effect resource to capture phenotypic effects of drugs. Mol Syst Biol. 2010; 6 (1):343. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Kuhn M, et al. STITCH 4: integration of protein–chemical interactions with user data. Nucleic Acids Res. 2013; 42 (D1):D401–D407. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Kumar SP, Feidler JC. BioSPICE: a computational infrastructure for integrative biology. OMICS J Integr Biol. 2003; 7 (3):225. doi: 10.1089/153623103322452350. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Kumar S, Talasila D, Gowrav M, Gangadharappa H. Adaptations of pharma 4.0 from industry 4.0. Drug Invent Today. 2020; 14 :405–415. [ Google Scholar ]
  • Lamb J, Crawford ED, Peck D, Modell JW, Blat IC, Wrobel MJ, Lerner J, Brunet JP, Subramanian A, Ross KN, et al. The connectivity map: using gene-expression signatures to connect small molecules, genes, and disease. Science. 2006; 313 :1929–1935. [ PubMed ] [ Google Scholar ]
  • Lapuschkin S, et al. Unmasking clever Hans predictors and assessing what machines really learn. Nat Commun. 2019; 10 :1096. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Lee CY, Chen YP. Descriptive prediction of drug side-effects using a hybrid deep learning model. Int J Intell Syst. 2021; 36 (6):2491–2510. [ Google Scholar ]
  • Lee H, Kim W. Comparison of target features for predicting drug–target interactions by deep neural network based on large-scale drug-induced transcriptome data. Pharmaceutics. 2019; 11 :377. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Lee HW, Christie A, Xu J, Yoon S. Data fusion-based assessment of raw materials in mammalian cell culture. Biotechnol Bioeng. 2012; 109 :2819–2828. [ PubMed ] [ Google Scholar ]
  • Lee G, Park C, Ahn J. Novel deep learning model for more accurate prediction of drug–drug interaction effects. BMC Bioinform. 2019; 20 (1):415. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Lee I, Keum J, Nam H. DeepConv-DTI: prediction of drug–target interactions via deep learning with convolution on protein sequences. PLoS Comput Biol. 2019; 15 :1–21. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Legner C, Eymann T, Hess T, Matt C, Böhmann T, Drews P, Mädche A, Urbach N, Ahlemann F. Digitalization: opportunity and challenge for the business and information systems engineering community. Bus Inf Syst Eng. 2017; 59 :301–308. [ Google Scholar ]
  • Lei T, Barzilay R, Jaakkola T (2016) Rationalizing neural predictions. In: 2016 conference on empirical methods in natural language processing, 2016; Austin, Texas: Association for computational linguistics, pp 107—117. https://aclanthology.org/D16-1011
  • Li M, Wang Y, Zheng R, Shi X, Wu F, Wang J, et al. (2019) Deepdsc: a deep learning method to predict drug sensitivity of cancer cell lines. IEEE/ACM transactions on computational biology and bioinformatics [ PubMed ]
  • Lian M, Du W, Wang X, Yao Q. Drug–target interaction prediction based on multi-similarity fusion and sparse dual-graph regularized matrix factorization. IEEE Access. 2021; 9 :99718–99730. doi: 10.1109/ACCESS.2021.3096830. [ CrossRef ] [ Google Scholar ]
  • Lin X, Quan Z, Wang Z-J, Ma T, Zeng X (2021) KGNN: knowledge graph neural network for drug–drug interaction prediction. In: Proceedings of the twenty-ninth international joint conference on artificial intelligence, Jaban; IJCAI'20
  • Lin-Gibson S, Srinivasan V. Recent industrial roadmaps to enable smart manufacturing of biopharmaceuticals. IEEE Trans Autom Sci Eng. 2019; 2019 :1–8. [ Google Scholar ]
  • Lipton ZC. The mythos of model interpretability. Queue. 2018; 16 :31–57. [ Google Scholar ]
  • Liu Y, Wu M, Miao C, Zhao P, Li X-L. Neighborhood regularized logistic matrix factorization for drug–target interaction prediction. PLoS Comput Biol. 2016; 12 :e1004760. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Liu B, Ramsundar B, Kawthekar P, Shi J, Gomes J, Luu Nguyen Q, Ho S, Sloane J, Wender P, Pande V. Retrosynthetic reaction prediction using neural sequence-to-sequence models. R ACS Cent Sci. 2017; 3 :1103–1113. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Liu N, Chen CB, Kumara S. Semi-supervised learning algorithm for identifying high-priority drug–drug interactions. IEEE J Biomedic Health Inform. 2019 doi: 10.1109/JBHI.2019.2932740. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Liu K, Sun X, Jia L, Ma J, Xing H, Wu J, Gao H, Sun Y, Boulnois F, Fan J. Chemi-net: a molecular graph convolutional network for accurate drug property prediction. Int J Mol Sci. 2019; 20 :3389. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Liu P, Li H, Li S, Leung KS. Improving prediction of phenotypic drug response on cancer cell lines using deep convolutional network. BMC Bioinform. 2019; 20 :408. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Liu S, Huang Z, Qiu Y, Chen Y-PP, Zhang W. Structural network embedding using multi-modal deep auto-encoders for predicting drug–drug interactions. IEEE Int Conf Bioinform Biomed. 2019; 2019 :445–450. doi: 10.1109/BIBM47256.2019.8983337. [ CrossRef ] [ Google Scholar ]
  • Liu S, Zhang Y, Cui Y, Qiu Y, Deng Y, Zhang W, Zhang Z. Enhancing drug–drug interaction prediction using deep attention neural networks. BioRxiv. 2021 doi: 10.1101/2021.03.16.435553. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Lopes MR, Costigliola A, Pinto R, Vieira S, Sousa JMC. Pharmaceutical quality control laboratory digital twin—a novel governance model for resource planning and scheduling. Int J Prod Res. 2019; 58 :1–15. [ Google Scholar ]
  • Louizos C, Welling M, Kingma DP (2017) Learning sparse neural networks through l 0 regularization. http://arxiv.org/abs/1712.01312 .
  • Lu Y, Guo Y, Korhonen AJB. Link prediction in drug–target interactions network using similarity indices. BMC Bioinf. 2017; 18 (1):39. doi: 10.1186/s12859-017-1460-z. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Luo Y, Zhao X, Zhou J, Yang J, Zhang Y, Kuang W, Peng J, Chen L, Zeng J. A network integration approach for drug–target interaction prediction and computational drug repositioning from heterogeneous information. Nat Commun. 2017; 8 :573. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Luo D, Cheng W, Xu D, Yu W, Zong B, Chen H, Zhang X. Parameterized explainer for graph neural network. Adv Neural Inf Process Syst. 2020; 33 :19620–19631. [ Google Scholar ]
  • Lyu T, Gao J, Tian L, Li Z, Zhang P, Zhang J (2021) MDNN: a multimodal deep neural network for predicting drug–drug interaction events. In: Proceedings of the thirtieth international joint conference on artificial intelligence (IJCAI-21), pp 3536–3542. 10.24963/ijcai.2021/487
  • Ma T, Xiao C, Zhou J, Wang F (2018) Drug similarity integration through attentive Multiview graph auto-encoders. In: IJCAI 2018, proceedings of the 27th international joint conference on artificial intelligence, pp 3477–3483
  • Mahajan D, Kumar D (2018) Sentiment analysis using RNN and Google translator. In: 2018 8th international conference on cloud computing, data science & engineering (Confluence), pp 798–802. 10.1109/CONFLUENCE.2018.8442924
  • Mak IWY, Evaniew N, Ghert M. Lost in translation: animal models and clinical trials in cancer treatment. Am J Transl Res. 2014; 6 :114–118. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Marr B (2017) What is digital twin technology and why is it so important? Forbes. https://www.forbes.com/sites/bernardmarr/2017/03/06/what-is-digital-twin-technology-and-why-is-it-so-important
  • Matsuzaka Y, Uesawa Y. Prediction model with high-performance constitutive androstane receptor (CAR) using DeepSnap-deep learning approach from the tox21 10K compound library. Int J Mol Sci. 2019; 20 :4855. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Maul J-T, Djamei V, Kolios AG, Meier B, Czernielewskiand J, Jungo P. Efficacy and survival of systemic psoriasis treatments: an analysis of the SWISS registry SDNTT. Dermatology. 2016; 232 (6):640–647. [ PubMed ] [ Google Scholar ]
  • Mayani MG, Svendsen M, Oedegaard SI (2018) Drilling digital twin success stories the last 10 years. In: Proceedings of the SPE Norway one day seminar, Bergen, Norway. 10.2118/191336-MS
  • Metz JT, Johnson EF, Soni NB, Merta PJ, Kifle L, Hajduk PJ. Navigating the kinome. Nat Chem Biol. 2011; 7 :200–202. [ PubMed ] [ Google Scholar ]
  • Miller T. Explanation in artificial intelligence: insights from the social sciences. Artif Intell. 2019; 267 :1–38. [ Google Scholar ]
  • Miyato T, Dai AM, Goodfellow I (2016) Adversarial training methods for semisupervised text classification. http://arxiv.org/abs/1605.07725
  • Mohamed C, Nsiri B, Abdelmajid S, Abdelghani EM, Brahim B. Deep convolutional networks for image segmentation: application to optic disc detection. Int Conf Electr Inf Technol (ICEIT) 2020; 2020 :1–3. doi: 10.1109/ICEIT48248.2020.9113204. [ CrossRef ] [ Google Scholar ]
  • Mukhamediev RI, Symagulov A, Kuchin Y, Yakunin K, Yelis M. From classical machine learning to deep neural networks: a simplified scientometric review. Appl Sci. 2021; 11 :5541. doi: 10.3390/app11125541. [ CrossRef ] [ Google Scholar ]
  • Murdoch WJ, Singh C, Kumbier K, Abbasi-Asl R, Yu B. Definitions, methods, and applications in interpretable machine learning. Proc Natl Acad Sci USA. 2019; 116 :22071–22080. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Nag S, Baidya ATK, Mandal A, et al. Deep learning tools for advancing drug discovery and development. 3 Biotech. 2022; 12 :110. doi: 10.1007/s13205-022-03165-8. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Nagy ZK, Fevotte G, Kramer H, Simon LL. Recent advances in the monitoring, modelling, and control of crystallization systems. Chem Eng Res Des. 2013; 91 :1903–1922. [ Google Scholar ]
  • Narayanan H, Luna MF, von Stosch M, Cruz Bournazou MN, Polotti G, Morbidelli M, Butte A, Sokolov M. Bioprocessing in the digital age: the role of process models. Biotechnol J. 2020; 15 :e1900172. [ PubMed ] [ Google Scholar ]
  • Nascimento ACA, Prudêncio RBC, Costa IG. A multiple kernel learning algorithm for drug–target interaction prediction. BMC Bioinforma. 2016; 17 :46. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Needleman SB, Wunsch CD. A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol. 1970; 48 (3):443–453. [ PubMed ] [ Google Scholar ]
  • Nguyen T, Nguyen TT, Nguyen T, Le DH. Graph convolutional networks for drug response prediction. IEEE/ACM Trans Comput Biol Bioinform. 2021; 19 :146–154. [ PubMed ] [ Google Scholar ]
  • O’Connor TF, Yu LX, Lee SL. Emerging technology: a key enabler for modernizing pharmaceutical manufacturing and advancing product quality. Int J Pharm. 2016; 509 :492–498. [ PubMed ] [ Google Scholar ]
  • Oboyle NM, Sayle RA. Comparing structural fingerprints using a literature-based similarity benchmark. J Cheminform. 2016; 8 (1):1–14. doi: 10.1186/s13321-016-0148-0. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Olughu W, Deepika G, Hewitt C, Rielly C. Insight into the large-scale upstream fermentation environment using scaled-down models. J Chem Technol Biotechnol. 2019; 94 :647–657. [ Google Scholar ]
  • Oughtred R, Rust J, Chang C, Breitkreutz BJ, Stark C, Willems A, Boucher L, Leung G, Kolas N, Zhang F, Dolma S. The BioGRID database: a comprehensive biomedical resource of curated protein, genetic, and chemical interactions. Protein Sci. 2021; 30 (1):187–200. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Oztemel E, Gursev S. Literature review of Industry 4.0 and related technologies. J Intell Manuf. 2018; 31 :127–182. [ Google Scholar ]
  • Ozturk H, Ozturk A, Ozkirimli E. DeepDTA: Deep drug–target binding affinity prediction. Bioinformatics. 2018; 34 :i821–i829. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Pandey P, Katakdaunde M, Turton R. Modeling weight variability in a pan coating process using Monte Carlo simulations. AAPS Pharm Sci Tech. 2006; 7 :E2–E11. [ PubMed ] [ Google Scholar ]
  • Papadakis E, Woodley JM, Gani R (2018) Perspective on PSE in pharmaceutical process development and innovation. In Process. Systems engineering for pharmaceutical manufacturing. Elsevier, Amsterdam pp 597–656
  • Passi A, et al. RepTB: a gene ontology-based drug repurposing approach for tuberculosis. J Cheminform. 2018; 10 (1):24. doi: 10.1186/s13321-018-0276-9. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Peng J, Li J, Shang X. A learning-based method for drug–target interaction prediction based on feature representation learning and deep neural network. BMC Bioinform. 2020; 21 :1–13. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Perozzi B, Al-Rfou R, Skiena S (2014) DeepWalk: online learning of social representations. In: Proceeding of the ACM SIGKDD international conference on knowledge discovery and data mining, New York, NY, USA, 24–27 August 2014, pp 701–710
  • Poluzzi E, Raschi E, Piccinni C, De Ponti F (2012) data mining techniques in pharmacovigilance: analysis of the publicly accessible FDA adverse event reporting system (AERS). In: Data mining applications in engineering and medicine. London, United Kingdom: IntechOpen. 10.5772/50095
  • Pouryahya M, Oh JH, Mathews JC, Belkhatir Z, Moosmüller C, Deasy JO, Tannenbaum AR. Pan-cancer prediction of cell-line drug sensitivity using network-based methods. Int J Mol Sci. 2022; 23 :1074. doi: 10.3390/ijms23031074. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Qiu K, Lee J, Kim H, Yoon S, Kang K. Machine learning based anti-cancer drug response prediction and search for predictor genes using cancer cell line gene expression. Genomics Inform. 2021 doi: 10.5808/gi.20076. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Quan C, et al. Multichannel convolutional neural network for biological relation extraction. BioMed Res Int. 2016 doi: 10.1155/2016/1850404. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Raghava GP, Barton GJ. Quantification of the variation in percentage identity for protein sequence alignments. BMC Bioinf. 2006; 7 (1):415. doi: 10.1186/1471-2105-7-415. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Rampášek L, et al. Improving drug response prediction via modeling of drug perturbation effects. Bioinformatics. 2019 doi: 10.1093/bioinformatics/btz158. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Rantanen J, Khinast J. The future of pharmaceutical manufacturing sciences. J Pharm Sci. 2015; 104 :3612–3638. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Read EK, Park JT, Shah RB, Riley BS, Brorson KA, Rathore AS. Process analytical technology (PAT) for biopharmaceutical products: Part I. Concepts and applications. Biotechnol Bioeng. 2010; 105 :276–284. [ PubMed ] [ Google Scholar ]
  • Reinhardt IC, Oliveira DJC, Ring DDT. Current perspectives on the development of industry 4.0 in the pharmaceutical sector. J Ind Inf Integr. 2020; 18 :100131. [ Google Scholar ]
  • Ren S, Tao Y, Yu K, et al. De novo prediction of Cell-Drug sensitivities using deep learning-based graph regularized matrix factorization. Pacif Symp Biocomput. 2022 doi: 10.7490/f1000research.1118807.1. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Reza F, Reza S, Yadollah O. Computational prediction of drug–drug interactions based on drugs functional similarities. J Biomed Inform. 2017; 70 :54–64. [ PubMed ] [ Google Scholar ]
  • Richardson P, Grifn I, Tucker C, Smith D, Oechsle O, Phelan A, Rawling M, Savory E, Stebbing J. Baricitinib as potential treatment for 2019-nCoV acute respiratory disease. Lancet (london, England) 2020; 395 (10223):e30. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Rifaioglu AS, Atas H, Martin MJ, Cetin-Atalay R, Atalay V, Dogan T. Recent applications of deep learning and machine intelligence on in silico drug discovery: methods, tools and databases. Brief Bioinform. 2019; 20 :1878–1912. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Rosen R, von Wichert G, Lo G, Bettenhausen KD. About the importance of autonomy and digital twins for the future of manufacturing. IFAC-PapersOnLine. 2015; 48 :567–572. [ Google Scholar ]
  • Ryu JY, Kim HU, Lee SY. Deep learning improves prediction of drug–drug and drug–food interactions. PNAS. 2018; 115 (18):E4304–E4311. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Sachdev K, Gupta MK. A comprehensive review of feature-based methods for drug–target interaction prediction. J Biomed Inform. 2019; 93 :103159. [ PubMed ] [ Google Scholar ]
  • Sajjia M, Shirazian S, Kelly CB, Albadarin AB, Walker G. ANN analysis of a roller compaction process; in the pharmaceutical industry. Chem Eng Technol. 2017; 40 :487–492. [ Google Scholar ]
  • Sarker IH. Deep learning: a comprehensive overview on techniques, taxonomy, applications and research directions. SN Comput Sci. 2021; 2 :420. doi: 10.1007/s42979-021-00815-1. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Sawada R, Iwata M, Tabei Y, Yamato H, Yamanishi Y. Predicting inhibitory and activatory drug targets by chemically and genetically perturbed transcriptome signatures. Sci Rep. 2018; 8 :156. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Schleich B, Anwer N, Mathieu L, Wartzack S. Shaping the digital twin for design and production engineering. CIRP Ann. 2017; 66 :141–144. [ Google Scholar ]
  • Schlichtkrull MS, De Cao N, Titov I (2020) Interpreting graph neural networks for NLP with differentiable edge masking. http://arxiv.org/abs/2010.00577
  • Schwarz K. AttentionDDI: Siamese attention-based deep learning method for drug–drug interaction predictions. BMC Bioinf. 2021; 22 (1):412. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Scudellari M (2020) Five companies using AI to fight coronavirus. https://spectrum.ieee.org/the-human-os/artificial-intelligence/medical-ai/companies-ai-coronavirus
  • Seo S, Lee T, Kim MH, Yoon Y. Prediction of side effects using comprehensive similarity measures. BioMed Res Int. 2020 doi: 10.1155/2020/1357630. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Shang C, Liu Q, Chen KS, Sun J, Lu J, Yi J, Bi J (2018) Edge attention-based multi-relational graph convolutional networks. arXiv 2018; arXiv:1802.04944 .
  • Shao K, Zhang Z, He S, Bo X (2020) DTIGCCN: prediction of drug–target interactions based on GCN and CNN. In: Proceedings of the 2020 IEEE 2 nd international conference on tools with artificial intelligence (ICTAI), Baltimore, MD, USA, 9–11 November 2020, pp 337–342
  • Sharifi-Noghabi H, Zolotareva O, Collins CC, Ester M. MOLI: multi-omics late integration with deep neural networks for drug response prediction. Bioinformatics. 2019; 35 :i501–i509. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Shin B, Park S, Kang K, Ho JC. Self-attention based molecule representation for predicting drug–target interaction. Proc Mach Learn Res. 2019; 106 :1–18. [ Google Scholar ]
  • Shoemaker RH. The NCI60 human tumour cell line anticancer drug screen. Nat Rev Cancer. 2006; 6 :813–823. [ PubMed ] [ Google Scholar ]
  • Shrikumar A, Greenside P, Kundaje A (2017) Learning important features through propagating activation differences. In: Proceedings of the 34th international conference on machine learning 2017; 70, JMLR.org: Sydney, NSW, Australia. pp 3145–3153
  • Shtar G, Rokach L, Shapira B. Detecting drug–drug interactions using artificial neural networks and classic graph similarity measures. PLoS ONE. 2019; 14 (8):e0219796. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Silver D, Schrittwieser J, Simonyan K, Antonoglou I, Huang A, Guez A, Hubert T, Baker L, Lai M, Bolton A, et al. Mastering the game of go without human knowledge. Nature. 2017; 550 (7676):354–359. [ PubMed ] [ Google Scholar ]
  • Simon LL, Kiss AA, Cornevin J, Gani R. Process engineering advances in pharmaceutical and chemical industries: Digital process design, advanced rectification, and continuous filtration. Curr Opin Chem Eng. 2019; 25 :114–121. [ Google Scholar ]
  • Simonyan K, Vedaldi A, Zisserman A (2014) Deep inside convolutional networks: visualising image classification models and saliency maps. In: 2nd international conference on learning representations, ICLR 2014, Banff, AB, Canada, April 14–16, 2014, Workshop Track Proceedings; http://arxiv.org/abs/1312.6034
  • Smiatek J, Jung A, Bluhmki E. Towards a digital bioprocess. Replica: computational approaches in biopharmaceutical development and manufacturing. Trends Biotechnol. 2020; 38 (10):1141–1153. doi: 10.1016/j.tibtech.2020.05.008. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Song T, Zhang X, Ding M, Rodriguez-Paton A, Wang S, Wang G. DeepFusion: a deep learning based multi-scale feature fusion method for predicting drug–target interactions. Methods. 2022; 204 :269–277. [ PubMed ] [ Google Scholar ]
  • Springenberg JT (2015) Striving for simplicity: the all-convolutional Net. CoRR, http://arxiv.org/abs/1412.6806
  • Stark R, Fresemann C, Lindow K. Development and operation of digital twins for technical systems and services. CIRP Ann. 2019; 68 :129–132. [ Google Scholar ]
  • Steinwandter V, Borchert D, Herwig C. Data science tools and applications on the way to Pharma 4.0. Drug Discov Today. 2019; 24 :1795–1805. [ PubMed ] [ Google Scholar ]
  • Stokes JM, Yang K, Swanson K, Jin W, Cubillos-Ruiz A, Donghia NM, MacNair CR, French S, Carfrae LA, Bloom-Ackerman Z, et al. A deep learning approach to antibiotic discovery. Cell. 2020; 180 :688–702.e13. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Subramanian K. Digital twin for drug discovery and development—the virtual liver. J Indian Inst Sci. 2020; 100 :653–662. doi: 10.1007/s41745-020-00185-2. [ CrossRef ] [ Google Scholar ]
  • Subramanian A, Narayan R, Corsello SM, Peck DD, Natoli TE, Lu X, Gould J, Davis JF, Tubelli AA, Asiedu JK, et al. A next generation connectivity map: L1000 platform and the first 1,000,000 profiles. Cell. 2017; 171 :1437–1452.e17. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Sun X, Ma L, Du X, Feng J, Dong K (2018) Deep convolution neural networks for drug–drug interaction extraction. In: 2018 IEEE international conference on bioinformatics and biomedicine (BIBM), pp 1662–1668. 10.1109/BIBM.2018.8621405
  • Sun M, Zhao S, Gilvary C, Elemento O, Zhou J, Wang F. Graph convolutional networks for computational drug development and discovery. Brief Bioinform. 2020; 21 :919–935. [ PubMed ] [ Google Scholar ]
  • Sun M, Wang F, Elemento O, Zhou J. Structure-based drug–drug interaction detection via expressive graph convolutional networks and deep sets. Proc AAAI Conf Artif Intell. 2020; 34 (10):13927–13928. doi: 10.1609/aaai.v34i10.7236. [ CrossRef ] [ Google Scholar ]
  • System HSL (2006) Psychoactive Drug Screening Program. https://www.hsls.pitt.edu/obrc/index.php?page=URL1133202727
  • Tajbakhsh N, et al. Convolutional neural networks for medical image analysis: full training or fine tuning? IEEE Trans Med Imaging. 2016; 35 (5):1299–1312. doi: 10.1109/TMI.2016.2535302. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Tang J, Szwajda A, Shakyawar S, Xu T, Hintsanen P, Wennerberg K, Aittokallio T. Making sense of large-scale kinase inhibitor bioactivity data sets: a comparative and integrative analysis. J Chem Inf Model. 2014; 54 :735–743. [ PubMed ] [ Google Scholar ]
  • Tang P, Xu J, Louey A, Tan Z, Yongky A, Liang S, Li ZJ, Weng Y, Liu S. Kinetic modeling of Chinese hamster ovary cell culture: factors and principles. Crit Rev Biotechnol. 2020; 40 :265–281. [ PubMed ] [ Google Scholar ]
  • Tao F, Cheng J, Qi Q, Zhang M, Zhang H, Sui F. Digital twin-driven product design, manufacturing and service with big data. Int J Adv Manuf Technol. 2018; 94 :3563–3576. [ Google Scholar ]
  • Tatonetti NP, et al. Data-driven prediction of drug effects and interactions. Sci Transl Med. 2012; 4 (125):12531. doi: 10.1126/scitranslmed.3003377. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Tatonetti NP, Patrick PY, Daneshjou R, Altman RB. Data driven prediction of drug effects and interactions. Sci Transl Med. 2012; 4 (125):125ra31–125ra31. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Tehseen Z, Usman Z. Long short-term memory recurrent neural network architectures for Urdu acoustic modelling. Int J Speech Technol. 2019; 22 (1):21–30. doi: 10.1007/s10772-018-09573-7. [ CrossRef ] [ Google Scholar ]
  • Thafar M, Raies AB, Albaradei S, Essack M, Bajic VB. Comparison study of computational prediction tools for drug–target binding affinities. Front Chem. 2019; 7 :782. doi: 10.3389/fchem.2019.00782. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Thafar MA, Olayan RS, Olayan RS, Ashoor H, Ashoor H, Albaradei S, Albaradei S, Bajic VB, Gao X, et al. DTiGEMS: drug–target interaction prediction using graph embedding, graph mining, and similarity-based techniques. J Cheminform. 2020; 12 :1–17. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Thafar MA, Alshahrani M, Albaradei S, et al. Affinity2Vec: drug–target binding affinity prediction through representation learning, graph mining, and machine learning. Sci Rep. 2022; 12 :4751. doi: 10.1038/s41598-022-08787-9. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Thorben F, Megha Kh, Avishek A (2021) Hard masking for explaining graph neural networks. In Submitted to international conference on learning representations https://openreview.net/forum?id=uDN8pRAdsoC
  • Tian X, Xin M, Luo J, Jiang Z. Using the ranking-based KNN approach for drug repositioning based on multiple information. Cham: Springer; 2016. pp. 317–327. [ Google Scholar ]
  • Tong H, Heidemeyer M, Ban F, Cherkasov A, Ester M. SimBoost: A read-across approach for predicting drug–target binding affinities using gradient boosting machines. J Cheminform. 2017; 9 :1–14. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Torng W, Altman RB. Graph convolutional neural networks for predicting drug–target interactions. J Chem Inf Model. 2019; 59 :4131–4149. [ PubMed ] [ Google Scholar ]
  • Townshend RJL, Powers A, Eismann S, Derry A (2021) ATOM3D: tasks on molecules in three dimensions. arXiv 2021: arXiv:2012.04035
  • Trißl S, Rother K, Müller H, et al. Columba: an integrated database of proteins, structures, and annotations. BMC Bioinformatics. 2005; 6 :81. doi: 10.1186/1471-2105-6-81. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Trott O, Olson AJ. AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization and multithreading. J Comput Chem. 2010; 31 :455. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Tyson RJ, Park CC, Powell JR, Patterson JH, Weiner D, Watkins PB, Gonzalez D. Precision dosing priority criteria: drug, disease, and patient population variables. J Front Pharmacol. 2020 doi: 10.3389/fphar.2020.00420. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • U. Consortium UniProt: a hub for protein information. Nucleic Acids Res. 2014; 43 (D1):D204–D212. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Vazquez J, Lopez M, Gibert E, Herrero E, Luque FJ. Merging ligand-based and structure-based methods in drug discovery: an overview of combined virtual screening approaches. Molecules. 2020; 25 :4723. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Venkatasubramanian V. The promise of artificial intelligence in chemical engineering: is it here, finally? AIChE J. 2019; 65 :466–478. [ Google Scholar ]
  • Vermeer NS, Straus SM, Mantel-Teeuwisse AK, Domergue F, Egberts TC, Leufkens HG, De Bruin ML. Traceability of biopharmaceuticals in spontaneous reporting systems: a cross sectional study in the FDA adverse event reporting system (FAERS) and surveillance databases. Drug Saf. 2013; 36 (8):617–625. [ PubMed ] [ Google Scholar ]
  • Vilar S, Hripcsak GJ. Leveraging 3D chemical similarity, target and phenotypic data in the identification of drug-protein and drug-adverse effect associations. J Cheminform. 2016; 8 (1):35. doi: 10.1186/s13321-016-0147-1. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Vilar S, Uriarte E, Santana L, Lorberbaum T, Hripcsak G, Friedman C, Tatonetti NP. Similarity-based modeling in large-scale prediction of drug–drug interactions. Nat Protoc. 2014; 9 (9):2147–2163. doi: 10.1038/nprot.2014.151. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Wallach I, Dzamba M, Heifets A (2015) AtomNet: a deep convolutional neural network for bioactivity prediction in structurebased drug discovery. arXiv 2015: arXiv:1510.02855 .
  • Wan F, et al. DeepCPI: a deep learning-based framework for large-scale in silico drug screening. Genom Proteomics Bioinform. 2019; 17 :478–495. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Wang JZ, et al. A new method to measure the semantic similarity of GO terms. Bioinformatics. 2007; 23 (10):1274–1281. doi: 10.1093/bioinformatics/btm087. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Wang W, et al. Drug repositioning by integrating target information through a heterogeneous network model. Bioinformatics. 2014; 30 (20):2923–2930. doi: 10.1093/bioinformatics/btu403. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Wang CS, Lin PJ, Cheng CL, Tai SH, Kao Yang YH, Chiang JH. Detecting potential adverse drug reactions using a deep neural network model. J Med Internet Res. 2019; 21 (2):e11016. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Wang T, Yi HC, You ZH, Li LP, Wang YB, Hu L, Wong L (2019) A gated recurrent unit model for drug repositioning by combining comprehensive similarity measures and Gaussian interaction profile kernel. In: International conference on intelligent computing. Springer, Cham. pp 344–353
  • Wang YB, You ZH, Yang S, et al. A deep learning-based method for drug–target interaction prediction based on long short-term memory neural network. BMC Med Inform Decis Mak. 2020; 20 :49. doi: 10.1186/s12911-020-1052-0. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Wang H, Wang J, Dong C, Lian Y, Liu D, Yan Z. A novel approach for drug–target interactions prediction based on multimodal deep autoencoder. Front Pharmacol. 2020; 10 :1–19. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Watanabe JH, McInnis T, Hirsch JD. Cost of prescription drug-related morbidity and mortality. Ann Pharmacother. 2018; 52 :829–837. doi: 10.1177/1060028018765159. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Way GP, Greene CS. Extracting a biologically relevant latent space from cancer transcriptomes with variational autoencoders. Pac Symp Biocomput. 2018; 23 :80–91. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Wei J, Lu Z, Qiu K, Li P, Sun H. Predicting drug risk level from adverse drug reactions using SMOTE and machine learning approaches. IEEE Access. 2020; 8 :185761–185775. doi: 10.1109/ACCESS.2020.3029446. [ CrossRef ] [ Google Scholar ]
  • Weinstein JN. Integromic analysis of the NCI-60 cancer cell lines. Breast Dis. 2004; 19 :11–22. [ PubMed ] [ Google Scholar ]
  • Wen M, Zhang Z, Niu S, Sha H, Yang R, Yun Y, Lu H. Deep-learning-based drug–target interaction prediction. J Proteome Res. 2017; 16 :1401–1409. [ PubMed ] [ Google Scholar ]
  • Wenzel J, Matter H, Schmidt F. Predictive multitask deep neural network models for adme-tox properties: learning from large data sets. J Chem Inf Model. 2019; 59 :1253–1268. [ PubMed ] [ Google Scholar ]
  • White J, Schiffer JT, Bender R, et al. Drug combinations as a first line of defense against coronaviruses and other emerging viruses. Mbio. 2021; 12 (6):e0334721. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Withnall M, Lindelöf E, Engkvist O, Chen H. Building attention and edge message passing neural networks for bioactivity and physical-chemical property prediction. J Cheminform. 2020; 12 :1. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Wu Z, Ramsundar B, Feinberg EN, Gomes J, Geniesse C, Pappu AS, Leswing K, Pande V. MoleculeNet: a benchmark for molecular machine learning. Chem Sci. 2018; 9 :513–530. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Wu Z, Pan S, Chen F, et al. A comprehensive survey on graph neural networks. IEEE Trans Neural Netw Learn Syst. 2020; 32 :4–24. [ PubMed ] [ Google Scholar ]
  • Xia Z, Wu LY, Zhou X, Wong ST. Semi-supervised drug-protein interaction prediction from heterogeneous biological spaces. BMC Syst Biol. 2010; 4 :S6. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Xiang W, Yingxin W, An Z, Xiangnan H, Tat-seng C (2021) Causal screening to interpret graph neural networks. In Submitted to international conference on learning representations. https://www.openreview.net/forum?id=nzKv5vxZfge
  • Xie L, He S, Song X, Bo X, Zhang Z. Deep learning-based transcriptome data classification for drug–target interaction prediction. BMC Genomics. 2018; 19 :13–16. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Xie Y, Peng J, Zhou Y, et al (2019) Integrating protein-protein interaction information into drug response prediction by graph neural encoding. 16 December 2019, Available at Research Square 10.21203/rs.2.18936/v1.
  • Xu Y, Pei J, Lai L. Deep learning-based regression and multiclass models for acute oral toxicity prediction with automatic chemical feature extraction. J Chem Inf Model. 2017; 57 :2672–2685. [ PubMed ] [ Google Scholar ]
  • Yan CK, Wang WX, Zhang G, et al. BiRWDDA: a novel drug repositioning method based on multisimilarity fusion. J Comput Biol. 2019; 26 (11):1230–1242. [ PubMed ] [ Google Scholar ]
  • Yan C, Duan G, Zhang Y, Wu F-X, Pan Y, Wang J. Predicting drug–drug interactions based on integrated similarity and semi-supervised learning. IEEE/ACM Trans Comput Biol Bioinf. 2022; 19 (1):168–179. doi: 10.1109/TCBB.2020.2988018. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Yang K, Swanson K, Jin W, Coley C, Eiden P, Gao H, Guzman-Perez A, Hopper T, Kelley B, Mathea M, et al. Analyzing learned molecular representations for property prediction. J Chem Inf Model. 2019; 59 :3370–3388. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Yi HC, You ZH, Wang L, et al. In silico drug repositioning using deep learning and comprehensive similarity measures. BMC Bioinf. 2021; 22 :293. doi: 10.1186/s12859-020-03882-y. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Yifan D, Xinran X, Yang Q, Jingbo X, Wen Z, Shichao L. A multimodal deep learning framework for predicting drug–drug interaction events. Bioinformatics. 2020; 36 :4316–4322. [ PubMed ] [ Google Scholar ]
  • Ying Z, Bourgeois D, You J, Zitnik M, Leskovec J. Gnnexplainer: generating explanations for graph neural networks. Adv Neural Inf Process Syst. 2019; 32 :9244–9255. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Yu Y, Si X, Hu C, Zhang J. A review of recurrent neural networks: Lstm cells and network architectures. Neural Comput. 2019; 31 :1235–1270. [ PubMed ] [ Google Scholar ]
  • Yu Y, Huang K, Zhang C, Glass LM, Sun J, Xiao C. SumGNN: multi-typed drug interaction prediction via efficient knowledge graph summarization. Bioinformatics. 2021; 37 (18):2988–2995. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Yuan H, Yu H, Wang J, Li K, Ji S (2021) On explain-ability of graph neural networks via subgraph explorations. http://arxiv.org/abs/2102.05152
  • Yue X, Wang Z, Huang J, Parthasarathy S, Moosavinasab S, Huang Y, Lin SM, Zhang W, Zhang P, Sun H. Graph embedding on biomedical networks: methods, applications, and evaluations. Bioinformatics. 2020; 36 (4):1241–1251. doi: 10.1093/bioinformatics/btz718. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Yunsheng B, Ken G, Yizhou S, Wei W (2020) Bi-level graph neural networks for drug–drug interaction prediction. J Comput Eng arXiv:2006.14002
  • Zaikis D, Vlahavas I (2020) Drug–drug interaction classification using attention based neural networks. In: 11th Hellenic conference on artificial intelligence, pp 34–40. 10.1145/3411408.3411461
  • Zeng H, Qiu C, Cui QJD. Drug-path: a database for drug-induced pathways. J Biol Databases Curation. 2015 doi: 10.1093/database/bav061. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Zeng T, Rongjian L, Ravi M, Jieping Y, Shuiwang J. Deep convolutional neural networks for annotating gene expression patterns in the mouse brain. BMC Bioinformatics. 2015; 16 (1):147. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Zeng X, et al. Measure clinical drug–drug similarity using electronic medical records. Int J Med Inf. 2019; 124 :97–103. doi: 10.1016/j.ijmedinf.2019.02.003. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Zeng X, Zhu S, Lu W, Liu Z, Huang J, Zhou Y, Fang J, Huang Y, Guo H, Li L, et al. Target identification among known drugs by deep learning from heterogeneous networks. Chem Sci. 2020; 11 :1775–1797. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Zhai J, Zhang S, Chen J, He Q (2018) Autoencoder and its various variants. In: 2018 IEEE international conference on systems, man, and cybernetics (SMC), pp 415–419. 10.1109/SMC.2018.00080
  • Zhang Y. Predicting drug–drug interactions using multi-modal deep autoencoders based network embedding and positive-unlabeled learning. Methods. 2020; 179 :37–46. [ PubMed ] [ Google Scholar ]
  • Zhang M-L, Zhou Z-H. Ml-knn: a lazy learning approach to multi-label learning. Pattern Recogn. 2007; 40 (7):2038–2048. [ Google Scholar ]
  • Zhang H, Liu D, Xiong Z (2018) Convolutional neural network-based video super-resolution for action recognition. In: 2018 13th IEEE international conference on automatic face & gesture recognition (FG 2018), pp 746–750. 10.1109/FG.2018.00117
  • Zhang Y, Weng Y, Lund J. Applications of explainable artificial intelligence in diagnosis and surgery. Diagnostics. 2022; 12 :237. doi: 10.3390/diagnostics12020237. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Zhang C, Lu Y, Zang T. CNN-DDI: a learning-based method for predicting drug–drug interactions using convolution neural networks. BMC Bioinf. 2022; 23 :88. doi: 10.1186/s12859-022-04612-2. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Zhao Y, Zheng K, Guan B, Guo M, Song L, Gao J, Qu H, Wang Y, Shi D, Zhang Y. DLDTI: a learning-based framework for drug–target interaction identification using neural networks and network representation. J Transl Med. 2020; 18 :434. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Zhao Q, Xiao F, Yang M, Li Y, Wang J (2019) AttentionDTA: prediction of drug–target binding affinity using attention model. In: Proceedings of the 2019 IEEE international conference on bioinformatics and biomedicine, San Diego, CA, USA, 18–21 November 2019, pp 64–69
  • Zhou Y, Zhang Y, Lian X, Li F, Wang C, Zhu F, Qiu Y, Chen Y. Therapeutic target database update 2022: facilitating drug discovery with enriched comparative data of targeted agents. Nucleic Acids Res. 2022; 50 :1398–1407. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Zitnik M, Agrawal M, Leskovec J. Modeling polypharmacy side effects with graph convolutional networks. Bioinformatics. 2018; 34 (13):i457–i466. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Zitnik SM, Sosic R, Leskovec J (2018) Biosnap datasets: Stanford biomedical network dataset collection. http://snap.stanford.edu/biodata
  • Zong N, Kim H, Ngo V, Harismendy O. Deep mining heterogeneous networks of biomedical linked data to predict novel drug–target associations. Bioinformatics. 2017; 33 :2337–2344. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Zügner D, Akbarnejad A, Günnemann S (2018) Adversarial attacks on neural networks for graph data. In: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery and Data Mining. 2018, Association for Computing Machinery: London, United Kingdom. pp 2847–2856

Machine Learning in Drug Discovery: A Review

  • Published: 11 August 2021
  • Volume 55 , pages 1947–1999, ( 2022 )

Cite this article

  • Suresh Dara   ORCID: orcid.org/0000-0002-1626-8701 1 ,
  • Swetha Dhamercherla 1 ,
  • Surender Singh Jadav 2 ,
  • CH Madhu Babu 1 &
  • Mohamed Jawed Ahsan 3  

50k Accesses

161 Citations

24 Altmetric

Explore all metrics

This review provides the feasible literature on drug discovery through ML tools and techniques that are enforced in every phase of drug development to accelerate the research process and deduce the risk and expenditure in clinical trials. Machine learning techniques improve the decision-making in pharmaceutical data across various applications like QSAR analysis, hit discoveries, de novo drug architectures to retrieve accurate outcomes. Target validation, prognostic biomarkers, digital pathology are considered under problem statements in this review. ML challenges must be applicable for the main cause of inadequacy in interpretability outcomes that may restrict the applications in drug discovery. In clinical trials, absolute and methodological data must be generated to tackle many puzzles in validating ML techniques, improving decision-making, promoting awareness in ML approaches, and deducing risk failures in drug discovery.

Similar content being viewed by others

literature review on discovery learning

Deep learning in drug discovery: an integrative review and future challenges

Heba Askr, Enas Elgeldawi, … Aboul Ella Hassanien

literature review on discovery learning

Artificial intelligence to deep learning: machine intelligence approach for drug discovery

Rohan Gupta, Devesh Srivastava, … Pravir Kumar

literature review on discovery learning

The AI-driven Drug Design (AIDD) platform: an interactive multi-parameter optimization system integrating molecular evolution with physiologically based pharmacokinetic simulations

Jeremy Jones, Robert D. Clark, … Marvin Waldman

Avoid common mistakes on your manuscript.

1 Introduction

In computer science, Artificial intelligence (AI) additionally attributed as machine intelligence because machines are trained or customized to perform activities like a human brain (Poole et al. 1998 ; Vinod and Anand 2021 ; Gopal 2018 ). Artificial Intelligence (AI) can be categorized here as the field is dealing with a wide range of utilization and layouts of numerous algorithms for interpreting and attaining knowledge from data. And the AI concept is firmly related to many fields like pattern recognition, probability theory, statistics, machine learning, and numerous procedures like fuzzy models, neural networks which are collectively known as “Computational Intelligence” Vinod and Anand ( 2021 ), Engelbrecht ( 2007 ), Konar ( 2006 ), Duda et al. ( 2012 ), Webb ( 2003 ), Friedman et al. ( 2001 ). Multiple complicated usages engaged with AI strategies like classification, regression, predictions and also optimization techniques. Machine learning needs to be modified well in the utilization of any kind of information i.e., initially, a particular model must be characterized along with parameters. So, machines can be gain proficiency in the model with accessible parameters through the utilization of trained data. Furthermore, the model can predict the data in the future for recovering information from data (Alpaydin 2020 ).

In this review, we are primarily focusing on qualities of AI approaches that are appropriate for drug development and discovery (Duch et al. 2007 ). Recently various factors were developed due to greater enthusiasm for utilizing machine learning approaches in the pharmaceutical industry. Figure  1 shows that the various fields of Drug Discovery and advancements utilized through machine learning. Every phase was performed like a pipeline to represent therapeutic concepts. The respective phases represent unique iterations in time and cost expenditure. Here each phase is carried out to prove the effectiveness of the remedial treatment. The medical information was being mined and estimated accurately by using some ‘omics’ and ‘smart automation tools’. Enlarging these techniques into the biological field gives more opportunities as well as challenges in the pharmaceutical industry. Since numerous pharmaceutical enterprises’ objective is to distinguish the persuasive clinical hypothesis. With the obtained results, practitioners or clinicians can develop the medications. For establishing any type of drug in pharmaceutical industries, the usage of machine learning approaches has checked out the performance. At this point, if included with unlimited storage, improvement appeared in datasets like size, types can provide premises to machine learning. In this way, it can access enormous data from pharmaceutical industries. Data types can have different configurations like textual data, images, assay information, biometrics, and furthermore high dimensional omics data (Mamoshina et al. 2018 ).

Thus, the AI field has developed from theoretical knowledge to real-world data. Information was widely improved for utilizing in PC hardware, for example, Graphical Processing Units (GPU), which makes faster in processing (i.e., in computational techniques). Recently, the deep learning model is one of the machine learning algorithms (LeCun et al. 2015 ), it develops the models for making more accomplishment in broad daylight challenges (Chen et al. 2018 ; Hinton 2018 ). For the past 2 years, the usage of ML algorithms has a great extension within pharmaceutical enterprises.

figure 1

Various fields in Drug discovery by using Machine Learning

In the clinical field, developing a new drug for persistent disease primarily relied on new medications. As of late, various drugs are improvised for recognizing dynamic components from traditional treatments such as penicillin. In chemical laboratories, it consists of natural substances, small molecules that aid in therapeutic medicine to detect substances such as cells or intact organisms. This procedure is called old-style pharmacology.

High throughput screening with multiple libraries has normally expanded because of the human genome has permitted cloning strategies and furthermore improving refining of proteins in huge quantities. Screening activity for large compounds through biological targets can be used to achieve a change in a disease called reverse pharmacology. Multiple hits can be generated from screening activity to provide cells and furthermore tests have been conducted in creatures for adequacy. In modern days, drug discovery has engaged with the performance of identifications on screening hits, optimization techniques can build the drug effectiveness, affinity, stability of metabolic. If all requirements are satisfied by the compound, a particular drug will be developed in clinical trials if the drug is successful. In process of drug development and discovery, it requires lead optimization, target identification and validation, hit discovery, clinical trials (Vohora and Singh 2018 ). In novel drug development, the cost expenditure can approximately 2.558 billion USD (DiMasi et al. 2016 ) and it is a tedious procedure in light of the fact that about 10–15 years have taken for selling in the market (Turner 2010 ). To accomplish a small number of molecules in drug development, many investors are putting a lot of cash in developing exact progress in clinical trials. And still, 13% precision rate is lagging with disappointment. So as to conquer this issue, clinicians have utilized the Computer-assisted Drug Design CADD technique (Hassan Baig et al. 2016 ). By utilizing this strategy in drug discovery, the artificial techniques not just provide the molecular properties (i.e., selectivity, distribution, absorption, bioactivity, metabolism, side effects, and excretion in the theoretical levels) but also provides the lead compounds such as ideal attributes in silico. Also, attrition cost in the preclinical state can be decreased through the utilization of multi-objective optimization techniques.

In drug discovery, computational intelligence provides various techniques for analyzing, learning and furthermore clarifies how such pharmaceutical was identified with AI for finding numerous medications in a programmed and integrated format (Duch et al. 2007 ). Therefore, many pharmaceutical industries have shown greater enthusiasm for contributing to technologies, resources for retrieving accurate results in drug discovery. At last, this survey proposes AI techniques in the drug discovery area for targeting multiple applications in drug discovery and development by utilizing deep learning techniques. Along these lines, the AI field provides expected outcomes in concern of computational intelligence in drug development and discovery (Table 1 ).

1.1 Roadmap

The rest of the article is arranged in the following way: Sect.  2 describes the application of AI in Drug design. Then, the various machine learning methods towards Drug discovery are discussed in Sect.  3 . Various Drug design applications are discussed in Sect.  4 . In Sect.  5 , different Drug design problems have discussed. Finally, Sect.  6 presents the research challenges with few possible suggestions in Drug discovery using Machine learning, and Sect.  7 concludes the article and provided some future directions.

2 Application of AI in drug design

This section discusses a few applications in AI which relate to drug study. The activity of protein structure is considered as the application in drug design. Many impurities have appeared in the human body due to protein dysfunctions. Structural drug design strategies are used to differentiate small molecules in protein targets. Protein structure in 3D format requires more money and time for predicting the 3D structure. And still, it faces the problem i.e., in making more exactness over de-novo prediction in 3D structure. By using deep learning and feature extraction tools, it is mandatory to predict the secondary structure (Spencer et al. 2014 ) and residing the protein contacts (Li et al. 2017 ). It precisely gains the information on the connection among structure and sequence from feature extraction. The further goal is to predict the 3D- protein structure by utilizing deep learning techniques for improving the accuracy. To retrieve information from drug design of protein-protein computer structure, then it is mandatory to conduct investigations on PPI interface (Xue et al. 2015 ).

Artificial Intelligence has been used in various applications like a prediction on drug–protein interactions, the discovery of drug efficacy, ensuring the safety biomarkers. The detailed discussion is given as follows

2.1 Prediction on drug–protein interactions

The crucial step of drug development in silico is consisting of multiple biological sources for predicting drug–protein interactions. Here complications can be seen in large predictions, which relied on the countless unknown interactions. Therefore, semi-supervised training techniques should be used to address these unlabelled and labeled date complications. Usually, only labeled data will produce better results. In addition, the semi-supervised technology integrates chemical structure, drug–protein interaction network data, and genome sequence data. Finally, in this article, drug–protein interactions of various data sets such as ions, enzymes, and nuclear receptors provided well predictable results (Xia et al. 2010 ).

Drugs have an important priority in therapeutic activity, which is regulated by protein interactions. The drug–protein interaction database (DPI) focuses primarily on therapeutic protein targets, while knowledge of non-targets has been limited and resolved. Thus, computational techniques can fill the knowledge gap for predicting protein targets for distributed drug molecules. In that study, the pool of 35 predictors had a major impact on the similarity between protein and drug targets. Drug structure, target sequence, and drug profile are three types of similarity developed from the results of 35 predictors. Finally, the significant content, relationships, and implications between database sources are of great importance for therapeutic activity (Wang and Kurgan 2020 ).

In drug repurposing, the unexpected detection of drug–protein interactions is essential. Thus, the dominant drug may be useful for repurposing, while drug side effects are unavoidable and about 1,000 human proteins can cause critical side effects. The proteomic scale method was used to predict side effects and protein goals. FINDSITEcomb is used to predict drug–protein interactions. The estimates showed greater disruption with a mean of 329 human targets for each drug (Zhou et al. 2015 ).

2.2 Discover of drug efficacy

Usually, a drug effect assessment looks at its biochemical activity. The effectiveness of the therapeutic activity has posed a challenge to be properly coupled with the biochemical activity. The collection of a large amount of data on the effects of cellular drugs was undertaken to fill a gap that has been explored in the extensive content of cellular estimations and while this estimation is classified as a psychotropic drug. Here, the microarray data can be analyzed by applying random trees to the forest and classifying them, providing a profile for the efficiency of biomarker gene expression. Accuracy of 88.9% of the classification tree and 83.3% of the random forest model used this efficacy profile for a drug treatment analysis. Therefore, at the cellular assessment level, general genomic data are acceptable to reconcile the effects of new physiological drugs with clinical applications. Finally, in vitro signatures of gene expression data can identify the effectiveness of therapeutic activities that can help validate targeting and drug development (Gunther et al. 2003 ).

In drug development, increasing profitability by validating new drugs requires predicting effectiveness and identifying targets. The proximity of medical illnesses helps to reduce the effectiveness of the treatment and also releases drugs that are effective in therapeutic activity. The study treated 78 diseases with 238 drugs to demonstrate the drug’s effectiveness in therapeutic activity, as well as problems with gene efficacy and various disorders. Here the network-based system is used to develop a drug-disease proximate measure that assesses the interactions between the disease and the drug target. Therefore, the proximity of network-based systems makes it possible to predict associations for novel drug diseases, offering a wide range of possibilities for conflict detection and drug repurposing (Guney et al. 2016 ).

2.3 Ensuring the safety biomarkers

In drug development, the use of biomarkers supports the provision of safety measures that critically determine the biological and analytical indicators of a particular biomarker. In this way, stakeholders can assess and manage whether claims are defended for a particular purpose and whether the desired standards are being met. For shareholders in the implementation of evaluating the experiment agreement, a stakeholder evaluation process is needed to adjust the unique characteristics of the biomarkers, as well as to determine how these innovations are analyzed, integrated, and interpreted, and how improved biomarkers and conventional comparators are measured (Sistare et al. 2010 ).

In the survey, we found that modern medicines are no safer than older drugs, even though with longer medical trial programs. These trails are placed on the market and impractical inspections are carried out which are not sufficient to be carried out systematically to ensure safety. Previous drug-related signals can help in improving drug safety as well as identify underlying biomarkers, making them more toxic. However, the safety markers can be different for different target systems. However, no other approach can provide assurance that medicines are very safe, but we can develop a common understanding of benefit and risk assessment by communicating with the public (Rolan et al. 2007 ).

Various deep learning techniques are carried out here to predict the PPI interface and show fantastic results when contrasted with the SVM technique (Du et al. 2016 ). Thus, the PPI’s became more complex to utilize in biological techniques (Falchi et al. 2014 ; Scott et al. 2016 ). Each PPI can be a mixture of various residues (Cukuroglu et al. 2014 ). New PPI can act as a modern class for pharmaceutical targets where disparate for different targets i.e., ion channels, GPCRs (G-Protein coupled receptors), kinases (Higueruelo et al. 2013 ; Santos et al. 2017 ). iFitDock is a docking tool used for investigating a few hotspots in PPIs. Further, AI techniques have been utilized for distinguishing structures and hotspots in PPI interface (Fig. 2 ).

figure 2

Applications of AI in Drug discovery depicts the Machine learning mechanisms

3 Machine learning methods to drug discovery

AI innovation has a high priority in drug design through the enhancement of ML approaches and the collection of pharmacological data. AI does not rely upon any hypothetical improvements, but it has more essence in transforming medical information into studies like reusable methods. In general, there are different approaches such as Random Forest, Naive Bayesian Classification (NBC), Multiple Linear Regression (MLR), Logistic Regression (LR), Linear Discriminant Analysis (LDA), Probabilistic Neural Networks (PNN), Multi-Layer Perceptron (MLP), Support Vector Machine (SVM), etc are considered in the context of ML (Lavecchia and Di Giovanni 2013 ). In order to gain capability in feature extraction and feature generalization, AI advancements are specifically used as a deep learning technique towards drug design. Also, Fig.  3 shows respective applications which illustrate an outline of AI procedures utilized to respond to drug discovery queries in the review. A scope of classifier and regression strategies i.e. supervised learning techniques utilized to respond addresses desire expectations in continuous or categorical data factors, also unsupervised methods utilized in creating a model which empowers the clustering data.

Many designed features in traditional ML models are performed manually, but deep learning approaches will accelerate various features through available initialized data automatically because multi-layer feature extraction techniques are used to convert straightforward features into complex features. One advantage of using deep learning approaches was, presence of low quantity generalization blunders, so it recovers more exact results. CNN, RNN, Auto Encoder, DNN, and RBN are considered as different deep learning techniques. Summary of deep learning algorithms can be identified (LeCun et al. 2015 ; Angermueller et al. 2016 ; Schmidhuber 2015 ) and provides detailed information about deep learning techniques which are available in Deep Learning literature (Goodfellow et al. 2016 ).

In drug discovery and development, many AI calculations are associated to analyse and predict the data. Here, few popular models like SVM, RF, and MLP discuss their effective use in drug discovery.

3.1 Support vector machines

SVM model is a supervised learning algorithm basically utilized in predicting the class labelled data i.e., binary data. In SVM, x is considered as feature vector i.e., input to SVM model. At that point, \(x \in R_n\) where n is a dimension feature vector. Y acts as a class i.e., output for svm. \(Y \in \{-1,1\}\) . Here, Binary values are considered as classification task. Parameters in SVM u and b have considered for learning data in training set. In dataset, \((x^{(i)}, Y^{(i)})\) are considered as \(i^{th}\) sample. Y can be represented as follows:

A class Y can be written as \(Y^{(i)}) (UTX(i)+b) \geqslant 1\) . Finally, SVM algorithm goal is to satisfy:

In SVM, seperation between any two boundaries ought to be augmented i.e., the distance between two hyperplane \(u^T x + b= -1\) and \(u^T x + b=1\) should be maximized. In this way, \(Distance=\frac{2}{|| U ||} {max}^{U}\) . Finally, it have to solve \({max}^{U} \frac{2}{|| U ||} {min}^{U} \frac{2}{|| U ||}\)

Complete \(x^{(i)}\) samples need to classify effectively in the SVM i.e., \(Y^{(i)}) (UTX(i)+b) \geqslant 1 \forall \in {1,2,3 \ldots N}\)

Then, it produces quadratic optimization problem i.e. \(\frac{min ||U||}{U,b 2}\) . So that, \(Y^{i} (U^T X^i + b) \geqslant 1, \forall i \in {1,2,3,\ldots ,N}\) .

The above equation was a hard-margin SVM, and we can avoid this problem through applying linearly separable method. Using the slack variable \(\epsilon ^{(i)}\) as constraints. In training data, each sample has its own slack variable. Then,

Now, it’s a soft margin SVM, where ‘C’ is considered as a penalty of the error term. Involving function \(\phi\) to allow more flexibility in mapping. So, it maps multiple features like original space to high dimensional space (Noble 2006 ). Then, the quadratic optimization problem updates Eq.  1 as the following:

figure 3

Maximum-margin hyperplane and margins for an SVM trained with samples from two classes. Samples on the margin are called the support vectors

The SVM widely used in drug discovery using its various kernels (Smola and Schölkopf 2004 ). Various problems like Screen radiation protection and Gene Interaction using SVM-RBF(Radial Basis Function) (Matsumoto et al. 2016 ; Guo et al. 2008 ), Assess target-ligand interactions using Regression-SVM (Li et al. 2011 ), Identify drug target interaction by Biased SVM (Wang et al. 2017 ), Predicting drug sensitivity prediction by Ensemble SVM, and the Linear SVM used in Identify novel drug targets (Volkamer et al. 2012 ), Anti/non-anticancer molecule classification (Kapoorb et al. 2020 ), Kinase mutaion activation (Patil et al. 2021 ).

The SVM approach (Huang et al. 2018 ) was used to quantify anti-cancer drugs based on cancer cell properties. To understand the relationship between cancer cell properties and drug resistance, 24 drugs were tested on cancer cell lines (Gupta et al. 2016 ). In the treatment of oral cancer, the SVM-RBF (Radial Basis Function) approach has been used to find therapeutic compounds from a large collection of public databases (Bundela et al. 2015 ), the RBF is the popular kernel function used in various learning algorithms. The RBF kernel takes two samples S 1 and S 2, represented as feature vectors in some input space \(K(S1,S2)=exp (\frac{||S1-S2||^2}{2 \sigma ^2})\) where \(||S1-S2||^2\) is used to recognized as the squared Euclidean distance between two vectors and \(\sigma\) is a free parameter. Here the RBF is used and hybridized as many variations with different parameter values.

In general, radiation therapy techniques help to protect against cancer. Therefore, the SVM method is used in virtual screening (Matsumoto et al. 2016 ) to protect the radiation function. Radiation therapy also has side effects on normal cells and tissues (Morita et al. 2014 ). In this study, we found that the SVM approach worked better than other techniques. When the target protein is known, we can find a suitable compound for the target protein. However, the SVM technique is mainly used to predict the outcome of targeted drugs. SVM has used sites to link global descriptors, taking into account various properties such as compactness and size. These descriptors can determine drug scores for novel targets (Volkamer et al. 2012 ; Li et al. 2011 ).

In therapeutic activities, the use of SVM helps to find the active ingredient at various stages of the drug development process. In general, the active component of the connection is taken into account in the number of turns of the design process. The main goal is to find different lead series in the active compound to improve them in parallel in therapeutic activity (Warmuth et al. 2003 ). In contrast to other artificial neural networks, SVM demonstrated the ability to test drug similarity predictions of a wide variety of compounds. Because of this set of descriptors, the SVM outperformed the task and also reported that the SVM model predicted better enzyme inhibitor quality for conventional QSAR (Zernov et al. 2003 ).

Right now, the SVM model is the best methodology for predicting organic and compound properties. Recently, the SVM model has been utilized in the drug discovery region and turned out to be more famous in drug discovery applications like a prediction on properties, compound classification (Maltarollo et al. 2019 ). In designing new structures, the SVM approach was utilized for retrieving higher predicted results where depend on ligands (Hartenfeller and Schneider 2010 ). In the Activity process, to improve scoring capacity execution, the SVM approach was utilized for clarifying non-linear relationships of energy terms from eHiTS and binding data which shows a lot of improvement in scoring power and screening power (Kinnings et al. 2011 ; Zsoldos et al. 2007 ). SVM model was frequently utilized in virtual screening (Leelananda and Lindert 2016 ; Liew et al. 2009 ; Melville et al. 2009 ) and demonstrated best results (in the predicted ratio called hits) and furthermore false-hit rates are decreased concurrently (counterfeit hit rates in the predicted hits) (Ma et al. 2009 ). Creating meta-classifiers with SVM-based methodology can coordinate different methods for exploiting each complementarity and individual strengths (Maltarollo et al. 2019 ).

3.2 Random forest

The Random Forest algorithm was a supervised algorithm. The name itself says, ”This is a way of creating a forest from various perspectives to make it random”. The significant advantage of the Random Forest algorithm was, it can relevant for both regression and classification issues. In the procedure of regression and classification tasks, overfitting can happen normally, so the outcome will be in a worse state. We can defeat the overfitting issue through the usage of random forests algorithm with the availability of multiple trees in the forest. Random forests can apply trained algorithmic techniques i.e., bagging. Training set comprises,

\(X=X_1,X_2,\ldots ,X_n\) , \(Y=Y_1,Y_2,\ldots ,Y_n\) . Then, random samples can alternately selected from training data for fitting random forest tree.

Alternate samples with n trained examples from X, Y then \(X_a, Y_a\) .

Classification tree \(f_b\) must be trained on \(X_a, Y_a\) data. Here, \(a=12,\ldots ,A\) .

After training the data, invisible samples \(x'\) need to be predicted by averaging all individual trees on \(x'\) :

In classification trees, majority voting can be considered. Finally, random forest model produces better results due to the absence of increment in bias, it reduces variance in the model. The equation for individual regression tree on \(x'\) can be represented in standard deviation form i.e.,

where ‘A’ is a free parameter. In view of the size, nature of the trained data, a large number of trees can be used (Ho 1995 ). Also, the random forest can be appropriate in medication for deciding the right segments of grouping in therapy, and; investigating patient records can be supportive in recognizing the infections (Polamuri 2017 ). In ligand-protein binding affinity, using random forests can improve the scoring function performance (Kinnings et al. 2011 ; Zsoldos et al. 2007 ). Representation of scientific models and chemical structures are the fundamental issues in QSAR model (Dudek et al. 2006 ). At that point when descriptors are chosen, it is necessary to establish the best mathematical model for correct fitting in structure-activity correlation. So as to improve fitting standards in mathematical model (A Dobchev et al. 2014 ; Ning and Karypis 2011 ), a random forest algorithm was utilized (Fig. 4 ).

figure 4

The random forest visually generated a data point decision tree to extract estimations for each sample to determine the best outcomes through voting

The selection of molecular descriptors is seen as an important step in virtual screening to identify bioactive molecules during the drug development process. Because this choice of descriptors shows predictions with lower accuracy. Hence, the random forest technique was used to improve prediction and then select naturally trained molecular descriptors for kinase ligands, hormone receptors, enzymes, etc. (Cano et al. 2017 ).

In the pharmaceutical industry, when developing drugs, the question that arises naturally is whether a prediction model trained with heterogeneous data is implemented as a similar prediction model. Then the heterogeneity data were compiled for forecasting and model training. In this study, heterogeneity was treated as a problem with the latent distribution, and the covariate-free allocation technique was distributed to be distributed by means of an ensemble leaf node model. In general, an ensemble-based random forest model has incorporated Heterogeneity Aware Random Forest (HARF) and assign specific weights to tree-based categories. Of course, the technique proposed by HARF gives better results than classical random forest, whereas drug feedback with the cancer disease types is something peculiar (Rahman et al. 2017 ).

Immune network technology is to determine new compounds from drug molecules. Using examples of sulfonamide properties, sulfonamides are divided into various prognostic effects over a period of time. Using a random forest approach, we selected molecular descriptors to achieve better accuracy than the simulation results for compounds designed for the drug (Samigulina and Zarina 2017 ).

3.3 Multilayer perception

The Multilayer perception model is also known as a feed-forward neural network. MLP provides an outcome based on a set of input sources. For training any sort of information, the backpropagation approach is utilized. This model is similar to a directed graph because of the essence of multiple layers as input nodes and output nodes are associated with some weights (Pal et al. 1992 ). After processing the data, the perceptron can fluctuate each connected weight in the network. In this way, the presence of error in actual output can be compared with the expected output. Consider node \(`j'\) in output as degree of error in last data point i.e., \(n^{th}\) \(e_j(n) = a_j(n)- Y_j(n)\) Where \(a \rightarrow target value\) , \(Y \rightarrow\) the variable developed from the perception. Based on some corrections, weights in each node can be adjusted through decreasing error in the output i.e.,

Also, every weight can be varied through the gradient descent approach i.e.,

where, \(\eta \rightarrow\) learning rate and weights can be converted into a response without any oscillations. \(Y_i \rightarrow\) previous neuron result.

Depending on \(`V_j'\) field, derivative can be calculated. Then, simplified derivative in output node can be

Here \(\phi\) cannot be varied itself. Because changing each and every weight in hidden layer becomes difficult; Also, it provides

where \(`k'\) is represented as the last node in the output layer. In case, changing any weights in a hidden layer, the activation function can be varied the weights in the output layer. Figure 5 performs specific computations to distinguish few features in input data. It learns optimal weights consequently and afterward input features will be increased with available weights to decide specific neuron was terminated or not. In this way, Multilayer perceptron uses backpropagation strategy with the activation function (Rosenblatt 1961 ). In this review, a multi-layer perceptron was utilized for predicting action between the drugs. This model has one advantage i.e., it does not require any structural information on compounds because of the fact that it uses experimental data for predicting the accuracy (Stokes et al. 2020 ). Additionally, MLP was utilized to generate a de-novo drug design. This model having the capability to generate different compounds automatically with some advanced properties (Gómez-Bombarelli et al. 2018 ).

figure 5

Multilayer Perception Architecture

In general, MLP can be used very easily and very quickly, but fulfilling its duties in training is very difficult, and MLP also does not offer any guarantee of global minimum performance (Gertrudes et al. 2012 ).

The secondary structure of proteins offers a greater advantage in determining protein function, drug design. In that study, the MLP approach showed greater interest in classification success. However, in the experimental area, determining the secondary structure is more difficult and expensive. Finally, the results from the trained data were reported as a positive success compared to the classification (Yavuz et al. 2018 ).

3.4 Deep learning

Deep learning is a part of machine learning, having the capability to extract a greater level of features through utilization of multiple layers from input data (Deng and Dong 2014 ). Deep learning is an immense field that is creating massive premiums nowadays. Recently, deep learning techniques have been used in many research fields and have achieved higher profitability in business strikes. But what exactly is deep learning? In general, deep learning is the same neural network architecture that consists of several layers, and data can be transformed between these layers. It’s still a significant popular expression, but the innovation behind it is genuine and very refined. So, models in deep learning can be developed through a strategy called greedy layer-by-layer (Bengio et al. 2007 ). Figure 6 contrasts the powerful deep learning approaches with pooling layers and figure outs the critical issues and devise the most appropriate solution even problem was in a complex situation. In this review, deep learning algorithms have presented numerous models like DNN, CNN, RNN, Autoencoder in drug discovery areas. The pooling layer is another structure that hinders the neural networks. The capacity of the pooling layer is to reduce the spatial size of the representation to reduce boundary measurement and system computations and work independently on each feature map (channel). The motivation behind why max-pooling layers work so well in various networks is that it enables the system to recognize the features very effectively after down-testing an input structure and it reduces the over-fitting.

DNN architecture was evolved from an extension of Artificial Neural Network (ANN), contains multiple layers between input and output nodes (Bengio 2009 ). The DNN architecture traces the outcomes in a mathematical model either it can be a non-linear or linear relationship. Here, each mathematical model expected as a layer, also multiple layers were available in complex DNN, so that network is named as ‘deep’. Deep learning models are introduced in QSAR modeling to retrieve feature extractions and capabilities in chemical characters automatically. Dahl et.al had inspired by Kaggle’s results and improved investigations on multi-task DNN. The results of multi-task DNN have demonstrated incredible execution in learning general features of sharing parameters (Dahl et al. 2014 ).

figure 6

Deep Learning Architectures

Development of candidate drugs plays major desirable property in oral delivery. Molecules in intestinal permeability can be assessed by computational technology through affording rapid and reasonable ways. Multiple studies focused on intestinal intake of chemical composites for predicting the peptide sequence data. ML techniques like artificial neural networks have been adopted for predicting the intestinal permeabilities of peptides. The intestinal permeable of peptides consists of positive controlled data obtained through the peroral phage technique and random sequence data can be prepared through negative controlled data. Multiple statistical indicators like specificity, sensitivity, ROC score, enrichment curves, etc., are validated to produce appropriate predictions. And the statistical results declared that models have good quality and can segregate in between random sequences and permeable with great levels of confidence. Finally, the ANN models demonstrated greater prediction than unpredictable one. So, this model can applicable for intestinal permeable peptide selection to generate peptidomimetics (Jung et al. 2007 ).

Multi-task neural networks integrated into a platform called ‘DeepChem’, it helps the multi-task neural network to perform in drug development process (Ramsundar et al. 2017 ). Along with this, networks have assessed performance in the multi-task deep networks was robust. Finally, the performance of deep learning algorithms in QSAR models upgraded the prediction performance. Also, DNN played out a significant role in further research of hit-to-hit lead optimization.

CNN is a subclass of DNN, ordinarily utilized for analyzing the visual images (Valueva et al. 2020 ). CNN also called shift-invariant ANN because frequently rely on weights. CNN is a regularized version of a multi-layer perceptron. The concept of multi-layer perceptron characterizes fully connected networks, where each neuron in the first layer is associated with the following layer. By using of a fully connected algorithm, a network can conquer the overfitting problem. The CNN algorithm examines the clinical field so that, every neuron in a human cell appears like the visual cortex (Venkatesan and Li 2017 ). In ligand-protein interaction, many researchers utilized CNN model for predicting affinity in protein-ligand (LeCun et al. 2015 ; Leelananda and Lindert 2016 ). The affinity prediction indicated the best correlation in the dataset (Jiménez et al. 2018 ). In protein-ligand interaction, the CNN algorithm predicted binding affinities which can further increase scoring function but predictive capabilities must upgrade simultaneously.

RNN algorithm is an area of the artificial neural network, connections can occur between the input node and the output node. In this way, a directed graph can be created in the network along with a temporal sequence. Likewise, the RNN network utilizes the internal memory to perform grouping in input variables (Dupond 2019 ). It also exhibits dynamic performance Miljanovic ( 2012 ) because the RNN algorithm struggled for two networks at a time with the general structure. Each network may contain various impulses i.e., finite and infinite impulses.

Determining the functionality of protein structure will play a vital role in secondary and tertiary structures. Previously, numerous algorithms relates to folding prediction have improved to encode in the protein sequence experiment to develop protein structures. So, Visibelli has found \(\alpha -helixes\) signals on a large dataset. To locate specific occurrences in amino acids to characterize the specifications in secondary structure for deciding the helical moieties boundaries. The \(\alpha -helixes\) occurrences are predicted through various ML models for validating the hypothesis equipped with an attention mechanism. This mechanism can interpret the weights of each input, model’s decision for prediction. At last, the similar subsequences show the experimental outcomes, where input code-driven in secondary structure information (Visibelli et al. 2020 ).

Day by day, it has been turning out to be a challenge in improving affordable and effective treatments to humans without any prescience in drug target information. The deeDTnet is one of the deep learning techniques that were embedded with 15 variations of phenotypic, chemicals, cellular profiles, genomics utilized to accelerate drug repurposing and target identification. Due to the presence of high accuracy, deepDTnet has been approved by U.S. Food and Drug Administration with the identification of novel targets to familiar drugs. Through experimental results, topotecan was an approved inhibitor that can directly be utilized for human retinoic-acid receptors to diminish transitional void in drug development (Zeng et al. 2020 ).

In virtual screening, RNN utilized to cause new molecular libraries, so it got supportive in finding anticancer agents through molecular fingerprints (Kadurin et al. 2017 ). In producing the de novo drug design, the prediction must be conducted on biological performance. In this way, the RNN algorithm was utilized for generating molecules (Olivecrona et al. 2017 ). In the ChEMBL dataset, molecules could be gathered. For sampling, generated molecules must be trained by the RNN algorithm through conditional probability. Various classifiers performed data sampling however RNN with reinforcement learning has given 95% accuracy in scoring function (Mnih et al. 2015 ).

’Deep Interact’ was an integrative domain-based approach is utilized to predict PPI’s through Deep Neural Network. Assortment of multiple PPIs is extended out from (KUPS) Kansas University Proteomics Service and (DIP) Database of Interacting Proteins. It’s highly fundamental to discover and analyze the cellular components in the specificity of interactions and explicit molecular protein complexes. The significant goal is to develop enormous scope high-throughput experiments through silico approach to improve the uncovering levels in PPI. From a dataset known as Saccharomyces cervisiae, 34,100 PPIs have been validated to return promising results with a sensitivity of 86.85%, an accuracy of 98.31%, a specificity of 98.51%, and an accuracy of 92.67%. At last, the Deep Interact approach concluded to be better performed over existing ML approaches in PPI prediction (Patel et al. 2017 ).

Autoencoder is a class of artificial neural network, it retrieves information through unsupervised learning (Kramer 1991 ). Autoencoder objective is to represent the encoding data format in dimensionality reduction for maintaining a strategic distance from the ‘noise’ signal in the network. Along with this, the autoencoder must explore input data and then copied to the output layer. Autoencoder has two areas i.e., Encoder and Decoder; and one hidden layer. Here, the hidden layer is considered as code. Encoder transfers input data to the hidden layer. The decoder can retrieve information for reproducing the signal output. Autoencoders was most appropriate in dimensionality reduction and learning the data from generative models (Kingma and Welling 2013 ; Larsen et al. 2015 ).

Considering encoder as \(\phi\) and decoder as \(\psi\) , such that \(\phi :Y\rightarrow E\) , \(\psi : E\rightarrow Y\)

In first hidden layer, encoder considered input as \(y \in R^d=Y\) and maps to \(h \in R^p=E\) . \(h= \sigma (Wy+b)\) Here, \(`h'\) considered as code, \(`W'\) as weight matrix, \(`b'\) as bias vector, \(\sigma\) acts as activation function. Basically, biases and weights are randomly utilized and updated through backpropagation technique. Then, decoder maps \(`h'\) to \(`y'\) with same structure of \(`y':\) , \(Y'= \sigma ' (W'h+b')\)

Decoder consists \(\sigma ', W', b'\) coefficients may vary in encoder i.e., \(\sigma , W, b\) coefficients. Mainly autoencoders were trained to decrease reconstruction errors (loss).

Here, feature space \(`E'\) consists of less dimensionality than input space \(`Y'\) . Also \(\phi (y)\) is a compressed format for input ‘y’. At whatever point, hidden layers are more prominent than or equivalent to the input layer, it offers the adequate capability to learn identity function, finally, it was useless. In Autoencoders, test results despite everything to learn numerous valuable features from training set (Kingma and Welling 2019 ). In drug discovery, autoencoders utilized as unique architecture to deliver molecules through conducting experiments right into vermin (Zhavoronkov et al. 2019 ). In designing of de-novo drug design, deep learning model i.e., autoencoder have utilized for generating the molecules. So, the autoencoder approach was employed with various classifiers like multilayer perceptron for generating new compounds automatically with appropriate properties (Gómez-Bombarelli et al. 2018 ). In many situations, the drug produces invalid SMILES syntax, so as to defeat this issue, grammar variational autoencoder was utilized for developing SMILES syntax with more effectiveness (Pu et al. 2017 ) (Fig. 7 ).

figure 7

Basic flowchart of an AutoEncoder with an example NCE

4 Drug design applications

The review of drug discovery is further categorized on the basis of task performing of ML and their applications like target identification, hit discovery, hit to lead, lead optimization techniques are discussed out. The drug design techniques rely on the databases which are inturn developed based on the different ML algorithms. The precise training, validation, and application of ML algorithms in the drug discovery era provide an enthusiastic outcome by easing the complicated error-prone protocols. The ML techniques are introduced in most of the drug design processes to reduce the time as well as manual interference. The best example is QSAR, in which the huge data collection and training of datasets are considered as rate-limiting steps in defining the ligand-based virtual screening protocols and are now replaced by Denovo design techniques. The relationship between drug discovery steps and algorithms is presented in Fig  8 .

4.1 Homology modeling/prediction of protein folding

The folding of secondary structure like \(\beta -sheets\) and \(\alpha -helices\) , which is formed by the interaction of side-chain amino acid residues are very critical to regulating the smooth functioning of three-dimensional proteins. An accurate protein folding along with its prerogative active ligand site can be experimentally obtained by X-ray crystallography, NMR-spectroscopy, and Cryogenic electron microscopic techniques (Cryo-EM).

figure 8

Primary, secondary, tertiary and quaternary structures of the protein highlighted with active site residues. The AmpC beta-lactamase (PDB:6DPZ) as case example is taken and depicted in the above figures

Information about the primary amino acid sequences of proteins/enzymes/receptors, both dissolved / insoluble, is stored on the UNIPROT server along with their targets and cellular functions. Based on medicinal chemistry or pharmacological or biochemical studies, the main role of proteins is identified, and this information is also the basic unit for developing the protein folding prediction studies by software or experimental studies. Whereas, the protein folding predictions in the provided aminoacid (UNIPROT) sequence were compared with its experimentally derived PDB homologues which became a hopeful technique to refine the new protein models computationally and is also termed as ”homology modeling”. The homology modeling or comparative modeling is analyzed by the several algorithms which need to be implemented in either software modules (PRIME) or web servers (EXPASY, SWISS-MODEL) will definitely make a decision to predict the secondary structure folding with high accuracy within provided templates. However, the fine-tuning for the obtained homology models or template-based models are again scrutinized by Ramachandran analysis which can be sorted out by commercial modules (PRIME) or web servers (QMEAN, PROCHEK). For further understanding, the homology of CHIKV nsP2 protease is described here (Fig.  9 ) which is obtained based on experimentally predicted VEEV nsP2 protease template by using insilico techniques. The inisilico tool utilizes the computational databases to dig the information about the homology templates and provides the best closest match as considering for more practical bioinformatics and medicinal chemistry applications. Figure  8 depicted the alignment of secondary structures such as \(\alpha\) -helices, \(\beta\) -pleated sheets, and loop representations present in tertiary complexes. The surface view also useful for recognizing the hotspots present on the protein to bind with incoming ligands/substrates. The sequence alignment mode also shows the mutations or differences in their primary sequences, it can be employed in different chemo-informatics approaches to identify the mutations similar kinds of viruses or any other pathogenic disorders. The significance of chemo-informatics is playing a crucial role and prevailing as an emerging tool in the current SARS-COV2 pandemic towards the identification of new drug-like molecules (Fig. 10 ).

In addition, selecting the best homologous model obtained from the above process is another major task that can be performed with SVQMA (Support-vector-machine Protein single-model Quality Assessment ) servers or ProQ3 or ERRAT, which are operated by the Deep-learning methods. After going through the above steps, the best 3D protein template can be used for any basic drug chemistry study to identify hits that are part of a structure-based virtual screening protocol.

figure 9

a Overlap of 3TRK with 2HWK; b surface view of 2HWK; c , d off-surface/ribbon diagram of finest 3TRK model; and e homology validation parameter obtained from SWISS-MODEL

figure 10

The overlap of active site residue of the CHIKV (homology model) (red sticks) and VEEV nsP2 protease (green sticks)

To provide insight for homology modeling, the Q5XXP4 fasta sequence belongs to CHIKV nsP2 protease domain has been employed as a template by overlapping its closest VEEV nsP2 protease solved protein (PDB:2HWK) as reference model using the SWISS-MODEL web server and the results are presented in Fig.  11 . for understanding the above-specified concepts. Further, the active site residue position analysis of the finest developed model has been done and is found to have similar to VEEV nsP2 protease residues as shown in Fig. 10 . The SWISS-MODEL also provides the information about percentage similarity along with structure alignment, the Fig.  10 shown the overlap of similar active site residues consists of catalytic site (catalytic diad Cys and His). It also represents the conformational changes present in the new template which also considered as an essential parameter for drug interaction studies

4.2 Target identification

The target identification for NCE’s is an extreme task due to lack of knowledge on their off-targets such as enzymes, ion channels, proteins, or receptors. The binding site recognition for the NCE’s is another key task for computational/bioinformatics experiments where more than one active site has existed in the protein. In the above cases, the predefined most popular web servers (FTMap), as well as specific modules such as ”Sitemap” developed with the help of algorithms, can define the preferential binding site to speed up the drug discovery process. A few other online programs like GHECOM, POCASA, Pocketome, SURFNET, ConCavity, LIGSITE, Q-SiteFinder, Fpocket, and PASS predicts the feasible binding sites located within the provided protein templates. Whereas, the metaPocket 2.0 program utilizes the above platforms to afford the most reliable ligand binding sites present on templates. Further, AI models like FD/DCA can also predict the druggable sites in the provided biological macromolecules. Recently, the DeepDTnet as a new target identifier in drug repurposing has been tested. The DeepDTnet strategy is developed by amalgamating the multi-disease cellular targets, pathogenic genes (genomics), and drugs (chemical spaces) being utilized for their treatment.

4.2.1 Prediction of protein folding

Patients who experienced illnesses can be recognized through protein dysfunctions. Here, active molecules can recognize through a structure-based drug design approach. Time and cost consumption should be required for 3D structural processing, and it is also important to be aware of what algorithms are used to predict the 3D structure of proteins. Because of the essence of the large amount of protein sequence data, it creates a problematic issue in making 3D structure accuracy for de-novo prediction. For retrieving feature extraction capabilities, deep learning approaches must apply prediction in backbone torsion angle (Li et al. 2017 ), secondary structure (Spencer et al. 2014 ), and protein residue contacts (Wang et al. 2017 ). At long last, the goal was to predict the 3D protein structure. Also, deep learning techniques have elaborated this field for improving 3D protein structure.

4.2.2 Prediction of protein–protein structure

PPI’s are essential for biological processes and infections (Falchi et al. 2014 ; Scott et al. 2016 ). PPI can be characterized as ‘it performs similar to networks for mathematical representation of physical contacts between cell proteins. Composed contacts between binding regions in proteins have specific biological importance. Also, it obtains the experimental and bioinformatics strategies from PPI’s database (Li and Lai 2007 ; Szklarczyk et al. 2015 ). PPI interface is also referred to as a collection of multiple residues (Cukuroglu et al. 2014 ). In this way, it turns into a new class for drug targets that are different from mainstream pharmaceutical targets like ion channels, coupled receptors, G-protein, etc (Higueruelo et al. 2013 ; Santos et al. 2017 ). At that point, a new class will extend the target space for improving small molecule drugs (Shin et al. 2017 ). When contrasted with traditional drug targets, target PPI’s reduces harmful impacts because of increment in biological selectivity of regulatory impacts (Valkov et al. 2011 ). It is mandatory to learn fundamental ideas of the PPI interface on the protein-protein structure. Because of the less accessibility of PPI’s data, it contributes many computational techniques for predicting PPI’s interface (Xue et al. 2015 ). Those techniques are dependent on a template which makes it simple for PPI interface protection (Zhang et al. 2010 ). For example, a website name “eFindSite” (Maheshwari and Brylinski 2016 ) utilized for predicting PPI interfaces which consist of templates, residues, and sequence-related features for improving SVM, NBC techniques. If the chance of two interactive protein structures is vacant then it makes it easy for predicting the PPI interface (Vakser 2014 ) where it mainly relies on complementarity rules of protein-protein docking (Chen et al. 2003 ) and SymmDock strategies (Schneidman-Duhovny et al. 2005 ). When two unbound proteins are integrated and converged as one protein, then a difficulty emerges for predicting the conformational change. When an equivalent accent sequence needs to be derived, deep learning models are used to predict PPI and better improvement is achieved compared to machine learning models such as SVM (Du et al. 2016 ). Searching for druggable sites for interface in the buried zone (in the range of 1500-3000 A2) (Scott et al. 2016 ) was mandatory. Considering druggable sites as hotspots because of providing an enormous amount of binding free energy to convince the medical chemists (Cukuroglu et al. 2014 ).

figure 11

Illustrating drug discovery design techniques and topics with AI models

Bai et al. ( 2016 ), utilized two techniques i.e., fragment docking and direct coupling analysis for detecting druggable PPI sites. Fragment docking named “iFitDock”, utilized for checking druggable hot spots(problem areas) in the PPI interface. Further improvement for candidate binding locales needs to integrate similar small hot spots. At last, based on the evolutionary conservative level, the scoring function must be located to provide the finest protein-protein binding spots. The PPI interface objective was to improve computational methodologies for locating the best hot spots and significant structure of small modulator targets in the PPI interface.

4.3 Prophecy of protein–protein interactions

The Protein-Protein Interactions (PPI) is one of the major biological phenomena through which the basic units of the body (cell) transports the signals, ions, substrates, and energy production components that need to improve the pharmacological responses needed by the body. In another circumstance, the PPI plays a critical role in the pathogenesis of the disease such as various types of cancer, especially colorectal carcinoma. The development of colorectal carcinoma in humans is purely dependent on lifestyle as well as hereditary means. However, the pathogenesis of the colorectal carcinoma is linked with the formation of malignant Adenomatous Polyposis Coli (APC) and its migration in the entire colorectal portion in the body is majorly occurs due to the interaction of APC protein with Asef (guanine nucleotide exchange factor) and \(\beta -catenin\) with TCF4 component peptide are located in the pathogenic carcinoma cells. The example APC-Asef, \(\beta\) -catenin-TCF4 PPI has been illustrated in Fig.  12 .

figure 12

a , b Protein-protein interactions of APC-Asef (yellow surface/cartoon-APC & cyan surface/cartoon-Asef); and c , d PPI of \(\beta\) -catenin/TCF4 in surface & cartoon forms (yellow surface cartoon- \(\beta\) -catenin & cyan surface/cartoon-TCF4)

In recent years, the PPI-based drug discovery programs are experimentally produced a hopeful pharmacological substance, in terms of cancer pathogenesis, APC-Asef PPI inhibitors are the best example which are delivered the basic peptides as an initial point to switch on the medicinal chemistry oriented drug design projects. The importance of PPIs in understanding host pathogenic protein interactions is another extreme task that excites most vaccination programs. Battling against SARS-CoV2 infection is a key paradigm in the current scenario where the scientific community targets a protein spike from SARS-CoV2 that preferentially binds to the human angiotensin converting enzyme-2 (hACE2) to enter into the alveoli mainstream of lungs and cause severe obstruction in respiratory syndrome. However, the time and cost for experimental prediction of PPI are considered as rate limiting barriers. In this regard, the different databases hosted the web servers (few are publicly available) framed by targeting PPI which are prevailing as preliminary PPI identification tools to accelerate the medicinal chemistry research.

4.4 Hit discovery

The Hit discovery process is advanced in success which has been taken in drug discovery. In this procedure, small molecules are considered as hits for target binding to identify the best-altered functions. The detection of hit by diverse algorithms is currently prevailing as a robust technique in the current drug discovery paradigm. An application of multivariate parameters (K-nearest neighbors (K-NN) and support vector machine(SVM)) on high-content screening (HCS) analysis in one such method produced a variety of hits against neurological complications.

4.4.1 Drug repurposing

DeepDTnet’s training parameters outperform other existing target identification techniques and rely on a minimum quantity of FDA-approved drugs (732 drugs) to produce beneficial therapeutic effects (human retinoic acid receptor orphan receptor gamma t-ROR- \(\gamma\) t) of the existing topoisomerase inhibitor Topotecan (TPT). The deepDTnet strategy also transfers several FDA drugs with different chemical scaffolds against GPCR with new targeted pharmacological actions. (See in Figs. 13 , 14 , 15 ). The deepDTnet algorithm is considered to be much more advantageous than NetLapRLS and KBMF2K methods as well as Naive Bayes, SVM, KNN, and Random Forest algorithms.

figure 13

The FDA approved drugs under drug-target repurposing applications derived by deepDTnet

figure 14

The FDA approved drugs under drug-target repurposing applications derived by deepDTnet (contd.)

“Repurpose” refers “reprocess/reused/recycle”. Drug Repurposing is characterized as ‘locating new indications for drugs (Ashburn and Thor 2004 ; Lotfi Shahreza et al. 2018 ) which are as of now in the existence stage’. Because it reduces time and hazardous circumstances in drug discovery (Ashburn and Thor 2004 ). A significant reason for utilizing the drug repurposing concept in drug discovery, because it exceptionally supportive to have multiple targets (Susan et al. 2017 ) in each drug which corresponds to various impacts. In this way, it provides high diversity in drug-disease relationships. Example: Few drugs extend its life expectancy such as “Metformin” which is an approved medicine to deal with diseases like “type 2 Diabetes”. In repurpose, essential elements are “drugs and diseases” (Cabreiro et al. 2013 , De Haes et al. 2014 , Martin-Montalvo et al. 2013 ) utilized. Drug targets and disease genes are other elements utilized in drug repurposing.

In order to show the interactions that have occurred in element (Lotfi Shahreza et al. 2018 ), this can be performed through the network investigations based on diversity interactions. Nine sorts of networks arranged in drug design concept i.e., Gene regulatory networks, target-disease networks, drug-adverse networks, metabolic networks, protein-protein networks, drug-drug networks, drug-disease networks, disease-disease networks, drug-target networks (Lotfi Shahreza et al. 2018 ). In general, the network’s model principle was, indistinguishable drugs have similar targets/effects (Yamanishi et al. 2008 ). If data is less or fragmented, in that situation drug repurposing is necessary. For repurposing, integrating the entire multiple networks to create extraordinary (heterogeneous) networks. At last, consolidate the drug repurposing with drug target prediction to generate drug target (Wang et al. 2014 ). So, drug target assists with treating the sicknesses. To generate new targets and indications, then utilize the network diffusion algorithm and dimensionality reduction approach (Luo et al. 2017 ).

4.4.2 Virtual screening

It is an AI strategy utilized in the drug discovery process for locating small molecules to distinguish bind structures for a drug target. In drug development, virtual screening also utilized software as well as algorithms to recognize hits from private chemical collections for retrieving unique hits inefficient way. After identification of new hits, a further step needs to purify compounds with unfavorable scaffolds (framework) (Lavecchia and Di Giovanni 2013 ). And furthermore incorporates hardly includes few strategies like docking-based, similarity searching (Willett 2006 ), pharmacore-based (Willett 2006 ), and machine learning methods (Leelananda and Lindert 2016 ). Based on the above techniques, classification has taken two strategies i.e., structure-based and ligand-based virtual screening.

When 3D-protein structure was accessible then molecular docking process can be widely utilized (Chen 2015 ). Many applications related to docking-based virtual screening have built (Talele et al. 2010 ) effectively without any impacts. May some obstacles are present in this strategy such as the scoring function. A scoring function cannot estimate binding affinities (bond/relationship) with accuracy because insufficient arrangements and entropy impacts (Huang and Zou 2010 ) have taken protein flexibility which makes it more complicated (Chen 2015 ). Finally, many docking models considered binding affinities and refuses remained like docking score, distance-time (Copeland 2010 ; Xing et al. 2017 ). When compared to docking-based virtual screening, the ligand-based virtual screening cannot confide to the 3D-protein structure. Its goal is to design bioactivity domains from molecular features (Lavecchia and Di Giovanni 2013 ).

In this concept, the aim is to persistently improve yields and to decrease false hit rates (Leelananda and Lindert 2016 ; Liew et al. 2009 ; Melville et al. 2009 ). To accomplish this objective, the SVM technique was frequently utilized in virtual screening (Ma et al. 2009 ). DL strategies have been applied to retrieve great classification capacity, low generalization error (LeCun et al. 2015 ; Thomas et al. 2014 ) and powerful feature extraction ability. Example: In virtual screening, sparse distribution method wastes a lot of time in searching process (Ma et al. 2009 ; Segler et al. 2018 ). So as to conquer this issue, molecule libraries must be provided along with unique training molecules (Thomas et al. 2014 ) among the Simplified Molecular Input Line Entry Specifications (SMILES) and natural language relies on long short-term memory network architecture. ML techniques like DNN and gradient boosting trees provided the molecular libraries by RNN. Adversarial autoencoder models the molecular fingerprints to locate potential anti-cancer agents (Kadurin et al. 2017 ).

4.5 High throughput virtual screening and scoring in molecular docking techniques

Routine techniques used after target identification are high through virtual screening (HTVS) and molecular docking techniques embedded in free energy perturbations, sampling, and scoring algorithms. The knowledge of active site for the protein/receptor where ligand would bind to mimic/antagonize the physiological role which is an essential task to initiate the HTVS protocol. Similarly, the ligand-based virtual screening (LBVS) considered as another basic method relies on the Physico-chemical properties of chemical databases (Fig. 15 ).

figure 15

Basic overview of molecular docking sampling and scoring flowchart

4.5.1 Activity scoring

In virtual scoring, the scoring function is a fundamental component in molecular docking for assessing binding affinities towards target (Huang and Zou 2010 ). In machine learning, mapping ability features can yield great accomplishment to extract physical, geometric, and chemical features (Khamis et al. ( 2015 )) to retrieve scores. Based on scores, data-driven black box models which are considered to predict interactions in binding affinities and furthermore avoiding few concepts in docking like physical function are very hard to study (Ain et al. 2015 ). Random Forest and SVM concepts identified with AI utilization for better performance in the scoring function. For instance, an SVM model can be utilized instead of a linear additive method related to the energy terms concept. Since an SVM can characterize the relationship between experimental binding affinities and own energy terms i.e., can be extracted from docking program eHiTS. Thus, data gives better execution in scoring power and screening power (Kinnings et al. 2011 ; Zsoldos et al. 2007 ).

Numerous researchers initiated in utilizing the CNN model in image processing (LeCun et al. 2015 ) field because CNN demonstrated better performance and protein-ligand interactions providing numerous features to CNN for predicting protein-ligand affinities. In the estimation of protein-ligand affinities, Jimenez et al. worked on the 3D visual representation of CNN model and binding affinities (Jiménez et al. 2018 ) which have indicated better correlation behavior in data sets. And essentially, deep learning represents its genuine intensity to increase abstract features from primitive features, since it’s necessary to represent fundamental features for a compound-protein structure like molecule types, particle separation (LeCun et al. 2015 ) etc. A structure Deep VS, reliant on CNN model, got familiar with abstract features from fundamental features to provide docking programs like GLIDE SP (Friesner et al. 2004 ) and ICM (Abagyan et al. 1994 ). Thus, the point in activity scoring was, choosing few features among protein-ligand interaction for predicting binding affinities with help of the CNN model, so it increases information scoring function but it upgrades the predictive capabilities.

4.6 Hit to lead

It is also referred to as lead generation in the beginning phases of drug discovery. It locates small molecules referred to as hits from the High Throughput Screen (HTS) through deficient optimization to locate promising lead compounds. The practical interface of hit-to-lead optimization approach integrated with chemical synthesis as well as mapping algorithm ”design layer”/Random Forest regression applied to create new biologically active chemical spaces through the utilization of existed kinase inhibitor library (Desai et al. 2013 ) (Fig. 16 ).

figure 16

Abl kinase inhibitor obtained from Hit-to-lead optimization protocol linked with ML algorithms

QSAR analysis was used in the hit-to-lead optimization process to find potential lead compounds from the hit analogs with the prediction of bioactivity analogs (Esposito et al. 2004 ). And primarily utilized in mathematical concepts to study quantitative mapping with physicochemical or structural objects and biological activities. QSAR analysis taken apart in foundation of mathematical models, selection and making the progression of molecular descriptions, evaluation and interpretation methods, utilization techniques (Myint and Xie 2010 ). Here, mathematical models and chemical structure representations are considered issues in QSAR demonstration. When descriptors are chosen, then locating mathematical models is necessary to fit relationships in the structure-activity technique. In the year 1964, Hansch equation was suggested by Hansch et al. For clarifying the 2D structure-activity relationship, utilize the parameters like physicochemical descriptors and linear regression models for presenting QSAR study as another section (Hansch and Fujita 1964 ).

In the same year, Free-Wilson model suggested by Free et al. He formulated the bioactivity description and chemical structure relationships have hypothesis concept to contribute substituent in compound activities (Free and Wilson 1964 ). Contrasted with the Hansch method, the Free-Wilson method can encode the chemical structures since it predicts legitimately from the chemical structure without any physiochemical parameters. Random Forest and SVM are machine learning procedures, used in mathematical models (A Dobchev et al. 2014 ; Dudek et al. 2006 ; Ning and Karypis 2011 ).

Likewise, QSAR modeling utilized deep learning techniques to retrieve capabilities in chemical strings and automatically extracts the features. Merck Molecular Activity challenge was held in 2012 and a team called George Dahl’s won the challenge in ensemble methods like gaussian progress regression, multi-task DNN, and gradient boosting machine (Ma et al. 2015 ). Kaggle inspired the results in multi-task DNN. Along with this, Dahl et al. proceeded to work on the multi-task DNN concept and shown excellent performance in single-task neural systems.

Due to multi-task strategy, neural networks learn features from different parameters however tasks can be similar (Dahl et al. 2014 ). Ramsundar et al. ( 2017 ) utilized multi-task neural structures in drug development to assess the performance and finally, excellent results appeared in the random forests algorithm. Since multi-task neural structures consolidated towards platform called Deeepchem. Subramanian utilized canvas descriptors for employing DNN. Prediction in binding affinities needs to reinforce the regression and classification model to gain results in human \(\beta\) -secretase-1 inhibitors (Subramanian et al. 2016 ). Usage of DNN model gives great results in validation set i.e., classification capability gives 0.82 accuracy, it exhibits regression ability \(R^2\) with 0.74, MAE (Mean Absolute Error) is 0.52. DNN model utilizes the 2D descriptors and indicated better results when compared with force-field-based strategies because of the utilization of partial capability models in deep learning. At last, QSAR models rely upon deep learning techniques which allots the better results in the future prediction role of hit-to-lead optimization research.

4.6.2 De novo drug architecture

De novo Drug Architecture progressed unique chemical structures by adjusting or balancing the target interest (Hartenfeller and Schneider 2010 ). To introduce a new molecule from scratch using a popular De novo model called the fragment-based approach. If at this point there are impracticalities and complexities in the molecular structure (Schneider et al. 2017 ), the risk arises in the development of the structure and becomes difficult in the assessment of bioactivity. Deep learning models utilized powerful knowledge and generative capabilities to introduce a new structure with appropriate properties (Mullard 2017 ).

In the De novo drug design process, the deep learning models acts as autoencoder to generate an appropriate format for new chemical entities (NCE’s). Therefore, an embedment of autoencoder with multilayer perceptron classifier is also a value-added technique in the generation of NCE’s with predefined physicochemical properties. The syntax of the drug/chemical structure is produced in SMILES format which might be difficult to understand in many circumstances and grammar variational autoencoder (VAE) overcomes this problem to accelerate the process (Fig. 17 ).

figure 17

Smiles/SLN notation of antiviral compound

Deep reinforcement learning technique extended by Olivecrona et al. for predicting biological activities to develop new molecules by adjusting RNN model (Olivecrona et al. 2017 ). To obtain SMILES syntax, RNN model to be trained; where molecules can collect from chemBL. In reinforcement learning, agents act through actions in activities under certain conditions. At this point, if the agent gets a positive reward, the actions made by the agent’s trend can be renewed (Mnih et al. 2015 ). To acquire a high reward for activity scoring, then utilize the SVM technique to enhance few approaches relying upon ligands concept in the training set. Generate few molecules against dopamine receptor 2-type for employing deep reinforcement learning model with RNN model. Along with this, it observed predictions have taken over 95% for structures in the bioactive region through the scoring capacity of SVM. By utilizing deep learning techniques, unique molecules can be created through the auto-encoders technique. To generate new molecules automatically with appropriate properties then, Gomez-Bombarelli et al. ( 2018 ) integrated multilayer perceptron (MLP) and variational autoencoder (VAE).

In PPI prediction, numerous tackles have taken placed due to (i) spending low expenditure in protein information, (ii) lack of known PPI to learn about the explicit virus, (iii) inefficient strategies due to sequence dissimilarity in viral families. The de-novo methodology motivation is to predict innovative PPI virus with its host. De-novo was a sequence-based negative examining framework that learns the diverse viruses in PPI to predict the innovative one, where the shared host proteins can exploit. For assessing generalization, de-novo has endeavored to test the PPI’s with various domains. At last, the De novo approach retrieved 81% accuracy in reducing the noisy negative associations and 86% accuracy in the viral protein prediction that utilized in the training period respectively. De-novo strategy accomplished more comparable in intra-species and single virus-host prediction cases. In this way, it turns to be difficult to predict the PPI for a contaminated person and optimal accuracy is obtained when carrying out tests for the human-bacteria interactions (Eid et al. 2016 ).

To develop biological and chemical prospects, multi-objective optimization technique and AI has given promising outcomes through entrusting an automated De-novo compound structure like a human-creative mechanism. In this study, innovative perception pair, which relies on multi-objective technology, is to apply the RNN algorithm to automate unique molecules with a de-novo structure build on common properties found among constant physicochem properties for leading trade-offs. In this view, multiple chemical libraries related to de-novo structure targeting acetylcholinesterase and neuraminidase. For assessing chemical feasibility, validity, drug-likeness, and diversity content were employed through numerous quality metrics. In the de-novo generative molecules, molecular docking has taken place for the evaluation of posing and scoring through X-ray cognate ligands with similar molecular counterparts. At last, multi-objective optimization and AI are provided to use easily for customizable design techniques which especially effective for lead advancement and generation (Domenico et al. 2020 ).

For the most part, the network consists of 3 segments i.e., encoder, decoder, and predictor. Encoder plays a significant role in changing strings called discrete SMILES into latent (inactive) space, where vectors are considered as constants. The decoder role was considering vectors back to the past string stage i.e., discrete SMILES. In the predictor stage, Multi-Layer Perceptron (MLP) approach is used for predicting the molecules. For retrieving a high prediction ratio in constant vectors, then utilize the gradient-based technique. To locate new molecules rapidly with appropriate properties, then utilize 2 techniques i.e., Bayesian inference and gradient-based approach. By using both approaches, a significant advantage was delivering a high predictive ratio consequently, where humans can comprehend the chemical structure. It does not correlate to chemical structure when SMILES syntax is invalid. To maintain a strategic distance from such difficulties, make the result source more constrained; Pu et al. used variational autoencoder (VAE) for characterizing SMILES syntax (Pu et al. 2017 ).

For creating molecular fingerprints, Kadurin et al. have utilized the AAE model, were later referred as druGAN. While using the AAE technique, it demonstrated excellent performance in the VAE model in areas of generation ability, error in reconstruction area, further extraction ability (Kadurin et al. 2017 ). Coley et al. ( 2018 ) suggested locating whether the generated molecule was synthetically accessed or not. Depending upon the reaction database, the neural network was trained because of the availability of excellent approximation capabilities for retrieving synthetic complexity metrics. The fundamental explanation behind synthetic reaction is to increase the reactant complexities i.e., the score in product complexity must be greater than reactant (Andras 2017 ). Coley strived numerous attempts to build scoring function through encoding chemicals response into product pair and reactant pair for clarifying correlation inequalities between product and reactant complexities. To become familiar with any scoring capacity at that point, neural networks need to be trained where Coley utilized reactant and product pairs in a scope of 22 million. Along with this, the outcome determined with huge complexities in the synthesis process. At long last, generative models not just clarify drug activities in inverse synthetic planning yet additionally discloses synthetic complexities due to disposing of the non-realistic molecules.

4.7 Lead optimization

The lead optimization is an essential step of the drug discovery process in which the best medicinally active fragment hits are considered leads to extend the medicinal chemistry projects. The main aim of the lead optimization is to eliminate the side effects/notorious effects of the existing active analogues by a minimal structural modification to yield a better and safer scaffold. One such example is the optimization of Autotaxin inhibitors such as GLPG1690 clinical agent which is advanced in human clinical trials to combat pulmonary fibrosis. Another example is to increase the potency by tailor-made approaches to provide better active analogue. Here, the various properties of ADME/T like Chemical and physical properties, Absorption, distribution, metabolism and excretion, Toxicity, and the ADME/T multi-task neural networks are discussed in the following sections.

4.7.1 Chemical and physical properties

In the drug discovery pipeline, physical and chemical properties have been utilized to reduce significant failures. At that point, deep learning models are utilized lead optimization techniques to improve unique methodologies (Lusci et al. 2013 ). Duvenaud et al. ( 2015 ) extracted data from molecular graph directly by adopting the CNN-ANN concept to perform prediction i.e., (MAE is 0.53+0.07) due to relied upon interpretability concept. Coley et al. inspired Duvenaud’s work and begun working for better results in molecular aqueous concepts. And furthermore used the tensor-based convolutional technique and gave better outcomes as MAE (0.424+0.005).

It’s necessary to clarify molecular graph attribution since tensor-based techniques need to integrate features like a bond, atom levels. For predicting molecular aqueous solution, Coley’s employed an enormous number of atom level information compared to Duvenaud’s model (Coley et al. 2017 ). Establishing a great correlation between Caco-2 permeability coefficients and oral drug absorption (P app) for predicting the candidate drug (P app) (Artursson and Karlsson 1991 ; Hubatsch et al. 2007 ) in the estimation of pharmacokinetic properties. To fabricate prediction templates with 30 descriptors (Wang et al. 2016 ) at that point, Wang et al. composed 1,272 components for permeability information of Caco-2 including models like SVM regression, boosting. In the testing set, the boosting model demonstrated the best outcomes with great expectation capability. It follows QSAR principles from OECD (Organization for Economic Co-operation and Development). So as to persuade reliability and rationality, then follow the sequence of OECD standards.

4.7.2 Absorption, distribution, metabolism and excretion

Entering medicines or drugs into veins of the human body under some activity site known as drug absorption. For examining the degree of absorptions utilize the bioavailability parameter. Numerous clinical departments clarified optimization of absorption properties with a prediction of bioavailability molecules (Tian et al. 2011 ). In the usage of the MLR model, Tian et al. employed 1,014 molecules for bioavailability prediction through molecular assets and structural fingerprints. By utilizing the genetic function technique, excellent results appeared in predictive performance as RMSE = 0.2355 and correlation coefficient is 0.71 respectively. Conveying drugs or medicine into the human body i.e., intracellular and interstitial fluids along with few drug absorption (Sim 2015 ) properties called as drug distribution. Drug distribution at steady state (VDss) is a proportion of dosage from vivo stage into plasma reaction. The steady phase in drug distribution is the significant index for evaluating the drug distribution process. Thus, VDss must be predicted; Lombardo and Jing have created PLS and Random Forest techniques along with 1,096 molecules (Lombardo and Jing 2016 ). Here, board members are not satisfied with prediction results because 50% of molecules are accessible in twofold error. VDss may influence by the presence of obscure factors. To defeat this issue, intently taken as a challenge for VDss value in molecular structural data. If a drug or drug enters the human body under the conditions applied, the drug itself tries to produce the current toxic metabolite in order to successfully structure the metabolism. To ensure the strength of the metabolic structure, use structural optimization techniques to encourage the metabolism to make predictions with high accuracy. Many AI strategies adopted a huge amount of drug metabolism information to predict unique metabolic enzymes like UDP-glucuronosyltransferases (UGT’s), cytochrome P450s, etc. Furthermore, neural networks trained in UGT metabolism at Xenosite (Matlock et al. 2015 ; Zaretzki et al. 2013 ) platform for predicting the UGT metabolism (Dang et al. 2016 ). Eliminating dosage from drugs and also metabolites from the human body referred to as drug excretion. Drug metabolites are wiped out from the human body either with the usage of water (i.e., some drugs can be soluble in water) or it directly eliminated through the absence of metabolism. For retrieving excellent results in unique mechanisms, Lombardo et al. utilized the PCA technique with an expectation pace of 84% (Lombardo et al. 2014 ) accuracy.

4.7.3 Toxicity and the ADME/T multi-task neural networks

In clinical and preclinical damage accomplishment was reduced the adequacy of about 33% of significant molecules in drug localization, optimizing the significant molecules reducing risk hazards by predicting toxicity (Guengerich 2010 ). Prediction can perform through techniques called structural alerts and rule-based expert knowledge for toxicity profiles like kidney and liver. Here, deep learning models are required to produce better results in toxicity prediction. Along these, Xu et al. created a prediction model named acute-oral toxicity, for predicting results on molecular graph encoding CNN (MGE-CNN). Predicted outcomes indicated as better when compared with SVM model (Youjun et al. 2017 ). Therefore, the MGE-CNN model succussed because of feature extraction, model development, molecular encoding is similar in training for neural networks. The advantage was, the issue can alter through molecular fingerprints because of accessibility of flexibility in the MGE-CNN model. For acquiring great fragments relates to structural alerts, Xu et al. utilized toxic features for fingerprints which characterizes TOX Alerts (Sushko et al. 2012 ). If parameters were comparative, then it’s necessary to correlate with trained multi-task neural networks and performance demonstrated better results contrasted with single task neural networks (Mayr et al. 2016 ) because of sharing parameters and more supportive towards multiple tasks for retrieving similar features. At last, some information is provided to the human body when drug absorption, distribution, metabolism, and excretion has handled and prediction improved through performing multi-tasking neural networks. Here, single-task and multi was tasks contrasted by Kearnes et al. with ADME/T experimental data, and outcome demonstrated better performance in multi-task model (Kearnes et al. 2016 ).

4.8 ML in e-Resources for drug discovery

The AI and ML algorithms prevailed as the main computational scoring functions for evaluation when a predicted value was added as a parameter, which is involved in the basic drug discovery paradigm (Stork et al. 2020 ), it illustrated in  18 . The detailed applications of the ML algorithms specified in the e-resource are described in the following sections (Fig. 18 ).

figure 18

ML in e-Resources of drug discovery platform

4.8.1 ML in Pan-assay interference screening (PAINS)

The precise information about hits can be obtained from primary or secondary biological screening assays of purchasable/commercially available databases which were the most important parameters before starting medicinal chemistry projects. Thus, elimination of the compounds has been exhibited its presence in different cellular biological assays considered as pan-assay derived hits could reduce the cost and time of the medicinal chemists. The pan assay information can be accessed from the PAINS database on request. Therefore, the Hit Dexter 2.0 web server has been launched compiled from Pubchem library and screening assays. The Hit Dexter 2.0 could be initially utilized to know the biological properties of the newly designed compound and thus anyone can easily eliminate the pan-assay interfering compound at the initial stage itself (Stork et al. 2019 ).

4.8.2 ML in drug metabolite and metabolic site prediction

The identification of metabolic site for any kind of drug or new chemical entity is very essential before its administration into the human body. The prediction of drug metabolism can be done by animal models (preclinical studies) which was a rate-limiting step as well as costly and it is mandatory to retrieve therapeutic approval of new chemical entities. The site of metabolism can be predicted by several modules among ”ADMET Predictor” of SimulationsPlus tools have gained attention and is pure works on the models compiled by the artificial intelligence algorithms. The FAME3 is one of the online servers which predicts the region for the given drug/compound which undergoes metabolism validated databases gathering phase-1/phase-2 metabolic parameters associated with several databases validated by comparing with Matthews correlation coefficient (MCC) (Stork et al. 2020 ). It is also important to have an overview of the chemical modification of drugs/NCE’s which are undergone the metabolism and thus can be used in calculating dosage regimen, dosage frequency, toxicity, and other beneficial side effects. The online services such as GLORY/GLORYx provides the precise information about the possibilities of new metabolite and their relevant formation data with respect to mitochondrial cytochromeP450 enzyme and conjugations (de Bruyn Kops et al. 2019 ).

4.8.3 ML in skin sensitivity parameter prediction

The prediction of skin sensitivity is one of the essential criteria for assessing safety parameters of the new drugs/compounds and it is patient to patient specifications. In this regard, the AI models such as Random Forest based MACCS (RF_MACCS) and support vector machine (SVM) based PaDEL (SVM_PaDEL) algorithms trained with approximately 1400 ligands linked with local lympho node assay (LLNA) information (Stork et al. 2020 ; Vranic et al. 2019 ).

4.8.4 ML in natural product identification

The ML trained with 265000 natural product isolates and synthetic libraries validated by MCC is being used as a basic predictive model NP Scout online server will reveal the probable identity of the newly discovered analogs. The application of NP Scout in the prediction of sources for the query molecule might provide information about their natural product sources and could become a part of natural product-based drug discovery (An et al. 2019 ).

5 Drug discovery problems

In drug development and discovery, numerous clinicians and specialists confronted challenges towards target validation, computational pathology data, identification of prognostic biomarkers in clinical preliminaries.

5.1 Target validation

By regulating the molecular target activity, drugs can be developed through the utilization of ultimate methodologies in drug discovery for altering the infection state. By inaugurating a program in drug development, target identification requires a therapeutic hypothesis for modulating target regulation in the outcome of the infection state. When available evidence is identified for that target, it can be considered as target identification. Based on fundamental decisions, in vivo and ex vivo models are utilized to validate the target disease. In target validation, outcomes can be retrieved through clinical preliminaries, yet it’s necessary to concentrate on target validation efforts for successful projects. The diseases incorporate metabolomic, transcriptomic, proteomic profiles that are available in-patient clinical material. With the clinical database, the capability of re-utilizing data through public databases provides the primitive target identification and target validation. For predicting target identification, it requires appropriate strategies for yielding legitimate statistical models.

ML approaches are used in target identification because of the increment of data-driven target identification experiments. In target identification, recognizing causal confederation among disease and target is the initial step. Target disease modulates either naturally or artificially (experimental). By using ML approaches, prediction can be taken placed on known properties of targets, causalities, driven targets. ML techniques can apply from various perspectives in the target identification field. For predicting genes with dysphoria, a decision-tree classifier need to be trained on a protein-protein localization network (Costa et al. 2010 ). So, distinguished few key parameters in decision-tree inspection i.e., extracellular path, transcription factors, metabolic paths. John et al. improved a classifier model called SVM with genomic details for classifying proteins towards non-drug and drug spots in ovarian and breast cancer (Jeon et al. 2014 ). mRNA expression, network topology, protein-protein interaction, DNA copy numbers are the key segments in classification and recognized 122 cancer targets globally. Targets identified as 462, 266, and 355 related to pancreatic, breast, and ovarian tumors. Peptide inhibitors were validated through the prediction of two targets. Outcomes in the cell culture approach were identified as more prominent anti-proliferative effects. Although, in pancreatic tumors, usage of inhibitors shown twice greater inhibition on cells.

To distinguish transcriptional changes in Huntington’s disease, Ament et al. developed a model called mouse transcription factor site with transcriptome information (Ament et al. 2018 ). By utilizing LASSO and regression models in mouse striatum, a genome-scale has been created for 718 transcription factors. Transcriptional factor modules are recognized to provide treatment in the early phases of Huntington’s disease. In tissue-related anti-aging treatments, Mamoshina et al. ( 2018 ) identified molecular targets for comparing gene-expression signature with old and new muscles. When contrasted with supervised machine learning models, SVM exposed feature selection and linear kernels are generally appropriate for identifying biomarkers. Predicted targets can be developed through ML i.e., blind drugs can furtherly be utilized for therapeutic assumptions. For identifying affiliations like gene-disease, drug-disease, target-drugs, then apply NLP kernel strategies in Medline concept (Bravo et al. 2015 ). Many supervised learning techniques rely upon EU-ADR [European Union Adverse Drug Reaction] database for disease genes identification in the Medline concept. NLP technique is used in the extraction of biological entity events (Kim et al. 2017 ).

For identifying therapeutic treatment through novel targets, ML is the best extension for understanding biological aspects. The splicing signal model is an example had in curing Alzheimer’s disease. DL splicing signal model is utilized to predict alternate signal (Leung et al. 2014 ). Binding the integrative splicing signals (Jha et al. 2017 ) like RNA sequencing data and CLIP-seq splicing data indicated knock-down results. To identify variations in Alzheimer disease (Vaquero-Garcia et al. 2016 ), then code models like complex variants and de-novo designs must integrate for prediction. ML can predict cancer-related drug impacts (Iorio et al. 2016 ). So that, ML investigated how DNA-methylation, somatic mutation data, genome-wide data impacts the drug feedback. To identify molecular features, then utilize logical models, ANOVA, and machine learning models like random forests for predicting the drug response.

Gene expression, DNA methylation are recognized as the best predictive data types in cancer regions. Data utilized from RNAi screens to locate molecular features from 501 cancer lines, so it predicts 769 genes from cancer cells (Tsherniak et al. 2017 ). 171 chemicals are necessary to locate in genetic affiliations because targetable vulnerabilities revealed as oncotypes don’t influence cancer therapy (McMillan et al. 2018 ). The models used in predictive data types how therapy in cancer-intrinsic medicine. Many queries emerge for developers i.e., how specific drugs are developed for the given target. For identifying targets in small molecular design, proteins suggested integrating with small molecules for delivering drugs. In this way, a random forest algorithm must train on genomic attributes like physicochemical and cavities of 1,187 compounds in non-drug adhesive sites against 99 protein collection (Nayal and Honig 2006 ). Additionally, length and configuration are considered significant features in surface cavities. For predicting drug targets, distinctive physicochemical properties from protein sequences applied SVM’s (Li and Lai 2007 ; Bakheet and Doig 2009 ) DL model (Bakheet and Doig 2009 ). Proteins occupy explicit locations in PPI network to associate exceptionally (Jeon et al. 2014 ; Costa et al. 2010 ; Kandoi et al. 2015 ). ML algorithms utilized newly developed targets to predict blind drugs for reducing search space, but drug target requires more endorsements. Predicting the clinical trial success in drug targets is a complicated goal for target validation and identification. Along ML approaches, omics information utilized 332 drug targets, so it can come up short or accomplishment in the third phase of clinical trials through multivariate compound selection (Rouillard et al. 2018 ).

Gene-expression data is identified as successful prediction across tissue layers with high variance and less RNA mean expression in clinical trials. In this way, the drug target was confirmed that specific disease expression can influence tissue region (Kumar et al. 2016 ). For predicting de-novo therapeutic drug targets, (Koscielny et al. 2017 ) ML classifiers should train from open platform (Ferrero et al. 2017 ). Significant indications are key data types such as genetic data, gene expression for predicting therapeutic drug targets. In such cases, ML approaches constrained because of data absence and sparse data are fundamental purposes behind failure in drug development programs. Practically, to initiate any drug in the market, it considers the length of time period due to more advancement in technology, new models like biologics (antibodies were included) can accessible and small molecular drug design may not same as today. Additional constraints are developed to predict medicine because it can fail or succeed with accessible metadata in public space.

5.2 Prognostic biomarkers

Using the ML approach, biomarker discovery is used to improve clinical trial performance by differentiating drugs and understanding drug mechanisms for reasonable patients (Li et al. 2015 ; van Gool et al. 2017 ; Kraus 2018 ). It consumes a lot of time and cost in the final stages of clinical trials. To defeat this issue, necessary to apply, build and validate predicted models in the early stages of clinical trials. Usage of ML algorithms allows predicting translational biomarkers in preclinical data assortment. After data validation, corresponding biomarkers and models must investigate the patient indications and lastly propose the medication. In literature, several papers provided information relates to predictive models and biomarkers, and last, few were utilized in clinical trials. Various factors like model rebuilding, designing, data accessing, data quality and software, model selection are necessary for a clinical setting. The principal issue was, ML approaches assess community endeavors for developing regression and classification models. Many years ago, in US FDA (Food and Drug Administration) led (MAQC II) MicroArray Quality Control evaluated ML algorithms for predicting gene expression data (Shi et al. 2010 ) in the final stage of clinical trials. In this project, 6 microarray data collections were analyzed by 36 independent groups to develop predictive models for classifying in the end stage of clinical sites. For modelling appropriate approaches in a clinical trial, information incorporates data quality, skilled scientists, control processes. Multiple myeloma is a poor prediction in patients and cut-off within 24 months due to partially applied. Here, the regression-based approach is appropriate for prediction because multiple myeloma and gene expression are continuous variables. By utilizing Cox regression models, it confirmed to predict (Zhan et al. 2006 ) patient risk factors through gene expression signature. In this review, the advantage was, utilizing regression models (Shaughnessy et al. 2007 ; Zhan et al. 2008 ; Decaux et al. 2008 ; Mulligan et al. 2007 ) can be highlighted due to the absence of predefined classes that can perform prediction in clinical trials. To evaluate regression models, NCI (National Cancer Institute) challenge is to build drug predictive models (Costello et al. 2014 ). Each group must utilize the best model with key parameters in training data collection (i.e., treating 35 breast tumor cells with 31 drugs) and models ought to be verified through similar blind testing data collection (i.e., treating 18 breast tumor cells with similar 31 drugs). For generating more predictive techniques, six sorts of data profiles are considered i.e., RNA sequencing, RNA microarray, reverse protein phase array, SNP (Single Nucleotide Polymorphism) array, DNA methylation status, exome sequencing for 44 groups are utilized for applying multiple regression models like sparse linear regression, kernel methods, regression trees, principal component methods. In MAQC II results, individual groups performed well and other groups utilized similar models. In differentiating, few groups maintained technical details like feature selection, quality control, data reduction, tuning ML parameters, splitting strategy, and biological data like gene expression data to improve the predictive model. Numerous drugs are convenient in the development of the predictive model when compared to other strategies.

Challenge of NCI-DREAM needs to maintain a data collection and outcomes for evaluating, improving group factor analyses in validation (Bunte et al. 2016 ), Random forest framework (Rahman et al. 2017 ) and other approaches (Huang et al. ( 2017 ); Hejase and Chan ( 2015 )). Predictive ML models were published in several papers where biomarkers play a significant role in drug development and discovery. A conference was conducted in utilizing the tumor cell screen data to create drug sensitivity models (i.e., sorafenib and erlotinib) (Li et al. 2015 ). In BATTLE clinical trials (Kim et al. 2011 ), improved models ought to apply to patients for finalizing whether these approaches are drug-specific and predictive. In this case, study, utilizing ML models helps in recognizing key parameters in drug sensitivity sites across tumors in tissue cells. PD1 (Programmed cell Death 1) inhibitor endorsed by FDA in 2017, at that situation, genetic biomarkers utilized s pembrolizumab as inhibitors for tumors. It was the first endorsement made by FDA that relates to genetic biomarkers other than tumor type (Boyiadzis et al. 2018 ), which can highlight the biomarker disclosure. Recently, predictive biomarkers indicated improvement in ML other than different oncology data types. For improving drug responses in patients, ML algorithms ought to apply multi-omics data (Tasaki et al. 2018 ). And gradient regression tree is utilized for improving polygenic risk scores in predicting clinical trials (Paré et al. 2017 ). Tested outcomes in UK Biobank explained the presentation of SNP model is indicated polygenic variance as 46.9% for height, 32.7% for BMI. For distinguishing high complexes in individuals such as cardiac arrests, breast cancers, inflammatory bowel cancers, at that point, genome-wide scored data must develop (Khera et al. 2018 ).

RNA sequencing for single-cell innovation is widely utilized in advanced biomarker discoveries and gene clustering. This technique is utilized to locate lineages of trace development, determining cell states, novel cell varieties. Here, reducing estimations in gene expression from thousand cells into the low-dimensional regions was the unresolved issue. For reducing high-dimensional into low dimensional form, Ding et al introduced probabilistic generative structure in gene expression of single-cell data accompanied by unpredictable estimations (Ding et al. 2018 ). Here, a probabilistic model is widely utilized to examine RNA sequencing for four single cells data. Along with, it develops 2D structure in the multi-dimensional regions for distinguishing cell patterns in RNA sequencing single-cell data. Transformation of RNA sequencing single-cell data into the encoded feature of latent space, VAE’s (Variational autoencoders) utilized for determining subpopulations in hidden tumour (Sabrina et al. 2019 ). Encoded features assessed few relationships in gene cell subpopulations. This strategy contains a data pre-processing technique since it relies upon unsupervised learning. RNA sequencing of single-cell data utilized the VASC model for data visualization (Wang and Jin 2018 ).

When testing was conducted on 20 informational sets, results indicated more superior to VASC model other than SIMLR (Wang et al. 2017 ) and ZIFA (Pierson and Yau 2015 ) reduction models. By utilizing ML approaches, feature selection received huge advancements in biomarker discovery. For extracting appropriate structures in clusters (Tan et al. 2016 ), many specialists have claimed unsupervised deep learning methods. To locate explicit structures in VAE encoded features, then the VAE technique must compete with TCGA (The Cancer Genome Atlas) data in RNA sequencing (Way and Greene 2017 ). To upgrade identifications in carcinoma disease, Beck et al. ( 2011 ) explained data integration techniques, image analysis with gene expression data to identify the squamous cells in lungs. And CNN model showed better execution in predicting the cardiac failures i.e., (AUC=0.97) from endomyocardial biopsy data other than (AUC=0.73 and 0.75) trained samples (Nirschl et al. 2018 ). From the above examples, the usage of ML approaches has shown success in biomarker discovery and still, numerous issues need to be rectified. A few issues considered as; one classifier must understandable by end-users for clinical adoptions. Another key issue was, every approach needs to validate the multi-institutional, multi-site data sets for determining the generalizability approach. Many community parties tended to key issues and providing a quick advancement like model extraction and interpretations in biological sites (Finnegan and Song 2017 ), key optimization and training algorithms (Angermueller et al. 2016 ), model reproducib0ility (Hutson 2018 ).

5.3 Digital pathology

The word pathology refers to a realistic field, each pathologist clarified what can see from a glass slide through visual assessment. A lot of information is produced through glass slides for example, which cell type is arranged in tissue layer and spatial context. In this way, it is generally imperative to examine relationships between immune cells and immune-oncology cancers. In clinical trials, before choosing a patient to test with thousands of compounds, pharmaceutical industries must realize how the particular drug can treat patient cells and tissues in the body. Because of rapid advancements in clinical trials, locating biomarkers became more significant for victims i.e., who can ready to react to the therapy. Fast improvement in digital pathology can discover new biomarkers with more reasonable, precise, and high-throughput behavior for reducing time in drug development, and also victims can access therapy very fast. Prior to applying deep learning models, many algorithms related to image analysis propelled me to collaborate with pathologists. For classifying tissue layers, numerous computer scientists are required to handcraft graphical features in computers.

The objective of digital pathology study is to recognize etymological descriptors largely utilized in hematoxylin and easin (H&E) structures. Here, Nuclear morphometry is an implementation in the digital study for explaining relationships between prognosis (Veltri et al. 2000 ) and features created by PCs. From the spatial context, Beck et al. ( 2011 ) identified tissues in stroma cancer and stroma survival features in breast cancer. Recently, the Nuclear orientation structure was explained by LU et al. ( 2017 ) for clarifying survival features in oral cancers and breast cancers (Cheng et al. 2018 ). In many conditions, antibodies utilized immunohistochemical stains for targeting image proteins. With the absence of deep learning tools, morphology can detect tissues in sophisticated data. Investigation of immuno-oncology permits ML approaches for generating high throughput features to explain thousands of cells associated with a spatial context, and impossible tasks given for pathologists. Usage of DL methods shows improvement more precisely for tissue and cell detection in cancer environments. Many different features are explained spatial context associations for cells and tissues through scale estimations. Understanding heterogeneity concept in breast-cancer population to utilize lymphocytes in biomarkers (Mani et al. 2016 ). The cell-cell relationship was examined and delivered outcomes through cell locations like CD8+, PD1+ and cell densities for distinguishing carcinoma Merkel cell to respond in pembrolizumab (Giraldo et al. 2017 ). For leading a trial, utilized the number of tissues for each stain. If thousands of features are examined, then cell-cell interaction increases in each stain. In this circumstance, ML models and feature selections must be incorporated to predict the therapeutic response.

The CNN model is well applicable for digital pathology works since a single biopsy was utilized to train feasible pixels. So, DL models automatically learn structured features from various classification tasks (Janowczyk and Madabhushi 2016 ). Here model was, M-CNN (Multi-scale CNN) considered as a supervised learning technique for phenotyping images with high-content cells (Godinez et al. 2017 ), where it restricts a few models with their customized steps. Converting image pixel values to phenotype images, then the M-CNN approach demonstrated more accuracy at classification levels. For creating objectives in image analysis, numerous DL methods utilized in tubules (Romo-Bucheli et al. 2016 ), lymphocytes (Saltz et al. 2018 ; Corredor et al. 2019 ), mitotic activity (Romo-Bucheli et al. 2016 ), cancer tumours (Sharma et al. 2017 ; Korbar et al. 2017 ; Bychkov et al. 2018 ; Cruz-Roa et al. 2017 ) situated in lung and breast cancers. In digital pathology, DL models provide information related to other methodologies. Utilization of deep learning models can stimulate data acquisition (Cohen et al. 2018 ) of MRI (Magnetic Resonance Imaging) or it diminishes dosage for radiation in CT (Computed Tomography) image process (Chen et al. 2017 ). The quality of images improved a lot in noise signal ratio, spatial resolution; so, applications like victim stratification, disease prediction, image qualification have correspondingly improved. The deep learning framework is another study (Coudray et al. 2018 ) which determines to predict the usage of mutated genes called lung cancers from hematoxylin & eosin (H & E)-stained images.

In image analysis, numerous deep learning procedures are required to perform explicit tasks; So, integration of image analysis and deep learning algorithms can be accommodated for problem-solving. In numerous issues, usage of DL techniques can outperform the results, however, it was not an image analysis tool because of lack of flexibility. Likewise, many scientific experts are accessible for any classification tasks. However, it consumes a lot of money to generate. To defeat this challenge (Turkki et al. 2016 ) immunohistochemistry staining would utilize to mitigate this problem. Due to community tasks, it provides more data for pathologists to build annotations for many use-cases. The transparency issue is another challenge to digital pathology. Black-box is a known methodology in deep learning strategies. In classification tasks, decision-making is unclear. For understanding numerous mechanisms in drug development, interpretable outcomes can be accommodating in locating potential biomarkers and drug targets for predictive response in therapy. Additionally, trust should be improved in generating assembled features with interpretability. In clinical trials, the large sample size required to apply DL techniques legitimately for predictive response in therapy is a further challenge. The DL requires countless sample examples in clinical trials. Sometimes, integrating data in clinical trials can be possible however the existence of bias can make the outcomes difficult for interpretation. Corredor et al. ( 2019 ) and Saltz et al. ( 2018 ) explained numerous models related to image analysis and DL models for predictive response in therapy, at that point CNN model used to identify features in sub-sequent graph and lymphocytes situated in H&E-stained cells. In the future, DL consists of more capabilities to replace nuclear detection and traditional segmentation algorithms for providing spatial context features (Table 2 ).

6 Challenges

Many challenges are there in Drug discovery, most of the challenges can be solved by using Machine Learning Techniques. Here, some of the challenges are being given with possible suggestions.

Numerous ML strategies produced precise results, despite the fact that a couple of parameters and structures lead to trouble during the training period. Especially when data is insufficient during the training period, the particular algorithm cannot fulfill the accuracy and local optimum.

To defeat this issue, a deep belief architecture, which is an unsupervised pre-trained model needs to be implemented for improving parameters, so the results can be created with more effectiveness Ghasemi et al. ( 2018 )).

The transparency issue is another challenge in drug discovery. Because decision-making is unclear in different classification models. In drug development, numerous mechanisms need to comprehend for interpreting the outcomes. So, it makes more supportive in locating new drug targets and multiple assembled features need to improve trust in interpretability Vamathevan et al. ( 2019 )).

In drug development, numerous mechanisms like SVM, MLR, RF, and Deep learning techniques can be implemented to comprehend for interpreting the outcomes. So, it makes more supportive in locating new drug targets and multiple assembled features for developing trust in interpretability.

Integrated data can accessible from many references, especially from the ‘omics’ region. It’s turning out to be more challenging in day-by-day, because not only expanding the data as well as this data type contains profoundly heterogeneity in pharmaceutical companies (Searls 2005 ).

Public databases are available like ZINC, BindingDB, PUBCHEM, Drugbank, and REAL chemical databases, developers need to create a pipeline architecture to integrate these heterogeneous data sources. However, the Data warehousing tools which work based on ETL (Extract Transform and Load) are Integrated Genomic Database, Adaptable Clinical Trail Database, DataFoundry, SWISS-PROT, SCoP, and dbEST. Genome Information Management System, BIOMOLQUEST, PDB, SWISS-PORT, ENZIME and CATH data (Cornell et al. 2001 ; Bukhman and Skolnick 2001 ).

Additionally, Homogeneous data can generate integration challenges, commencing with testing and logical issues, cross-platform normalization, and statistical issues can expand enormous heterogeneity information (Searls 2005 ).

So, ML with Big data analytic can be utilized for integrating homogenous data sources. Some Ontology-based integration tools are available like Ontology Web Language, Extensive Markup Language (XML), RDF Schema or Resource Description Language (RDF), Unified Medical Language System, etc (A Seoane et al. 2013 ). Some weblink based integration tools available like Sequence Retrieval System (Etzold et al. 1996 , ChEMBL (Gaulton et al. 2012 , NCBI Entrez, PubChem, Integr8, DisaseCard and EMBL-EBI search and Sequence analysis (A Seoane et al. 2013 ; Madeira et al. 2019 ). Some visualization tools are also available like Microsoft Power BI, IBM Cognos, Tableau, Zoho Analytics, Sisense, SAS Business Intelligence, etc. Because integration and visualization tools help in identifying bottlenecks and potential problems before which affects important processes (Soukup and Davidson 2002 ).

In pharmaceutical companies, research was stretched out from huge molecules to individuals, and generally relied upon integration of heterogenous data which sustain its own challenges in varying contexts and scales (Searls 2005 ).

A high level of artificial intelligence needs to be obtained for managing various sources and must be improved with a better understanding of the gathered data. So that, modern data connectors are suggested to centralize the dissimilar data and at last, these data connectors help in allotting original data.

7 Conclusion and future directions

The AI technology is utilized in pharmaceutical industries including ML algorithms and deep learning techniques in daily life. ML techniques in drug development regions and health service centers have encountered numerous conflicts, especially in image analysis and omics data. In medical science, ML models predict the trained data in a known framework i.e., the compound structure can perform alternative tools like PPT inhibitors, macrocycles with traditional algorithms. Additionally, deep learning models can be considered the chemical structures and QSAR models from pharmaceutical data which was pertinent for molecules with appropriate properties, because to the forward success rate in clinical trials. AI technology has taken a forward step in entering into computer-aided drug development to retrieve the powerful capabilities in data mining. Some issues still existed i.e.,

The performance of deep learning methods can directly influence the innovation of data mining because multiple deep neural networks are effectively trained on a large volume of data. The main aim is to tackle the transfer learning automatic problem.

“Black-Box” model became confused in deep learning concepts. The Local Interpretable Model-Explanations (LIME) is an example of a counterfactual probe. LIME was utilized to unlock the black-box model (Voosen 2017 ). Here, restricted data was mandatory to explain through deep learning models (Tishby and Zaslavsky 2015 ). However, revealing data by deep learning techniques perform only in the initial stages.

Many parameters are adjusted during the training period of neural networks but some theoretical and practical frameworks are out of reach to optimize these models.

7.1 Future directions

Web innovation was integrated with medical science to improve predictive power in decision-making and deep learning algorithms about biomarkers, side effects in therapies, therapeutic benefits. In clinical trials, success is achieved through the utilization of particular applications. So, motivation is performed for future investment in pharmaceutical companies. In the future, drug discovery and development, looking forward to covering all aspects by AI technology. Automated AI needs to coordinate theoretical results such as chemistry information, omics data, and medical data for emerging. Also, we are anticipating that more confirmations should be rebuilt for the medication revelation campaign.

Abagyan R, Totrov M, Kuznetsov D (1994) Icm–a new method for protein modeling and design: applications to docking and structure prediction from the distorted native conformation. J Comput Chem 15(5):488–506

Article   Google Scholar  

Ain QU, Aleksandrova A, Roessler FD, Ballester PJ (2015) Machine-learning scoring functions to improve structure-based binding affinity prediction and virtual screening. Wiley Interdisciplinary Reviews. Comput Molec Sci 5(6):405–424

Alpaydin E (2020) Introduction to machine learning. MIT press, Cambridge

MATH   Google Scholar  

Ament SA, Pearl JR, Cantle JP, Bragg RM, Skene PJ, Coffey SR, Bergey DE, Wheeler VC, MacDonald ME, Baliga NS et al (2018) Transcriptional regulatory networks underlying gene expression changes in huntington’s disease. Mol Syst Biol 14(3):e7435

An H, Li M, Gao J, Zhang Z, Ma S, Chen Y (2019) Incorporation of biomolecules in metal-organic frameworks for advanced applications. Coord Chem Rev 384:90–106

Andras P (2017) High-dimensional function approximation with neural networks for large volumes of data. IEEE Trans Neural Netw Learn Syst 29(2):500–508

Article   MathSciNet   Google Scholar  

Angermueller C, Pärnamaa T, Parts L, Stegle O (2016) Deep learning for computational biology. Mol Syst Biol 12(7):878

Artursson P, Karlsson J (1991) Correlation between oral drug absorption in humans and apparent drug permeability coefficients in human intestinal epithelial (caco-2) cells. Biochem Biophys Res Commun 175(3):880–885

Ashburn TT, Thor KB (2004) Drug repositioning: identifying and developing new uses for existing drugs. Nat Rev Drug Discovery 3(8):673–683

Asher M (2017) The drug-maker’s guide to the galaxy. Nature News 549(7673):445

Bai F, Morcos F, Cheng RR, Jiang H, Onuchic JN (2016) Elucidating the druggable interface of protein- protein interactions using fragment docking and coevolutionary analysis. Proc Natl Acad Sci 113(50):E8051–E8058

Bakheet TM, Doig AJ (2009) Properties and identification of human protein drug targets. Bioinformatics 25(4):451–457

Beck AH, Sangoi AR, Leung S, Marinelli RJ, Nielsen TO, Van De Vijver MJ, West RB, Van De Rijn M, Koller D (2011) Systematic analysis of breast cancer morphology uncovers stromal features associated with survival. Sci Trans Med 3(108):108ra113-108ra113

Bengio Y (2009) Learning deep architectures for AI. Now Publishers Inc, Norwell

Book   MATH   Google Scholar  

Bengio Y, Lamblin P, Popovici D, Larochelle H (2007) Greedy layer-wise training of deep networks advances in neural information processing systems

Boyiadzis MM, Kirkwood JM, Marshall JL, Pritchard CC, Azad NS, Gulley JL (2018) Significance and implications of fda approval of pembrolizumab for biomarker-defined disease. J Immunother Cancer 6(1):1–7

Bravo À, Piñero J, Queralt-Rosinach N, Rautschka M, Furlong LI (2015) Extraction of relations between genes and diseases from text and large-scale data analysis: implications for translational research. BMC Bioinform 16(1):55

Bukhman YV, Skolnick J (2001) Biomolquest: integrated database-based retrieval of protein structural and functional information. Bioinformatics 17(5):468–478

Bundela S, Sharma A, Bisen PS (2015) Potential compounds for oral cancer treatment: resveratrol, nimbolide, lovastatin, bortezomib, vorinostat, berberine, pterostilbene, deguelin, andrographolide, and colchicine. PLoS ONE 10(11):e0141719

Bunte K, Leppäaho E, Saarinen I, Kaski S (2016) Sparse group factor analysis for biclustering of multiple data sources. Bioinformatics 32(16):2457–2463

Bychkov D, Linder N, Turkki R, Nordling S, Kovanen PE, Verrill C, Walliander M, Lundin M, Haglund C, Lundin J (2018) Deep learning based tissue analysis predicts outcome in colorectal cancer. Sci Rep 8(1):1–11

Cabreiro F, Au C, Leung K-Y, Vergara-Irigaray N, Cochemé HM, Noori T, Weinkove D, Schuster E, Greene NDE, Gems D (2013) Metformin retards aging in c. elegans by altering microbial folate and methionine metabolism. Cell 153(1):228–239

Cano G, Garcia-Rodriguez J, Garcia-Garcia A, Perez-Sanchez H, Benediktsson JA, Thapa A, Barr A (2017) Automatic selection of molecular descriptors using random forest: Application to drug discovery. Expert Syst Appl 72:151–159

Chen Y-C (2015) Beware of docking! Trends Pharmacol Sci 36(2):78–95

Chen H, Engkvist O, Wang Y, Olivecrona M, Blaschke T (2018) The rise of deep learning in drug discovery. Drug Discovery Today 23(6):1241–1250

Chen R, Li L, Weng Z (2003) Zdock: an initial-stage protein-docking algorithm. Proteins Struct Funct Bioinf 52(1):80–87

Chen H, Zhang Y, Kalra MK, Lin F, Chen Y, Liao P, Zhou J, Wang G (2017) Low-dose ct with a residual encoder-decoder convolutional neural network. IEEE Trans Med Imaging 36(12):2524–2535

Cheng L, Lewis JS, Dupont WD, Plummer WD, Janowczyk A, Madabhushi A (2017) An oral cavity squamous cell carcinoma quantitative histomorphometric-based image classifier of nuclear morphology can risk stratify patients for disease-specific survival. Mod Pathol 30(12):1655–1665

Cheng L, Romo-Bucheli D, Wang X, Janowczyk A, Ganesan S, Gilmore H, Rimm D, Madabhushi A (2018) Nuclear shape and orientation features from h&e images predict survival in early-stage estrogen receptor-positive breast cancers. Lab Invest 98(11):1438–1448

Cohen O, Zhu B, Rosen MS (2018) Mr fingerprinting deep reconstruction network (drone). Magn Reson Med 80(3):885–894

Coley CW, Barzilay R, Green WH, Jaakkola TS, Jensen KF (2017) Convolutional embedding of attributed molecular graphs for physical property prediction. J Chem Inf Model 57(8):1757–1772

Coley CW, Rogers L, Green WH, Jensen KF (2018) Scscore: synthetic complexity learned from a reaction corpus. J Chem Inf Model 58(2):252–261

Copeland RA (2010) The dynamics of drug-target interactions: drug-target residence time and its impact on efficacy and safety. Expert Opin Drug Discov 5(4):305–310

Cornell M, Paton NW, Wu S, Goble CA, Miller CJ, Kirby P, Eilbeck K, Brass A, Hayes A, Oliver SG (2001) Gims-a data warehouse for storage and analysis of genome sequence and functional data. In: Proceedings 2nd Annual IEEE International Symposium on Bioinformatics and Bioengineering (BIBE 2001). IEEE, pp 15–22

Corredor G, Xiangxue Wang Yu, Zhou CL, Pingfu F, Syrigos K, Rimm DL, Yang M, Romero E, Schalper KA et al (2019) Spatial architecture and arrangement of tumor-infiltrating lymphocytes for predicting likelihood of recurrence in early-stage non-small cell lung cancer. Clin Cancer Res 25(5):1526–1534

Costa PR, Acencio ML, Lemke N (2010) A machine learning approach for genome-wide prediction of morbid and druggable human genes based on systems-level data. In: BMC genomics, vol 11. Springer, Berlin, p S9

Costello JC, Heiser LM, Georgii E, Gönen M, Menden MP, Wang NJ, Bansal M, Hintsanen P, Khan SA, Mpindi J-P et al (2014) A community effort to assess and improve drug sensitivity prediction algorithms. Nat Biotechnol 32(12):1202–1212

Coudray N, Ocampo PS, Sakellaropoulos T, Narula N, Snuderl M, Fenyö D, Moreira AL, Razavian N, Tsirigos A (2018) Classification and mutation prediction from non-small cell lung cancer histopathology images using deep learning. Nat Med 24(10):1559–1567

Cruz-Roa A, Gilmore H, Basavanhally A, Feldman M, Ganesan S, Shih NNC, Tomaszewski J, González FA, Madabhushi A (2017) Accurate and reproducible invasive breast cancer detection in whole-slide images: A deep learning approach for quantifying tumor extent. Sci Rep 7:46450

Cukuroglu E, Engin HB, Gursoy A, Keskin O (2014) Hot spots in protein-protein interfaces: towards drug discovery. Prog Biophys Mol Biol 116(2–3):165–173

Dahl GE, Jaitly N, Salakhutdinov R (2014) Multi-task neural networks for qsar predictions. arXiv preprint arXiv:1406.1231

Dang NL, Hughes TB, Krishnamurthy V, Swamidass SJ (2016) A simple model predicts ugt-mediated metabolism. Bioinformatics 32(20):3183–3189

de Bruyn KC, Stork C, Šícho M, Kochev N, Svozil D, Jeliazkova N, Kirchmair J (2019) Glory: generator of the structures of likely cytochrome p450 metabolites based on predicted sites of metabolism. Front Chem 7:402

De Haes W, Frooninckx L, Van Assche R, Smolders A, Depuydt G, Billen J, Braeckman BP, Schoofs L, Temmerman L (2014) Metformin promotes lifespan through mitohormesis via the peroxiredoxin prdx-2. Proc Natl Acad Sci 111(24):E2501–E2509

Google Scholar  

Decaux O, Lodé L, Magrangeas F, Charbonnel C, Gouraud W, Jézéquel P, Attal M, Harousseau J-L, Moreau P, Bataille R et al (2008) Prediction of survival in multiple myeloma based on gene expression profiles reveals cell cycle and chromosomal instability signatures in high-risk patients and hyperdiploid signatures in low-risk patients: a study of the intergroupe francophone du myelome. J Clin Oncol 26(29):4798–4805

Deng L, Dong Y (2014) Deep learning: methods and applications. Found Trends Sign Process 7(3–4):197–387

Article   MathSciNet   MATH   Google Scholar  

Desai B, Dixon K, Farrant E, Feng Q, Gibson KR, van Hoorn WP, Mills J, Morgan T, Parry DM, Ramjee MK et al (2013) Rapid discovery of a novel series of abl kinase inhibitors by application of an integrated microfluidic synthesis and screening platform. J Med Chem 56(7):3033–3047

DiMasi JA, Grabowski HG, Hansen RW (2016) Innovation in the pharmaceutical industry: new estimates of r&d costs. J Health Econ 47:20–33

Ding J, Condon A, Shah SP (2018) Interpretable dimensionality reduction of single cell transcriptome data with deep generative models. Nat Commun 9(1):1–13

Dobchev DA, Pillai G, Karelson M (2014) In silico machine learning methods in drug development. Curr Top Med Chem 14(16):1913–1922

Domenico A, Nicola G, Daniela T, Fulvio C, Nicola A, Orazio N (2020) De novo drug design of targeted chemical libraries based on artificial intelligence and pair based multi-objective optimization. J Chem Inform Model

Du T, Liao L, Wu CH, Sun B (2016) Prediction of residue-residue contact matrix for protein–protein interaction with fisher score features and deep learning. Methods 110:97–105

Duch W, Swaminathan K, Meller J (2007) Artificial intelligence approaches for rational drug design and discovery. Curr Pharm Des 13(14):1497–1508

Duda RO, Hart PE, Stork DG (2012) Pattern classification. John Wiley & Sons, New Jersy

Dudek AZ, Arodz T, Gálvez J (2006) Computational methods in developing quantitative structure-activity relationships (qsar): a review. Comb Chem High Throughput Screen 9(3):213–228

Dupond S (2019) A thorough review on the current advance of neural network structures. Annu Rev Control 14:200–230

Duvenaud DK, Maclaurin D, Iparraguirre J, Bombarell R, Hirzel T, Aspuru-Guzik A, Adams RP (2015) Convolutional networks on graphs for learning molecular fingerprints. In: Advances in neural information processing systems, pp 2224–2232

Eid F-E, ElHefnawi M, Heath LS (2016) Denovo: virus-host sequence-based protein–protein interaction prediction. Bioinformatics 32(8):1144–1150

Engelbrecht AP (2007) Computational intelligence: an introduction. John Wiley & Sons, New Jersy

Book   Google Scholar  

Esposito EX, Hopfinger AJ, Madura JD (2004) Methods for applying the quantitative structure-activity relationship paradigm. In: Chemoinformatics. Springer, pp 131–213

Etzold T, Ulyanov A, Argos P (1996) [8] srs: information retrieval system for molecular biology data banks. Methods Enzymol 266:114–128

Falchi F, Caporuscio F, Recanatini M (2014) Structure-based design of small-molecule protein–protein interaction modulators: the story so far. Future Med Chem 6(3):343–357

Ferrero E, Dunham I, Sanseau P (2017) In silico prediction of novel therapeutic targets using gene-disease association data. J Transl Med 15(1):182

Finnegan A, Song JS (2017) Maximum entropy methods for extracting the learned features of deep neural networks. PLoS Comput Biol 13(10):e1005836

Free SM, Wilson JW (1964) A mathematical contribution to structure-activity studies. J Med Chem 7(4):395–399

Friedman J, Hastie T, Tibshirani R (2001) The elements of statistical learning, vol 1. Springer series in statistics, New York

Friesner RA, Banks JL, Murphy RB, Halgren TA, Klicic JJ, Mainz DT, Repasky MP, Knoll EH, Shelley M, Perry JK et al (2004) Glide: a new approach for rapid, accurate docking and scoring. 1. method and assessment of docking accuracy. J Med Chem 47(7):1739–1749

Gaulton A, Bellis LJ, Bento AP, Chambers J, Davies M, Hersey A, Light Y, McGlinchey S, Michalovich D, Al-Lazikani B et al (2012) Chembl: a large-scale bioactivity database for drug discovery. Nucleic Acids Res 40(D1):D1100–D1107

Gertrudes JC, Maltarollo VG, Silva RA, Oliveira PR, Honorio KM, Da Silva ABF (2012) Machine learning techniques and drug design. Curr Med Chem 19(25):4289–4297

Ghasemi F, Mehridehnavi A, Fassihi A, Pérez-Sánchez H (2018) Deep neural network in qsar studies using deep belief network. Appl Soft Comput 62:251–258

Giraldo NA, Kaunitz GJ, Cottrell TR, Berry S, Sunshine JC, Nguyen P, Xu H, Orgutsova A, Church CD, Miller NJ et al. (2017) The differential association of pd-1, pd-l1, and cd8+ cells with response to pembrolizumab and presence of merkel cell polyomavirus (mcpyv) in patients with merkel cell carcinoma (mcc)

Godinez WJ, Hossain I, Lazic SE, Davies JW, Zhang X (2017) A multi-scale convolutional neural network for phenotyping high-content cellular images. Bioinformatics 33(13):2010–2019

Goodfellow I, Bengio Y, Courville A, Bengio Y (2016) Deep learning, vol 1. MIT press, Cambridge

Gopal M (2018) Applied machine learning. McGraw-Hill Education, Chennai

Guengerich FP (2010) Mechanisms of drug toxicity and relevance to pharmaceutical development. Drug metabolism and pharmacokinetics, p 1010210090

Guney E, Menche J, Vidal M, Barábasi A-L (2016) Network-based in silico drug efficacy screening. Nat Commun 7(1):1–13

Gunther EC, Stone DJ, Gerwien RW, Bento P, Heyes MP (2003) Prediction of clinical drug efficacy by classification of drug-induced genomic expression profiles in vitro. Proc Natl Acad Sci 100(16):9608–9613

Guo Y, Lezheng Yu, Wen Z, Li M (2008) Using support vector machine combined with auto covariance to predict protein-protein interactions from protein sequences. Nucleic Acids Res 36(9):3025–3030

Gupta S, Chaudhary K, Kumar R, Gautam A, Nanda JS, Dhanda SK, Brahmachari SK, Raghava GPS (2016) Prioritization of anticancer drugs against a cancer using genomic features of cancer cells: A step towards personalized medicine. Sci Rep 6(1):1–11

Gómez-Bombarelli R, Wei JN, Duvenaud D, Hernández-Lobato JM, Sánchez-Lengeling B, Sheberla D, Aguilera-Iparraguirre J, Hirzel TD, Adams RP, Aspuru-Guzik A (2018) Automatic chemical design using a data-driven continuous representation of molecules. ACS Cent Sci 4(2):268–276

Hansch C, Fujita T (1964) Additions and corrections-analysis. a method for the correlation of biological activity and chemical structure. J Am Chem Soc 86(24):5710

Hartenfeller M, Schneider G (2010) De novo drug design. In: Chemoinformatics and computational chemical biology. Springer, Berlin, pp 299–323

Hassan BM, Ahmad K, Roy S, Mohammad Ashraf J, Adil M, Haris Siddiqui M, Khan S, Amjad Kamal M, Provazník I, Choi I (2016) Computer aided drug design: success and limitations. Curr Pharm Des 22(5):572–581

Hejase HA, Chan C (2015) Improving drug sensitivity prediction using different types of data. CPT: Pharmacometrics Syst Pharmacol 4(2):98–105

Higueruelo AP, Jubb H, Blundell TL (2013) Protein-protein interactions as druggable targets: recent technological advances. Curr Opin Pharmacol 13(5):791–796

Hinton G (2018) Deep learning–a technology with the potential to transform health care. JAMA 320(11):1101–1102

Ho Tin K (1995) Random decision forests. In: Proceedings of 3rd international conference on document analysis and recognition, vol 1. IEEE, pp 278–282

Huang S, Cai N, Pacheco PP, Narrandes S, Wang Y, Wayne X (2018) Applications of support vector machine (svm) learning in cancer genomics. Cancer Genom Proteom 15(1):41–51

Huang C, Mezencev R, McDonald JF, Vannberg F (2017) Open source machine-learning algorithms for the prediction of optimal cancer drug therapies. PLoS ONE 12(10):e0186906e0186906

Huang S-Y, Zou X (2010) Inclusion of solvation and entropy in the knowledge-based scoring function for protein–ligand interactions. J Chem Inf Model 50(2):262–273

Hubatsch I, Ragnarsson EGE, Artursson P (2007) Determination of drug permeability and prediction of drug absorption in caco-2 monolayers. Nat Protoc 2(9):2111

Hutson M (2018) Artificial intelligence faces reproducibility crisis

Iorio F, Knijnenburg TA, Vis DJ, Bignell GR, Menden MP, Schubert M, Aben N, Gonçalves E, Barthorpe S, Lightfoot H et al (2016) A landscape of pharmacogenomic interactions in cancer. Cell 166(3):740–754

Janowczyk A, Madabhushi A (2016) Deep learning for digital pathology image analysis: a comprehensive tutorial with selected use cases. J Pathol Inform 7

Jeon J, Nim S, Teyra J, Datti A, Wrana JL, Sidhu SS, Moffat J, Kim PM (2014) A systematic approach to identify novel cancer drug targets using machine learning, inhibitor design and high-throughput screening. Genome Med 6(7):1–18

Jha A, Gazzara MR, Barash Y (2017) Integrative deep models for alternative splicing. Bioinformatics 33(14):i274–i282

Jiménez J, Skalic M, Martinez-Rosell G, De Fabritiis G (2018) K deep: Protein-ligand absolute binding affinity prediction via 3d-convolutional neural networks. J Chem Inf Model 58(2):287–296

Jung E, Kim J, Kim M, Jung DH, Rhee H, Shin J-M, Choi K, Kang S-K, Kim M-K, Yun C-H et al (2007) Artificial neural network models for prediction of intestinal permeability of oligopeptides. BMC Bioinform 8(1):245

Kadurin A, Aliper A, Kazennov A, Mamoshina P, Vanhaelen Q, Khrabrov K, Zhavoronkov A (2017) The cornucopia of meaningful leads: Applying deep adversarial autoencoders for new molecule development in oncology. Oncotarget 8(7):10883

Kandoi G, Acencio ML, Lemke N (2015) Prediction of druggable proteins using machine learning and systems biology: a mini-review. Front Physiol 6:366

Kapoorb R, Haganb M, Paltab J, Ghosha P (2020) Artificial intelligence methods in computer-aided diagnostic tools and decision support analytics for clinical informatics. Artif Intell Prec Health From Conc Appl, p 31

Kearnes S, Goldman B, Pande V (2016) Modeling industrial admet data with multitask networks. arXiv preprint arXiv:1606.08793

Khamis MA, Gomaa W, Ahmed WF (2015) Machine learning in computational docking. Artif Intell Med 63(3):135–152

Khera AV, Chaffin M, Aragam KG, Haas ME, Roselli C, Choi SH, Natarajan P, Lander ES, Lubitz SA, Ellinor PT et al (2018) Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations. Nat Genet 50(9):1219–1224

Kim ES, Herbst RS, Wistuba II, Lee JJ, Blumenschein GR, Tsao A, Stewart DJ, Hicks ME, Erasmus J, Gupta S et al (2011) The battle trial: personalizing therapy for lung cancer. Cancer Discov 1(1):44–53

Kim J, Kim J, Lee H (2017) An analysis of disease-gene relationship from medline abstracts by digsee. Sci Rep 7(1):1–13

Kingma DP, Welling M (2013) Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114

Kingma DP, Welling M (2019) An introduction to variational autoencoders. arXiv preprint arXiv:1906.02691

Kinnings SL, Liu N, Tonge PJ, Jackson RM, Xie L, Bourne PE (2011) A machine learning-based method to improve docking scoring functions and its application to drug repurposing. J Chem Inf Model 51(2):408–419

Konar A (2006) Computational intelligence: principles, techniques and applications. Springer Science & Business Media, Berlin

Korbar B, Olofson AM, Miraflor AP, Nicka CM, Suriawinata MA, Torresani L, Suriawinata AA, Hassanpour S (2017) Deep learning for classification of colorectal polyps on whole-slide images. J Pathol Inform 8

Koscielny G, An P, Carvalho-Silva D, Cham JA, Fumis L, Gasparyan R, Hasan S, Karamanis N, Maguire M, Papa E et al (2017) Open targets: a platform for therapeutic target identification and validation. Nucleic Acids Res 45(D1):D985–D994

Kramer MA (1991) Nonlinear principal component analysis using autoassociative neural networks. AIChE J 37(2):233–243

Kraus VB (2018) Biomarkers as drug development tools: discovery, validation, qualification and use. Nat Rev Rheumatol 14(6):354–362

Kumar V, Sanseau P, Simola DF, Hurle MR, Agarwal P (2016) Systematic analysis of drug targets confirms expression in disease-relevant tissues. Sci Rep 6:36205

Larsen ABL, Sønderby SK (2015) Generating faces with torch. URL http://torch.ch/blog/2015/11/13/gan. html

Lavecchia A, Di Giovanni C (2013) Virtual screening strategies in drug discovery: a critical review. Curr Med Chem 20(23):2839–2860

LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444

Leelananda SP, Lindert S (2016) Computational methods in drug discovery. Beilstein J Org Chem 12(1):2694–2718

Leung MKK, Xiong HY, Lee LJ, Frey BJ (2014) Deep learning of the tissue-regulated splicing code. Bioinformatics 30(12):i121–i129

Li H, Hou J, Adhikari B, Lyu Q, Cheng J (2017) Deep learning methods for protein torsion angle prediction. BMC Bioinform 18(1):417

Li Q, Lai L (2007) Prediction of potential drug targets based on simple sequence properties. BMC Bioinform 8(1):353

Li B, Shin H, Gulbekyan G, Pustovalova O, Nikolsky Y, Hope A, Bessarabova M, Schu M, Kolpakova-Hart E, Merberg D et al (2015) Development of a drug-response modeling framework to identify cell line derived translational biomarkers that can predict treatment outcome to erlotinib or sorafenib. PLoS ONE 10(6):e0130700e0130700

Li L, Wang B, Meroueh SO (2011) Support vector regression scoring of receptor-ligand complexes for rank-ordering and virtual screening of chemical libraries. J Chem Inf Model 51(9):2132–2138

Liew CY, Ma XH, Liu X, Yap CW (2009) Svm model for virtual screening of lck inhibitors. J Chem Inf Model 49(4):877–885

Lombardo F, Jing Y (2016) In silico prediction of vol of distribution in humans. extensive data set and the exploration of linear and nonlinear methods coupled with molecular interaction fields descriptors. J Chem Inf Model 56(10):2042–2052

Lombardo F, Obach RS, Varma MV, Stringer R, Berellini G (2014) Clearance mechanism assignment and total clearance prediction in human based upon in silico models. J Med Chem 57(10):4397–4405

Lotfi SM, Ghadiri N, Mousavi SR, Varshosaz J, Green JR (2018) A review of network-based approaches to drug repositioning. Brief Bioinform 19(5):878–892

Luo Y, Zhao X, Zhou J, Yang J, Zhang Y, Kuang W, Peng J, Chen L, Zeng J (2017) A network integration approach for drug-target interaction prediction and computational drug repositioning from heterogeneous information. Nat Commun 8(1):1–13

Lusci A, Pollastri G, Baldi P (2013) Deep architectures and deep learning in chemoinformatics: the prediction of aqueous solubility for drug-like molecules. J Chem Inf Model 53(7):1563–1575

Ma XH, Jia J, Zhu F, Xue Y, Li ZR, Chen YZ (2009) Comparative analysis of machine learning methods in ligand-based virtual screening of large compound libraries. Comb Chem High Throughput Screen 12(4):344–357

Ma J, Sheridan RP, Liaw A, Dahl GE, Svetnik V (2015) Deep neural nets as a method for quantitative structure-activity relationships. J Chem Inf Model 55(2):263–274

Madeira F, Park YM, Lee J, Buso N, Gur T, Madhusoodanan N, Basutkar P, Tivey ARN, Potter SC, Finn RD et al (2019) The embl-ebi search and sequence analysis tools apis in 2019. Nucleic Acids Res 47(W1):W636–W641

Maheshwari S, Brylinski M (2016) Template-based identification of protein–protein interfaces using efindsiteppi. Methods 93:64–71

Maltarollo VG, Kronenberger T, Espinoza GZ, Oliveira PR, Honorio KM (2019) Advances with support vector machines for novel drug discovery. Expert Opin Drug Discov 14(1):23–33

Mamoshina P, Volosnikova M, Ozerov IV, Putin E, Skibina E, Cortese F, Zhavoronkov A (2018) Machine learning on human muscle transcriptomic data for biomarker discovery and tissue-specific drug target identification. Front Genet 9:242

Mani NL, Schalper KA, Hatzis C, Saglam O, Tavassoli F, Butler M, Chagpar AB, Pusztai L, Rimm DL (2016) Quantitative assessment of the spatial heterogeneity of tumor-infiltrating lymphocytes in breast cancer. Breast Cancer Res 18(1):78

Martin-Montalvo A, Mercken EM, Mitchell SJ, Palacios HH, Mote PL, Scheibye-Knudsen M, Gomes AP, Ward TM, Minor RK, Blouin M-J et al (2013) Metformin improves healthspan and lifespan in mice. Nat Commun 4(1):1–9

Matlock MK, Hughes TB, Swamidass SJ (2015) Xenosite server: a web-available site of metabolism prediction tool. Bioinformatics 31(7):1136–1137

Matsumoto A, Aoki S, Ohwada H (2016) Comparison of random forest and svm for raw data in drug discovery: prediction of radiation protection and toxicity case study. Int J Mach Learn Comput 6(2):145

Mayr A, Klambauer G, Unterthiner T, Hochreiter S (2016) Deeptox: toxicity prediction using deep learning. Front Environ Sci 3:80

McMillan EA, Ryu M-J, Diep CH, Mendiratta S, Clemenceau JR, Vaden RM, Kim J-H, Motoyaji T, Covington KR, Peyton M et al (2018) Chemistry-first approach for nomination of personalized treatment in lung cancer. Cell 173(4):864–878

Melville JL, Burke EK, Hirst JD (2009) Machine learning in virtual screening. Comb Chem High Throughput Screen 12(4):332–343

Miljanovic M (2012) Comparative analysis of recurrent and finite impulse response neural networks in time series prediction. Indian J Comput Sci Eng 3(1):180–191

Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G et al (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533

Morita A, Ariyasu S, Wang B, Asanuma T, Onoda T, Sawa A, Tanaka K, Takahashi I, Togami S, Nenoi M et al (2014) As-2, a novel inhibitor of p53-dependent apoptosis, prevents apoptotic mitochondrial dysfunction in a transcription-independent manner and protects mice from a lethal dose of ionizing radiation. Biochem Biophys Res Commun 450(4):1498–1504

Mulligan G, Mitsiades C, Bryant B, Zhan F, Chng WJ, Roels S, Koenig E, Fergus A, Huang Y, Richardson P et al (2007) Gene expression profiling and correlation with outcome in clinical trials of the proteasome inhibitor bortezomib. Blood 109(8):3177–3188

Myint KZ, Xie X-Q (2010) Recent advances in fragment-based qsar and multi-dimensional qsar methods. Int J Mol Sci 11(10):3846–3866

Nayal M, Honig B (2006) On the nature of cavities on protein surfaces: application to the identification of drug-binding sites. Proteins Struct Funct Bioinf 63(4):892–906

Ning X, Karypis G (2011) In silico structure-activity-relationship (sar) models from machine learning: a review. Drug Dev Res 72(2):138–146

Nirschl JJ, Janowczyk A, Peyster EG, Frank R, Margulies KB, Feldman MD, Madabhushi A (2018) A deep-learning classifier identifies patients with clinical heart failure using whole-slide images of h&e tissue. PLoS ONE 13(4):e0192726

Noble WS (2006) What is a support vector machine? Nat Biotechnol 24(12):1565–1567

Olivecrona M, Blaschke T, Engkvist O, Chen H (2017) Molecular de-novo design through deep reinforcement learning. J Cheminform 9(1):48

Pal SK, Mitra S (1992) Multilayer perceptron, fuzzy sets, classifiaction

Paré G, Mao S, Deng WQ (2017) A machine-learning heuristic to improve gene score prediction of polygenic traits. Sci Rep 7(1):1–11

Patel S, Tripathi R, Kumari V, Varadwaj P (2017) Deepinteract: deep neural network based protein-protein interaction prediction tool. Curr Bioinform 12(6):551–557

Patil K, Jordan EJ, Park JH, Suresh K, Smith CM, Lemmon AA, Mossé Yaël P, Lemmon MA, Radhakrishnan R (2021) Computational studies of anaplastic lymphoma kinase mutations reveal common mechanisms of oncogenic activation. Proc Natl Acad Sci 118(10)

Pierson E, Yau C (2015) Zifa: Dimensionality reduction for zero-inflated single-cell gene expression analysis. Genome Biol 16(1):1–10

Polamuri S (2017) How the random forest algorithm works in machine learning. Retrieved December, 21

Poole D, Mackworth A, Goebel R (1998) Computational intelligence

Pu Y, Wang W, Henao R, Chen L, Gan Z, Li C, Carin L (2017) Adversarial symmetric variational autoencoder. In: Advances in neural information processing systems, pp 4330–4339

Rahman R, Matlock K, Ghosh S, Pal R (2017) Heterogeneity aware random forest for drug sensitivity prediction. Sci Rep 7(1):1–11

Rahman R, Otridge J, Pal R (2017) Integratedmrf: random forest-based framework for integrating prediction from different data types. Bioinformatics 33(9):1407–1410

Ramsundar B, Liu B, Zhenqin W, Verras A, Tudor M, Sheridan RP, Pande V (2017) Is multitask deep learning practical for pharma? J Chem Inf Model 57(8):2068–2076

Rolan P, Danhof M, Stanski D, Peck C (2007) Current issues relating to drug safety especially with regard to the use of biomarkers: A meeting report and progress update. Eur J Pharm Sci 30(2):107–112

Romo-Bucheli D, Janowczyk A, Gilmore H, Romero E, Madabhushi A (2016) Automated tubule nuclei quantification and correlation with oncotype dx risk categories in er+ breast cancer whole slide images. Sci Rep 6:32706

Rosenblatt F (1961) Principles of neurodynamics. perceptrons and the theory of brain mechanisms. Technical report, Cornell Aeronautical Lab Inc Buffalo NY

Rouillard AD, Hurle MR, Agarwal P (2018) Systematic interrogation of diverse omic data reveals interpretable, robust, and generalizable transcriptomic features of clinically successful therapeutic targets. PLoS Comput Biol 14(5):e1006142

Sabrina R, Sohrab S, Ziv BJ, Ravi P (2019) Dhaka: variational autoencoder for unmasking tumor heterogeneity from single cell genomic data. Bioinformatics

Saltz J, Gupta R, Hou L, Kurc T, Singh P, Nguyen V, Samaras D, Shroyer KR, Zhao T, Batiste R et al (2018) Spatial organization and molecular correlation of tumor-infiltrating lymphocytes using deep learning on pathology images. Cell Rep 23(1):181–193

Samigulina G, Zarina S (2017) Immune network technology on the basis of random forest algorithm for computer-aided drug design. In: International Conference on Bioinformatics and Biomedical Engineering. Springer, pp 50–61

Santos R, Ursu O, Gaulton A, Bento AP, Donadi RS, Bologa CG, Karlsson A, Al-Lazikani B, Hersey A, Oprea TI et al (2017) A comprehensive map of molecular drug targets. Nat Rev Drug Discovery 16(1):19–34

Schmidhuber J (2015) Deep learning in neural networks: an overview. Neural Netw 61:85–117

Schneider G, Funatsu K, Okuno Y, Winkler D (2017) De novo drug design-ye olde scoring problem revisited. Mol Inf 36(1–2):1681031

Schneidman-Duhovny D, Inbar Y, Nussinov R, Wolfson HJ (2005) Patchdock and symmdock: servers for rigid and symmetric docking. Nucleic Acids Res 33(suppl-2):W363–W367

Scott DE, Bayly AR, Abell C, Skidmore J (2016) Small molecules, big targets: drug discovery faces the protein-protein interaction challenge. Nat Rev Drug Discovery 15(8):533

Searls DB (2005) Data integration: challenges for drug discovery. Nat Rev Drug Discovery 4(1):45–58

Segler MHS, Kogej T, Tyrchan C, Waller MP (2018) Generating focused molecule libraries for drug discovery with recurrent neural networks. ACS Cent Sci 4(1):120–131

Seoane JA, Aguiar-Pulido V, Munteanu C, Rivero D, Rabunal J, Dorado J, Pazos A (2013) Biomedical data integration in computational drug design and bioinformatics. Curr Comput Aided Drug Des 9(1):108–117

Sharma H, Zerbe N, Klempert I, Hellwich O, Hufnagl P (2017) Deep convolutional neural networks for automatic classification of gastric carcinoma using whole slide images in digital histopathology. Comput Med Imag Graph 61:2–13

Shaughnessy JD Jr, Zhan F, Burington BE, Huang Y, Colla S, Hanamura I, Stewart JP, Kordsmeier B, Randolph C, Williams DR et al (2007) A validated gene expression model of high-risk multiple myeloma is defined by deregulated expression of genes mapping to chromosome 1. Blood 109(6):2276–2284

Shi L, Campbell G, Jones W, Campagne F, Wen Z, Walker S, Su Z, Chu T, Goodsaid F, Pusztai L, et al. (2010) The maqc-ii project: a comprehensive study of common practices for the development and validation of microarray-based predictive models

Shin W-H, Christoffer CW, Kihara D (2017) In silico structure-based approaches to discover protein-protein interaction-targeting drugs. Methods 131:22–32

Sim, DSM (2015) Drug distribution. In: Pharmacological Basis of Acute Care, Springer, Berlin, pp 27–36

Sistare FD, Dieterle F, Troth S, Holder DJ, Gerhold D, Andrews-Cleavenger D, Baer W, Betton G, Bounous D, Carl K et al (2010) Towards consensus practices to qualify safety biomarkers for use in early drug development. Nat Biotechnol 28(5):446–454

Smola AJ, Schölkopf B (2004) A tutorial on support vector regression. Stat Comput 14(3):199–222

Soukup T, Davidson I (2002) Visual data mining: techniques and tools for data visualization and mining. John Wiley & Sons, New Jersey

Spencer M, Eickholt J, Cheng J (2014) A deep learning network approach to ab initio protein secondary structure prediction. IEEE/ACM Trans Comput Biol Bioinf 12(1):103–112

Stokes A, Hum W, Zaslavsky J (2020) A minimal-input multilayer perceptron for predicting drug-drug interactions without knowledge of drug structure. arXiv preprint arXiv:2005.10644

Stork C, Chen Y, Sicho M, Kirchmair J (2019) Hit dexter 2.0: machine-learning models for the prediction of frequent hitters. J Chem Inf Model 59(3):1030–1043

Stork C, Embruch G, Šícho M, de Bruyn Kops C, Chen Y, Svozil D, Kirchmair J (2020) Nerdd: A web portal providing access to in silico tools for drug discovery. Bioinformatics 36(4):1291–1292

Subramanian G, Ramsundar B, Pande V, Denny RA (2016) Computational modeling of \(\beta\) -secretase 1 (bace-1) inhibitors using ligand based approaches. J Chem Inf Model 56(10):1936–1949

Susan K, Stephanie H, Mathias W, Harald P, Binje V, Paul-Albert K, Maria R, Benjamin R, Svenja P, Chen M et al (2017) The target landscape of clinical kinase drugs. Science 358(6367)

Sushko I, Salmina E, Potemkin VA, Poda G, Tetko IV (2012) Toxalerts: a web server of structural alerts for toxic chemicals and compounds with potential adverse reactions

Szklarczyk D, Franceschini A, Wyder S, Forslund K, Heller D, Huerta-Cepas J, Simonovic M, Roth A, Santos A, Tsafou KP et al (2015) String v10: protein-protein interaction networks, integrated over the tree of life. Nucleic Acids Res 43(D1):D447–D452

Talele TT, Khedkar SA, Rigby AC (2010) Successful applications of computer aided drug discovery: moving drugs from concept to the clinic. Curr Top Med Chem 10(1):127–141

Tan J, Hammond JH, Hogan DA, Greene Casey S (2016) Adage-based integration of publicly available pseudomonas aeruginosa gene expression data with denoising autoencoders illuminates microbe-host interactions. MSystems 1(1)

Tasaki S, Suzuki K, Kassai Y, Takeshita M, Murota A, Kondo Y, Ando T, Nakayama Y, Okuzono Y, Takiguchi M et al (2018) Multi-omics monitoring of drug response in rheumatoid arthritis in pursuit of molecular remission. Nat Commun 9(1):1–12

Thomas U, Andreas M, Günter K, Marvin S, Wegner Jörg K, Hugo C, Sepp H (2014) Deep learning as an opportunity in virtual screening. Proc Deep Learn Workshop NIPS 27:1–9

Tian S, Li Y, Wang J, Zhang J, Hou T (2011) Adme evaluation in drug discovery. 9. prediction of oral bioavailability in humans based on molecular properties and structural fingerprints. Mol Pharm 8(3):841–851

Tishby N, Zaslavsky N (2015) Deep learning and the information bottleneck principle. In: 2015 IEEE Information Theory Workshop (ITW). IEEE, pp 1–5

Tsherniak A, Vazquez F, Montgomery PG, Weir BA, Kryukov G, Cowley GS, Gill S, Harrington WF, Pantel S, Krill-Burger JM et al (2017) Defining a cancer dependency map. Cell 170(3):564–576

Turkki R, Linder N, Kovanen PE, Pellinen T, Lundin J (2016) Antibody-supervised deep learning for quantification of tumor-infiltrating immune cells in hematoxylin and eosin stained breast cancer samples. J Pathol Inform 7

Turner JR (2010) New drug development: an introduction to clinical trials. Springer Science & Business Media, Berlin

Vakser IA (2014) Protein-protein docking: From interaction to interactome. Biophys J 107(8):1785–1793

Valkov E, Sharpe T, Marsh M, Greive S, Hyvönen M (2011) Targeting protein–protein interactions and fragment-based drug discovery. In: Fragment-Based Drug Discovery and X-Ray Crystallography. Springer, pp 145–179

Valueva MV, Nagornov NN, Lyakhov PA, Valuev GV, Chervyakov NI (2020) Application of the residue number system to reduce hardware costs of the convolutional neural network implementation. Math Comput Simul

Vamathevan J, Clark D, Czodrowski P, Dunham I, Ferran E, Lee G, Li B, Madabhushi A, Shah P, Spitzer M et al (2019) Applications of machine learning in drug discovery and development. Nat Rev Drug Discovery 18(6):463–477

van Gool AJ, Bietrix F, Caldenhoven E, Zatloukal K, Scherer A, Litton J-E, Meijer G, Blomberg N, Smith A, Mons B et al (2017) Bridging the translational innovation gap through good biomarker practice. Nat Rev Drug Discovery 16(9):587–588

Vaquero-Garcia J, Barrera A, Gazzara MR, Gonzalez-Vallinas J, Lahens NF, Hogenesch JB, Lynch KW, Barash Y (2016) A new view of transcriptome complexity and regulation through the lens of local splicing variations. Elife 5:e11752

Veltri RW, Partin AW, Miller MC (2000) Quantitative nuclear grade (qng): A new image analysis-based biomarker of clinically relevant nuclear structure alterations. J Cell Biochem 79(S35):151–157

Venkatesan R, Li B (2017) Convolutional neural networks in visual computing: a concise guide. CRC Press, London

Vinod CSS, Anad Hareendran S (2021) Artificial intelligence: a practitioner’s approach. PHI Learning Pvt Ltd, Delhi

Vinod CSS, Anand Hareendran S (2021) Machine learning: a practitioner’s approach. PHI Learning Pvt Ltd, Delhi

Visibelli A, Bongini P, Rossi A, Niccolai N, Bianchini M (2020) A deep attention network for predicting amino acid signals in the formation of [formula: see text]-helices. J Bioinform Comput Biol:2050028

Vohora D, Singh G (2018) Pharmaceutical medicine and translational clinical research. Academic Press, London

Volkamer A, Kuhn D, Grombacher T, Rippmann F, Rarey M (2012) Combining global and local measures for structure-based druggability predictions. J Chem Inf Model 52(2):360–372

Voosen P (2017) The ai detectives

Vranic S, Shimada Y, Ichihara S, Kimata M, Wenting W, Tanaka T, Boland S, Tran L, Ichihara G (2019) Toxicological evaluation of sio2 nanoparticles by zebrafish embryo toxicity test. Int J Mol Sci 20(4):882

Wang N-N, Dong J, Deng Y-H, Zhu M-F, Wen M, Yao Z-J, Ai-Ping L, Wang J-B, Cao D-S (2016) Adme properties evaluation in drug discovery: prediction of caco-2 cell permeability using a combination of nsga-ii and boosting. J Chem Inf Model 56(4):763–773

Wang Q, Feng YH, Huang JC, Wang TJ, Cheng GQ (2017) A novel framework for the identification of drug target proteins: Combining stacked auto-encoders with a biased support vector machine. PLoS ONE 12(4):e0176486

Wang D, Jin G (2018) Vasc: dimension reduction and visualization of single-cell rna-seq data by deep variational autoencoder. Genom Proteom Bioinform 16(5):320–331

Wang C, Kurgan L (2020) Survey of similarity-based prediction of drug-protein interactions. Curr Med Chem 27(35):5856–5886

Wang S, Sun S, Li Z, Zhang R, Jinbo X (2017) Accurate de novo prediction of protein contact map by ultra-deep learning model. PLoS Comput Biol 13(1):e1005324

Wang W, Yang S, Zhang X, Li J (2014) Drug repositioning by integrating target information through a heterogeneous network model. Bioinformatics 30(20):2923–2930

Wang B, Zhu J, Pierson E, Ramazzotti D, Batzoglou S (2017) Visualization and analysis of single-cell rna-seq data by kernel-based similarity learning. Nat Methods 14(4):414–416

Warmuth MK, Liao J, Rätsch G, Mathieson M, Putta S, Lemmen C (2003) Active learning with support vector machines in the drug discovery process. J Chem Inf Comput Sci 43(2):667–673

Way GP, Greene CS (2017) Extracting a biologically relevant latent space from cancer transcriptomes with variational autoencoders. BioRxiv, p 174474

Webb AR (2003) Statistical pattern recognition. John Wiley & Sons, New Jersy

Willett P (2006) Similarity-based virtual screening using 2d fingerprints. Drug Discovery Today 11(23–24):1046–1053

Xia Z, Wu L-Y, Zhou X, Wong STC (2010) Semi-supervised drug-protein interaction prediction from heterogeneous biological spaces. In: BMC systems biology, vol 4. BioMed Central, pp 1–16

Xing J, Wenchao L, Liu R, Wang Y, Xie Y, Zhang H, Shi Z, Jiang H, Liu Y-C, Chen K et al (2017) Machine-learning-assisted approach for discovering novel inhibitors targeting bromodomain-containing protein 4. J Chem Inf Model 57(7):1677–1690

Xue LC, Dobbs D, Bonvin AMJJ, Honavar V (2015) Computational prediction of protein interfaces: A review of data driven methods. FEBS Lett 589(23):3516–3526

Yamanishi Y, Araki M, Gutteridge A, Honda W, Kanehisa M (2008) Prediction of drug–target interaction networks from the integration of chemical and genomic spaces. Bioinformatics 24(13):i232–i240

Yavuz BÇ, Yurtay N, Ozkan O (2018) Prediction of protein secondary structure with clonal selection algorithm and multilayer perceptron. IEEE Access 6:45256–45261

Youjun X, Pei J, Lai L (2017) Deep learning based regression and multiclass models for acute oral toxicity prediction with automatic chemical feature extraction. J Chem Inf Model 57(11):2672–2685

Zaretzki J, Matlock M, Swamidass SJ (2013) Xenosite: accurately predicting cyp-mediated sites of metabolism with neural networks. J Chem Inf Model 53(12):3373–3383

Zeng X, Zhu S, Weiqiang L, Liu Z, Huang J, Zhou Y, Fang J, Huang Y, Guo H, Li L et al (2020) Target identification among known drugs by deep learning from heterogeneous networks. Chem Sci 11(7):1775–1797

Zernov VV, Balakin KV, Ivaschenko AA, Savchuk NP, Pletnev IV (2003) Drug discovery using support vector machines. the case studies of drug-likeness, agrochemical-likeness, and enzyme inhibition predictions. J Chem Inf Comput Sci 43(6):2048–2056

Zhan F, Barlogie B, Mulligan G, Shaughnessy JD Jr, Bryant B (2008) High-risk myeloma: a gene expression-based risk-stratification model for newly diagnosed multiple myeloma treated with high-dose therapy is predictive of outcome in relapsed disease treated with single-agent bortezomib or high-dose dexamethasone. Blood J Am Soc Hematol 111(2):968–969

Zhan F, Huang Y, Colla S, Stewart JP, Hanamura I, Gupta S, Epstein J, Yaccoby S, Sawyer J, Burington B et al (2006) The molecular classification of multiple myeloma. Blood 108(6):2020–2028

Zhang QC, Petrey D, Norel R, Honig BH (2010) Protein interface conservation across structure space. Proc Natl Acad Sci 107(24):10896–10901

Zhavoronkov A, Ivanenkov YA, Aliper A, Veselov MS, Aladinskiy VA, Aladinskaya AV, Terentiev VA, Polykovskiy DA, Kuznetsov MD, Asadulaev A et al (2019) Deep learning enables rapid identification of potent ddr1 kinase inhibitors. Nat Biotechnol 37(9):1038–1040

Zhou H, Gao M, Skolnick J (2015) Comprehensive prediction of drug-protein interactions and side effects for the human proteome. Sci Rep 5(1):1–13

Zsoldos Z, Reid D, Simon A, Sadjad SB, Johnson AP (2007) ehits: a new fast, exhaustive flexible ligand docking system. J Mol Graph Model 26(1):198–212

Download references

Author information

Authors and affiliations.

Department of Computer Science and Engineering, B V Raju Institute of Technology, Narsapur, Medak, 502313, Telangana, India

Suresh Dara, Swetha Dhamercherla & CH Madhu Babu

Centre for Molecular Cancer Research (CMCR) and Vishnu Institute of Pharmaceutical Education and Research (VIPER), Narsapur, Medak, 502313, Telangana, India

Surender Singh Jadav

Department of Pharmaceutical Chemistry, Maharishi Arvind College of Pharmacy, Jaipur, 302023, Rajasthan, India

Mohamed Jawed Ahsan

You can also search for this author in PubMed   Google Scholar

Corresponding authors

Correspondence to Suresh Dara or Surender Singh Jadav .

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Dara, S., Dhamercherla, S., Jadav, S.S. et al. Machine Learning in Drug Discovery: A Review. Artif Intell Rev 55 , 1947–1999 (2022). https://doi.org/10.1007/s10462-021-10058-4

Download citation

Published : 11 August 2021

Issue Date : March 2022

DOI : https://doi.org/10.1007/s10462-021-10058-4

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Artificial intelligence
  • Drug discovery
  • Machine learning
  • Target validation
  • Prognostic biomarkers
  • Digital pathology
  • Find a journal
  • Publish with us
  • Track your research

Help | Advanced Search

Computer Science > Machine Learning

Title: generative ai for architectural design: a literature review.

Abstract: Generative Artificial Intelligence (AI) has pioneered new methodological paradigms in architectural design, significantly expanding the innovative potential and efficiency of the design process. This paper explores the extensive applications of generative AI technologies in architectural design, a trend that has benefited from the rapid development of deep generative models. This article provides a comprehensive review of the basic principles of generative AI and large-scale models and highlights the applications in the generation of 2D images, videos, and 3D models. In addition, by reviewing the latest literature from 2020, this paper scrutinizes the impact of generative AI technologies at different stages of architectural design, from generating initial architectural 3D forms to producing final architectural imagery. The marked trend of research growth indicates an increasing inclination within the architectural design community towards embracing generative AI, thereby catalyzing a shared enthusiasm for research. These research cases and methodologies have not only proven to enhance efficiency and innovation significantly but have also posed challenges to the conventional boundaries of architectural creativity. Finally, we point out new directions for design innovation and articulate fresh trajectories for applying generative AI in the architectural domain. This article provides the first comprehensive literature review about generative AI for architectural design, and we believe this work can facilitate more research work on this significant topic in architecture.

Submission history

Access paper:.

  • HTML (experimental)
  • Other Formats

license icon

References & Citations

  • Google Scholar
  • Semantic Scholar

BibTeX formatted citation

BibSonomy logo

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

  • Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .

IMAGES

  1. (PDF) Discovery vs Inquiry Learning Model

    literature review on discovery learning

  2. (PDF) Online tools to support literature-based discovery in the life

    literature review on discovery learning

  3. (PDF) Smart literature review: a practical topic modelling approach to

    literature review on discovery learning

  4. literature review article examples Sample of research literature review

    literature review on discovery learning

  5. Literature Review For Qualitative Research

    literature review on discovery learning

  6. (PDF) Evaluation of Literature-Based Discovery Systems

    literature review on discovery learning

COMMENTS

  1. PDF The Effect of Discovery Learning Method Application on Increasing ...

    B. Literature Review Discovery Learning is a learning method that encourages students to ask questions and formulate their own tentative answers, and to deduce general principles from practical examples or experiences. Other definition states that Discovery Learning is a learning situation in which the principal content of what is to be learned ...

  2. A LITERATURE REVIEW ON 'DISCOVERY LEARNING'

    This research aims to find out the application of discovery learning models in increasing students' interest and learning outcomes in harmonic vibrational materials in MAN 4 Aceh Besar. The method in this study is quasi-experimentation with the design of a pretest-posttest control group. The instruments used are questionnaires and problems.

  3. The potential of discovery learning models to empower students

    The method used is qualitative with the main source of literature review about discovery learning models and critical thinking skills. The results of the analysis of the discovery learning model literature with orientation, hypothesis generation, hypothesis testing, conclusion, and regulation stages.

  4. The role of guidance in children's discovery learning

    Though traditional views emphasize a lack of instructional constraint or scaffolding, more recent evidence suggests that guidance should be included in the process of discovery learning. The present review summarizes three general approaches which have been shown to facilitate guided discovery learning: (1) strategic presentation of materials ...

  5. Full article: Learners' challenges in understanding and performing

    2. Inquiry-based learning (IBL) and experiments. IBL has been considered as an essential component of science education for more than 50 years (Stender et al., Citation 2018).Due to its long-lasting importance and the extensive literature on the topic, the term IBL is accompanied by a variety of descriptions and connotations (Abrams et al., Citation 2008; Anderson, Citation 2002; Blanchard et ...

  6. PDF DISCOVERY LEARNING STRATEGIES IN ENGLISH

    feedback, worked examples support the Discovery learning. Mc Donold Betty (2011) seeks to suggest that discovery learning is more effective in a collaborative atmosphere of students sharing each other. A review of the literature suggests that discovery learning occurs whenever the learner is not provided with the target information or

  7. The potential of discovery learning models to

    The method used is qualitative with the main source of literature review about discovery learning models and critical thinking skills. The results of the analysis of the discovery learning model literature with orientation, hypothesis generation, hypothesis testing, conclusion, and regulation stages.

  8. Inquiry-Based Learning: A Review of the Research Literature

    Inspiring Education emphasizing inquiry-based learning has a long ancestry in the. INQUIRY-BASED LEARNING LITERATURE REVIEW 6. West. This spirit of inquiry has a strong historical antecedent in ...

  9. Discovery Learning for the 21st Century: What is it and how does it

    The scope of this review includes literature that defines discovery learning, outlines the theoretical and historical basis for discovery learning, describes practice and applications, and describes WebQuests as a current technologically-based application of discovery learning. This review includes the following topics: § A definition of ...

  10. The potential of discovery learning models to empower students

    Critical thinking skills have become the competencies of educational goals. This article aims to examine the potential of discovery learning models that are applied in science learning to empower students' critical thinking skills. The method used is qualitative with the main source of literature review about discovery learning models and critical thinking skills. The results of the analysis ...

  11. PDF The Use of Discovery Learning in Improving Students ...

    The Use of Discovery Learning in Improving Students' Critical Thinking Ability (A Literature Review) Ida Ayu Made Trisna Dwi Jayanti Ganesha Education University, Bali [email protected] ABSTRACT Critical thinking ability is an essential ability that students need to compete in the 21st century.

  12. PDF Discovery Learning Strategy in Geographical Education: A Sample of

    Review of International Geographical Education Online ©RIGEO Volume 9, Number 3, Winter 2019 ... covered productively and an environment can be achieved for sustainable learning. Examining the relevant literature; it can be seen that the studies as to how discovery ... Discovery learning strategy is a motivating strategy that is accomplished ...

  13. Learning by Discovery: A Critical Review of Studies: The Journal of

    In this review an attempt is made to determine what findings can be drawn from discovery learning experiments. The results of these experiments are conflicting and often insignificant, but they tend to favor discovery learning methods compared to other teaching methods. However, many results are suspect due to limitations in experimental design ...

  14. Systematic review of adaptive learning research designs, context

    This systematic review of research on adaptive learning used a strategic search process to synthesize research on adaptive learning based on publication trends, instructional context, research methodology components, research focus, adaptive strategies, and technologies. A total of 61 articles on adaptive learning were analyzed to describe the current state of research and identify gaps in the ...

  15. PDF Students' Perception of the Discovery learning Strategy in ...

    Consequently, discovery learning promotes reading comprehension skills development. Several studies have been conducted on the implementation of discovery learning in the development of students' language skills in general and reading skills in particular. Mahmoud (2014) reported discovery learning strategy is a good way to improve language ...

  16. A Historical Review of Collaborative Learning and Cooperative Learning

    Abstract. Collaborative learning and cooperative learning are two separate approaches developed independently by two groups of scholars around the same period of time in the 1960 and 1970 s. Due to their different origins and intertwined paths of development, they have their own distinct features while sharing many similarities.

  17. Literature Review: Implementasi Model Pembelajaran Discovery-learning

    Nurdin, K., Muh, H. S., & Muhammad, M. H. (2019). The implementation of inquiry-discovery learning. IDEAS: Journal on English Language Teaching and Learning, Linguistics and Literature, 7(1). Suendartia, M. (2017). The effect of the learning discovery model on the learning outcomes of natural science of junior high school students in Indonesia.

  18. Systematic Literature Review : Penelitian Discovery Learning

    The results are: (1) of the three learning models conducted using the model of problem-based learning, discovery learning, and open-ended had a positive effect on the ability to solve mathematical ...

  19. PDF Inquiry-Based Learning: A Review of the Research Literature

    INQUIRY-BASED LEARNING LITERATURE REVIEW 4 As education in Alberta is organized around the three E's of 21st century learning, a shift will occur from disseminating information and recalling facts toward developing particular competencies. Teachers will cultivate the natural curiosities of students and plant the seeds of life-long learning.

  20. Review Leveraging machine learning for automatic topic discovery and

    For instance, statistical relational learning supports the discovery of process models, and frequent itemset mining supports organizational mining. ... Utilizing domain knowledge in data-driven process discovery: A literature review. Computers in Industry, 137 (2022), Article 103612. View PDF View article View in Scopus Google Scholar. Scott, 2012.

  21. Literature-based discovery approaches for evidence-based healthcare: a

    Purpose. Literature-Based Discovery (LBD) is a text mining technique used to generate novel hypotheses from vast amounts of literature sources, by identifying links between concepts from disparate sources. One of the main areas where it has been predominantly applied is the healthcare domain, whereby promising results, in the form of novel ...

  22. Deep learning in drug discovery: an integrative review and future

    The remainder of this review paper is organized as: Sect. 2 presents a review of related studies; Sect. 3 covers the various DL techniques as an overview. Section 4 presents the organization of DL applications in drug discovery problems through explaining each drug discovery problem category and gives a literature review of the DL techniques used. . Section 5 discusses the numerous benchmark ...

  23. Machine Learning in Drug Discovery: A Review

    This review provides the feasible literature on drug discovery through ML tools and techniques that are enforced in every phase of drug development to accelerate the research process and deduce the risk and expenditure in clinical trials. Machine learning techniques improve the decision-making in pharmaceutical data across various applications like QSAR analysis, hit discoveries, de novo drug ...

  24. Is Meta-training Really Necessary for Molecular Few-Shot Learning

    Few-shot learning has recently attracted significant interest in drug discovery, with a recent, fast-growing literature mostly involving convoluted meta-learning strategies. We revisit the more straightforward fine-tuning approach for molecular data, and propose a regularized quadratic-probe loss based on the the Mahalanobis distance. We design a dedicated block-coordinate descent optimizer ...

  25. Generative AI for Architectural Design: A Literature Review

    This article provides the first comprehensive literature review about generative AI for architectural design, and we believe this work can facilitate more research work on this significant topic in architecture. Comments: 32 pages, 20 figures. Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI) Cite as: arXiv:2404.01335 [cs.LG]