Eli Lilly signs $2.75 billion deal with AI drug developer Insilico Medicine
Back to Tutorials
aiTutorialbeginner

Eli Lilly signs $2.75 billion deal with AI drug developer Insilico Medicine

March 29, 20263 views5 min read

Learn how to use AI-powered tools for drug discovery by analyzing molecular structures and predicting drug properties using Python and machine learning.

Introduction

In this tutorial, you'll learn how to use AI-powered drug discovery tools to predict molecular properties and identify potential drug candidates. This is exactly the kind of technology that companies like Eli Lilly and Insilico Medicine are using to revolutionize the pharmaceutical industry. You'll start with a simple Python environment and work through hands-on exercises using real AI models to analyze molecular structures.

Prerequisites

Before starting this tutorial, you'll need:

  • A computer with internet access
  • Python 3.7 or higher installed
  • Basic understanding of molecular structures and chemistry concepts
  • Access to a code editor (like VS Code or Jupyter Notebook)

Why these prerequisites? We'll be using Python libraries to analyze molecular data, so having Python installed is essential. Understanding basic chemistry helps you interpret the results, and a code editor will let you write and run the code easily.

Step-by-Step Instructions

1. Set Up Your Python Environment

First, we need to install the required Python packages. Open your terminal or command prompt and run:

pip install rdkit-pypi
pip install scikit-learn
pip install pandas
pip install numpy

Why this step? These packages provide the tools we need to work with molecular data (RDKit), perform machine learning (scikit-learn), and handle data analysis (pandas, numpy).

2. Import Required Libraries

Create a new Python file and start by importing the necessary libraries:

from rdkit import Chem
from rdkit.Chem import Descriptors
from rdkit.Chem import AllChem
import pandas as pd
import numpy as np

Why this step? These imports give us access to molecular manipulation functions, chemical descriptors, and data handling tools that we'll use throughout the tutorial.

3. Create a Simple Molecule

Now, let's create a simple molecule to work with. We'll use aspirin as an example:

# Create a molecule from SMILES notation
smiles = 'CC(=O)OC1=CC=CC=C1C(=O)O'
mol = Chem.MolFromSmiles(smiles)

# Check if the molecule was created successfully
if mol is not None:
    print('Molecule created successfully!')
    print('Molecular formula:', Chem.MolToSmiles(mol))
else:
    print('Failed to create molecule')

Why this step? SMILES (Simplified Molecular Input Line Entry System) is a standard way to represent molecular structures. This lets us work with real chemical compounds in our AI analysis.

4. Calculate Molecular Properties

Next, we'll calculate basic molecular properties that AI models use for drug discovery:

# Calculate key descriptors
mw = Descriptors.MolWt(mol)
logp = Descriptors.MolLogP(mol)
num_h_donors = Descriptors.NumHDonors(mol)
num_h_acceptors = Descriptors.NumHAcceptors(mol)

print(f'Molecular Weight: {mw:.2f}')
print(f'LogP (water solubility): {logp:.2f}')
print(f'Hydrogen Donors: {num_h_donors}')
print(f'Hydrogen Acceptors: {num_h_acceptors}')

Why this step? These molecular descriptors are features that AI models use to predict how well a compound might work as a drug. They include properties like molecular weight, water solubility, and hydrogen bonding capabilities.

5. Generate 3D Structure

AI drug discovery often uses 3D molecular structures. Let's generate one:

# Generate 3D coordinates
AllChem.EmbedMolecule(mol)
AllChem.UFFOptimizeMolecule(mol)

# Convert to SMILES with 3D information
print('3D molecule generated successfully!')
print('SMILES with 3D:', Chem.MolToSmiles(mol, isomericSmiles=True))

Why this step? 3D structures are crucial for understanding how molecules interact with proteins. This is where AI models get detailed spatial information to predict drug-target interactions.

6. Create a Simple AI Model for Property Prediction

Now we'll create a basic predictive model using the molecular properties we've calculated:

# Create sample data for our model
# In a real scenario, this would come from a large dataset
sample_data = {
    'mw': [180.16, 225.28, 150.16],
    'logp': [1.5, 2.0, 0.8],
    'h_donors': [1, 2, 0],
    'h_acceptors': [2, 3, 1],
    'solubility': [0.01, 0.05, 0.1]
}

df = pd.DataFrame(sample_data)
print(df)

Why this step? This simulates how AI models are trained on large datasets of molecules with known properties. In real drug discovery, AI models learn from millions of compounds to predict new ones.

7. Train a Simple Prediction Model

Let's train a basic machine learning model to predict solubility based on our molecular properties:

# Prepare the data
X = df[['mw', 'logp', 'h_donors', 'h_acceptors']]
y = df['solubility']

# Simple linear regression model
from sklearn.linear_model import LinearRegression
model = LinearRegression()
model.fit(X, y)

# Make a prediction for our aspirin molecule
aspirin_features = [[mw, logp, num_h_donors, num_h_acceptors]]
prediction = model.predict(aspirin_features)

print(f'Predicted solubility for aspirin: {prediction[0]:.4f}')

Why this step? This demonstrates how AI models in drug discovery can predict properties of new compounds before they're synthesized. This saves time and resources in the development process.

8. Visualize Your Molecule

Finally, let's visualize our molecule to understand its structure:

# Install additional visualization package
# pip install matplotlib

import matplotlib.pyplot as plt
from rdkit.Chem import Draw

# Draw the molecule
img = Draw.MolToImage(mol, size=(300, 300))
plt.figure(figsize=(5, 5))
plt.imshow(img)
plt.axis('off')
plt.title('Aspirin Molecule')
plt.show()

Why this step? Visualizing molecules helps you understand their structure and how AI models might interpret them. This is part of how AI systems learn to recognize drug-like molecules.

Summary

In this tutorial, you've learned how to work with AI-driven drug discovery tools by:

  1. Setting up a Python environment with molecular analysis libraries
  2. Creating and analyzing molecular structures using SMILES notation
  3. Calculating key molecular properties that AI models use
  4. Building a simple machine learning model to predict molecular properties
  5. Visualizing molecular structures

This hands-on approach mirrors what companies like Eli Lilly and Insilico Medicine are doing with their $2.75 billion AI partnership. You've now experienced the basics of how artificial intelligence is revolutionizing how we discover new medicines, making the process faster and more efficient than traditional methods.

Source: The Decoder

Related Articles