GLASS 2.0

About GLASS 2.0

A comprehensive database for protein-ligand interactions with curated experimental data and machine learning-ready datasets.

What is GLASS 2.0?

GLASS 2.0 (G-protein coupled receptor Ligand ASsociated dataSet) is a comprehensive database that provides standardized, experimentally validated GPCR-ligand interaction data, specifically optimized for modern computational and AI research approaches. Built upon extensive database integration and large language model (LLM) powered text mining of 41M PubMed articles, GLASS 2.0 represents a 53% increase in dataset size compared to its predecessor.

Our database employs rigorous data standardization and deduplication measures to harmonize experimental parameters, values, and units, substantially enhancing data consistency and usability. With 106% more ligands than GPCRdb and carefully curated positive/negative samples, GLASS 2.0 facilitates the development of deep learning algorithms for drug discovery.

Key Improvements in GLASS 2.0

  • 53% increase in experimental records (1.14M data points)
  • Advanced data standardization & deduplication
  • AI-optimized datasets for ML research
  • LLM-powered literature mining integration
  • Enhanced drug-like molecule coverage (303K)
  • Free downloadable curated datasets

Database Statistics

Current data coverage and scope

3,308
Proteins
458,292
Ligands
890,209
Interactions
1,147,227
Data Points

How to Use GLASS 2.0

Navigate and utilize our comprehensive GPCR-ligand interaction database

Interactive Web Interface

1

Search & Browse

Search GPCRs, ligands, or interactions by name, ID, or molecular properties using our advanced filtering system.

2

Explore Details

View detailed protein information, 2D/3D ligand structures, experimental data, and drug-likeness assessments.

3

Analyze & Visualize

Use interactive molecular viewers and QED drug-likeness visualizations to analyze binding data.

Dataset Downloads

1

Complete Datasets

Download full database dumps in TSV, CSV formats with standardized experimental values and identifiers.

2

ML-Ready Data

Access curated classification (607K points) and regression (564K points) datasets optimized for AI research.

3

Structure Files

Download protein structures (PDB) and computationally generated ligand structures (SDF).

Quick Start

Get started with exploring GPCR-ligand interactions