Data-Driven Discovery of Models

Automating Data Science and Predicting Behavior through Empirical Models

What is D3M?

The DARPA Data Driven Discovery of Models (D3M) program automates methods in data science to enable domain experts to incorporate their knowledge into the modeling process and create meaningful and valid predictive models of real, complex processes without the need for expert data scientists. Learn more.

Getting Started

Analytic Platforms for Domain SMEs

Three fully integrated, intuitive interactive platforms for domain SMEs to curate, select, edit and explain: (1) data and problems, (2) features and relationships, and (3) models.

Einblick, founded out of years of research at MIT and Brown, is changing the way people work and play with data by providing a fast and collaborative approach to understand the past, predict the future, and optimize decisions.

TwoRavens is a platform for machine learning that allows a domain expert, in concert with our system, to complete a high quality, predictive and interpretable model without any statistical or machine learning expertise. To do so, the system facilitates intuitive machine learning and model interpretation, model discovery, and data exploration. Watch a demo video here.

Distil is a mixed-initiative modeling workbench developed by Uncharted Software. Through an interactive analytic-question-first workflow, it enables subject matter experts to discover underlying dynamics of complex systems and generate data-driven models from tabular, time series, image and multispectral satellite image datasets.

AutoML Engines for Data Scientists

Automated machine learning engines that quickly discover pipelines that outperform human experts, supporting 20+ problem types, and built with an extensible machine learning library of over 300 automatically discoverable modeling primitives.


AutonML is an automated machine learning system developed by Carnegie Mellon University Auton Lab to power data scientists with efficient model discovery and advanced data analytics. “AutonML takes your machine learning capacity to the nth power”

AlphaD3M is an AutoML system, developed by VIDA Lab (NYU), that automatically searches for models and derives end-to-end pipelines. AlphaD3M leverages recent advances in deep reinforcement learning and is able to adapt to different application domains and problems through incremental learning.


Explore the D3M AutoML Ecosystem

Dataset Search and Augmentation

Learn more about Data Discovery and Augmentation tools that enhance the D3M data preparation and modeling process.

  • NYU Auctus, an open-source dataset search engine

  • ISI Datamart, a publicly available knowledge graph with Wikidata at its core

Metalearning Database

A large database of Millions of auto-generated pipelines to solve 20+ problem types across 200+ datasets.

Extend the D3M AutoML Ecosystem

How to add datasets, problems, primitives, and pipelines using the D3M standard JSON schemas. The AutoML engines will then use existing and new primitives to auto-solve the new problems.

MARVIN visual front end to D3M AutoML ecosystem

The MARVN GUI provides a query, exploration, and analytics interface to the datasets, problems, primitives, and pipeline solutions in the D3M ecosystem.

Machine Learning Primitives

Discoverable library of fundamental ML elements used to build models and pipelines.

Other D3M AutoML Engines

Cutting-edge systems leverage state-of-the-art research in ML architecture search and metalearning to comprehensively explore combinations of ML primitives for high-quality models.