## D3M Academic Papers

#### AutoML User Interfaces

Towards Evaluating Exploratory Model Building Process with AutoML Systems

NYU/UTD/Uncharted; Keywords: Human-guided machine learning; Evaluation Methodology; System Evaluation; Machine Learning; Automated Machine Learning; AutoML; Exploratory Visual Analysis; Exploratory Model Building

Towards Human-Guided Machine Learning

ISI/Harvard/UTD; Keywords: Human-guided machine learning; Automated machine learning (AutoML); Task analysis; Scientific workflows.

Distil: A Mixed-Initiative Model Discovery System for Subject Matter Experts

Uncharted; Keywords: Augmented Intelligence;Machine Learning;Mixed-initiative;Visual Analytics

PipelineProfiler: A Visual Analytics Tool for the Exploration of AutoML Pipelines

NYU; Keywords: Automatic Machine Learning; Pipeline Visualization; Model Evaluation

#### AutoML Engines

Learning Across Tasks with Surrogate Model Ensembles for Algorithm and Hyperparameter Optimization

SRI (Eindhoven)

Learning to go with the flow: on the adaptability of automated machine learning to evolving data

SRI (Eindhoven)

MetaTPOT: Enhancing A Tree-based Pipeline Optimization Tool Using Meta-Learning

UC Berkeley; Keywords: AutoML; Meta-Learning; Genetic Programming(GP); TPOT

Metalearning by Exploiting Granular Machine Learning Pipeline Metadata

BYU; Keywords: metalearning;automl;ml pipeline;metamodel;meta-dataset;metafeatures

Modeling and Forecasting Armed Conflict: AutoML with Human-Guided Machine Learning

Harvard/UTD; Keywords: Human-guided machine learning; Automated machine learning (AutoML); conflict forecasting.

On Robustness of Neural Architecture Search under Label Noise.

TAMU; Keywords: deep learning; automated machine learning; neuralarchitecture search; label noise; robust lossfunction

P4ML: A phased performance-based pipeline planner for automated machine learning

ISI; Keywords: Automating machine learning, planning, pipelines, workflows, AutoML

Preprocessor Selection for Machine Learning Pipelines

BYU; Keywords: metalearning;preprocessor selection;ml pipeline design;automl

RankML: a Meta Learning-Based Approach for Pre-Ranking Machine Learning Pipelines

UC Berkeley; Keywords: AutoML; meta-learning; machine learning pipelines

The ABC of Data: A Classifying Framework for Data Readiness

SRI (Eindhoven)

AlphaD3M: An Open-Source AutoML Library for Multiple ML Tasks Roque Lopez, Raoni Lourenco, Remi Rampin, Sonia Castelo, Aécio SR Santos, Jorge Henrique Piazentin Ono, Claudio Silva, Juliana Freire AutoML Conference 2023, 2023

Using Pipeline Performance Prediction to Accelerate AutoML Systems Haoxiang Zhang, Roque López, Aécio Santos, Jorge Piazentin Ono, Aline Bessa, Juliana Freire Proceedings of the Seventh Workshop on Data Management for End-to-End Machine Learning, 2023 Best paper award.

An ecosystem of applications for modeling political Aline Bessa, Sonia Castelo, Rémi Rampin, Aécio Santos, Mike Shoemate, Vito D'Orazio, Juliana Freire Proceedings of the 2021 International Conference on Management of Data

A Hybrid Approach for Automatic Model Recommendation

UC Berkeley; Keywords: Classification; Classifier Families; Meta-Learning; Dataset Metafeatures; Scholarly Big Data; Algorithm Recommendation; Word Embedding; Expert System

A meta-learning recommender system for hyperparameter tuning: predicting when tuning improves SVM classifiers

SRI (Eindhoven)

Auto-Keras: An Efficient Neural Architecture Search System

TAMU; Keywords: Automated Machine Learning; AutoML; Neural Architecture Search; Bayesian Optimization; Network Morphism

AutoGRD: Model Recommendation Through Graphical Dataset Representation

UC Berkeley; Awards: Best paper award CIKM '19; Best industry paper award CIKM '19; Keywords: Meta-learning; Algorithm selection; AutoML; Dataset representation; Classification; Regression; Graph embedding

Automatic Machine Learning Derived from Scholarly Big Data

UC Berkeley; Keywords: Meta-learning; Algorithm selection; AutoML; Dataset representation; Classification;

Automatic Machine Learning: Methods, Systems, Challenges

SRI (Eindhoven)

DeepLine: AutoML Tool for Pipelines Generation using Deep Reinforcement Learning and Hierarchical Actions Filtering

UC Berkeley; Keywords: AutoML, classification, deep reinforcement learning

GAMA: Genetic Automated Machine learning Assistant

SRI (Eindhoven)

GAMA: a General Automated Machine learning Assistant

SRI (Eindhoven)

Layered TPOT: Speeding up Tree-based Pipeline Optimization

SRI (Eindhoven)

#### Primitives

Joint Embedding of Graphs

JHU; Keywords: graphs, embedding, feature extraction, statistical inference

Knowledge Augmented Deep Neural Networks for Joint Facial Expression and Action Unit Recognition

RPI; Keywords: computer vision; facial expression and action unites; prior knowledge

Label Error Correction and Generation Through Label Relationships

RPI; Keywords: Bayesian network, structure learning, and label denoising

On a 'Two Truths' Phenomenon in Spectral Graph Clustering

JHU; Keywords: Spectral Embedding, Spectral Clustering, Graph, Network, Connectome

On consistent vertex nomination schemes

JHU; Keywords: vertex nomination, Bayes optimal

On estimation and inference in latent structure random graphs

JHU; Keywords: efficiency, Latent structure random graphs, manifold learning, spectral graph inference

On spectral embedding performance and elucidating network structure in stochastic block model graphs

JHU; Keywords: Statistical network analysis; random graphs; stochastic block model; Laplacian spectral embed- ding; adjacency spectral embedding; Chernoff information; vertex clustering.

Seeded Graph Matching

JHU; Keywords: Hungarian Algorithm, Quadratic Assignment Problem (QAP), Vertex Alignment

Signal-plus-noise matrix models: eigenvector deviations and fluctuations

JHU; Keywords: Random matrix; Signal-plus-noise; Eigenvector perturbation; Principal component analysis; Asymptotic normality.

Simultaneous Dimensionality and Complexity Model Selection for Spectral Graph Clustering

JHU; Keywords: Adjacency spectral embedding, Model-based clustering, Stochastic block model

The two-to-infinity norm and singular subspace geometry with applications to high-dimensional statistics

JHU; Keywords: singular value decomposition, principal component analysis, eigenvector perturbation, spectral methods, Procrustes analysis, high-dimensional statis- tics

Type-augmented Relation Prediction in Knowledge Graphs

RPI; Keywords: prior knowledge, relation prediction, knowledge graph

Vertex Nomination Via Seeded Graph Matching

JHU; Keywords: vertex nomination, graph matching, seeded graph matching, graph inference, graph mining, stochastic block model

Vertex Nomination, Consistent Estimation, and Adversarial Modification

JHU; Keywords: adversarial machine learning, networks, Random graphs, statistics, Vertex nomination

Vertex nomination: The canonical sampling and the extended spectral nomination schemes

JHU; Keywords: vertex nomination, Markov chain Monte Carlo, spectral partitioning, Mclust

Abstractive Tabular Dataset Summarization via Knowledge Base Semantic Embeddings

Uncharted; Keywords: Dataset Summarization;Type Recommendation;Semantic Embeddings

Alignment Strength and Correlation for Graphs

JHU; Keywords: correlated Bernoulli random graphs, alignment strength, graph correlation, graph matchability, complexity of graph matching

Amortized Monte Carlo Integration

UBC; Awards: Best Paper Honourable Mention ICML 2019

Blendshape-augmented Facial Action Units Detection

RPI; Keywords: computer vision; facial action units; 3D facial blendshapes

Etalumis: Bringing Probabilistic Programming to Scientific Simulators at Scale

UBC; Awards: Best Paper Finalist at Supercomputing 2019

Forecasting Hierarchical Time Series with a Regularized Embedding Space

Uncharted; Keywords: hierarchical time series; grouped time series; time series forecasting; embedding space; neural network

Graph Traversal with Tensor Functionals: A Meta-Algorithm for Scalable Learning

ISI; Keywords: Graph, Learning, Algorithm, Scale, Message Passing, Node Embeddings

#### AutoML Infrastructure

Interactive Data Visualization in Jupyter Notebooks

NYU; Keywords: Visualization

On Evaluation of AutoML System

UC Berkeley, BYU, TAMU, CMU, JPL

OpenML-Python: an extensible Python API for OpenML

SRI (Eindhoven)

#### Other

ML Friend: Interactive Prediction Task Recommendation for Event-Driven Time-Series Data

MIT/FeatureLabs

Missing Value Imputation for Mixed Data Through Gaussian Copula

Cornell University; Keywords: mixed data; ordinal data; Gaussian copula; missing values; imputation

Multi-task learning with a natural metric for quantitative structure activity relationship learning.

SRI (Eindhoven)

OBOE: Collaborative Filtering for AutoML Model Selection

Cornell University; Keywords: AutoML; meta-learning; time-constrained; model selection; collaborative filtering

Online high-rank matrix completion

Cornell University

Ordalia: Deep Learning Hyperparameter Search via Generalization Error Bounds Extrapolation

MIT/Brown; Keywords: Deep Learning, Hyperparameters Optimization,
Multi-armed Bandits, Automated Machine Learning

Polynomial Matrix Completion for Missing Data Imputation and Transductive Learning

Cornell University

Prediction Factory: automated development and collaborative evaluation of predictive models

MIT/FeatureLabs

Safe Visual Data Exploration

MIT/Brown

Solving the "False Positives" Problem in Fraud Prediction

MIT/FeatureLabs

Sustainability at Scale: Bridging the Intention-Behavior Gap with Sustainable Recommendations

SRI (UCSC)

TRIÈST: Counting Local and Global Triangles in Fully Dynamic Streams with Fixed Memory Size

MIT/Brown

Tandem Inference: An Out-of-Core Streaming Algorithm For Very Large-Scale Relational Inference

SRI (UCSC)

Techniques for Automated Machine Learning

TAMU; Keywords: automated machine learning; neural architecture search; bayesian optimization; reinforce-ment learning; evolutionary algorithm; gradient-based methods

The Machine Learning Bazaar: Harnassing the ML Ecosystem for Effective System Development

MIT/FeatureLabs

The online performance estimation framework: Heterogeneous Ensemble Learning for Data Streams

SRI (Eindhoven)

Third-Party Data Providers Ruin Simple Mechanisms

UBC (Hebrew University of Jerusalem)

Towards Automated Neural Architecture Discovering for Click-Through Rate Prediction

TAMU; Keywords: Information systems; Recommender systems; Theory of computation; Evolutionary algorithms; Computing methodologies; Neural networks

Towards Interactive Data Exploration

MIT/Brown

TwoRavens for Event Data

Harvard/UTD

Understanding Spatio-Temporal Urban Processes

NYU; Keywords: data quality;data profiling;urban data

A Necessary and Sufficient Stability Notion for Adaptive Generalization

UBC (Hebrew University of Jerusalem)

A New Analysis of Differential Privacy’s Generalization Guarantees

UBC (Hebrew University of Jerusalem)

A Survey on Collecting, Managing, and Analyzing Provenance from Scripts

NYU; Keywords: provenance;scripts;collecting;managing;analyzing;survey

ATM: A Distributed, Collaborative, Scalable System for Automated Machine Learning (Code)

MIT/FeatureLabs

Annealed Importance Sampling with q-Paths

ISI; Awards: Best Paper Award: NeurIPS Workshop on Deep Learning through Information Geometry

Assessment of Convolutional Neural Networks for Automated Classification of Chest Radiographs

Stanford

Augmenting Visualizations with Interactive Data Facts to Facilitate Interpretation and Communication

Georgia Institute of Technology

AutoML Pipeline Selection: Efficiently Navigating the Combinatorial Space

Cornell University; Keywords: AutoML; meta-learning; pipeline search; tensor decomposition;submodular optimization; experiment design; greedy algorithms

AutoML using Metadata Language Embeddings

Cornell University

Automatic Feature Selection in Learning Using Privileged Information

Perspecta Labs , (formerly Vencore Labs)

BEAMES: Interactive Multimodel Steering, Selection, and Inspection for Regression Tasks

Georgia Institute of Technology

Bounded-Leakage Differential Privacy

UBC (Hebrew University of Jerusalem)

Causal Relational Learning

SRI (UCSC)

Collective Bio-Entity Recognition in Scientific Documents using Hinge-Loss Markov Random Fields

SRI (UCSC)

Conflict Forecasting and Prediction

Harvard/UTD; Keywords: conflict forecasting; predictive models; machine learning; international conflict; civil war; terrorism

Contrastive Entity Linkage: Mining Variational Attributes From Large Catalogs for Entity Linkage

SRI (UCSC)

Correlation Sketches for Approximate Join-Correlation Queries

NYU; Keywords: Dataset search; Correlation; Join-Correlation estimation; Sketching algorithms;

Deep Sets

CMU

Detecting Patterns of Physiological Response to Hemodynamic Stress via Unsupervised Deep Learning

CMU

Fairness in Relational Domains

SRI (UCSC)

Feature Selection in Learning Using Privileged Information

Perspecta Labs , (formerly Vencore Labs)

Gaggle: Visual Analytics for Model Space Navigation

Georgia Institute of Technology

Geono-Cluster: Interactive Visual Cluster Analysis for Biologists

Georgia Institute of Technology

Hidden Stratification Causes Clinically Meaningful Failures in Machine Learning for Medical Imaging

Stanford

Lifted Hinge-Loss Markov Random Field

SRI (UCSC)

Updated: 07 June 2021, 333 papers