Jan-Willem van de Meent
photo

Jan-Willem van de Meent
Assistant Professor

Northeastern University
College of Computer and Information Science
Office 922, 177 Huntington Avenue
Boston, MA 02115


+1 (617) 373 7696

I am an assistant professor in the College of Computer and Information Science at Northeastern. I work on probabilistic programming frameworks that provide building blocks for model development in data science, machine learning, and artificial intelligence. I am one of the creators of the Anglican, a probabilistic programming system that is closely integrated with Clojure. I am currently developing of Probabilistic Torch, a library for deep generative models that extends PyTorch.

News

OCT 2018 ∙ A draft of our book An Introduction to Probabilistic Programming is now publicly available [arXiv]. This book is intended as a graduate-level introduction to probabilistic programming languages and methods for inference in probabilistic programs.

OCT 2018 ∙ Thanks to all speakers and attendees for making PROBPROG 2018 a success!

AUG 2018 ∙ Congrats to Sarthak Jain on the acceptance of his paper Learning Disentangled Representations of Texts with Application to Biomedical Abstracts at EMLP [arXiv].

AUG 2018 ∙ I am teaching DS 5230 Unsupervised Machine Learning and Data Mining this Fall [website].

MAY 2018 ∙ New pre-print by our students Babak, Hao, and Sarthak on learning stuctured disentangled representations [arXiv].

DEC 2017 ∙ We have open-sourced Probabilistic Torch [github], a library for deep generative models that extends PyTorch. This release accompanies our paper at NIPS [paper].

DEC 2017 ∙ Our extended abstract “Inference Trees: Adaptive Inference with Exploration” was accepted at the NIPS workshop on Advances in Approximate Bayesian Inference [website].

SEP 2017 ∙ Our paper “Learning Disentangled Representations with Semi-Supervised Deep Generative Models” has been accepted for publication at NIPS [arxiv].

Group

Babak Esmaeli
Babak Esmaeli
Ph.D. Candidate
Jered McInerney
Jered McInerney
Ph.D. Candidate
Iris Seaman
Iris Seaman
Ph.D. Candidate
Eli Sennesh
Eli Sennesh
Ph.D. Candidate
Hao Wu
Hao Wu
Ph.D. Candidate
Heiko Zimmermann
Heiko Zimmermann
Ph.D. Candidate

Working Papers

vandemeent_ftml_2018
An Introduction to Probabilistic Programming
This document is designed to be a first-year graduate-level introduction to probabilistic programming. It not only provides a thorough background for anyone wishing to use a probabilistic programming system, but also introduces the techniques needed to design and build these systems. It is aimed at people who have an undergraduate-level understanding of either or, ideally, both probabilistic machine learning and programming languages.
seaman_arxiv_2018
Modeling Theory of Mind for Autonomous Agents with Probabilistic Programs
As autonomous agents become more ubiquitous, they will eventually have to reason about the mental state of other agents, including those agents' beliefs, desires and goals - so-called theory of mind reasoning. We introduce a collection of increasingly complex theory of mind models of a "chaser" pursuing a "runner", which are implemented as nested probabilistic programs. We show that planning can be performed using nested importance sampling methods, resulting in rational behaviors from both agents, and show that allocating additional computation to perform nested reasoning about agents result in lower-variance estimates of expected utility.
esmaeli_arxiv_2018b
Structured Representations for Reviews: Aspect-Based Variational Hidden Factor Models
We present Variational Aspect-Based Latent Dirichlet Allocation (VALDA), a family of au- toencoding topic models that learn aspect-based representations of reviews. VALDA defines a user- item encoder that maps bag-of-words vectors for combined reviews associated with each paired user and item onto structured embeddings, which in turn define per-aspect topic weights. We model individual reviews in a structured manner by infer- ring an aspect assignment for each sentence in a given review, where the per-aspect topic weights obtained by the user-item encoder serve to define a mixture over topics, conditioned on the aspect. The result is an autoencoding neural topic model for reviews, which can be trained in a fully unsupervised manner to learn topics that are structured into aspects.
bozkurt_cract_2018
Can VAEs Generate Novel Examples?
We investigate to what extent widely employed variational autoencoder (VAE) architectures can generate examples that were not previously seen in the training data. We consider both generalization to new examples of previously seen classes, and generalization to the classes that were withheld from the training set. In both cases, we find that reconstructions are closely approximated by nearest neighbors for higher-dimensional parameterizations. When generalizing to unseen classes however, lower-dimensional parameterizations offer a clear advantage
sennesh_arxiv_2018
Composing Modeling and Inference Operations with Probabilistic Program Combinators
We introduce a combinator library for the Probabilistic Torch framework. Combinators are functions that accept models and return transformed models. We assume that models are dynamic, but that model composition is static, in the sense that combinator application takes place prior to evaluating the model on data. Model combinators use classic functional constructs such as map and reduce to define a computation at a coarsened level of representation. Inference combinators alter the evaluation strategy using operations such as importance resampling and application of a transition kernel, whilst preserving proper weighting.
esmaeli_arxiv_2018
Structured Disentangled Representations
Deep latent-variable models learn representations of high-dimensional data in an unsupervised manner. A number of recent efforts have focused on learning representations that disentangle statistically independent axes of variation by introducing modifications to the standard objective function. These approaches generally assume a simple diagonal Gaussian prior and as a result are not able to reliably disentangle discrete factors of variation. We propose a two-level hierarchical objective to control relative degree of statistical independence between blocks of variables and individual variables within blocks.
rainforth_arxiv_2018
Inference Trees: Adaptive Inference with Exploration
We introduce inference trees (ITs), a new adaptive Monte Carlo inference method building on ideas from Monte Carlo tree search. Unlike most existing methods which are implicitly based on pure exploitation, ITs explicitly aim to balance exploration and exploitation in the inference process, alleviating common pathologies and ensuring consistency. More specifically, ITs use bandit strategies to adaptively sample from hierarchical partitions of the parameter space, while simultaneously learning these partitions in an online manner.

Selected Papers

jain_emnlp_2018
Learning Disentangled Representations of Texts with Application to Biomedical Abstracts
We propose a method for learning disentangled representations of texts that code for distinct and complementary aspects, with the aim of affording efficient model transfer and interpretability. To induce disentangled embeddings, we propose an adversarial objective based on the (dis)similarity between triplets of documents with respect to specific aspects. Our motivating application is embedding biomedical abstracts describing clinical trials in a manner that disentangles the populations, interventions, and outcomes in a given trial. We show that our method learns representations that encode these clinically salient aspects, and that these can be effectively used to perform aspect-specific retrieval.
siddharth_nips_2017
Learning Disentangled Representations with Semi-Supervised Deep Generative Models
We propose to learn disentangled representations using model architectures that generalise from standard VAEs, employing a general graphical model structure in the encoder and decoder. This allows us to train partially-specified models that make relatively strong assumptions about a subset of interpretable variables and rely on the flexibility of neural networks to learn representations for the remaining variables.
rainforth_nips_2016
Bayesian Optimization for Probabilistic Programs
We present the first general purpose framework for marginal maximum a pos- teriori estimation of probabilistic program variables. By using a series of code transformations, the evidence of any probabilistic program, and therefore of any graphical model, can be optimized with respect to an arbitrary subset of its sampled variables. To carry out this optimization, we develop the first Bayesian optimization package to directly exploit the source code of its target, leading to innovations in problem-independent hyperpriors, unbounded optimization, and implicit constraint satisfaction.
tolpin_fpl_2016
Design and Implementation of Probabilistic Programming Language Anglican
We present the first general purpose framework for marginal maximum a pos- teriori estimation of probabilistic program variables. By using a series of code transformations, the evidence of any probabilistic program, and therefore of any graphical model, can be optimized with respect to an arbitrary subset of its sampled variables. To carry out this optimization, we develop the first Bayesian optimization package to directly exploit the source code of its target, leading to innovations in problem-independent hyperpriors, unbounded optimization, and implicit constraint satisfaction.
vandemeent_aistats_2016
Black-Box Policy Search with Probabilistic Programs
In this work we show how to represent policies as programs: that is, as stochastic simulators with tunable parameters. To learn the parameters of such policies we develop connections between black box variational inference and existing policy search approaches. We then explain how such learning can be implemented in a probabilistic programming system. We demonstrate both conciseness of policy representation and automatic policy parameter learning for a set of canonical reinforcement learning problems.
vandemeent_aistats_2015
Particle Gibbs with Ancestor Sampling for Probabilistic Programs
Particle Markov chain Monte Carlo techniques rank among current state-of-the-art methods for probabilistic program inference. A drawback of these techniques is that they rely on importance resampling, which results in degenerate particle trajectories and a low effective sample size for variables sampled early in a program. We here develop a for- malism to adapt ancestor resampling, a technique that mitigates particle degeneracy, to the probabilistic programming setting.
vandemeent_aistats_2014
A New Approach to Probabilistic Programming Inference
We demonstrate a new approach to inference in expressive probabilistic programming languages based on particle Markov chain Monte Carlo. It applies to Turing-complete proba- bilistic programming languages and supports accurate inference in models that make use of complex control flow, including stochas- tic recursion. It also includes primitives from Bayesian nonparametric statistics. Our experiments show that this approach can be more e cient than previously introduced single-site Metropolis-Hastings methods.
vandemeent_bpj_2014
Empirical Bayes Methods Enable Advanced Population-Level Analyses of Single-Molecule FRET Experiments
We demonstrate a new approach to inference in expressive probabilistic programming languages based on particle Markov chain Monte Carlo. It applies to Turing-complete proba- bilistic programming languages and supports accurate inference in models that make use of complex control flow, including stochas- tic recursion. It also includes primitives from Bayesian nonparametric statistics. Our experiments show that this approach can be more e cient than previously introduced single-site Metropolis-Hastings methods.