college lockup
Jan-Willem van de Meent

Working Papers

  1. van de Meent, J.-W., Paige, B., Yang, H., & Wood, F. (2018). An Introduction to Probabilistic Programming. ArXiv:1809.10756 [Cs, Stat].

    This document is designed to be a first-year graduate-level introduction to probabilistic programming. It not only provides a thorough background for anyone wishing to use a probabilistic programming system, but also introduces the techniques needed to design and build these systems. It is aimed at people who have an undergraduate-level understanding of either or, ideally, both probabilistic machine learning and programming languages. We start with a discussion of model-based reasoning and explain why conditioning as a foundational computation is central to the fields of probabilistic machine learning and artificial intelligence. We then introduce a simple first-order probabilistic programming language (PPL) whose programs define static-computation-graph, finite-variable-cardinality models. In the context of this restricted PPL we introduce fundamental inference algorithms and describe how they can be implemented in the context of models denoted by probabilistic programs. In the second part of this document, we introduce a higher-order probabilistic programming language, with a functionality analogous to that of established programming languages. This affords the opportunity to define models with dynamic computation graphs, at the cost of requiring inference methods that generate samples by repeatedly executing the program. Foundational inference algorithms for this kind of probabilistic programming language are explained in the context of an interface between program executions and an inference controller. This document closes with a chapter on advanced topics which we believe to be, at the time of writing, interesting directions for probabilistic programming research; directions that point towards a tight integration with deep neural network research and the development of systems for next-generation artificial intelligence applications.

    @article{vandemeent2018introduction,
      archiveprefix = {arXiv},
      eprinttype = {arxiv},
      eprint = {1809.10756},
      primaryclass = {cs, stat},
      title = {An {{Introduction}} to {{Probabilistic Programming}}},
      journal = {arXiv:1809.10756 [cs, stat]},
      author = {{van de Meent}, Jan-Willem and Paige, Brooks and Yang, Hongseok and Wood, Frank},
      month = sep,
      year = {2018},
      keywords = {Statistics - Machine Learning,Computer Science - Artificial Intelligence,Computer Science - Programming Languages,Computer Science - Machine Learning},
      file = {/Users/janwillem/Zotero/storage/D3M245LQ/van de Meent - 2018 - An Introduction to Probabilistic Programming.pdf}
    }
    
  2. Bateni, P., Barber, J., van de Meent, J.-W., & Wood, F. (2020). Improving Few-Shot Visual Classification with Unlabelled Examples. ArXiv:2006.12245 [Cs, Stat].

    We propose a transductive meta-learning method that uses unlabelled instances to improve few-shot image classification performance. Our approach combines a regularized Mahalanobis-distance-based soft k-means clustering procedure with a state of the art neural adaptive feature extractor to achieve improved test-time classification accuracy using unlabelled data. We evaluate our method on transductive few-shot learning tasks, in which the goal is to jointly predict labels for query (test) examples given a set of support (training) examples. We achieve new state of the art in-domain performance on Meta-Dataset, and improve accuracy on mini- and tiered-ImageNet as compared to other conditional neural adaptive methods that use the same pre-trained feature extractor.

    @article{bateni2020improving,
      title = {Improving {{Few}}-{{Shot Visual Classification}} with {{Unlabelled Examples}}},
      author = {Bateni, Peyman and Barber, Jarred and {van de Meent}, Jan-Willem and Wood, Frank},
      year = {2020},
      month = jun,
      author+an = {3=highlight},
      archiveprefix = {arXiv},
      eprint = {2006.12245},
      eprinttype = {arxiv},
      file = {/Users/janwillem/Zotero/storage/JXXWVCPB/Bateni - 2020 - Improving Few-Shot Visual Classification with Unlabelled Examples.pdf},
      journal = {arXiv:2006.12245 [cs, stat]},
      keywords = {Computer Science - Computer Vision and Pattern Recognition,Computer Science - Machine Learning,Statistics - Machine Learning},
      primaryclass = {cs, stat}
    }
    
  3. Biza, O., Platt, R., van de Meent, J.-W., & Wong, L. L. S. (2020). Learning Discrete State Abstractions with Deep Variational Inference. ArXiv:2003.04300 [Cs, Stat].

    Abstraction is crucial for effective sequential decision making in domains with large state spaces. In this work, we propose a variational information bottleneck method for learning approximate bisimulations, a type of state abstraction. We use a deep neural net encoder to map states onto continuous embeddings. The continuous latent space is then compressed into a discrete representation using an action-conditioned hidden Markov model, which is trained end-to-end with the neural network. Our method is suited for environments with high-dimensional states and learns from a stream of experience collected by an agent acting in a Markov decision process. Through a learned discrete abstract model, we can efficiently plan for unseen goals in a multi-goal Reinforcement Learning setting. We test our method in simplified robotic manipulation domains with image states. We also compare it against previous model-based approaches to finding bisimulations in discrete grid-world-like environments.

    @article{biza2020learning,
      title = {Learning Discrete State Abstractions with Deep Variational Inference},
      author = {Biza, Ondrej and Platt, Robert and {van de Meent}, Jan-Willem and Wong, Lawson L. S.},
      year = {2020},
      month = mar,
      archiveprefix = {arXiv},
      eprint = {2003.04300},
      eprinttype = {arxiv},
      file = {/Users/janwillem/Zotero/storage/MZIEZZSQ/Biza - 2020 - Learning discrete state abstractions with deep variational inference.pdf},
      journal = {arXiv:2003.04300 [cs, stat]},
      keywords = {Computer Science - Machine Learning,Statistics - Machine Learning},
      primaryclass = {cs, stat}
    }
    
  4. Farnoosh, A., Rezaei, B., Sennesh, E. Z., Khan, Z., Dy, J., Satpute, A., Hutchinson, J. B., van de Meent, J.-W., & Ostadabbas, S. (2020). Deep Markov Spatio-Temporal Factorization. ArXiv:2003.09779 [Cs, Stat].

    We introduce deep Markov spatio-temporal factorization (DMSTF), a deep generative model for spatio-temporal data. Like other factor analysis methods, DMSTF approximates high-dimensional data by a product between time-dependent weights and spatially dependent factors. These weights and factors are in turn represented in terms of lower-dimensional latent variables that we infer using stochastic variational inference. The innovation in DMSTF is that we parameterize weights in terms of a deep Markovian prior, which is able to characterize nonlinear temporal dynamics. We parameterize the corresponding variational distribution using a bidirectional recurrent network. This results in a flexible family of hierarchical deep generative factor analysis models that can be extended to perform time series clustering, or perform factor analysis in the presence of a control signal. Our experiments, which consider simulated data, fMRI data, and traffic data, demonstrate that DMSTF outperforms related methods in terms of reconstruction accuracy and can perform forecasting in a variety domains with nonlinear temporal transitions.

    @article{farnoosh2020deep,
      title = {Deep {{Markov Spatio}}-{{Temporal Factorization}}},
      author = {Farnoosh, Amirreza and Rezaei, Behnaz and Sennesh, Eli Zachary and Khan, Zulqarnain and Dy, Jennifer and Satpute, Ajay and Hutchinson, J. Benjamin and {van de Meent}, Jan-Willem and Ostadabbas, Sarah},
      year = {2020},
      month = mar,
      archiveprefix = {arXiv},
      eprint = {2003.09779},
      eprinttype = {arxiv},
      file = {/Users/janwillem/Zotero/storage/4PQ2QMUG/Farnoosh - 2020 - Deep Markov Spatio-Temporal Factorization.pdf},
      journal = {arXiv:2003.09779 [cs, stat]},
      keywords = {Computer Science - Machine Learning,Statistics - Machine Learning},
      primaryclass = {cs, stat}
    }
    
  5. Bozkurt, A., Esmaeili, B., Brooks, D. H., Dy, J. G., & van de Meent, J.-W. (2020). Rate-Regularization and Generalization in VAEs. ArXiv:1911.04594 [Cs, Stat].

    Variational autoencoders (VAEs) optimize an objective that comprises a reconstruction loss (the distortion) and a KL term (the rate). The rate is an upper bound on the mutual information, which is often interpreted as a regularizer that controls the degree of compression. We here examine whether inclusion of the rate term also improves generalization. We perform rate-distortion analyses in which we control the strength of the rate term, the network capacity, and the difficulty of the generalization problem. Lowering the strength of the rate term paradoxically improves generalization in most settings, and reducing the mutual information typically leads to underfitting. Moreover, we show that generalization performance continues to improve even after the mutual information saturates, indicating that the gap on the bound (i.e. the KL divergence relative to the inference marginal) affects generalization. This suggests that the standard spherical Gaussian prior is not an inductive bias that typically improves generalization, prompting further work to understand what choices of priors improve generalization in VAEs.

    @article{bozkurt2020rate-regularization,
      title = {Rate-{{Regularization}} and {{Generalization}} in {{VAEs}}},
      author = {Bozkurt, Alican and Esmaeili, Babak and Brooks, Dana H. and Dy, Jennifer G. and {van de Meent}, Jan-Willem},
      year = {2020},
      month = jul,
      archiveprefix = {arXiv},
      eprint = {1911.04594},
      eprinttype = {arxiv},
      file = {/Users/janwillem/Zotero/storage/V8VYWG7D/Bozkurt - 2020 - Rate-Regularization and Generalization in VAEs.pdf},
      journal = {arXiv:1911.04594 [cs, stat]},
      keywords = {Computer Science - Machine Learning,Statistics - Machine Learning},
      primaryclass = {cs, stat}
    }
    
  6. Sennesh, E., Khan, Z., Wang, Y., Farnoosh, A., Ostadabbas, S., Dy, J., Satpute, A. B., Hutchinson, J. B., & van de Meent, J.-W. (2020). Neural Topographic Factor Analysis for fMRI Data. ArXiv:1906.08901 [Cs, Eess, Stat].

    Neuroimaging studies produce gigabytes of spatio-temporal data for a small number of participants and stimuli. Recent work increasingly suggests that the common practice of averaging across participants and stimuli leaves out systematic and meaningful information. We propose Neural Topographic Factor Analysis (NTFA), a deep generative model that parameterizes factors in terms of embeddings for participants and stimuli. We evaluate NTFA on data from an in-house pilot experiment, as well as two publicly available datasets. We demonstrate that inferring representations for participants and stimuli improves predictive generalization to unseen data when compared to existing methods. NTFA infers meaningful embeddings without supervision, circumventing the assumptions that experimenters defined stimulus conditions and that subject variance is error. We also demonstrate that the inferred latent factor representations are useful for downstream tasks such as multivoxel pattern analysis and functional connectivity.

    @article{sennesh2020neural,
      title = {Neural {{Topographic Factor Analysis}} for {{fMRI Data}}},
      author = {Sennesh, Eli and Khan, Zulqarnain and Wang, Yiyu and Farnoosh, Amirreza and Ostadabbas, Sarah and Dy, Jennifer and Satpute, Ajay B. and Hutchinson, J. Benjamin and {van de Meent}, Jan-Willem},
      year = {2020},
      month = mar,
      archiveprefix = {arXiv},
      eprint = {1906.08901},
      eprinttype = {arxiv},
      file = {/Users/janwillem/Zotero/storage/ARTEZFVV/Sennesh - 2020 - Neural Topographic Factor Analysis for fMRI Data.pdf},
      journal = {arXiv:1906.08901 [cs, eess, stat]},
      keywords = {Computer Science - Machine Learning,Electrical Engineering and Systems Science - Image and Video Processing,Statistics - Machine Learning},
      primaryclass = {cs, eess, stat}
    }
    

Conference

  1. Wu, H., Zimmermann, H., Sennesh, E., Le, T. A., & van de Meent, J.-W. (2020). Amortized Population Gibbs Samplers with Neural Sufficient Statistics. In Proceedings of the International Conference on Machine Learning (pp. 10205–10215).

    Amortized variational methods have proven difficult to scale to structured problems, such as inferring positions of multiple objects from video images. We develop amortized population Gibbs (APG) samplers, a class of scalable methods that frame structured variational inference as adaptive importance sampling. APG samplers construct high-dimensional proposals by iterating over updates to lower-dimensional blocks of variables. We train each conditional proposal by minimizing the inclusive KL divergence with respect to the conditional posterior. To appropriately account for the size of the input data, we develop a new parameterization in terms of neural sufficient statistics. Experiments show that APG samplers can be used to train highly-structured deep generative models in an unsupervised manner, and achieve substantial improvements in inference accuracy relative to standard autoencoding variational methods.

    @incollection{wu2020amortized,
      author = {Wu, Hao and Zimmermann, Heiko and Sennesh, Eli and Le, Tuan Anh and van de Meent, Jan-Willem},
      booktitle = {Proceedings of the International Conference on Machine Learning},
      pages = {10205--10215},
      title = {Amortized Population Gibbs Samplers with Neural Sufficient Statistics},
      year = {2020}
    }
    
  2. McInerney, D. J., Dabiri, B., Touret, A.-S., Young, G., van de Meent, J.-W., & Wallace, B. C. (2020). Query-Focused EHR Summarization to Aid Imaging Diagnosis. Machine Learning for Healthcare.

    Electronic Health Records (EHRs) provide vital contextual information to radiologists and other physicians when making a diagnosis. Unfortunately, because a given patient’s record may contain hundreds of notes and reports, identifying relevant information within these in the short time typically allotted to a case is very difficult. We propose and evaluate models that extract relevant text snippets from patient records to provide a rough case summary intended to aid physicians considering one or more diagnoses. This is hard because direct supervision (i.e., physician annotations of snippets relevant to specific diagnoses in medical records) is prohibitively expensive to collect at scale. We propose a distantly supervised strategy in which we use groups of International Classification of Diseases (ICD) codes observed in ’future’ records as noisy proxies for ’downstream’ diagnoses. Using this we train a transformer-based neural model to perform extractive summarization conditioned on potential diagnoses. This model defines an attention mechanism that is conditioned on potential diagnoses (queries) provided by the diagnosing physician. We train (via distant supervision) and evaluate variants of this model on EHR data from a local hospital and MIMIC-III (the latter to facilitate reproducibility). Evaluations performed by radiologists demonstrate that these distantly supervised models yield better extractive summaries than do unsupervised approaches. Such models may aid diagnosis by identifying sentences in past patient reports that are clinically relevant to a potential diagnoses.

    @article{mcinerney2020query-focused,
      title = {Query-{{Focused EHR Summarization}} to {{Aid Imaging Diagnosis}}},
      author = {McInerney, Denis Jered and Dabiri, Borna and Touret, Anne-Sophie and Young, Geoffrey and {van de Meent}, Jan-Willem and Wallace, Byron C.},
      year = {2020},
      month = apr,
      archiveprefix = {arXiv},
      eprint = {2004.04645},
      eprinttype = {arxiv},
      journal = {Machine Learning for Healthcare},
      keywords = {Computer Science - Machine Learning,Statistics - Machine Learning},
      primaryclass = {cs, stat}
    }
    
  3. Esmaeili, B., Wu, H., Jain, S., Bozkurt, A., Siddharth, N., Paige, B., Brooks, D. H., Dy, J., & van de Meent, J.-W. (2019). Structured Disentangled Representations. Artificial Intelligence and Statistics.

    Deep latent-variable models learn representations of high-dimensional data in an unsupervised manner. A number of recent efforts have focused on learning representations that disentangle statistically independent axes of variation by introducing modifications to the standard objective function. These approaches generally assume a simple diagonal Gaussian prior and as a result are not able to reliably disentangle discrete factors of variation. We propose a two-level hierarchical objective to control relative degree of statistical independence between blocks of variables and individual variables within blocks. We derive this objective as a generalization of the evidence lower bound, which allows us to explicitly represent the trade-offs between mutual information between data and representation, KL divergence between representation and prior, and coverage of the support of the empirical data distribution. Experiments on a variety of datasets demonstrate that our objective can not only disentangle discrete variables, but that doing so also improves disentanglement of other variables and, importantly, generalization even to unseen combinations of factors.

    @article{esmaeili2018structured,
      archiveprefix = {arXiv},
      eprinttype = {arxiv},
      eprint = {1804.02086},
      primaryclass = {cs, stat},
      title = {Structured {{Disentangled Representations}}},
      journal = {Artificial Intelligence and Statistics},
      author = {Esmaeili, Babak and Wu, Hao and Jain, Sarthak and Bozkurt, Alican and Siddharth, N. and Paige, Brooks and Brooks, Dana H. and Dy, Jennifer and {van de Meent}, Jan-Willem},
      month = apr,
      year = {2019},
      keywords = {Computer Science - Machine Learning,Statistics - Machine Learning}
    }
    
  4. Esmaeili, B., Huang, H., Wallace, B. C., & van de Meent, J.-W. (2019). Structured Neural Topic Models for Reviews. Artificial Intelligence and Statistics.

    We present Variational Aspect-based Latent Topic Allocation (VALTA), a family of autoencoding topic models that learn aspect-based representations of reviews. VALTA defines a user-item encoder that maps bag-of-words vectors for combined reviews associated with each paired user and item onto structured embeddings, which in turn define per-aspect topic weights. We model individual reviews in a structured manner by inferring an aspect assignment for each sentence in a given review, where the per-aspect topic weights obtained by the user-item encoder serve to define a mixture over topics, conditioned on the aspect. The result is an autoencoding neural topic model for reviews, which can be trained in a fully unsupervised manner to learn topics that are structured into aspects. Experimental evaluation on large number of datasets demonstrates that aspects are interpretable, yield higher coherence scores than non-structured autoencoding topic model variants, and can be utilized to perform aspect-based comparison and genre discovery.

    @article{esmaeili2018structuredb,
      archiveprefix = {arXiv},
      eprinttype = {arxiv},
      eprint = {1812.05035},
      primaryclass = {cs},
      title = {Structured {{Neural Topic Models}} for {{Reviews}}},
      journal = {Artificial Intelligence and Statistics},
      author = {Esmaeili, Babak and Huang, Hongyi and Wallace, Byron C. and {van de Meent}, Jan-Willem},
      month = dec,
      year = {2019},
      keywords = {Computer Science - Computation and Language,Computer Science - Machine Learning}
    }
    
  5. Jain, S., Banner, E., van de Meent, J.-W., Marshall, I. J., & Wallace, B. C. (2018). Learning Disentangled Representations of Texts with Application to Biomedical Abstracts. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.

    We propose a method for learning disentangled representations of texts that code for distinct and complementary aspects, with the aim of affording efficient model transfer and interpretability. To induce disentangled embeddings, we propose an adversarial objective based on the (dis)similarity between triplets of documents with respect to specific aspects. Our motivating application is embedding biomedical abstracts describing clinical trials in a manner that disentangles the populations, interventions, and outcomes in a given trial. We show that our method learns representations that encode these clinically salient aspects, and that these can be effectively used to perform aspect-specific retrieval. We demonstrate that the approach generalizes beyond our motivating application in experiments on two multi-aspect review corpora.

    @inproceedings{jain_emnlp_2018,
      archiveprefix = {arXiv},
      eprinttype = {arxiv},
      eprint = {1804.07212},
      title = {Learning {{Disentangled Representations}} of {{Texts}} with {{Application}} to {{Biomedical Abstracts}}},
      booktitle = {Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing},
      author = {Jain, Sarthak and Banner, Edward and {van de Meent}, Jan-Willem and Marshall, Iain J. and Wallace, Byron C.},
      year = {2018}
    }
    
  6. Siddharth, N., Paige, B., van de Meent, J.-W., Desmaison, A., Goodman, N. D., Kohli, P., Wood, F., & Torr, P. (2017). Learning Disentangled Representations with Semi-Supervised Deep Generative Models. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, & R. Garnett (Eds.), Advances in Neural Information Processing Systems 30 (pp. 5927–5937).

    Variational autoencoders (VAEs) learn representations of data by jointly training a probabilistic encoder and decoder network. Typically these models encode all features of the data into a single variable. Here we are interested in learning disentangled representations that encode distinct aspects of the data into separate variables. We propose to learn such representations using model architectures that generalise from standard VAEs, employing a general graphical model structure in the encoder and decoder. This allows us to train partially-specified models that make relatively strong assumptions about a subset of interpretable variables and rely on the flexibility of neural networks to learn representations for the remaining variables. We further define a general objective for semi-supervised learning in this model class, which can be approximated using an importance sampling procedure. We evaluate our framework’s ability to learn disentangled representations, both by qualitative exploration of its generative capacity, and quantitative evaluation of its discriminative ability on a variety of models and datasets.

    @inproceedings{siddharth_nips_2017,
      title = {Learning Disentangled Representations with Semi-Supervised Deep Generative Models},
      author = {Siddharth, N. and Paige, Brooks and van de Meent, Jan-Willem and Desmaison, Alban and Goodman, Noah D. and Kohli, Pushmeet and Wood, Frank and Torr, Philip},
      booktitle = {Advances in Neural Information Processing Systems 30},
      editor = {Guyon, I. and Luxburg, U. V. and Bengio, S. and Wallach, H. and Fergus, R. and Vishwanathan, S. and Garnett, R.},
      pages = {5927--5937},
      year = {2017}
    }
    
  7. Tolpin, D., van de Meent, J.-W., Yang, H., & Wood, F. (2016). Design and Implementation of Probabilistic Programming Language Anglican. Proceedings of the 28th Symposium on the Implementation and Application of Functional Programming Languages, 6:1–6:12. https://doi.org/10.1145/3064899.3064910

    Anglican is a probabilistic programming system designed to interoperate with Clojure and other JVM languages. We introduce the programming language Anglican, outline our design choices, and discuss in depth the implementation of the Anglican language and runtime, including macro-based compilation, extended CPS-based evaluation model, and functional representations for probabilistic paradigms, such as a distribution, a random process, and an inference algorithm. We show that a probabilistic functional language can be implemented efficiently and integrated tightly with a conventional functional language with only moderate computational overhead. We also demonstrate how advanced probabilistic modelling concepts are mapped naturally to the functional foundation.

    @inproceedings{tolpin_ifl_2016,
      author = {Tolpin, David and van de Meent, Jan-Willem and Yang, Hongseok and Wood, Frank},
      title = {Design and Implementation of Probabilistic Programming Language Anglican},
      booktitle = {Proceedings of the 28th Symposium on the Implementation and Application of Functional Programming Languages},
      series = {IFL 2016},
      year = {2016},
      isbn = {978-1-4503-4767-9},
      location = {Leuven, Belgium},
      pages = {6:1--6:12},
      articleno = {6},
      numpages = {12},
      url = {http://doi.acm.org/10.1145/3064899.3064910},
      doi = {10.1145/3064899.3064910},
      acmid = {3064910},
      publisher = {ACM},
      address = {New York, NY, USA}
    }
    
  8. Rainforth, T., Le, T. A., van de Meent, J.-W., Osborne, M. A., & Wood, F. (2016). Bayesian Optimization for Probabilistic Programs. Advances in Neural Information Processing Systems, 280–288.
    @inproceedings{rainforth_nips_2016,
      title = {Bayesian {O}ptimization for {P}robabilistic {P}rograms},
      author = {Rainforth, Tom and Le, Tuan Anh and van de Meent, Jan-Willem and Osborne, Michael A and Wood, Frank},
      booktitle = {Advances in Neural Information Processing Systems},
      pages = {280--288},
      year = {2016},
      annote = {https://github.com/probprog/bopp}
    }
    
  9. Rainforth, T., Naesseth, C. A., Lindsten, F., Paige, B., van de Meent, J.-W., Doucet, A., & Wood, F. (2016). Interacting Particle Markov Chain Monte Carlo. Proceedings of The 33rd International Conference on Machine Learning, 2616–2625.

    We introduce interacting particle Markov chain Monte Carlo (iPMCMC), a PMCMC method that introduces a coupling between multiple standard and conditional sequential Monte Carlo samplers. Like related methods, iPMCMC is a Markov chain Monte Carlo sampler on an extended space. We present empirical results that show significant improvements in mixing rates relative to both non- interacting PMCMC samplers and a single PMCMC sampler with an equivalent total computational budget. An additional advantage of the iPMCMC method is that it is suitable for distributed and multi-core architectures.

    @inproceedings{rainforth_icml_2016,
      booktitle = {Proceedings of The 33rd International Conference on Machine Learning,},
      pages = {2616–2625},
      title = {{Interacting Particle Markov Chain Monte Carlo}},
      author = {Rainforth, Tom and Naesseth, Christian A. and Lindsten, Fredrik and Paige, Brooks and van de Meent, Jan-Willem and Doucet, Arnaud and Wood, Frank},
      year = {2016}
    }
    
  10. van de Meent, J.-W., Paige, B., Tolpin, D., & Wood, F. (2016). Black-Box Policy Search with Probabilistic Programs. Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, 1195–1204.

    In this work, we explore how probabilistic programs can be used to represent policies in sequential decision problems. In this formulation, a probabilistic program is a black-box stochastic simulator for both the problem domain and the agent. We relate classic policy gradient techniques to recently introduced black-box variational methods which generalize to probabilistic program inference. We present case studies in the Canadian traveler problem, Rock Sample, and a benchmark for optimal diagnosis inspired by Guess Who. Each study illustrates how programs can efficiently represent policies using moderate numbers of parameters.

    @article{vandemeent_aistats_2016,
      journal = {Proceedings of the 19th International Conference on Artificial Intelligence and Statistics},
      author = {van de Meent, Jan-Willem and Paige, Brooks and Tolpin, David and Wood, Frank},
      pages = {1195–1204},
      title = {{Black-Box Policy Search with Probabilistic Programs}},
      year = {2016}
    }
    
  11. Tolpin, D., van de Meent, J.-W., Paige, B., & Wood, F. (2015). Output-Sensitive Adaptive Metropolis-Hastings for Probabilistic Programs. In A. Appice, P. P. Rodrigues, V. Santos Costa, J. Gama, A. Jorge, & C. Soares (Eds.), Machine Learning and Knowledge Discovery in Databases (Vol. 9285, pp. 311–326). Springer International Publishing. https://doi.org/10.1007/978-3-319-23525-7_19
    @incollection{tolpin_ecml_2015,
      year = {2015},
      isbn = {978-3-319-23524-0},
      booktitle = {Machine Learning and Knowledge Discovery in Databases},
      volume = {9285},
      series = {Lecture Notes in Computer Science},
      editor = {Appice, Annalisa and Rodrigues, Pedro Pereira and Santos Costa, Vítor and Gama, João and Jorge, Alípio and Soares, Carlos},
      doi = {10.1007/978-3-319-23525-7_19},
      title = {Output-Sensitive Adaptive Metropolis-Hastings for Probabilistic Programs},
      url = {http://dx.doi.org/10.1007/978-3-319-23525-7_19},
      publisher = {Springer International Publishing},
      keywords = {Probabilistic programming; Adaptive MCMC},
      author = {Tolpin, David and van de Meent, Jan-Willem and Paige, Brooks and Wood, Frank},
      pages = {311-326},
      language = {English}
    }
    
  12. van de Meent, J.-W., Yang, H., Mansinghka, V., & Wood, F. (2015). Particle Gibbs with Ancestor Sampling for Probabilistic Programs. Artificial Intelligence and Statistics.

    Particle Markov chain Monte Carlo techniques rank among current state-of-the-art methods for probabilistic program inference. A drawback of these techniques is that they rely on importance resampling, which results in degenerate particle trajectories and a low effective sample size for variables sampled early in a program. We here develop a formalism to adapt ancestor resampling, a technique that mitigates particle degeneracy, to the probabilistic programming setting. We present empirical results that demonstrate nontrivial performance gains.

    @inproceedings{vandemeent_aistats_2015,
      archiveprefix = {arXiv},
      arxivid = {1501.06769},
      author = {van de Meent, Jan-Willem and Yang, Hongseok and Mansinghka, Vikash and Wood, Frank},
      booktitle = {Artificial Intelligence and Statistics},
      eprint = {1501.06769},
      title = {{Particle Gibbs with Ancestor Sampling for Probabilistic Programs}},
      year = {2015}
    }
    
  13. Wood, F., van de Meent, J.-W., & Mansinghka, V. (2014). A new approach to probabilistic programming inference. Artificial Intelligence and Statistics, 1024–1032.
    @inproceedings{wood_aistats_2014,
      author = {Wood, Frank and van de Meent, Jan-Willem and Mansinghka, Vikash},
      booktitle = {Artificial Intelligence and Statistics},
      pages = {1024--1032},
      title = {{A new approach to probabilistic programming inference}},
      year = {2014}
    }
    
  14. van de Meent, J.-W., Bronson, J. E., Wood, F., Gonzalez, R. L., & Wiggins, C. H. (2013). Hierarchically-coupled hidden Markov models for learning kinetic rates from single-molecule data. Proceedings of the 30th International Conference on Machine Learning, 28(2), 361–369.

    We address the problem of analyzing sets of noisy time-varying signals that all report on the same process but confound straightforward analyses due to complex inter-signal heterogeneities and measurement artifacts. In particular we consider single-molecule experiments which indirectly measure the distinct steps in a biomolecular process via observations of noisy time-dependent signals such as a fluorescence intensity or bead position. Straightforward hidden Markov model (HMM) analyses attempt to characterize such processes in terms of a set of conformational states, the transitions that can occur between these states, and the associated rates at which those transitions occur; but require ad-hoc post-processing steps to combine multiple signals. Here we develop a hierarchically coupled HMM that allows experimentalists to deal with inter-signal variability in a principled and automatic way. Our approach is a generalized expectation maximization hyperparameter point estimation procedure with variational Bayes at the level of individual time series that learns an single interpretable representation of the overall data generating process.

    @article{vandemeent_icml_2013,
      archiveprefix = {arXiv},
      arxivid = {1305.3640},
      author = {van de Meent, Jan-Willem and Bronson, Jonathan E and Wood, Frank and Gonzalez, Ruben L. and Wiggins, Chris H.},
      eprint = {1305.3640},
      journal = {Proceedings of the 30th International Conference on Machine Learning},
      month = may,
      number = {2},
      pages = {361--369},
      title = {{Hierarchically-coupled hidden Markov models for learning kinetic rates from single-molecule data}},
      volume = {28},
      year = {2013}
    }
    

Journal

  1. Emmett, K. J., Rosenstein, J. K., van de Meent, J.-W., Shepard, K. L., & Wiggins, C. H. (2015). Statistical Inference for Nanopore Sequencing with a Biased Random Walk Model. Biophysical Journal, 108(April), 1852–1855. https://doi.org/doi:10.1016/j.bpj.2015.03.013

    Nanopore sequencing promises long read-lengths and single-molecule resolution, but the stochastic motion of the DNA molecule inside the pore is a current barrier to high accuracy reads. We develop a method of statistical inference that explicitly accounts for this error and demonstrate that high accuracy (>99.9%) sequence inference is feasible even under highly diffusive motion by using a hidden Markov model to jointly analyze multiple stochastic reads. Using this model, we place bounds on achievable inference accuracy under a range of experimental parameters.

    @article{emmett_bpj_2015,
      author = {Emmett, Kevin J. and Rosenstein, Jacob K. and van de Meent, Jan-Willem and Shepard, Ken L. and Wiggins, Chris H.},
      journal = {Biophysical Journal},
      volume = {108},
      issue = {April},
      pages = {1852-1855},
      doi = {doi:10.1016/j.bpj.2015.03.013},
      title = {{Statistical Inference for Nanopore Sequencing with a Biased Random Walk Model}},
      year = {2015}
    }
    
  2. Johnson, S., van de Meent, J.-W., Phillips, R., Wiggins, C. H., & Linden, M. (2014). Multiple LacI-mediated loops revealed by Bayesian statistics and tethered particle motion. Nucleic Acids Research, gku563–. https://doi.org/10.1093/nar/gku563

    The bacterial transcription factor LacI loops DNA by binding to two separate locations on the DNA simultaneously. Despite being one of the best-studied model systems for transcriptional regulation, the number and conformations of loop structures accessible to LacI remain unclear, though the importance of multiple coexisting loops has been implicated in interactions between LacI and other cellular regulators of gene expression. To probe this issue, we have developed a new analysis method for tethered particle motion, a versatile and commonly used in vitro single-molecule technique. Our method, vbTPM, performs variational Bayesian inference in hidden Markov models. It learns the number of distinct states (i.e. DNA-protein conformations) directly from tethered particle motion data with better resolution than existing methods, while easily correcting for common experimental artifacts. Studying short (roughly 100 bp) LacI-mediated loops, we provide evidence for three distinct loop structures, more than previously reported in single-molecule studies. Moreover, our results confirm that changes in LacI conformation and DNA-binding topology both contribute to the repertoire of LacI-mediated loops formed in vitro, and provide qualitatively new input for models of looping and transcriptional regulation. We expect vbTPM to be broadly useful for probing complex protein-nucleic acid interactions.

    @article{johnson_nar_2014,
      archiveprefix = {arXiv},
      arxivid = {1402.0894},
      author = {Johnson, S. and van de Meent, J.-W. and Phillips, R. and Wiggins, C. H. and Linden, M.},
      doi = {10.1093/nar/gku563},
      eprint = {1402.0894},
      issn = {0305-1048},
      journal = {Nucleic Acids Research},
      pages = {gku563--},
      title = {{Multiple LacI-mediated loops revealed by Bayesian statistics and tethered particle motion}},
      year = {2014}
    }
    
  3. Gansell, A. R., van de Meent, J. W., Zairis, S., & Wiggins, C. H. (2014). Stylistic clusters and the Syrian/South Syrian tradition of first-millennium BCE Levantine ivory carving: A machine learning approach. Journal of Archaeological Science, 44, 194–205. https://doi.org/10.1016/j.jas.2013.11.005

    Thousands of first-millennium BCE ivory carvings have been excavated from Neo-Assyrian sites in Mesopotamia (primarily Nimrud, Khorsabad, and Arslan Tash), hundreds of miles from their Levantine production contexts. At present, their specific manufacture dates and workshop localities are unknown. Relying on subjective, visual methods, scholars have grappled with their classification and regional attribution for over a century. This study combines visual approaches with machine learning techniques to offer data-driven perspectives on the classification and attribution of this Iron Age corpus.The study sample consists of 162 sculptures of female figures that have been conventionally attributed to three main regional carving traditions: "Phoenician," "North Syrian," and "Syrian/South Syrian". We have developed an algorithm that clusters the ivories based on a combination of descriptive and anthropometric data. The resulting categories, which are based on purely statistical criteria, show good agreement with conventional art historical classifications, while revealing new insights, especially with regard to the "Syrian/South Syrian" tradition.Specifically, we have determined that objects of the Syrian/South Syrian tradition might be more closely related to Phoenician objects than to North Syrian objects. We also reconsider the classification of a subset of "Phoenician" objects, and we confirm Syrian/South Syrian stylistic subgroups, the geographic distribution of which might illuminate Neo-Assyrian acquisition networks. Additionally, we have identified the features in our cluster assignments that might be diagnostic of regional traditions. In short, our study both corroborates traditional visual classifications and demonstrates how machine learning techniques may be employed to retrieve complementary information not accessible through an exclusively visual analysis. ?? 2013 Elsevier Ltd.

    @article{gansell_jas_2014,
      archiveprefix = {arXiv},
      arxivid = {1401.0871},
      author = {Gansell, Amy Rebecca and van de Meent, Jan Willem and Zairis, Sakellarios and Wiggins, Chris H.},
      doi = {10.1016/j.jas.2013.11.005},
      eprint = {1401.0871},
      issn = {10959238},
      journal = {Journal of Archaeological Science},
      keywords = {Attribution,Clustering,Iron Age,Ivory sculpture,Levant,Machine learning,Mutual information},
      month = apr,
      pages = {194--205},
      publisher = {Elsevier Ltd},
      title = {{Stylistic clusters and the Syrian/South Syrian tradition of first-millennium BCE Levantine ivory carving: A machine learning approach}},
      volume = {44},
      year = {2014}
    }
    
  4. van de Meent, J.-W., Bronson, J. E., Wiggins, C. H., & Gonzalez, R. L. (2014). Empirical Bayes methods enable advanced population-level analyses of single-molecule FRET experiments. Biophysical Journal, 106(6), 1327–1337. https://doi.org/10.1016/j.bpj.2013.12.055

    Many single-molecule experiments aim to characterize biomolecular processes in terms of kinetic models that specify the rates of transition between conformational states of the biomolecule. Estimation of these rates often requires analysis of a population of molecules, in which the conformational trajectory of each molecule is represented by a noisy, time-dependent signal trajectory. Although hidden Markov models (HMMs) may be used to infer the conformational trajectories of individual molecules, estimating a consensus kinetic model from the population of inferred conformational trajectories remains a statistically difficult task, as inferred parameters vary widely within a population. Here, we demonstrate how a recently developed empirical Bayesian method for HMMs can be extended to enable a more automated and statistically principled approach to two widely occurring tasks in the analysis of single-molecule fluorescence resonance energy transfer (smFRET) experiments: 1), the characterization of changes in rates across a series of experiments performed under variable conditions; and 2), the detection of degenerate states that exhibit the same FRET efficiency but differ in their rates of transition. We apply this newly developed methodology to two studies of the bacterial ribosome, each exemplary of one of these two analysis tasks. We conclude with a discussion of model-selection techniques for determination of the appropriate number of conformational states. The code used to perform this analysis and a basic graphical user interface front end are available as open source software.

    @article{vandemeent_bpj_2014,
      author = {van de Meent, Jan-Willem and Bronson, Jonathan E and Wiggins, Chris H and Gonzalez, Ruben L},
      doi = {10.1016/j.bpj.2013.12.055},
      issn = {1542-0086},
      journal = {Biophysical journal},
      month = mar,
      number = {6},
      pages = {1327--37},
      pmid = {24655508},
      title = {{Empirical Bayes methods enable advanced population-level analyses of single-molecule FRET experiments.}},
      volume = {106},
      year = {2014}
    }
    
  5. van de Meent, J.-W., Sederman, A. J., Gladden, L. F., & Goldstein, R. E. (2010). Measurement of cytoplasmic streaming in single plant cells by magnetic resonance velocimetry. Journal of Fluid Mechanics, 642, 5–14. https://doi.org/10.1017/S0022112009992187
    @article{vandemeent_jfm_2010,
      author = {van de Meent, Jan-Willem and Sederman, Andy J. and Gladden, Lynn F. and Goldstein, Raymond E.},
      doi = {10.1017/S0022112009992187},
      issn = {0022-1120},
      journal = {Journal of Fluid Mechanics},
      pages = {5--14},
      title = {{Measurement of cytoplasmic streaming in single plant cells by magnetic resonance velocimetry}},
      volume = {642},
      year = {2010}
    }
    
  6. Sultan, E., van de Meent, J.-W., Somfai, E., Morozov, A. N., & van Saarloos, W. (2010). Polymer rheology simulations at the meso- and macroscopic scale. Europhysics Letters, 90(6), 64002. https://doi.org/10.1209/0295-5075/90/64002
    @article{sultan_epl_2010,
      author = {Sultan, Eric and van de Meent, Jan-Willem and Somfai, Ellak and Morozov, Alexander N. and van Saarloos, Wim},
      doi = {10.1209/0295-5075/90/64002},
      issn = {0295-5075},
      journal = {Europhysics Letters},
      month = jun,
      number = {6},
      pages = {64002},
      title = {{Polymer rheology simulations at the meso- and macroscopic scale}},
      volume = {90},
      year = {2010}
    }
    
  7. van de Meent, J.-W., Tuval, I., & Goldstein, R. (2008). Nature’s Microfluidic Transporter: Rotational Cytoplasmic Streaming at High Péclet Numbers. Physical Review Letters, 101(17), 178102. https://doi.org/10.1103/PhysRevLett.101.178102
    @article{vandemeent_prl_2008,
      author = {van de Meent, Jan-Willem and Tuval, Idan and Goldstein, Raymond},
      doi = {10.1103/PhysRevLett.101.178102},
      issn = {0031-9007},
      journal = {Physical Review Letters},
      month = oct,
      number = {17},
      pages = {178102},
      title = {{Nature’s Microfluidic Transporter: Rotational Cytoplasmic Streaming at High P\'{e}clet Numbers}},
      volume = {101},
      year = {2008}
    }
    
  8. Goldstein, R. E., Tuval, I., & van de Meent, J.-W. (2008). Microfluidics of cytoplasmic streaming and its implications for intracellular transport. Proceedings of the National Academy of Sciences of the United States of America, 105(10), 3663–3667. https://doi.org/10.1073/pnas.0707223105

    Found in many large eukaryotic cells, particularly in plants, cytoplasmic streaming is the circulation of their contents driven by fluid entrainment from particles carried by molecular motors at the cell periphery. In the more than two centuries since its discovery, streaming has frequently been conjectured to aid in transport and mixing of molecular species in the cytoplasm and, by implication, in cellular homeostasis, yet no theoretical analysis has been presented to quantify these processes. We show by a solution to the coupled dynamics of fluid flow and diffusion appropriate to the archetypal "rotational streaming" of algal species such as Chara and Nitella that internal mixing and the transient dynamical response to changing external conditions can indeed be enhanced by streaming, but to an extent that depends strongly on the pitch of the helical flow. The possibility that this may have a developmental consequence is illustrated by the coincidence of the exponential growth phase of Nitella and the point of maximum enhancement of those processes.

    @article{goldstein_pnas_2008,
      author = {Goldstein, Raymond E and Tuval, Idan and van de Meent, Jan-Willem},
      doi = {10.1073/pnas.0707223105},
      issn = {1091-6490},
      journal = {Proceedings of the National Academy of Sciences of the United States of America},
      month = mar,
      number = {10},
      pages = {3663--7},
      pmid = {18310326},
      title = {{Microfluidics of cytoplasmic streaming and its implications for intracellular transport.}},
      volume = {105},
      year = {2008}
    }
    
  9. van de Meent, J.-W., Morozov, A., Somfai, E., Sultan, E., & van Saarloos, W. (2008). Coherent structures in dissipative particle dynamics simulations of the transition to turbulence in compressible shear flows. Physical Review E, 78(1), 015701. https://doi.org/10.1103/PhysRevE.78.015701
    @article{vandemeent_pre_2008,
      author = {van de Meent, Jan-Willem and Morozov, Alexander and Somfai, Ell\'{a}k and Sultan, Eric and van Saarloos, Wim},
      doi = {10.1103/PhysRevE.78.015701},
      issn = {1539-3755},
      journal = {Physical Review E},
      month = jul,
      number = {1},
      pages = {015701},
      title = {{Coherent structures in dissipative particle dynamics simulations of the transition to turbulence in compressible shear flows}},
      volume = {78},
      year = {2008}
    }
    
  10. Fenistein, D., van de Meent, J.-W., & van Hecke, M. (2006). Core Precession and Global Modes in Granular Bulk Flow. Physical Review Letters, 96(11), 118001. https://doi.org/10.1103/PhysRevLett.96.118001
    @article{fenistein_prl_2006,
      author = {Fenistein, Denis and van de Meent, Jan-Willem and van Hecke, Martin},
      doi = {10.1103/PhysRevLett.96.118001},
      issn = {0031-9007},
      journal = {Physical Review Letters},
      month = mar,
      number = {11},
      pages = {118001},
      title = {{Core Precession and Global Modes in Granular Bulk Flow}},
      volume = {96},
      year = {2006}
    }
    
  11. Fenistein, D., van de Meent, J., & van Hecke, M. (2004). Universal and Wide Shear Zones in Granular Bulk Flow. Physical Review Letters, 92(9), 094301. https://doi.org/10.1103/PhysRevLett.92.094301
    @article{fenistein_prl_2004,
      author = {Fenistein, Denis and van de Meent, Jan and van Hecke, Martin},
      doi = {10.1103/PhysRevLett.92.094301},
      issn = {0031-9007},
      journal = {Physical Review Letters},
      month = mar,
      number = {9},
      pages = {094301},
      title = {{Universal and Wide Shear Zones in Granular Bulk Flow}},
      volume = {92},
      year = {2004}
    }
    

Workshop

  1. Sennesh, E., Ścibior, A., Wu, H., & van de Meent, J.-W. (2019). Model and Inference Combinators for Deep Probabilistic Programming. POPL Workshop on Languages for Inference (LAFI).
    @inproceedings{vandemeent_lafi_2019,
      author = {Sennesh, Eli and \'Scibior, Adam and Wu, Hao and {van de Meent}, Jan-Willem},
      booktitle = {POPL Workshop on Languages for Inference (LAFI)},
      title = {{Model and Inference Combinators for Deep Probabilistic Programming}},
      year = {2019}
    }
    
  2. Seaman, I. R., van de Meent, J.-W., & Wingate, D. (2018). Modeling Theory of Mind for Autonomous Agents with Probabilistic Programs. ICML 2019 Workshop on Imitation, Intention, and Interaction (I3).

    As autonomous agents become more ubiquitous, they will eventually have to reason about the mental state of other agents, including those agents’ beliefs, desires and goals - so-called theory of mind reasoning. We introduce a collection of increasingly complex theory of mind models of a "chaser" pursuing a "runner", known as the Chaser-Runner model. We show that our implementation is a relatively straightforward theory of mind model that can capture a variety of rich behaviors, which in turn, increase runner detection rates relative to basic (non-theory-of-mind) models. In addition, our paper demonstrates that (1) using a planning-as-inference formulation based on nested importance sampling results in agents simultaneously reasoning about other agents’ plans and crafting counter-plans, (2) probabilistic programming is a natural way to describe models in which each uses complex primitives such as path planners to make decisions, and (3) allocating additional computation to perform nested reasoning about agents result in lower-variance estimates of expected utility.

    @article{seaman2019modeling,
      archiveprefix = {arXiv},
      eprinttype = {arxiv},
      eprint = {1812.01569},
      primaryclass = {cs},
      title = {Modeling {{Theory}} of {{Mind}} for {{Autonomous Agents}} with {{Probabilistic Programs}}},
      journal = {ICML 2019 Workshop on Imitation, Intention, and Interaction (I3)},
      author = {Seaman, Iris Rubi and {van de Meent}, Jan-Willem and Wingate, David},
      month = dec,
      year = {2018},
      keywords = {Computer Science - Artificial Intelligence},
      file = {/Users/janwillem/Zotero/storage/FAKF5T67/Seaman - 2018 - Modeling Theory of Mind for Autonomous Agents with Probabilistic Programs.pdf}
    }
    
  3. Bozkurt, A., Esmaeli, B., Brooks, D. H., Dy, J., & van de Meent, J.-W. (2018). Can VAEs Generate Novel Examples? NeurIPS Workshop on Critiquing and Correcting Trends in Machine Learning.
    @inproceedings{vandemeent_nipscract_2018,
      author = {Bozkurt, Alican and Esmaeli, Babak and Brooks, Dana H. and Dy, Jennifer and {van de Meent}, Jan-Willem},
      booktitle = {NeurIPS Workshop on Critiquing and Correcting Trends in Machine Learning},
      title = {{Can VAEs Generate Novel Examples?}},
      year = {2018}
    }
    
  4. Sennesh, E., Ścibior, A., Wu, H., & van de Meent, J.-W. (2018). Composing Modeling and Inference Operations with Probabilistic Program Combinators. NeurIPS BNP Workshop.
    @inproceedings{vandemeent_nipsbnp_2018,
      author = {Sennesh, Eli and \'Scibior, Adam and Wu, Hao and {van de Meent}, Jan-Willem},
      booktitle = {NeurIPS BNP Workshop},
      title = {{Composing Modeling and Inference Operations with Probabilistic Program Combinators}},
      year = {2018}
    }
    
  5. Janz, D., Paige, B., Rainforth, T., van de Meent, J.-W., & Wood, F. (2016). Probabilistic structure discovery in time series data. NIPS 2016 Workshop on Artificial Intelligence for Data Science.
    @article{janz_nipsw_2016,
      title = {Probabilistic structure discovery in time series data},
      author = {Janz, David and Paige, Brooks and Rainforth, Tom and {van de Meent}, Jan-Willem and Wood, Frank},
      journal = {NIPS 2016 workshop on Artificial Intelligence for Data Science},
      year = {2016}
    }
    
  6. van de Meent, J.-W., Paige, B., Tolpin, D., & Wood, F. (2016). An Interface for Black Box Learning in Probabilistic Programs. POPL Workshop on Probabilistic Programming Semantics.
    @inproceedings{vandemeent_poplw_2016,
      author = {{van de Meent}, Jan-Willem and Paige, Brooks and Tolpin, David and Wood, Frank},
      booktitle = {POPL Workshop on Probabilistic Programming Semantics},
      title = {{An Interface for Black Box Learning in Probabilistic Programs}},
      year = {2016}
    }
    
  7. Rainforth, T., van de Meent, J.-W., & Wood, F. (2015). Bayesian Optimization for Probabilistic Programs (2015), NIPS workshop on Black Box Learning and Inference. NIPS Workshop on Black Box Learning and Inference.
    @inproceedings{rainforth_nipsw_2015,
      author = {Rainforth, Tom and {van de Meent}, Jan-Willem and Wood, Frank},
      booktitle = {NIPS Workshop on Black Box Learning and Inference},
      title = {{Bayesian Optimization for Probabilistic Programs (2015), NIPS workshop on Black Box Learning and Inference}},
      year = {2015}
    }
    
  8. Tolpin, D., Paige, B., van de Meent, J.-W., & Wood, F. (2015). Path Finding under Uncertainty through Probabilistic Inference. Proceedings of the 25th International Conference on Automated Planning and Scheduling, Workshop on Planning and Learning (ICAPS WPAL), 1502.07314.

    We introduce a new approach to solving path-finding problems under uncertainty by representing them as probabilistic models and applying domain-independent inference algorithms to the models. This approach separates problem representation from the inference algorithm and provides a framework for efficient learning of path-finding policies. We evaluate the new approach on the Canadian Traveler Problem, which we formulate as a probabilistic model, and show how probabilistic inference allows high performance stochastic policies to be obtained for this problem.

    @inproceedings{tolpin_icapsw_2015,
      author = {Tolpin, David and Paige, Brooks and {van de Meent}, Jan-Willem and Wood, Frank},
      booktitle = {Proceedings of the 25th International Conference on Automated Planning and Scheduling, Workshop on Planning and Learning (ICAPS WPAL)},
      archiveprefix = {arXiv},
      arxivid = {1502.07314},
      eprint = {1502.07314},
      pages = {1502.07314},
      title = {{Path Finding under Uncertainty through Probabilistic Inference}},
      year = {2015}
    }
    
  9. van de Meent, J.-W., Yang, H., & Wood, F. (2014). Particle Gibbs with Ancestor Resampling for Probabilistic Programs. 3rd NIPS Workshop on Probabilistic Programming.
    @inproceedings{vandemeent_nipsw_2014,
      author = {{van de Meent}, Jan-Willem and Yang, Hongseok and Wood, Frank},
      booktitle = {3rd {NIPS} Workshop on Probabilistic Programming},
      title = {{Particle Gibbs with Ancestor Resampling for Probabilistic Programs}},
      year = {2014}
    }
    

Reports

  1. Rainforth, T., Zhou, Y., Lu, X., Teh, Y. W., Wood, F., Yang, H., & van de Meent, J.-W. (2018). Inference Trees: Adaptive Inference with Exploration. ArXiv:1806.09550 [Stat].

    We introduce inference trees (ITs), a new class of inference methods that build on ideas from Monte Carlo tree search to perform adaptive sampling in a manner that balances exploration with exploitation, ensures consistency, and alleviates pathologies in existing adaptive methods. ITs adaptively sample from hierarchical partitions of the parameter space, while simultaneously learning these partitions in an online manner. This enables ITs to not only identify regions of high posterior mass, but also maintain uncertainty estimates to track regions where significant posterior mass may have been missed. ITs can be based on any inference method that provides a consistent estimate of the marginal likelihood. They are particularly effective when combined with sequential Monte Carlo, where they capture long-range dependencies and yield improvements beyond proposal adaptation alone.

    @article{rainforth2018inference,
      archiveprefix = {arXiv},
      eprinttype = {arxiv},
      eprint = {1806.09550},
      primaryclass = {stat},
      title = {Inference {{Trees}}: {{Adaptive Inference}} with {{Exploration}}},
      shorttitle = {Inference {{Trees}}},
      journal = {arXiv:1806.09550 [stat]},
      author = {Rainforth, Tom and Zhou, Yuan and Lu, Xiaoyu and Teh, Yee Whye and Wood, Frank and Yang, Hongseok and {van de Meent}, Jan-Willem},
      month = jun,
      year = {2018},
      keywords = {Statistics - Computation,Statistics - Machine Learning},
      file = {/Users/janwillem/Zotero/storage/M8NE39ZK/Rainforth - 2018 - Inference Trees.pdf}
    }
    
  2. van de Meent, J.-W., Paige, B., & Wood, F. (2014). Tempering by Subsampling. ArXiv e-Prints, 1401.7145.

    In this paper we demonstrate that tempering Markov chain Monte Carlo samplers for Bayesian models by recursively subsampling observations without replacement can improve the performance of baseline samplers in terms of effective sample size per computation. We present two tempering by subsampling algorithms, subsampled parallel tempering and subsampled tempered transitions. We provide an asymptotic analysis of the computational cost of tempering by subsampling, verify that tempering by subsampling costs less than traditional tempering, and demonstrate both algorithms on Bayesian approaches to learning the mean of a high dimensional multivariate Normal and estimating Gaussian process hyperparameters.

    @article{vandemeent_arxiv_2014,
      author = {van de Meent, Jan-Willem and Paige, Brooks and Wood, Frank},
      journal = {ArXiv e-prints},
      archiveprefix = {arXiv},
      eprint = {1401.7145},
      page = {1401.7145},
      title = {{Tempering by Subsampling}},
      year = {2014}
    }