Jesse Anderton

Current Research

Practitioners of Machine Learning and related fields commonly seek out embeddings of object collections into some Euclidean space. These embeddings are useful for dimensionality reduction, for data visualization, as concrete representations of abstract notions of similarity for similarity search, or as features for some downstream learning task such as web search or sentiment analysis. A wide array of such techniques exist, ranging from traditional (PCA, MDS) to trendy (word2vec, deep learning).

Most existing techniques depend on exact numerical input of some form: object features, distance estimates, etc. For datasets derived from user behavior or opinion, however, such quantitative data is either too sparse, too noisy, or too difficult to interpret for us to rely on such exact numerical values. Instead of exactly trusting the numerical values of similarity or distance estimates, one can rely on the ordering of those values, e.g. by sorting all items by their estimated similarity to a given query item and discarding the (unreliable) exact numerical similarities. The relationship between objects is reduced to a set of ordinal triplets, such as "object a is closer to object b than to object c." One can then seek an embedding which preserves this ordinal information, and seek to prove guarantees for the degree to which such an embedding must recover the true distances or similarities between items.

My research has explored several key questions for producing large-scale ordinal embeddings, including:

What is the minimal subset of ordinal data sufficient to preserve ordering perfectly?
How and when can distance information be inferred from ordinal information? For instance, when can we say that if some set of triplets holds between objects a, b, c, ... in ℝ^d, then (e.g.) the ratio of distances ||a - b|| / ||a - c|| must lie within some small interval?
How can we obtain ordinal embeddings of n objects into ℝ^d in just O(n d log n) or better time, with good accuracy guarantees?

I have solved these and other questions, and am now working to publish these results and apply them to large-scale user behavior datasets.

Publications

Scaling Up Ordinal Embedding: A Landmark Approach. [Under review]
Jesse Anderton and Javed A. Aslam. 2019.
Pure Exploration in Infinitely-Armed Bandit Models with Fixed-Confidence [PDF]
Maryam Aziz, Jesse Anderton, Emilie Kaufmann, and Javed A. Aslam
Algorithmic Learning Theory (ALT) 2018.
A Modification of LambdaMART to Handle Noisy Crowdsourced Assessments [PDF]
Pavel Metrikov, Jie Wu, Jesse Anderton, Virgil Pavlu, and Javed A. Aslam
In Proceedings of the 2013 Conference on the Theory of Information Retrieval (ICTIR '13).
An analysis of crowd workers mistakes for a specific and complex relevance assessment task [PDF]
Jesse Anderton, Maryam Bashir, Virgil Pavlu, and Javed A. Aslam
In Proceedings of the 22nd ACM international conference on Conference on information & knowledge management (CIKM '13).
A document rating system for preference judgements [PDF]
Maryam Bashir, Jesse Anderton, Jie Wu, Peter B. Golbus, Virgil Pavlu, and Javed A. Aslam
In Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval (SIGIR '13).
Northeastern University Runs at the TREC12 Crowdsourcing Track [PDF]
Maryam Bashir, Jesse Anderton, Jie Wu, Matthew Ekstrand-Abueg, Peter B. Golbus, Virgil Pavlu, Javed A. Aslam
In Online Proceedings of TREC, 2012.

Preprints

(Summaries available upon request)

Revealing the basis: Ordinal embedding through geometry [PDF]
Jesse Anderton, Virgil Pavlu, and Javed A. Aslam
ArXiV.
Triple Selection for Ordinal Embedding
Jesse Anderton, Virgil Pavlu, and Javed A. Aslam
Measuring Human-perceived Similarity in Heterogeneous Collections [PDF]
Jesse Anderton, Pavel Metrikov, Virgil Pavlu, and Javed A. Aslam
ArXiV.

In Preparation

(Summaries available upon request)

Fast Geometric Ordinal Embedding
Jesse Anderton, Michaël Perrot, Damien Garreau, and Ulrike von Luxburg
Density-Sensitive Coverings and Packings
Jesse Anderton and Javed A. Aslam
Adaptively Pruning Features for Boosted Decision Trees
Maryam Aziz, Jesse Anderton, and Javed A. Aslam
Offline evaluation of music recommendations
Zahra Nazari, Jesse Anderton, and Benjamin Carterette
Improved ranking for music recommendations
Jesse Anderton, Zahra Nazari, Joe Cauteruccio, Jason Uh, Benjamin Carterette, and Fernando Diaz

Teaching

Thesis Proposal

Current Research

Publications

Preprints

In Preparation