Scaling Approximate Inference and Approximation – Aware Learning

Lead PI

Wolfgang Gatterbauer

Abstract

The last decade has seen an enormous increase in our ability to gather and manage large amounts of data; business, healthcare, education, economy, science, and almost every aspect of society are accumulating data at unprecedented levels. The basic premise is that by having more data, even if uncertain and of lower quality, we are also able to make better-informed decisions. To make any decisions, we need to perform “inference” over the data, i.e. to either draw new conclusions, or to find support for existing hypotheses, thus allowing us to favor one course of action over another. However, general reasoning under uncertainty is highly intractable, and many state-of-the-art systems today perform approximate inference by reverting to sampling. Thus for many modern applications (such as information extraction, knowledge aggregation, question-answering systems, computer vision, and machine intelligence), inference is a key bottleneck, and new methods for tractable approximate inference are needed.

This project addresses the challenge of scaling inference by generalizing two highly scalable approximate inference methods and complementing them with scalable methods for parameter learning that are “approximation-aware.” Thus, instead of treating the (i) learning and the (ii) inference steps separately, this project uses the approximation methods developed for inference also for learning the model. The research hypothesis is that this approach increases the overall end-to-end prediction accuracy while simultaneously increasing scalability. Concretely, the project develops the theory and a set of scalable algorithms and optimization methods for at least the following four sub-problems: (1) approximating general probabilistic conjunctive queries with standard relational databases; (2) learning the probabilities in uncertain databases based on feedback on rankings of output tuples from general queries; (3) approximating the exact probabilistic inference in undirected graphical models with linearized update equations; and (4) complementing the latter with a robust framework for learning linearized potentials from partially labeled data.

Funding

NSF

Related Publications

Wolfgang Gatterbauer, Dan Suciu. “Dissociation and propagation for approximate lifted inference with standard relational database management systems.” VDLBJ Special issue of best of VLDB 2015, 2016. DOI: 10.1007/s00778-016-0434-5

Dean's Welcome

Experiential Learning

Global Campus Experience

NDIF at Northeastern: $9 million NSF grant to launch groundbreaking project

Northeastern at ACM CHI

Events to engage with students this spring

Hiring a co-op student: What to know

Careers at Khoury College

Scaling Approximate Inference and Approximation – Aware Learning

Lead PI

Abstract

Funding

Related Publications