|
|
Mathematical Data Science Lab in the Khoury College of Computer Sciences at Northeastern University is led by Dr. Ehsan Elhamifar. Our research focuses on computer vision, machine learning and AI. Current research in the lab includes in-the-wild video understanding, procedure learning from videos and multimodal data, low-shot, weakly- and self-supervised learning, fine-grained and human-object interaction recognition, adversarial attacks and defenses for fine-grained and multi-label models, and structured data summarization.
|
Learning Procedural Activities from Videos and Multimodal Data
Many human activities (e.g., cooking recipes, assembling and repairing devices, surgeries) are procedural. Such activities consists of multiple steps that need to be performed in certain ways to achieves the desired outcomes. We develop recognition, segmentation and anticipation methods to understand and learn procedural tasks using videos and multimodal data. We use these methods to develop virtual (e.g., AR/VR) assistants that guide users with different skill levels through tasks.
|
In-the-Wild Video Understanding
Real-world videos are often long and untrimmed and consists of many actions with long-range temporal dependencies. We develop activity understanding methods in such in-the-wild videos. We design architectures that capture long-range temporal dependencies of actions, run efficiently on very long videos, and handle large appearance and motion variations of actions across videos. We develop video understanding techniques that learn from different supervision levels and small amount of training or annotated videos.
|
Low-Shot, Weakly-Supervised and Self-Supervised Learning
While deep neural networks (DNNs) have become the state of the art for many tasks, in principle, they require a large number of annotated training samples to work well. This is particularly a bottleneck in tasks where there are not many annotated training samples available or labeling is laborious or costly. We develop methods to efficiently learn DNNs and data representations for a variety of tasks using minimum supervision and/or training data. We investigate zero- and few-shot, weakly- and self-supervised methods using new sequence alignment, deep attention models and multi-modal learning methods.
|
Human-Object Interaction (HOI) and Fine-Grained Recognition
Some recognition problems require distinguishing classes that are visually very similar (fine-grained recognition) or classes that are described as composition of different entities such as a verb/action performed on an object (HOI recognition). High visual similarities in the case of fine-grained recognition as well as the large number of possible HOIs and lack of sufficient data for many HOIs make recognition very challenging. We investigate and develop methods for fine-grained and HOI recognition that address these challenges effectively by designing new architectures and efficient training methods.
|
Adversarial Attacks and Defenses
Despite achieving high performance across many tasks, deep neural networks (DNNs) are vulnerable to adversarial attacks, which are imperceptible perturbations of input data that lead to drastically different predictions. We study vulnerabilities of DNNs for important yet less studied tasks such as fine-grained recognition and multi-label learning. We develop efficient and generalizable attacks and subsequently investigate making DNNs robust to these attacks by designing effective defense mechanisms.
|
Structured Summarization of Large Data
We are constantly capturing data using various sensors. Not all such data provide useful actionable information for learning and decision making. We design robust and scalable data summarization methods that handle structured dependencies in massive and complex data, adapt to tasks and require minimum/no supervision. We combine efficient and scalable optimization and deep learning to develop algorithms, analyze their performance and apply them to real-world tasks such as summarization of long videos.
|
|