Ehsan Elhamifar

Data

Home MCADS Lab People Publications Activities Codes Data Teaching

EgoPER Dataset for Procedural Error Understanding

EgoPER is an egocentric dataset for error understanding in procedural videos. It contains multimodal data (RGB, depth, audio, gaze and hands) along annotations of steps and bounding boxes of objects and active objects. The dataset contains both normal and error videos from 5 different cooking tasks.

S. Lee, Z. Lu, Z. Zhang, M. Hoai and E. Elhamifar, Error Detection in Egocentric Procedural Task Videos,
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024.

ProceL is a multimodal procedural learning dataset for research on instructional video understanding.

The dataset consists of 47.3 hours of annotated videos from 720 videos coming from 12 diverse tasks. For every task, an instruction grammar is built and videos are annotated with the beginning and ending time of each key-step in the grammar. The dataset can be downloaded from the link below. When using the dataset in your work, you should cite the following paper:

E. Elhamifar, Z. Naing, Unsupervised Procedure Learning via Joint Dynamic Summarization,
International Conference on Computer Vision (ICCV), 2020.