EgoPER Dataset for Procedural Error Understanding
- EgoPER is an egocentric dataset for error understanding in procedural videos. It contains multimodal data (RGB, depth, audio, gaze and hands) along annotations of steps and bounding boxes of objects and active objects. The dataset contains both normal and error videos from 5 different cooking tasks.
The dataset can be downloaded from the link below. When using the dataset in your work, you should cite the following paper:
S. Lee, Z. Lu, Z. Zhang, M. Hoai and E. Elhamifar, Error Detection in Egocentric Procedural Task Videos, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024.
Dataset Page
ProceL Dataset for Learning from Instructional Videos
- ProceL is a multimodal procedural learning dataset for research on instructional video understanding.
- The dataset consists of 47.3 hours of annotated videos from 720 videos coming from 12 diverse tasks. For every task, an instruction grammar is built and videos are annotated with the beginning and ending time of each key-step in the grammar. The dataset can be downloaded from the link below. When using the dataset in your work, you should cite the following paper:
E. Elhamifar, Z. Naing, Unsupervised Procedure Learning via Joint Dynamic Summarization, International Conference on Computer Vision (ICCV), 2020.
Dataset Page
|