Egocentric Procedural ERror (EgoPER) Dataset
Dataset & Code
Shih-Po Lee1 Zijia Lu1 Zekung Zhang2 Minh Hoai2 Ehsan Elhamifar1
1Northeastern University 2Stony Brook University

What is EgoPER

Characteristics

Challenges

The dataset contains egocentric procedural task videos and other modalities such as audio, depth, hand tracking, etc, on 5 different cooking tasks - pinwheels, coffee, quesadilla, tea, and oatmeal. Besides the correct/normal videos, EgoPER dataset contains erroneous/abnormal videos with 5 different categories - slip, correction, modification, addition, and omission.
  • 28 Hours of Recording
  • 5 Cooking Tasks
  • Erroneous Videos
  • 5 Error Types
  • Multiple Modalities
  • Frame-wise Step Labels
  • Object Bounding Boxes
  • Active Object Labels
  • Temporal Action Segmentation
  • Procedural Error Detection
  • Action Recognition
  • Active Object Detection

Error Taxonomy

1) Step Omission
corresponds to skipping one or multiple steps, e.g., not checking water temperature in the kettle, or not putting bananas on the tortilla.
2) Step Addition
corresponds to having unnecessary extra steps that are not in the task graph, e.g., pouring sugar onto tortilla or sprinkle cinnamon into mug.
3) Step Modification
corresponds to performing a step in a different way than the one specified by the recipe, e.g., scoop nut butter using spoon or pour water without circular motion. This does not necessarily change the outcome of the step.
4) Step Slip
corresponds to executing a step in a way that leads to not achieving the goal of the step, e.g., adding water to a different bowl from the one containing oats, or dropping tortilla on the floor.
5) Step Correction
corresponds to performing an action to mitigate the effect of an slip error, e.g., transferring water from the second bowl to the one containing oats or discarding the tortilla on the floor and picking a new one.

Active Object Detection

Dataset Statistics

Download & Code

Please access the dataset, annotations, and code from our github

Citation

Cite our CVPR paper
@InProceedings{Lee_2024_CVPR, author = {Lee, Shih-Po and Lu, Zijia and Zhang, Zekun and Hoai, Minh and Elhamifar, Ehsan}, title = {Error Detection in Egocentric Procedural Task Videos}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2024}, pages = {18655-18666} }

Collaborators & Acknowledgements