Ego4D Challenge 2023
In CVPR 2023, we will host 14 challenges including 2 new challenges (EgoTracks & PACO Zero-Shot), representing each of Ego4D’s five benchmarks. These are:
- Visual queries with 2D localization (VQ2D) and Visual Queries 3D localization (VQ3D): Given an egocentric video clip and an image crop depicting the query object, return the most recent occurrence of the object in the input video, in terms of contiguous bounding boxes (2D + temporal localization) or the 3D displacement vector from the camera to the object in the environment.
- Natural language queries (NLQ): Given a video clip and a query expressed in natural language, localize the temporal window within all the video history where the answer to the question is evident.
- Moments queries (MQ): Given an egocentric video and an activity name (e.g., a “moment”), localize all instances of that activity in the past video
- EgoTracks: Given an egocentric video and a visual template of an object, localize the bounding box containing the object in each frame of the video along with a confidence score representing the presence of the object. [NEW for 2023]
- PACO Zero-Shot: Retrieve the bounding box of a specific object instance from a dataset, based on a textual query describing the instance. Query is composed using object and part attributes describing the object of interest. [NEW for 2023]
Hands and Objects:
- Temporal localization: Given an egocentric video clip, localize temporally the key frames that indicate an object state change.
- Object state change classification: Given an egocentric video clip, indicate the presence or absence of an object state change.
- Audio-visual speaker diarization: Given an egocentric video clip, identify which person spoke and when they spoke.
- Speech transcription: Given an egocentric video clip, transcribe the speech of each person.
- Talking to me: Given an egocentric video clip, identify whether someone in the scene is talking to the camera wearer.
- Looking at me: Given an egocentric video clip, identify whether someone in the scene is looking at the camera wearer.
- Short-term hand object prediction: Given a video clip, predict the next active objects, and, for each of them, predict the next action, and the time to contact.
- Long-term activity prediction: Given a video clip, the goal is to predict what sequence of activities will happen in the future. For example, after kneading dough, list the actions that the baker will do next.
Other Ego4D challenges which are not part of CVPR 2023 workshop remain open on EvalAI website for submissions but are not eligible for prizes.
Ego4D challenge participants will use Ego4D’s annotated data set of more than 3,670 hours of video data, capturing the daily-life scenarios of more than 900 unique individuals from nine different countries around the world. Unique train, validation and unannotated test sets are available to download per challenge at https://ego4d-data.org/docs/.
This year's challenge will use Ego4D v2.0 which contains ~2X train and val annotations for Forecasting, Hands & Objects and NLQ, a number of corrections and usability enhancements, and two new related dataset enhancements (PACO & EgoTracks). The test set remains the same as previous versions of the challenge. More details can be found here. We have also updated the baselines for NLQ, MQ, VQ2D and forecasting tasks leveraging more training data available in Ego4D v2.0 release.
Participate in the contest by registering on the EvalAI challenge page and create a team. All participants must register as a part of a “participating team” on EvalAI to ensure the submission limits are honored. Participants will upload their predictions in the format specified for the specific challenge, and will be evaluated on AWS instances by comparing to ground truth predictions. Instructions for training, local evaluation, and online submission are provided at EvalAI. Please refer to the individual EvalAI pages for each challenge for submission guidelines, task specifications, and evaluation criteria.
The challenge will launch on March 1, 2023 with the leaderboard closing on May 19, 2023. Winners will be announced at the Joint International 3rd Ego4D and 11th EPIC Workshop at CVPR 2023. Top performing teams may be invited to speak at the workshop.
Competition Rules and Prize Information
Competition rules can be found here. Additionally, we are thrilled that FAIR is able to offer the following prize thresholds per challenges:
- First place: $1500
- Second place: $1000
- Third place: $500
In addition to the submission on EvalAI, participants must submit a report describing their method to the workshop CMT (link TBD). In addition to your method and results, please remember to include examples of positive and negative results (limitations) of your model. These validation reports will be evaluated by challenge hosts from the Ego4D consortium before winner determination can be made. Similarly, challenge validation reports, research code from winning entries, and names of participants from the winning teams for all successful submissions must be shared publicly with the research community.
The Ego4D challenge would not have been possible without the infrastructure and support of the EvalAI team. Thank you!
- Suyog Jain
- Rohit Girdhar
- Andrew Westbury
- Santhosh Kumar Ramakrishnan
- Chen Zhao
- Merey Ramazanova
- Satwik Kottur
- Mengmeng Xu
- Vincent Cartillier
- Yifei Huang
- Qichen Fu
- Siddhant Bansal
- Hao Jiang
- Vamsi Ithapu
- Jachym Kolar
- Christian Fuegen
- Leda Sari
- Eric Zhongcong Xu
- Zachary Chavis
- Wenqi Jia
- Miao Liu
- Antonino Furnari
- Francesco Ragusa
- Tushar Nagarajan
- Dima Damen
- Giovanni Maria Farinella
- Michael Wray
- Hao Tang
- Kevin Liang
- Weiyao Wang
- Vladan Petrovic
- Anmol Kalia
- Vignesh Ramanathan
- Dhruv Mahajan
- Gene Byrne
- Matt Feiszli
- Kristen Grauman
Past Challenges / Winners
ECCV Workshop 2022 (Oct 24, 2022)
CVPR Workshop 2022 (June 19, 2022)