Ego2Web: Grounding Web Agents in Egocentric Video Benchmarks
New benchmark **connects egocentric video perception with web execution** tasks, bridging gap between real-world vision and AI agent web navigation capabilities. Dataset spans **e-commerce, media retrieval, knowledge lookup, and maps** with 50%+ e-commerce tasks generated via LLM pipeline and human verification. Ego2WebJudge **automated evaluation method scores agent performance** using LLM assessment of task keypoints against video evidence and screenshots.
More in Pivot 5
OpenAI Shuts Down Sora, Refocuses on Code AGI and Model Spud
OpenAI **discontinues Sora and all video generation products** to redeploy compute resources toward competing with Anthropic in enterprise coding and knowledge work. CEO Sam Altman **narrows his role to focus on capital, supply chains, and data centers**; Fiji Simo's product division becomes 'AGI Deployment' team. New model 'Spud' completes pre-training with expectations to 'accelerate the economy'; **Disney cancels $1B investment partnership** following Sora shutdown.