SkillHub

arxiv-gamedevbench-evaluating-agentic-capabili

v1.0.0

Learned from arXiv paper GameDevBench: Evaluating Agentic Capabilities Through Game Development. Use this skill to scaffold Node.js experiments based on the paper method.

Sourced from ClawHub, Authored by WANGJUNJIE

Installation

Please help me install the skill `arxiv-gamedevbench-evaluating-agentic-capabili` from SkillHub official store. npx skills add wanng-ide/arxiv-gamedevbench-evaluating-agentic-capabili

arxiv-gamedevbench-evaluating-agentic-capabili

Source

  • Paper key: 44f3ad505bee7a5c25a60d2a3686cb7e
  • Title: GameDevBench: Evaluating Agentic Capabilities Through Game Development
  • Categories: cs.AI,cs.CL,cs.SE

Learned insight

Despite rapid progress on coding agents, progress on their multimodal counterparts has lagged behind. A key challenge is the scarcity of evaluation testbeds that combine the complexity of software development with the need for deep multimodal understanding. Game development provides such a testbed as agents must navigate large, dense codebases while manipulating intrinsically multimodal assets such as shaders, sprites, and animations within a visual game scene. We present GameDevBench, the first

Node.js implementation entry

node {baseDir}/scripts/run.js