On building multi-agent Blender scene genration system (Part 1)

Repository URL: multi-agent-scene-simulator

Inspired by the blog post “How We Built Our Multi-Agent Research System”, this project explores a similar agentic architecture. A lead agent breaks down user requirements into sub-tasks, while search sub-agents work on these sub-tasks. In this design, each sub-agent is constrained to operate within its specific task scope. In Anthropic’s multi-agent research, for example, sub-agents are only capable of browsing the internet.

I decided to apply this design to a different domain: Blender scene generation.

Tech stack involved:

blender-mcp – Enables communication between agents and Blender
DSPy – Helps systematically optimize agents

Design: Implemented at “agentic” layer

Lead Agent (lead_agent.py): Decomposes scene requirements into sub-tasks
Blender Code Generator (blender_code_generator.py): Converts natural language instructions into Python code
Scene Evaluator (scene_evaluator.py): Assesses scene quality and identifies missing components

Execution flow

Agentic execution logic is handled by “services” layer (services/executor.py)
Context Manager is responsible for storing agent’s execution traces such as tooling results, generated subtasks, generated code, etc.

Lead agent prompt optimization process

Scene description sources: Crawled from blenderkit.com
LLM as a judge: Uses LLMs capable of image recognition to return a prompt-image match score (range: 0 = not a match, 1 = perfect match).

Thoughts

Inconsistent scene generation

The existing implementation of executor’s feedback loop is insufficient to have subagents had proper context of the whole execution flow. I’m defining a sufficient feedbackloop should be contains all traces of agents, tool execution results AND current environemnt state.

Optimization Issue

DSPy’s optimizers currently work only on a single module. However, in a multi-agent setup, this approach is hard to adapt because our primary goal is to optimize only the lead agent, whose responsibility is breaking down scene requirements into sub-tasks. The code generator sub-agents, on the other hand, simply act as “soldiers,” executing the task of generating code without requiring optimization.