scene_service.scene_graph.captioner¶

Node captioner — pluggable interface for generating captions.

V1 implementation: caption = label (no VLM call, no crops). Future versions will accept crop images and call a VLM to produce richer captions like “a black office chair with wheels near a desk”.

Classes

NodeCaptioner()

Generate a natural-language caption for a SceneGraphNode.

class scene_service.scene_graph.captioner.NodeCaptioner[source]¶

Bases: object

Generate a natural-language caption for a SceneGraphNode.

The interface is a single async method caption_node so that future implementations can do I/O (VLM calls, crop reads) without blocking the event loop.

V1 simply copies node.label into node.caption.

async caption_node(node: SceneGraphNode) → SceneGraphNode[source]¶

Set node.caption and return the mutated node.

Override this method to plug in a VLM-based captioner.