scene_service.ingest¶

Ingest layer — async tasks that pull observations into the registry.

v2 (current): scene runs in its own docker container with rclpy on the host DDS bus. ROS2 subscribers are the primary ingest path; VLM perception consumes the latest RGB frame each tick and produces Detections that go through state/data_assoc.

v1 (removed): MCP-based polling of primitive caps via atlas. MCP is for pilot only; scene’s own data fetching uses the fast direct path.

class scene_service.ingest.ConceptGraphsDetector(*, rgb_fetcher_msg: Callable[[], Any | None], depth_fetcher_msg: Callable[[], Any | None], camera_info_fetcher: Callable[[], _CamIntrinsics | None], chassis_pose_fn: Callable[[], tuple[float, float, float, float] | None], on_detections: Callable[[list[Detection]], Awaitable[None]], registry: ObjectRegistry, world_frame_fn: Callable[[], str] | None = None, period_s: float = 0.6, confidence_threshold: float = 0.3, camera_height_m: float = 1.1, max_detections: int = 30, yolo_weights_path: str | None = None, sam_weights_path: str | None = None, clip_model_name: str | None = None, clip_pretrained: str | None = None, cfg_overrides: dict | None = None, hub: Any = None, camera_frame: str = 'head_front_camera_rgb_optical_frame')[source]¶

Bases: object

Per-frame detector that runs the canonical concept-graphs merge pipeline. Owns a persistent MapObjectList and projects it into scene’s ObjectRegistry once per tick so the web UI/MCP API stay consistent.

on_detections is kept for backwards compat but no longer used — we write to the registry directly via _project_to_registry. The caller still passes a registry via this side channel.

export_3d_snapshot(max_points_per_obj: int = 256) → dict[source]¶

async start() → None[source]¶

async stop() → None[source]¶

class scene_service.ingest.SubscribersHub(specs: list[TopicSpec], *, node_name: str = 'scene_subscribers')[source]¶

Bases: object

Owns the rclpy.Node + spin thread + per-kind latest slots. Lifecycle:

hub = SubscribersHub(specs=[…]) await hub.start() # spawns spin thread, creates subs … await hub.stop() # destroys node, joins thread

add_spec(spec: TopicSpec) → bool[source]¶: Add a new (kind, topic) subscription to a hub already up. Used by the background reconciler to absorb topics that come online after start(). Returns True if added, False if the kind was already known.

has(kind: str) → bool[source]¶

has_kinds() → set[str][source]¶

latest(kind: str) → tuple[Any, float, int][source]¶: Returns (msg, stamp_unix, count). msg is None until the first callback arrives; readers should branch on stamp_unix > 0.

lookup_transform_4x4(target_frame: str, source_frame: str = 'map')[source]¶: Return a 4x4 numpy homogeneous transform that maps points from target_frame into source_frame. None when tf isn’t available. Used by perception to project camera-optical depth points directly into map frame without composing odom- frame chassis pose by hand.

lookup_xy_yaw(target_frame: str = 'base_link', source_frame: str = 'map') → tuple[float, float, float, float] | None[source]¶: Return (x, y, z, yaw) of target_frame expressed in source_frame via tf2, or None when the transform isn’t available yet. This is the SLAM-corrected pose: it goes through /tf (rtabmap publishes map→odom; chassis publishes odom→base_link), so it tracks rviz exactly. Reading /odom instead leaves the consumer in odom-frame and drifts away from the map once SLAM corrects.

async start() → None[source]¶

async stop() → None[source]¶

class scene_service.ingest.TopicSpec(kind: str, topic: str, msg_type: str, qos_profile: str = 'default')[source]¶

Bases: object

One observation kind’s wiring. kind is the abstract name (rgb / depth / lidar2d / pose / odom); topic is the concrete ROS topic; msg_type selects which sensor_msgs / nav_msgs class to import. Optional qos_profile lets a config override the default reliable-latched profile when needed (e.g. /amcl_pose needs RELIABLE + KEEP_LAST(1)).

kind: str¶

msg_type: str¶

qos_profile: str = 'default'¶

topic: str¶

class scene_service.ingest.VLMObjectDetector(*, rgb_fetcher: Callable[[], bytes | None], chassis_pose_fn: Callable[[], tuple[float, float, float, float] | None], on_detections: Callable[[list[Detection]], Awaitable[None]], period_s: float = 3.0, camera_frame_id: str = 'head_front_camera_rgb_optical_frame', intrinsics: _CamIntrinsics | None = None)[source]¶

Bases: object

Runs the RGB-poll → VLM-call → Detection-list pipeline as one asyncio task. Calls back into on_detections with a batch of Detection objects at each successful tick.

The detector reads camera/snapshot via the existing PrimitivePoller machinery (passed in as rgb_fetcher) so we don’t duplicate the atlas connect logic.

async start() → None[source]¶

async stop() → None[source]¶

Modules

`perception_concept_graphs`	ConceptGraphs perception — the real one, not just the detection frontend.
`perception_vlm`	VLM-based object detection: take an RGB frame, ask an OpenAI-compatible vision model to enumerate visible objects with approximate image coordinates, parse the JSON response, and (when depth is available) reproject to world coordinates.
`ros_subscribers`	ROS2 subscribers — scene's primary ingest path.