scene_service.ingest¶
Ingest layer — async tasks that pull observations into the registry.
v2 (current): scene runs in its own docker container with rclpy on the host DDS bus. ROS2 subscribers are the primary ingest path; VLM perception consumes the latest RGB frame each tick and produces Detections that go through state/data_assoc.
v1 (removed): MCP-based polling of primitive caps via atlas. MCP is for pilot only; scene’s own data fetching uses the fast direct path.
- class scene_service.ingest.ConceptGraphsDetector(*, rgb_fetcher_msg: Callable[[], Any | None], depth_fetcher_msg: Callable[[], Any | None], camera_info_fetcher: Callable[[], _CamIntrinsics | None], chassis_pose_fn: Callable[[], tuple[float, float, float, float] | None], on_detections: Callable[[list[Detection]], Awaitable[None]], registry: ObjectRegistry, world_frame_fn: Callable[[], str] | None = None, period_s: float = 0.6, confidence_threshold: float = 0.3, camera_height_m: float = 1.1, max_detections: int = 30, yolo_weights_path: str | None = None, sam_weights_path: str | None = None, clip_model_name: str | None = None, clip_pretrained: str | None = None, cfg_overrides: dict | None = None, hub: Any = None, camera_frame: str = 'head_front_camera_rgb_optical_frame')[source]¶
Bases:
objectPer-frame detector that runs the canonical concept-graphs merge pipeline. Owns a persistent MapObjectList and projects it into scene’s ObjectRegistry once per tick so the web UI/MCP API stay consistent.
on_detections is kept for backwards compat but no longer used — we write to the registry directly via _project_to_registry. The caller still passes a registry via this side channel.
- class scene_service.ingest.SubscribersHub(specs: list[TopicSpec], *, node_name: str = 'scene_subscribers')[source]¶
Bases:
objectOwns the rclpy.Node + spin thread + per-kind latest slots. Lifecycle:
hub = SubscribersHub(specs=[…]) await hub.start() # spawns spin thread, creates subs … await hub.stop() # destroys node, joins thread
- add_spec(spec: TopicSpec) bool[source]¶
Add a new (kind, topic) subscription to a hub already up. Used by the background reconciler to absorb topics that come online after start(). Returns True if added, False if the kind was already known.
- latest(kind: str) tuple[Any, float, int][source]¶
Returns (msg, stamp_unix, count). msg is None until the first callback arrives; readers should branch on stamp_unix > 0.
- lookup_transform_4x4(target_frame: str, source_frame: str = 'map')[source]¶
Return a 4x4 numpy homogeneous transform that maps points from target_frame into source_frame. None when tf isn’t available. Used by perception to project camera-optical depth points directly into map frame without composing odom- frame chassis pose by hand.
- lookup_xy_yaw(target_frame: str = 'base_link', source_frame: str = 'map') tuple[float, float, float, float] | None[source]¶
Return (x, y, z, yaw) of target_frame expressed in source_frame via tf2, or None when the transform isn’t available yet. This is the SLAM-corrected pose: it goes through /tf (rtabmap publishes map→odom; chassis publishes odom→base_link), so it tracks rviz exactly. Reading /odom instead leaves the consumer in odom-frame and drifts away from the map once SLAM corrects.
- class scene_service.ingest.TopicSpec(kind: str, topic: str, msg_type: str, qos_profile: str = 'default')[source]¶
Bases:
objectOne observation kind’s wiring. kind is the abstract name (rgb / depth / lidar2d / pose / odom); topic is the concrete ROS topic; msg_type selects which sensor_msgs / nav_msgs class to import. Optional qos_profile lets a config override the default reliable-latched profile when needed (e.g. /amcl_pose needs RELIABLE + KEEP_LAST(1)).
- kind: str¶
- msg_type: str¶
- qos_profile: str = 'default'¶
- topic: str¶
- class scene_service.ingest.VLMObjectDetector(*, rgb_fetcher: Callable[[], bytes | None], chassis_pose_fn: Callable[[], tuple[float, float, float, float] | None], on_detections: Callable[[list[Detection]], Awaitable[None]], period_s: float = 3.0, camera_frame_id: str = 'head_front_camera_rgb_optical_frame', intrinsics: _CamIntrinsics | None = None)[source]¶
Bases:
objectRuns the RGB-poll → VLM-call → Detection-list pipeline as one asyncio task. Calls back into on_detections with a batch of Detection objects at each successful tick.
The detector reads camera/snapshot via the existing PrimitivePoller machinery (passed in as rgb_fetcher) so we don’t duplicate the atlas connect logic.
Modules
ConceptGraphs perception — the real one, not just the detection frontend. |
|
VLM-based object detection: take an RGB frame, ask an OpenAI-compatible vision model to enumerate visible objects with approximate image coordinates, parse the JSON response, and (when depth is available) reproject to world coordinates. |
|
ROS2 subscribers — scene's primary ingest path. |