RealSense Person Follow Demo with OpenClaw

A Python application that detects and follows humans using a RealSense depth camera, with OpenClaw as the AI supervisor.

Phase 1: Mac Development (Current)

Python-only implementation that prints detection status and Twist commands to console. No ROS2 dependency - runs natively on macOS.

Phase 2: Linux Deployment (Future)

Port to ROS2 on Linux with actual /cmd_vel publishing to robot.

Prerequisites

Python 3.9 (must match pyrealsense2 build version)
Intel RealSense camera (D400 series)
librealsense + pyrealsense2 built from source (in ~/Projects/librealsense)
Ollama with qwen3-vl:2b model
OpenClaw installed

Installation

1. Set up Python 3.9 environment

pyrealsense2 built from source requires the same Python version it was compiled with.

# Create a Python 3.9 conda environment
conda create -n realsense python=3.9 -y
conda activate realsense

# Install dependencies
pip install flask mediapipe numpy opencv-python requests ollama

2. Verify librealsense location

The start script expects librealsense at ~/Projects/librealsense/build/Release. If yours is elsewhere, set the LIBREALSENSE_PATH environment variable.

3. Install Ollama vision model

ollama pull qwen3-vl:2b

Usage

Start the application

# Use the start script (handles sudo and PYTHONPATH)
./start.sh

Or manually:

sudo env PYTHONPATH=/path/to/librealsense/build/Release \
    /path/to/conda/envs/realsense/bin/python run.py

Note: sudo is required on macOS for USB access to the RealSense camera.

The app will:

Connect to your RealSense camera
Start detecting people using MediaPipe Pose

┌─────────────────────────────────────────────────────────────────┐ │ OpenClaw (Slow Loop ~1-5 Hz) │ │ │ │ • Natural language interface │ │ • Autonomous missions & goals │ │ • Scene understanding via VLM │ │ • High-level decision making │ │ • Proactive monitoring (heartbeats) │ │ │ └──────────────────────────┬──────────────────────────────────────┘ │ HTTP API (commands, queries, events) ▼ ┌─────────────────────────────────────────────────────────────────┐ │ Python App (Fast Loop @ 30 Hz) │ │ │ │ RealSense Camera ──► PersonTracker (MediaPipe) │ │ │ │ │ ├──► PersonIdentifier (VLM @ 1-2Hz) │ │ │ │ │ ▼ │ │ FollowerController │ │ │ │ │ ▼ │ │ Twist Commands (Console) │ │ │ │ │ ▼ │ │ [Phase 2: ROS2 /cmd_vel] │ │ │ └─────────────────────────────────────────────────────────────────┘

Robot-OpenClaw

Molt Pulse

RealSense Person Follow Demo with OpenClaw

Phase 1: Mac Development (Current)

Phase 2: Linux Deployment (Future)

Prerequisites

Installation

1. Set up Python 3.9 environment

2. Verify librealsense location

3. Install Ollama vision model

Usage

Start the application

Console Output

HTTP API Endpoints

Autonomous Missions

Scene Analysis

Event Webhooks

Manual Teleoperation

Object Detection

Find and Follow (Smart Object Tracking)

OpenClaw Integration

1. Initial OpenClaw Setup

2. Configure OpenAI API Key

3. Set the Model to gpt-4o-mini

4. Install the Follow-Robot Skill

5. Add Robot to TOOLS.md

6. Restart the Gateway

7. Open WebChat

Chat Commands

Troubleshooting OpenClaw

Architecture

Two-Loop Architecture

Configuration

Troubleshooting

"pyrealsense2 not available - using mock camera"

"No module named 'flask'"

Camera requires sudo

Custom librealsense location

Custom Python location

License

Ecosystem Role