Seeing Clearly Indoors: How InDro Robotics Uses Stereolabs ZED X for AMR Navigation

Seeing Clearly Indoors: How InDro Robotics Uses Stereolabs ZED X for AMR Navigation

My Store Admin

Autonomous Mobile Robots (AMRs) are only as good as their perception stack. At InDro Robotics, we build real systems for real environments. That means we evaluate hardware extensively and pivot fast when something better comes along.

Integrating robots is a complex job, individual components may have amazing specs, but how each component performs in the overall solution is what really matters. 

This is the story of why the Stereolabs ZED X became a cornerstone of our indoor AMR navigation pipeline, and what our engineers discovered along the way.

 

The Challenge: Indoor Navigation Without Compromise

Indoor AMR navigation is deceptively hard. Warehouses, showrooms, and logistics facilities present a mix of dynamic obstacles, variable lighting, long cable runs, and tight latency requirements. The sensor stack has to deliver reliable depth, low drift, and tight integration with the compute platform, all without adding bulk or complexity to the robot.

Our team had previously run Intel RealSense cameras for this job. They work, but they exposed two hard limitations that became impossible to ignore at scale:

  • High-bandwidth USB degrades quickly over distance. On larger platforms where the sensor is mounted far from the compute unit, USB simply could not deliver.
  • Latency and GPU access via USB are bottlenecks. For high-definition depth streaming with feature tracking, the pipeline needs direct GPU write capability, and USB doesn't give you that.

We needed a better architecture. The ZED X; paired with the NVIDIA Jetson AGX Orin via GMSL2, solved both problems cleanly.

 

The Setup: ZED X + Jetson AGX Orin + GMSL2

Our indoor AMR navigation stack is now built around the Stereolabs ZED X stereo camera, connected to the NVIDIA Jetson AGX Orin via the GMSL2 Duo Capture Card. GMSL2 (Gigabit Multimedia Serial Link 2) is the same interface standard used in automotive ADAS systems, it's designed for exactly this: long cable runs, high-bandwidth video, and low-latency data transfer with direct GPU access.

The Duo Capture Card supports up to four ZED X cameras simultaneously (two stereo pairs), which gives us flexibility across different robot form factors and field-of-view requirements.

Key specs that matter for this application: global shutter (eliminates motion blur in fast-moving environments), built-in IMU, neural depth processing, and a ROS2 wrapper that makes integration into existing Isaac ROS pipelines straightforward.

 

Four Ways We're Putting the ZED X to Work

1. Navigation & 3D Obstacle Avoidance

The primary use case feeds depth data from the ZED X directly into nvblox — NVIDIA's Isaac ecosystem tool for 3D mapping. This lets us build real-time 3D obstacle costmaps that serve as the foundation for the AMR's navigation stack.

What really stands out is the synchronization between the depth data and the RGB images from the same lens. That tight sync enables semantic segmentation workflows where we can selectively filter specific objects out of the scene or trigger defined robot behaviors when specific object classes are detected.

WIN: LiDAR reduction in feature-rich environments: In well-lit, feature-rich indoor spaces, product showrooms, research labs, structured facilities, the ZED X handles the full navigation task without needing a secondary LiDAR sensor. That's a meaningful reduction in hardware cost and integration complexity on platforms where every gram and dollar matters.

 

2. Motion Tracking & Isaac ROS Integration

The Stereolabs SDK on the Jetson brings up a capable motion tracking system quickly, and that time-to-autonomy is a genuine competitive advantage when you're running rapid design iterations.

The SDK fuses the built-in IMU with Visual Odometry (VO) out of the box. When combined with Isaac ROS and NITROS (NVIDIA Isaac Transport for ROS; a fast in-place GPU copy mechanism), the result is excellent, low-drift motion tracking with minimal tuning required.

 

WIN: Fast time-to-autonomy: Prototyping a new robot platform? The ZED X + Isaac ROS + NITROS stack gets you to testable autonomy faster than building your own depth and odometry pipeline from scratch. For teams moving fast, that's not a nice-to-have, it's the whole game.

 

3. Human Detection & 3D-Aware AI Perception

Beyond navigation, the ZED X SDK ships with robust human detection and body tracking tools,  including an 18-keypoint skeleton system using the COCO18 representation. Our team uses this for both safety-critical applications and, occasionally, more creative ones (standing desk posture analysis, anyone?).

The bigger application is 3D-aware human detection for warehouse logistics. Standard 2D detection tells you a person is in the frame. The ZED X tells you where they are in 3D space, how far away they are, and gives you a tracking ID across frames. That unlocks zone-based trigger logic, stop the robot when someone enters a defined 3D zone, alert them via onboard speaker, or adjust speed based on proximity.

 

WIN: 2D to 3D detection without building your own stack: Going from basic 2D bounding boxes to reliable 3D-aware human detection is a significant engineering lift if you build it yourself. The Stereolabs SDK collapses that to a configuration exercise, which dramatically reduces false positives and accelerates deployment of safe human-robot interaction features.

 

4. Future: AI-Driven Intention Prediction

The 18-keypoint body tracking data opens a longer-term roadmap for our AMR platforms: predicting human intention from body movement. A warehouse worker reaching toward a shelf, a pedestrian stepping into a robot's path, a technician signaling a stop, these are all readable from body kinematics if you have a capable perception front-end.

The ZED X gives us that front-end. The AI models to interpret it are the next layer and that's work we're actively developing.

 

Why GMSL2 Changes the Hardware Calculus

This deserves its own section because it's often underappreciated. The interface between the camera and the compute platform is not a minor implementation detail, it's a fundamental constraint on what your perception stack can do.

USB: works until it doesn't

High-bandwidth USB is fine for benchtop demos and small robots. But it has two hard ceilings: cable length (signal degrades beyond a few meters) and GPU access (no direct write path, which means extra latency and CPU overhead for depth processing). On compact platforms, you might never hit these limits. On larger robots where the sensor is mounted far from the compute unit, USB becomes a mandatory blocker.

GMSL2: built for production robotics

GMSL2 was designed for automotive ADAS: long cable runs, vibration tolerance, high-bandwidth serial data, and direct GPU write via the Jetson's capture card interface. For our larger AMR platforms, it went from a nice architectural choice to a mandatory requirement. No GMSL2, no go.

 

WIN: Direct GPU write = lower latency, better depth accuracy: The GMSL2 + Jetson combination provides extremely fast communication with direct GPU write capabilities. The result is the low latency needed for high-definition streaming, visibly improved depth accuracy, and more robust feature tracking, especially under motion.

 

ROS2 Support: The Developer Experience Matters

Hardware wins on specs. It loses in the field on integration complexity. Stereolabs has invested heavily in their SDK and driver stack, and it shows. Continuous updates, active support, and a well-maintained ROS2 wrapper that exposes all camera data (depth, RGB, IMU, body tracking, point clouds) as native ROS2 topics.

For teams building on Isaac ROS, that wrapper is a significant time saver. You're not writing custom drivers or fighting with message formats, you're building the application logic that differentiates your robot.

 

Bottom Line

The Stereolabs ZED X is not the cheapest stereo camera on the market. But for serious indoor AMR development where latency, cable constraints, GPU integration, and perception capability all matter, it earns its place in the stack.

Here's what it delivered for InDro:

  • Replaced Intel RealSense on platforms where USB hit its limits
  • Enabled LiDAR-free navigation in feature-rich indoor environments
  • Accelerated time-to-autonomy on new platform prototypes
  • Unlocked 3D-aware human detection without a custom perception stack
  • Provided a clean, production-grade path to Isaac ROS integration

 

We carry the ZED X — along with the full Stereolabs product lineup — in the InDro Store. If you're building an AMR, mobile manipulator, or any robot that needs to navigate and perceive reliably in complex indoor environments, it's worth a serious look.

Shop Stereolabs cameras HERE.

Back to blog