Simulation Engine

The JOSHUA Simulation Engine provides a comprehensive physics simulation environment built on MuJoCo (Multi-Joint dynamics with Contact). It supports multiple operational modes for development, testing, validation, and training—from interactive GUI manipulation to GPU-accelerated reinforcement learning with thousands of parallel environments.

MuJoCo MuJoCo-XLA (MJX) JAX/Flax NVIDIA Isaac Sim Gymnasium ROS2
Key Capabilities The simulation engine serves as the foundation for sim-to-real transfer workflows. Train policies in GPU-accelerated simulation, validate in high-fidelity MuJoCo environments, and deploy directly to physical hardware—all driven by the same configuration system.

Architecture Overview

The simulation subsystem is organized into several layers that integrate with the broader JOSHUA framework:

simulation/
  mujoco_engine.py          # Core MuJoCo wrapper (MjModel + MjData lifecycle)
  sim_interactive.py        # Interactive GUI viewer mode
  sim_passive.py            # Trajectory replay mode
  sim_mirror.py             # ROS2 digital twin mode
  sim_offscreen.py          # Headless rendering mode
  train_mjx.py              # GPU-accelerated PPO training via MJX
  models/
    so_arm100/
      so_arm100.xml         # Base MJCF model
      scene.xml             # Scene with environment (table, lighting)
      scene_mjx.xml         # MJX-optimized scene
      pick_place.xml        # Pick-and-place task model
      pick_place_mjx.xml    # MJX pick-and-place variant
      assets/               # STL mesh files for visual fidelity
    ant/
      ant.xml               # Quadruped locomotion model
    isaac/
      so100.usd             # NVIDIA Isaac Sim USD asset
  trajectories/
    so_arm100_example.csv   # Example trajectory data
  isaac_sim/
    tasks/                  # Isaac Sim task definitions
    terms/                  # Reward, termination, observation terms

MuJoCo Engine

The core simulation wrapper is implemented in simulation/mujoco_engine.py. It encapsulates the full MuJoCo simulation lifecycle, providing a clean Python interface for model loading, physics stepping, control input, sensor readout, and rendering.

Model & Data Lifecycle

The engine manages two fundamental MuJoCo objects:

import mujoco
from simulation.mujoco_engine import MuJoCoEngine

# Initialize engine with an MJCF model
engine = MuJoCoEngine(model_path="simulation/models/so_arm100/scene.xml")

# The engine manages:
#   engine.model  -> mujoco.MjModel (compiled physics description)
#   engine.data   -> mujoco.MjData  (mutable simulation state)

# Reset simulation to initial state
engine.reset()

# Run simulation loop
for _ in range(1000):
    engine.set_ctrl([0.0, -0.5, 0.3, 0.0, 0.2, 0.0])  # Set actuator controls
    engine.step()                                         # Advance physics
    qpos = engine.get_qpos()                              # Read joint positions
    sensor = engine.get_sensor("wrist_force")             # Read sensor data

Core Methods

The MuJoCoEngine class exposes the following primary methods:

Method Description
step() Advances the simulation by one timestep. Calls mujoco.mj_step(model, data) internally, computing forward dynamics including contact forces, actuator responses, and constraint resolution.
set_ctrl(values) Sets actuator control signals. Accepts a list or numpy array matching the number of actuators defined in the model. Values are applied at the next step() call.
get_qpos() Returns the current generalized joint positions (data.qpos) as a numpy array. Includes both free joints (7-DOF) and hinge/slide joints (1-DOF each).
get_sensor(name) Reads a named sensor value from the simulation state. Supports all MuJoCo sensor types: force, torque, accelerometer, gyroscope, touch, joint position, and joint velocity sensors.
render() Renders the current simulation state. In interactive and passive modes, updates the GUI window. In offscreen mode, returns a pixel array via mujoco.Renderer.
reset() Resets the simulation to its initial state. Calls mujoco.mj_resetData(model, data) and restores all joint positions, velocities, and control signals to their defaults defined in the MJCF model.

Offscreen Rendering

The engine supports headless (offscreen) rendering through mujoco.Renderer, enabling camera image generation without a display. This is essential for data generation pipelines, automated testing in CI environments, and vision-based training data collection.

# Offscreen rendering for data generation
engine = MuJoCoEngine(model_path="simulation/models/so_arm100/scene.xml")

# Create offscreen renderer with specified resolution
renderer = mujoco.Renderer(engine.model, height=480, width=640)

# Render current state to pixel array
renderer.update_scene(engine.data)
pixels = renderer.render()  # Returns numpy array (H, W, 3) uint8

# Use for computer vision training, visual validation, etc.
import cv2
cv2.imwrite("frame_0001.png", cv2.cvtColor(pixels, cv2.COLOR_RGB2BGR))

Simulation Modes

JOSHUA provides four distinct simulation modes, each designed for a specific stage of the development workflow. The mode is selected through the SimulationConfig.mode field in the configuration file.

Interactive Mode

Interactive mode launches the full MuJoCo GUI viewer, providing a rich graphical environment for direct robot manipulation, physics testing, and visual debugging. The viewer supports mouse-based interaction with bodies and joints, real-time parameter adjustment, and visualization of contacts, forces, and constraints.

# Interactive mode - launches MuJoCo GUI viewer
# Configuration preset: so_arm100_sim_interactive

simulation {
  mode: INTERACTIVE
  model_path: "simulation/models/so_arm100/scene.xml"
}

# Programmatic usage:
from simulation.sim_interactive import run_interactive
run_interactive(model_path="simulation/models/so_arm100/scene.xml")
Development Workflow Interactive mode is the primary tool for iterating on MJCF model definitions. Use it to verify joint ranges, actuator gains, contact properties, and visual appearance before running automated training or validation pipelines.

Passive Mode

Passive mode replays pre-recorded trajectories from CSV files, providing a visualization and validation tool without any active control. Joint positions are read from the trajectory file and applied frame-by-frame to the MuJoCo model, allowing visual verification of recorded motions.

# Passive mode - trajectory replay from CSV
# Configuration preset: sim_passive

simulation {
  mode: PASSIVE
  model_path: "simulation/models/so_arm100/scene.xml"
  trajectory_path: "simulation/trajectories/so_arm100_example.csv"
}

# CSV format: each row is a timestep, columns are joint positions
# timestamp, joint_1, joint_2, joint_3, joint_4, joint_5, joint_6
# 0.000, 0.00, -0.50,  0.30, 0.00, 0.20, 0.00
# 0.002, 0.01, -0.49,  0.31, 0.00, 0.21, 0.01
# ...

Mirror Mode (ROS2 Digital Twin)

Mirror mode creates a real-time digital twin that subscribes to ROS2 encoder topics from the physical robot and mirrors its state in the MuJoCo simulation. This enables live visualization of the real robot's motion, side-by-side comparison for debugging, and sim-to-real calibration.

# Mirror mode - ROS2 digital twin
# Configuration preset: sim_mirror

simulation {
  mode: MIRROR
  model_path: "simulation/models/so_arm100/scene.xml"

  mirror_topic_mappings {
    mappings {
      ros_topic: "/encoder/joint_1"
      mujoco_actuator: "joint_1_position"
      offset: 0.0        # Offset for real-to-sim coordinate alignment
    }
    mappings {
      ros_topic: "/encoder/joint_2"
      mujoco_actuator: "joint_2_position"
      offset: -0.087     # Compensate for encoder zero vs sim origin
    }
    mappings {
      ros_topic: "/encoder/joint_3"
      mujoco_actuator: "joint_3_position"
      offset: 0.052
    }
    mappings {
      ros_topic: "/encoder/joint_4"
      mujoco_actuator: "joint_4_position"
      offset: 0.0
    }
    mappings {
      ros_topic: "/encoder/joint_5"
      mujoco_actuator: "joint_5_position"
      offset: -0.034
    }
    mappings {
      ros_topic: "/encoder/joint_6"
      mujoco_actuator: "joint_6_position"
      offset: 0.0
    }
  }
}
Coordinate Mapping The offset values in topic mappings compensate for the difference between the physical robot's encoder zero positions and the MuJoCo model's joint origins. These offsets are typically determined during the calibration process and ensure accurate digital twin visualization.

Offscreen Mode

Offscreen mode runs the simulation without any GUI (headless), rendering frames to pixel arrays in memory. This mode is designed for automated pipelines where visual output is consumed programmatically rather than displayed to a user.

# Offscreen mode - headless rendering for data generation
# No GUI window, renders to memory buffers

simulation {
  mode: OFFSCREEN
  model_path: "simulation/models/so_arm100/scene.xml"
  render_width: 640
  render_height: 480
}

# Programmatic usage:
from simulation.sim_offscreen import run_offscreen

frames = run_offscreen(
    model_path="simulation/models/so_arm100/scene.xml",
    trajectory_path="simulation/trajectories/so_arm100_example.csv",
    width=640,
    height=480
)
# frames: list of numpy arrays (H, W, 3) for each timestep

Robot Models (MJCF XML)

Robot models are defined using MuJoCo's MJCF (MuJoCo XML Format), a declarative specification for multi-body dynamics including kinematics, collision geometry, actuators, sensors, visual appearance, and contact properties. JOSHUA includes several pre-built models.

SO-ARM100

The SO-ARM100 is the primary robotic arm model used across JOSHUA's development and training pipelines. Multiple MJCF variants are provided for different use cases:

File Description Use Case
so_arm100.xml Base robot model defining links, joints, actuators, and collision geometry Component inclusion in scenes
scene.xml Complete scene with the arm, table surface, lighting, and camera positions Interactive and passive simulation
scene_mjx.xml MJX-optimized scene with simplified contacts and convex collision geometry GPU-accelerated training via MJX
pick_place.xml Pick-and-place task scene with target objects and goal positions Task-specific training and evaluation
pick_place_mjx.xml MJX-optimized pick-and-place variant for GPU parallel training Large-scale RL training for manipulation

The SO-ARM100 model includes STL mesh assets for high visual fidelity. These meshes are stored in the assets/ directory alongside the MJCF files and are referenced via the <mesh> element in the XML. Visual meshes provide detailed rendering while simplified collision geometry ensures efficient physics computation.

<!-- Example MJCF structure for SO-ARM100 -->
<mujoco model="so_arm100">
  <compiler meshdir="assets/" />

  <asset>
    <mesh name="base_link" file="base_link.stl" />
    <mesh name="shoulder" file="shoulder_link.stl" />
    <mesh name="upper_arm" file="upper_arm_link.stl" />
    <mesh name="forearm" file="forearm_link.stl" />
    <mesh name="wrist" file="wrist_link.stl" />
    <mesh name="gripper" file="gripper_link.stl" />
  </asset>

  <worldbody>
    <body name="base" pos="0 0 0">
      <geom type="mesh" mesh="base_link" />
      <body name="shoulder" pos="0 0 0.05">
        <joint name="joint_1" type="hinge" axis="0 0 1" range="-3.14 3.14" />
        <geom type="mesh" mesh="shoulder" />
        <!-- ... additional links and joints ... -->
      </body>
    </body>
  </worldbody>

  <actuator>
    <position name="joint_1_position" joint="joint_1" kp="50" />
    <position name="joint_2_position" joint="joint_2" kp="50" />
    <!-- ... additional actuators ... -->
  </actuator>
</mujoco>

Ant Quadruped

The Ant model is a standard quadruped locomotion benchmark widely used in the reinforcement learning community. It serves as a validation target for JOSHUA's RL training infrastructure, ensuring the MJX pipeline produces correct results on a well-understood problem before training on custom robot models.

NVIDIA Isaac Sim USD

For industrial-grade simulation scenarios, JOSHUA provides the so100.usd asset for use with NVIDIA Isaac Sim. USD (Universal Scene Description) is the interchange format for Isaac Sim, supporting high-fidelity physically-based rendering, domain randomization, and integration with NVIDIA's simulation ecosystem.

GPU-Accelerated Training (MuJoCo-XLA)

JOSHUA integrates MuJoCo-XLA (MJX) for massively parallel, GPU-accelerated reinforcement learning. MJX compiles MuJoCo's physics engine into XLA (Accelerated Linear Algebra) operations, enabling the entire simulation to run on GPU through JAX.

Performance MJX enables running 2048 parallel environments simultaneously on a single GPU, achieving orders-of-magnitude speedup compared to CPU-based simulation. This is critical for sample-efficient reinforcement learning, where millions of environment interactions are required for policy convergence.

MJX Architecture

The MJX training pipeline compiles the MuJoCo physics model into JAX-compatible operations that execute entirely on GPU, avoiding CPU-GPU data transfer bottlenecks:

                     MuJoCo-XLA (MJX) Training Pipeline
  +------------------------------------------------------------------+
  |  MJCF XML Model                                                  |
  |       |                                                          |
  |       v                                                          |
  |  MJX Compilation (mujoco.mjx)                                    |
  |  Converts MjModel to XLA-compatible operations                   |
  |       |                                                          |
  |       v                                                          |
  |  JAX JIT Compilation                                             |
  |  Compiles step() + reward() into fused GPU kernels               |
  |       |                                                          |
  |       v                                                          |
  |  Vectorized Environment (2048 parallel instances)                |
  |  jax.vmap() maps single-env logic across batch dimension         |
  |       |                                                          |
  |       v                                                          |
  |  PPO Training Loop (Actor-Critic Network)                        |
  |  Collects rollouts -> Computes advantages -> Updates policy      |
  |       |                                                          |
  |       v                                                          |
  |  Checkpoint Saving (msgpack + JSON metadata)                     |
  +------------------------------------------------------------------+

PPO Training Pipeline

The training implementation (simulation/train_mjx.py) uses Proximal Policy Optimization (PPO) with an Actor-Critic architecture, implemented entirely in JAX/Flax for GPU execution:

# GPU-accelerated training with MJX
# Configuration preset: ant_train_mjx

import jax
import jax.numpy as jnp
from flax import linen as nn
from simulation.train_mjx import train_mjx

# Actor-Critic Network Architecture
class ActorCritic(nn.Module):
    action_dim: int

    @nn.compact
    def __call__(self, x):
        # Shared feature extraction
        x = nn.Dense(256)(x)
        x = nn.relu(x)
        x = nn.Dense(256)(x)
        x = nn.relu(x)

        # Actor head (policy)
        mean = nn.Dense(self.action_dim)(x)
        log_std = self.param('log_std', nn.initializers.zeros, (self.action_dim,))

        # Critic head (value function)
        value = nn.Dense(1)(x)

        return mean, log_std, value

# Training configuration
training_config = {
    "num_envs": 2048,              # Parallel environments on GPU
    "num_steps": 10,               # Steps per rollout
    "num_epochs": 4,               # PPO epochs per update
    "minibatch_size": 256,         # Minibatch size for updates
    "learning_rate": 3e-4,         # Adam learning rate
    "gamma": 0.99,                 # Discount factor
    "gae_lambda": 0.95,            # GAE lambda
    "clip_epsilon": 0.2,           # PPO clipping parameter
    "total_timesteps": 50_000_000, # Total training timesteps
}

# Launch training
train_mjx(
    model_path="simulation/models/ant/ant.xml",
    config=training_config
)

Training Monitoring & Checkpoints

The MJX training pipeline includes an integrated viewer for real-time monitoring of training progress. During training, you can observe the policy's behavior across environments, track reward curves, and evaluate convergence.

# Checkpoint structure
checkpoints/
  ant_mjx_step_10000000/
    params.msgpack        # Flax model parameters (msgpack serialized)
    config.json           # Training hyperparameters and metadata
    metrics.json          # Training metrics (rewards, losses, entropy)

# Loading a checkpoint for evaluation
from flax.serialization import from_bytes

with open("checkpoints/ant_mjx_step_10000000/params.msgpack", "rb") as f:
    params = from_bytes(None, f.read())

# Resume training or deploy policy to physical robot

NVIDIA Isaac Sim Integration

JOSHUA integrates with NVIDIA Isaac Sim for industrial-grade simulation scenarios that require advanced rendering, domain randomization, and GPU-accelerated physics via PhysX. Isaac Sim provides photorealistic rendering through RTX ray tracing, making it suitable for vision-heavy sim-to-real transfer.

Tasks & Environments

Isaac Sim environments are structured around tasks that define the complete simulation scenario, including the robot, environment, observations, rewards, and termination conditions. JOSHUA includes the following pre-built tasks:

Task Description Robot
Ant Locomotion Quadruped locomotion with velocity tracking and energy minimization rewards Ant
Trileg Robot Three-legged platform balancing and locomotion task Trileg

Each task is composed of modular terms that define different aspects of the environment:

# Isaac Sim task structure
isaac_sim/
  tasks/
    ant_locomotion/
      __init__.py
      ant_env.py           # Environment definition
      ant_env_cfg.py       # Environment configuration
    trileg_robot/
      __init__.py
      trileg_env.py
      trileg_env_cfg.py
  terms/
    observations.py        # Observation term implementations
    rewards.py             # Reward term implementations
    terminations.py        # Termination condition implementations
    resets.py              # Environment reset implementations

RL Training with Isaac Sim

Isaac Sim environments integrate with standard RL training frameworks. JOSHUA supports training via both stable-baselines3 (for standard algorithms with minimal configuration) and custom PPO implementations (for maximum control over the training loop):

# Training with stable-baselines3
from stable_baselines3 import PPO
from isaac_sim.tasks.ant_locomotion import AntEnv

env = AntEnv(num_envs=1024, device="cuda")

model = PPO(
    "MlpPolicy",
    env,
    learning_rate=3e-4,
    n_steps=2048,
    batch_size=64,
    n_epochs=10,
    verbose=1,
    device="cuda"
)

model.learn(total_timesteps=10_000_000)
model.save("isaac_ant_policy")

# Custom PPO training is also supported via simulation/train_mjx.py patterns
# adapted for Isaac Sim's environment interface

USD Asset Pipeline

Isaac Sim uses USD (Universal Scene Description) as its native scene format. JOSHUA's so100.usd asset includes:

Trajectory Data

JOSHUA uses a simple CSV format for recording and replaying robot trajectories. Trajectory files capture time-stamped joint positions that can be replayed in passive simulation mode or used as demonstration data for imitation learning.

CSV Format

Each row represents a single timestep. The first column is the timestamp (in seconds from episode start), and subsequent columns contain joint positions in radians:

# simulation/trajectories/so_arm100_example.csv
# Trajectory recording for SO-ARM100 (6-DOF)
#
# Format: timestamp, joint_1, joint_2, joint_3, joint_4, joint_5, joint_6
# Units:  seconds,   radians, radians, radians, radians, radians, radians

timestamp,joint_1,joint_2,joint_3,joint_4,joint_5,joint_6
0.000, 0.000, -0.523,  0.349, 0.000, 0.174, 0.000
0.002, 0.003, -0.520,  0.352, 0.001, 0.176, 0.002
0.004, 0.008, -0.515,  0.358, 0.003, 0.180, 0.005
0.006, 0.015, -0.507,  0.367, 0.006, 0.186, 0.009
0.008, 0.024, -0.496,  0.378, 0.010, 0.194, 0.014
...

Usage

Config

The simulation engine is configured through the SimulationConfig Protocol Buffers message, which is part of JOSHUA's unified configuration system. All simulation parameters—mode, model path, topic mappings, and rendering settings—are specified in the .pbtxt configuration file.

SimulationConfig Proto

// Protocol Buffers schema for simulation configuration
message SimulationConfig {
  // Simulation mode
  enum Mode {
    INTERACTIVE = 0;   // Full MuJoCo GUI viewer
    PASSIVE = 1;       // Trajectory replay from CSV
    MIRROR = 2;        // ROS2 digital twin
    OFFSCREEN = 3;     // Headless rendering
  }
  Mode mode = 1;

  // Path to MJCF XML model file
  string model_path = 2;

  // Trajectory file path (used in PASSIVE mode)
  string trajectory_path = 3;

  // Mirror mode topic mappings (used in MIRROR mode)
  MirrorTopicMappings mirror_topic_mappings = 4;

  // Offscreen rendering dimensions (used in OFFSCREEN mode)
  int32 render_width = 5;
  int32 render_height = 6;
}

// Mapping between ROS2 encoder topics and MuJoCo actuators
message MirrorTopicMappings {
  repeated TopicMapping mappings = 1;
}

message TopicMapping {
  string ros_topic = 1;           // ROS2 topic name (e.g., "/encoder/joint_1")
  string mujoco_actuator = 2;    // MuJoCo actuator name (e.g., "joint_1_position")
  float offset = 3;              // Real-to-sim coordinate offset (radians)
}

Example Configurations

Interactive Simulation

# Interactive mode - SO-ARM100 with full MuJoCo GUI
# Preset: so_arm100_sim_interactive

operation_mode: SIMULATION

simulation {
  mode: INTERACTIVE
  model_path: "simulation/models/so_arm100/scene.xml"
}

Passive Trajectory Replay

# Passive mode - replay recorded trajectory
# Preset: sim_passive

operation_mode: SIMULATION

simulation {
  mode: PASSIVE
  model_path: "simulation/models/so_arm100/scene.xml"
  trajectory_path: "simulation/trajectories/so_arm100_example.csv"
}

Mirror Digital Twin

# Mirror mode - ROS2 digital twin of physical robot
# Preset: sim_mirror

operation_mode: SIMULATION

simulation {
  mode: MIRROR
  model_path: "simulation/models/so_arm100/scene.xml"
  mirror_topic_mappings {
    mappings { ros_topic: "/encoder/joint_1" mujoco_actuator: "joint_1_position" offset: 0.0 }
    mappings { ros_topic: "/encoder/joint_2" mujoco_actuator: "joint_2_position" offset: -0.087 }
    mappings { ros_topic: "/encoder/joint_3" mujoco_actuator: "joint_3_position" offset: 0.052 }
    mappings { ros_topic: "/encoder/joint_4" mujoco_actuator: "joint_4_position" offset: 0.0 }
    mappings { ros_topic: "/encoder/joint_5" mujoco_actuator: "joint_5_position" offset: -0.034 }
    mappings { ros_topic: "/encoder/joint_6" mujoco_actuator: "joint_6_position" offset: 0.0 }
  }
}

MJX GPU Training

# MJX training mode - GPU-accelerated PPO on Ant quadruped
# Preset: ant_train_mjx

operation_mode: TRAINING

training {
  framework: MJX
  model_path: "simulation/models/ant/ant.xml"
  num_envs: 2048
  total_timesteps: 50000000
  checkpoint_dir: "checkpoints/ant_mjx"
}

Presets Reference

JOSHUA includes pre-built configuration presets for common simulation scenarios. These presets can be loaded directly or used as starting points for custom configurations.

Preset Name Mode Description
so_arm100_sim_interactive Interactive SO-ARM100 arm in full MuJoCo GUI viewer with scene environment (table, lighting). For manual testing and model iteration.
sim_passive Passive Trajectory replay from CSV for the SO-ARM100. Visualizes pre-recorded motions for validation and review.
sim_mirror Mirror ROS2 digital twin for SO-ARM100. Subscribes to encoder topics with calibrated offsets for real-time mirroring.
ant_train_mjx MJX Training GPU-accelerated PPO training on the Ant quadruped benchmark. 2048 parallel environments with integrated viewer.
so_arm100_train_mjx MJX Training GPU-accelerated training for SO-ARM100 manipulation tasks using the MJX-optimized scene variant.
so_arm100_pick_place_mjx MJX Training Pick-and-place task training with the MJX-optimized model. Includes object spawning and goal conditioning.
isaac_ant_locomotion Isaac Sim Ant locomotion training in NVIDIA Isaac Sim with PhysX physics and RTX rendering.
isaac_trileg Isaac Sim Trileg robot balancing and locomotion task in Isaac Sim with USD asset pipeline.
Selecting a Preset Load a preset by specifying its name in the JOSHUA launch command: joshua --preset so_arm100_sim_interactive. Presets can be overridden with additional command-line flags or by providing a custom .pbtxt configuration file that extends the preset values.