Skip to content

LAV2_base_moe

rsl-rl config for LAV2 base task with Mixture-of-Experts trunk.

Mirrors lav2/runner/skrl/cfg/LAV2_base_moe.py but adapted to rsl-rl v5 native dict format. Uses :class:~lav2.runner.rsl_rl.models.moe.MoEMLPModel as the actor model class and :class:~lav2.runner.rsl_rl.algorithms.moe.PPO_MoE for post-update MoE bias correction.

Usage (Isaac Lab Hydra entry point)::

"rsl_rl_cfg_entry_point": "lav2.runner.rsl_rl.cfg.LAV2_base_moe:LAV2MoEPPORunnerCfg"

Usage (direct script)::

from lav2.runner.rsl_rl.cfg.LAV2_base_moe import get_runner_cfg
cfg = get_runner_cfg(experiment_name="my_run")
runner = OnPolicyRunner(env, cfg, log_dir, device=device)

Classes:

Name Description
BetaDistributionCfg

Configuration for the Beta output distribution.

LAV2MoEPPORunnerCfg

Hydra-compatible config class for the MoE task.

Functions:

Name Description
get_runner_cfg

Return a rsl-rl v5 native config dict for the MoE task.

BetaDistributionCfg

Configuration for the Beta output distribution.

LAV2MoEPPORunnerCfg

Bases: RslRlOnPolicyRunnerCfg

Hydra-compatible config class for the MoE task.

get_runner_cfg

get_runner_cfg(experiment_name: str = _EXPERIMENT_NAME, max_iterations: int = _MAX_ITERATIONS, num_experts: int = 4, k: int = 2, init_std: float = 1.0, distribution: str = 'beta', action_range: tuple[float, float] = (-1.0, 1.0)) -> dict

Return a rsl-rl v5 native config dict for the MoE task.

Parameters:

Name Type Description Default

experiment_name

str

Logging directory name.

_EXPERIMENT_NAME

max_iterations

int

Number of PPO iterations.

_MAX_ITERATIONS

num_experts

int

Number of experts in the MoE layer.

4

k

int

Top-k experts routed to per token.

2

init_std

float

Initial standard deviation (Gaussian only). Default 1.0 (matches skrl MoE config log_std=0).

1.0

distribution

str

Output distribution, "beta" or "gaussian". Default "beta".

'beta'

action_range

tuple[float, float]

Action space bounds (Beta only). Default (-1.0, 1.0).

(-1.0, 1.0)