Skip to content

rewards

Functions:

Name Description
lin_vel_l2

Penalize base linear velocity using L2 squared kernel.

ang_vel_l2

Penalize base angular velocity using L2 squared kernel.

pos_error_l2

Penalize asset pos from its target pos using L2 squared kernel.

pos_error_tanh

Penalize asset pos from its target pos using tanh kernel.

yaw_error_l2

Penalize heading error from target heading using L2 squared kernel.

yaw_error_tanh

Penalize heading error from target heading using tanh kernel.

track_lin_vel_z_exp

Reward tracking of linear velocity commands (z axis) using exponential kernel.

track_lin_vel_exp

Reward tracking of linear velocity commands using exponential kernel.

track_yaw_vel_exp

Reward tracking of angular velocity commands (yaw) using exponential kernel.

bimodal_action_tanh

Penalize bimodal actions using tanh kernel.

bimodal_height_tanh

Penalize bimodal height using tanh kernel.

contact_impulse

Penalize excessive contact impulse (rate of change of contact forces).

bimodal_contacts

Penalize contacts when switching from flight to ground mode.

lin_vel_l2

lin_vel_l2(env: ManagerBasedRLEnv, asset_cfg: SceneEntityCfg = SceneEntityCfg('robot')) -> torch.Tensor

Penalize base linear velocity using L2 squared kernel.

ang_vel_l2

ang_vel_l2(env: ManagerBasedRLEnv, asset_cfg: SceneEntityCfg = SceneEntityCfg('robot')) -> torch.Tensor

Penalize base angular velocity using L2 squared kernel.

pos_error_l2

pos_error_l2(env: ManagerBasedRLEnv, command_name: str) -> torch.Tensor

Penalize asset pos from its target pos using L2 squared kernel.

pos_error_tanh

pos_error_tanh(env: ManagerBasedRLEnv, std: float, command_name: str) -> torch.Tensor

Penalize asset pos from its target pos using tanh kernel.

yaw_error_l2

yaw_error_l2(env: ManagerBasedRLEnv, command_name: str) -> torch.Tensor

Penalize heading error from target heading using L2 squared kernel.

yaw_error_tanh

yaw_error_tanh(env: ManagerBasedRLEnv, std: float, command_name: str) -> torch.Tensor

Penalize heading error from target heading using tanh kernel.

track_lin_vel_z_exp

track_lin_vel_z_exp(env: ManagerBasedRLEnv, std: float, command_name: str, is_bimodal: bool = False, asset_cfg: SceneEntityCfg = SceneEntityCfg('robot')) -> torch.Tensor

Reward tracking of linear velocity commands (z axis) using exponential kernel.

track_lin_vel_exp

track_lin_vel_exp(env: ManagerBasedRLEnv, std: float, command_name: str, asset_cfg: SceneEntityCfg = SceneEntityCfg('robot')) -> torch.Tensor

Reward tracking of linear velocity commands using exponential kernel.

track_yaw_vel_exp

track_yaw_vel_exp(env: ManagerBasedRLEnv, std: float, command_name: str, asset_cfg: SceneEntityCfg = SceneEntityCfg('robot')) -> torch.Tensor

Reward tracking of angular velocity commands (yaw) using exponential kernel.

bimodal_action_tanh

bimodal_action_tanh(env: ManagerBasedRLEnv, std: float, command_name: str, flight_action_name: str = 'control_action', ground_action_name: str = 'track_control_action', ground_weight: float = 1.0, flight_weight: float = 1.0) -> torch.Tensor

Penalize bimodal actions using tanh kernel.

bimodal_height_tanh

bimodal_height_tanh(env: ManagerBasedRLEnv, std: float, command_name: str, asset_cfg: SceneEntityCfg = SceneEntityCfg('robot'), ground_weight: float = 1.0, flight_weight: float = 1.0) -> torch.Tensor

Penalize bimodal height using tanh kernel.

contact_impulse

contact_impulse(env: ManagerBasedRLEnv, threshold: float, sensor_cfg: SceneEntityCfg, mode: str = 'threshold') -> torch.Tensor

Penalize excessive contact impulse (rate of change of contact forces).

This function calculates the impulse as the change in contact forces between consecutive time steps and penalizes values that exceed a threshold.

Parameters:

Name Type Description Default

env

ManagerBasedRLEnv

The learning environment.

required

threshold

float

Maximum acceptable impulse magnitude. Forces below this are not penalized.

required

sensor_cfg

SceneEntityCfg

Configuration for the contact sensor, specifying which bodies to monitor.

required

mode

str

Penalty calculation mode: - "threshold": Penalize only the amount exceeding threshold (continuous) - "binary": Return 1.0 if any impulse exceeds threshold, else 0.0 (discrete) - "total": Return total impulse magnitude regardless of threshold (for monitoring)

'threshold'

Returns:

Type Description
Tensor

torch.Tensor: Penalty value for each environment: - "threshold" mode: Sum of (impulse - threshold) for all violations - "binary" mode: 1.0 if violation exists, 0.0 otherwise - "total" mode: Total impulse magnitude

Raises:

Type Description
ValueError

If the sensor history length is less than 2 or the mode is invalid.

Examples:

>>> # Penalize hard landings (impulse > 50 N·s)
>>> impulse_penalty = contact_impulse(
...     env, threshold=50.0,
...     sensor_cfg=SceneEntityCfg("contact_sensor", body_ids=[arml_id, armr_id]),
...     mode="threshold"
... )
>>> rewards["impulse_penalty"] = impulse_penalty * -1.0

bimodal_contacts

bimodal_contacts(env: ManagerBasedRLEnv, command_name: str, threshold: float, sensor_cfg: SceneEntityCfg, mode: str = 'threshold', ground_weight: float = 1.0, flight_weight: float = 1.0) -> torch.Tensor

Penalize contacts when switching from flight to ground mode.