rewards
Functions:
| Name | Description |
|---|---|
lin_vel_l2 |
Penalize base linear velocity using L2 squared kernel. |
ang_vel_l2 |
Penalize base angular velocity using L2 squared kernel. |
pos_error_l2 |
Penalize asset pos from its target pos using L2 squared kernel. |
pos_error_tanh |
Penalize asset pos from its target pos using tanh kernel. |
yaw_error_l2 |
Penalize heading error from target heading using L2 squared kernel. |
yaw_error_tanh |
Penalize heading error from target heading using tanh kernel. |
track_lin_vel_z_exp |
Reward tracking of linear velocity commands (z axis) using exponential kernel. |
track_lin_vel_exp |
Reward tracking of linear velocity commands using exponential kernel. |
track_yaw_vel_exp |
Reward tracking of angular velocity commands (yaw) using exponential kernel. |
bimodal_action_tanh |
Penalize bimodal actions using tanh kernel. |
bimodal_height_tanh |
Penalize bimodal height using tanh kernel. |
contact_impulse |
Penalize excessive contact impulse (rate of change of contact forces). |
bimodal_contacts |
Penalize contacts when switching from flight to ground mode. |
lin_vel_l2
lin_vel_l2(env: ManagerBasedRLEnv, asset_cfg: SceneEntityCfg = SceneEntityCfg('robot')) -> torch.Tensor
Penalize base linear velocity using L2 squared kernel.
ang_vel_l2
ang_vel_l2(env: ManagerBasedRLEnv, asset_cfg: SceneEntityCfg = SceneEntityCfg('robot')) -> torch.Tensor
Penalize base angular velocity using L2 squared kernel.
pos_error_l2
pos_error_l2(env: ManagerBasedRLEnv, command_name: str) -> torch.Tensor
Penalize asset pos from its target pos using L2 squared kernel.
pos_error_tanh
pos_error_tanh(env: ManagerBasedRLEnv, std: float, command_name: str) -> torch.Tensor
Penalize asset pos from its target pos using tanh kernel.
yaw_error_l2
yaw_error_l2(env: ManagerBasedRLEnv, command_name: str) -> torch.Tensor
Penalize heading error from target heading using L2 squared kernel.
yaw_error_tanh
yaw_error_tanh(env: ManagerBasedRLEnv, std: float, command_name: str) -> torch.Tensor
Penalize heading error from target heading using tanh kernel.
track_lin_vel_z_exp
track_lin_vel_z_exp(env: ManagerBasedRLEnv, std: float, command_name: str, is_bimodal: bool = False, asset_cfg: SceneEntityCfg = SceneEntityCfg('robot')) -> torch.Tensor
Reward tracking of linear velocity commands (z axis) using exponential kernel.
track_lin_vel_exp
track_lin_vel_exp(env: ManagerBasedRLEnv, std: float, command_name: str, asset_cfg: SceneEntityCfg = SceneEntityCfg('robot')) -> torch.Tensor
Reward tracking of linear velocity commands using exponential kernel.
track_yaw_vel_exp
track_yaw_vel_exp(env: ManagerBasedRLEnv, std: float, command_name: str, asset_cfg: SceneEntityCfg = SceneEntityCfg('robot')) -> torch.Tensor
Reward tracking of angular velocity commands (yaw) using exponential kernel.
bimodal_action_tanh
bimodal_action_tanh(env: ManagerBasedRLEnv, std: float, command_name: str, flight_action_name: str = 'control_action', ground_action_name: str = 'track_control_action', ground_weight: float = 1.0, flight_weight: float = 1.0) -> torch.Tensor
Penalize bimodal actions using tanh kernel.
bimodal_height_tanh
bimodal_height_tanh(env: ManagerBasedRLEnv, std: float, command_name: str, asset_cfg: SceneEntityCfg = SceneEntityCfg('robot'), ground_weight: float = 1.0, flight_weight: float = 1.0) -> torch.Tensor
Penalize bimodal height using tanh kernel.
contact_impulse
contact_impulse(env: ManagerBasedRLEnv, threshold: float, sensor_cfg: SceneEntityCfg, mode: str = 'threshold') -> torch.Tensor
Penalize excessive contact impulse (rate of change of contact forces).
This function calculates the impulse as the change in contact forces between consecutive time steps and penalizes values that exceed a threshold.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
ManagerBasedRLEnv
|
The learning environment. |
required |
|
float
|
Maximum acceptable impulse magnitude. Forces below this are not penalized. |
required |
|
SceneEntityCfg
|
Configuration for the contact sensor, specifying which bodies to monitor. |
required |
|
str
|
Penalty calculation mode: - "threshold": Penalize only the amount exceeding threshold (continuous) - "binary": Return 1.0 if any impulse exceeds threshold, else 0.0 (discrete) - "total": Return total impulse magnitude regardless of threshold (for monitoring) |
'threshold'
|
Returns:
| Type | Description |
|---|---|
Tensor
|
torch.Tensor: Penalty value for each environment: - "threshold" mode: Sum of (impulse - threshold) for all violations - "binary" mode: 1.0 if violation exists, 0.0 otherwise - "total" mode: Total impulse magnitude |
Raises:
| Type | Description |
|---|---|
ValueError
|
If the sensor history length is less than 2 or the mode is invalid. |
Examples:
>>> # Penalize hard landings (impulse > 50 N·s)
>>> impulse_penalty = contact_impulse(
... env, threshold=50.0,
... sensor_cfg=SceneEntityCfg("contact_sensor", body_ids=[arml_id, armr_id]),
... mode="threshold"
... )
>>> rewards["impulse_penalty"] = impulse_penalty * -1.0
bimodal_contacts
bimodal_contacts(env: ManagerBasedRLEnv, command_name: str, threshold: float, sensor_cfg: SceneEntityCfg, mode: str = 'threshold', ground_weight: float = 1.0, flight_weight: float = 1.0) -> torch.Tensor
Penalize contacts when switching from flight to ground mode.