跳转至

attitude

Shared model with rotation-manifold projection layers.

Control modes and their action dimensions::

cmd_ctatt_euler → 4D [thrust, roll, pitch, yaw] cmd_ctatt_rotvec → 4D [thrust, rx, ry, rz] cmd_ctatt_quat → 5D [thrust, qx, qy, qz, qw] cmd_ctatt_rotmat → 10D [thrust, r00..r22]

Projection layers ensure the rotation component of the policy output lies on a valid SO(3) manifold. Available variants per rotation type:

============== ============================================================ Euler / rotvec EulerProjection (tanh clamp) Quaternion QuatProjection (tanh + L2-normalize) QuatPlusProjection (tanh + sigmoid w + L2-normalize) QuatOffsetProjection (offset by identity quat) QuatPlusOffsetProjection (offset + sigmoid w) Rotation RotmatProjection (tanh + SVD-orthogonalize) matrix RotmatOffsetProjection (offset by identity matrix) ============== ============================================================

Select a non-default projection by passing projection_cls when constructing :class:SharedAttitude. The config module re-exports the projection classes for compatibility with existing experiments.

类:

名称 描述
EulerProjection

Bounded projection for euler-angle or rotation-vector actions.

QuatProjection

L2-normalize the 4-D quaternion at indices [1:5].

QuatPlusProjection

Quaternion projection with double-cover elimination.

QuatOffsetProjection

Quaternion projection centred at the identity rotation.

QuatPlusOffsetProjection

Quaternion projection combining offset and double-cover elimination.

RotmatProjection

SVD-orthogonalize the 9-D rotation-matrix at indices [1:10].

RotmatOffsetProjection

Rotation-matrix projection centred at the identity matrix.

SharedAttitude

Shared actor-critic with rotation-manifold policy projection.

EulerProjection

Bases: Module

Bounded projection for euler-angle or rotation-vector actions.

Applies tanh to clamp all components to [-1, 1] so that the environment action scaling (e.g. multiplying by attitude_range) stays within the expected range. No geometric constraint is needed beyond clipping.

References

so3_primer.rotations.modules.ddpg.activations — plain nn.Tanh() used for euler / tangent quadrotor actors.

方法:

名称 描述
forward

Clamp x to [-1, 1] via tanh.

forward

forward(x: Tensor) -> torch.Tensor

Clamp x to [-1, 1] via tanh.

QuatProjection

Bases: Module

L2-normalize the 4-D quaternion at indices [1:5].

Applies tanh to bound all components, then normalizes the quaternion slice to unit length. The thrust channel (index 0) is passed through unchanged.

This is the simplest quaternion projection — the double-cover ambiguity (q and -q represent the same rotation) is left for the policy to resolve.

References

so3_primer.rotations.modules.ddpg.activations.QuatQuadrotor

方法:

名称 描述
forward

Normalize quaternion slice of x.

forward

forward(x: Tensor) -> torch.Tensor

Normalize quaternion slice of x.

QuatPlusProjection

Bases: Module

Quaternion projection with double-cover elimination.

The scalar component w is mapped through a sigmoid channel (tanh → [0, 1]) so that the output quaternion always satisfies w ≥ 0, breaking the q ≡ -q ambiguity. The vector components (x, y, z) use standard tanh. This mirrors so3_primer.rotations.modules.ddpg.activations.QuatPlusQuadrotor and so3_primer.rotations.modules.ppo.std.QuatPlusQuadrotorActor.

方法:

名称 描述
forward

Match the primer quadrotor quat-plus head.

forward

forward(x: Tensor) -> torch.Tensor

Match the primer quadrotor quat-plus head.

QuatOffsetProjection

QuatOffsetProjection()

Bases: Module

Quaternion projection centred at the identity rotation.

Adds the identity quaternion [0, 0, 0, 1] to the policy output, clamps to [-1, 1], then L2-normalizes. This centres the action distribution on zero-rotation so that a zero policy output commands hover attitude.

References

so3_primer.rotations.modules.ddpg.activations.QuatQuadrotorOffset

Register the identity-quaternion offset buffer.

方法:

名称 描述
forward

Offset x by identity quaternion, then normalize.

forward

forward(x: Tensor) -> torch.Tensor

Offset x by identity quaternion, then normalize.

QuatPlusOffsetProjection

QuatPlusOffsetProjection()

Bases: Module

Quaternion projection combining offset and double-cover elimination.

Offsets the vector part by zero (adds [0, 0, 0]), maps w through a sigmoid channel, then L2-normalizes.

References

Combines so3_primer QuatPlusQuadrotor (sigmoid w) with QuatQuadrotorOffset (identity offset).

Register the zero-offset buffer for the vector part.

方法:

名称 描述
forward

Offset vector part by zero, force w ≥ 0, then normalize.

forward

forward(x: Tensor) -> torch.Tensor

Offset vector part by zero, force w ≥ 0, then normalize.

RotmatProjection

Bases: Module

SVD-orthogonalize the 9-D rotation-matrix at indices [1:10].

Reshapes the matrix slice to (..., 3, 3), applies tanh to bound elements, then computes the least-squares orthogonal Procrustes projection via SVD. A determinant correction ensures the result lies in SO(3) (det=+1) rather than O(3).

The thrust channel (index 0) is passed through unchanged.

References

so3_primer.rotations.modules.ddpg.activations.MatrixQuadrotor

方法:

名称 描述
forward

SVD-project the matrix slice of x onto SO(3).

forward

forward(x: Tensor) -> torch.Tensor

SVD-project the matrix slice of x onto SO(3).

RotmatOffsetProjection

RotmatOffsetProjection()

Bases: Module

Rotation-matrix projection centred at the identity matrix.

Adds the flattened 3×3 identity to the policy output, clamps to [-1, 1], then SVD-projects onto SO(3). A zero policy output therefore commands hover attitude.

References

so3_primer.rotations.modules.ddpg.activations.MatrixQuadrotorOffset

Register the identity-matrix offset buffer.

方法:

名称 描述
forward

Offset x by identity matrix, then SVD-project.

forward

forward(x: Tensor) -> torch.Tensor

Offset x by identity matrix, then SVD-project.

SharedAttitude

SharedAttitude(observation_space, state_space, action_space, device, clip_actions=False, clip_log_std=True, min_log_std=-20, max_log_std=2, reduction='sum', projection_cls: type[Module] | None = None)

Bases: GaussianMixin, DeterministicMixin, Model

Shared actor-critic with rotation-manifold policy projection.

The trunk is a 2-hidden-layer MLP (128 units, ELU activation) shared between policy and value heads. The policy mean passes through an optional :class:nn.Module projection layer that enforces the rotation component to lie on the correct manifold (e.g. unit quaternion, orthogonal matrix).

The projection is selected by action_dim via :data:_PROJECTION_MAP unless an explicit projection_cls is provided.

Initialize shared trunk, heads, and optional projection layer.

参数:

名称 类型 描述 默认

observation_space

Environment observation space.

必需

state_space

Environment state space.

必需

action_space

Environment action space.

必需

device

Torch device for parameters and computation.

必需

clip_actions

Whether to clip actions to the action space bounds (passed to :class:GaussianMixin).

False

clip_log_std

Whether to clip the log-standard-deviation (passed to :class:GaussianMixin).

True

min_log_std

Minimum log-std value.

-20

max_log_std

Maximum log-std value.

2

reduction

Reduction mode for the log-probability.

'sum'

projection_cls

type[Module] | None

Optional override for the rotation-manifold projection layer. When None (default), the projection is auto-detected from action_space via :data:_PROJECTION_MAP.

None

方法:

名称 描述
act

Dispatch to the appropriate mixin based on role.

compute

Compute policy mean or state value.

act

act(inputs: Any, role: str) -> Any

Dispatch to the appropriate mixin based on role.

参数:

名称 类型 描述 默认

inputs

Any

Model inputs dict (must contain "observations").

必需

role

str

"policy" or "value".

必需

返回:

类型 描述
Any

Action outputs for the given role, or None if unknown.

compute

compute(inputs: Any, role: str) -> Any

Compute policy mean or state value.

For the "policy" role the mean passes through the rotation-manifold projection (if configured) and the trunk features are cached for reuse by the value head.

参数:

名称 类型 描述 默认

inputs

Any

Model inputs dict (must contain "observations").

必需

role

str

"policy" or "value".

必需

返回:

类型 描述
Any

(output_tensor, extras_dict) for the given role,

Any

or None if role is unknown.