Skip to content

Commit a3ad40d

Browse files
greeble-devcartalice-i-cecile
authored
Alternative glTF coordinate conversion (#20394)
## Objective Change glTF coordinate conversion to satisfy some common use cases while dodging the more controversial aspects. This fixes #20621, but at the cost of removing one feature. ## Summary The Bevy glTF loader can optionally convert nodes and meshes from glTF's "+Z forward" semantics to Bevy's "-Z forward". But the current implementation [has issues](#20621), particularly with cameras and lights. It might also cause problems for users who want to re-orient the scene as a whole while preserving the original node semantics. This PR replaces node conversion with a simpler correction to the scene root and mesh entities. The new approach satisfies many use cases and fixes the issues with cameras and lights. But it could be a regression for some users. ## Background There's been confusion over how glTF behaves and what users might want from coordinate conversion. This section recaps the basic concepts, glTF's semantics, the current loader behaviour, and some potential user stories. Or you can skip to the next section if you want to get straight to the changes. <details> <summary>Click to expand</summary> ### Coordinate Systems and Semantics 3D coordinate systems can have semantics assigned to their axes. These semantics are often defined as a forward axis, an up axis, and a [handedness](https://en.wikipedia.org/wiki/Right-hand_rule) - the side axis is implicit in the other choices. Bevy's standard semantics are "-Z = forward, +Y = up, right handed". This standard is codified by the `forward` and `up` methods of `Transform` and `GlobalTransform`, and by the renderer's interpretation of camera and light transforms. There are debates about the standard and whether users should be able to choose different semantics. This PR does not account for those debates, and assumes that users want to follow the current standard. Other engines, DCCs, and file formats can have [different semantics](https://mastodon.social/@acegikmo/113313928426095165). Unlike Bevy, some vary their semantics by object type - a camera's forward axis may not be the same as a light's. Some only specify an up axis, leaving the forward and side axes unspecified. Assets might not follow the standard semantics of their file format. Static mesh hierarchies and skeletal animation rigs may even have per-node or per-joint semantics - a character rig could be +Y forward in the scene, while the head joint is +Z forward. One character rig might have both feet +X forward, while another rig might have the left foot +X forward and the right foot -X forward. This creates complexity, but also creates jobs, so no-one can say if it's good or bad. ### Asset Loaders And Coordinate Conversion Bevy currently has a glTF loader, and I'm assuming it will get in-repo FBX and USD loaders at some point. These loaders are likely to follow a common pattern: - The files contain meshes, which correspond to Bevy `Mesh` assets and skinned meshes. - Bevy meshes can only have a single material, so what the file format considers a single mesh might be multiple Bevy meshes. - The files have a node hierarchy, where nodes roughly correspond to Bevy entities with a `Transform`. - Nodes can optionally be mesh instances, cameras, lights or skinned mesh joints. - The loader outputs the assets and a `Scene` with an entity hierarchy that tries to match the file's node hierarchy. - Some aspects of nodes (e.g. pivot transforms) can't be represented in Bevy within a single entity. - So a 1:1 mapping might not be possible - instead nodes become multiple entities, or some data is lost (e.g. baking down pivot transforms). - Users can choose to spawn the scene, or they can ignore it and use the assets directly. Users may want asset loaders that convert assets to Bevy's standard semantics, so `Transform::forward` matches the asset. But the details of conversion can be contentious - users may want some parts of the scene to be converted differently from other parts, and assets may have ambiguities than can only be resolved by the user. There will never be a simple "it just works" option, although there could be a least worst default that satisfies the biggest group of users. Converting in the loader is not the only option. The user could edit the assets themselves or run a conversion script in DCC. But that's a pain - particularly for users who rely on asset packs and don't have DCC experience. Another option is to implement an asset transform that does coordinate conversion. But having the options right there in the loader is convenient. ### User Stories For coordinate conversion in the loader, some user stories might be: - "I want to spawn a scene on an entity with Bevy semantics and have it look right." - This is probably the most common case - the user wants to do `SceneRoot(load("my.gltf"))` and have it visually match the entity's `Transform::forward()`, and cameras and lights should do the right thing. - The user might not care about the semantics of mesh assets and nodes in the scene - they just want the scene as a whole to look right. - "I want to spawn a scene, and convert some or all of the nodes to Bevy semantics." - The user might have nodes in their scene that they want to animate manually or hook up to other systems that assume Bevy semantics. - That becomes easier if the loader can convert the node's forward to match `Transform::forward()`. - Conversely, some users might want nodes to stay as they are (particularly skeletal animation rigs). - "I want a mesh asset that's converted to Bevy semantics. I'm not using a scene." - Maybe the user is doing `Mesh3d(load("mesh.gltf#Mesh0"))` and wants it to match the entity's forward. - Or this is the first stage of an asset pipeline and the remaining stages expect Bevy semantics. - "I don't want the loader to touch anything." - Maybe they've already converted the file, or want to convert it post-load, or don't want to use Bevy semantics at all. - "I want one of the other conversion stories, but the loader should convert to my chosen semantics rather than Bevy's". - Z-up is not a crime. ### glTF Semantics glTF [scene semantics](https://registry.khronos.org/glTF/specs/2.0/glTF-2.0.html#coordinate-system-and-units) are "+Z = forward, +Y = up, right handed". This is almost the same as Bevy, except that scene forward is +Z instead of Bevy's -Z. Some glTF assets do not follow the spec's scene semantics. The Kenney asset packs use a mix of +Z and -Z forward. At least [one of the Khronos sample assets](https://github.com/KhronosGroup/glTF-Sample-Assets/tree/main/Models/Duck) uses +X forward. That said, the majority of Kenney assets and almost all the Khronos sample assets I tested do follow the spec. glTF [camera node](https://registry.khronos.org/glTF/specs/2.0/glTF-2.0.html#view-matrix) and [light node](https://github.com/KhronosGroup/glTF/blob/main/extensions/2.0/Khronos/KHR_lights_punctual/README.md#adding-light-instances-to-nodes) semantics are different from glTF scene semantics - they're -Z forward, same as Bevy. The glTF spec doesn't explicitly say if non-camera/light nodes and mesh buffers have semantics. I'm guessing that some users will have nodes and meshes that follow the spec's scene semantics, and might want them converted to Bevy semantics. But as noted in the user stories, it's likely that other users will have different needs. glTF and Bevy allow a single node/entity to be both a mesh and a camera or a light. This only makes sense if the user intends the mesh to have the same semantics as cameras and lights. I think it's very unlikely that significant numbers of users will want support for this combination - many other DCCs, file formats and engines don't support it at all. ### How The Bevy glTF Loader Works The loader maps glTF nodes to Bevy entities. It also adds entities for two cases: 1. A single "scene root" entity is added as a parent of the glTF root nodes. - Note that this is not the user's entity with the `SceneRoot` component - the scene root entity is a child of that entity. 2. Mesh primitive entities are added as a child of each glTF mesh node. - In glTF, a single mesh node can contain multiple primitives. - But in Bevy a mesh component can only contain a single primitive, so one entity can't contain multiple primitives. - So, for each primitive, Bevy adds a child entity with a mesh component. A single branch of the resulting scene hierarchy might look like this: - User entity with `SceneRoot` component. - Scene root entity. - glTF root node entity. - glTF intermediate node entities. - glTF mesh node entity (does not contain `Mesh3d` component) - Mesh primitive entities (does contain `Mesh3d` component). ### glTF Loader Changes In 0.17 In Bevy 0.16, the only user story supported by the glTF loader was "no conversion". During the 0.17 cycle, #19633 and some follow up PRs implemented an option that converts nodes, meshes and animation tracks. The changes do satisfy some user stories, including the common "convert scene semantics" (mostly) and "convert mesh semantics". But there's some problems (#20621): - The conversion depends on converting both nodes and meshes. - Some users might want to convert the scene without converting nodes and/or meshes. - Light and camera nodes get complicated. - glTF camera/light nodes already match Bevy semantics, so they need a counter-conversion (since their parent might have been converted). - Animation tracks for lights and cameras are not correctly converted. - (Counterpoint: This is fixable at the cost of some complexity) - Child nodes of lights and cameras are not correctly converted. - (Counterpoint: Also fixable, and probably a niche case?) - The conversion can't support a node that's a mesh instance and also a light and/or a camera. - (Counterpoint: As mentioned earlier, this is probably a very niche or non-existent use case.) </details> ## Solution The big change in this PR is the removal of node conversion. Instead, corrective transforms are applied to the scene root entity and mesh primitive entities. Before this PR: - Scene root entity. - glTF root node entity. <-- CONVERTED - glTF intermediate node entities. <-- CONVERTED - glTF mesh node entity. <-- CONVERTED - Mesh primitive entities. After this PR: - Scene root entity. <-- CORRECTIVE (if scene conversion enabled) - glTF root node entity. - glTF intermediate node entities. - glTF mesh node entity. - Mesh primitive entities. <-- CORRECTIVE (if mesh conversion enabled) The result is visually the same even though the scene internals are different. Cameras and lights now work correctly, including when animated. The new conversion is also simpler. There's no need to convert animations, and the scene part of the conversion only changes a single entity: ```diff +let world_root_transform = convert_coordinates.scene_conversion_transform(); let world_root_id = world - .spawn((Transform::default(), Visibility::default())) + .spawn((world_root_transform, Visibility::default())) .with_children(|parent| { for node in scene.nodes() { ``` Removing node conversion might be a regression for some users. My guess is that most users just want to spawn a scene with the correct orientation and don't worry about individual node transforms, so on balance this PR will be win. But I don't have much evidence to back that up. There might also be a path to adding node conversion back in as an option - see the "Future" section below. The previous conversion option - `GltfPlugin::use_model_forward_direction` - has been split into two separate options for scene and mesh conversion. ```diff struct GltfPlugin { ... - use_model_forward_direction: bool, + convert_coordinates: GltfConvertCoordinates, } ``` ```rust struct GltfConvertCoordinates {     scenes: bool,     meshes: bool, } ``` This might be turn out to be unnecessary flexibility, but I think it's the safer option for now in case users have unexpected needs. Both options are disabled by default. ### Testing I've tested various examples and glTFs with each combination of options, including glTFs with animated cameras and lights. ```sh # Visually the same as current Bevy *without* conversion. cargo run --example scene_viewer "assets/models/faces/faces.glb" cargo run --example scene_viewer "assets/models/faces/faces.glb" --convert-mesh-coordinates # Visually the same as current Bevy *with* conversion. cargo run --example scene_viewer "assets/models/faces/faces.glb" --convert-scene-coordinates cargo run --example scene_viewer "assets/models/faces/faces.glb" --convert-scene-coordinates --convert-mesh-coordinates cargo run --example animated_mesh ``` ## Future <details> <summary>Click to expand</summary> This PR removes node conversion, which is a desirable feature for some users. There are a couple of ways it could be added back as an option. The difficult part of node conversion is how to support camera and light nodes. glTF's camera/light semantics already match Bevy's -Z forward, so simply converting every node from +Z to -Z forward will leave camera and light nodes facing the wrong direction. The obvious solution is to special case camera/light node transforms - this is what the 0.17 conversion tries to do. But it's surprisingly complex to get right due to animation, child nodes, and nodes that can be meshes and cameras and lights. E.g. children of cameras and lights need a counter-conversion applied to their transform and animation tracks. For cameras, an alternative would be to split them multiple entities. The existing entity would correspond to the glTF node and be converted like every other node. But the Bevy `Camera` component would be on a new child entity and have a corrective transform. Before: - Parent glTF node entity. - Camera glTF node entity with `Camera` component and animated transform. - glTF node parented to camera node. After: - Parent glTF node entity. - Camera glTF node entity with animated transform. - New child entity with `Camera` component and corrective transform. - glTF node parented to camera node. Lights are already set up this way, so they only need the corrective transform. This approach is simpler since nodes are treated uniformly. And it's arguably a better reflection of the glTF format - glTF cameras are kind of a separate thing from nodes, and can be given a name that's different to their node's name. So it could be better for some users. The downside is that the glTF node entity might have the wrong semantics from the perspective of some users (although not all). And it will be annoying for users who currently assume the `Camera` component is on the node entity. </details> ## Alternatives <details> <summary>Click to expand</summary> ### What About The Forward Flag Proposal? There's a [proposal](#20135) to allow per-transform semantics, aka the "forward flag". This means the axis of `Transform::forward()` and others would depend on a variable in the `Transform`. In theory the forward flag might avoid the need for coordinate conversion in the loader. But whether that works in practice is unclear, and the proposal appears to be stalled. ### What Do Other Engines Do? [Godot's semantics](https://docs.godotengine.org/en/4.4/tutorials/assets_pipeline/importing_3d_scenes/model_export_considerations.html#d-asset-direction-conventions) are the same as the glTF standard. Godot doesn't offer any conversion options. Unreal's default semantics are "+X forward, +Z up, left handed", except meshes are typically "+Y forward, +Z up". Their glTF importer converts nodes and meshes to Unreal's mesh semantics - this is done by swapping the Y and Z axes, which implicitly flips the X for handedness. So Unreal's approach is actually closer to the current main approach of node + mesh conversion, versus this PR's scene + mesh conversion. The Unreal importer also supports a custom scene/mesh rotation and translation that's applied after normal conversion. There's no option to disable conversion. </details> --------- Co-authored-by: Carter Anderson <[email protected]> Co-authored-by: Alice I Cecile <[email protected]>
1 parent 5f567d7 commit a3ad40d

File tree

7 files changed

+244
-153
lines changed

7 files changed

+244
-153
lines changed
Lines changed: 81 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -1,34 +1,16 @@
1-
use core::f32::consts::PI;
1+
//! Utilities for converting from glTF's [standard coordinate system](https://registry.khronos.org/glTF/specs/2.0/glTF-2.0.html#coordinate-system-and-units)
2+
//! to Bevy's.
3+
use serde::{Deserialize, Serialize};
24

35
use bevy_math::{Mat4, Quat, Vec3};
46
use bevy_transform::components::Transform;
57

68
pub(crate) trait ConvertCoordinates {
7-
/// Converts the glTF coordinates to Bevy's coordinate system.
8-
/// - glTF:
9-
/// - forward: Z
10-
/// - up: Y
11-
/// - right: -X
12-
/// - Bevy:
13-
/// - forward: -Z
14-
/// - up: Y
15-
/// - right: X
16-
///
17-
/// See <https://registry.khronos.org/glTF/specs/2.0/glTF-2.0.html#coordinate-system-and-units>
9+
/// Converts from glTF coordinates to Bevy's coordinate system. See
10+
/// [`GltfConvertCoordinates`] for an explanation of the conversion.
1811
fn convert_coordinates(self) -> Self;
1912
}
2013

21-
pub(crate) trait ConvertCameraCoordinates {
22-
/// Like `convert_coordinates`, but uses the following for the lens rotation:
23-
/// - forward: -Z
24-
/// - up: Y
25-
/// - right: X
26-
///
27-
/// The same convention is used for lights.
28-
/// See <https://registry.khronos.org/glTF/specs/2.0/glTF-2.0.html#view-matrix>
29-
fn convert_camera_coordinates(self) -> Self;
30-
}
31-
3214
impl ConvertCoordinates for Vec3 {
3315
fn convert_coordinates(self) -> Self {
3416
Vec3::new(-self.x, self.y, -self.z)
@@ -48,34 +30,87 @@ impl ConvertCoordinates for [f32; 4] {
4830
}
4931
}
5032

51-
impl ConvertCoordinates for Quat {
52-
fn convert_coordinates(self) -> Self {
53-
// Solution of q' = r q r*
54-
Quat::from_array([-self.x, self.y, -self.z, self.w])
55-
}
33+
/// Options for converting scenes and assets from glTF's [standard coordinate system](https://registry.khronos.org/glTF/specs/2.0/glTF-2.0.html#coordinate-system-and-units)
34+
/// (+Z forward) to Bevy's coordinate system (-Z forward).
35+
///
36+
/// The exact coordinate system conversion is as follows:
37+
/// - glTF:
38+
/// - forward: +Z
39+
/// - up: +Y
40+
/// - right: -X
41+
/// - Bevy:
42+
/// - forward: -Z
43+
/// - up: +Y
44+
/// - right: +X
45+
///
46+
/// Note that some glTF files may not follow the glTF standard.
47+
///
48+
/// If your glTF scene is +Z forward and you want it converted to match Bevy's
49+
/// `Transform::forward`, enable the `rotate_scene_entity` option. If you also want `Mesh`
50+
/// assets to be converted, enable the `rotate_meshes` option.
51+
///
52+
/// Cameras and lights in glTF files are an exception - they already use Bevy's
53+
/// coordinate system. This means cameras and lights will match
54+
/// `Transform::forward` even if conversion is disabled.
55+
#[derive(Copy, Clone, Default, Debug, Serialize, Deserialize)]
56+
pub struct GltfConvertCoordinates {
57+
/// If true, convert scenes by rotating the top-level transform of the scene entity.
58+
///
59+
/// This will ensure that [`Transform::forward`] of the "root" entity (the one with [`SceneInstance`](bevy_scene::SceneInstance))
60+
/// aligns with the "forward" of the glTF scene.
61+
///
62+
/// The glTF loader creates an entity for each glTF scene. Entities are
63+
/// then created for each node within the glTF scene. Nodes without a
64+
/// parent in the glTF scene become children of the scene entity.
65+
///
66+
/// This option only changes the transform of the scene entity. It does not
67+
/// directly change the transforms of node entities - it only changes them
68+
/// indirectly through transform inheritance.
69+
pub rotate_scene_entity: bool,
70+
71+
/// If true, convert mesh assets. This includes skinned mesh bind poses.
72+
///
73+
/// This option only changes mesh assets and the transforms of entities that
74+
/// instance meshes. It does not change the transforms of entities that
75+
/// correspond to glTF nodes.
76+
pub rotate_meshes: bool,
5677
}
5778

58-
impl ConvertCoordinates for Mat4 {
59-
fn convert_coordinates(self) -> Self {
60-
let m: Mat4 = Mat4::from_scale(Vec3::new(-1.0, 1.0, -1.0));
61-
// Same as the original matrix
62-
let m_inv = m;
63-
m_inv * self * m
79+
impl GltfConvertCoordinates {
80+
const CONVERSION_TRANSFORM: Transform =
81+
Transform::from_rotation(Quat::from_xyzw(0.0, 1.0, 0.0, 0.0));
82+
83+
fn conversion_mat4() -> Mat4 {
84+
Mat4::from_scale(Vec3::new(-1.0, 1.0, -1.0))
6485
}
65-
}
6686

67-
impl ConvertCoordinates for Transform {
68-
fn convert_coordinates(mut self) -> Self {
69-
self.translation = self.translation.convert_coordinates();
70-
self.rotation = self.rotation.convert_coordinates();
71-
self
87+
pub(crate) fn scene_conversion_transform(&self) -> Transform {
88+
if self.rotate_scene_entity {
89+
Self::CONVERSION_TRANSFORM
90+
} else {
91+
Transform::IDENTITY
92+
}
93+
}
94+
95+
pub(crate) fn mesh_conversion_transform(&self) -> Transform {
96+
if self.rotate_meshes {
97+
Self::CONVERSION_TRANSFORM
98+
} else {
99+
Transform::IDENTITY
100+
}
101+
}
102+
103+
pub(crate) fn mesh_conversion_transform_inverse(&self) -> Transform {
104+
// We magically know that the transform is its own inverse. We still
105+
// make a distinction at the interface level in case that changes.
106+
self.mesh_conversion_transform()
72107
}
73-
}
74108

75-
impl ConvertCameraCoordinates for Transform {
76-
fn convert_camera_coordinates(mut self) -> Self {
77-
self.translation = self.translation.convert_coordinates();
78-
self.rotate_y(PI);
79-
self
109+
pub(crate) fn mesh_conversion_mat4(&self) -> Mat4 {
110+
if self.rotate_meshes {
111+
Self::conversion_mat4()
112+
} else {
113+
Mat4::IDENTITY
114+
}
80115
}
81116
}

crates/bevy_gltf/src/lib.rs

Lines changed: 7 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -128,7 +128,7 @@
128128
//! See the [glTF Extension Registry](https://github.com/KhronosGroup/glTF/blob/main/extensions/README.md) for more information on extensions.
129129
130130
mod assets;
131-
mod convert_coordinates;
131+
pub mod convert_coordinates;
132132
mod label;
133133
mod loader;
134134
mod vertex_attributes;
@@ -155,7 +155,7 @@ pub mod prelude {
155155
pub use crate::{assets::Gltf, assets::GltfExtras, label::GltfAssetLabel};
156156
}
157157

158-
use crate::extensions::GltfExtensionHandlers;
158+
use crate::{convert_coordinates::GltfConvertCoordinates, extensions::GltfExtensionHandlers};
159159

160160
pub use {assets::*, label::GltfAssetLabel, loader::*};
161161

@@ -198,19 +198,9 @@ pub struct GltfPlugin {
198198
/// Can be modified with the [`DefaultGltfImageSampler`] resource.
199199
pub default_sampler: ImageSamplerDescriptor,
200200

201-
/// _CAUTION: This is an experimental feature with [known issues](https://github.com/bevyengine/bevy/issues/20621). Behavior may change in future versions._
202-
///
203-
/// How to convert glTF coordinates on import. Assuming glTF cameras, glTF lights, and glTF meshes had global identity transforms,
204-
/// their Bevy [`Transform::forward`](bevy_transform::components::Transform::forward) will be pointing in the following global directions:
205-
/// - When set to `false`
206-
/// - glTF cameras and glTF lights: global -Z,
207-
/// - glTF models: global +Z.
208-
/// - When set to `true`
209-
/// - glTF cameras and glTF lights: global +Z,
210-
/// - glTF models: global -Z.
211-
///
212-
/// The default is `false`.
213-
pub use_model_forward_direction: bool,
201+
/// The default glTF coordinate conversion setting. This can be overridden
202+
/// per-load by [`GltfLoaderSettings::convert_coordinates`].
203+
pub convert_coordinates: GltfConvertCoordinates,
214204

215205
/// Registry for custom vertex attributes.
216206
///
@@ -223,7 +213,7 @@ impl Default for GltfPlugin {
223213
GltfPlugin {
224214
default_sampler: ImageSamplerDescriptor::linear(),
225215
custom_vertex_attributes: HashMap::default(),
226-
use_model_forward_direction: false,
216+
convert_coordinates: GltfConvertCoordinates::default(),
227217
}
228218
}
229219
}
@@ -276,7 +266,7 @@ impl Plugin for GltfPlugin {
276266
supported_compressed_formats,
277267
custom_vertex_attributes: self.custom_vertex_attributes.clone(),
278268
default_sampler,
279-
default_use_model_forward_direction: self.use_model_forward_direction,
269+
default_convert_coordinates: self.convert_coordinates,
280270
extensions: extensions.0.clone(),
281271
});
282272
}

crates/bevy_gltf/src/loader/gltf_ext/scene.rs

Lines changed: 3 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -10,10 +10,7 @@ use itertools::Itertools;
1010
#[cfg(feature = "bevy_animation")]
1111
use bevy_platform::collections::{HashMap, HashSet};
1212

13-
use crate::{
14-
convert_coordinates::{ConvertCameraCoordinates as _, ConvertCoordinates as _},
15-
GltfError,
16-
};
13+
use crate::GltfError;
1714

1815
pub(crate) fn node_name(node: &Node) -> Name {
1916
let name = node
@@ -29,8 +26,8 @@ pub(crate) fn node_name(node: &Node) -> Name {
2926
/// on [`Node::transform()`](gltf::Node::transform) directly because it uses optimized glam types and
3027
/// if `libm` feature of `bevy_math` crate is enabled also handles cross
3128
/// platform determinism properly.
32-
pub(crate) fn node_transform(node: &Node, convert_coordinates: bool) -> Transform {
33-
let transform = match node.transform() {
29+
pub(crate) fn node_transform(node: &Node) -> Transform {
30+
match node.transform() {
3431
gltf::scene::Transform::Matrix { matrix } => {
3532
Transform::from_matrix(Mat4::from_cols_array_2d(&matrix))
3633
}
@@ -43,15 +40,6 @@ pub(crate) fn node_transform(node: &Node, convert_coordinates: bool) -> Transfor
4340
rotation: bevy_math::Quat::from_array(rotation),
4441
scale: Vec3::from(scale),
4542
},
46-
};
47-
if convert_coordinates {
48-
if node.camera().is_some() || node.light().is_some() {
49-
transform.convert_camera_coordinates()
50-
} else {
51-
transform.convert_coordinates()
52-
}
53-
} else {
54-
transform
5543
}
5644
}
5745

0 commit comments

Comments
 (0)