What Training Actually Optimizes
A 3DGS trainer starts with positions, often initialized from sparse SfM points, then learns Gaussian scale, rotation, opacity, color, and sometimes spherical harmonic coefficients for view-dependent appearance. Each step renders one or more training views, compares them with target images, backpropagates image loss, and updates the Gaussian parameters.
The important difference from many NeRF methods is that 3DGS trains on full images and maintains an explicit primitive set. The trainer can clone, split, and prune Gaussians during optimization. This density control is powerful, but it can also amplify bad poses, bad masks, or areas with inconsistent images.
- Positions explain where scene evidence lives.
- Scale and rotation explain local shape and projected footprint.
- Opacity controls visibility and can create floaters if unconstrained.
- Color and spherical harmonics explain appearance and view dependence.
- Densification adds capacity; pruning removes weak or redundant primitives.