What Alignment Produces
A 3DGS trainer needs two things before optimization: images and a camera model for each image. Alignment estimates intrinsics such as focal length and distortion, extrinsics such as camera position and rotation, and a sparse point cloud that provides initialization. The sparse cloud is not the final scene, but it gives Gaussians a reasonable place to start.
The common open-source route is COLMAP. It detects local features, matches features across image pairs, verifies geometry, incrementally registers cameras, triangulates points, and bundle-adjusts the result. GLOMAP uses a global SfM strategy that can be faster for large datasets, while Metashape and RealityCapture provide polished visual tools and export paths.
- Cameras: one pose and calibration per registered image.
- Sparse points: triangulated feature points used for initialization and diagnostics.
- Undistorted images: often required by trainers so rendering uses a simple camera model.
- Transforms file or COLMAP database: the bridge between alignment and training.