Research Paper

Gaussian Splatting SLAM

A live SLAM system that uses 3D Gaussians as the unified representation for tracking, mapping, and high-quality rendering.

December 2023Monocular SLAMarXiv:2312.06741

Detailed Reading

Gaussian Splatting SLAM asks whether Gaussians can replace the usual SLAM map representation. Instead of building a sparse map for tracking and a separate dense model for viewing, it uses one Gaussian map for both camera localization and photorealistic reconstruction.

The method tracks by rendering the current Gaussian scene and directly optimizing the camera pose against the live image. Mapping then updates the Gaussian representation incrementally, with geometric verification to reduce the ambiguity that appears when a monocular camera has limited depth information.

Its significance is the unified representation. If tracking, mapping, and viewing all use Gaussians, AR systems could in principle build visually rich maps while moving through the world. The hard part is stability: monocular scale, drift, and incremental errors are unforgiving.

Gaussian Splatting SLAM studies whether a live monocular or RGB-D system can use Gaussians as the central map. The attraction is clear: a Gaussian map is both a geometric proxy for tracking and a photorealistic representation for visualization. The difficulty is that SLAM has to remain stable while the map is incomplete and poses are uncertain.

The method combines tracking, keyframe management, mapping, and Gaussian optimization. New frames are aligned against rendered predictions from the current map; keyframes provide supervision for map updates; Gaussians are added and refined as coverage grows. This creates a feedback loop where better poses improve the map and a better map improves tracking.

A key reading point is how the system prevents offline 3DGS behavior from becoming too expensive online. It cannot freely densify and optimize forever, so it needs local updates, pruning, and careful scheduling. The paper is therefore as much a systems contribution as a representation contribution.

Its importance is in showing that 3DGS can support continuous capture workflows. The weaknesses remain pose drift, scene dynamics, and sensitivity to initialization or exposure changes. For tool builders, the paper is a guide to turning splats from a batch output into a live scene model.

What The Paper Does

Gaussian Splatting SLAM applies 3DGS to incremental live reconstruction. It is especially notable for targeting monocular SLAM, one of the hardest visual SLAM settings.

The method optimizes camera tracking directly against the Gaussian map and adds geometric verification and regularization for incremental dense reconstruction.

Core Ideas

  • Uses Gaussians as the only 3D map representation.
  • Tracks camera pose by direct optimization against rendered Gaussians.
  • Adds geometric checks to reduce ambiguity during incremental reconstruction.

Why It Matters

  • It is a key early reference for live Gaussian mapping without relying purely on offline SfM.
  • It pushed 3DGS toward AR and robotics use cases where the camera moves through the world live.
  • It complements SplaTAM by emphasizing the monocular visual-SLAM framing.

Read This If

  • You are interested in live camera tracking with splat maps.
  • You want to compare monocular and RGB-D Gaussian SLAM methods.
  • You are building AR capture or live scanning prototypes.

Limitations And Caveats

  • Monocular live reconstruction is sensitive to scale, motion, and initialization.
  • Runtime constraints can force quality and map-size trade-offs.
  • The method still inherits some ambiguities of incremental dense reconstruction.