Research Paper

Speedy-Splat: Fast 3D Gaussian Splatting with Sparse Pixels and Sparse Primitives

A CVPR 2025 acceleration paper that attacks both pixel work and primitive count to make 3DGS faster, smaller, and quicker to train.

June 2025AccelerationarXiv:2412.00578

Detailed Reading

Speedy-Splat is a systems paper about where 3DGS spends unnecessary work. The original renderer is already much faster than NeRF-style ray marching, but every frame still projects, bins, sorts, and blends large numbers of Gaussians. If a Gaussian is conservatively assigned to too many pixels, or if redundant primitives survive training, the renderer wastes time without improving the image.

The first part of the method improves Gaussian localization in the rasterization pipeline. In practical terms, it tries to make the set of pixels touched by each projected Gaussian match its real contribution more tightly. This matters because splatting cost scales with screen coverage as well as primitive count, so better localization can speed up rendering even when visual output stays nearly unchanged.

The second part is sparse-primitives training. Rather than pruning only after a scene is finished, Speedy-Splat integrates pruning into the training process so the representation learns under a smaller budget. This reduces the chance that a large model learns fragile dependencies that disappear when compressed later.

The algorithmic idea is contribution-aware removal. A primitive should be judged by what it contributes across rendered views, not merely by opacity, size, or a single-frame heuristic. Good pruning has to protect thin structures, silhouettes, and view-dependent details while deleting primitives that duplicate neighboring evidence.

The reported result is important because it improves three practical metrics at once: frame time, model size, and training time. For viewer developers, this is more valuable than a pure benchmark PSNR improvement because it directly affects download size, GPU memory, and responsiveness on constrained devices.

Read the paper as a reminder that 3DGS performance is not solved just because it is real time on a desktop GPU. A production pipeline still needs rasterizer-level efficiency, training-aware compression, and careful tests on scenes with fine geometry. The limitation is that aggressive pruning can still miss rare-view contributions, so speedups need visual inspection, not only average metrics.

What The Paper Does

Speedy-Splat identifies two inefficiencies in vanilla 3DGS: too much work is spent on imprecisely localized screen-space splats, and too many primitives survive training even when their visual contribution is small.

The paper combines tighter rendering localization with a pruning method integrated into training, reporting major rendering speedups while reducing model size and training time.

Core Ideas

  • Optimizes the rasterization pipeline to reduce unnecessary sparse-pixel work.
  • Introduces pruning during training to keep the Gaussian set compact.
  • Targets rendering speed, file size, and training time together rather than one metric in isolation.
  • Reports large average speedups across Mip-NeRF 360, Tanks and Temples, and Deep Blending scenes.

Why It Matters

  • It is directly useful for web and application viewers where millions of primitives are expensive.
  • It complements compression papers by improving the rendering workload itself.
  • It gives a practical path for making high-quality scenes usable on resource-constrained hardware.

Read This If

  • You are optimizing a 3DGS viewer or renderer.
  • You care about deployment size and FPS more than only reconstruction quality.
  • You want to understand which parts of vanilla 3DGS still waste computation.

Limitations And Caveats

  • Pruning decisions can still damage details that are important only from rare views.
  • The method is most relevant to static-scene 3DGS; dynamic pipelines need additional temporal logic.
  • Renderer integration details matter, so results may vary across implementations.