Pipeline Tutorial

Training Tutorial For 3D Gaussian Splatting

How to turn aligned cameras and sparse points into a clean, portable splat model.

Step 3: TrainingPractical workflow

Tutorial Scope

What This Page Covers

Training is where 3DGS optimizes millions of Gaussian primitives so rendered views match the input images. It is also where capture and alignment mistakes become visible.

This guide covers trainer choice, dataset preparation, GPU and memory decisions, training schedules, quality checks, export formats, and when to stop or restart.

What Training Actually Optimizes

A 3DGS trainer starts with positions, often initialized from sparse SfM points, then learns Gaussian scale, rotation, opacity, color, and sometimes spherical harmonic coefficients for view-dependent appearance. Each step renders one or more training views, compares them with target images, backpropagates image loss, and updates the Gaussian parameters.

The important difference from many NeRF methods is that 3DGS trains on full images and maintains an explicit primitive set. The trainer can clone, split, and prune Gaussians during optimization. This density control is powerful, but it can also amplify bad poses, bad masks, or areas with inconsistent images.

  • Positions explain where scene evidence lives.
  • Scale and rotation explain local shape and projected footprint.
  • Opacity controls visibility and can create floaters if unconstrained.
  • Color and spherical harmonics explain appearance and view dependence.
  • Densification adds capacity; pruning removes weak or redundant primitives.

Choosing A Trainer

Use the original Inria implementation when you want a reference baseline and maximum comparability with the original paper. Use gsplat when you want a modern Python training stack with efficient rasterization and examples that fit COLMAP captures. Use Nerfstudio Splatfacto when you want an integrated viewer, data processing commands, and a broader NeRF/3DGS ecosystem.

Use OpenSplat when you need a portable native implementation or mixed hardware support. Use Brush or Postshot when you want an interactive desktop workflow and do not want to operate mostly through Python scripts. The best trainer is often the one whose output, hardware assumptions, and debugging tools fit your project rather than the one with the highest benchmark number.

  • Research baseline: Inria 3D Gaussian Splatting.
  • Fast Python and experiments: gsplat.
  • Integrated data and viewer workflow: Nerfstudio Splatfacto.
  • Cross-platform native option: OpenSplat.
  • Creator-friendly local workflow: Brush or Postshot.

Prepare The Dataset

Good dataset preparation is boring and decisive. Keep images, camera files, sparse points, and output directories separate. If your trainer expects COLMAP, preserve the sparse model layout. If your trainer expects nerfstudio data, use ns-process-data and confirm that transforms, image paths, and point initialization are present.

Image resolution is a tradeoff. Higher resolution can improve details but increases GPU memory and training time. A practical workflow is to train a fast preview at lower resolution, inspect geometry and artifacts, then train a higher-quality version only after the input is proven. For large scenes, image downsampling and tile or scene splitting may matter more than trainer choice.

  • Keep raw images untouched and create a processed copy for training.
  • Use consistent image orientation and color space.
  • Confirm trainer can find sparse points for initialization.
  • Start with a preview run before committing to a long high-resolution train.
  • Do not delete alignment outputs until training and export are verified.

Training Settings That Matter

Most users should begin with the default schedule of the chosen trainer, then tune only after seeing a specific failure. Key settings include image scale, total iterations, densification start and stop, opacity reset, pruning thresholds, spherical harmonic degree, learning rates, and whether camera pose refinement is allowed.

If training produces many floaters, the cause may be bad alignment, transient objects, overly aggressive densification, or opacity that never gets pruned. If the scene is blurry, the cause may be low image resolution, poor alignment, under-training, or overly strong pruning. If the file is huge, reduce unnecessary density after getting a clean source model rather than prematurely lowering quality.

  • Image scale controls detail, memory, and speed.
  • Iterations control convergence but cannot fix missing coverage.
  • Densification controls how the primitive count grows.
  • Pruning controls file size and floaters.
  • Pose optimization can help mild pose noise but can also hide deeper alignment failures.

Monitor The Run

Do not wait until the final PLY to inspect training. Early previews reveal whether the scene is coherent. Within the first part of training, major shapes should appear in the right place. If the scene is duplicated, folded, or exploding, stop and debug alignment or dataset structure instead of burning more GPU time.

A good monitoring routine checks the live viewer, random held-out views, file size, Gaussian count, and GPU memory. If the trainer supports evaluation, keep a small set of images for validation. Metrics are useful, but visual inspection from novel viewpoints catches failure modes that PSNR may hide.

  • Check the scene from cameras that were not used for debugging.
  • Orbit around silhouettes and thin structures.
  • Look for blobs in empty space, broken floors, and mirrored duplicates.
  • Track Gaussian count growth and final file size.
  • Save intermediate checkpoints for comparison if experimenting.

Export And Preserve The Source Model

Export a full-quality PLY or trainer-native checkpoint as the source artifact before conversion. PLY is large, but it is the common interchange format for editing, compression, and many viewers. Runtime formats such as SPZ, SOG, SPLAT, and KSPLAT are useful for delivery, but they may be lossy or viewer-specific.

Keep a small manifest with trainer name, commit or version, dataset path, image scale, iterations, output model, and any non-default parameters. This turns a one-off capture into a reproducible asset. It also makes it easier to compare a retrain against the original output later.

  • Archive source PLY or checkpoint before cleanup and compression.
  • Export runtime formats only after the source model is visually approved.
  • Record trainer version, settings, hardware, and training time.
  • Validate exported model in at least one independent viewer.

Common Failure Modes

  • Training a bad alignment longer usually creates sharper artifacts, not a better model.
  • Randomly changing many parameters at once makes failures impossible to diagnose.
  • A beautiful training-camera render can still fail from nearby novel views.
  • Over-pruning can destroy thin objects and rare-view details.
  • Keeping only compressed runtime output makes future editing and retraining harder.

Handoff To The Next Step

  • Export a source PLY or checkpoint and record trainer settings.
  • Open the model in a viewer and inspect full orbit, close range, and problem areas.
  • Clean obvious floaters before delivery if the viewer or editor supports safe selection.
  • Choose a runtime format based on target platform: PLY for editing, SPZ or SOG for web/mobile, engine-specific imports when needed.
  • Proceed to rendering only after source quality and file size are understood.

Reference Tutorials And Docs

These sources were used as research input. The guide above is written as a consolidated 3DGS workflow rather than copied from any single tutorial.