Baking Neural Radiance Fields for Real-Time View Synthesis ICCV 2021 (Oral)
Neural volumetric representations such as Neural Radiance Fields (NeRF) have emerged as a compelling technique for learning to represent 3D scenes from images with the goal of rendering photorealistic images of the scene from unobserved viewpoints. However, NeRF's computational requirements are prohibitive for real-time applications: rendering views from a trained NeRF requires querying a multilayer perceptron (MLP) hundreds of times per ray. We present a method to train a NeRF, then precompute and store (i.e. "bake") it as a novel representation called a Sparse Neural Radiance Grid (SNeRG) that enables real-time rendering on commodity hardware. To achieve this, we introduce 1) a reformulation of NeRF's architecture, and 2) a sparse voxel grid representation with learned feature vectors. The resulting scene representation retains NeRF's ability to render fine geometric details and view-dependent appearance, is compact (averaging less than 90 MB per scene), and can be rendered in real-time (higher than 30 frames per second on a laptop GPU). Actual screen captures are shown in our video.
Real-Time Interactive Viewer Demos
Synthetic Rendered Scenes
Real Captured Scenes
Sparse Neural Radiance Grids (SNeRG)
Our method precomputes and stores ("bakes") a NeRF into a Sparse Neural Radiance Grid (SNeRG) data structure. In order to render our SNeRG data structure in real time, we:
- Use a sparse voxel grid to skip empty space along rays
- Look up a diffuse color for each point sampled along a ray in occupied space, and composite these along the ray
- Look up a feature vector (4-dimensional) for each point, and composite these along the ray
- Decode the composited features into a single specular color per pixel using a tiny (2 layers, 16 channels) MLP
- Add the diffuse and specular color components to compute the final RGB color
The website template was borrowed from Michaël Gharbi.