Neural Bounding

Stephanie Wenxin Liu ¹

Michael Fischer ²

Paul D. Yoo ¹

Tobias Ritschel ²

¹ Birkbeck, University of London

² University College London

Published at SIGGRAPH 2024 (Conference Track)

Paper

Code

Abstract

Bounding volumes are an established concept in computer graphics and vision tasks but have seen little change since their early inception. In this work, we study the use of neural networks as bounding volumes. Our key observation is that bounding, which so far has primarily been considered a problem of computational geometry, can be redefined as a problem of learning to classify space into free or occupied. This learning-based approach is particularly advantageous in high-dimensional spaces, such as animated scenes with complex queries, where neural networks are known to excel. However, unlocking neural bounding requires a twist: allowing -- but also limiting -- false positives, while ensuring that the number of false negatives is strictly zero. We enable such tight and conservative results using a dynamically-weighted asymmetric loss function. Our results show that our neural bounding produces up to an order of magnitude fewer false positives than traditional methods. In addition, we propose an extension of our bounding method using early exits that accelerates query speeds by 25%. We also demonstrate that our approach is applicable to non-deep learning models that train within seconds. Our project page is at: https://wenxin-liu.github.io/neural_bounding/.

Summary

We learn neural representations of bounding volumes. The key is a dynamically-weighted asymmetric loss function, which ensures zero false negatives, in order words, no truncated geometry, while keeping false positives minimal for precise bounding. Our approach is adaptable to any dimension and suitable for both point and range queries. It can also be applied to optimise non-neural representations, such as k-DOPs (k discrete oriented polytopes). In results across diverse scenes using 2D, 3D, and 4D queries, our method outperforms traditional bounding methods, achieving up to 10x tighter bounding.

Comparison of Our Approach vs. Baselines

L-R: While AABox, Sphere and kDOP bounding are not tight, Neural - common neural networks - cut off parts of the geometry (see edges of the starfish). Ours Neural - our neural networks - are both tight bounding and conservative (no truncated geometry).

In addition to Ours Neural, our approach can also optimise non-neural representations such as k-DOP, which we call Ours kDOP. The interactive demo below uses 2D point queries to show a comparison of our methods: Ours Neural and Ours kDOP, shown in bold, vs. six baseline methods, on four different objects:

How Does Our Approach Work?

Input is an \(n\)-dimensional indicator function \( f(x) ∈ R^n → \{0,1\} \) that returns 1 inside and on the surface of the object, and 0 everywhere else. To support both point and range queries, on top of the indicator function \( f(x) \), we define a query function \( g(r) ∈ R^n → \{0,1\} \) that is 1 if the indicator function returns 1 for at least one point in the region \( r \). We then approximate \( g(r) \) using another function \( h_{\theta}(r) \) with learnable parameters \( \theta \). The training objective \( L \) is no cost if \( g(r) = h_{\theta}(r) \), otherwise asymmetric costs are applied if false negative: \( g(r) = 1 \) and \( h_{\theta}(r) = 0 \), or false positive: \( g(r) = 0 \) and \( h_{\theta}(r) = 1 \). We approximate \( L \) via a variant of a weighted binary cross entropy loss.

In our implementation, we use small MLPs (Multi-Layer Perceptrons) to achieve faster training and inference. For 2D data, we sample images; for 3D, voxel grids; and for 4D, animated objects represented as 3D voxel grids stacked in 4D tensors.

comparison of symmetric vs asymmetric neural network results

Results

We show that our methods generalise across dimensions. Here, we show results for 3D point queries, compared to baseline methods.

Two Types of Conservativeness

We introduce a variant of our approach that inverts the asymmetric costs associated with false negatives and false positives. Contrasting with our original approach that includes both definite and potential areas, this variant includes only areas that are definitively inside the object, excluding ambiguous regions. This is useful for tasks like sphere tracing where avoiding boundary overestimation is important.

In the following interactive demo, we explore these two types of conservativeness - hit conservativeness, which is the original approach and no-hit conservativeness, which is the variant approach. We show that these two types of conservativeness can be applied in the optimisation of both neural (Ours Neural) and non-neural (Ours kDOP) representations:

Select a method:

Ours Neural

Ours kDOP

Select conservativeness:

no hit conservative

hit conservative

Select an object:

puffer fish

sea urchin

parrot fish

Bounding Volume Hierarchy

We show that we can also stack our neural boundings into hierarchies, similar to classic bounding boxes in bounding volume hierarchies (BVH).

The results show that 1) our neural BVH does not cut off any geometry features and is fully conservative 2) each level in our neural BVH bounds its object group tightly 3) each level is more tight-fitting compared to classic bounding boxes, achieving improved accuracy over traditional BVH.

Citation

@inproceedings{liu2024neural,
  author = {Stephanie Wenxin Liu and Michael Fischer and Paul D. Yoo and Tobias Ritschel},
  title = {Neural Bounding},
  booktitle = {Proceedings of the Special Interest Group on Computer Graphics and Interactive Techniques Conference Papers '24 (SIGGRAPH Conference Papers '24)},
  year = {2024},
  location = {Denver, CO, USA},
  publisher = {ACM},
  address = {New York, NY, USA},
  pages = {10},
  doi = {10.1145/3641519.3657442}
}