SATR: Zero-Shot Semantic Segmentation of 3D Shapes

1KAUST, 2LIX, Ecole Polytechnique.
ICCV 2023 (Poster)

SATR performs zero-shot 3D shape segmentation via text descriptions by using a zero-shot 2D object detector. It infers 3D segmentation from multi-view 2D bounding box predictions by exploiting the topological properties of the underlying surface. SATR is able to segment the mesh from both single and multiple queries and provides accurate predictions even for fine-grained categories.


We explore the task of zero-shot semantic segmentation of 3D shapes by using large-scale off-the-shelf 2D im- age recognition models. Surprisingly, we find that modern zero-shot 2D object detectors are better suited for this task than contemporary text/image similarity predictors or even zero-shot 2D segmentation networks. Our key finding is that it is possible to extract accurate 3D segmentation maps from multi-view bounding box predictions by using the topological properties of the underlying surface. For this, we develop the Segmentation Assignment with Topological Reweighting (SATR) algorithm and evaluate it on ShapeNetPart and our proposed FAUST benchmarks. SATR achieves state-of-the-art performance and outperforms a baseline algorithm by 1.3% and 4% average mIoU on the FAUST coarse and fine-grained benchmarks, respectively, and by 5.2% average mIoU on the ShapeNetPart benchmark.


An overview of our method. Meshes are rendered from random view points. The resulting images are processed by GLIP which detects bounding boxes in the images. Each bounding box corresponds to a prompt (segment). For each bounding box, we compute scores for triangles inside the bounding box using Gaussian Geodesic Reweighting and Visibility Smoothing. Aggregating the scores yields a segmented mesh.

Qualitative Results

Coarse-Grained Zero-Shot Semantic Segmentation on FAUST

Fine-Grained Zero-Shot Semantic Segmentation on FAUST

Zero-Shot Semantic Segmentation on ShapeNetPart

More Results!


        author = {Abdelreheem, Ahmed and Skorokhodov, Ivan and Ovsjanikov, Maks and Wonka, Peter}
        title = {SATR: Zero-Shot Semantic Segmentation of 3D Shapes},
        booktitle = {Proceedings of the International Conference on Computer Vision ({ICCV})},