KeyPointDiffuser

Unsupervised 3D Keypoint Learning via Latent Diffusion Models

Authors: Rhys Newbury, Juyan Zhang, Tin Tran, Hanna Kurniawati, Dana Kulić

Monash University & Australian National University


Pipeline Overview

Overview

KeyPointDiffuser is a generative framework for learning unsupervised 3D keypoints directly from point cloud data. The model discovers semantically meaningful geometric structures and uses them to condition a diffusion model for high-quality 3D reconstruction and generation.

Unlike prior approaches, the method enables:


Learned Keypoint Attention

Keypoint Attention

Each learned keypoint attends to meaningful geometric regions of the object, producing consistent semantic structure across different instances.


Consistent Keypoints Across Shapes

Airplanes

Airplane Keypoints

Guitars

Guitar Keypoints

Chairs

Chair Keypoints

The same keypoint IDs remain spatially consistent across object instances, demonstrating strong semantic alignment.


Keypoint Interpolation

The learned latent representation enables smooth interpolation between different object geometries.

The generated shapes transition smoothly between source and target objects, showing that the learned keypoints capture meaningful geometric variation.


Highlights