PhD Thesis Defense: Dmitry Petrov, Structure-Aware Shape and Image Synthesis : Manning College of Information & Computer Sciences : UMass Amherst

Monday, May 19, 2025, 12:00 PM - Monday, May 19, 2025, 3:00 PM

Online

Speaker

Abstract

In recent years, a variety of approaches have developed deep neural network-based architectures for 3D shape and image generation with wide-ranging applications to computer-aided design, fabrication, architecture, art, and entertainment. While these methods can capture diverse macro-level appearances, they rarely model 3D shape structure or topology of generated objects explicitly, relying instead on the representational power of the network to generate plausible-looking shapes or images. In my work, I introduce shape and image synthesis methods methods that model complex topological and geometrical details, and support interpretable control of generated object structure and geometry.

(1) I propose ANISE – a new part-aware 3D shape reconstruction method based on neural implicit functions. Specifically, given a partial shape observation (image or point cloud), it reconstructs shapes as a combination of parts, each with its own geometric representations. I formulate shape reconstruction in two different manners: as a union of part implicit functions, or by retrieving parts in a reference database and assembling them into a final shape. This approach allows modification of the final result either by moving parts or swapping them using part latent codes.

(2) I introduce the GEM3D – a two-step 3D shape generation model that first generates medial skeletal abstraction that captures and then infers and assembles a collection of locally-supported neural implicit functions, conditioned on generated skeletal abstraction. This skeleton-based latent grid is more structure-aware compared to other irregular latent grid approaches, providing more interpretable support for latent codes in 3D space, while remaining capable of representing complex, fine-grained topological structures. It also allows for editing of the resulting surface through manipulation of the generated skeleton.

(3) Finally, I propose ShapeWords, an approach for synthesizing images based on 3D shape guidance and text prompts. ShapeWords incorporates target 3D shape information within specialized tokens embedded together with the input text, effectively blending 3D shape awareness with textual context to guide the image synthesis process. Unlike conventional shape guidance methods that rely on depth maps restricted to fixed viewpoints and often overlook full 3D structure or textual context, ShapeWords generates diverse yet consistent images that reflect both the target shape's geometry and the textual description. We show that ShapeWords produces images that are more text-compliant, aesthetically plausible, while also maintaining 3D shape awareness.

Advisor

Evangelos Kalogerakis

Online event posted in PhD Thesis Defense

PhD Thesis Defense: Dmitry Petrov, Structure-Aware Shape and Image Synthesis

Content

Global footer