ShapeWords: Guiding Text-to-Image Synthesis with 3D Shape-Aware Prompts

CVPR 2025

Dmitry Petrov1, Pradyumn Goyal1, Divyansh Shivashok1, Yuanming Tao1, Melinos Averkiou2,3, Evangelos Kalogerakis1,2,4

1UMass Amherst 2CYENS CoE 3University of Cyprus 4TU Crete

1️⃣ Select an shape category from the dropdown menu -- overall 55 categories. We recommend trying chair (default), car, lamp and bottle categories.

2️⃣ Create a text prompt using [CATEGORY] as a placeholder or use "Random prompt" button to select from a small set of pre-defined prompts

3️⃣ Adjust guidance strength to control shape influence. Use the default 0.9 value for best balance between prompt and shape adherence. Value of 0.0 corresponds to unguided result that is based just on input prompt.

4️⃣ (optional) Choose random seed. For a fixed combination of input prompt and random seed, unguided image will always be the same.

5️⃣ Choose guidance 3D shape using the slider, navigation or random shape buttons. Shapes come from ShapeNet dataset (~55K shapes across all categories)

6️⃣ Click Generate Images button at the bottom to create images that follow both your text prompt and the selected 3D shape geometry

📝 Prompt Design

1️⃣ Shape Category

⚙️ Generation Settings

0 1
0 10000

🔍 Shape Selection

0 6777

Shape 0 of 6777

🖼️ Generated Results Preview

About ShapeWords

ShapeWords incorporates target 3D shape information with text prompts to guide image synthesis.

How It Works

  1. Select an shape category from the dropdown menu -- overall 55 categories. We recommend trying chair (default), car, lamp and bottle categories.
  2. Create a text prompt using [CATEGORY] as a placeholder or use "Random prompt" button to select from a small set of pre-defined prompts
  3. Adjust guidance strength to control shape influence. Use the default 0.9 value for best balance between prompt and shape adherence. Value of 0.0 corresponds to unguided result that is based just on input prompt.
  4. (optional) Choose random seed. For a fixed combination of input prompt and random seed, unguided image will always be the same.
  5. Choose guidance 3D shape using the slider, navigation or random shape buttons. Shapes come from ShapeNet dataset (~55K shapes across all categories)
  6. Click Generate Images button at the bottom to create images that follow both your text prompt and the selected 3D shape geometry

Citation

@misc{petrov2024shapewords,
      title={ShapeWords: Guiding Text-to-Image Synthesis with 3D Shape-Aware Prompts}, 
      author={Dmitry Petrov and Pradyumn Goyal and Divyansh Shivashok and Yuanming Tao and Melinos Averkiou and Evangelos Kalogerakis},
      year={2024},
      eprint={2412.02912},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2412.02912}, 
}