What is CreatiDesign: A Unified Multi-Conditional Diffusion Transformer for Creative Graphic Design

CreatiDesign is a research project that helps you make polished graphic designs from many types of inputs at once. You can give it product photos, simple layout boxes, and short text like a slogan, and it will build a full poster-style image that follows your plan.

CreatiDesign: A Unified Multi-Conditional Diffusion Transformer for Creative Graphic Design

It was created by researchers from Fudan University and ByteDance Intelligent Creation. The team built a single model that listens to all your inputs and keeps each one in the right place with the right look.

CreatiDesign: A Unified Multi-Conditional Diffusion Transformer for Creative Graphic Design Overview

CreatiDesign focuses on clear control. Each element you give it is kept where it should be, without one input fighting another.

For more AI explainers and tools, you can browse our homepage at Omnihuman 1.Com.

Item	Detail
Type	Research project and open-source code
Purpose	Create full graphic designs from many inputs (images, layouts, texts) with strong control
Model	Unified multi-conditional diffusion transformer
Key Features	Multi-conditional generation, precise element control with attention masks, large dataset, benchmark, zero-shot editing
Inputs	Product or subject images, semantic layout boxes/positions, short text (titles, slogans)
Output	High-quality graphic design images (e.g., ads, posters, social posts)
Dataset	400K design samples with multi-condition labels
Benchmark	1,000 curated test cases for strict checks
Institutions	Fudan University and ByteDance Intelligent Creation
Authors	Hui Zhang, Dexiang Hong, Maoke Yang, Yutao Cheng, Zhao Zhang, Jie Shao, Xinglong Wu, Zuxuan Wu, Yu-Gang Jiang
Year	2025 (arXiv preprint arXiv:2505.19114)
Project Page	https://huizhang0812.github.io/CreatiDesign/
GitHub	https://github.com/HuiZhang0812/CreatiDesign

CreatiDesign: A Unified Multi-Conditional Diffusion Transformer for Creative Graphic Design Key Features

One model for many inputs. It handles product images, layout boxes, and short text at the same time.
Strong control over each element. A special attention mask keeps each input in its own area without mixing.
Big training data. A 400K-sample dataset and a 1,000-sample benchmark help measure real results.
Works for editing too. You can use it to tweak designs without extra training.
Built to keep your idea intact. Tests show better subject match and better layout match against past methods.

Unified Multi-Condition Driven Architecture

How It Works

You provide three kinds of inputs. These are main images (like a product), layout hints (where things go), and small text chunks.
The model reads them together. It plans how to arrange items so the final image fits your plan.
A “multimodal attention mask” stops inputs from stepping on each other. This keeps the subject sharp, the text in the right zone, and the layout tidy.

Installation & Setup

Follow these steps in your terminal to set up the environment and run the evaluation scripts. Use the exact commands below.

Environment setup

conda create -n creatidesign python=3.10 -y
conda activate creatidesign
conda install pytorch==2.4.1 torchvision==0.19.1 torchaudio==2.4.1 pytorch-cuda=12.1 -c pytorch -c nvidia

Requirements installation

pip install -r requirements.txt

To run the benchmark and evaluations:

Generate images:

python test_creatidesign_benchmark.py

Evaluate multi-subject preservation:

python eval/subject.py

Evaluate semantic layout alignment:

python eval/layout.py

python eval/text.py

Datasets and Benchmark

The team built a fully automated pipeline to gather 400K labeled design samples. The data includes ads, movie posters, brand promos, and social content.

The benchmark has 1,000 careful test cases. It checks if subjects are kept, if layouts are followed, and if the image quality is strong.

Graphic Design Datasets

CreatiDesign: A Unified Multi-Conditional Diffusion Transformer for Creative Graphic Design Use Cases

Marketing and ads: Place a product photo, add headline text, and lock key items to set spots for quick campaign assets.
Social media posts: Keep a logo and title in fixed areas and auto-fill backgrounds that fit the brand.
Posters for movies, events, and promos: Control hero images, taglines, badges, and credits with simple layout boxes.
Batch content for A/B tests: Reuse the same layout and vary product shots or taglines to compare.
Quick edits: Swap a subject or move a text box without training a new model.

To explore more related work from the parent company, see ByteDance on our site.

Free Lunch: Expanding to Editing Tasks

Performance & Showcases

CreatiDesign shows stronger subject keeping and tighter layout match than prior single-condition and multi-condition baselines. It also holds text and decorations in their zones with fewer mistakes.

Qualitative comparison with State-of-the-Art Relevant Methods

The project also reports broad tests that score how well subjects are preserved and how well the layout lines up with the plan. These checks give a clear view of real design quality and control.

Quantitative comparison with State-of-the-Art Relevant Methods

The Technology Behind It

Unified design model: One transformer-based diffusion model reads images, layouts, and text together, so it works as a single brain.
Attention masks: These masks tell the model which parts of the image each input can affect. This keeps the design balanced.
Minimal base changes: The team kept core changes small, which helps stability and training.

Step-by-Step: Run the Benchmark

Here is a short guide to test the model’s skills with the provided scripts.

Create the environment and install packages.

Use the “Environment setup” and “Requirements installation” commands above.

Generate benchmark images.

Run:

python test_creatidesign_benchmark.py

Evaluate results.

For subject keeping:

python eval/subject.py

For layout match:

python eval/layout.py

For text checks:

python eval/text.py

Tips for Best Results

Prepare clean product shots with clear edges. This helps the model keep the subject strong.
Keep layout boxes simple and non-overlapping. Clear plans lead to cleaner designs.
Use short, readable text for titles and slogans. This avoids clutter and helps placement.

FAQ

Who is CreatiDesign for?

It is for marketers, designers, content teams, and researchers who need fast, controlled design creation. It also helps teams run tests across many layout and content options.

Can I edit an existing design with it?

Yes. It supports editing tasks without extra training, so you can replace a subject or adjust text and keep the rest in place.

What do I need to run it?

You need a Python 3.10 environment with PyTorch 2.4.1 and CUDA 12.1 support. Follow the “Installation & Setup” commands above.

What kinds of inputs work best?

Use a clear subject image, simple layout boxes with positions, and short text strings. The cleaner the inputs, the better the final design.

Where can I read more about the team and project updates?

Visit our site’s info page here: About.

How It Helps Your Workflow

CreatiDesign speeds up mockups and keeps brand rules steady across many assets. It can turn a rough plan into share-ready designs with fewer manual tweaks.

For more helpful guides and updates, you can always check in at Omnihuman 1.Com.

Image source: CreatiDesign: A Unified Multi-Conditional Diffusion Transformer for Creative Graphic Design