Skip to content

Add PixArt-Alpha modular pipeline#14087

Open
cgloriacc wants to merge 1 commit into
huggingface:mainfrom
cgloriacc:modular-pixart-alpha
Open

Add PixArt-Alpha modular pipeline#14087
cgloriacc wants to merge 1 commit into
huggingface:mainfrom
cgloriacc:modular-pixart-alpha

Conversation

@cgloriacc

Copy link
Copy Markdown

What does this PR do?

This ports the PixArt-Alpha text-to-image pipeline into Modular Diffusers, following the same structure as the existing qwenimage and stable_diffusion_3 modular pipelines.

New block files under src/diffusers/modular_pipelines/pixart_alpha/:

  • encoders.py — the T5 text-encoder step. It emits the prompt embeddings and attention mask, plus the negative pair when the guider needs classifier-free guidance, and cleans captions with bs4/ftfy.
  • before_denoise.py — per-prompt input expansion, timestep setup, latent preparation, and PixArt micro-conditions. Resolution and aspect-ratio conditions are emitted only when the model's sample size is 128.
  • denoise.py — the denoise loop built on the guider abstraction. It also handles the PixArt learned-sigma split, taking the first chunk when out_channels is twice in_channels.
  • decoders.py — VAE decode and image post-processing.
  • modular_blocks_pixart_alpha.py and modular_pipeline.py — the blocks assembled into PixArtAlphaAutoBlocks and PixArtAlphaModularPipeline, registered in the modular pipeline mapping.

A pipeline-level test is added under tests/modular_pipelines/pixart_alpha/.

Coordination

Fixes #13301. This is tracked under the Modular Diffusers umbrella #13295, which @sayakpaul approved.

Tests run

Run on CPU with CUDA_VISIBLE_DEVICES="", which matches the CPU container the modular fast-test CI uses in pr_modular_tests.yml.

CUDA_VISIBLE_DEVICES="" python -m pytest tests/modular_pipelines/pixart_alpha/test_modular_pipeline_pixart_alpha.py -v
Full pytest output — 11 passed, 3 skipped
========================= test session starts =========================
platform linux -- Python 3.10.20, pytest-9.1.1, pluggy-1.6.0 -- /path/to/python
cachedir: .pytest_cache
rootdir: /path/to/diffusers
configfile: pyproject.toml
plugins: requests-mock-1.10.0, anyio-4.14.1, xdist-3.8.0, timeout-2.4.0
collected 14 items

...::test_inference_batch_consistent PASSED                                   [  7%]
...::test_inference_batch_single_identical PASSED                             [ 14%]
...::test_to_device SKIPPED (test requires a hardware accelerator)            [ 21%]
...::test_inference_is_not_nan_cpu PASSED                                     [ 28%]
...::test_inference_is_not_nan SKIPPED (test requires a hardware accelerator) [ 35%]
...::test_num_images_per_prompt PASSED                                        [ 42%]
...::test_components_auto_cpu_offload_inference_consistent SKIPPED (...)      [ 50%]
...::test_save_from_pretrained PASSED                                         [ 57%]
...::test_load_expected_components_from_pretrained PASSED                     [ 64%]
...::test_load_expected_components_from_save_pretrained PASSED                [ 71%]
...::test_modular_index_consistency PASSED                                    [ 78%]
...::test_workflow_map PASSED                                                 [ 85%]
...::test_pipeline_call_signature PASSED                                      [ 92%]
...::test_float16_inference PASSED                                            [100%]

========================== warnings summary ===========================
<frozen importlib._bootstrap>:241
  DeprecationWarning: builtin type SwigPyPacked has no __module__ attribute
<frozen importlib._bootstrap>:241
  DeprecationWarning: builtin type SwigPyObject has no __module__ attribute

huggingface_hub/utils/_validators.py:205: UserWarning (9 tests)
  The local_dir_use_symlinks argument is deprecated and ignored in hf_hub_download.

scheduling_dpmsolver_multistep.py:438: DeprecationWarning (12 warnings)
  __array__ implementation doesn't accept a copy keyword (numpy 2.0 migration).

============== 11 passed, 3 skipped, 23 warnings in 72.82s ==============

Why the 3 skips. test_to_device, test_inference_is_not_nan, and test_components_auto_cpu_offload_inference_consistent all carry the @require_accelerator decorator, so they skip on a CPU-only run. This is expected because the modular fast-test CI runs on CPU. The
not-NaN check still runs through its CPU companion test_inference_is_not_nan_cpu, which passed.

Why the warnings, none of which come from this PR's code. The SwigPyPacked/SwigPyObject warnings come from a SWIG-based dependency at import time on Python 3.10. The local_dir_use_symlinks warning comes from huggingface_hub while the test downloads the tiny
checkpoint. The numpy 2.0 array warning is raised inside the existing scheduling_dpmsolver_multistep.py scheduler; it is pre-existing and surfaces only because the pipeline uses DPMSolverMultistepScheduler.

The repo-consistency and quality checks that pr_modular_tests.yml runs also pass locally:

make quality
python utils/check_copies.py
python utils/check_dummies.py
python utils/check_support_list.py
python utils/check_forward_call_docstrings.py
make deps_table_check_updated
make modular-autodoctrings

Notes

  • Only the text2image workflow is exposed, since PixArt-Alpha is a text-to-image-only model with no image-to-image or inpainting variants.
  • Resolution binning is not ported in this first version. The classic pipeline exposes it through use_resolution_binning, and I'm happy to add it as a follow-up.
  • The tiny test checkpoint currently lives at idealclx/tiny-pixart-alpha-modular. Per modular.md gotcha add unet ldm in init #6, tiny test models must move under hf-internal-testing before merge. Could a maintainer move it there? I'll update the test path afterwards.
  • No per-model docs page is added, which matches the other modular pipelines. The modular docs only cover the framework classes.

Before submitting

Who can review?

@yiyixuxu @sayakpaul @asomoza

  This ports the PixArt-Alpha text-to-image pipeline into the Modular Diffusers
  framework. It adds the text-encoder, before-denoise, denoise (guider), and decode
  blocks, assembles them into PixArtAlphaAutoBlocks and PixArtAlphaModularPipeline,
  and adds a pipeline-level test.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Modular Pipeline: support for PixArtAlphaPipeline

1 participant