- 2025.9.15: 🔥 Add NF4, INT8, DF11 version of BAGEL-RecA! Thank to @theunlikely!
- 2025.9.14: 🔥 Add ComfyUI guide! Try BAGEL-RecA in ComfyUI!
- 2025.9.11: Harmon training code is released!
- 2025.9.10: BAGEL training code is released! Harmon training code will be released soon.
- 2025.9.9: Our finetuned weights and arXiv paper are available! We expect to release the training code tomorrow.
- Online Demo: Try out our enhanced BAGEL-RecA demo on Hugging Face Spaces!
- ComfyUI: see ComfyUI-BAGEL. The usage is totally the same as the original ComfyUI-BAGEL but you should replace the BAGEL weight
models/bagel/BAGEL-7B-MoT/ema.safetensors
with RecA-tuned one. The ComfyUI-BAGEL repo already supports the NF4 and INT8 conversion of BAGEL.
wget https://huggingface.co/sanaka87/BAGEL-RecA/blob/main/model_bf16.safetensors
mv model_bf16.safetensors models/bagel/BAGEL-7B-MoT/ema.safetensors
You can also download weight of NF4 and INT8 version of BAGEL in BAGEL-RecA.
DF11 version BAGEL-RecA (heartfelt thank to @theunlikely !!!).
-
Local Setup: Follow the instructions in the BAGEL Installation Guide to set up the environment, and run
BAGEL/inference.ipynb
to test the model locally! -
Full Training & Evaluation: For detailed instructions on installation, training, and evaluation, please refer to the respective repository READMEs:
-
BAGEL Installation Guide: Complete guide for BAGEL model training and evaluation.
-
Harmon Installation Guide: Comprehensive instructions for Harmon model training and evaluation.
-
Benchmark Evaluation Guide: Multi-benchmark evaluation scripts and setup instructions.
A collection of RecA models on Hugging Face with benchmark performance:
Model Name | Parameters | GenEval | DPGBench | ImgEdit | GEdit |
---|---|---|---|---|---|
BAGEL-RecA (support INT8, NF4) | 14B | 82.4 (+3.6) | 85.29 (+1.26) | 3.75 (+0.37) | 7.27 (+0.33) |
Harmon-0.5B-RecA | 0.5B | 78.7 (+11.1) | 84.67 (+4.55) | - | - |
Harmon-1.5B-RecA | 1.5B | 85.7 (+12.8) | 87.21 (+6.28) | - | - |
Show-o-RecA | 1.3B | 61.9 (+5.3) | 75.70 (+5.05) | - | - |
Show-o-512x512-RecA | 1.3B | 72.3 (+6.1) | 84.94 (+2.73) | - | - |
Harmon-1.5B-RecA-plus | 1.5B | 90.0 | 88.15 | - | - |
OpenUni-RecA | 3.6B | 74.1 (+12.2) | 82.75 (+3.73) | - | - |
Unlocking the Massive Zero-shot Potential in Unified Multimodal Models through Self-supervised Learning.
RecA achieves state-of-the-art performance on generation benchmarks with remarkable efficiency. Despite using only 1.5B parameters, RecA surpasses models with 7B-24B parameters, achieving GenEval 0.86 and DPGBench 87.21 without GPT-4o distillation data or reinforcement learning. RecA also improves BAGEL's editing performance significantly across all categories. Further two-stage fine-tuning with GPT-4o-Image distillation data enhances the score to 0.90 and 88.15 respectively.
We've tested RecA on various base architectures, including Show-o, OpenUni, Harmon, and BAGEL, consistently observing significant performance improvements across all models and benchmarks.
Our method demonstrates superior image editing capabilities compared to state-of-the-art models including ICEdit, FLUX-Kontext, and GPT-4o:
- Release our model weights on Hugging Face.
- Release BAGEL training code.
- Release Harmon training code.
- Add ComfyUI guide.
- Release Show-o and OpenUni training code.
- Further scale-up BAGEL training.
- Add support for new UMM architectures like Show-o2.
The majority of RecA is licensed under the Apache License, however portions of the project are available under their own license terms: BAGEL and Show-o are licensed under Apache, Harmon and OpenUni are licensed under S-Lab license; If you later add other third party code, please keep this license info updated, and please let us know if that component is licensed under something other than Apache, CC-BY-NC, MIT, or CC0.
For feedback, or collaboration opportunities, feel free to reach out!
If you have any general questions, feel free to email us at sanaka@berkeley.edu and xdwang@eecs.berkeley.edu. If you have code or implementation-related questions, please feel free to send emails to us or open an issue in this codebase (We recommend that you open an issue in this codebase, because your questions may help others).
If you find our work inspiring or use our codebase in your research, please consider giving a star ⭐ and a citation.
@article{xie2025reconstruction,
title={Reconstruction Alignment Improves Unified Multimodal Models},
author={Xie, Ji and Darrell, Trevor and Zettlemoyer, Luke and Wang, XuDong},
journal={arXiv preprint arXiv:2509.07295},
year={2025}
}