RecA: Reconstruction Alignment Improves Unified Multimodal Models

🚀 Just 6 × 80GB A100s × 4.5 hours to boost BAGEL performance across all tasks! Our BAGEL outperforms FLUX-Kontext in image editing capabilities!

Ji Xie¹, Trevor Darrell¹, Luke Zettlemoyer², XuDong Wang^1*
UC Berkeley¹; University of Washington²

🔥 News

2025.9.15: 🔥 Add NF4, INT8, DF11 version of BAGEL-RecA! Thank to @theunlikely!
2025.9.14: 🔥 Add ComfyUI guide! Try BAGEL-RecA in ComfyUI!
2025.9.11: Harmon training code is released!
2025.9.10: BAGEL training code is released! Harmon training code will be released soon.
2025.9.9: Our finetuned weights and arXiv paper are available! We expect to release the training code tomorrow.

🔧 Quick Start!

Online Demo: Try out our enhanced BAGEL-RecA demo on Hugging Face Spaces!

ComfyUI: see ComfyUI-BAGEL. The usage is totally the same as the original ComfyUI-BAGEL but you should replace the BAGEL weight models/bagel/BAGEL-7B-MoT/ema.safetensors with RecA-tuned one. The ComfyUI-BAGEL repo already supports the NF4 and INT8 conversion of BAGEL.

wget https://huggingface.co/sanaka87/BAGEL-RecA/blob/main/model_bf16.safetensors
mv model_bf16.safetensors models/bagel/BAGEL-7B-MoT/ema.safetensors

You can also download weight of NF4 and INT8 version of BAGEL in BAGEL-RecA.

DF11 version BAGEL-RecA (heartfelt thank to @theunlikely !!!).

Local Setup: Follow the instructions in the BAGEL Installation Guide to set up the environment, and run BAGEL/inference.ipynb to test the model locally!
Full Training & Evaluation: For detailed instructions on installation, training, and evaluation, please refer to the respective repository READMEs:

BAGEL Installation Guide: Complete guide for BAGEL model training and evaluation.
Harmon Installation Guide: Comprehensive instructions for Harmon model training and evaluation.
Benchmark Evaluation Guide: Multi-benchmark evaluation scripts and setup instructions.

🏆 Model Zoo

A collection of RecA models on Hugging Face with benchmark performance:

Model Name	Parameters	GenEval	DPGBench	ImgEdit	GEdit
BAGEL-RecA (support INT8, NF4)	14B	82.4 (+3.6)	85.29 (+1.26)	3.75 (+0.37)	7.27 (+0.33)
Harmon-0.5B-RecA	0.5B	78.7 (+11.1)	84.67 (+4.55)	-	-
Harmon-1.5B-RecA	1.5B	85.7 (+12.8)	87.21 (+6.28)	-	-
Show-o-RecA	1.3B	61.9 (+5.3)	75.70 (+5.05)	-	-
Show-o-512x512-RecA	1.3B	72.3 (+6.1)	84.94 (+2.73)	-	-
Harmon-1.5B-RecA-plus	1.5B	90.0	88.15	-	-
OpenUni-RecA	3.6B	74.1 (+12.2)	82.75 (+3.73)	-	-

🍭 Results

Unlocking the Massive Zero-shot Potential in Unified Multimodal Models through Self-supervised Learning.

RecA achieves state-of-the-art performance on generation benchmarks with remarkable efficiency. Despite using only 1.5B parameters, RecA surpasses models with 7B-24B parameters, achieving GenEval 0.86 and DPGBench 87.21 without GPT-4o distillation data or reinforcement learning. RecA also improves BAGEL's editing performance significantly across all categories. Further two-stage fine-tuning with GPT-4o-Image distillation data enhances the score to 0.90 and 88.15 respectively.

We've tested RecA on various base architectures, including Show-o, OpenUni, Harmon, and BAGEL, consistently observing significant performance improvements across all models and benchmarks.

🎨 Edit Comparison

Our method demonstrates superior image editing capabilities compared to state-of-the-art models including ICEdit, FLUX-Kontext, and GPT-4o:

🚧 TODO

Release our model weights on Hugging Face.
Release BAGEL training code.
Release Harmon training code.
Add ComfyUI guide.
Release Show-o and OpenUni training code.
Further scale-up BAGEL training.
Add support for new UMM architectures like Show-o2.

License

The majority of RecA is licensed under the Apache License, however portions of the project are available under their own license terms: BAGEL and Show-o are licensed under Apache, Harmon and OpenUni are licensed under S-Lab license; If you later add other third party code, please keep this license info updated, and please let us know if that component is licensed under something other than Apache, CC-BY-NC, MIT, or CC0.

📮 Contact

For feedback, or collaboration opportunities, feel free to reach out!

If you have any general questions, feel free to email us at sanaka@berkeley.edu and xdwang@eecs.berkeley.edu. If you have code or implementation-related questions, please feel free to send emails to us or open an issue in this codebase (We recommend that you open an issue in this codebase, because your questions may help others).

📄 Citation

If you find our work inspiring or use our codebase in your research, please consider giving a star ⭐ and a citation.

@article{xie2025reconstruction,
  title={Reconstruction Alignment Improves Unified Multimodal Models},
  author={Xie, Ji and Darrell, Trevor and Zettlemoyer, Luke and Wang, XuDong},
  journal={arXiv preprint arXiv:2509.07295},
  year={2025}
}

⭐ If you find this project helpful, please consider giving it a star! ⭐

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
BAGEL		BAGEL
Benchmark		Benchmark
Harmon		Harmon
assets		assets
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

RecA: Reconstruction Alignment Improves Unified Multimodal Models

🚀 Just 6 × 80GB A100s × 4.5 hours to boost BAGEL performance across all tasks! Our BAGEL outperforms FLUX-Kontext in image editing capabilities!

🔥 News

📑 Table of Contents

🔧 Quick Start!

🏆 Model Zoo

🍭 Results

🎨 Edit Comparison

🚧 TODO

License

📮 Contact

📄 Citation

About

Uh oh!

Releases

Packages

Contributors 3

Languages

License

HorizonWind2004/reconstruction-alignment

Folders and files

Latest commit

History

Repository files navigation

RecA: Reconstruction Alignment Improves Unified Multimodal Models

🚀 Just 6 × 80GB A100s × 4.5 hours to boost BAGEL performance across all tasks! Our BAGEL outperforms FLUX-Kontext in image editing capabilities!

🔥 News

📑 Table of Contents

🔧 Quick Start!

🏆 Model Zoo

🍭 Results

🎨 Edit Comparison

🚧 TODO

License

📮 Contact

📄 Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages