- Yandex releases Alchemist, an open, model-guided dataset enhancing text-to-image model quality.
- The dataset contains the training data evaluated and selected by a pre-trained diffusion model.
- Alchemist helps enhance image aesthetics and complexity by up to 20%.
Yandex has released Alchemist, an open dataset designed to enhance the output quality of text-to-image (T2I) models. The dataset, consisting of only 3,350 image-text pairs, enables researchers and developers worldwide to significantly enhance the performance of their T2I models.
The quality gap in text-to-image generation
Despite the progress made by models like DALL-E 3, Imagen 3, and Stable Diffusion 3, achieving consistently high-quality image generation remains a major challenge. Large-scale pre-training provides models with broad contextual understanding, but it is often not enough to produce detailed visuals that fully match the text prompts. Supervised fine-tuning (SFT), a process that refines AI models after their initial training, is a crucial step to bridge this gap, but its success depends heavily on the quality of the fine-tuning data.
Public datasets currently used for SFT often focus on niche styles, such as anime, or draw from large-scale web data using only simple filtering techniques. Human curation is expensive, non-scalable, and often overlooks the most impactful training samples. Moreover, recent T2I models use proprietary datasets with minimal transparency, making results difficult to reproduce and slowing global AI progress in the field.
A smarter way to curate datasets
Alchemist addresses the quality gap with an innovative approach. Instead of relying on human labeling or simplistic aesthetic scoring, a pre-trained diffusion model is used to evaluate and select the training data samples that are most likely to enhance the output quality of T2I models.
The creation of the Alchemist dataset includes a multi-stage filtering pipeline that begins with approximately 10 billion images. The process includes removing inappropriate content, filtering out low-quality or corrupted images, and eliminating duplicates. Then, the pre-trained diffusion model is used to rank and select images that are most likely to enhance visual complexity, aesthetic appeal, and stylistic richness of the model output. Finally, the captions are rewritten using a vision-language model to ensure they are clear and well-suited for fine-tuning. The resulting model-guided dataset contains 3,350 high-quality image-text pairs.
Proven quality gains across T2I models
The effectiveness of Alchemist was evaluated across five open-source Stable Diffusion models: SD1.5, SD2.1, SDXL, SD3.5 Medium, and SD3.5 Large. Each model was fine-tuned using one of three datasets: Alchemist, a similarly sized subset of a widely-used open dataset, and each model’s standard baseline. When assessing the Alchemist-tuned models:
- Human evaluators noted improvements in image aesthetics and complexity (often by margins of 12–20%).
- Prompt accuracy remained uncompromised — images stayed true to text descriptions.
- The models generally scored higher across all technical quality benchmarks.
Remarkably, during the evaluation stage, fine-tuning with Alchemist’s 3,350 samples delivered better results than its larger versions of 7 and 19 thousand samples, demonstrating that high-quality, carefully selected data is often preferable to larger datasets.
Yandex has already used the dataset to fine-tune its proprietary text-to-image generative model, YandexART v2.5, and plans to continue leveraging it for future model updates.
Dataset availability
Yandex is releasing Alchemist under an open license on Hugging Face, along with fine-tuned versions of five publicly available Stable Diffusion models. The preprint detailing its creation is available on arxiv.org.
Alchemist is the company’s latest publicly available release, making an important contribution to text-to-image generation field. Yandex believes that sharing such resources is essential, as they empower global collaboration and accelerate technological progress.
About Yandex
Yandex is a global technology company that builds intelligent products and services powered by machine learning. Its goal is to help consumers and businesses better navigate the online and offline world. Since 1997, Yandex has delivered world-class, locally relevant search and information services and developed market-leading on-demand transportation services, navigation products, and other mobile applications for millions of consumers worldwide.
Tags
Сontacts
Yandex Press Office
pr@yandex-team.com