Evolutionary Optimization of Model Merging Recipes
Leverages evolutionary techniques to automatically combine diverse open-source models and create new foundation models tailored to specific application domains
Sakana AI unveils Evolutionary Optimization of Model Merging Recipes, a pioneering approach harnessing evolutionary techniques to merge diverse open-source models and forge new foundation models tailored to specific application domains. By tapping into the vast collective intelligence of existing models, the method defies expectations, achieving state-of-the-art performance in Japanese language comprehension, math reasoning, and image description tasks.
Through Evolutionary Model Merge, Sakana AI navigates the vast ocean of over 500k models across various modalities, crafting novel combinations that transcend traditional boundaries. Surpassing human intuition, the method seamlessly integrates models from disparate domains, yielding unexpected synergies and breakthroughs in non-English language comprehension and vision tasks.
In a remarkable display of efficacy, the evolved Japanese Language and Vision-Language Models exhibit unparalleled proficiency in handling culturally-specific content, outperforming previous benchmarks without explicit optimization. Notably, the Japanese Math Language Model achieves unprecedented performance, rivaling larger counterparts while requiring minimal compute resources.
Comments
None