Tag
1 article
Researchers at the Allen Institute for AI and UC Berkeley have developed EMO, a mixture-of-experts model that maintains near-full performance using only 12.5% of its experts, making it more practical for memory-constrained settings.