6B / 1B Nano (MoE) and 26B / 3B Mini (MoE)
Apache 2.0
U.S.-trained MoE family with AFMoE architecture; 128K context; trained on 10T tokens; Nano (6B/1B) for chat with personality and on-device AI; Mini (26B/3B) for high-throughput reasoning, function calling, and agent workflows; strong on MMLU and BFCL V3.