2B-2B, 9B-2B, 9B-9B (Gemma 2 Series) and Small/Base/Large/XL/ML (T5-compatible Series; encoder-decoder)
Apache 2.0
First encoder-decoder models adapted from Gemma 2 via novel adaptation technique; converts pretrained decoder-only models into encoder-decoder architecture using UL2 or PrefixLM training; achieves comparable/better performance than Gemma 2 counterparts while dominating quality-efficiency frontier; T5Gemma 2B-2B IT gains +12 points MMLU and +12.7% GSM8K over Gemma 2 2B; flexible unbalanced configurations (e.g. 9B encoder with 2B decoder) optimize for specific tasks like summarization; excels at tasks requiring deep input understanding (translation, QA, summarization); pretrained and instruction-tuned variants; demonstrates encoder-decoder architecture remains competitive against decoder-only dominance; proves pretrained decoder-only models can be successfully adapted to encoder-decoder.