Google/Dense

TranslateGemma 27B

chatmultilingualvision
27B
Parameters
128K
Context length
7
Benchmarks
4
Quantizations
7K
HF downloads
Architecture
Dense
Released
2026-01-13
Layers
62
KV Heads
16
Head Dim
128
Family
gemma

TranslateGemma model card

Resources and Technical Documentation:

Terms of Use: Terms
Authors: Google Translate

Model Information

Summary description and brief definition of inputs and outputs.

Description

TranslateGemma is a family of lightweight, state-of-the-art open translation models from Google, based on the Gemma 3 family of models.
TranslateGemma models are designed to handle translation tasks across 55 languages. Their relatively small size makes it possible to deploy them in environments with limited resources such as laptops, desktops or your own cloud infrastructure, democratizing access to state of the art translation models and helping foster innovation for everyone.

Inputs and outputs

  • Input:

    • Text string, representing the text to be translated
    • Images, normalized to 896 x 896 resolution and encoded to 256 tokens each
    • Total input context of 2K tokens
  • Output:

    • Text translated into the target language

Usage

TranslateGemma is designed to work with a specific chat template that supports direct translation of a text input, or text-extraction-and-translation from an image input. This chat template has been implemented with Hugging Face transformers' chat templating system and is compatible with the apply_chat_template() function provided by the Gemma tokenizer and Gemma 3 processor. Notable differences from other models' chat templates include:

  • TranslateGemma supports only User and Assistant roles.

  • TranslateGemma's User role is highly opinionated:

    • The content property must be provided as a list with exactly one entry.

    • The content list entry must provide:

      • A "type" property where the value must be either "text" or "image".
      • A "source_lang_code" property as a string
      • A "target_lang_code" property as a string
    • The content list entry should provide one of these:

      • A "url" property, if the entry's type is "image", from which the image will be loaded
      • A "text" property, if the entry's type is "text", containing only the text to translate
    • The "source_lang_code" and "target_lang_code" property values can take one of one of two forms:

    • If the "source_lang_code" and "target_lang_code" property value is not supported by the model, an error will be raised when the template is applied.

Additionally, TranslateGemma may respond well to other prompting techniques to support use cases that go beyond the provided chat template, such as Automatic Translation Post-Editing. As these are not officially supported, they should be crafted manually using the special control tokens and structures specified in the Gemma 3 Technical Report, and sent directly to the tokenizer or processor instead of using the apply_chat_template() function. The TranslateGemma team is interested in hearing about your experiences with alternate prompts, please reach out with any questions and feedback.

With Pipelines

from transformers import pipeline
import torch

pipe = pipeline(
    "image-text-to-text",
    model="google/translategemma-27b-it",
    device="cuda",
    dtype=torch.bfloat16
)

# ---- Text Translation ----
messages = [
    {
        "role": "user",
        "content": [
            {
                "type": "text",
                "source_lang_code": "cs",
                "target_lang_code": "de-DE",
                "text": "V nejhorším případě i k prasknutí čočky.",
            }
        ],
    }
]

output = pipe(text=messages, max_new_tokens=200)
print(output[0]["generated_text"][-1]["content"])

# ---- Text Extraction and Translation ----
messages = [
    {
        "role": "user",
        "content": [
            {
                "type": "image",
                "source_lang_code": "cs",
                "target_lang_code": "de-DE",
                "url": "https://c7.alamy.com/comp/2YAX36N/traffic-signs-in-czech-republic-pedestrian-zone-2YAX36N.jpg",
            },
        ],
    }
]

output = pipe(text=messages, max_new_tokens=200)
print(output[0]["generated_text"][-1]["content"])

With direct initialization

import torch
from transformers import AutoModelForImageTextToText, AutoProcessor

model_id = "google/translategemma-27b-it"
processor = AutoProcessor.from_pretrained(model_id)
model = AutoModelForImageTextToText.from_pretrained(model_id, device_map="auto")


# ---- Text Translation ----
messages = [
    {
        "role": "user",
        "content": [
            {
                "type": "text",
                "source_lang_code": "cs",
                "target_lang_code": "de-DE",
                "text": "V nejhorším případě i k prasknutí čočky.",
            }
        ],
    }
]

inputs = processor.apply_chat_template(
    messages, tokenize=True, add_generation_prompt=True, return_dict=True, return_tensors="pt"
).to(model.device, dtype=torch.bfloat16)
input_len = len(inputs['input_ids'][0])

with torch.inference_mode():
    generation = model.generate(**inputs, do_sample=False)

generation = generation[0][input_len:]
decoded = processor.decode(generation, skip_special_tokens=True)
print(decoded)

# ---- Text Extraction and Translation ----
messages = [
    {
        "role": "user",
        "content": [
            {
                "type": "image",
                "source_lang_code": "cs",
                "target_lang_code": "de-DE",
                "url": "https://c7.alamy.com/comp/2YAX36N/traffic-signs-in-czech-republic-pedestrian-zone-2YAX36N.jpg",
            },
        ],
    }
]

inputs = processor.apply_chat_template(
    messages, tokenize=True, add_generation_prompt=True, return_dict=True, return_tensors="pt"
).to(model.device, dtype=torch.bfloat16)

with torch.inference_mode():
    generation = model.generate(**inputs, do_sample=False)

generation = generation[0][input_len:]
decoded = processor.decode(generation, skip_special_tokens=True)
print(decoded)

Quantizations & VRAM

Q4_K_M4.5 bpw
15.9 GB
VRAM required
94%
Quality
Q6_K6.5 bpw
22.8 GB
VRAM required
97%
Quality
Q8_08 bpw
27.9 GB
VRAM required
100%
Quality
FP1616 bpw
55.3 GB
VRAM required
100%
Quality

Benchmarks (7)

IFEval75.5
BBH51.1
BigCodeBench42.8
MMLU-PRO40.3
MATH27.9
MUSR16.9
GPQA16.0

Run with Ollama

$ollama run translategemma

GPUs that can run this model

At Q4_K_M quantization. Sorted by minimum VRAM.

Find the best GPU for TranslateGemma 27B

Build Hardware for TranslateGemma 27B