TranslateGemma 27B
Model Card
View on HuggingFaceTranslateGemma model card
Resources and Technical Documentation:
- Technical Report
- Responsible Generative AI Toolkit
- TranslateGemma on Kaggle
- TranslateGemma on Vertex Model Garden
Terms of Use: Terms
Authors: Google Translate
Model Information
Summary description and brief definition of inputs and outputs.
Description
TranslateGemma is a family of lightweight, state-of-the-art open translation models from Google, based on the Gemma 3 family of models.
TranslateGemma models are designed to handle translation tasks across 55 languages. Their relatively small size makes it possible to deploy them in environments with limited resources such as laptops, desktops or your own cloud infrastructure, democratizing access to state of the art translation models and helping foster innovation for everyone.
Inputs and outputs
-
Input:
- Text string, representing the text to be translated
- Images, normalized to 896 x 896 resolution and encoded to 256 tokens each
- Total input context of 2K tokens
-
Output:
- Text translated into the target language
Usage
TranslateGemma is designed to work with a specific chat template that supports direct translation of a text input, or text-extraction-and-translation from an image input. This chat template has been implemented with Hugging Face transformers' chat templating system and is compatible with the apply_chat_template() function provided by the Gemma tokenizer and Gemma 3 processor. Notable differences from other models' chat templates include:
-
TranslateGemma supports only User and Assistant roles.
-
TranslateGemma's User role is highly opinionated:
-
The content property must be provided as a list with exactly one entry.
-
The content list entry must provide:
- A "type" property where the value must be either "text" or "image".
- A "source_lang_code" property as a string
- A "target_lang_code" property as a string
-
The content list entry should provide one of these:
- A "url" property, if the entry's type is "image", from which the image will be loaded
- A "text" property, if the entry's type is "text", containing only the text to translate
-
The "source_lang_code" and "target_lang_code" property values can take one of one of two forms:
- An ISO 639-1 Alpha-2 language code, e.g.,
en; or - A "regionalized" variant as an ISO 639-1 Alpha-2 language code and an ISO 3166-1 Alpha-2 country code pair separated by a dash or an underscore, e.g., en_US
oren-GB, similar to the Unicode Common Locale Data Repository format.
- An ISO 639-1 Alpha-2 language code, e.g.,
-
If the "source_lang_code" and "target_lang_code" property value is not supported by the model, an error will be raised when the template is applied.
-
Additionally, TranslateGemma may respond well to other prompting techniques to support use cases that go beyond the provided chat template, such as Automatic Translation Post-Editing. As these are not officially supported, they should be crafted manually using the special control tokens and structures specified in the Gemma 3 Technical Report, and sent directly to the tokenizer or processor instead of using the apply_chat_template() function. The TranslateGemma team is interested in hearing about your experiences with alternate prompts, please reach out with any questions and feedback.
With Pipelines
from transformers import pipeline
import torch
pipe = pipeline(
"image-text-to-text",
model="google/translategemma-27b-it",
device="cuda",
dtype=torch.bfloat16
)
# ---- Text Translation ----
messages = [
{
"role": "user",
"content": [
{
"type": "text",
"source_lang_code": "cs",
"target_lang_code": "de-DE",
"text": "V nejhorším případě i k prasknutí čočky.",
}
],
}
]
output = pipe(text=messages, max_new_tokens=200)
print(output[0]["generated_text"][-1]["content"])
# ---- Text Extraction and Translation ----
messages = [
{
"role": "user",
"content": [
{
"type": "image",
"source_lang_code": "cs",
"target_lang_code": "de-DE",
"url": "https://c7.alamy.com/comp/2YAX36N/traffic-signs-in-czech-republic-pedestrian-zone-2YAX36N.jpg",
},
],
}
]
output = pipe(text=messages, max_new_tokens=200)
print(output[0]["generated_text"][-1]["content"])
With direct initialization
import torch
from transformers import AutoModelForImageTextToText, AutoProcessor
model_id = "google/translategemma-27b-it"
processor = AutoProcessor.from_pretrained(model_id)
model = AutoModelForImageTextToText.from_pretrained(model_id, device_map="auto")
# ---- Text Translation ----
messages = [
{
"role": "user",
"content": [
{
"type": "text",
"source_lang_code": "cs",
"target_lang_code": "de-DE",
"text": "V nejhorším případě i k prasknutí čočky.",
}
],
}
]
inputs = processor.apply_chat_template(
messages, tokenize=True, add_generation_prompt=True, return_dict=True, return_tensors="pt"
).to(model.device, dtype=torch.bfloat16)
input_len = len(inputs['input_ids'][0])
with torch.inference_mode():
generation = model.generate(**inputs, do_sample=False)
generation = generation[0][input_len:]
decoded = processor.decode(generation, skip_special_tokens=True)
print(decoded)
# ---- Text Extraction and Translation ----
messages = [
{
"role": "user",
"content": [
{
"type": "image",
"source_lang_code": "cs",
"target_lang_code": "de-DE",
"url": "https://c7.alamy.com/comp/2YAX36N/traffic-signs-in-czech-republic-pedestrian-zone-2YAX36N.jpg",
},
],
}
]
inputs = processor.apply_chat_template(
messages, tokenize=True, add_generation_prompt=True, return_dict=True, return_tensors="pt"
).to(model.device, dtype=torch.bfloat16)
with torch.inference_mode():
generation = model.generate(**inputs, do_sample=False)
generation = generation[0][input_len:]
decoded = processor.decode(generation, skip_special_tokens=True)
print(decoded)
Quantizations & VRAM
Benchmarks (7)
Run with Ollama
ollama run translategemmaGPUs that can run this model
At Q4_K_M quantization. Sorted by minimum VRAM.
Find the best GPU for TranslateGemma 27B
Build Hardware for TranslateGemma 27B