Google/translategemma-27b-it
Lightweight open translation model from Google (based on Gemma 3) supporting 55 languages. Served via the vLLM-optimized Infomaniak-AI checkpoint.
View on HuggingFaceGuide
Overview
TranslateGemma is a family of lightweight, state-of-the-art open translation models from Google, based on the Gemma 3 family. TranslateGemma models handle translation across 55 languages and are small enough to deploy on laptops, desktops, and modest cloud GPU environments.
Original Models
Optimized vLLM Models
Why use the vLLM-optimized models?
The original Google models have compatibility issues with standard inference engines like vLLM. The optimized versions from Infomaniak-AI (detailed changes):
- vLLM Compatibility: Originals require custom JSON parameters (
source_lang_code/target_lang_code). The optimized version uses string delimiters. - RoPE Simplification: Originals use a complex RoPE configuration for sliding attention. Optimized uses a standard linear RoPE format (
factor: 8.0). - EOS Token Fix: Corrects the EOS token from
<end_of_turn>to<eos>.
Prerequisites
Docker
docker pull vllm/vllm-openai:v0.14.1-cu130
Deployment Configurations
Verified for both 4B and 27B:
docker run -itd --name google-translategemma-27b-it \
--ipc=host \
--network host \
--shm-size 16G \
--gpus all \
-v ~/.cache/huggingface:/root/.cache/huggingface \
vllm/vllm-openai:v0.14.1-cu130 \
Infomaniak-AI/vllm-translategemma-27b-it \
--served-model-name translategemma-27b-it \
--gpu-memory-utilization 0.8 \
--host 0.0.0.0 \
--port 8000
Client Usage
Tips:
- Prompt Delimiters: Encode language metadata directly in the content string:
<<<source>>>{src_lang}<<<target>>>{tgt_lang}<<<text>>>{text} - Language Codes: ISO 639-1 Alpha-2 (e.g.
en,zh) and regional variants (e.g.en_US,zh_CN). - Context Limit: ~2K tokens.
cURL
curl -X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "translategemma-27b-it",
"messages": [{
"role": "user",
"content": "<<<source>>>en<<<target>>>zh<<<text>>>We distribute two models for language identification, which can recognize 176 languages."
}]
}'