Translation Accuracy Table

Comprehensive accuracy metrics for our advanced translation system (accuracy data based on English → target language testing)

SBERT (Sentence-BERT) Metrics
80+ Languages
LaBSE (Language-agnostic BERT) Model

About SBERT & LaBSE

Our accuracy measurements use SBERT (Sentence-BERT) with the LaBSE (Language-agnostic BERT Sentence Embedding) model to evaluate translation quality.

What is LaBSE?

LaBSE is a multilingual sentence embedding model developed by Google Research that creates universal sentence representations across 100+ languages. It maps similar meanings to similar vectors in a shared multilingual space, making it ideal for cross-lingual semantic comparison.

How We Measure Accuracy

We encode both the source and translated sentences using LaBSE, then calculate the cosine similarity between them. This gives us a score between -1 and 1, where higher scores indicate better semantic similarity and translation quality.

Experiment Setup

These accuracy scores were generated using our automated evaluation pipeline. The experiment processes translation bindings across multiple language pairs and translation modes to provide comprehensive quality metrics. Our accuracy data is based on English → target language testing.

Evaluation Script

The scores were calculated using our score_bindings.py script, which processes translation bindings and computes SBERT similarity scores using the LaBSE model.

Test Text

The accuracy evaluation was performed using a passage from F. Scott Fitzgerald's "The Great Gatsby" as the test text. This literary excerpt was chosen for its:

  • Complex sentence structures and varied vocabulary
  • Nuanced language and cultural references
  • Representative of the type of content users might translate
  • Sufficient length to provide meaningful evaluation data

"In my younger and more vulnerable years my father gave me some advice that I've been turning over in my mind ever since.

'Whenever you feel like criticizing anyone,' he told me, 'just remember that all the people in this world haven't had the advantages that you've had.'

He didn't say any more, but we've always been unusually communicative in a reserved way, and I understood that he meant a great deal more than that. In consequence, I'm inclined to reserve all judgements, a habit that has opened up many curious natures to me and also made me the victim of not a few veteran bores. The abnormal mind is quick to detect and attach itself to this quality when it appears in a normal person, and so it came about that in college I was unjustly accused of being a politician, because I was privy to the secret griefs of wild, unknown men. Most of the confidences were unsought—frequently I have feigned sleep, preoccupation, or a hostile levity when I realized by some unmistakable sign that an intimate revelation was quivering on the horizon; for the intimate revelations of young men, or at least the terms in which they express them, are usually plagiaristic and marred by obvious suppressions. Reserving judgements is a matter of infinite hope. I am still a little afraid of missing something if I forget that, as my father snobbishly suggested, and I snobbishly repeat, a sense of the fundamental decencies is parcelled out unequally at birth."

— Excerpt from "The Great Gatsby" by F. Scott Fitzgerald

Data Sources

The complete dataset is available for download. Here are the key files:

Individual binding files for each language pair are also available. You can access them directly by URL:/sbert-output/en_[language]_[mode]_[precision]_bindings.json

Coverage

The experiment covers 80+ language pairs (English → target languages), with each language evaluated across three translation modes (Simple, Smart, Sentence) and two precision settings (Standard, Precision). This comprehensive testing ensures robust quality assessment across diverse linguistic contexts.

Score Interpretation

Score RangeQuality LevelUse Case
0.90 - 1.00ExcellentProfessional translation, subtitle alignment
0.85 - 0.90Very GoodEducational tools, academic applications
0.75 - 0.85GoodGeneral use, data mining
0.60 - 0.75FairRequires review, may need improvement
< 0.60PoorLikely mistranslated or unrelated

Translation Modes

Simple Mode

Basic translation approach using standard advanced machine translation techniques. Provides reliable results for most common language pairs.

Smart Mode

Enhanced translation with context awareness and advanced language understanding. Better handling of idiomatic expressions and cultural nuances.

Sentence Mode

Sentence-level translation optimization that considers the complete sentence structure. Ideal for maintaining grammatical coherence and natural flow.

Precision Mode

Uses certified professional translation services with enhanced quality control and additional processing steps. Higher computational cost but delivers superior translation fidelity through professional translation services.

SBERT (Sentence-BERT) Accuracy Scores

Loading data...

Technical Details

Research Background

LaBSE was introduced in the paper "Language-Agnostic BERT Sentence Embedding" by Feng et al. (2020). It has been widely adopted in the NLP community for cross-lingual semantic similarity tasks and translation quality assessment.

Evaluation Methodology

Our evaluation process involves encoding both source and target sentences using the LaBSE model, then computing cosine similarity between the resulting embeddings. This provides a quantitative measure of semantic preservation during translation.

Limitations

While SBERT scores provide valuable insights into translation quality, they should be considered alongside other metrics and human evaluation. The scores primarily measure semantic similarity rather than grammatical correctness or stylistic appropriateness.