The Intento 2021 State of Machine Translation Report – Your Cheatsheet to the MT Landscape

Intento, the leading AI integration platform, has released its annual State of Machine Translation report, giving those working in and around the MT landscape an in-depth analysis of the current vendors and best strategies to successfully leverage their offerings. The report is conducted in collaboration with TAUS, the central authority in language data, offering the largest industry-shared repository of data, deep know-how in language engineering, and a network of Human Language Project workers around the globe.

The 2021 edition delivers everything you need to know to choose the best-fit MT engines for your language pair and industry sector.

Marketing Technology News: Backblaze Strengthens Leadership With Two New Board Members

It provides:

  • The performance of different MT engines across 7 industries (Education, Finance, Healthcare, Hospitality, Legal, Entertainment, and General) and 13 language key pairs.
  • The latest data on 24 commercial MT engines (Alibaba eCommerce and General, Amazon, Apptek, Baidu, DeepL, Elia, Globalese, Google, GTCom, IBM Watson, Microsoft, ModernMT, Naver, Kawamura / NICT, Pangeanic, PROMT, Rozetta, Systran, Tilde, Tencent, Yandex, Youdao, and XL8)
  • Along with 5 open-source pre-trained models (M2M-100-1.2B, M2M-100-418M, mBART50-EN2M, mBART50-M2M, and OpusMT)
  • The principal scores to rely on when studying MT outcomes, such as similarity scores (COMET, BERTScore, PRISM, TER, and hLEPOR)
  • A thorough comparison of scores: hLEPOR, BERTScore, PRISM, and COMET.
  • Coverage of language support, which jumped from 16k to 100k language pairs in 2021
  • Price comparisons

This year’s report is chock-full of novel insights and will consist of two parts. First, ‘Automatic semantic similarity scoring’ demonstrates various changes to the MT landscape over the past year, including information on all new players on the market.

The second part will provide a deep-dive linguistic analysis for 3 language pairs (EN → ES, EN → IT, EN → NL). Essential takeaways from this breakdown include:

  • The comparison of texts between 5 industry sectors: Education, Financial, Healthcare, Legal, Travel (ES).
  • Key conclusions on how automatic metrics relate to human estimation of translations.
  • Recommendations on the best-fit MT engines for analyzed language pairs and industry sectors.
  • Insights on how to enhance the power of all MT engines available on the market.

Intento is trusted by global companies to help select, deploy, and improve the best-fit machine translation and other AI services, including sentiment analysis, voice synthesis, image tagging, and optical character recognition. The report aims to provide an expert vision of the constantly-changing MT landscape to save internationally-facing businesses both human and financial capital. A deep comprehension of the MT landscape benefits your company no matter your experience in machine translation, as there are significant insights for implementing AI and machine translation across various departments to boost productivity and growth.

Marketing Technology News: MarTech Interview with Damien Mahoney, Co-founder and CEO at Stackla

“Working with MT is like living on an erupting volcano. We had 16,000 language pairs available from 34 MT providers just a year ago, and today it’s about 100,000 from 46. We don’t have datasets to evaluate them all, but by working with TAUS we get a look into 13 language pairs and 7 domains”, says Konstantin Savenkov, Intento CEO. “The level of quality we see from stock models in 2021 is unprecedented. However, real-world business applications demand even more, and simply knowing the best stock model is not enough to succeed with MT. Make sure you have domain adaptation, glossaries, tone of voice control, and other tools on your belt.”

Savenkov continues, “this year, together with TAUS, we had a particular focus on using high-quality domain-specific data. It took more time to prepare, but the results should be relevant to a wider audience and applicable to more use cases than before. One key highlight we see from this year is the emergence of new semantic similarity metrics, such as COMET”.

The availability of high-quality, domain-specific language data powering MT models has become ever so significant as AI-enabled automatic translation becomes more and more common. We are pleased to have offered test datasets in 13 language pairs and 7 domains to Intento to be used in their State of the MT 2021 Report. We believe the findings will shed light on many use cases providing guidance on which MT engines are best suitable for users’ given requirements and above all demonstrate the value of high-quality, domain-specific data in increasing the quality of the final output.” Jaap van der Meer, TAUS Director.

Marketing Technology News: Diversity Isn’t Just Important, It’s Good for Your Brand

Brought to you by
For Sales, write to: contact@martechseries.com
Copyright © 2024 MarTech Series. All Rights Reserved.Privacy Policy
To repurpose or use any of the content or material on this and our sister sites, explicit written permission needs to be sought.