Train Custom MT Models

In this article:

  1. How does MT Customization work?

  2. Available providers to train

  3. Create a project

  4. How to make sure your data is correct

  5. Training process

Currently, MT Studio supports only Google Chrome and Mozilla Firefox browsers.

 How does MT Customization Work?

When you train a custom MT model, you customize a provider's existing MT model to translate specific content in a specific way.

To customize a model, you need to give it parallel text segments: source texts and translations into a target language. The model processes the segments and learns to translate words and phrases in a way that you want it to. 

Compared to stock MT models, custom models produce translations closer to reference translations.

With Intento MT Studio projects, you can train multiple models from MT providers and evaluate them in one simple interface.

Available Providers to Train

To train models, you need to have a direct contract with the MT provider.

To use training with Intento, you need connected accounts with all providers whose models you want to customize.
With Intento MT Studio, you can now train custom models of the following providers:

Provider: Google AutoML
Connected Accounts: To connect a Google account, follow the instructions here.
Training Costs: The price of training a Google model depends on the amount of training data. The minimum is approximately $100, and the maximum per training is fixed at $300.
Minimum Number of Segments for Training: 1,000
Training Time: 3-6 hours

Provider: ModernMT
Connected Accounts: To connect a ModernMT account, follow the instructions here.
Training Costs: Free
Minimum Number of Segments for Training: No limitations
Training Time: Seconds

Create a Project

To create a project:

  1. Visit Intento MT Studio

  2. Select Create project → Train & Evaluate models:

    1-Train & Evaluate models.png
  3. Enter a name for your new project and the source and target languages:

    2-source and target languages.png
  4. Upload files with parallel data (translation memory) and process them:

    3-parallel data.png
  5. Choose a provider in the Provider field

  6. Choose an account in the Connected account field.

    If you're having trouble with accounts, contact your organization's admin or Intento support.

  7. To train a model from another provider, select Add provider and choose another provider and connected account.

  8. Select Create project

How to Make Sure your Data is Correct

MT Studio works with TMX, CSV, and TSV files. To make sure your data is correct:

  • CSV and TSV files must contain only two columns: source and target. There must not be any other columns in the file.

  • TMX files must contain only two languages

    • The first will be the source language

    • The second will be the target language

Training Process

After you start training, MT Studio shows the status of your models: 

  • Completed, if the model is ready.

  • In progress, if it's still training. While the training is in progress, the user can cancel the process.

  • If training has failed, you'll see Error and a redo arrow — select the arrow to start training again. 

4-Error.png

With trained models, it is easy to make translations for further deep evaluation.