# Instructions to run inference with SeamlessM4T models SeamlessM4T models currenlty support five tasks: - Speech-to-speech translation (S2ST) - Speech-to-text translation (S2TT) - Text-to-speech translation (T2ST) - Text-to-text translation (T2TT) - Automatic speech recognition (ASR) Inference calls for the `Translator` object instanciated with a Multitasking UnitY model with the options: - `multitask_unity_large` - `multitask_unity_medium` and a vocoder `vocoder_36langs` ```python import torch import torchaudio from seamless_communication.models.inference import Translator # Initialize a Translator object with a multitask model, vocoder on the GPU. translator = Translator("multitask_unity_large", "vocoder_36langs", torch.device("cuda:0")) ``` Now `predict()` can be used to run inference as many times on any of the supported tasks. Given an input audio with `` or an input text `` in ``, we can translate into `` as follows: ## S2ST and T2ST: ```python # S2ST translated_text, wav, sr = translator.predict(, "s2st", ) # T2ST translated_text, wav, sr = translator.predict(, "t2st", , src_lang=) ``` Note that `` must be specified for T2ST. The generated units are synthesized and the output audio file is saved with: ```python wav, sr = translator.synthesize_speech(, ) # Save the translated audio generation. torchaudio.save( , wav[0].cpu(), sample_rate=sr, ) ``` ## S2TT, T2TT and ASR: ```python # S2TT translated_text, _, _ = translator.predict(, "s2tt", ) # ASR # This is equivalent to S2TT with `=`. transcribed_text, _, _ = translator.predict(, "asr", ) # T2TT translated_text, _, _ = translator.predict(, "t2tt", , src_lang=) ``` Note that `` must be specified for T2TT # Inference using the CLI, from the root directory of the repository: The model can be specified with e.g., `--model_name multitask_unity_large`: S2ST: ``` python scripts/m4t/predict/predict.py s2st --output_path --model_name multitask_unity_large ``` S2TT: ``` python scripts/m4t/predict/predict.py s2tt ``` T2TT: ``` python scripts/m4t/predict/predict.py t2tt --src_lang ``` T2ST: ``` python scripts/m4t/predict/predict.py t2st --src_lang --output_path ``` ASR: ``` python scripts/m4t/predict/predict.py asr ```