Browse Source

Minor README.md update (#198)

* Minor README.md update

Adding a line about the vocoders for SeamlessStreaming and Seamless models

* revise

* Update README.md

* revise

* revise

* Update README.md

Co-authored-by: Kaushik Ram Sadagopan <krs@fb.com>

* punctuation and grammar

---------

Co-authored-by: Yilin Yang <yilinyang721@gmail.com>
Co-authored-by: Yilin Yang <12211426+yilinyang7@users.noreply.github.com>
Co-authored-by: Kaushik Ram Sadagopan <krs@fb.com>
Abinesh Ramakrishnan 1 year ago
parent
commit
f3f0931b79
2 changed files with 6 additions and 1 deletions
  1. 3 0
      README.md
  2. 3 1
      src/seamless_communication/cli/streaming/README.md

+ 3 - 0
README.md

@@ -134,6 +134,9 @@ Please note that SeamlessExpressive is made available under its own [License]()
 | ------------------ | ------- | --------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------ |
 | SeamlessStreaming  | 2.5B    | [🤗 Model card](https://huggingface.co/facebook/SeamlessStreaming) - [monotonic decoder checkpoint](https://huggingface.co/facebook/SeamlessStreaming/resolve/main/seamless_streaming_monotonic_decoder.pt) - [streaming UnitY2 checkpoint](https://huggingface.co/facebook/SeamlessStreaming/resolve/main/seamless_streaming_unity.pt)  | [metrics](https://dl.fbaipublicfiles.com/seamless/metrics/streaming/seamless_streaming.zip)  |
 
+### Seamless models
+Seamless model is simply the SeamlessStreaming model with the non-expressive `vocoder_v2` swapped out with the expressive `vocoder_pretssel`.
+Please check out above [section](#seamlessexpressive-models) on how to acquire `vocoder_pretssel` checkpoint.
 
 ## Evaluation
 

+ 3 - 1
src/seamless_communication/cli/streaming/README.md

@@ -42,4 +42,6 @@ The Seamless model is an unified model for streaming expressive speech-to-speech
 streaming_evaluate --task s2st --data-file <path_to_data_tsv_file> --audio-root-dir <path_to_audio_root_directory> --output <path_to_evaluation_output_directory> --tgt-lang <3_letter_lang_code> --expressive
 ```
 
-Note: In the current version of our paper, we use vocoder_pretssel_16khz for the evaluation , so in order to reproduce those results please add this arg to the above command: `--vocoder-name vocoder_pretssel_16khz`
+The Seamless model uses `vocoder_pretssel` which is a 24KHz version (`vocoder_pretssel`) by default. In the current version of our paper, we use 16KHz version (`vocoder_pretssel_16khz`) for the evaluation , so in order to reproduce those results please add this arg to the above command: `--vocoder-name vocoder_pretssel_16khz`.
+
+Also, to acquire `vocoder_pretssel` or `vocoder_pretssel_16khz` checkpoints, please check out [this section](../../README.md#seamlessexpressive-models).