# unity.cpp ## Introduction [GGML](https://github.com/ggerganov/ggml) is an open source library in C to enable large model inference on various hardware platforms. We implemented unity.cpp in ggml. Now it supports SeamlessM4T model for X2T tasks - Speech-to-text translation (S2TT), Acoustic speech recognition (ASR), Text-to-text translation (T2TT). The project is still active in development. Contributions are welcome! ## Build To build the interactive console for S2TT & ASR & T2TT, ``` cd seamless_communication/ggml mkdir build; cd build cmake -DGGML_OPENBLAS=ON \ -DBUILD_SHARED_LIBS=On \ -DCMAKE_BUILD_TYPE=Release \ -DCMAKE_CXX_FLAGS="-g2 -fno-omit-frame-pointer" \ .. make -j4 unity # Interactive Console ``` For more build commands see [Makefile](Makefile). ## CLI usage ### S2TT Command to launch an interactive console for S2TT & ASR, note that the model already includes vocabulary needed to detokenize. ``` OPENBLAS_NUM_THREADS=8 ./bin/unity --model seamlessM4T_medium.ggml ``` In the console, enter "wav_file tgt_lang" - the path of local waveform file and target language, separated by space. Note that the first run would include some “warm up” time so could be slow. ### T2TT Launching command: ``` OPENBLAS_NUM_THREADS=8 ./bin/unity --model nllb-200_dense_1b.ggml --text ``` In the console, enter "input_text tgt_lang" - input text and target langauge, separated by space. Note that the language code should align with [NLLB BCP-47 code](https://github.com/facebookresearch/flores/blob/main/flores200/README.md#languages-in-flores-200), NOT 3-letter language code as S2TT task with Seamless. Unifying this is on todo list. ### Model downloads Converted ggml models could be downloaded from |SeamlessM4T_large | SeamlessM4T_medium | NLLB_dense_1b | NLLB_distill_600m | |-------- | -------- | ------- | ------- | | [model](dl.fbaipublicfiles.com/seamless/models/seamlessM4T_large.ggml) | [model](dl.fbaipublicfiles.com/seamless/models/seamlessM4T_medium.ggml) | [model](dl.fbaipublicfiles.com/seamless/models/nllb-200_dense_1b.ggml) | [model](dl.fbaipublicfiles.com/seamless/models/nllb-200_dense_distill_600m.ggml) For more details of NLLB models, please check https://github.com/facebookresearch/fairseq/tree/nllb. ## Fairseq2 model conversion Models from fairseq2 checkpoints could be converted to ggml automatically with [ggml_convert.py](ggml_convert.py). ``` python ggml_convert.py -m MODEL_NAME ``` where MODEL_NAME corresponds to asset cards in fairseq2 / seamless_communication, e.g. seamlessM4T_medium, seamlessM4T_large ## Python bindings We also utilize ggml python bindings for better dev experience. For examples of running unity.cpp in python, refer to tests in [test_unity_cpp.py](test_unity_cpp.py). ## [Optional]Dependencies ### OpenBLAS We strongly suggest building with OpenBLAS, as we've seen 8x speedup on test machine. ### libsndfile This is needed only for the console to load waveform, but not the library.