Tuan Tran
|
9f6ade6ee4
Make a public-facing watermarked vocoder (PretsselVocoder) (#97)
|
2 years ago |
Anna Sun
|
2ccf28ad24
[streaming] Port changes for streaming demo (#130)
|
2 years ago |
Yilin Yang
|
00118c21cc
Enabling 24khz vocoder for demo/OSS (#132)
|
2 years ago |
Kaushik Ram Sadagopan
|
7537081d50
Return durations from the variance adaptor. (#134)
|
2 years ago |
Abinesh Ramakrishnan
|
b1027e9858
Silero VAD Agent (#120)
|
2 years ago |
Pierre Andrews
|
87e10d101b
mintox - Add option to consume pretranscribed text + log mintox for cloudwatch (#131)
|
2 years ago |
Kaushik Ram Sadagopan
|
5a2d61655f
Make unit_extractor configurable by dtype. (#128)
|
2 years ago |
Abinesh Ramakrishnan
|
bc88690d56
Ability to change tgt_lang dynamically during streaming inference. (#121)
|
2 years ago |
Kaushik Ram Sadagopan
|
5dd9722b8d
Introduce MMASpeechToTextDecoderAgent and related agents for online_text_decoder. (#113)
|
2 years ago |
Ruslan Mavlyutov
|
c9f611a0b2
Fix inconsistency between model vocab info and associated tokenizers (inherit directly from the tokenizers) (#126)
|
2 years ago |
Yilin Yang
|
b9f101b2b7
Loose the PretsselModel test check by allowing one unit different b/t runs (#127)
|
2 years ago |
Kaushik Ram Sadagopan
|
239a9440a9
Offline w2v-bert encoder agent with parity. (#110)
|
2 years ago |
Kaushik Ram Sadagopan
|
521a374213
Online feature extractor SimulEval agent. (#107)
|
2 years ago |
Ruslan Mavlyutov
|
ca1ebf90ea
* Training recipees for M4T-nano/-micro\n* Adjustments to fairseq2/sc updates\n* Fixing MyPy warnings (#59)
|
2 years ago |
Can Balioglu
|
00b066c6f8
Fix breaking fairseq2 API changes (#119)
|
2 years ago |
Yilin Yang
|
0cc7bc610a
Add integrated test for ProsodyUnitY/seamless_expressivity model (#99)
|
2 years ago |
Can Balioglu
|
0bdc7b60ac
Revise, clean up MinTox implementation. Part 1 (#96)
|
2 years ago |
Can Balioglu
|
2393016090
Move to generic loaders (#111)
|
2 years ago |
Kaushik Ram Sadagopan
|
5198e0586c
Add seamless_streaming assets. (#106)
|
2 years ago |
Yilin Yang
|
1a91d39931
Rename gcmvn_fbank to prosody_encoder_input (#105)
|
2 years ago |
Kaushik Ram Sadagopan
|
e3c40244e1
Fix bug in eval script for S2ST, T2ST tasks. (#102)
|
2 years ago |
Yilin Yang
|
ed18e69190
Implement PretsselModel & its inference (#89)
|
2 years ago |
Can Balioglu
|
29935570cd
Start using fairseq2's NGramRepeatBlockProcessor (#100)
|
2 years ago |
Can Balioglu
|
a618cd43f0
Update Shaw attention init (#101)
|
2 years ago |
Kaushik Ram Sadagopan
|
4e93254fa5
Skip loading text_encoder for S2X tasks, skip loading T2U model for X2T tasks. (#95)
|
2 years ago |
Kaushik Ram Sadagopan
|
fcaf953981
Clean up M4T v2 and vocoder v2 checkpoints. (#94)
|
2 years ago |
Kaushik Ram Sadagopan
|
d6ff19b20b
Introduce monotonic_decoder. (#73)
|
2 years ago |
Changhan Wang
|
1cead95b37
Add PRETSSEL HiFiGAN Vocoder (#78)
|
2 years ago |
Kaushik Ram Sadagopan
|
a11376477b
Add use_text_decoder parameter to UnitYConfig. (#88)
|
2 years ago |
Can Balioglu
|
05419775be
Introduce Prosody encoder (#87)
|
2 years ago |