index.html 30 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343
  1. <!DOCTYPE html>
  2. <html>
  3. <head>
  4. <meta charset="UTF-8">
  5. <title>A Holistic Cascade System, Benchmark, and Human Evaluation Protocol for Expressive Speech-to-Speech Translation </title>
  6. <link rel="stylesheet" type="text/css" href="styles.css">
  7. <script src="jquery-3.5.js"></script>
  8. <script src="wavesurfer.js"></script>
  9. </head>
  10. <body>
  11. <div class="container">
  12. <div id="text1"> A Holistic Cascade System, Benchmark, and Human Evaluation Protocol </br> for Expressive Speech-to-Speech Translation </div>
  13. <div id="intro">
  14. <br>
  15. <p>
  16. Wen-Chin Huang1<sup>&#8224;&#8225;</sup> , Benjamin Peloquin<sup>2&#8225;</sup>, Justine Kao<sup>2</sup>, Changhan Wang<sup>2</sup> </br>
  17. Hongyu Gong<sup>2</sup>, Elizabeth Salesky<sup>3†</sup>, Yossi Adi<sup>2</sup>, Ann Lee<sup>2</sup>, Peng-Jen Chen<sup>2</sup> </br>
  18. </p>
  19. <p>
  20. <sup>1</sup>Nagoya University, <sup>2</sup>Meta AI, <sup>3</sup>Johns Hopkins University </br>
  21. <font size="-1">(&#8224; = Work done while interning at Meta AI. and &#8225; = Equal contribution.)</font>
  22. </p>
  23. </div>
  24. </div>
  25. <div class="content-container">
  26. <p>
  27. We propose a holistic cascade system for expressive S2ST, combining multiple prosody transfer techniques previously considered only in isolation.
  28. We curate a benchmark expressivity test set in the TV series domain (Heroes) and explored a second dataset in the audiobook domain (Mined audiobook).
  29. Finally, wepresent a human evaluation protocol to assess multiple expressive dimensions across speech pairs.
  30. Experimental results indicate that bilingual annotators can assess the quality of expressive preservation in S2ST systems, and the holistic modeling approach outperforms single-aspect systems.
  31. </p>
  32. <p>
  33. In this page, we demonstrate synthesized examples on both Heroes and Mined audiobook benchmark datasets with different expressive dimensions.
  34. </p>
  35. <h3> Demo </h3>
  36. <ul>
  37. <li><a style="color:rgb(90, 4, 83)" href="#mined_audiobook_benchmark">Mined audiobook benchmark</a></li>
  38. <ul>
  39. <li><a style="color:rgb(90, 4, 83)" href="#mined_audiobook_benchmark">Synthesize speech-to-text output</a></li>
  40. </ul>
  41. <li><a style="color:rgb(90, 4, 83)" href="#heores_benchmark">Heroes benchmark</a></li>
  42. <ul>
  43. <li><a style="color:rgb(90, 4, 83)" href="heroes_s2t">Synthesize speech-to-text output</a></li>
  44. <li><a style="color:rgb(90, 4, 83)" href="heroes_gt">Synthesize ground truth text</a></li>
  45. </ul>
  46. </ul>
  47. </div>
  48. <link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/font-awesome/4.3.0/css/font-awesome.min.css"><div id="mined_audiobook_benchmark" class="content-container">
  49. <div class="content-title">
  50. <font size="+5">Results on the mined audiobook benchmark</font>
  51. </div>
  52. <table border="0" class="inlineTable">
  53. <tr>
  54. <th colspan="7">
  55. <font size="+2">Synthesize speech-to-text output</font>
  56. </th>
  57. </tr>
  58. <tr>
  59. <th></th>
  60. <th colspan="2">Ground Truth</th>
  61. <th colspan="2">Predictions</th>
  62. </tr>
  63. <tr>
  64. <th></th>
  65. <th>Source (Spanish)</th>
  66. <th>Target (English)</th>
  67. <th>Vanilla TTS</th>
  68. <th>Holistic Cascade (Global transfer + local transfer)</th>
  69. </tr>
  70. <tr>
  71. <th></th>
  72. <th>
  73. <div id="miserables_25_hugo_64kb_416396_421764_goldenmilestone_01_boreham_64kb_673394_679384_mined_src__waveform"></div>
  74. <button id="miserables_25_hugo_64kb_416396_421764_goldenmilestone_01_boreham_64kb_673394_679384_mined_src__button" class="play-button-demo btn btn-primary" onclick="miserables_25_hugo_64kb_416396_421764_goldenmilestone_01_boreham_64kb_673394_679384_mined_src.playPause()"><i class="fa fa-play"></i> Play / <i class="fa fa-pause"></i> Pause </button>
  75. <script> var miserables_25_hugo_64kb_416396_421764_goldenmilestone_01_boreham_64kb_673394_679384_mined_src = WaveSurfer.create({ container: '#miserables_25_hugo_64kb_416396_421764_goldenmilestone_01_boreham_64kb_673394_679384_mined_src__waveform', waveColor: 'violet', progressColor: 'purple' }); miserables_25_hugo_64kb_416396_421764_goldenmilestone_01_boreham_64kb_673394_679384_mined_src.load('./audio/reference/mined_audiobook/es/miserables_25_hugo_64kb_416396_421764-goldenmilestone_01_boreham_64kb_673394_679384.wav'); </script>
  76. </th>
  77. <th>
  78. <div id="miserables_25_hugo_64kb_416396_421764_goldenmilestone_01_boreham_64kb_673394_679384_mined_tgt__waveform"></div>
  79. <button id="miserables_25_hugo_64kb_416396_421764_goldenmilestone_01_boreham_64kb_673394_679384_mined_tgt__button" class="play-button-demo btn btn-primary" onclick="miserables_25_hugo_64kb_416396_421764_goldenmilestone_01_boreham_64kb_673394_679384_mined_tgt.playPause()"><i class="fa fa-play"></i> Play / <i class="fa fa-pause"></i> Pause </button>
  80. <script> var miserables_25_hugo_64kb_416396_421764_goldenmilestone_01_boreham_64kb_673394_679384_mined_tgt = WaveSurfer.create({ container: '#miserables_25_hugo_64kb_416396_421764_goldenmilestone_01_boreham_64kb_673394_679384_mined_tgt__waveform', waveColor: 'violet', progressColor: 'purple' }); miserables_25_hugo_64kb_416396_421764_goldenmilestone_01_boreham_64kb_673394_679384_mined_tgt.load('./audio/reference/mined_audiobook/en/miserables_25_hugo_64kb_416396_421764-goldenmilestone_01_boreham_64kb_673394_679384.wav'); </script>
  81. </th>
  82. <th>
  83. <div id="miserables_25_hugo_64kb_416396_421764_goldenmilestone_01_boreham_64kb_673394_679384_mined_s2t_gpdf__waveform"></div>
  84. <button id="miserables_25_hugo_64kb_416396_421764_goldenmilestone_01_boreham_64kb_673394_679384_mined_s2t_gpdf__button" class="play-button-demo btn btn-primary" onclick="miserables_25_hugo_64kb_416396_421764_goldenmilestone_01_boreham_64kb_673394_679384_mined_s2t_gpdf.playPause()"><i class="fa fa-play"></i> Play / <i class="fa fa-pause"></i> Pause </button>
  85. <script> var miserables_25_hugo_64kb_416396_421764_goldenmilestone_01_boreham_64kb_673394_679384_mined_s2t_gpdf = WaveSurfer.create({ container: '#miserables_25_hugo_64kb_416396_421764_goldenmilestone_01_boreham_64kb_673394_679384_mined_s2t_gpdf__waveform', waveColor: 'violet', progressColor: 'purple' }); miserables_25_hugo_64kb_416396_421764_goldenmilestone_01_boreham_64kb_673394_679384_mined_s2t_gpdf.load('./audio/S2T_text/mined_audiobook/G_P_D_F/miserables_25_hugo_64kb_416396_421764-goldenmilestone_01_boreham_64kb_673394_679384.wav'); </script>
  86. </th>
  87. <th>
  88. <div id="miserables_25_hugo_64kb_416396_421764_goldenmilestone_01_boreham_64kb_673394_679384_mined_s2t_nnnn__waveform"></div>
  89. <button id="miserables_25_hugo_64kb_416396_421764_goldenmilestone_01_boreham_64kb_673394_679384_mined_s2t_nnnn__button" class="play-button-demo btn btn-primary" onclick="miserables_25_hugo_64kb_416396_421764_goldenmilestone_01_boreham_64kb_673394_679384_mined_s2t_nnnn.playPause()"><i class="fa fa-play"></i> Play / <i class="fa fa-pause"></i> Pause </button>
  90. <script> var miserables_25_hugo_64kb_416396_421764_goldenmilestone_01_boreham_64kb_673394_679384_mined_s2t_nnnn = WaveSurfer.create({ container: '#miserables_25_hugo_64kb_416396_421764_goldenmilestone_01_boreham_64kb_673394_679384_mined_s2t_nnnn__waveform', waveColor: 'violet', progressColor: 'purple' }); miserables_25_hugo_64kb_416396_421764_goldenmilestone_01_boreham_64kb_673394_679384_mined_s2t_nnnn.load('./audio/S2T_text/mined_audiobook/N_N_N_N/miserables_25_hugo_64kb_416396_421764-goldenmilestone_01_boreham_64kb_673394_679384.wav'); </script>
  91. </th>
  92. </tr>
  93. <tr>
  94. <th>Input text:</th>
  95. <th>próxima a cometer una mala acción contemplando el sueño de un justo</th>
  96. <th>which is on the point of committing a bad action contemplating the sleep of a</th>
  97. <th>He is about to commit a bad action, contemplating the dream of a just man.</th>
  98. <th>He is about to commit a bad action, contemplating the dream of a just man.</th>
  99. </tr>
  100. <tr>
  101. <th></th>
  102. <th>
  103. <div id="cuentosdehadas03_14_grimm_64kb_165350_168986_chanticleer_15_unknown_64kb_28958_32068_mined_src__waveform"></div>
  104. <button id="cuentosdehadas03_14_grimm_64kb_165350_168986_chanticleer_15_unknown_64kb_28958_32068_mined_src__button" class="play-button-demo btn btn-primary" onclick="cuentosdehadas03_14_grimm_64kb_165350_168986_chanticleer_15_unknown_64kb_28958_32068_mined_src.playPause()"><i class="fa fa-play"></i> Play / <i class="fa fa-pause"></i> Pause </button>
  105. <script> var cuentosdehadas03_14_grimm_64kb_165350_168986_chanticleer_15_unknown_64kb_28958_32068_mined_src = WaveSurfer.create({ container: '#cuentosdehadas03_14_grimm_64kb_165350_168986_chanticleer_15_unknown_64kb_28958_32068_mined_src__waveform', waveColor: 'violet', progressColor: 'purple' }); cuentosdehadas03_14_grimm_64kb_165350_168986_chanticleer_15_unknown_64kb_28958_32068_mined_src.load('./audio/reference/mined_audiobook/es/cuentosdehadas03_14_grimm_64kb_165350_168986-chanticleer_15_unknown_64kb_28958_32068.wav'); </script>
  106. </th>
  107. <th>
  108. <div id="cuentosdehadas03_14_grimm_64kb_165350_168986_chanticleer_15_unknown_64kb_28958_32068_mined_tgt__waveform"></div>
  109. <button id="cuentosdehadas03_14_grimm_64kb_165350_168986_chanticleer_15_unknown_64kb_28958_32068_mined_tgt__button" class="play-button-demo btn btn-primary" onclick="cuentosdehadas03_14_grimm_64kb_165350_168986_chanticleer_15_unknown_64kb_28958_32068_mined_tgt.playPause()"><i class="fa fa-play"></i> Play / <i class="fa fa-pause"></i> Pause </button>
  110. <script> var cuentosdehadas03_14_grimm_64kb_165350_168986_chanticleer_15_unknown_64kb_28958_32068_mined_tgt = WaveSurfer.create({ container: '#cuentosdehadas03_14_grimm_64kb_165350_168986_chanticleer_15_unknown_64kb_28958_32068_mined_tgt__waveform', waveColor: 'violet', progressColor: 'purple' }); cuentosdehadas03_14_grimm_64kb_165350_168986_chanticleer_15_unknown_64kb_28958_32068_mined_tgt.load('./audio/reference/mined_audiobook/en/cuentosdehadas03_14_grimm_64kb_165350_168986-chanticleer_15_unknown_64kb_28958_32068.wav'); </script>
  111. </th>
  112. <th>
  113. <div id="cuentosdehadas03_14_grimm_64kb_165350_168986_chanticleer_15_unknown_64kb_28958_32068_mined_s2t_gpdf__waveform"></div>
  114. <button id="cuentosdehadas03_14_grimm_64kb_165350_168986_chanticleer_15_unknown_64kb_28958_32068_mined_s2t_gpdf__button" class="play-button-demo btn btn-primary" onclick="cuentosdehadas03_14_grimm_64kb_165350_168986_chanticleer_15_unknown_64kb_28958_32068_mined_s2t_gpdf.playPause()"><i class="fa fa-play"></i> Play / <i class="fa fa-pause"></i> Pause </button>
  115. <script> var cuentosdehadas03_14_grimm_64kb_165350_168986_chanticleer_15_unknown_64kb_28958_32068_mined_s2t_gpdf = WaveSurfer.create({ container: '#cuentosdehadas03_14_grimm_64kb_165350_168986_chanticleer_15_unknown_64kb_28958_32068_mined_s2t_gpdf__waveform', waveColor: 'violet', progressColor: 'purple' }); cuentosdehadas03_14_grimm_64kb_165350_168986_chanticleer_15_unknown_64kb_28958_32068_mined_s2t_gpdf.load('./audio/S2T_text/mined_audiobook/G_P_D_F/cuentosdehadas03_14_grimm_64kb_165350_168986-chanticleer_15_unknown_64kb_28958_32068.wav'); </script>
  116. </th>
  117. <th>
  118. <div id="cuentosdehadas03_14_grimm_64kb_165350_168986_chanticleer_15_unknown_64kb_28958_32068_mined_s2t_nnnn__waveform"></div>
  119. <button id="cuentosdehadas03_14_grimm_64kb_165350_168986_chanticleer_15_unknown_64kb_28958_32068_mined_s2t_nnnn__button" class="play-button-demo btn btn-primary" onclick="cuentosdehadas03_14_grimm_64kb_165350_168986_chanticleer_15_unknown_64kb_28958_32068_mined_s2t_nnnn.playPause()"><i class="fa fa-play"></i> Play / <i class="fa fa-pause"></i> Pause </button>
  120. <script> var cuentosdehadas03_14_grimm_64kb_165350_168986_chanticleer_15_unknown_64kb_28958_32068_mined_s2t_nnnn = WaveSurfer.create({ container: '#cuentosdehadas03_14_grimm_64kb_165350_168986_chanticleer_15_unknown_64kb_28958_32068_mined_s2t_nnnn__waveform', waveColor: 'violet', progressColor: 'purple' }); cuentosdehadas03_14_grimm_64kb_165350_168986_chanticleer_15_unknown_64kb_28958_32068_mined_s2t_nnnn.load('./audio/S2T_text/mined_audiobook/N_N_N_N/cuentosdehadas03_14_grimm_64kb_165350_168986-chanticleer_15_unknown_64kb_28958_32068.wav'); </script>
  121. </th>
  122. </tr>
  123. <tr>
  124. <th>Input text:</th>
  125. <th>entonces el escribió una carta a su madre</th>
  126. <th>and writes a letter to his mother</th>
  127. <th>Then he wrote a letter to his mother.</th>
  128. <th>Then he wrote a letter to his mother.</th>
  129. </tr>
  130. <tr>
  131. <th></th>
  132. <th>
  133. <div id="losmiserables5_56_hugo_64kb_454551_458949_lesmiserables_vol5_36_hugo_64kb_476280_483550_mined_src__waveform"></div>
  134. <button id="losmiserables5_56_hugo_64kb_454551_458949_lesmiserables_vol5_36_hugo_64kb_476280_483550_mined_src__button" class="play-button-demo btn btn-primary" onclick="losmiserables5_56_hugo_64kb_454551_458949_lesmiserables_vol5_36_hugo_64kb_476280_483550_mined_src.playPause()"><i class="fa fa-play"></i> Play / <i class="fa fa-pause"></i> Pause </button>
  135. <script> var losmiserables5_56_hugo_64kb_454551_458949_lesmiserables_vol5_36_hugo_64kb_476280_483550_mined_src = WaveSurfer.create({ container: '#losmiserables5_56_hugo_64kb_454551_458949_lesmiserables_vol5_36_hugo_64kb_476280_483550_mined_src__waveform', waveColor: 'violet', progressColor: 'purple' }); losmiserables5_56_hugo_64kb_454551_458949_lesmiserables_vol5_36_hugo_64kb_476280_483550_mined_src.load('./audio/reference/mined_audiobook/es/losmiserables5_56_hugo_64kb_454551_458949-lesmiserables_vol5_36_hugo_64kb_476280_483550.wav'); </script>
  136. </th>
  137. <th>
  138. <div id="losmiserables5_56_hugo_64kb_454551_458949_lesmiserables_vol5_36_hugo_64kb_476280_483550_mined_tgt__waveform"></div>
  139. <button id="losmiserables5_56_hugo_64kb_454551_458949_lesmiserables_vol5_36_hugo_64kb_476280_483550_mined_tgt__button" class="play-button-demo btn btn-primary" onclick="losmiserables5_56_hugo_64kb_454551_458949_lesmiserables_vol5_36_hugo_64kb_476280_483550_mined_tgt.playPause()"><i class="fa fa-play"></i> Play / <i class="fa fa-pause"></i> Pause </button>
  140. <script> var losmiserables5_56_hugo_64kb_454551_458949_lesmiserables_vol5_36_hugo_64kb_476280_483550_mined_tgt = WaveSurfer.create({ container: '#losmiserables5_56_hugo_64kb_454551_458949_lesmiserables_vol5_36_hugo_64kb_476280_483550_mined_tgt__waveform', waveColor: 'violet', progressColor: 'purple' }); losmiserables5_56_hugo_64kb_454551_458949_lesmiserables_vol5_36_hugo_64kb_476280_483550_mined_tgt.load('./audio/reference/mined_audiobook/en/losmiserables5_56_hugo_64kb_454551_458949-lesmiserables_vol5_36_hugo_64kb_476280_483550.wav'); </script>
  141. </th>
  142. <th>
  143. <div id="losmiserables5_56_hugo_64kb_454551_458949_lesmiserables_vol5_36_hugo_64kb_476280_483550_mined_s2t_gpdf__waveform"></div>
  144. <button id="losmiserables5_56_hugo_64kb_454551_458949_lesmiserables_vol5_36_hugo_64kb_476280_483550_mined_s2t_gpdf__button" class="play-button-demo btn btn-primary" onclick="losmiserables5_56_hugo_64kb_454551_458949_lesmiserables_vol5_36_hugo_64kb_476280_483550_mined_s2t_gpdf.playPause()"><i class="fa fa-play"></i> Play / <i class="fa fa-pause"></i> Pause </button>
  145. <script> var losmiserables5_56_hugo_64kb_454551_458949_lesmiserables_vol5_36_hugo_64kb_476280_483550_mined_s2t_gpdf = WaveSurfer.create({ container: '#losmiserables5_56_hugo_64kb_454551_458949_lesmiserables_vol5_36_hugo_64kb_476280_483550_mined_s2t_gpdf__waveform', waveColor: 'violet', progressColor: 'purple' }); losmiserables5_56_hugo_64kb_454551_458949_lesmiserables_vol5_36_hugo_64kb_476280_483550_mined_s2t_gpdf.load('./audio/S2T_text/mined_audiobook/G_P_D_F/losmiserables5_56_hugo_64kb_454551_458949-lesmiserables_vol5_36_hugo_64kb_476280_483550.wav'); </script>
  146. </th>
  147. <th>
  148. <div id="losmiserables5_56_hugo_64kb_454551_458949_lesmiserables_vol5_36_hugo_64kb_476280_483550_mined_s2t_nnnn__waveform"></div>
  149. <button id="losmiserables5_56_hugo_64kb_454551_458949_lesmiserables_vol5_36_hugo_64kb_476280_483550_mined_s2t_nnnn__button" class="play-button-demo btn btn-primary" onclick="losmiserables5_56_hugo_64kb_454551_458949_lesmiserables_vol5_36_hugo_64kb_476280_483550_mined_s2t_nnnn.playPause()"><i class="fa fa-play"></i> Play / <i class="fa fa-pause"></i> Pause </button>
  150. <script> var losmiserables5_56_hugo_64kb_454551_458949_lesmiserables_vol5_36_hugo_64kb_476280_483550_mined_s2t_nnnn = WaveSurfer.create({ container: '#losmiserables5_56_hugo_64kb_454551_458949_lesmiserables_vol5_36_hugo_64kb_476280_483550_mined_s2t_nnnn__waveform', waveColor: 'violet', progressColor: 'purple' }); losmiserables5_56_hugo_64kb_454551_458949_lesmiserables_vol5_36_hugo_64kb_476280_483550_mined_s2t_nnnn.load('./audio/S2T_text/mined_audiobook/N_N_N_N/losmiserables5_56_hugo_64kb_454551_458949-lesmiserables_vol5_36_hugo_64kb_476280_483550.wav'); </script>
  151. </th>
  152. </tr>
  153. <tr>
  154. <th>Input text:</th>
  155. <th>le he destinado un sitio de honor habéis conquistado a mi abuelo</th>
  156. <th>I have fixed upon a corner of Honor for that you have conquered my grandfather you suit him</th>
  157. <th>I have assigned him a place of honor, you have conquered my grandfather.</th>
  158. <th>I have assigned him a place of honor, you have conquered my grandfather.</th>
  159. </tr>
  160. </table>
  161. </div>
  162. <div id="heores_benchmark" class="content-container">
  163. <div class="content-title">
  164. <font size="+5">Results on the Heroes benchmark</font>
  165. </div>
  166. <table border="0" class="inlineTable" id="heroes_s2t">
  167. <tr>
  168. <th colspan="5">
  169. <font size="+2">Synthesize speech-to-text system output</font>
  170. </th>
  171. </tr>
  172. <tr>
  173. <th></th>
  174. <th colspan="4">Predictions</th>
  175. </tr>
  176. <tr>
  177. <th></th>
  178. <th>Vanilla TTS</th>
  179. <th>Holistic Cascade (Global transfer + local transfer)</th>
  180. <th>Ablation (Global transfer only)</th>
  181. <th>Ablation (Local transfer only)</th>
  182. </tr>
  183. <tr>
  184. <th colspan="6" style="text-align:left">
  185. <div size="+2">Input text (speech-to-text output): It's like a Greek tragedy or something.</div>
  186. </th>
  187. </tr>
  188. <tr>
  189. <th></th>
  190. <th>
  191. <div id="heroes_s3_6_0253_s2t_nnnn__waveform"></div>
  192. <button id="heroes_s3_6_0253_s2t_nnnn__button" class="play-button-demo btn btn-primary" onclick="heroes_s3_6_0253_s2t_nnnn.playPause()"><i class="fa fa-play"></i> Play / <i class="fa fa-pause"></i> Pause </button>
  193. <script> var heroes_s3_6_0253_s2t_nnnn = WaveSurfer.create({ container: '#heroes_s3_6_0253_s2t_nnnn__waveform', waveColor: 'violet', progressColor: 'purple' }); heroes_s3_6_0253_s2t_nnnn.load('./audio/S2T_text/heroes/N_N_N_N/heroes_s3_6_0253.wav'); </script>
  194. </th>
  195. <th>
  196. <div id="heroes_s3_6_0253_s2t_gpdf__waveform"></div>
  197. <button id="heroes_s3_6_0253_s2t_gpdf__button" class="play-button-demo btn btn-primary" onclick="heroes_s3_6_0253_s2t_gpdf.playPause()"><i class="fa fa-play"></i> Play / <i class="fa fa-pause"></i> Pause </button>
  198. <script> var heroes_s3_6_0253_s2t_gpdf = WaveSurfer.create({ container: '#heroes_s3_6_0253_s2t_gpdf__waveform', waveColor: 'violet', progressColor: 'purple' }); heroes_s3_6_0253_s2t_gpdf.load('./audio/S2T_text/heroes/G_P_D_F/heroes_s3_6_0253.wav'); </script>
  199. </th>
  200. <th>
  201. <div id="heroes_s3_6_0253_s2t_gnnn__waveform"></div>
  202. <button id="heroes_s3_6_0253_s2t_gnnn__button" class="play-button-demo btn btn-primary" onclick="heroes_s3_6_0253_s2t_gnnn.playPause()"><i class="fa fa-play"></i> Play / <i class="fa fa-pause"></i> Pause </button>
  203. <script> var heroes_s3_6_0253_s2t_gnnn = WaveSurfer.create({ container: '#heroes_s3_6_0253_s2t_gnnn__waveform', waveColor: 'violet', progressColor: 'purple' }); heroes_s3_6_0253_s2t_gnnn.load('./audio/S2T_text/heroes/G_N_N_N/heroes_s3_6_0253.wav'); </script>
  204. </th>
  205. <th>
  206. <div id="heroes_s3_6_0253_s2t_npdf__waveform"></div>
  207. <button id="heroes_s3_6_0253_s2t_npdf__button" class="play-button-demo btn btn-primary" onclick="heroes_s3_6_0253_s2t_npdf.playPause()"><i class="fa fa-play"></i> Play / <i class="fa fa-pause"></i> Pause </button>
  208. <script> var heroes_s3_6_0253_s2t_npdf = WaveSurfer.create({ container: '#heroes_s3_6_0253_s2t_npdf__waveform', waveColor: 'violet', progressColor: 'purple' }); heroes_s3_6_0253_s2t_npdf.load('./audio/S2T_text/heroes/N_P_D_F/heroes_s3_6_0253.wav'); </script>
  209. </th>
  210. </tr>
  211. <tr>
  212. <th colspan="6" style="text-align:left">
  213. <div size="+2">Input text (speech-to-text output): Abby Collins, “National Security.”</div>
  214. </th>
  215. </tr>
  216. <tr>
  217. <th></th>
  218. <th>
  219. <div id="heroes_s3_16_0124_s2t_nnnn__waveform"></div>
  220. <button id="heroes_s3_16_0124_s2t_nnnn__button" class="play-button-demo btn btn-primary" onclick="heroes_s3_16_0124_s2t_nnnn.playPause()"><i class="fa fa-play"></i> Play / <i class="fa fa-pause"></i> Pause </button>
  221. <script> var heroes_s3_16_0124_s2t_nnnn = WaveSurfer.create({ container: '#heroes_s3_16_0124_s2t_nnnn__waveform', waveColor: 'violet', progressColor: 'purple' }); heroes_s3_16_0124_s2t_nnnn.load('./audio/S2T_text/heroes/N_N_N_N/heroes_s3_16_0124.wav'); </script>
  222. </th>
  223. <th>
  224. <div id="heroes_s3_16_0124_s2t_gpdf__waveform"></div>
  225. <button id="heroes_s3_16_0124_s2t_gpdf__button" class="play-button-demo btn btn-primary" onclick="heroes_s3_16_0124_s2t_gpdf.playPause()"><i class="fa fa-play"></i> Play / <i class="fa fa-pause"></i> Pause </button>
  226. <script> var heroes_s3_16_0124_s2t_gpdf = WaveSurfer.create({ container: '#heroes_s3_16_0124_s2t_gpdf__waveform', waveColor: 'violet', progressColor: 'purple' }); heroes_s3_16_0124_s2t_gpdf.load('./audio/S2T_text/heroes/G_P_D_F/heroes_s3_16_0124.wav'); </script>
  227. </th>
  228. <th>
  229. <div id="heroes_s3_16_0124_s2t_gnnn__waveform"></div>
  230. <button id="heroes_s3_16_0124_s2t_gnnn__button" class="play-button-demo btn btn-primary" onclick="heroes_s3_16_0124_s2t_gnnn.playPause()"><i class="fa fa-play"></i> Play / <i class="fa fa-pause"></i> Pause </button>
  231. <script> var heroes_s3_16_0124_s2t_gnnn = WaveSurfer.create({ container: '#heroes_s3_16_0124_s2t_gnnn__waveform', waveColor: 'violet', progressColor: 'purple' }); heroes_s3_16_0124_s2t_gnnn.load('./audio/S2T_text/heroes/G_N_N_N/heroes_s3_16_0124.wav'); </script>
  232. </th>
  233. <th>
  234. <div id="heroes_s3_16_0124_s2t_npdf__waveform"></div>
  235. <button id="heroes_s3_16_0124_s2t_npdf__button" class="play-button-demo btn btn-primary" onclick="heroes_s3_16_0124_s2t_npdf.playPause()"><i class="fa fa-play"></i> Play / <i class="fa fa-pause"></i> Pause </button>
  236. <script> var heroes_s3_16_0124_s2t_npdf = WaveSurfer.create({ container: '#heroes_s3_16_0124_s2t_npdf__waveform', waveColor: 'violet', progressColor: 'purple' }); heroes_s3_16_0124_s2t_npdf.load('./audio/S2T_text/heroes/N_P_D_F/heroes_s3_16_0124.wav'); </script>
  237. </th>
  238. </tr>
  239. <tr>
  240. <th colspan="6" style="text-align:left">
  241. <div size="+2">Input text (speech-to-text output): You weren’t going to find out what the powers were.</div>
  242. </th>
  243. </tr>
  244. <tr>
  245. <th></th>
  246. <th>
  247. <div id="heroes_s3_11_0045_s2t_nnnn__waveform"></div>
  248. <button id="heroes_s3_11_0045_s2t_nnnn__button" class="play-button-demo btn btn-primary" onclick="heroes_s3_11_0045_s2t_nnnn.playPause()"><i class="fa fa-play"></i> Play / <i class="fa fa-pause"></i> Pause </button>
  249. <script> var heroes_s3_11_0045_s2t_nnnn = WaveSurfer.create({ container: '#heroes_s3_11_0045_s2t_nnnn__waveform', waveColor: 'violet', progressColor: 'purple' }); heroes_s3_11_0045_s2t_nnnn.load('./audio/S2T_text/heroes/N_N_N_N/heroes_s3_11_0045.wav'); </script>
  250. </th>
  251. <th>
  252. <div id="heroes_s3_11_0045_s2t_gpdf__waveform"></div>
  253. <button id="heroes_s3_11_0045_s2t_gpdf__button" class="play-button-demo btn btn-primary" onclick="heroes_s3_11_0045_s2t_gpdf.playPause()"><i class="fa fa-play"></i> Play / <i class="fa fa-pause"></i> Pause </button>
  254. <script> var heroes_s3_11_0045_s2t_gpdf = WaveSurfer.create({ container: '#heroes_s3_11_0045_s2t_gpdf__waveform', waveColor: 'violet', progressColor: 'purple' }); heroes_s3_11_0045_s2t_gpdf.load('./audio/S2T_text/heroes/G_P_D_F/heroes_s3_11_0045.wav'); </script>
  255. </th>
  256. <th>
  257. <div id="heroes_s3_11_0045_s2t_gnnn__waveform"></div>
  258. <button id="heroes_s3_11_0045_s2t_gnnn__button" class="play-button-demo btn btn-primary" onclick="heroes_s3_11_0045_s2t_gnnn.playPause()"><i class="fa fa-play"></i> Play / <i class="fa fa-pause"></i> Pause </button>
  259. <script> var heroes_s3_11_0045_s2t_gnnn = WaveSurfer.create({ container: '#heroes_s3_11_0045_s2t_gnnn__waveform', waveColor: 'violet', progressColor: 'purple' }); heroes_s3_11_0045_s2t_gnnn.load('./audio/S2T_text/heroes/G_N_N_N/heroes_s3_11_0045.wav'); </script>
  260. </th>
  261. <th>
  262. <div id="heroes_s3_11_0045_s2t_npdf__waveform"></div>
  263. <button id="heroes_s3_11_0045_s2t_npdf__button" class="play-button-demo btn btn-primary" onclick="heroes_s3_11_0045_s2t_npdf.playPause()"><i class="fa fa-play"></i> Play / <i class="fa fa-pause"></i> Pause </button>
  264. <script> var heroes_s3_11_0045_s2t_npdf = WaveSurfer.create({ container: '#heroes_s3_11_0045_s2t_npdf__waveform', waveColor: 'violet', progressColor: 'purple' }); heroes_s3_11_0045_s2t_npdf.load('./audio/S2T_text/heroes/N_P_D_F/heroes_s3_11_0045.wav'); </script>
  265. </th>
  266. </tr>
  267. </table>
  268. <table border="0" class="inlineTable" id="heroes_gt">
  269. <tr>
  270. <th colspan="7">
  271. <font size="+2">Synthesize ground truth text</font>
  272. </th>
  273. </tr>
  274. <tr>
  275. <th colspan="2">Predictions</th>
  276. </tr>
  277. <tr>
  278. <th>Vanilla TTS</th>
  279. <th>Holistic Cascade (Global transfer + local transfer)</th>
  280. </tr>
  281. <tr>
  282. <th colspan="6" style="text-align:left">
  283. <div size="+2">Input text (ground truth): It started with their father. Delusions of grandeur, paranoia.</div>
  284. </th>
  285. </tr>
  286. <tr>
  287. <th>
  288. <div id="heroes_s2_8_0204_ref_nnnn__waveform"></div>
  289. <button id="heroes_s2_8_0204_ref_nnnn__button" class="play-button-demo btn btn-primary" onclick="heroes_s2_8_0204_ref_nnnn.playPause()"><i class="fa fa-play"></i> Play / <i class="fa fa-pause"></i> Pause </button>
  290. <script> var heroes_s2_8_0204_ref_nnnn = WaveSurfer.create({ container: '#heroes_s2_8_0204_ref_nnnn__waveform', waveColor: 'violet', progressColor: 'purple' }); heroes_s2_8_0204_ref_nnnn.load('./audio/GT_text/heroes/N_N_N_N/heroes_s2_8_0204.wav'); </script>
  291. </th>
  292. <th>
  293. <div id="heroes_s2_8_0204_ref_gpdf__waveform"></div>
  294. <button id="heroes_s2_8_0204_ref_gpdf__button" class="play-button-demo btn btn-primary" onclick="heroes_s2_8_0204_ref_gpdf.playPause()"><i class="fa fa-play"></i> Play / <i class="fa fa-pause"></i> Pause </button>
  295. <script> var heroes_s2_8_0204_ref_gpdf = WaveSurfer.create({ container: '#heroes_s2_8_0204_ref_gpdf__waveform', waveColor: 'violet', progressColor: 'purple' }); heroes_s2_8_0204_ref_gpdf.load('./audio/GT_text/heroes/G_P_D_F/heroes_s2_8_0204.wav'); </script>
  296. </th>
  297. </tr>
  298. <tr>
  299. <th colspan="6" style="text-align:left">
  300. <div size="+2">Input text (ground truth): I have been thinking about you and wondering how you've been since...</div>
  301. </th>
  302. </tr>
  303. <tr>
  304. <th>
  305. <div id="heroes_s3_8_0163_ref_nnnn__waveform"></div>
  306. <button id="heroes_s3_8_0163_ref_nnnn__button" class="play-button-demo btn btn-primary" onclick="heroes_s3_8_0163_ref_nnnn.playPause()"><i class="fa fa-play"></i> Play / <i class="fa fa-pause"></i> Pause </button>
  307. <script> var heroes_s3_8_0163_ref_nnnn = WaveSurfer.create({ container: '#heroes_s3_8_0163_ref_nnnn__waveform', waveColor: 'violet', progressColor: 'purple' }); heroes_s3_8_0163_ref_nnnn.load('./audio/GT_text/heroes/N_N_N_N/heroes_s3_8_0163.wav'); </script>
  308. </th>
  309. <th>
  310. <div id="heroes_s3_8_0163_ref_gpdf__waveform"></div>
  311. <button id="heroes_s3_8_0163_ref_gpdf__button" class="play-button-demo btn btn-primary" onclick="heroes_s3_8_0163_ref_gpdf.playPause()"><i class="fa fa-play"></i> Play / <i class="fa fa-pause"></i> Pause </button>
  312. <script> var heroes_s3_8_0163_ref_gpdf = WaveSurfer.create({ container: '#heroes_s3_8_0163_ref_gpdf__waveform', waveColor: 'violet', progressColor: 'purple' }); heroes_s3_8_0163_ref_gpdf.load('./audio/GT_text/heroes/G_P_D_F/heroes_s3_8_0163.wav'); </script>
  313. </th>
  314. </tr>
  315. <tr>
  316. <th colspan="6" style="text-align:left">
  317. <div size="+2">Input text (ground truth): Only someone with Peter's abilities could get where the virus is stored.</div>
  318. </th>
  319. </tr>
  320. <tr>
  321. <th>
  322. <div id="heroes_s2_11_0095_ref_nnnn__waveform"></div>
  323. <button id="heroes_s2_11_0095_ref_nnnn__button" class="play-button-demo btn btn-primary" onclick="heroes_s2_11_0095_ref_nnnn.playPause()"><i class="fa fa-play"></i> Play / <i class="fa fa-pause"></i> Pause </button>
  324. <script> var heroes_s2_11_0095_ref_nnnn = WaveSurfer.create({ container: '#heroes_s2_11_0095_ref_nnnn__waveform', waveColor: 'violet', progressColor: 'purple' }); heroes_s2_11_0095_ref_nnnn.load('./audio/GT_text/heroes/N_N_N_N/heroes_s2_11_0095.wav'); </script>
  325. </th>
  326. <th>
  327. <div id="heroes_s2_11_0095_ref_gpdf__waveform"></div>
  328. <button id="heroes_s2_11_0095_ref_gpdf__button" class="play-button-demo btn btn-primary" onclick="heroes_s2_11_0095_ref_gpdf.playPause()"><i class="fa fa-play"></i> Play / <i class="fa fa-pause"></i> Pause </button>
  329. <script> var heroes_s2_11_0095_ref_gpdf = WaveSurfer.create({ container: '#heroes_s2_11_0095_ref_gpdf__waveform', waveColor: 'violet', progressColor: 'purple' }); heroes_s2_11_0095_ref_gpdf.load('./audio/GT_text/heroes/G_P_D_F/heroes_s2_11_0095.wav'); </script>
  330. </th>
  331. </tr>
  332. </table>
  333. </div>
  334. <div class="content-container">
  335. Template based on <a style="color:rgb(22, 38, 67)" href="https://speechbot.github.io/"> Textless NLP</a> and <a
  336. style="color:rgb(22, 38, 67)" href="https://daps.cs.princeton.edu/projects/HiFi-GAN/index.php"> HiFi-GAN</a>
  337. pages.
  338. </div>
  339. </body>
  340. </html>