seamless_align_expressive_README 2.2 KB

123456789101112131415161718192021222324252627
  1. # SeamlessAlignExpressive
  2. Building upon our past work with WikiMatrix, CCMatrix, NLLB, SpeechMatrix and SeamlessM4T, we’re introducing the first expressive speech alignment procedure. Starting with raw data, the expressive alignment procedure automatically discovers pairs of audio segments sharing not only the same meaning, but the same overall expressivity. To showcase this procedure, we are making metadata available to create a benchmarking dataset called SeamlessAlignExpressive, that can be used to validate the quality of our alignment method. SeamlessAlignExpressive is the first large-scale collection of multilingual audio alignments for expressive translation for benchmarking.
  3. ## Format
  4. The metadata files are space separated, gzip files. Each file corresponds to one alignment direction. File naming convention: we use 2 letters with an 'A': e.g. `frA`, `enA`, `deA`.
  5. For example, the direction `deA-enA` corresponds to information for reconstructing German speech to English speech alignments.
  6. Each line has 9 columns.
  7. The columns correspond to:
  8. - `direction`: direction, e.g. `enA-deA`
  9. - `side`: side, e.g. `enA` or `deA`
  10. - `line_no`: alignment number
  11. - `cc_warc`: The public CC warc file reference containing the public audio url
  12. - `duration`: original file duration
  13. - `audio_speech_segment_url`: public audio reference
  14. - `audio_speech_start_frame`: start frame when the audio is resampled at 16kHz
  15. - `audio_speech_end_frame`: end frame when the audio is resampled at 16kHz
  16. - `laser_score`: score of the alignment
  17. ## Data
  18. [deA-enA](https://dl.fbaipublicfiles.com/seamless/data/seamless_align_expressive/seamless.dataset.metadata.public.deA-enA.tsv.gz) [enA-esA](https://dl.fbaipublicfiles.com/seamless/data/seamless_align_expressive/seamless.dataset.metadata.public.enA-esA.tsv.gz) [enA-frA](https://dl.fbaipublicfiles.com/seamless/data/seamless_align_expressive/seamless.dataset.metadata.public.enA-frA.tsv.gz) [enA-itA](https://dl.fbaipublicfiles.com/seamless/data/seamless_align_expressive/seamless.dataset.metadata.public.enA-itA.tsv.gz) [enA-zhA](https://dl.fbaipublicfiles.com/seamless/data/seamless_align_expressive/seamless.dataset.metadata.public.enA-zhA.tsv.gz)