Przeglądaj źródła

Indicate the SONAR device in the mutox example and explain the dataset columns (#412)

* Indicate the SONAR device in the mutox example

* Add explanations of scores and columns in the readme
David Dale 1 rok temu
rodzic
commit
95abd7d350

+ 15 - 0
src/seamless_communication/cli/toxicity/mutox/README.md

@@ -49,6 +49,9 @@ We use multi-modal and multilingual
 al., 2023). For the classifier, we use variable input sizes for the 3
 feedforward layers (1024, 512, and 128).
 
+The predictions of the classifier can be interpreted as logits (i.e. after feeding them to a sigmoid transform they become probabilities). 
+The 0 value can be used as a threshold, as it corresponds to the 50% predicted toxicity probability.
+
 ## Classifier Quick Start
 
 This introduces the MuTox speech toxicity model, this relies on computing the
@@ -73,6 +76,18 @@ You can also check the mutox example notebook in this directory.
 The dataset is available in this [tsv file](https://dl.fbaipublicfiles.com/seamless/datasets/mutox.tsv). The dataset is licensed under the MIT license (see MIT_LICENSE
 file at the root of seamless_communication).
 
+The columns of the dataset are:
+- `id`: a string id of the segment;
+- `lang`: 3-letter language code;
+- `partition`: one of `train`, `dev`, or `devtest`;
+- `public_url_segment`: a string formatted as `url:start:end`, where start and end are indicated in milliseconds;
+- `audio_file_transcript`: text transctiption of the segment;
+- `contains_toxicity`,	`toxicity_types`,	`perlocutionary_effects`: annotation results as strings (see the paper for their explanation);
+- `label`: 	an integer label, equal to 1 if `contains_toxicity` equals `Yes` and 0 otherwise;
+- `etox_result`: toxic word (or multiple words, separated by `|`) detected by the Etox matcher;
+- `detoxify_score`: toxicity probabilities predicted by the Detoxify system (float numbers between 0 and 1);
+- `mutox_speech_score`,	`mutox_text_score`, `mutox_zero_shot_speech_score`, `mutox_zero_shot_text_score`: MuTox predictions as float numbers with any value (they can be interpreted as logits, i.e. probabilities before a sigmoid transformation).
+
 ## Citation
 
 ```bitex

+ 1 - 0
src/seamless_communication/cli/toxicity/mutox/mutox_example.ipynb

@@ -180,6 +180,7 @@
     "t2vec_model = TextToEmbeddingModelPipeline(\n",
     "    encoder=\"text_sonar_basic_encoder\",\n",
     "    tokenizer=\"text_sonar_basic_encoder\",\n",
+    "    device=device,\n",
     ")\n",
     "text_column='lang_txt'\n",
     "classifier = load_mutox_model(\n",