|
@@ -16,6 +16,8 @@ GLM-130B is an open bilingual (English & Chinese) bidirectional dense model with
|
|
- **Reproducibility:** all results (30+ tasks) can be easily reproduced with open-sourced code and model checkpoints.
|
|
- **Reproducibility:** all results (30+ tasks) can be easily reproduced with open-sourced code and model checkpoints.
|
|
- **Cross-Platform:** supports training and inference on NVIDIA, Hygon DCU, Ascend 910, and Sunway (Will be released soon).
|
|
- **Cross-Platform:** supports training and inference on NVIDIA, Hygon DCU, Ascend 910, and Sunway (Will be released soon).
|
|
|
|
|
|
|
|
+For smaller models, please find monolingual GLM (English: 10B/2B/515M/410M/335M/110M, Chinese: 10B/335M) and an 1B multilingual GLM (104 languages).
|
|
|
|
+
|
|
## Getting Started
|
|
## Getting Started
|
|
|
|
|
|
### Environment Setup
|
|
### Environment Setup
|
|
@@ -342,7 +344,7 @@ We compare GLM-130B to the largest existing Chinese monolingual language model E
|
|
|
|
|
|
<details>
|
|
<details>
|
|
<summary><b>Acknowledgement</b></summary>
|
|
<summary><b>Acknowledgement</b></summary>
|
|
-
|
|
|
|
|
|
+
|
|
<br/>
|
|
<br/>
|
|
This project is supported by the National Science Foundation for Distinguished Young Scholars (No. 61825602).
|
|
This project is supported by the National Science Foundation for Distinguished Young Scholars (No. 61825602).
|
|
|
|
|