2.4 million - Korean Test Questions Structured Analysis Processing Data
200,000 Sets of Multi-country Landmark Buildings Image Caption Data
100,000 Fine-Tuning text data set for English LLM General Domain SFT
480000 corrected texts in German, Spanish, French, Italian
250,000 English Animals Medical dataset
20,846 Groups Image Caption Data of Cookbook
32 million - Science Subjects Questions Text Parsing And Processing Data
1 million - Chinese Code Questions Text Parsing And Processing Data
Japanese OKWAVE Q&A platform Text Parsing and Processing Data
6.5TB multi-programming language code dataset
10.44 million - English Test Questions Text Parsing And Processing Data
140,000,000 - Chinese Judgment Documents Text Parsing And Processing Data
114,000 - Chinese Contest Questions Text Parsing And Processing Data
200000 text data in German, Spanish, French, and Italian
130000 Chinese standard text parsing and processing data