791 Hours - Mandarin Conversational Speech Data by Microphone

{"id":1714,"datatype":"1","titleimg":"/shujutang/static/image/index/datatang_yuyin_default.jpg","type1":"165","type1str":null,"type2":"166","type2str":null,"dataname":"791 Hours - Mandarin Conversational Speech Data by Microphone","datazy":[{"title":"Format","content":"48kHz, 16bit, uncompressed wav, mono channel;"},{"title":"Recording Environment","content":"quiet indoor environment, without echo;"},{"title":"Recording content","content":"dozens of topics are specified, and the speakers make dialogue under those topics while the recording is performed;"},{"title":"Demographics","content":"1126 speakers; balanced gender ratio among speakers, with age distribution ranging from 18 to 60 years old;"},{"title":"Annotation","content":"extract and annotate individual sentences with their start and end timestamps, speaker identification, and spoken text content; noise annotation;"},{"title":"Device","content":"Microphone;"},{"title":"Language","content":"Mandarin;"},{"title":"Application scenarios","content":"speech recognition; voiceprint recognition;"},{"title":"Accuracy rate","content":"character accuracy rate of 99%"}],"datatag":"Conversation,Mandarin,High sampling","technologydoc":null,"downurl":null,"datainfo":null,"standard":null,"dataylurl":null,"flag":null,"publishtime":null,"createby":null,"createtime":null,"ext1":null,"samplestoreloc":null,"hosturl":null,"datasize":null,"industryPlan":null,"keyInformation":null,"samplePresentation":[],"officialSummary":"791 Hours - Mandarin Conversational Speech Data by Microphone, collected from dialogues based on given topics, covering dozens of generic domain. Transcribed with text content, speaker's ID, gender and other attributes. Our dataset was collected from extensive and diversify speakers(1,126 people in total), geographicly speaking, enhancing model performance in real and complex tasks. Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.","dataexampl":null,"datakeyword":"Conversation,Mandarin,High sampling","isDelete":null,"ids":null,"idsList":null,"datasetCode":null,"productStatus":null,"tagTypeEn":"Language,Data Type","tagTypeZh":null,"website":null,"samplePresentationList":null,"datazyList":null,"keyInformationList":null,"dataexamplList":null,"bgimg":null,"datazyScriptList":null,"datakeywordListString":null,"sourceShowPage":"speechRec","BGimg":"brightSpot_audio","voiceBg":["/shujutang/static/image/comm/audio_bg.png","/shujutang/static/image/comm/audio_bg2.png","/shujutang/static/image/comm/audio_bg3.png","/shujutang/static/image/comm/audio_bg4.png","/shujutang/static/image/comm/audio_bg5.png"]}

Tel：03-6256-8911

データセット

コンピュータビジョンデータセット

音声認識データセット

音声合成データセット

OCRデータセット

発音辞書データセット

自然言語理解データセット

大規模モデルデータセット

データセット

Datatangは、20万時間の音声認識データと800TBのコンピュータビジョンデータ、20億件の自然言語理解データ、5TBのラベルなしテキストデータを持っています。版権クリアした高品質なデータセットによって、お客様からの信頼を得ています。
さらに見る
カスタマイズ

3D点群データ

ストリートビューデータ

OCRデータ

行動識別データ

ID識別データ

音声認識データ

音声合成データ

マルチモーダルデータ

カスタマイズ

Datatangは専門的なデータ収集設備とツールを持つとともに、3つの大型データアノテーション基地を設置。豊富な実績と完備なプロジェクト管理によって、お客様による様々なシーンや種類のデータカスタマイズニーズを満たし、パーソナライズされたデータ収集・アノテーションサービスにプロフェッショナルに対応します。
さらに見る
業界ソリューション

自動運転

エンターテインメント

カスタマーサービス

スマートホーム

ニューリテール

スマート医療

業界ソリューション

Datatangは10年以上にわたって様々な業界にデータサービスを提供してきた経験により、複数のビジネスシナリオのデータニーズを迅速に対応できます。独自のデータ収集・アノテーションラットフォームツールと自動化されたデータ処理機能により、マルチシナリオのデータソリューションを提供できます。
さらに見る
プラットフォーム

プラットフォーム
企業情報

会社概要

お知らせ

パートナー

お問い合わせ・販売

jp

数据解决方案

请输入姓名

携帯電話番号が無効です

連絡先を入力してください

会社名を入力してください

有効な仕事用電子メールを入力してください。

ご希望のデータについて入力してください

送信完了しました！ご協力ありがとうございました。

填写格式错误请重新填写

確認する

5文字以下、または数字のみでの入力は無効です。

https://www.datatang.co.jp

1714

_Data Products_Datatang

791 Hours - Mandarin Conversational Speech Data by Microphone_791 Hours - Mandarin Conversational Speech Data by Microphone

791 Hours - Mandarin Conversational Speech Data by Microphone

ライセンス認証を経た製品データセットが、AIプロジェクトのスピーディーな立ち上げをアシストします。

791 Hours - Mandarin Conversational Speech Data by Microphone, collected from dialogues based on given topics, covering dozens of generic domain. Transcribed with text content, speaker's ID, gender and other attributes. Our dataset was collected from extensive and diversify speakers(1,126 people in total), geographicly speaking, enhancing model performance in real and complex tasks. Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

お問い合わせサンプルを入手する

データ仕様

Format: 48kHz, 16bit, uncompressed wav, mono channel;

Recording Environment: quiet indoor environment, without echo;

Recording content: dozens of topics are specified, and the speakers make dialogue under those topics while the recording is performed;

Demographics: 1126 speakers; balanced gender ratio among speakers, with age distribution ranging from 18 to 60 years old;

Annotation: extract and annotate individual sentences with their start and end timestamps, speaker identification, and spoken text content; noise annotation;

Device: Microphone;

Language: Mandarin;

Application scenarios: speech recognition; voiceprint recognition;

Accuracy rate: character accuracy rate of 99%

サンプル紹介

収集対象者からの明確に許可を得た、高品質の製品トレーニングデータセットはが、AIプロジェクトのスピーディーな立ち上げをアシストします。

さっそく始めてみる

関連データのおすすめ

600 Hours - Greek Real-world Casual Conversation and Monologue speech dataset

600 Hours - Greek Real-world Casual Conversation and Monologue speech dataset

600 Hours - Norwegian Real-world Casual Conversation and Monologue speech dataset

600 Hours - Norwegian Real-world Casual Conversation and Monologue speech dataset

Gujatati(India) Scripted dialogue speech dataset

Gujatati(India) Scripted dialogue speech dataset

Spanish(Mexico) Real-world Casual Conversation and Monologue speech dataset

Spanish(Mexico) Real-world Casual Conversation and Monologue speech dataset

より高品質なデータ、より競争力のあるAI

データセット: コンピュータビジョンデータセット; 音声認識データセット; 音声合成データセット; OCRデータセット; 発音辞書データセット; 自然言語理解データセット; 大規模モデルデータセット

カスタマイズ: 3D点群データ; ストリートビューデータ; OCRデータ; 行動識別データ; ID識別データ; 音声認識データ; 音声合成データ; マルチモーダルデータ

業界ソリューション: 自動運転; エンターテインメント; カスタマーサービス; スマートホーム; ニューリテール; スマート医療

プラットフォーム: プラットフォーム

リソースセンター: 研究支援データセット; 高品質データ要件

リンク集: Datatang.cn; DataPlus; OPENMPD; カスタマーサービス

お問い合わせ: 本社〒101-0063 東京都千代田区神田淡路町2-105 ワテラスアネックス6階; 03-6256-8911; [email protected]

サイトマップ法的告知とプライバシー権

© 2021 Datatang Inc. All Rights Reserved.

このウェブサイトではサイトの利便性の向上を目的にCookieを使用します。パーソナライズされた広告やコンテンツを提供するとともに、Datatangのトラフィックを分析します。「すべて同意する」をクリックすると、DatatangによるCookieの使用に同意したものとみなされます。

Data Features

791 Hours - Mandarin Conversational Speech Data by Microphone

*Name：

*Phone：

*Company：

*E-mail：

*Requirement：

791 Hours - Mandarin Conversational Speech Data by Microphone