Tel:03-6256-8911

    数据解决方案

    请输入姓名

    携帯電話番号が無効です

    連絡先を入力してください

    会社名を入力してください

    有効な仕事用電子メールを入力してください。

    ご希望のデータについて入力してください

    送信完了しました! ご協力ありがとうございました。

    填写格式错误请重新填写

    確認する

    5文字以下、または数字のみでの入力は無効です。

    https://www.datatang.co.jp

    コンピュータビジョン
    音声認識
    データセット名称 種類 ボリューム 特徴
    1,000 Images Caption Data of Diverse Scenes Image 1000 images Image caption dataset of diverse scenes. The scene distribution includes natural scenery, urban street, exhibitions, home environment, etc. Each image includes an 3-5 sentences English description.
    1,000 Images Caption Data of OCR in Natural Scenes Image 1000 images OCR caption dataset of 14 languages. The subjects of images include bus stops, posters, road signs, etc. Each image includes an 3-5 sentences English description.
    1,000 Images Caption Data of Human Face Image 1000 images Human face image caption dataset of various head postures, facial expressions, etc. Each image includes an 3-5 sentences English description.
    1,000 Images Caption Data of Gestures Image 1000 images Gesture image caption dataset of different angles and gestures categories .Each image includes an 3-5 sentences English description.
    1,000 Images Human Facial Skin Defects Data Image 1000 images Facial skin defect dataset, including acne, acne scars, dark spots, wrinkles and dark circles.
    1,000 Videos Caption Data of Human Motion Video 1000 videos Human motion video caption dataset in CCTV and non CCTV scenes. Human motions include walking, drinking, yawning, fitness, etc. Each video inlcudes an English captions.
    1,000 People Multi-race 7 Expressions Recognition Data Image 1000 people 7 facial expressions dataset, including normal, happy, amazed, sad, angry, disgusted, scared.
    1,000 Videos Multi-race Micro-expression (FACS) Data Video 1000 videos 57 facial micro-expression dataset,including inner brow raiser(AU1), outer brow raiser(AU2), upper lid raiser(AU5), etc.
    50 People- DMS Data Video 50 people DMS dataset of dangerous behavior, fatigue behavior and visual movement behavior. The dataset diversity includes various subject age periods, time periods, vehicle types and camera positions.
    50 People-2D Face Anti-Spoofing Data Image&Video 50 people 2D face anti-spoofing dataset. Real face data includes facial action videos, facial images and lip language videos. Anti-spoofing data includes fake facial action videos, fake lip language videos and fake facial images.
    1,000 Images Gesture Recognition Data Image 1000 images Gesture recognition dataset of 18 gesture categories. The gestures categories include number 1, OK, LOVE, etc. For dataset annotation, 21 landmarks of hand and multiple gesture labels were adopted.
    3,000 Images Natural Scene OCR Data Image 3000 images Natural scene OCR dataset of Asian languages(Japanese, Korean, etc.) and European languages(French, German, etc.). For annotation, line-level quadrilateral bounding box annotation and transcription for the texts were adopted.
    500 Images Handwriting OCR Data Image 500 images Handwriting OCR data of English and Japanese. For annotation, line-level quadrilateral bounding box annotation and transcription for the texts were adopted.
    50 People- 3D Face Anti-Spoofing Data Image 50 people 3D face anti-spoofing dataset. Real face data includes facial images. Anti-spoofing data includes fake facial images. Each image corresponds to a depth image, a depth values file and a camera parameters file.
    1,000 People Multi-race and Multi-pose Face Images Data Image 1000 people Facial recognition dataset of multiple races. Each subject has 29 facial images, including 14 indoor multi-pose images, 14 outdoor multi-pose images and 1 id image. The annotations include labels of race, gender, age, and facial pose.
    データセット名称 収集デバイス ボリューム 特徴
    2 Hours- 4 Countries English Speech Synthesis Corpus Microphone 2 hours, 4 people People: 4 people from America, British, Australia, New Zealand
    Format : 48,000Hz, 24bit, uncompressed wav, mono channel;
    Recording environment : professional recording studio
    20 Hours - France French Reading & Conversational Speech Data by Mobile Phone Mobile Phone 20 hours Format : 16kHz, 16bit, uncompressed wav, mono channel;
    Recording condition : Low background noise(indoor), without echo;
    Content category : Reading, Conversation
    Recording device : Android Smartphone, iPhone;
    Country : Portugal
    Language : Portuguese;
    Features of annotation : Transcription text;
    Accuracy Rate : Word Accuracy Rate (WAR) is at least 97%
    20 Hours - German Reading & Conversational Speech Data by Mobile Phone Mobile Phone 20 hours Format : 16kHz, 16bit, uncompressed wav, mono channel;
    Recording condition : Low background noise(indoor), without echo;
    Content category : Reading, Conversation
    Recording device : Android Smartphone, iPhone;
    Country : Germany
    Language : German;
    Features of annotation : Transcription text;
    Accuracy Rate : Word Accuracy Rate (WAR) is at least 97%
    20 Hours - Italian Reading & Conversational Speech Data by Mobile Phone Mobile Phone 20 hours Format : 16kHz, 16bit, uncompressed wav, mono channel;
    Recording condition : Low background noise(indoor), without echo;
    Content category : Reading, Conversation
    Recording device : Android Smartphone, iPhone;
    Country : Italy
    Language : Italian;
    Features of annotation : Transcription text;
    Accuracy Rate : Word Accuracy Rate (WAR) is at least 97%
    20 Hours - Spain Spanish Reading & Conversational Speech Data by Mobile Phone Mobile Phone 20 hours Format : 16kHz, 16bit, uncompressed wav, mono channel;
    Recording condition : Low background noise(indoor), without echo;
    Content category : Reading, Conversation
    Recording device : Android Smartphone, iPhone;
    Country : Spain
    Language : Spanish;
    Features of annotation : Transcription text;
    Accuracy Rate : Word Accuracy Rate (WAR) is at least 97%
    20 Hours - European Portuguese Reading & Conversational Speech Data by Mobile Phone Mobile Phone 20 hours Format : 16kHz, 16bit, uncompressed wav, mono channel;
    Recording condition : Low background noise(indoor), without echo;
    Content category : Reading, Conversation
    Recording device : Android Smartphone, iPhone;
    Country : Portugal
    Language : Portuguese;
    Features of annotation : Transcription text;
    Accuracy Rate : Word Accuracy Rate (WAR) is at least 97%
    20 Hours - Japanese Reading & Conversational Speech Data by Mobile Phone Mobile Phone 20 hours Format : 16kHz, 16bit, uncompressed wav, mono channel;
    Recording condition : Low background noise(indoor), without echo;
    Content category : Reading, Conversation
    Recording device : Android Smartphone, iPhone;
    Country : Japan
    Language : Japanese;
    Features of annotation : Transcription text;
    Accuracy Rate : Word Accuracy Rate (WAR) is at least 97%
    20 Hours - Korean Reading & Conversational Speech Data by Mobile Phone Mobile Phone 20 hours Format : 16kHz, 16bit, uncompressed wav, mono channel;
    Recording condition : Low background noise(indoor), without echo;
    Content category : Reading, Conversation
    Recording device : Android Smartphone, iPhone;
    Country : Korea
    Language : Korean;
    Features of annotation : Transcription text;
    Accuracy Rate : Word Accuracy Rate (WAR) is at least 97%
    10 Hours - Pashto Conversational Speech Data by Telephone Telephone 10 hours Format : 8kHz 8bit, a-law/u-law pcm, mono channel
    Content category : Dialogue based on given topics
    Recording condition : Low background noise (indoor)
    Recording device : Telephony
    Country : Afghanistan(AFG)
    Language(Region) Code : ps-AF
    Language : Pashto
    Speaker : 224 people in total, 92% male and 8% female
    Features of annotation : Transcription text, timestamp, speaker ID, gender
    Accuracy rate : Word accuracy rate(WAR) 95%
    Accuracy Rate : Word Accuracy Rate (WAR) is at least 95%
    Interspeech_ Accented English Speech Recognition Competition Data Mobile Phone 200 hours,528 people /
    Note: Please apply for datasets reasonably according to the research field. The maximum number of applications for Computer Vision datasets is 6 sets.
    Note: Please apply for datasets reasonably according to the research field. The maximum number of applications for speech recognition datasets is 4 sets.

    申請から受け取りまでのフローと説明

    同意書をダウンロードして記入する
    同意書を入手する
    同意書に押印する
    メールにて申請を送信する
    審査、結果のフィードバック
    データの準備
    データの納品
    申請完了

    協力機関

    研究支援データ活動の最終的な解釈権はDatatangに帰属します