天气预报怎么说 你好, 很高兴认识你
我明天过来接你 为我们的合作成功干杯
This MCCS dataset is the first large-scale Mandarin Chinese Cued Speech dataset. This dataset covers 23 major categories of scenarios (e.g, communication, transportation and shoping) and 72 subcategories of scenarios (e.g, meeting, dating and introduction). It is recorded by four skilled native Mandarn Chinese Cued Speech cuers with portable cameras on the mobile phones. The Cued Speech videos are recorded with 30fps and 1280x720 format. We provide the raw Cued Speech videos, text file (with 1000 sentences) and corresponding annotations which contains two kind of data annotation. One is continuious video annotation with ELAN, the other is discrete audio annotations with Praat.
We are excited to announce the release of MCCSD-2024, an expanded version of the Mandarin Chinese Cued Speech Dataset.
This new version includes 7 native chinese cuers and a total of 7,000 sentences (1000 for each cuer), significantly increasing the diversity and scale of the dataset.
The dataset is divided into two main groups:
This new version aims to provide a more comprehensive resource for researchers studying Cued Speech, particularly in the context of cross-modal learning and accessibility technologies.
The inclusion of data from hearing-impaired individuals offers unique insights into the variations and adaptations in Cued Speech production.
The dataset maintains the same high-quality standards as the original MCCSD, with videos recorded at 30fps and 1280x720 resolution.
Annotations are provided in both ELAN and Praat formats, ensuring compatibility with a wide range of research tools.
To access MCCSD-2024, please follow the same procedure as for the original dataset:
For any questions regarding MCCSD-2024, please feel free to contact us at the email addresses provided below.
This MCCSD contains 1000 Mandarin Chinese Cued Speech sentences. We currently only provide it to universities and research institutions for research purposes. Please complete the following steps to obtain the dataset:
If you are interested in our CS generation work, check out the following links for more details: GitHub
If you use this MCCS dataset for your research, please consider citing the following papers:
If you have any questions about the dataset and our research works, please feel free to contact us:
Li Liu avrillliu@hkust-gz.edu.cn
Wentao Lei wentaolei@hkust-gz.edu.cn
Feel free to visit Homepage of Prof. Liu for more details about our group and research topics.
Special thanks for the support from Tencent Charity Venture Capital Program.