本研究组经过多年的技术积累,拥有一批用于自然语言处理的数据集,愿意与国内外同行共享。以下是部分数据集的下载链接以及相关论文,欢迎下载

       For years of technical accumulation, our research group has developed several datasets for natural language processing, and we are willing to share them. Following are download links for the datasets and the related papers. Welcome to download:

       1. 自动摘要数据集 (Summarization Dataset)

            <1> 多模态输入多模态输出型自动摘要数据集(Dataset for Multimodal Summarization with Multimodal Output)

                     相关论文: [EMNLP18]

                     下载链接: [OneDrive Link]  [GoogleDrive Link]

           <2> 多模态句子摘要数据集(Dataset for Multimodal Sentence Summarization)

                     相关论文: [IJCAI18]

                     下载链接: [GoogleDrive Link]

           <3> 多模态自动摘要数据集(Dataset for Multimodal Summarization)

                     相关论文: [EMNLP17]

                     下载链接: [OneDrive Link]

           <4> 跨语言自动摘要数据集(Datasets for Cross-Lingual Summarization)

                     下载链接: [GitHub Link]

           <5> 客服对话摘要数据集(Datasets for Customer Service Dialogue Summarization)

                     下载链接: [GitHub Link]

           <6> 其他资源(Other resources)

                     ROUGE: [How-to-install-and-use-official-ROUGE-toolkit]

 

       2. 情感分析数据集 (Sentiment Analysis Dataset)

            <1> 个性化评论摘要数据集(Dataset for Personalized Review Summarization)

                     相关论文: [AAAI19]

                     下载链接: [OneDrive Link]

           <2> 文档级别多要素情感分类数据集(Dataset for Document-level Multi-aspect Sentiment Classification)

                     相关论文: [COLING18]

                     下载链接: [OneDrive Link]

           <3> 文档级别情感分类数据集(Dataset for Document-level Sentiment Classification)

                     相关论文: [TALLIP18]

                     下载链接: [OneDrive Link]

 

       3. 表示学习数据集 (Representation Learning Dataset)

            <1> 中文句子表示数据集(Dataset for Chinese Sentence Representation)

                     相关论文: [EMNLP17]

                     下载链接: [GitHub Link]

           <2> 短语表示数据集(Dataset for Phrase Representation)

                     相关论文: [TALLIP17]

                     下载链接: [GitHub Link]

           <3> 672个来自同义词词林不同类别的代表性词汇的fmri可视化激活图

                     下载链接: [Baidu Link]   下载密码: gyco

       4. 语音翻译数据集 (Speech Translation Dataset)

                     相关论文: [INTERSPEECH19]

                     下载链接: [Baidu Link]   下载密码: bva0


 

If you have any question, please contact zlu@nlpr.ia.ac.cn