基于深度学习的方言识别系统实现
无需注册登录,支付后按照提示操作即可获取该资料.
基于深度学习的方言识别系统实现(任务书,开题报告,论文18000字)
摘要
在过去的2017年,人工智能已成为科技前沿技术,深度学习也逐渐成为一个热门话题,长期以来,我们在语音识别领域声学模型的建模通常都是使用GMM-HMM模型,该模型具有可靠的精度,并且有成熟的算法来进行参数训练,但因为GMM模型属于浅层模型,随着数据量增加它的建模能力明显不足。而深度神经网络(DNN)因其对复杂数据有更好的建模与学习能力,成为语音识别领域研究的热点。中国话将普通话作为官方语言, 但是各地区、各民族的方言种类众多,国内对于语音识别技术已日趋成熟,但是方言识别还甚少研究,针对方言的独特发音特点和其声学特征的明显差异,本设计拟研究一种基于深度学习的方言识别技术。本文深入研究了基于深度学习的语音识别,分析两个模型的优点以及不足,主要进行了以下工作:
(1)对基于隐马尔科夫模型(HMM)的语音识别算法进行深入研究,并使用MATLAB实现语音识别,对语音库存放的语音信号进行训练得到语言模型和声学模型。实验解码结果表明,在小词汇量汉语语音识别中,可以对语音信号进行正确识别。
(2)针对HMM模型的不足,对深度神经网络DNN深入研究,实现了大词方言词汇语音识别系统的构建,对方言语音进行DNN声学模型训练,DNN模型在小词汇量语音识别系统中具有比较好好的识别效果。
(3)噪声干扰一直是语音识别的难点,在进行声学模型训练的过程中,通过在训练和测试语音加入白噪声、汽车背景噪声、自助餐背景噪声进行DNN训练,并与多种模型对比,可以用于提高恢复噪声损坏的输入。
关键词:语音识别方言识别深度学习 DNN
Abstract
In the past 2017, artificial intelligence has become a cutting-edge technology, and deep learning has gradually become a hot topic. For a long time, the modeling of acoustic models in the field of speech recognition has always used the GMM-HMM model, which is reliable. Accuracy, and there are mature algorithms for parameter training, but because the GMM model is a shallow model, its modeling ability is obviously insufficient with increasing data volume. Deep neural network (DNN) has become a hot spot in the field of speech recognition because of its better modeling and learning ability for complex data. Mandarin speaks Putonghua as the official language. However, there are many kinds of dialects in different regions and nationalities. Voice recognition technology has matured in China. However, dialect recognition is rarely researched. The differences between the distinctive pronunciation characteristics of dialects and their acoustic characteristics are obvious. This design plans to study a dialect recognition technology based on deep learning. This paper deeply studies the speech recognition based on deep learning, analyzes the advantages and disadvantages of the two models, and mainly performs the following work:
(1) In-depth research on speech recognition algorithms based on Hidden Markov Models (HMM), and using MATLAB to achieve speech recognition, the voice signals placed in the voice inventory are trained to obtain language models and acoustic models. The experimental decoding results show that in the small vocabulary Chinese speech recognition, the speech signal can be correctly identified.
(2) According to the deficiency of the HMM model, the deep neural network DNN is deeply studied, and the construction of the vocabulary speech recognition system of the big vocabulary is realized. The DNN acoustic model is trained on the dialect speech, and the DNN model is compared in the small vocabulary speech recognition system. Good recognition effect.
(3) Noise disturbance has always been a difficult point in speech recognition. In the process of acoustic model training, DNN training is performed by adding white noise, car background noise, buffet background noise in training and test speech, and compared with various models. Used to improve the recovery of noise damage input.
Keywords: speech recognitiondialect recognitiondeep learningDNN
目 录
摘要 I
Abstract II
第1章 绪论 1
1.1 研究课题背景及研究的目的和意义 1
1.2 国内外的研究现状 2
1.3 论文组织结构 3
第2章 语音识别的基本原理 4
2.1 语音信号的预处理 4
2.1.1 预加重 5
2.1.2 分帧 6
2.1.3 加窗 6
2.1.4 端点检测 7
2.2 语音的特征提取 9
2.3 本章小结 10
第3章 识别方法 11
3.1 动态时间规整 11
3.1.1 DTW概述 11
3.1.2 动态时间规整DTW 13
3.1.3 DTW在语音中的运用 14
3.2 隐马尔科夫模型 14
3.2.1 HMM基本概念 14
3.2.2 HMM的三个基本问题和解决方案 16
3.2.3 基于HMM的孤立词语音识别系统 17
3.3 深度学习理论 18
3.3.1 深度神经网络声学模型 18
3.3.2 声学模型网络结构 19
3.4 前馈全连接深度神经网络介绍-DNN训练算法 19
3.5 各方法性能比较分析 19
3.6 本章小结 20
第4章 基于深度学习的方言识别系统设计 21
4.1 系统需求分析 21
4.2 开发环境 21
4.3 方法集成与功能实现 21
4.4 系统测试 23
4.5 本章小结 25
第5章 总结与展望 26
5.1 课题工作总结 26
5.2 展望 26
参考文献 27
附录A 28
附录B 31
致 谢 33