2015年6月14日 星期日

Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups

Hinton, Geoffrey, et al.


Summarization

This article introduces many techniques about speech recognition using deep neural networks. Some of these seem special and seldom appear in other domain.

Restricted Boltzmann Machine (RBM) is a model used in generative pre-training. The energy function is given by 

Contrastive Divergence (CD) is an efficient learning procedure for RBMs.

To model real-valued data, a new model called Gaussian-Bernoulli RBM (GRBM) is proposed. The energy function is given by

We can stack RBMs to make a Deep Belief Network (DBN).

A DNN that is pre-trained generatively as a DBN is called DBN-DNN.

There are some comparison between Microsoft, Google and IBM research.



This article is an important introduction to learn speech recognition using CNN. There are several techniques seem to be used in this subject for a long time, but I haven't caught it well. Still, it is worth studying fine. 











沒有留言:

張貼留言