Wolter, Moritz: Frequency Domain Methods in Recurrent Neural Networks for Sequential Data Processing. - Bonn, 2021. - Dissertation, Rheinische Friedrich-Wilhelms-Universität Bonn.
Online-Ausgabe in bonndoc: https://nbn-resolving.org/urn:nbn:de:hbz:5-63361
@phdthesis{handle:20.500.11811/9245,
urn: https://nbn-resolving.org/urn:nbn:de:hbz:5-63361,
author = {{Moritz Wolter}},
title = {Frequency Domain Methods in Recurrent Neural Networks for Sequential Data Processing},
school = {Rheinische Friedrich-Wilhelms-Universität Bonn},
year = 2021,
month = jul,

note = {Machine learning algorithms now make it possible for computers to solve problems, which were thought to be impossible to automize. Neural Speech processing, convolutional neural networks, and other recent advances are powered by frequency-domain methods like the FFT.
This cumulative thesis presents applications of frequency-domain methods in recurrent machine learning. It starts by exploring the combination of the STFT and recurrent neural networks. This combination allows faster training through windowing, end-to-end window function optimization, while low-pass filtering the Fourier coefficients can reduce the model size. Fourier coefficients are complex numbers, and therefore best processed in $mathbb{C}$. The development of a complex recurrent memory cell is an additional contribution of this text. To move a modern RNN-cell into the complex domain, we must make various design choices regarding the gating mechanism, state transition matrix, and activation functions. The design process introduces a new complex gate activation function the modSigmoid. Afterwards, we explore the interplay of state transition matrices and cell activation functions. It is confirmed that unbounded non-linearities require unitary or orthogonal state transition matrices to be stable.
General-purpose machine learning models often produce blurry video predictions. By using the phase of frames in their frequency domain representation, it is possible to do better. Image registration methods allow the extraction of transformation parameters. For single pre-segmented objects on input video frames, phase modification can help to predict future images.
The FFT represents all inputs in the fixed Fourier representation. The FWT works with infinitely many wavelets, all of which can serve as potential bases. This text proposes a loss function, which allows wavelet optimization and integrates the FWT into convolutional and recurrent neural networks. Replacing dense linear weight matrices with sparse diagonal matrices and fast wavelet transforms allows spectacular parameter reductions without performance loss in some cases. Finally, the last chapter finds that wavelet quantization can reduce the memory space required to store and transmit a convolutional neural network.},

url = {https://hdl.handle.net/20.500.11811/9245}
}

The following license files are associated with this item:

Namensnennung-Nicht kommerziell 4.0 International