# Simplified minimal gated unit variations for recurrent neural networks

@article{Heck2017SimplifiedMG, title={Simplified minimal gated unit variations for recurrent neural networks}, author={Joel Heck and Fathi M. Salem}, journal={2017 IEEE 60th International Midwest Symposium on Circuits and Systems (MWSCAS)}, year={2017}, pages={1593-1596} }

Recurrent neural networks with various types of hidden units have been used to solve a diverse range of problems involving sequence data. Two of the most recent proposals, gated recurrent units (GRU) and minimal gated units (MGU), have shown comparable promising results on example public datasets. In this paper, we introduce three model variants of the minimal gated unit which further simplify that design by reducing the number of parameters in the forget-gate dynamic equation. These three… Expand

#### 41 Citations

Radically Simplifying Gated Recurrent Architectures Without Loss of Performance

- Computer Science
- 2019 IEEE International Conference on Big Data (Big Data)
- 2019

This study demonstrates that it is possible to radically simplify the MGU without significant loss of performance for some tasks and datasets, and an extraordinarily simple Forget Gate architecture performs just as well as an MGU on the given task. Expand

Gates are not what you need in RNNs

- Computer Science
- ArXiv
- 2021

This paper proposes a new recurrent cell called Residual Recurrent Unit (RRU), which beats traditional cells and does not employ a single gate, based on the residual shortcut connection together with linear transformations, ReLU, and normalization. Expand

A Review of Recurrent Neural Networks: LSTM Cells and Network Architectures

- Computer Science, Medicine
- Neural Computation
- 2019

The LSTM cell and its variants are reviewed and their variants are explored to explore the learning capacity of the LSTm cell and the L STM networks are divided into two broad categories:LSTM-dominated networks and integrated LSTS networks. Expand

SLIM LSTMs

- Computer Science
- ArXiv
- 2018

This paper systemically introduces variants of the LSTM RNNs, referred to as SLIM LSTMs, which express aggressively reduced parameterizations to achieve computational saving and/or speedup in (training) performance---while necessarily retaining (validation accuracy) performance comparable to the standard LSTm RNN. Expand

Performance of Three Slim Variants of The Long Short-Term Memory (LSTM) Layer

- Computer Science
- 2019 IEEE 62nd International Midwest Symposium on Circuits and Systems (MWSCAS)
- 2019

Computational analysis of the validation accuracy of a convolutional plus recurrent neural network architecture designed to analyze sentiment, using comparatively the standard LSTM and three Slim LSTm layers finds that some realizations of the Slim L STM layers can potentially perform as well as the standardLSTM layer for the considered architecture targeted at sentiment analysis. Expand

CARU: A Content-Adaptive Recurrent Unit for the Transition of Hidden State in NLP

- Computer Science
- ICONIP
- 2020

This article introduces a novel RNN unit inspired by GRU, namely the Content-Adaptive Recurrent Unit (CARU). The design of CARU contains all the features of GRU but requires fewer training… Expand

Toponym Resolution with Deep Neural Networks

- 2017

Ricardo Custódio Toponym resolution, i.e. inferring the geographic coordinates of a given string that represents a placename, is a fundamental problem in the context of several applications related… Expand

Applying Machine Learning to the Task of Generating Search Queries

- Computer Science
- SSI
- 2020

In this paper we research two modifications of recurrent neural networks – Long Short-Term Memory networks and networks with Gated Recurrent Unit with the addition of an attention mechanism to both… Expand

Airborne particle pollution predictive model using Gated Recurrent Unit (GRU) deep neural networks

- Computer Science, Environmental Science
- Earth Science Informatics
- 2020

A forecasting model using gated Recurrent unit (GRU) and Long-Short Term Memory (LSTM) networks, which are types of a deep recurrent neural network (RNN) is presented, showing that thistype of deep network is feasible for predicting the non-linearities of this type of data. Expand

Gated Recurrent Networks for Video Super Resolution

- Computer Science
- 2020 28th European Signal Processing Conference (EUSIPCO)
- 2021

This work proposes a new Gated Recurrent Convolutional Neural Network for VSR adapting some of the key components of a Gated recurrent Unit, which outperforms current VSR learning based models in terms of perceptual quality and temporal consistency. Expand

#### References

SHOWING 1-10 OF 17 REFERENCES

Minimal gated unit for recurrent neural networks

- Computer Science, Engineering
- Int. J. Autom. Comput.
- 2016

This work proposes a gated unit for RNN, named as minimal gated units (MGU), since it only contains one gate, which is a minimal design among all gated hidden units. Expand

Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling

- Computer Science
- ArXiv
- 2014

These advanced recurrent units that implement a gating mechanism, such as a long short-term memory (LSTM) unit and a recently proposed gated recurrent unit (GRU), are found to be comparable to LSTM. Expand

An Empirical Exploration of Recurrent Network Architectures

- Computer Science
- ICML
- 2015

It is found that adding a bias of 1 to the LSTM's forget gate closes the gap between the L STM and the recently-introduced Gated Recurrent Unit (GRU) on some but not all tasks. Expand

Long Short-Term Memory

- Computer Science, Medicine
- Neural Computation
- 1997

A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units. Expand

Adam: A Method for Stochastic Optimization

- Computer Science, Mathematics
- ICLR
- 2015

This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework. Expand

On the Properties of Neural Machine Translation: Encoder–Decoder Approaches

- Computer Science, Mathematics
- SSST@EMNLP
- 2014

It is shown that the neural machine translation performs relatively well on short sentences without unknown words, but its performance degrades rapidly as the length of the sentence and the number of unknown words increase. Expand

Gradient-based learning applied to document recognition

- Computer Science
- 1998

This paper reviews various methods applied to handwritten character recognition and compares them on a standard handwritten digit recognition task, and Convolutional neural networks are shown to outperform all other techniques. Expand

Nov.). Long shortterm memory

- Neural Computation
- 1997

Reduced Paramerization in Gated Recurrent Neural Networks

- Memorandum 7.11.2016,
- 2016

Keras : Theanobased deep learning library

- 2015