A C++ implementation of the long short-term memory recursive neural network model (LSTM-RNN) described in
[1] Phong Le and Willem Zuidema (2015). Compositional Distributional Semantics with Long Short Term Memory. In Proceedings of Joint Conference on Lexical and Computational Semantics (*SEM).
Written and maintained by Phong Le (p.le [at] uva.nl)
Note: this is similar to Tree-LSTM, which was proposed at the same time.
This package contains three components
-
src
- source code files in C++ of the LSTM-RNN, -
data
- Stanford Sentiment Treebank and GloVe word embeddings -
Release
- for compiling the source code.
- Install OpenBlas at
/opt/OpenBLAS
. - Go to
Release
, executemake
. It should work with gcc 4.9 or later. (If OpenBlas is not installed at/opt/OpenBLAS
, you need to replace/opt/OpenBLAS
by the correct path inmakefile
andsrc/subdir.mk
.)
The following instruction is for replicating the second experiment reported in [1]. Some small changes are needed for your own cases.
data/trees
contains the train/dev/test files of the Stanford Sentiment Treebank (SST).
data/dic/glove-300d-840B
contains the 300-D GloVe word embeddings (vectors of words not in the SST are removed).
If you want to use other word-embeddings, you should follow the following format:
- create
words.lst
containing all frequent words (e.g. words appear at least 2 times), each word per line. The first line is always#UNKNOWN#
- create
wembs.txt
containing word vectors. The first line is<number-of-vectors> <dimension>
. For each following line:<word> <vector>
. Vectors of words inwords.lst
but not inwembs.txt
will be randomly initialised.
In Release
, open train.config
, which stores the default parameter values. It should look like
dim 50
lambda 1e-3
lambdaL 1e-3
lambdaC 1e-3
dropoutRate 0
normGradThresh 1e10
learningRateDecay 0.
paramLearningRate 0.01
wembLearningRate 0.01
classLearningRate 0.01
evalDevStep 1
maxNEpoch 10
batchSize 5
nThreads 1
dataDir ../data/trees/
dicDir ../data/dic/glove-300d-840B/
compositionType LSTM
functionType tanh
You can try the traditional RNN by setting compositionType NORMAL
, and try other activation functions by setting functionType <NAME>
where <NAME>
could be softsign
, sigmoid
, rlu
.
Execute
./lstm-rnn --train train.config model.dat
where the resulting model will be stored in model.dat
.
Execute
./lstm-rnn --test <test-file> <model-file>