This is a fork of the TensorFlow implementation of the model proposed in Real-Time Sign Language Detection using Human Pose Estimation, published in SLRTP 2020.
In this fork, we add data loading and training for Holistic pose estimation.
This model is used in the Real-TIme Sign Language Detection for Videoconferencing demo published in ECCV 2020.
This repository includes pre-trained models for both python and javascript.
You can use the included models to perform inference or fine-tuning.
To load a model in python, use
tensorflow.python.keras.models.load_model('models/py/model.h5')
.
To load a model in the browser, use tf.loadLayersModel('models/js/model.json')
from tfjs.
You can use the train.py script to train the model from scratch
using a tfrecord
dataset file.
python -m train --dataset_path="data.tfrecord" --device="/GPU:0"
The dataset is represented as a tfrecord
file where each video has 4
properties:
fps
:Int64List
- the framerate of the videopose_data
:BytesList
- human pose estimation, as a tensor of the shape(frames, 1, points, dimensions)
pose_confidence
:BytesList
- human pose estimation confidence, as a tensor of the shape(frames, 1, points)
0is_signing
:BytesList
- a bytes object representing weather the user was signing or not in every frame.
Please see examples/create_tfrecord.py
for an example of creating this record.
The provided models were trained on the Public DGS Corpus.
To create the data files using the dgs corpus, TODO
@inproceedings{moryossef2020sign,
title={Real-Time Sign Language Detection using Human Pose Estimation},
author={Amit Moryossef and Ioannis Tsochantaridis and Roee Aharoni and Sarah Ebling and S. Narayanan},
journal={SLRTP},
year={2020},
}
# If you are using the Public DGS Corpus
@inproceedings{hanke2020extending,
title={{E}xtending the {P}ublic {DGS} {C}orpus in Size and Depth},
author={Hanke, Thomas and Schulder, Marc and Konrad, Reiner and Jahn, Elena},
booktitle={Proceedings of the LREC2020 9th Workshop on the Representation and Processing of Sign Languages: Sign Language Resources in the Service of the Language Community, Technological Challenges and Application Perspectives},
pages={75--82},
year={2020}
}