NLP
Toolkit for training/converting LibreTranslate compatible language models 🚂
Free and Open Source Machine Translation API. Self-hosted, offline capable and easy to setup.
Training open neural machine translation models
Open Source Neural Machine Translation and (Large) Language Models in PyTorch
Open-source offline translation library written in Python
A machine translation reading list maintained by Tsinghua Natural Language Processing Group
Open Source Neural Machine Translation in Torch (deprecated)
Toolkit for Machine Learning, Natural Language Processing, and Text Generation, in TensorFlow. This is part of the CASL project: http://casl-project.ai/
Unsupervised Word Segmentation for Neural Machine Translation and Text Generation
A modular RL library to fine-tune language models to human preferences
Neural machine translation and sequence learning using TensorFlow
Multilingual word vectors in 78 languages
Open-Source Neural Machine Translation in Tensorflow
An open-source neural machine translation toolkit developed by Tsinghua Natural Language Processing Group
Paper list of simultaneous translation / streaming translation, including text-to-text machine translation and speech-to-text translation.
Natural Language Processing Pipeline - Sentence Splitting, Tokenization, Lemmatization, Part-of-speech Tagging and Dependency Parsing
Open neural machine translation models and web services
Whisper command line client compatible with original OpenAI client based on CTranslate2.
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
Text compression for generating keyboard expansions
Hausa-NMT: Empirical Study of Neural Machine translation for English-Hausa-English
AfriSenti-SemEval Shared Task 12: Sentiment Analysis for African languages : https://afrisenti-semeval.github.io/
Facebook Low Resource (FLoRes) MT Benchmark
Fast inference engine for Transformer models
Generate transcripts for audio and video content with a user friendly UI, powered by Open AI's Whisper with automatic translations and download videos automatically with yt-dlp integration
Obtain Word Alignments using Pretrained Language Models (e.g., mBERT)
A collection of links and notes on forced alignment tools
RAG (Retrieval Augmented Generation) Framework for building modular, open source applications for production by TrueFoundry