Fast and portable character string processing in R (with the Unicode ICU)
-
Updated
Jul 11, 2024 - C++
Fast and portable character string processing in R (with the Unicode ICU)
A minimalist single-header library for building pattern-matchers, lexers, and parsers.
Colibri core is an NLP tool as well as a C++ and Python library for working with basic linguistic constructions such as n-grams and skipgrams (i.e patterns with one or more gaps, either of fixed or dynamic size) in a quick and memory-efficient way. At the core is the tool ``colibri-patternmodeller`` whi ch allows you to build, view, manipulate a…
Frog is an integration of memory-based natural language processing (NLP) modules developed for Dutch. All NLP modules are based on Timbl, the Tilburg memory-based learning software package.
A large scale feature extraction tool for text-based machine learning
A repository to bind mecab for Python 3.5+. Not using swig nor pybind. (Not Maintained Now)
mime is a scripting tool for text processing, inspired by Emacs Keyboard Macros.
A Graphics Library that renders in text mode
Yet another way to type amharic on standard english keyboard.
A Regex📋 implementation in C++ using Thompson's NFA algorithm
Numero is a library for converting between Arabic numbers and their English numeral
UNIX line counting utilities
c++ ascii not poisonous parsing library
OOP based PWG-DQ User Interface (CLI) Development in Python
Transforms a list of documents to the input accepted by pisa-engine
Example of cleaning the text-file for unreasonable symbols
Функции работы с русскими числительными
An C++ program which can provide a Google-like summary of a document given a list of positions of words and phrases to highlight.
New version of the specs pipeline stage based on what's in current CMS pipelines
Add a description, image, and links to the text-processing topic page so that developers can more easily learn about it.
To associate your repository with the text-processing topic, visit your repo's landing page and select "manage topics."