A Modern C++ Data Sciences Toolkit
-
Updated
Apr 17, 2023 - C++
A Modern C++ Data Sciences Toolkit
Porter stemming library (C++)
Plagiarism detector in C++ for naive text file matching
The C++ implementation of Aho-Corasick Automation, which can apply to full-text indexing
RcppJagger is a wrapper package for Jagger
The app finds and lists the top 10 longer letter combinations with their frequency differences.
A text analysis tool for PDF files.
Example of cleaning the text-file for unreasonable symbols
Minhash text analyzer developed during Algorithmics subject.
A C++ project implementing a self-balancing AVL tree for efficient word frequency counting. This program analyzes text files, finds unique words, tracks word occurrences, and prints results in alphabetical order for ease of viewing.
A case study for a word search application in text
A command line tool analyzing your text for undesired expressions in academic writing.
Markov chain N-gram text generator for fast work with big number of N. Want to reach fast work with 6-grams or more.
Frequency dictionary implementation based on custom hashtable
Add a description, image, and links to the text-analysis topic page so that developers can more easily learn about it.
To associate your repository with the text-analysis topic, visit your repo's landing page and select "manage topics."