Stars
- All languages
- ANTLR
- ActionScript
- Arduino
- Assembly
- Batchfile
- Bikeshed
- BitBake
- C
- C#
- C++
- CMake
- CSS
- Clojure
- CoffeeScript
- Cuda
- Cython
- DIGITAL Command Language
- DTrace
- Dart
- Dockerfile
- Elixir
- Elm
- Erlang
- Fortran
- G-code
- GAP
- Gherkin
- Go
- Groovy
- HTML
- Haskell
- Haxe
- Isabelle
- Java
- JavaScript
- Jinja
- Julia
- Jupyter Notebook
- Kotlin
- Lex
- Lua
- MATLAB
- MDX
- Makefile
- Markdown
- Meson
- OCaml
- Objective-C
- Objective-C++
- Open Policy Agent
- OpenSCAD
- PHP
- PLpgSQL
- Pascal
- Perl
- PowerShell
- Prolog
- Pug
- Python
- R
- Rich Text Format
- Ruby
- Rust
- SCSS
- SWIG
- Scala
- Shell
- Solidity
- Svelte
- Swift
- SystemVerilog
- TSQL
- TeX
- TypeScript
- VBA
- Vala
- Verilog
- Vue
- WebAssembly
- Wikitext
- XSLT
- YARA
- Zeek
Code for ACL paper "Zero-Shot Text Classification via Self-Supervised Tuning"
Easy to use, state-of-the-art Neural Machine Translation for 100+ languages
Things you can do with the token embeddings of an LLM
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 and reasoning techniques.
Questions? Contact me at @DhruvAtreja1
Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
Empowering RAG with a memory-based data interface for all-purpose applications!
Trained models & code to predict toxic comments on all 3 Jigsaw Toxic Comment Challenges. Built using ⚡ Pytorch Lightning and 🤗 Transformers. For access to our API, please email us at contact@unita…
Notebooks for training universal 0-shot classifiers on many different tasks
experiments of some semantic matching models and comparison of experimental results.
🧑🚀 全世界最好的LLM资料总结 | Summary of the world's best LLM resources.
A Large-scale Chinese Short-Text Conversation Dataset and Chinese pre-training dialog models
Repository for the paper "Large Language Model-Based Agents for Software Engineering: A Survey".
Code implementation of synthetic continued pretraining
A webpage proxy that request through Chromium (puppeteer) - can be used to bypass Cloudflare anti bot / anti ddos on any application (like curl)
🌦️ A catalogue and categorization of AI-based weather forecasting models.
A standalone version of the readability lib
Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Wo…
Analysis of Bot Protection systems with available countermeasures 🚿. How to defeat anti-bot system 👻 and get around browser fingerprinting scripts 🕵️♂️ when scraping the web?
A multithreaded 🕸️ web crawler that recursively crawls a website and creates a 🔽 markdown file for each page, designed for LLM RAG
[Survey] Awesome List of Mixup Augmentation and Beyond (https://arxiv.org/abs/2409.05202)