⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡
-
Updated
Oct 8, 2024 - Python
⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡
Large-scale LLM inference engine
Official Implementation of EAGLE-1 (ICML'24) and EAGLE-2 (EMNLP'24)
scalable and robust tree-based speculative decoding algorithm
Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024
[COLM 2024] TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding
REST: Retrieval-Based Speculative Decoding, NAACL 2024
[NeurIPS'23] Speculative Decoding with Big Little Decoder
Codes for our paper "Speculative Decoding: Exploiting Speculative Execution for Accelerating Seq2seq Generation" (EMNLP 2023 Findings)
SWIFT: On-the-Fly Self-Speculative Decoding for LLM Inference Acceleration
Implementation of the paper Fast Inference from Transformers via Speculative Decoding, Leviathan et al. 2023.
minimal C implementation of speculative decoding based on llama2.c
PipeInfer: Accelerating LLM Inference using Asynchronous Pipelined Speculation
Accelerating LLM inference with techniques like speculative decoding, quantization, and kernel fusion, focusing on implementing state-of-the-art research papers.
Dynasurge: Dynamic Tree Speculation for Prompt-Specific Decoding
Verification of the effect of speculative decoding in Japanese.
Implementation of Speculative Sampling in "Accelerating Large Language Model Decoding with Speculative Sampling"
Unofficial implementation of Token Recycling self-speculative decoding method.
Reproducibility Project for [NeurIPS'23] Speculative Decoding with Big Little Decoder
Add a description, image, and links to the speculative-decoding topic page so that developers can more easily learn about it.
To associate your repository with the speculative-decoding topic, visit your repo's landing page and select "manage topics."