jeradf

jerad fields jeradf

medium

Achievements

Starred repositories

geealbers / purple-states

Traditional U.S. electoral maps not only illustrate polarization, they can exacerbate it. No state is strictly red or blue, they are all shades of purple.

JavaScript 46 4 Updated Nov 9, 2024

KellerJordan / modded-nanogpt

NanoGPT (124M) in 5 minutes

Python 1,517 135 Updated Nov 25, 2024

microsoft / autogen

A programming framework for agentic AI 🤖

Jupyter Notebook 34,804 5,037 Updated Nov 25, 2024

facebookresearch / spiritlm

Inference code for the paper "Spirit-LM Interleaved Spoken and Written Language Model".

Python 812 52 Updated Oct 28, 2024

huggingface / optimum

🚀 Accelerate training and inference of 🤗 Transformers and 🤗 Diffusers with easy to use hardware optimization tools

Python 2,586 469 Updated Nov 25, 2024

vocodedev / vocode-core

🤖 Build voice-based LLM agents. Modular + open source.

Python 2,936 494 Updated Nov 15, 2024

PaddlePaddle / PaddleSlim

PaddleSlim is an open-source library for deep model compression and architecture search.

Python 1,564 345 Updated Nov 20, 2024

SforAiDl / KD_Lib

A Pytorch Knowledge Distillation library for benchmarking and extending works in the domains of Knowledge Distillation, Pruning, and Quantization.

Python 609 58 Updated Mar 1, 2023

modal-labs / quillman

A voice chat app

Python 1,072 122 Updated Nov 15, 2024

Nkluge-correa / TeenyTinyLlama

A pair of tiny foundational models trained in Brazilian Portuguese.🦙🦙

Python 26 5 Updated Sep 27, 2024

OpenNLPLab / HGRN

[NeurIPS 2023 spotlight] Official implementation of HGRN in our NeurIPS 2023 paper - Hierarchically Gated Recurrent Neural Network for Sequence Modeling

Python 61 4 Updated Apr 24, 2024

dtsip / in-context-learning

Jupyter Notebook 199 41 Updated May 10, 2024

jaymody / picoGPT

An unnecessarily tiny implementation of GPT-2 in NumPy.

Python 3,254 417 Updated Apr 24, 2023

neuralmagic / compressed-tensors

A safetensors extension to efficiently store sparse quantized tensors on disk

Python 51 2 Updated Nov 25, 2024

mobiusml / hqq

Official implementation of Half-Quadratic Quantization (HQQ)

Python 705 70 Updated Nov 22, 2024

hahnyuan / RPTQ4LLM

Reorder-based post-training quantization for large language model

Python 181 11 Updated May 17, 2023

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 30,786 4,675 Updated Nov 25, 2024

microsoft / DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python 35,582 4,139 Updated Nov 25, 2024

HuangOwen / Awesome-LLM-Compression

Awesome LLM compression research papers and tools.

1,212 80 Updated Nov 25, 2024

mit-han-lab / llm-awq

[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Python 2,543 206 Updated Oct 16, 2024

Vahe1994 / AQLM

Official Pytorch repository for Extreme Compression of Large Language Models via Additive Quantization https://arxiv.org/pdf/2401.06118.pdf and PV-Tuning: Beyond Straight-Through Estimation for Ext…

Python 1,175 178 Updated Nov 23, 2024

FEMA / openfema-samples

Code, dataset, and analysis samples that utilize the OpenFEMA API.

Jupyter Notebook 29 8 Updated Sep 17, 2024

vllm-project / llm-compressor

Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM

Python 734 60 Updated Nov 25, 2024

OpenGVLab / EfficientQAT

EfficientQAT: Efficient Quantization-Aware Training for Large Language Models

Python 226 17 Updated Oct 8, 2024

microsoft / T-MAC

Low-bit LLM inference on CPU with lookup table

C++ 593 45 Updated Nov 19, 2024

OpenGVLab / OmniQuant

[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.

Python 732 56 Updated Oct 8, 2024

huggingface / datatrove

Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.

Python 2,059 149 Updated Nov 25, 2024

IST-DASLab / gptq

Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".

Python 1,947 155 Updated Mar 27, 2024

SqueezeAILab / SqueezeLLM

[ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization

Python 652 43 Updated Aug 13, 2024

horseee / LLM-Pruner

[NeurIPS 2023] LLM-Pruner: On the Structural Pruning of Large Language Models. Support Llama-3/3.1, Llama-2, LLaMA, BLOOM, Vicuna, Baichuan, TinyLlama, etc.

Python 881 106 Updated Oct 7, 2024

jerad fields jeradf

Starred repositories

voice-assistant

voice-activity-detection

time-series

data-quality

model-serving

Medium

recsys

ml-infrastructure

Machine learning

Deep learning