Skip to content
View Leeon-K's full-sized avatar
  • BUAA
  • Beijing China
  • 22:29 (UTC +08:00)
  • X @Lick

Highlights

  • Pro

Block or report Leeon-K

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Beta Lists are currently in beta. Share feedback and report bugs.
142 results for source starred repositories
Clear filter

🥇 A curated list of awesome large language models in finance(FinLLMs), including papers,models,datasets and codebases. 金融大模型列表,特别是中英双语大模型。

9 1 Updated Apr 24, 2024

Modeling, training, eval, and inference code for OLMo

Python 4,594 465 Updated Nov 5, 2024

主要记录大语言大模型(LLMs) 算法(应用)工程师相关的知识及面试题

HTML 3,527 407 Updated Oct 22, 2024

Learning material for CMU10-714: Deep Learning System

Jupyter Notebook 214 34 Updated May 12, 2024

The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…

Python 32,134 4,756 Updated Nov 4, 2024

A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 and reasoning techniques.

4,977 278 Updated Nov 1, 2024

TinySTL is a subset of STL(cut some containers and algorithms) and also a superset of STL(add some other containers and algorithms)

C++ 2,321 634 Updated Oct 27, 2018

从入门到精通,该项目力求做到最清晰、最系统的中文Prompt指北

3 Updated Aug 23, 2024

VideoSys: An easy and efficient system for video generation

Python 1,756 118 Updated Nov 5, 2024

Efficient Triton Kernels for LLM Training

Python 3,373 189 Updated Nov 5, 2024

A curated list for Efficient Large Language Models

Python 1,235 93 Updated Oct 30, 2024

SwissArmyTransformer is a flexible and powerful library to develop your own Transformer variants.

Python 990 95 Updated Oct 30, 2024

A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations

Python 729 37 Updated Oct 30, 2024

本项目旨在分享大模型相关技术原理以及实战经验(大模型工程化、大模型应用落地)

HTML 10,344 1,025 Updated Nov 3, 2024

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python 35,340 4,098 Updated Nov 5, 2024

计算机自学指南

HTML 57,562 6,867 Updated Nov 2, 2024

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Python 2,574 203 Updated Nov 5, 2024

flash attention tutorial written in python, triton, cuda, cutlass

Cuda 193 14 Updated Jun 18, 2024

📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.

2,771 192 Updated Nov 1, 2024

4 bits quantization of LLaMA using GPTQ

Python 2,993 458 Updated Jul 13, 2024

Development repository for the Triton language and compiler

C++ 13,298 1,628 Updated Nov 5, 2024

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

3,560 149 Updated Sep 25, 2024

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…

C++ 8,570 973 Updated Nov 5, 2024

Universal LLM Deployment Engine with ML Compilation

Python 19,116 1,571 Updated Nov 2, 2024

Inference Llama 2 in one file of pure C

C 17,430 2,084 Updated Aug 6, 2024

Several optimization methods of half-precision general matrix multiplication (HGEMM) using tensor core with WMMA API and MMA PTX instruction.

Cuda 293 67 Updated Sep 8, 2024

Mamba SSM architecture

Python 13,087 1,113 Updated Nov 5, 2024

Making large AI models cheaper, faster and more accessible

Python 38,770 4,340 Updated Nov 5, 2024

🎉 Modern CUDA Learn Notes with PyTorch: CUDA Cores, Tensor Cores, fp32/tf32, fp16/bf16, fp8/int8, flash_attn, rope, sgemm, hgemm, sgemv, warp/block reduce, elementwise, softmax, layernorm, rmsnorm.

Cuda 1,380 152 Updated Nov 5, 2024
Next