- Vancouver, BC, Canada
-
13:19
(UTC -07:00) - rwightman.com
- @wightmanr
Highlights
- Pro
Stars
Supercharge Your PyTorch Image Models: Bag of Tricks to 8x Faster Inference with ONNX Runtime & Optimizations
Scenic: A Jax Library for Computer Vision Research and Beyond
A reactive notebook for Python — run reproducible experiments, execute as a script, deploy as an app, and version with git.
Fast and flexible image augmentation library. Paper about the library: https://www.mdpi.com/2078-2489/11/2/125
Robot Utility Models are trained on a diverse set of environments and objects, and then can be deployed in novel environments with novel objects without any further data or training.
Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.
Pytorch implementation of Transfusion, "Predict the Next Token and Diffuse Images with One Multi-Modal Model", from MetaAI
MambaOut: Do We Really Need Mamba for Vision?
Cambrian-1 is a family of multimodal LLMs with a vision-centric design.
A native PyTorch Library for large model training
🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning
OCR, layout analysis, reading order, table recognition in 90+ languages
docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.
Minimal sharded dataset loaders, decoders, and utils for multi-modal document, image, and text datasets.
Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.
TensorDict is a pytorch dedicated tensor container.
Large Language Models (LLMs) applications and tools running on Apple Silicon in real-time with Apple MLX.
Python APTED algorithm for the Tree Edit Distance
A dashboard for exploring timm learning rate schedulers
🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.
A PyTorch port of Google TensorFlow.js PoseNet (Real-time Human Pose Estimation)
DataComp: In search of the next generation of multimodal datasets
StableLM: Stability AI Language Models
An open-source framework for training large multimodal models.
Implementation of MusicLM, Google's new SOTA model for music generation using attention networks, in Pytorch
The simplest, fastest repository for training/finetuning medium-sized GPTs.