Hugging Face – Posts

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

All HF Hub posts

mlabonne

posted an update 2 days ago

Post

3629

Large models are surprisingly bad storytellers.

I asked 8 LLMs to "Tell me a bedtime story about bears and waffles."

Claude 3.5 Sonnet and GPT-4o gave me the worst stories: no conflict, no moral, zero creativity.

In contrast, smaller models were quite creative and wrote stories involving talking waffle trees and bears ostracized for their love of waffles.

Here you can see a comparison between Claude 3.5 Sonnet and NeuralDaredevil-8B-abliterated. They both start with a family of bears but quickly diverge in terms of personality, conflict, etc.

I mapped it to the hero's journey to have some kind of framework. Prompt engineering can definitely help here, but it's still disappointing that the larger models don't create better stories right off the bat.

Do you know why smaller models outperform the frontier models here?

21 replies

lamhieu

posted an update 2 days ago

Post

2698

🎉 The Ghost 8B Beta model outperforms prominent models such as Llama 3 8B Instruct, GPT 3.5 Turbo in the lc_winrate score. In addition, it also outperforms Claude 3 Opus, Claude 3 Sonnet, GPT-4, and Mistral Large when comparing the winrate score of AlpacaEval 2.0.

Ghost 8B Beta is a large language model developed with goals that include excellent multilingual support, superior knowledge capabilities, and cost-effectiveness. The model comes in two context length versions, 8k and 128k, along with multilingual function tools support by default.
The languages supported are 🇺🇸 English, 🇫🇷 French, 🇮🇹 Italian, 🇪🇸 Spanish, 🇵🇹 Portuguese, 🇩🇪 German, 🇻🇳 Vietnamese, 🇰🇷 Korean and 🇨🇳 Chinese.

Explore the Potential:
To learn more about this groundbreaking language model, visit the official website or explore the online demo platforms:
- Ghost 8B Beta (β, 8k) on Spaces: lamhieu/ghost-8b-beta-8k.
- Ghost 8B Beta (β, 128k) on Spaces: lamhieu/ghost-8b-beta-128k
- Official website: https://ghost-x.org/docs/models/ghost-8b-beta

34 replies

isidentical

posted an update 3 days ago

Post

3408

Announcing the second open model in our Aura series of media models at @fal : fal/AuraFlow

Try it using diffusers or ComfyUI from publicly available weights, and read more about it in our blog https://blog.fal.ai/auraflow.

1 reply

bokesyo

posted an update about 4 hours ago

Post

302

It's time to switch from bge to Memex! We introduce Memex: OCR-free Visual Document Embedding Model as Your Personal Librarian.

The model only takes images as document-side inputs and produce vectors representing document pages. Memex is trained with over 200k query-visual document pairs, including textual document, visual document, arxiv figures, plots, charts, industry documents, textbooks, ebooks, and openly-available PDFs, etc. Its performance is on a par with our ablation text embedding model on text-oriented documents, and an advantages on visually-intensive documents.

Our model is capable of:

😋 Help you read a long visually-intensive or text-oriented PDF document and find the pages that answer your question.

🤗 Help you build a personal library and retireve book pages from a large collection of books.

🤩 It has only 2.8B parameters, and has the potential to run on your PC.

🐵 It works like human: read and comprehend with vision and remember multimodal information in hippocampus.

The model is open-sourced at RhapsodyAI/minicpm-visual-embedding-v0

Everyone is welcome to try our online demo at bokesyo/minicpm-visual-embeeding-v0-demo

alvdansen

posted an update about 17 hours ago

Post

1179

New model drop...🥁

FROSTING LANE REDUX

The v1 of this model was released during a big model push, so I think it got lost in the shuffle. I revisited it for a project and realized it wasn't inventive enough around certain concepts, so I decided to retrain.

alvdansen/frosting_lane_redux

I think the original model was really strong on it's own, but because it was trained on fewer images I found that it was producing a very lackluster range of facial expressions, so I wanted to improve that.

The hardest part of creating models like this, I find, is maintaining the detailed linework without without overfitting. It takes a really balanced dataset and I repeat the data 12 times during the process, stopping at the last 10-20 epochs.

It is very difficult to predict the exact amount of time needed, so for me it is crucial to do epoch stops. Every model has a different threshold for ideal success.

DmitryRyumin

posted an update about 17 hours ago

Post

617

🔥🎭🌟 New Research Alert - ECCV 2024 (Avatars Collection)! 🌟🎭🔥
📄 Title: RodinHD: High-Fidelity 3D Avatar Generation with Diffusion Models 🔝

📝 Description: RodinHD generates high-fidelity 3D avatars from portrait images using a novel data scheduling strategy and weight consolidation regularization to capture intricate details such as hairstyles.

👥 Authors: Bowen Zhang, @yiji , @chunyuwang , Ting Zhang, @jiaolong , Yansong Tang, Feng Zhao, Dong Chen, and Baining Guo

📄 Paper: RodinHD: High-Fidelity 3D Avatar Generation with Diffusion Models (2407.06938)

🌐 Github Page: https://rodinhd.github.io/
📁 Repository: https://github.com/RodinHD/RodinHD

📺 Video: https://www.youtube.com/watch?v=ULvHt7dZx-Q

🚀 CVPR-2023-24-Papers: https://github.com/DmitryRyumin/CVPR-2023-24-Papers

🚀 WACV-2024-Papers: https://github.com/DmitryRyumin/WACV-2024-Papers

🚀 ICCV-2023-Papers: https://github.com/DmitryRyumin/ICCV-2023-Papers

📚 More Papers: more cutting-edge research presented at other conferences in the DmitryRyumin/NewEraAI-Papers curated by @DmitryRyumin

🚀 Added to the Avatars Collection: DmitryRyumin/avatars-65df37cdf81fec13d4dbac36

🔍 Keywords: #RodinHD #3DAvatars #DiffusionModels #HighFidelity #PortraitTo3D #MachineLearning #ComputerVision #DeepLearning #AI #ECCV2024

Niansuh

posted an update 1 day ago

Post

603

Get ChatGPT 3.5, 4, and 4o API key for only $10 per month. Unlimited usage with no limits on quota.

Demo: NiansuhAI/Copilot

Niansuh

posted an update 3 days ago

Post

2949

Use GPT-4, GPT-4 Turbo Preview, GPT-3.5 Turbo, BingAI, and other models. The interface is similar to ChatGPT, with a speedy API endpoint.

NiansuhAI/Copilot

4 replies

nroggendorff

posted an update about 13 hours ago

Post

381

Updated https://huggingface.co/blog/nroggendorff/train-with-llama-architecture so you can "train" your own tokenizer from your dataset.

1 reply

as-cle-bert

posted an update about 19 hours ago

Post

313

Responsibly building AI also means knowing its impact on the environment and the hidden carbon costs associated with it🌱
If you're interested in the subject, you can check out my latest community article: https://huggingface.co/blog/as-cle-bert/is-ai-carbon-footprint-worrisome
Where I try to unravel AI's carbon footprint and potential solutions to reduce it🌻
Enjoy!🤗

Recently active users