Quant-Requests are open.
I apologize for disrupting your experience.
If you want and you are able to...
You can support my various endeavors here (Ko-fi).
Eventually I want to have a proper infrastructure for these.
In the meantime I'll be working to make do with the resources at hand at the time.
Welcome to my GGUF-IQ-Imatrix Model Quantization Requests card!
Please read everything.
This card is meant only to request GGUF-IQ-Imatrix quants for models that meet the requirements bellow.
Requirements to request GGUF-Imatrix model quantizations:
For the model:
- Maximum model parameter size of
11B12B. Small note is that models sizes larger than 8B parameters may take longer to process and upload than the smaller ones.
At the moment I am unable to accept requests for larger models due to hardware/time limitations.
Preferably for Mistral and LLama-3 based models in the creative/roleplay niche.
If you need quants for a bigger model, you can try requesting at mradermacher's. He's doing an amazing work.
Important:
- Fill the request template as outlined in the next section.
How to request a model quantization:
Open a New Discussion titled "
Request: Model-Author/Model-Name
", for example, "Request: Nitral-AI/Infinitely-Laydiculous-7B
", without the quotation marks.Include the following template in your new discussion post, you can just copy and paste it as is, and fill the required information by replacing the {{placeholders}} (example request here):
**[Required] Model name:** <br>
{{replace-this}}
**[Required] Model link:** <br>
{{replace-this}}
**[Required] Brief description:** <br>
{{replace-this}}
**[Required] An image/direct image link to represent the model (square shaped):** <br>
{{replace-this}}
**[Optional] Additonal quants (if you want any):** <br>
<!-- Keep in mind that anything bellow I/Q3 isn't recommended, -->
<!-- since for these smaller models the results will likely be -->
<!-- highly incoherent rendering them unusable for your needs. -->
Default list of quants for reference:
"IQ3_M", "IQ3_XXS",
"Q4_K_M", "Q4_K_S", "IQ4_XS",
"Q5_K_M", "Q5_K_S",
"Q6_K",
"Q8_0"