Development branch includes BarkAI Text to speech. The User inputs a prompt and ai replies with 3 models. First the image model will generate an image, then the language model generates a text reply, then Bark TTS reads the message aloud.
YouTube Demo: https://www.youtube.com/watch?v=2GaCATKfVDA
Installation For Windows 10/11 with Nvidia-GPU
-
Install Automatic1111
sd.webui.zip
fromv1.0.0-pre here
https://github.com/AUTOMATIC1111/stable-diffusion-webui/releases/tag/v1.0.0-pre and extract the zip file. -
After running update.bat, Make
webui.bat
look like this. by addingset COMMANDLINE_ARGS=--api
-
Start Automatic1111 by running
run.bat
. This should start the Automatic1111 WebUI and load the model to be used with the GUI program. -
Download each part of the model/gui 7zip from here https://github.com/graylan0/ModeZion/releases/tag/v1 then extract the GUI/Model Folder with 7zip.
-
Open the Model/GUI folder that was extracted.
-
Run
llama-stable-gui.exe
.