Mobile on-Device Generative Language Model Inference

The repo provides the main binary file generated with FastLLM (https://github.com/ztxz16/fastllm.git) for the latest mainstream ARMv8-based Android mobile devices (i.e. smartphones and tablets). The file has been tested compatible to Qualcomm Snapdragon 8+ Gen 1, 8 Gen 2, and MediaTek Dimensity 8100 platform devices.

Step 1

Install Termux application on your Android device. Make sure your device has more than 6GB of RAM.

Step 2

Download a supported model file from HuggingFace. You only need to download the model file suffixed with ".flm". It is better downloaded directly with your target Android device.

Step 3

Download the main binary file. Copy or move the main file and the model file to a storage path of your target device, e.g. downloads.

Step 4

Open Termux, and execute the command: termux-setup-storage.

Step 5

Execute the command: mv storage/downloads/<model_filename>.flm . && mv storage/downloads/main . && chmod 777 main. Replace the <model_filename>.flm with the filename of your model, e.g. chatglm2-6b-int4.flm. (Notice that the example command works for you ONLY if you put the aforementioned 2 files under the downloads directory, which is STRONGLY recommended for common users!)

Step 6

Run the streamlined inference of the language model with the command: ./main -p <model_filename>.flm.

You're ready for the mobile on-device inference with the latest GLM(s)!

Hints:

If you encounter error like FORTIFY: read: count XXXXXXXX > SSIZE_MAX, you can try adding -l at the end of the command for low memory mode inference. For instance, it is observed that the Qwen-7B-Chat-int4.flm model cannot run without low memory mode on devices with ≤12GB RAM.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
LICENSE		LICENSE
README.md		README.md
main		main
main_pc_only		main_pc_only

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Mobile on-Device Generative Language Model Inference

Step 1

Step 2

Step 3

Step 4

Step 5

Step 6

Hints:

About

Releases 1

Packages

License

henryyantq/mobile-on-device-GLM-inference

Folders and files

Latest commit

History

Repository files navigation

Mobile on-Device Generative Language Model Inference

Step 1

Step 2

Step 3

Step 4

Step 5

Step 6

Hints:

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Packages