Skip to content

AidGenSE Model Upload and Configuration

Introduction

This document details how to upload models and use AidGenSE to start OpenAI HTTP services.

Prerequisites

Before uploading and loading models, please ensure the following conditions are met:

  • Install AidGenSE related dependencies, please refer to AidGenSE Installation
  • Complete model folder (including .gguf or .bin model files and configuration files)
  • Understand the model types and limitations supported by the current platform, please refer to AidGenSE Support Status

Model Resource Preparation

GGUF Format Models

Recommended file structure:

bash
qwen2.5-1.5b-instruct-q4_k_m/

|__ qwen2.5-1.5b-instruct-q4_k_m.gguf

|__ qwen2.5-1.5b-instruct-q4_k_m.json Model configuration file (needs manual creation)

Configuration File Template:

json
{
  "backend_type": "llamacpp",
  "model": {
    "path": "<absolute path to model file>"
  }
}

Qualcomm QNN Bin Format Models

Recommended structure:

bash
qwen2.5-1.5b-instruct-8550-bin/

|__ *.serialized.bin (multiple files)

|__ qwen2.5-1.5b-instruct-htp.json Main configuration file

|__ qwen2.5-1.5b-instruct-tokenizer.json

|__ htp_backend_ext_config.json

💡Note

The large model resources on Model Farm are organized according to standard structure. Information about Qualcomm large model configuration files can be found in Qualcomm Genie.

Special Notes

qwen2.5-1.5b-instruct-htp.json is the main configuration file, where related file paths need to be set as absolute paths. Example:

json
"tokenizer": {
  "path": "/opt/aidlux/app/aid-openai-api/res/models/qwen2.5-1.5B-instruct-8550-bin/qwen2.5-1.5b-instruct-tokenizer.json"
},
"ctx-bins": [
  "/opt/aidlux/app/aid-openai-api/res/models/qwen2.5-1.5B-instruct-8550-bin/qwen2.5-1.5b-instruct_qnn229_qcs8550_4096_1_of_3.serialized.bin",
  "/opt/aidlux/app/aid-openai-api/res/models/qwen2.5-1.5B-instruct-8550-bin/qwen2.5-1.5b-instruct_qnn229_qcs8550_4096_2_of_3.serialized.bin",
  "/opt/aidlux/app/aid-openai-api/res/models/qwen2.5-1.5B-instruct-8550-bin/qwen2.5-1.5b-instruct_qnn229_qcs8550_4096_3_of_3.serialized.bin"
],
"extensions": "/opt/aidlux/app/aid-openai-api/res/models/qwen2.5-1.5B-instruct-8550-bin/htp_backend_ext_config.json"

APLUX aidem Encrypted Format

Recommended file structure:

bash
qwen2.5-1.5b-instruct-q4_k_m/

|__ *.aidem (multiple files)

|__ qwen2.5-1.5b-instruct.json Model configuration file (needs manual creation)

Configuration File Template:

json
{
  "backend_type": "genie",
  "model": {
    "path": "<absolute path to model file>"
  }
}

Configuration File Setup

Configure model resources by editing api_cfg.json

bash
vi /opt/aidlux/app/aid-openai-api/api-cfg.json

Add model:

json
{
"prompt_template_list": [
    ...
    "<conversation template type>": "conversation template format"
],
"model_cfg_list": [
    ...
    {
        "model_id": "<model ID>",
        "model_create": "<timestamp>",
        "model_owner": "<creator>",
        "cfg_path": "<absolute path to model config file>",
        "prompt_template_type": "<conversation template type>"
    }
}

Field Descriptions:

  • model_id: Model identifier, unique ID used to specify the model at runtime, e.g., qwen2.5-1.5b-instruct-q4_k_m
  • model_create: Model registration timestamp, can be generated using command printf "%s%03d\n" "$(date +%s)" "$((10#$(date +%N) / 1000000))", in milliseconds
  • model_owner: Model owner name, customizable, e.g., "aplux"
  • cfg_path: Absolute path to main model configuration file
  • prompt_template_type: Conversation prompt template type, common values like "qwen1.5", "qwen2", "deepseek", should match the model training format

View Added Models Using Commands

bash
# Use aidllm list api command to check if the model has been added to aidllm
aidllm list api

qwen2.5-7b-instruct
aplux_qwen2-7B
qwen2.5-7b-instruct-q4_k_m
qwen2.5-7B-8550
qwen2.5-1.5b-instruct-q4_k_m
qwen2.5-1.5B-instruct-8550-bin

# Use aidllm start api -m qwen2.5-1.5b-instruct-q4_k_m to specify and run the model, check if the model can run
aidllm start api -m qwen2.5-1.5b-instruct-q4_k_m