AidGenSE Development Documentation

Introduction

AidGenSE is a generative AI HTTP service built on top of the AidGen SDK wrapper, adapted to the OpenAI HTTP protocol. Developers can call generative AI over HTTP and quickly integrate it into their own applications.

💡Note

All large language models supported by Model Farm achieve inference acceleration on Qualcomm NPUs through AidGen.

Support Status

Model Format and Backend Support

Model Format	CPU	GPU	NPU
.gguf	✅	✅	❌
.bin	❌	❌	✅
.aidem	❌	❌	✅

✅: Supported ❌: Not supported

Operating System Support

Linux	Android
✅	🚧

✅: Supported 🚧: Planned support

AidGenSE Service Installation and Operation

Installation

bash

# Install aidgen sdk
sudo aid-pkg update
sudo aid-pkg -i aidgense
sudo aid-pkg -i aidgen-sdk
sudo aid-pkg -i aidgen-qnn236
sudo aid-pkg -i aidgen-qnn240

Model Query & Retrieval

bash

# View supported models
aidllm remote-list api

Example output:

yaml

Current Soc : 8550

Name                                 Url                                          CreateTime
-----                                ---------                                    ---------
qwen2.5-0.5B-Instruct-8550           aplux/qwen2.5-0.5B-Instruct-8550             2025-03-05 14:52:23
qwen2.5-3B-Instruct-8550             aplux/qwen2.5-3B-Instruct-8550               2025-03-05 14:52:37
Qwen2.5-VL-3B-392x392-8550           aplux/Qwen2.5-VL-3B-392x392-8550             2025-12-02 16:48:32
Qwen2.5-VL-3B-672x672-8550           aplux/Qwen2.5-VL-3B-672x672-8550             2025-12-02 16:48:05
Qwen2.5-VL-3B-Instruct-q4_k_m        aplux/Qwen2.5-VL-3B-Instruct-q4_k_m          2026-03-10 11:00:27
...

bash

# Download model
aidllm pull api [Url] # aplux/qwen2.5-3B-Instruct-8550

# View downloaded models
aidllm list api

# Delete downloaded model
sudo aidllm rm api [Name] # qwen2.5-3B-Instruct-8550

Starting the Service

bash

# Start the OpenAI API service for the corresponding model
aidllm start api -m <model_name> 

# Check status
aidllm status api

# Stop service               
aidllm stop api

# Restart service
aidllm restart api

💡Note

The default port is 8888.

Chat Testing

Web UI Method

bash

# Install UI frontend service
sudo aidllm install ui

# Start UI service
aidllm start ui

# Check UI service status
aidllm status ui

# Stop UI service
aidllm stop ui

💡Note

After the UI service starts, visit http://ip:51104

API Method (Large Language Model)

Call the /v1/chat/completions endpoint via HTTP POST with a messages list to converse with the large language model. Set "stream": true to enable streaming output, which returns generated content token by token.

Python call example:

python

import os
import requests
import json

def stream_chat_completion(messages, model="qwen2.5-3B-Instruct-8550"):

    url = "http://127.0.0.1:8888/v1/chat/completions"
    headers = {
        "Content-Type": "application/json"
    }
    payload = {
        "model": model,
        "messages": messages,
        "stream": True    # Enable streaming
    }

    # Make request with stream=True
    response = requests.post(url, headers=headers, json=payload, stream=True)
    response.raise_for_status()

    # Read line by line and parse SSE format
    for line in response.iter_lines():
        if not line:
            continue
        # print(line)
        line_data = line.decode('utf-8')
        # Each SSE line starts with the "data: " prefix
        if line_data.startswith("data: "):
            data = line_data[len("data: "):]
            # End marker
            if data.strip() == "[DONE]":
                break
            try:
                chunk = json.loads(data)
            except json.JSONDecodeError:
                # Print and skip when parsing fails
                print("Unable to parse JSON:", data)
                continue

            # Extract the model output token
            content = chunk["choices"][0]["delta"].get("content")
            if content:
                print(content, end="", flush=True)

if __name__ == "__main__":
    # Example conversation
    messages = [
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Hello."}
    ]
    print("Assistant:", end=" ")
    stream_chat_completion(messages)
    print()  # New line

API Method (Vision Language Model)

AidGenSE supports vision language models (VLM), which can understand and describe images. In messages, pass text and images together through a content array: text is represented as {"type": "text", "text": "..."}, and images are passed as {"type": "image_url", "image_url": {"url": "data:image/jpeg;base64,..."}} with base64-encoded data to enable multimodal vision conversation.

Python call example:

python

import os
import requests
import json
import base64

def encode_image_to_base64(image_path):
    """Encode image file to base64 string."""
    with open(image_path, "rb") as image_file:
        return base64.b64encode(image_file.read()).decode("utf-8")

def stream_chat_completion(messages, model="Qwen2.5-VL-3B-392x392-8550"):

    url = "http://127.0.0.1:8888/v1/chat/completions"
    headers = {
        "Content-Type": "application/json"
    }
    payload = {
        "model": model,
        "messages": messages,
        "stream": True    # Enable streaming
    }

    # Make request with stream=True
    response = requests.post(url, headers=headers, json=payload, stream=True)
    response.raise_for_status()

    # Read line by line and parse SSE format
    for line in response.iter_lines():
        if not line:
            continue
        line_data = line.decode('utf-8')
        # Each SSE line starts with the "data: " prefix
        if line_data.startswith("data: "):
            data = line_data[len("data: "):]
            # End marker
            if data.strip() == "[DONE]":
                break
            try:
                chunk = json.loads(data)
            except json.JSONDecodeError:
                # Print and skip when parsing fails
                print("Unable to parse JSON:", data)
                continue

            # Extract the model output token
            content = chunk["choices"][0]["delta"].get("content")
            if content:
                print(content, end="", flush=True)

if __name__ == "__main__":
    # Encode image to base64
    image_path = "/path/to/your/image.jpg"
    image_base64 = encode_image_to_base64(image_path)

    # Example conversation with image
    messages = [
        {
            "role": "system",
            "content": "You are a helpful assistant."
        },
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "Please describe the content of this image."
                },
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "data:image/jpeg;base64," + image_base64
                    }
                }
            ]
        }
    ]
    print("Assistant:", end=" ")
    stream_chat_completion(messages)
    print()  # New line

Image Format Restrictions

MIME type: Only image/jpeg and image/png are supported. For PNG format, change the MIME in the URL from image/jpeg to image/png.
Encoding: Only base64 encoding is supported, in the format data:image/jpeg;base64,<base64_string>.
Image dimensions: The maximum single-side length is 7680 pixels, and the total pixel count must not exceed 33,177,600 (approx. 8K UHD resolution). The minimum supported size is 1×1 pixels.
Not supported: Automatic download and analysis from an image URL.

API Documentation

AidGen SDK

AidGenSE (OpenAI API compatible)

Video Codec Tool (AidStream)

Image Processing Tool (AidCV)

Fusion OS Comms Tool (AidConnect)

AI Development

Generative AI Development

Audio AI Development

Model Farm

System Usage Guide

AI Toolchain Development Guide

AidGenSE Development Documentation

Introduction

Support Status

AidGenSE Service Installation and Operation

Installation

Model Query & Retrieval

Starting the Service

Chat Testing

Web UI Method

API Method (Large Language Model)

API Method (Vision Language Model)

Examples

AidGenSE Development Documentation ​

Introduction ​

Support Status ​

AidGenSE Service Installation and Operation ​

Installation ​

Model Query & Retrieval ​

Starting the Service ​

Chat Testing ​

Web UI Method ​

API Method (Large Language Model) ​

API Method (Vision Language Model) ​

Examples ​

AidGenSE Development Documentation

Introduction

Support Status

AidGenSE Service Installation and Operation

Installation

Model Query & Retrieval

Starting the Service

Chat Testing

Web UI Method

API Method (Large Language Model)

API Method (Vision Language Model)

Examples