PersonalAIserver Download - PersonalAIserver Source code download

PersonalAIserver

AI Source Code

1.0.0

Download

Personal AI server

Stand up a GenAI server on your own available hardware. This project provides a web interface to interact with LLaMA and Stable Diffusion models (among others)for text, image, video, and 3D generation.

Many people have a nice GPU on hand and are willing to use that instead of pay for subscriptions from OpenAI, Anthropic, etc. Here, you can host your own AI models, albeit with many limitations as compared to those great services. Then, you can access it from anywhere with a web browser, like your phone or another computer.

Login Screen Models page Text generation Image generation

Features

Text generation using LLaMA 3.2-1B-Instruct
Image generation using Stable Diffusion 3.5 medium
User authentication and JWT token-based security
React-based frontend with modern UI components
FastAPI backend with CORS support

Setup

The frontend for this is hosted on GitHub Pages, while the backend is hosted on your available server machine.

After setup, the frontend will be available at https://[username].github.io/PersonalAIserver. Follow the instructions below to set up the backend.

Frontend (GitHub Pages hosted)

For the frontend, only a GitHub action needs to be setup to build the project page. In your forked repository, navigate to Settings -> Pages -> Source, and set it to "GitHub Actions". The build will automatically run when you push to the repository, thanks to the .github/workflows/main.yml file.

Frontend (Local hosted)

In case you want to run the frontend locally, follow the instructions below. Requires Node.js, installable from node version manager (nvm).

Setup app environment. From the frontend directory, run the following commands:

nvm install 20
npm install -D @shadcn/ui 
npx shadcn@latest init # Select default style, any color, and dont use css variables. 
npx shadcn@latest add alert button card input textarea # Accept defaults
npm install lucide-react
npm install -D @tailwindcss/typography
npm install clsx tailwind-merge
npm install
npm install sharp

Running the local frontend

npm run dev

This should serve the frontend at http://localhost:3000, which you can access with a web browser.

Backend

This uses conda for package management, but feel free to use any other package manager.

In your terminal at the backend directory, run the following commands:

Create conda environment

conda create -n personalai python=3.11
conda activate personalai

Install dependencies

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124
pip install -r requirements.txt

Hugging Face setup

Create an account and login to Hugging Face.
Agree to the model's terms of use: stable-diffusion-3.5-medium, Llama-3.2-1B-Instruct.
Create a read-only personal access token: settings/tokens.

Local files setup

Create a users.json file in the backend directory with the following format:

{
    "username1": {
        "username": "username1",
        "password": "password1",
        "disabled": false
    },
    "username2": {
        "username": "username2",
        "password": "password2",
        "disabled": false
    }
    ...
}

Create a .secret file in the backend directory with a random string of your choosing. This is used to encrypt the JWT token. I recommend generating one with:

import secrets
secret_key = secrets.token_hex(32)
print(secret_key)

and then copying the output.

GitHub secrets (if building for public access with GitHub Pages)

Navigate to Settings -> Secrets and variables -> Actions in your forked repository
Add a repository secret with the name NEXT_PUBLIC_API_URL and the value being the URL of the backend server (more details below). Example: https://api.example.com.

Adjust CORS origins

In backend/backend.py, adjust the CORSMiddleware near line 40 to include the URLs of your frontend.

Running the backend

Log in to Hugging Face with huggingface-cli login and use your personal access token created above as the password.
Run the backend with python backend/backend.py --public, or remove --public if you want to run the server on localhost only. Localhost requires you run the frontend locally as well.

Hosting the backend

Many options exist for hosting the backend at a public URL, but I recommend using Cloudflare Tunnel. Cloudflare Tunnel routes traffic from the internet to your server without exposing a port or your local IP. It also supports HTTPS, SSL encryption, and DDoS protection, among other security features you would otherwise need to setup and maintain yourself.

Other options include:

Ngrok
Local IP address + port forwarding

For Cloudflare Tunnel, you will need a domain name and Cloudflare account.

Install cloudflared on the backend machine: https://developers.cloudflare.com/cloudflare-one/connections/connect-apps/install-and-setup/installation/
Run cloudflared tunnel login. This step opens a browser window to log in with your Cloudflare account.
Run cloudflared tunnel create genai-api. This step generates a tunnel ID, note this for next steps.
Create a config file ~/.cloudflared/config.yml:

tunnel: <your-tunnel-id>
credentials-file: /home/user/.cloudflared/<tunnel-id>.json

ingress:
  - hostname: <your-api-url>
    service: http://localhost:8000
  - service: http_status:404

The <your-api-url> should be the domain name you have registered with Cloudflare's nameservers, and can be a subdomain. For example, if you own example.com, you could use genai.example.com or api.example.com. 5. Create DNS record cloudflared tunnel route dns <tunnel-id> <your-api-url>. This creates the tunnel for Cloudflare to route traffic from the internet to your server without exposing a port or your local IP.

Changing models

In backend/backend.py, you may change the models used. Find the model on Hugging Face you'd like to use for text/image generation, and adjust the ModelManager model configs (line ~127). By default, this uses the LLaMA 3.2-1B-Instruct model for text generation and the Stable Diffusion 3.5 medium model for image generation. Below are the VRAM requirements for each model tested with.

Model Type	Model Name	VRAM Usage (GB)	Notes
Text	meta-llama/Llama-3.2-1B-Instruct	~8	Base model for text generation
Image	stabilityai/stable-diffusion-3.5-medium	~13	Works well on RTX 4090
Image	stabilityai/stable-diffusion-3.5-large	~20-30	Exceeds 4090 VRAM for long prompts

Note: VRAM usage may vary depending on image resolution, length of text prompts, and other parameters. Values shown are approximate for default settings. Quantized transformer models are supported to further reduce VRAM usage, but are not used by default (see backend/backend.py lines 178-192). The large stable diffusion model can barely fit in my 4090's VRAM with quantization. Chatbot speeds are extremely quick on my 4090, with image generation taking up to 30sec for 100 steps for 1024x1024 images.

The system prompt for the language model can be found in backend/system_prompt.txt. At the moment, it is a spin on Anthropic's Claude 3.5 Sonnet model, released on October 22, 2024. https://docs.anthropic.com/en/release-notes/system-prompts#claude-3-5-sonnet

Notes and disclaimers

I am not a web developer or security expert, so there are likely many security and usability issues with this project. Use at your own risk. The entire frontend and login system was generated with Claude.
This project is not affiliated with Meta or Stability AI, nor any company I work for. This is simply a personal project to learn about Hugging Face models and their deployment on local hardware.
I do not plan on maintaining this project. It is purely a hobby project for me to learn about LLMs and AI. Feel free to contribute yourself, I'd be happy to see some pull requests!

Expand

Additional Information

Version 1.0.0
Type AI Source Code
Update Time 2024-12-26
size 3.42MB
From Github

Related Applications

node telegram bot api

2024-12-14
typebot.io

2024-12-14
python wechaty getting started

2024-12-14
TranscriberBot

2024-12-14
genal chat

2024-12-14
Facemoji

2024-12-14

Recommended for You

chat.petals.dev

Other source code

1.0.0
GPT Prompt Templates

Other source code

1.0.0
GPTyped

Other source code

GPTyped 1.0.5
node telegram bot api

AI Source Code

v0.50.0
typebot.io

AI Source Code

v3.1.2
python wechaty getting started

AI Source Code

1.0.0
waymo open dataset

Other source code

December 2023 Update
termwind

Other categories

v2.3.0
wp functions

Other categories

1.0.0

Related Information All