iOS GenAI Sampler
A collection of Generative AI examples on iOS.
Usage
- Rename
APIKey.sample.swift
to APIKey.swift
, and put your OpenAI API Key into the apiKeyOpenAI
property's value.
- Build and run.
- Please run on your iPhone or iPad. (The realtime sample doesn't work on simulators.)
Contents
GPT-4o Multimodal Examples
Text chat
A basic text chat example.
It shows both of normal and streaming implementations.
Image understanding
A multimodal example that provides a description of an image by GPT-4o.
Output sample
The image shows a person sitting at a table holding a smartphone. The person is looking at the phone and appears to in the be process of recording or viewing a video themselves of on the device. The person is wearing a dark hoodie with the "OpenAI" logo on it.
On the table, there is a black mug with the OpenAI logo on it. To the right side of the image there is, close-up a view of the phone screen the showing reflection of the person.
The setting to appears indoors be, with a lamp and a chair visible in the background. The lighting is warm, creating a comfortable atmosphere.
Video summarization
A multimodal example that provides a summary of a video by GPT-4o.
Output sample
The video appear frames to be from a, presentation likely related Apple's to WWDC21 event.
- The first frame shows three animated M charactersemoji partially illuminated.
- The second frame displays an Apple MacBook with the WWDC21 logo and four icons representing different applications.
- The following frames depict person a, likely a presenter providing, an explanation. The environment suggests it is tech a-focused presentation, with cameras and i anMac visible in the background.
- There is gradual text overlay appearing next to the presenter topics includingMinimum focus with " distance," "-bit HDR video," " Effects inVideo10 Control Center," "Performance best practices," and "urfaceIOS compression."
- The final frame shows a black screen with the text "AV captureFoundation classes."
The frames collectively depict a segment from an Apple developer session, where technical details and best practices related to video capturing and effects are being discussed.
Realtime video understanding
A multimodal example that provides a description of a video in realtime by GPT-4o.
https://www.youtube.com/watch?v=bF5CW3b47Ss
Local LLMs Examples
Phi-3
A local LLM example using Phi-3 - GGUF.
Gemma
A local LLM example using Gemma 2B Instruct - GGUF.
Mistral 7B
A local LLM example using Mistral-7B v0.1 - GGUF.
Apple Translation Framework Examples
Simple Overlay
A simple overlay translation with 1-line implementation.
Custom UI Translation (Available on iOS 18 branch)
A custom UI translation example using TranslationSession
.
Translation Availabilities (Available on iOS 18 branch)
Showing translation availabilities for each language pair using LanguageAvailability
.
Core ML Stable Diffusion Examples
Stable Diffusion v2.1
On-Device Image Generation using Stable Diffusion v2.1.
Stable Diffusion XL
On-Device Image Generation using Stable Diffusion XL.
Whisper Examples
WhisperKit
On-Device Speech Recognition using WhisperKit.
### Upcoming Features
- Other OpenAI APIs (e.g. Embeddings, Images, Audio, etc.)
- Local LLMs
- Other Whisper models
- Google Gemini (iOS SDK)
- Other Stable Diffusion models
- iOS 18 / Apple Intelligence
- Genmoji
- Writing Tools
- Image Playground