Any guidance for Ollama API support ? #539

iam4x · 2023-09-04T20:14:15Z

iam4x
Sep 4, 2023

Ollama is new but yet very powerfull simple way to run OpenSource LLM on your own Mac with metal support (they plan support for other OS next).

It's a Go program exposing a simple API to interact with different local LLM models, here is the documentation:

https://github.com/jmorganca/ollama/blob/main/docs/api.md#generate-a-completion

I want to create a simple frontend using the Vercel AI SDK, I've looked a bit into the documentation and I guess I'm going to use https://sdk.vercel.ai/docs/api-reference/use-completion ?

But couldn't find a way to add more parameters to specify the model to use?

Would appreciate some guidance, will share what I'm able to achieve then, thanks!

EDIT: If nothing exists, I'm happy to share some of time for pushing a PR into Vercel AI otherwise. Cheers.

Answered by MaxLeiter

Sep 5, 2023

A PR would be accepted.

You'll likely need to create a wrapper for the Ollama rest API, like here: https://github.com/vercel/ai/blob/main/packages/core/streams/anthropic-stream.ts

View full answer

MaxLeiter · 2023-09-05T17:14:51Z

MaxLeiter
Sep 5, 2023
Maintainer

A PR would be accepted.

You'll likely need to create a wrapper for the Ollama rest API, like here: https://github.com/vercel/ai/blob/main/packages/core/streams/anthropic-stream.ts

0 replies

brunnolou · 2023-10-23T17:00:22Z

brunnolou
Oct 23, 2023

I've created a create next-ollama-app by cloning the langchain example.
@MaxLeiter, do think it ok to create a PR as is? Or do you think it shouldn't use Lanchaing but rather a wrapper as you suggested?

0 replies

peterdresslar · 2023-10-23T19:06:36Z

peterdresslar
Oct 23, 2023

Why not use ollama-node @brunnolou ? I think that might give you something that is closer to the rest of the examples.

0 replies

lgrammel · 2023-11-20T16:26:05Z

lgrammel
Nov 20, 2023
Maintainer

Here is a starter kit for the AI SDK & Ollama using ModelFusion (a library that I'm working on) as glue: https://github.com/lgrammel/modelfusion-ollama-nextjs-starter

0 replies

Und3rf10w · 2024-01-20T21:26:18Z

Und3rf10w
Jan 20, 2024

This "worked" for me. Should handle both chat completions and normal completions. Submitted in #935

import {
  AIStream,
  readableFromAsyncIterable,
  type AIStreamCallbacksAndOptions,
  createCallbacksTransformer,
  createStreamDataTransformer
} from 'ai'

// Chat message interfaces
interface ChatMessage {
  role: 'system' | 'user' | 'assistant'
  content: string
  images?: string[]
}

interface ChatRequestParams {
  model: string
  messages: ChatMessage[]
  stream?: boolean
  format?: string
  options?: Record<string, unknown>
  template?: string
}

interface ChatResponse {
  model: string
  created_at: string
  message?: ChatMessage
  done: boolean
  total_duration?: number
  load_duration?: number
  prompt_eval_count?: number
  prompt_eval_duration?: number
  eval_count?: number
  eval_duration?: number
}

interface ChatCompletionChunk {
  model: string
  created_at: string
  message: ChatMessage
  done: boolean
}

export interface CompletionResponse {
  model: string
  created_at: string
  response: string
  done: boolean
  context?: number[]
  total_duration?: number
  load_duration?: number
  prompt_eval_count?: number
  prompt_eval_duration?: number
  eval_count?: number
  eval_duration?: number
}

export interface CompletionChunk {
  model: string
  created_at: string
  response: string
  done: boolean
}

// Extend StreamData type to include ChatResponse
type OllamaStreamData = ChatResponse | ChatCompletionChunk | CompletionResponse | CompletionChunk

// Function to send chat requests
export async function sendChatRequest (data: OllamaChatCompletionsParams): Promise<AsyncIterable<ChatResponse>> {
  const url = `https://your_ollama_instance/api/chat`
  const response = await fetch(url, {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json'
    },
    body: JSON.stringify({
      model: data.model,
      messages: data.messages,
      stream: data?.stream,
      format: data?.format,
      options: data?.options,
      template: data?.template
    })
  })
  return response
}

function parseOllamaStream (): (data: string) => OllamaStreamData {
  return data => {
    try {
      return JSON.parse(data)
    } catch (error) {
      if (error instanceof SyntaxError) {
        console.warn('Received non-JSON data:', data)
      } else {
        throw error
      }
    }
  }
}

async function * streamable<T> (stream: AsyncIterable<T>) {
  for await (const chunk of stream) {
    yield chunk
  }
}

// A modified version of the streamable function specifically for chat messages
async function * chatStreamable (
  stream: AsyncIterable<ChatResponse>,
) {
  for await (const response of stream) {
    if (response.message) {
      yield response.message
    }
    if (response.done) {
      // Additional final response data can be handled here if necessary
      return
    }
  }
}

export function OllamaStream (
  res: Response | AsyncIterable<OllamaStreamData>,
  cb?: AIStreamCallbacksAndOptions
): ReadableStream<string> {
  if ('body' in res) {
    const asyncIterable = chunksToAsyncIterator(res.body, parseOllamaStream())
    return readableFromAsyncIterable(asyncIterable)
      .pipeThrough(createCallbacksTransformer(cb))
      .pipeThrough(createStreamDataTransformer(cb?.experimental_streamData))
  } else if (Symbol.asyncIterator in res) {
    return readableFromAsyncIterable(streamable(res))
      .pipeThrough(createCallbacksTransformer(cb))
      .pipeThrough(createStreamDataTransformer(cb?.experimental_streamData))
  } else {
    throw new Error('The provided resource is neither a Response nor an AsyncIterable.')
  }
}

// Helper function to convert a ReadableStream (from the Fetch Response) to an AsyncIterable
async function * chunksToAsyncIterator (
  stream: ReadableStream<Uint8Array>,
  parseFn: (data: string) => OllamaStreamData
): AsyncIterable<OllamaStreamData> {
  let buffer = ''
  const reader = stream.getReader()
  try {
    while (true) {
      const { done, value } = await reader.read()
      if (done) break
      const textDecoder = new TextDecoder()
      buffer += textDecoder.decode(value)
      let boundary = buffer.indexOf('\n')
      while (boundary !== -1) {
        const dataToParse = buffer.substring(0, boundary)
        buffer = buffer.substring(boundary + 1)
        const parsedData = parseFn(dataToParse)
        if (parsedData?.message) {
          yield parsedData.message.content
        } else if (parsedData?.response) {
          yield parsedData.response
        }
        boundary = buffer.indexOf('\n')
      }
    }
  } finally {
    reader.releaseLock()
  }
}

Then I called it with:

import { OllamaStream, sendChatRequest } from '@lib/ollama/ollamaStream'

// <...snip...>

const ollamaResponse = await sendChatRequest(data)
const stream = OllamaStream(ollamaResponse)
const response = new StreamingTextResponse(stream)
return response

0 replies

alissonsleal · 2024-03-16T16:27:50Z

alissonsleal
Mar 16, 2024

Ollama now has built-in compatibility with the OpenAI Chat Completions API

blog post: https://ollama.com/blog/openai-compatibility

usage:

// app/api/chat/route.ts
import OpenAI from 'openai'
import { OpenAIStream, StreamingTextResponse } from 'ai'

export const runtime = 'edge'

const openai = new OpenAI({
  baseURL: 'http://localhost:11434/v1',
  apiKey: 'ollama', // required but not used
})

export async function POST(req: Request) {
  const { messages } = await req.json()

  const response = await openai.chat.completions.create({
    model: 'llama2',
    stream: true,
    messages,
  })

  const stream = OpenAIStream(response)
  return new StreamingTextResponse(stream)
}

0 replies

lgrammel · 2024-05-08T14:54:00Z

lgrammel
May 8, 2024
Maintainer

With the Vercel AI SDK 3.1, there is a community provider for llama.cpp that works with the new AI functions: https://github.com/sgomez/ollama-ai-provider

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Any guidance for Ollama API support ? #539

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 7 comments

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

Any guidance for Ollama API support ? #539

iam4x Sep 4, 2023

Replies: 7 comments

MaxLeiter Sep 5, 2023 Maintainer

brunnolou Oct 23, 2023

peterdresslar Oct 23, 2023

lgrammel Nov 20, 2023 Maintainer

Und3rf10w Jan 20, 2024

alissonsleal Mar 16, 2024

lgrammel May 8, 2024 Maintainer

iam4x
Sep 4, 2023

MaxLeiter
Sep 5, 2023
Maintainer

brunnolou
Oct 23, 2023

peterdresslar
Oct 23, 2023

lgrammel
Nov 20, 2023
Maintainer

Und3rf10w
Jan 20, 2024

alissonsleal
Mar 16, 2024

lgrammel
May 8, 2024
Maintainer