on runner startup, do all the ollama model warmup in one place #225

lukemarsden · 2024-03-19T11:07:13Z

runners start multiple ollama processes
ollama processes share the model cache on the filesystem in the container
if requests result in multiple ollama processes warming up in parallel, they tread on eachothers' cache

I've seen issues with the model cache getting corrupted in this case. We should just warm up the ollama model cache once, based on the warmup models, before spinning up multiple ollama instances

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

on runner startup, do all the ollama model warmup in one place #225

on runner startup, do all the ollama model warmup in one place #225

lukemarsden commented Mar 19, 2024

on runner startup, do all the ollama model warmup in one place #225

on runner startup, do all the ollama model warmup in one place #225

Comments

lukemarsden commented Mar 19, 2024