Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server /embedding api doesn't handle cases when physical batch size < prompt length. #7422

Closed
wsxiaoys opened this issue May 20, 2024 · 1 comment

Comments

@wsxiaoys
Copy link
Contributor

wsxiaoys commented May 20, 2024

HTTP Request

POST http://localhost:30888/embeddings HTTP/1.1
Content-Type: application/json

{
  "content": "## For more information about docker support in SkyPilot, please refer to the `image_id` section above.envs:MY_BUCKET:skypilot-temp-gcs-testMY_LOCAL_PATH:tmp-workdirMODEL_SIZE:13bfile_mounts:# Uses rsync to sync local files/directories to all nodes of the cluster.## If a relative path is used, it\"s evaluated relative to the location from# which `sky` is called.## If symlinks are present, they are copied as symlinks, and their targets# must also be synced using file_mounts to ensure correctness./remote/dir1/file:/local/dir1/file/remote/dir2:/local/dir2# Create a S3 bucket named sky-dataset, uploads the contents of# /local/path/datasets to the bucket, and marks the bucket as persistent# (it will not be deleted after the completion of this task).# Symlinks and their contents are NOT copied.## Mounts the bucket at /datasets-storage on every node of the cluster./datasets-storage:name:sky-dataset# Name of storage, optional when source is bucket URIsource:/local/path/datasets# Source path, can be local or s3/gcs URL. Optional, do not specify to create an empty bucket.store:s3# Could be either \"s3\", \"gcs\" or \"r2\"; default: None. Optional.persistent:True# Defaults to True; can be set to false to delete bucket after cluster is downed. Optional.mode:MOUNT# Either MOUNT or COPY. Defaults to MOUNT. Optional.# Copies a cloud object store URI to the cluster. Can be private buckets./datasets\n-s3:s3://my-awesome-dataset# Demoing env var usage./checkpoint/${MODEL_SIZE}:~/${MY_LOCAL_PATH}/mydir:name:${MY_BUCKET}# Name of the bucket.mode:MOUNT# Setup script (optional) to execute on every `sky launch`.# This is executed before the \"run\" commands.## The \"|\" separator indicates a multiline string. To specify a single command:#   setup: pip install -r requirements.txtsetup:|echo Begin setup.pip install -r requirements.txtecho Setup complete.# Main program (optional, but recommended) to run on every node of the cluster.run:|echo Beginning task.python train.py# Demoing env var usage.echo Env var MODEL_SIZE has value: ${MODEL_SIZE}\n\n"
}

Command to start:

# Return empty embedding
llama-server -m /Users/meng/Projects/models/nomic/ggml/model.gguf --port 30888 --ctx-size 4096 --embedding -ngl 9999 -cb -ub 512

# Return correct embedding, as input prompt has length of 613
llama-server -m /Users/meng/Projects/models/nomic/ggml/model.gguf --port 30888 --ctx-size 4096 --embedding -ngl 9999 -cb -ub 614

Error output

HTTP/1.1 500 Internal Server Error
Access-Control-Allow-Origin: 
Connection: close
Content-Length: 120
Content-Type: application/json; charset=utf-8
Server: llama.cpp

{
  "error": {
    "code": 500,
    "message": "input is too large to process. increase the physical batch size",
    "type": "server_error"
  }
}

Seems related: #6996

@wsxiaoys
Copy link
Contributor Author

Seems WAI according to #7389

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant