Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(bedrock.py): Add Cloudflare AI Gateway support #3467

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

Manouchehri
Copy link
Collaborator

Title

This adds Cloudflare AI Gateway support for Bedrock.

Relevant issues

Resolves #1040.

Type

🆕 New Feature
🚄 Infrastructure

Changes

We add a boto3 hook to modify the URL after signing, but before invoking.

Testing

model_list:
  - model_name: claude-3-haiku-20240307
    litellm_params:
      model: bedrock/anthropic.claude-3-haiku-20240307-v1:0
      aws_region_name: us-east-1
      max_tokens: 4096
      aws_bedrock_runtime_endpoint: https://gateway.ai.cloudflare.com/v1/ACCOUNT_ID_HERE/GATEWAY_ID_HERE/aws-bedrock/bedrock-runtime/us-east-1
curl -v "${OPENAI_API_BASE}/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{
    "model": "claude-3-haiku-20240307",
    "max_tokens": 100,
    "temperature": 1.0,
    "messages": [
      {
        "role": "user",
        "content": "Tell a joke."
      }
    ]
  }'

Notes

Cloudflare AI Gateway seems to break support for streaming at the moment, not sure why.

curl -v "${OPENAI_API_BASE}/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{
    "model": "claude-3-haiku-20240307",
    "max_tokens": 100,
    "temperature": 1.0,
    "stream": true,
    "messages": [
      {
        "role": "user",
        "content": "Tell a joke."
      }
    ]
  }'
data: {"error": {"message": "Header length of 3216834560 exceeded the maximum of 131072\n\nTraceback (most recent call last):\n  File \"/usr/local/lib/python3.11/site-packages/litellm/proxy/proxy_server.py\", line 3159, in async_data_generator\n    async for chunk in response:\n  File \"/usr/local/lib/python3.11/site-packages/litellm/utils.py\", line 10971, in __anext__\n    raise e\n  File \"/usr/local/lib/python3.11/site-packages/litellm/utils.py\", line 10903, in __anext__\n    chunk = next(self.completion_stream)\n            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/usr/local/lib/python3.11/site-packages/botocore/eventstream.py\", line 602, in __iter__\n    for event in self._event_generator:\n  File \"/usr/local/lib/python3.11/site-packages/botocore/eventstream.py\", line 611, in _create_raw_event_generator\n    yield from event_stream_buffer\n  File \"/usr/local/lib/python3.11/site-packages/botocore/eventstream.py\", line 544, in __next__\n    return self.next()\n           ^^^^^^^^^^^\n  File \"/usr/local/lib/python3.11/site-packages/botocore/eventstream.py\", line 536, in next\n    self._prelude = self._parse_prelude()\n                    ^^^^^^^^^^^^^^^^^^^^^\n  File \"/usr/local/lib/python3.11/site-packages/botocore/eventstream.py\", line 480, in _parse_prelude\n    self._validate_prelude(prelude)\n  File \"/usr/local/lib/python3.11/site-packages/botocore/eventstream.py\", line 471, in _validate_prelude\n    raise InvalidHeadersLength(prelude.headers_length)\nbotocore.eventstream.InvalidHeadersLength: Header length of 3216834560 exceeded the maximum of 131072\n", "type": "None", "param": "None", "code": 500}}

Pre-Submission Checklist (optional but appreciated):

  • I have included relevant documentation updates (stored in /docs/my-website)

OS Tests (optional but appreciated):

  • Tested on Windows
  • Tested on MacOS
  • Tested on Linux

Copy link

vercel bot commented May 5, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
litellm ✅ Ready (Inspect) Visit Preview 💬 Add feedback May 7, 2024 10:51am

Copy link
Contributor

@ishaan-jaff ishaan-jaff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add a test for this @Manouchehri - want to make sure no regressions occur. I believe you can mock bedrock calls

@Manouchehri
Copy link
Collaborator Author

please add a test for this @Manouchehri - want to make sure no regressions occur. I believe you can mock bedrock calls

I added a test, let me know if it fails. =)

@Manouchehri
Copy link
Collaborator Author

Manouchehri commented May 7, 2024

cd litellm/tests/
poetry run pytest test_bedrock_completion.py::test_completion_bedrock_cloudflare_ai_gateway -s -v

Confirmed, it passes. =)

@@ -607,6 +621,11 @@ def init_bedrock_client(
else:
endpoint_url = f"https://bedrock-runtime.{region_name}.amazonaws.com"

real_endpoint_url = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please can we not have 2 floating variables both with 'endpoint_url' :D

is it possible for us to have cloudflare logic in 'cloudflare.py' and just have that function wrap this?

having the cloudflare logic in here, looks like it complicates this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could enforce types on functions in here, so any wrapper function can always know what it's going to get

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, isn't having a floating variable the easiest way of doing this? :)

We could separate the logic out, but it's very Bedrock + Cloudflare AI Gateway specific. e.g. the code for Azure OpenAI + Cloudflare AI Gateway is totally different.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature]: Cloudflare AI Gateway support for Bedrock
3 participants