Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Cohere Rerank 3 Support #466

Draft
wants to merge 8 commits into
base: main
Choose a base branch
from
Draft

Conversation

ystoneman
Copy link

Issue #, if available: #280

Description of changes: Currently, this solution only supports cross-encoder/ms-marco-MiniLM-L-12-v2. I want to add Cohere Rerank 3 as an option.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@ystoneman
Copy link
Author

@bigadsoleiman @azaylamba -- Do you have any ideas why these changes of adding Cohere Rerank 3 support aren't working?

When I click on the Cross-Encoder Model dropdown menu on the /rag/cross-encoders page, I am still only getting the cross-encoder/ms-marco-MiniLM-L-12-v2 model?

I made sure to define the COHERE_API_KEY in Secrets Manager.

In the browser dev tools, I can see the array only has the one cross-encoder:

[{…}]
0
: 
{provider: 'sagemaker', name: 'cross-encoder/ms-marco-MiniLM-L-12-v2', default: true, __typename: 'CrossEncoderData'}
length
: 
1
[[Prototype]]
: 
Array(0)

@azaylamba
Copy link
Contributor

Does your config.json have the cohere model once you run npm run config? The dropdown menu loads the list from config object. See get_cross_encoder_models() in lib/shared/layers/python-sdk/python/genai_core/cross_encoder.py

@ystoneman
Copy link
Author

ystoneman commented Apr 22, 2024

Is the following cdk-nag issue a common issue anyone else has encountered while modifying this project @bigadsoleiman, @massi-ang, or @azaylamba?


- Adjust chunk size limit for this warning via build.chunkSizeWarningLimit.
✓ built in 23.95s
/home/ubuntu/environment/aws-genai-llm-chatbot/node_modules/cdk-nag/src/nag-suppressions.ts:98
    pathArray.forEach((p) => {
              ^
Error: Suppression path "/cloud9GenAIChatBotStack/RagEngines/SageMaker/Model/MultiAB24A/CodeBuildRole/DefaultPolicy/Resource" did not match any resource. This can occur when a resource does not exist or if a suppression is applied before a resource is created.
    at /home/ubuntu/environment/aws-genai-llm-chatbot/node_modules/cdk-nag/src/nag-suppressions.ts:115:15
    at Array.forEach (<anonymous>)
    at Function.addResourceSuppressionsByPath (/home/ubuntu/environment/aws-genai-llm-chatbot/node_modules/cdk-nag/src/nag-suppressions.ts:98:15)
    at new AwsGenAILLMChatbotStack (/home/ubuntu/environment/aws-genai-llm-chatbot/lib/aws-genai-llm-chatbot-stack.ts:273:25)
    at Object.<anonymous> (/home/ubuntu/environment/aws-genai-llm-chatbot/bin/aws-genai-llm-chatbot.ts:13:1)
    at Module._compile (node:internal/modules/cjs/loader:1369:14)
    at Module.m._compile (/home/ubuntu/environment/aws-genai-llm-chatbot/node_modules/ts-node/src/index.ts:1618:23)
    at Module._extensions..js (node:internal/modules/cjs/loader:1427:10)
    at Object.require.extensions.<computed> [as .ts] (/home/ubuntu/environment/aws-genai-llm-chatbot/node_modules/ts-node/src/index.ts:1621:12)
    at Module.load (node:internal/modules/cjs/loader:1206:32)

Subprocess exited with error 1

This happened after I re-ran npm run config and then npx cdk deploy. I had it use the same prefix as before because the stack already existed, so I provided the existing VPC ID. And I selected "no" for VPC endpoints since those were already created by the previous deployment.

Re-running everything did make the cohere rerank model show up in the config file, but now I'm having this new issue.

@azaylamba
Copy link
Contributor

@ystoneman Can you share the config.json file?

@ystoneman
Copy link
Author

Thanks for following up @azaylamba. Here's my config.json:

{
  "prefix": "cloud9",
  "vpc": {
    "vpcId": "vpc-0906dfbea13ffd463",
    "createVpcEndpoints": false
  },
  "privateWebsite": false,
  "certificate": "",
  "domain": "",
  "cfGeoRestrictEnable": false,
  "cfGeoRestrictList": [],
  "bedrock": {
    "enabled": true,
    "region": "us-east-1"
  },
  "llms": {
    "sagemaker": [],
    "huggingfaceApiSecretArn": ""
  },
  "rag": {
    "enabled": true,
    "engines": {
      "aurora": {
        "enabled": false
      },
      "opensearch": {
        "enabled": true
      },
      "kendra": {
        "enabled": false,
        "createIndex": false,
        "external": [],
        "enterprise": false
      }
    },
    "embeddingsModels": [
      {
        "provider": "sagemaker",
        "name": "intfloat/multilingual-e5-large",
        "dimensions": 1024
      },
      {
        "provider": "sagemaker",
        "name": "sentence-transformers/all-MiniLM-L6-v2",
        "dimensions": 384
      },
      {
        "provider": "bedrock",
        "name": "amazon.titan-embed-text-v1",
        "dimensions": 1536
      },
      {
        "provider": "bedrock",
        "name": "amazon.titan-embed-image-v1",
        "dimensions": 1024
      },
      {
        "provider": "bedrock",
        "name": "cohere.embed-english-v3",
        "dimensions": 1024,
        "default": true
      },
      {
        "provider": "bedrock",
        "name": "cohere.embed-multilingual-v3",
        "dimensions": 1024
      },
      {
        "provider": "openai",
        "name": "text-embedding-ada-002",
        "dimensions": 1536
      }
    ],
    "crossEncoderModels": [
      {
        "provider": "sagemaker",
        "name": "cross-encoder/ms-marco-MiniLM-L-12-v2",
        "default": true
      },
      {
        "provider": "cohere",
        "name": "rerank-english-v3.0"
      }
    ]
  }
}

@massi-ang
Copy link
Collaborator

Hi @ystoneman, this issue is due to some resource that was included in the nag-suppression rules to not be present any more. In particular /cloud9GenAIChatBotStack/RagEngines/SageMaker/Model/MultiAB24A/CodeBuildRole/DefaultPolicy/Resource. The current logic applies this suppression rule when opensearch or auroradb are selected as RAG Engines. Now I suppose that in your case you have disabled the SM cross encoder and only use the external cohere reranker which might explain this issue.

@ystoneman ystoneman changed the title Add Cohere Rerank 3 Support to Add Cohere Rerank 3 Support May 7, 2024
@ystoneman
Copy link
Author

Hi @ystoneman, this issue is due to some resource that was included in the nag-suppression rules to not be present any more. In particular /cloud9GenAIChatBotStack/RagEngines/SageMaker/Model/MultiAB24A/CodeBuildRole/DefaultPolicy/Resource. The current logic applies this suppression rule when opensearch or auroradb are selected as RAG Engines. Now I suppose that in your case you have disabled the SM cross encoder and only use the external cohere reranker which might explain this issue.

Hi @massi-ang, thanks for your response. I don't think I'm disabling the SageMaker cross-encoder, because both cross-encoders are specified in the config.json.

My desired behavior is that the SageMaker cross-encoder still gets deployed, but I want to provide the ability to toggle between that and the Cohere rerank-english-v3.0 external API endpoint in the cross-encoder dropdown in the UI.

Could you please clarify if there's a better way to handle this scenario in the suppression rules?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

None yet

3 participants