Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Retire openapi3, use openapi-service-client instead #7514

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

vblagoje
Copy link
Member

@vblagoje vblagoje commented Apr 9, 2024

Why:

The pull request eviscerates the underlying openapi3 library used under the hood in OpenAPIServiceConnector and replaces it with out own openapi library. The main motivations for replacement of the old library are:

  • openapi3 seems to be in maintenance mode (PRs are piling up)
  • it had some bugs we can't fix and control in the future

These bugs were related to their mechanism of checking for responses being valid as defined by schema. In theory that works great but one could not make this validation step optional and they had a few bugs in there + developers sometimes make tiny mistakes defining schemas in OpenAPI specs. Validation doesn't matter at all for LLMs and we can add this request/response validation later or not at all. A third party lib like pydanticv2 can do the validation of both.

What:

  • Replaced OpenAPI library usage with openapi-service-client library abstractions.
  • Replaced OpenAPIServiceToFunctions OpenAPI schema -> LLM tools translation with use of get_tools_definitions invocation
  • Improved error handling by using a try-except block with logging and returning a JSON error object upon failure.
  • Eliminated a complex and now unnecessary authentication mechanism (migrated to openapi-service-client library)
  • Removed code that dealt with parsing raw JSON responses into model objects.
  • Modified the unit tests to reflect the above changes and introduced an integration test that relies on an actual API key from the environment.
  • Many tests were removed because we added exhaustive testing to openapi-service-client library and the tests testing openapi integration in previous versions of these components do need to be duplicated

How can it be used:

  • The functionality and UX is as before
  • Users will likely need to update their OutputAdapter like this:
# An OutputAdapter filter we'll use to setup function calling
def prepare_fc_params(openai_functions_schema: Dict[str, Any]) -> Dict[str, Any]:
    return {
        "tools": [{
            "type": "function",
            "function": openai_functions_schema
        }],
        "tool_choice": {
            "type": "function",
            "function": {"name": openai_functions_schema["name"]}
        }
    }

to:

def prepare_fc_params(openai_functions_schema: Dict[str, Any]) -> Dict[str, Any]:
    return {"tools": [openai_functions_schema]}

How did you test it:

  • The existing unit tests were refactored to mock OpenAPIServiceClient instead of OpenAPI. The mocking and assertions were adapted to the new implementation.
  • An integration test was added, using a real API key (retrieved from an environment variable), to connect with SerperDev service
  • Our serper.dev integration notebook has been updated for manual test
  • Our dynamic service demo notebook

Notes for the reviewer:

  • In the diff, note how openapi3 is replaced with openapi-service-client to ensure that all of the intended functionality is preserved.
  • Run the old serper.dev notebook and a new one and notice the slight changes from above
  • Run the old dynamic service demo notebook and compare it to new one, notice the minimal change - only in tools definition

@github-actions github-actions bot added topic:tests topic:build/distribution 2.x Related to Haystack v2.0 type:documentation Improvements on the docs labels Apr 9, 2024
@coveralls
Copy link
Collaborator

coveralls commented Apr 9, 2024

Pull Request Test Coverage Report for Build 9076174278

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • 9 unchanged lines in 2 files lost coverage.
  • Overall coverage increased (+0.1%) to 90.512%

Files with Coverage Reduction New Missed Lines %
components/converters/openapi_functions.py 3 91.89%
components/connectors/openapi_service.py 6 85.0%
Totals Coverage Status
Change from base Build 9074873557: 0.1%
Covered Lines: 6420
Relevant Lines: 7093

💛 - Coveralls

@vblagoje vblagoje force-pushed the integrate_openapi_service_client branch from aabe2d7 to b109d3c Compare April 9, 2024 13:45
@vblagoje vblagoje force-pushed the integrate_openapi_service_client branch from b109d3c to c0a9024 Compare April 16, 2024 13:49
@vblagoje vblagoje force-pushed the integrate_openapi_service_client branch 4 times, most recently from b959918 to 56a1440 Compare May 2, 2024 09:39
@vblagoje vblagoje force-pushed the integrate_openapi_service_client branch from 8f19fcd to e0ae77e Compare May 7, 2024 08:18
@vblagoje vblagoje marked this pull request as ready for review May 7, 2024 14:34
@vblagoje vblagoje requested review from a team as code owners May 7, 2024 14:34
@vblagoje vblagoje requested review from dfokina, anakin87 and masci and removed request for a team May 7, 2024 14:34
@vblagoje
Copy link
Member Author

vblagoje commented May 7, 2024

@anakin87 removing you because @masci and I talked offline about this one and he's already familiar with the context. We also likely need to go back and forth a bit here as well

@vblagoje vblagoje removed the request for review from anakin87 May 7, 2024 14:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.x Related to Haystack v2.0 topic:build/distribution topic:tests type:documentation Improvements on the docs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants