Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document Proxy-Wasm ABI v0.2.1. #42

Open
wants to merge 23 commits into
base: main
Choose a base branch
from

Conversation

PiotrSikora
Copy link
Contributor

Fixes #41.

Fixes proxy-wasm#41.

Signed-off-by: Piotr Sikora <piotr@aviatrix.com>
@PiotrSikora
Copy link
Contributor Author

Please take a look, I'm sure there are some errors, but it should be fairly complete.

I'll wait a week or two before merging this to make sure all mistakes are caught and fixed. Afterwards, I'll push v0.1.0 and v0.2.0 for completeness, since it's going to be trivial to do those.

Copy link

@codefromthecrypt codefromthecrypt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started making some comments, then realized they would be the same through the whole doc. Hopefully this gives a good idea of what may be helpful.

Called when the Wasm module is first loaded.


### `proxy_on_memory_allocate`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe somewhere say that either proxy_on_memory_allocate or malloc may be absent (not exported by the guest), but not both?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a little tricky. One of those callbacks is required in the current Envoy implementation, but technically plugins don't need to export this if they don't retrieve strings/vectors/buffers/maps/properties from the host.

abi-versions/v0.2.1/README.md Show resolved Hide resolved
abi-versions/v0.2.1/README.md Show resolved Hide resolved
abi-versions/v0.2.1/README.md Show resolved Hide resolved
abi-versions/v0.2.1/README.md Show resolved Hide resolved


### `wasi_snapshot_preview1.fd_write`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same.. summary comment if this is required, which would be odd. in fact, _start and this may be better put into an "optional support" section unless they are required.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in general, it would be better to keep all things like this (things that are not defined in this ABI) as a historical appendix. That envoy's host exports a few functions is interesting info, so is that some compilers only have a wasi target. However, even though these topics affect portability, these are different than the ABI in general which is used by hosts besides envoy such as mosn cc @antJack

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But those are neither historical not optional, and intercepting fd_write is actually quite useful when integrating with existing 3rd party libraries that print to stdout and/or stderr.

Ideally, all Proxy-Wasm plugins should use proper logging facilitates and use proxy_log, but that's not always possible.

abi-versions/v0.2.1/README.md Show resolved Hide resolved
Copy link

@mpwarres mpwarres left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great, thanks for writing it!

abi-versions/v0.2.1/README.md Outdated Show resolved Hide resolved
abi-versions/v0.2.1/README.md Outdated Show resolved Hide resolved
abi-versions/v0.2.1/README.md Outdated Show resolved Hide resolved
abi-versions/v0.2.1/README.md Outdated Show resolved Hide resolved
abi-versions/v0.2.1/README.md Outdated Show resolved Hide resolved
abi-versions/v0.2.1/README.md Outdated Show resolved Hide resolved
abi-versions/v0.2.1/README.md Outdated Show resolved Hide resolved
abi-versions/v0.2.1/README.md Outdated Show resolved Hide resolved
abi-versions/v0.2.1/README.md Outdated Show resolved Hide resolved
abi-versions/v0.2.1/README.md Outdated Show resolved Hide resolved
- `SERIALIZATION_FAILURE` TODO
- `INVALID_MEMORY_ACCESS` when `path_data`, `path_size`,
`return_value_data` and/or `return_value_size` point to invalid
memory address.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is incomplete without a description of the encoding of the input / output I think.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, but I don't think that should hold up this PR; personally I'd be fine with it landing in a follow-on.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, I'm treating proxy_{get,set}_property in v0.2.1 as a opaque pass-thru to CEL (hence the comment at the beginning of this section), and I was hoping not to document existing paths, since the set isn't stable across host implementations.

Maybe we could redirect users to the documentation in proxies for CEL attributes?

In any case, I think CEL makes sense for a lot of use-cases (e.g. when used in expressions, or to access unspecified objects like Envoy's filter state or dynamic metadata), but I don't think it works great as a generic property getter for predefined values.

Notably:

  1. Querying using strings vs enums is wasteful, and inconsistent with rest of the Proxy-Wasm ABI.
  2. The set isn't stable, so it's impossible to enforce it at the ABI layer, and we had issues with that in the past.
  3. There is a some overlap with existing functions (e.g. access to header maps), which is source of some confusion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it works great as a generic property getter for predefined values.

huge +1


## HTTP fields and gRPC metadata

TODO

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Data encoding is almost as important as the function declarations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I'll add this shortly. I left TODO because I didn't want to delay this PR any longer, but I intend to complete this sections before merging this PR.

Signed-off-by: Piotr Sikora <piotr@aviatrix.com>
Signed-off-by: Piotr Sikora <piotr@aviatrix.com>
Signed-off-by: Piotr Sikora <piotr@aviatrix.com>
Signed-off-by: Piotr Sikora <piotr@aviatrix.com>
Signed-off-by: Piotr Sikora <piotr@aviatrix.com>
Signed-off-by: Piotr Sikora <piotr@aviatrix.com>
Signed-off-by: Piotr Sikora <piotr@aviatrix.com>
Signed-off-by: Piotr Sikora <piotr@aviatrix.com>
Signed-off-by: Piotr Sikora <piotr@aviatrix.com>
Signed-off-by: Piotr Sikora <piotr@aviatrix.com>
Signed-off-by: Piotr Sikora <piotr@aviatrix.com>
Signed-off-by: Piotr Sikora <piotr@aviatrix.com>
Signed-off-by: Piotr Sikora <piotr@aviatrix.com>
Signed-off-by: Piotr Sikora <piotr@aviatrix.com>
Signed-off-by: Piotr Sikora <piotr@aviatrix.com>
Signed-off-by: Piotr Sikora <piotr@aviatrix.com>

#### `proxy_header_map_type_t`

- `HTTP_REQUEST_HEADERS` = `0`
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @PiotrSikora! Quick question: has it been considered to make URI arguments a map type as well? How are proxy-wasm users currently expected to manipulate that part of the request? Through platform/implementation-specific properties maybe? Would it make sense to introduce a HTTP_REQUEST_URI_ARGS type?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @thibaultcha, it's been a while!

I don't recall a map with URI query parameters ever being discussed, no.

Right now, the only option is to modify the :path pseudo-header which contains the query parameters.

Adding HTTP_REQUEST_URI_ARGS map that allows access to individual parameters could work, but I'm not sure if there is much value there vs modifying :path directly. Most APIs I know would only separate path and query strings, but not give access to individual parameters. Are many users asking for it?

In any case, this PR is about documenting v0.2.1 as-is, so this is not something that can be introduced here.


Plugin must return one of the following values:
- `CONTINUE` to forward `HTTP_RESPONSE_BODY` buffer downstream.
- `PAUSE` to pause processing.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the Rust SDK examples we recently noticed that returning PAUSE expects the host implementation to buffer the response body until eof, at which point it will call on_response_body once more (with a full body and eof=true). We saw this described in this example but not written explicitly so anywhere else. Would this be the right place to clarify this behavior?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eh, that is a terrible example, and it works only for HTTP responses with a small body, but it would deadlock with anything larger than the configured buffer size.

Hosts are expected to buffer request/response body while the processing is paused, but they can limit the amount of data that may be buffered. I've added small note about it, PTAL.

The issue is that v0.2.1 doesn't have a way to notify the plugin when that limit is reached. We could add a note that calling proxy_on_{request,response}_body with the same buffer size twice in a row implies that the buffer is full, but that's a bit hacky and not currently implemented. @mpwarres any thoughts?

Note that this is fixed in vNEXT with a much richer set of actions (see: vNEXT#proxy_on_http_request_body), but that version is neither reviewed nor implemented yet.

Also, see my previous comment about buffering (proxy-wasm/proxy-wasm-cpp-host#143 (comment)).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could add a note that calling proxy_on_{request,response}_body with the same buffer size twice in a row implies that the buffer is full, but that's a bit hacky and not currently implemented.

I suppose that is not what the proxy-wasm-rust-sdk example expects, and I don't know how much flexibility there is for v0.2.1 and this corner-case, but what if we made it clear body handlers can always be invoked with more chunks so long as eof: false? In short doing something like vNEXT WaitForEndOrFull, but implied.

  1. Handlers return again the first time which enables buffering.
  2. The same handler expects another invocation with a chunk that may or may not fit in the body buffer.
  3. Next body handler invocation:
    3a. If eof: true, the full body fits in the buffer on this invocation.
    3b. If eof: false, the buffer may or may not be full, but either way the handler can consider this body to only be handled by chunks, and expect more invocations.

In this context, hosts are expected to print appropriate error logs for a full body buffer or a second again return value. Filters are them expected to keep track of body handlers invocations themselves on a per-need basis. Or rather, body handlers are expected to always handle chunked bodies as a fail-safe.

Maybe this misses cases I haven't thought of, or maybe it matches existing implementations, I don't know. Plus of course I am unsure how much flexibility there is for existing host and filters of v0.2.1.

By the way, what would happen to the proxy-wasm-rust-sdk example on Envoy, would the body not fit the buffer? I ought to check but I don't know my way around Envoy very well...

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall the actions need to extend a bit and make it more explicit? the Pause will actually convert into Http::FilterDataStatus::StopIterationAndBuffer status implicitly which will buffer the response body. more context: tetratelabs/proxy-wasm-go-sdk#414

mpwarres
mpwarres previously approved these changes Jul 7, 2023
Copy link

@mpwarres mpwarres left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM mod comments, acknowledging some TODOs could be addressed in later PRs.

abi-versions/v0.2.1/README.md Show resolved Hide resolved
abi-versions/v0.2.1/README.md Outdated Show resolved Hide resolved
- `SERIALIZATION_FAILURE` TODO
- `INVALID_MEMORY_ACCESS` when `path_data`, `path_size`,
`return_value_data` and/or `return_value_size` point to invalid
memory address.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, but I don't think that should hold up this PR; personally I'd be fine with it landing in a follow-on.

abi-versions/v0.2.1/README.md Outdated Show resolved Hide resolved
abi-versions/v0.2.1/README.md Outdated Show resolved Hide resolved
abi-versions/v0.2.1/README.md Show resolved Hide resolved
abi-versions/v0.2.1/README.md Outdated Show resolved Hide resolved
abi-versions/v0.2.1/README.md Outdated Show resolved Hide resolved
Signed-off-by: Piotr Sikora <piotr@aviatrix.com>
Signed-off-by: Piotr Sikora <piotr@aviatrix.com>
Signed-off-by: Piotr Sikora <piotr@aviatrix.com>
Signed-off-by: Piotr Sikora <piotr@aviatrix.com>
Signed-off-by: Piotr Sikora <piotr@aviatrix.com>
Signed-off-by: Piotr Sikora <piotr@aviatrix.com>
Copy link
Contributor

@mathetake mathetake left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

haven't reviewed all, but flushing the first comments. Thank you so much for your work! @PiotrSikora

Called to allocate continuous memory buffer of `memory_size` using
the in-VM memory allocator.

Plugin must return `memory_data` pointing to the start of the allocated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should clarify what Plugin means here (this is the first appearance I guess), maybe adding a Terminology subsection?

Comment on lines +113 to +114
> This callback has been deprecated in favor of [`proxy_on_memory_allocate`],
> and it's called only in its absence.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should clarify when this deprecation happens. Do you want to create an issue for something like "support policy"?

#### `proxy_on_vm_start`

* params:
- `i32 (uint32_t) plugin_context_id`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we missing a doc for this param? maybe at least say "unused" or something like that?


Plugin must return one of the following values:
- `true` to indicate that the configuration was processed successfully.
- `false` to indicate that the configuration processing failed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should document the desired semantic when false to answer the question like will the same Wasm VM be used?


Plugin must return one of the following values:
- `true` to indicate that the configuration was processed successfully.
- `false` to indicate that the configuration processing failed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same about semantics

Comment on lines +637 to +638
- `INVALID_MEMORY_ACCESS` when `return_pairs_data` and/or
`return_pairs_size` point to invalid memory address.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- `INVALID_MEMORY_ACCESS` when `return_pairs_data` and/or
`return_pairs_size` point to invalid memory address.
- `INVALID_MEMORY_ACCESS` when `return_serialized_pairs_data` and/or
`return_serialized_pairs_size` point to invalid memory address.

- `i32 (`[`proxy_status_t`]`) status`

Retrieves value (`return_value_data`, `return_value_size`) of the header
key (`key_data`, `key_value`) from the header map `map_id`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment on header map

- `i32 (`[`proxy_status_t`]`) status`

Adds key (`key_data`, `key_size`) with value (`value_data`,
`value_size`) to the header map `map_id`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment on header map

- `i32 (`[`proxy_status_t`]`) status`

Adds or replaces key's (`key_data`, `key_value`) value with the provided
value (`value_data`, `value_size`) in the header map `map_id`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment on header map

* returns:
- `i32 (`[`proxy_status_t`]`) status`

Removes the key (`key_data`, `key_value`) from the header map `map_id`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment on header map

point to invalid memory address.


## TCP/UDP/QUIC streams
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should limit this to only TCP streams since UDP/QUIC are not supposed in v0.2.1.

invalid memory address.


## Common stream operations
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we clarify what Common mean here? This includes not only TCP but also HTTP, right?

Called when upstream connection is closed.


## HTTP
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency, shouldn't this be HTTP Steams (as in Common stream operations)


> **Note**
> The same unique `stream_context_id` is shared by
> HTTP request and response.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trailers as well?

Plugin must return one of the following values:
- `CONTINUE` to forward `HTTP_RESPONSE_BODY` buffer downstream.
- `PAUSE` to pause processing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we clarify that we cannot use proxy_send_local_response at this point since the headers might have already arrived at the downstream?

Plugin must return one of the following values:
- `CONTINUE` to forward `HTTP_RESPONSE_TRAILERS` fields downstream.
- `PAUSE` to pause processing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment on proxy_send_local_response above.

- `i32 (const uint8_t *) serialized_trailers_data`
- `i32 (size_t) serialized_trailers_size`
- `i32 (uint32_t) timeout`
- `i32 (uint32_t *) return_call_id`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we document how return_call_id will be used?

* returns:
- `i32 (`[`proxy_status_t`]`) status`

Resolves existing shared queue using the provided VM ID (`vm_id_data`,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we document what VM ID mean here? this is the first appearance.

- `SERIALIZATION_FAILURE` TODO
- `INVALID_MEMORY_ACCESS` when `path_data`, `path_size`,
`return_value_data` and/or `return_value_size` point to invalid
memory address.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it works great as a generic property getter for predefined values.

huge +1

@mathetake
Copy link
Contributor

cc @spacewander @shukitchan as non-Envoy implementors


When `fd_id` is `STDOUT`, then it's logged at the `INFO` level.

When `fd_id` is `STDERR`, then it's logged at the `ERROR` level.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some well-known loggers write log to stderr by default, like logrus (https://github.com/sirupsen/logrus/blob/dd1b4c2e81afc5c255f216a722b012ed26be57df/logger.go#L89).
It might cause trouble if all the logs are considered in ERROR level.

`buffer_id`.

Returned `status` value is:
- `OK` on success.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we doc the behavior when the return buffer is empty? (For example, request without a body)

@shukitchan shukitchan mentioned this pull request Jul 30, 2023

### Functions exposed by the host

#### `proxy_http_call`
Copy link

@younes-io younes-io Jul 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@PiotrSikora Is the function proxy_http_call the replacement of proxy_dispatch_http_call in the old spec ? What about the code ? Will dispatch_http_call disappear ? I'm asking these questions because I'm developing proxies and it's not very clear to me. Thank you Piotr

* returns:
- `i32 (`[`proxy_status_t`]`) status`

Defines a new metric of type `metric_type` with a name (`name_data`,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of questions about behaviors that are not specified here:

  • What happens if I give it a metric name that was previously defined? Do I get the preexisting id in return_metric_id?

    • What happens if I pass a pre-existing name but with a different metric_type?
      • Are the names namespaced by type and I get a new metric id?
      • Do I get a BAD_ARGUMENT error?
      • Do I get the preexisting id and the type is ignored?
      • Do I get the preexisting id and the type is updated?
  • This call creates a metric; how do I dispose of a metric? For other functions such as proxy_set_property setting the value to NULL can be assumed to mean "unset", but how to "undefine" and release memory from metrics?


Plugin must return one of the following values:
- `CONTINUE` to forward `HTTP_REQUEST_HEADERS` fields downstream.
- `PAUSE` to pause processing.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

notice a case that the action pause is not actual pausing the processing: tetratelabs/proxy-wasm-go-sdk#425

`HTTP_REQUEST`.

Additionally, instead of forwarding request upstream, a HTTP response
can be sent using [`proxy_send_local_response`].
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like there are several variations of local responses like:

Is it possible to make here proxy_send_local_response more explicitly to indicate which type specifically it will trigger or specifying clearly in the doc?

@htuch
Copy link

htuch commented Jan 19, 2024

@PiotrSikora any update on when this might be merged?

@PiotrSikora
Copy link
Contributor Author

@PiotrSikora any update on when this might be merged?

I'll try to wrap this this week.

@jedisct1
Copy link
Contributor

🎉

@jedisct1
Copy link
Contributor

@PiotrSikora Is there anything blocking this? Reference documentation for proxy-wasm would be super useful for new implementations.

@PiotrSikora
Copy link
Contributor Author

@PiotrSikora Is there anything blocking this? Reference documentation for proxy-wasm would be super useful for new implementations.

Nothing major, I need to address some comments and do a minor cleanup.

It's finally on top of my TODO list, so I should land it in a day or two.

Sets all key-value pairs in the header map `map_id` to the provided
[serialized] map (`serialized_pairs_data`, `serialized_pairs_size`).

Returned `status` value is:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the expected behavior when the underlying function fails to set the header map pairs? (For example, if attempting to set response headers during proxy_on_response_body, after the headers have already been sent).

What should be the result when failing to set the header map:

  • set status to OK and fail silently?
  • set status to BAD_ARGUMENT, for trying to use a map_id that is invalid in the given context?
  • issue a wasm trap, which triggers a panic in the client SDK?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Properly document v0.2.1 ABI