Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Support streaming in the Twirp spec #3

Open
jacksontj opened this issue Jan 16, 2018 · 59 comments
Open

Proposal: Support streaming in the Twirp spec #3

jacksontj opened this issue Jan 16, 2018 · 59 comments
Labels
feature request pending Keeps issues from going stale proposal

Comments

@jacksontj
Copy link

I think this is a great addition for RPC for Go (as well as other languages!). The main thing that I expect will limit adoption is the lack of streaming support. There are mechanisms to do streaming on http/1.x (a great example is the grpc-gateway stuff) where they effectively just use chunked encoding with blobs to do the streaming.

@spenczar
Copy link
Contributor

Yeah, let's get this discussion going. I agree that this is an important feature.

One important constraint: we need to continue to work on HTTP 1.1. I think that means we can't ever support full-duplex, bidirectional streaming, but I'd be very happy to be proven wrong.

Could you (and others) write a bit about the use cases for streaming RPCs? At Twitch, we haven't really needed any, so I'd like to have concrete users to build for to guide the design of the user API. The underlying implementation seems pretty clear, but I want to make sure the generated code feels nice to use.

@marwan-at-work
Copy link
Contributor

My use case is usually file uploads from one backend service to another. I've written a CI/CD where I have multiple containers build different docker images and stream data from all these containers to a central backend that tars all of them and uploads them as a github release asset. Those could be many megabytes and obviously had to be chunk'd and sent out as streams. RPC streams were perfect for my use case : )

@tommie
Copy link

tommie commented Jan 17, 2018

I don't have any needs for Twirp, but I do use gRPC streaming.

In all cases so far, I've never need bi-directional on the same RPC. The one place I could think of is if you have a single RPC that sends state deltas and receives state deltas. But in that case, you're really looking for classic long-polling e.g. Websockets, not the RPC style of doing things. You can always invent an identifier on top of the RPC mechanism that would allow concurrency safe pub-sub on two uni-directional streams.

So if the only concern is support for bi-directionality, I'd say start with mutually exclusive directions, WLOG. That also cuts the question of full duplex short. I mean, on the wire, HTTP/1.1 could definitely support full duplex streaming, but I assume the problem is that the XHR API requires a request to be finished before reading the response and is hence practically unusable. (Is it more common for HTTP server APIs to support FD BD streaming? Thinking about the Go HTTP library got me to "yes".)

@thelazyfox
Copy link

We should also consider websockets. Client support is pretty widely available these days and we'd get bidirectional streaming support out of the box.

The biggest argument I can think of against is that LB/Proxy support would be poorer which is certainly unfortunate. AWS ALBs, Nginx, and HAProxy all support websockets and I imagine those would be the most commonly used Proxy tools with Twirp.

@spenczar spenczar added this to the v6 milestone Jan 24, 2018
@wora
Copy link

wora commented Jan 26, 2018

The streaming support can be added using a wrapper message, such as:

// A wrapper for (stream Foo)
message StreamFoo {
  repeated Foo messages = 1;
  Status status = 2;
}

Pros:

  • It has the correct trailer semantics.
  • Works for both client and server streaming.
  • Works for all HTTP/* protocols.
  • Works for HTTP JSON as well.
  • Standard error payload.
  • Extremely simple to use.

Cons:

  • Needs HTTP/2 for bidi streaming.

@jacksontj
Copy link
Author

@spenczar thanks for the quick response (and sorry for mine being so slow :) ). I'll try to find some time here in the next couple of weeks to get a proposal (with PoC) together to review here.

@spenczar
Copy link
Contributor

If we could get away with this without using websockets, I'd be happier. Support exists in proxies and stuff, but it can be shaky. You often have to do some configuration. Nobody has to configure their proxy or load balancer to be able to serve HTTP 1 requests, though.

Websockets libraries usually aren't as battle-tested as HTTP libraries, too, so we'd likely be running people into more bugs.

I think we don't need to use websockets to get what we want. I think we can do unidirectional streaming in pure HTTP 1, and say that bidirectional streaming is only available in http/2. I think that's going to be fine for almost everyone's use cases, and will make things simpler for everyone.

I like @wora's proposal. To flesh it out a bit:

Streams are represented as a sequence of messages, terminated by a single "trailer" message. The trailer holds a status code. This trailer has two functions. First, it signals the end of the messages, so the recipient knows the stream is done. Second, it lets the sender send an error, even if things started out well.

For an rpc like this:

rpc UploadFile(stream FileChunk) returns UploadFileResponse

We'd create a helper type:

message FileChunkStream {
  repeated FileChunk messages = 1 [packed=true];
  Trailer trailer = 2;
}

The "Trailer" type can be figured out later - I won't get into it here. It doesn't really matter for the core design.

A sender writes out the serialization of the Stream helper type to the wire, which would either be a HTTP request body or response body. The recipient would need a streaming deserializer for this to work.

A streaming deserializer would work in JSON, because each message is balanced by matching { and } characters, so you know when you've received a complete message.

It would work in Protobuf because repeated messages use length prefixing in front of each message - it looks like <tag><len><encoded_msg><tag><len><encoded_msg>..., where <len> is the size of the encoded_msg in bytes, so a reader can always know how many bytes to chomp.

Most JSON and Protobuf deserializers are not so low-level that they let you emit each message on the fly like this, though. Most would want the whole thing to be loaded into memory. We don't want that, so we would probably need to implement our own deserializer in generated code. It's hard to tell how difficult that would be - we should take a crack at it to get a feel for its difficulty before accepting or declining this particular proposal.

@rhysh
Copy link
Contributor

rhysh commented Jan 28, 2018

Using HTTP/1.1 for unidirectional streaming, with bidi being available on HTTP/2 sounds great. The message design / wire protocol sounds good.

Doing streaming processing shouldn't be too hard. For JSON, we might specify that messages are separated by \n. That would let us easily find the boundaries, be very readable on the wire, and for the Go implementation we could use encoding/json.Decoder out of the box (probably decoding to a json.RawMessage before handing off to jsonpb.Unmarshal). The protobuf framing is pretty straightforward too, though I don't know of any currently-open-source code for Go that does it.

I'm interested to hear about the Go API design. Currently the client and server use a single interface, which is a really useful property. Will we be able to keep it when adding streaming APIs?

@spenczar
Copy link
Contributor

Glad it sounds good, @rhysh!

I was wary of using \n when I first thought about your suggestion because I didn't want to have to escape them, but then I learned that \n is forbidden inside JSON strings, so it actually shouldn't be too hard at all. I like that idea.

Could we use ,\n as the separator instead, though? Then the full recorded stream would be valid JSON, which seems like a nice property.

I'll make a separate issue for discussing the Go API design, I have some thoghts there but yes - it's tricky.

@jmank88
Copy link
Contributor

jmank88 commented Jan 29, 2018

Regarding streaming JSON, there is an RFC describing JSON text sequences ("application/json-seq"): https://tools.ietf.org/html/rfc7464. Basically, always lead with a RS and end with a LF (sometimes required to ensure no truncation). I've implemented a go library here https://github.com/jmank88/jsonseq, which adapts std encoding/json by default.

@achille-roussel
Copy link

What would you think about using an array representation for streaming JSON? Writing [ and ] at the beginning and end, and separating with ,?
That way there's no need to support a custom "line-split" format, any JSON decoder can read the stream, those that support streaming JSON arrays can produce items as they arrive, the others would load the entire array in memory but still work.

@spenczar
Copy link
Contributor

spenczar commented Jan 29, 2018

Yes, @achille-roussel, that's what I am thinking of too. I think we should send a JSON array which is a sequence of messages, then the trailer.

(I'm not sure what I was thinking when I suggested ,\n. The spec is quite clear that , is the only legal separator in arrays.) (Wrong again! The spec permits this.)

That's an interesting RFC, @jmank88, and it does seem to be designed for this case. But I have a few concerns:

  • Stuff will look pretty weird when printed to most terminals.
  • How good is support across languages? It's an obscure RFC.
  • It's no longer valid as a plain JSON object, which would be a nice feature as @achille-roussel said.

@jmank88
Copy link
Contributor

jmank88 commented Jan 29, 2018

The primary advantage of json-seq is recovery without dropping the stream (https://tools.ietf.org/html/rfc7464#section-2.3), but that may not be necessary here.

@rhysh
Copy link
Contributor

rhysh commented Jan 30, 2018

Some options for the JSON representation of a message stream of {"m":4}, followed by {"m":1}, and concluding with {"m":5}:

Space-separated

{"k":4} {"k":1} {"k":5}

Newline-separated (\n)

{"k":4}
{"k":1}
{"k":5}

Dense JSON Array

[{"k":4},{"k":1},{"k":5}]

JSON Array with Helpful Newlines

[
{"k":4}
,{"k":1}
,{"k":5}
]

Of these, I most prefer the ones with newlines. They're easy to work with on the command line (and manual debugging is important for the JSON protocol). Client and server implementations can safely split on the newlines, process a few bytes on either end of the line, and pass to a non-stream-aware JSON decoder.

But the difference between "Newline-separated" and "JSON Array with Helpful Newlines" is whether the wire protocol for a unary RPC is different from a streaming RPC with a single value. We'll need to consider this for the protobuf wire format as well.

I think that gRPC's wire format doesn't differentiate between streaming and unary RPCs. That design implies more complexity in some places, but allows simplification in others.

@achille-roussel
Copy link

+1 on having one item per line when JSON is used, it doesn't add much to the payload size and makes it much easier to debug.

@wora
Copy link

wora commented Jan 30, 2018

The entire response must be a valid JSON object, including the messages and the trailer. Otherwise, it won't be compatible with OpenAPI, and create all kind of compatibility issues. Incremental parser for JSON should be a trivial work, at least outside the browser. I don't think we need to worry too much about the complexity, since it is one time cost to write such a parser.

@spenczar
Copy link
Contributor

I like your "JSON Array with Helpful Newlines" ("JAHN") proposal a lot, @rhysh. We would need one more constraint on serialization, which is no newlines within a serialized object. I want to forbid this:

[
{
  "k": 4,
}
,{"k": 1},
,{"k": 5}
]

@wora, I don't totally agree that we should discount the difficulty of making a deserializer. Simplicity in making clients is important for Twirp's success.

That said, I think it's okay for Streaming to be a slightly "advanced" feature. Client generators that can only handle deserializing streams APIs by holding everything in memory (like @achille-roussel described) will still work, they'll just be less good than well-made deserializers.

@wora
Copy link

wora commented Jan 30, 2018

If you only send the array, where do you put the trailer?

Adding the object wrapper does not add any complexity to the parser. The output looks like this:

{"messages":[
]}

It still uses a fixed first line with a few extra bytes.

@spenczar
Copy link
Contributor

Oh yes, I was thinking of a mixed-type array:

[
{"k":1}
,{"k":2}
,{"k":3}
,{"status": "ok"}
]

Mixed-type arrays can be obnoxious to deserialize, though; I agree that wrapping is probably better, so it becomes

{"messages":[
{"k":1}
,{"k":2}
,{"k":3}
],"trailer": {"status": "ok"}
}

@rhysh
Copy link
Contributor

rhysh commented Jan 30, 2018

@spenczar your last example LGTM

Thanks for clarifying that we're not relying on HTTP Trailers; those have spotty support in HTTP/1.1 software which would roughly translate into Twirp depending on HTTP/2.

Having the top level be a JSON message rather than an array is probably good for JavaScript too .. I think there's an attack involving overriding the constructor for Array, which allows intercepting XHR responses that use an array at the top level.

What are your thoughts on how unary RPCs look compared with streams that contain a single element? As a parallel, in protobuf messages we're allowed to change between "repeated" and non-repeated ("optional" in proto2) fields without breaking wire compatibility.

@wora
Copy link

wora commented Jan 30, 2018

Changing between repeated and non-repeated is a breaking change for JSON. That is how JSON works, array of ints is different from int.

@spenczar
Copy link
Contributor

@rhysh I don't mind them looking different. Unary RPCs are different than streams. We can handle that difference, and so can our users - in fact, I think they'd be happier with this. It's much simpler and clearer.

@wora, I think Rhys is suggesting using the same wrapper for unary RPCs, which would be compatible. I think that would be a mistake, though.

@rhysh
Copy link
Contributor

rhysh commented Jan 30, 2018

Right, single-field -> repeated-field is a breaking change for JSON. Is it worthwhile to make unary RPC -> streaming RPC be a non-breaking change for JSON?

Is it worthwhile to make unary RPC -> streaming RPC a non-breaking change for protobuf?

@rhysh
Copy link
Contributor

rhysh commented Jan 30, 2018

OK, thanks Spencer.

@Merovius
Copy link

To chime in here as a stranger:

First, my use-case for streaming is bi-directional synchronization. Essentially the delta-encoding mentioned above. Currently, the protocol I'm using is:

  • The client sends a request to the server, containing its last known timestamp.
  • The server responds with its last known timestamp and a set of updates the client doesn't know.
  • The client sends a second request, containing a set of updates the server doesn't know.
    The advantage of this is, that the client needs to be able to send requests to the server, but not vice-versa (i.e. perfect for the web). The downside is, that the total set of updates need to be kept in RAM at both ends, even though the individual updates can be persisted on-the-go and forgot. Streaming would help - but note, that the stream needs to go in both directions. It could be two separate RPCs, one for a server->client stream and one for a client->server stream. But it would increase the API surface - currently there is message SyncMessage { Timestamp last = 1; Update update = 1 } and a rpc Sync(SyncMessage) returns (SyncMessage) (where both of these could just be made streaming).

In regards to the protocol: I know that it may seem complicated, but IMO HTTP multipart bodies are far simpler than what's discussed here. Instead of arguing about how to violate the spec of which encoding scheme and requiring to write your own de- and encoders to manage field orders and merging and… it seems simpler to say a streaming RPC sends/responds with a multi-part body, where each part is one request/response message. Yes, implementing multipart-HTTP itself could be complicated - but I'd expect most HTTP libraries, proxies and the like to already be aware of that (i.e. it reuses more existing efforts). Go has multipart support in the stdlib, which together with http.Flusher could be used to implement streaming pretty simply AFAICT.


Just my 2¢ :) Feel free to ignore me :)

@danmux
Copy link

danmux commented May 8, 2019

I think it would end up being simpler supporting only http/2 - it is likely that the client and server are twirp and so it will in general not be a huge issue I would have thought.

@marioizquierdo
Copy link
Contributor

Note: the streaming API is no longer tied to Twirp protocol v6, which has been archived. This doesn't mean that streaming will never happen on Twirp, but it means that there are no short-term plans to release it as v6. This issue may continue open and looking for alternatives.

@github-actions
Copy link

github-actions bot commented Dec 3, 2020

This issue is stale because it has been open 60 days with no activity. Remove stale label / comment or this will be closed in 5 days

@github-actions
Copy link

github-actions bot commented Feb 2, 2021

This issue is stale because it has been open 60 days with no activity. Remove stale label / comment or this will be closed in 5 days

@tv42
Copy link

tv42 commented Feb 2, 2021

Bad stalebot.

@casimcdaniels
Copy link

Bad stalebot.

Worst github bot ever created

@3ventic 3ventic added the pending Keeps issues from going stale label Mar 16, 2021
@omar391
Copy link

omar391 commented Nov 24, 2022

Any progress on this?

@swiftyspiffy
Copy link
Member

We looked this issue over once again, and came to the conclusion that Spencer's detailed comment in #70 seems to be still relevant today.

@spenczar:
I'm sorry for the long silence on this topic. Part of the reason I've gone quiet is just that I've been focused on other work, but part of it is that I have always had my doubts about the suitability of streaming in Twirp, and have been grappling with that uncertainty, trying to find a design that maintains simplicity.

Streams are a dangerous feature for users

@peter-edge has expressed my deepest concern extremely well. One of Twirp's best features today is that it is very hard to use it badly. You kind of can't screw things up too much if you hand it to an inexperienced developer as a tool kit. And while you can make mistakes in API design and message design, those mistakes stay local: they largely aren't architectural or infrastructural.

Streams are different. They imply expectations about how connection state is managed. Load balancing them is really hard, which means that a decision to use streams for an API has ramifications that ripple out to the infrastructure of a system. So, streams introduce some architectural risk when recommending Twirp.

How plausible is it that users trip on that pothole? Unfortunately, I think it is quite likely. Streaming is the sort of feature whose downsides can be hard to see at first, but it sounds at a glance. "Lower latency and the ability to push data to the client? Sign me up!" But people could easily reach for it too early in order to future-proof their APIs, "just in case we need streams later," and walk themselves into an architectural hole which is very difficult to get out of.

Streams are complex and hard to implement well

In addition, streams add significant complexity to the Twirp project. The alpha branch that implements them has to parse protobuf message binary encoding directly - it can't really lean on the proto library for fiddly details of the encoding, but relies on a few low-level utility functions. This kind of terrifies me. I'm worried about maintaining something like that, for the health of the project.

It also imposes a much, much heavier burden on implementers in other languages. One of the best things about Twirp has been how quick and simple it is to write a generator in new languages. We saw Ruby, JavaScript, Java, and Rust implementations appear in a matter of days of the project being first released! I doubt we would have seen any third party implementations if they needed to do low-level manipulation of byte streams to pull out binary-encoded protobuf tag numbers.

The complexity also translates into a clumsy API. It's difficult to do this in a general way that feels really ergonomic while covering the many ways a streaming connection can break. I am still not thrilled with the Go API we designed, and expect that other language implementations would be just as difficult to get really right.

Streams are required only rarely

This is all risk associated with implementing streams. What is the reward, on the other side?

Frankly, at Twitch we have hundreds of Twirp services, and have not once found a compelling need for streams. I can think of a small number of backend systems that do use streaming communication between a client and server, but all of them have extra requirements which make Twirp unusable: some talk to hardware appliances or stream video data, which cannot be represented in Protobuf in any reasonable way. Some have extreme performance requirements which Twirp doesn't aim to meet. None of our designs would work for those niche applications.

Meanwhile, most simple streaming RPCs can be implemented in terms of request-response RPCs, so long as they don't have extreme demands on latency or number of open connections (although HTTP/2 request multiplexing mostly resolves those issues anyway!). Pagination and polling are the obvious tools here, but even algorithms like rsync are surprisingly straightforward to implement in a request-response fashion, and probably more efficient than you think if you're using http/2, since the transport is streaming anyway.

Sometimes you really do need streams for your system, like if you are replicating a data stream. Nothing else is really going to work, there. But Twirp does not need to be all things for all use-cases. There is room for specialized solutions, but streams are a special thing, perhaps too specialized for Twirp, and they have resounding architectural impact.

Maybe we shouldn't do streams

Given the above, I worry about adding streams to Twirp. I think it would be a step in the wrong direction.

I know others have been relying upon streams already, though, with some adventurous souls even using them in production. I'd like to better understand these use cases. What am I missing in the above analysis?

I think to move forward we'd need to see what's changed to improve on the core issues he pointed out: streams are dangerous, hard to implement well, and rarely required. Any new perspectives are welcomed.

To be clear, the specific implementation in that issue itself is not the important part, rather the idea of streams being implemented. Also, Spencer hasn't been active on this project for a while, so let's use this issue tracker going forward.

@paralin
Copy link

paralin commented Dec 12, 2022

Those arguments are... weak in my opinion. People use pubsub all the time with, for example, firebase. Sure scalability might be a bit harder to solve and inexperienced developers can get it wrong. But those aren't reasons to not add it at all.

I've implemented the streaming side here: https://github.com/aperturerobotics/starpc - with no special Protobuf hacks or anything else. Just using message framing (length prefix). It can carry multiple streams over a single connection.

It does require a two way connection like WebSockets or webrtc.

@tommie
Copy link

tommie commented Dec 13, 2022

it can't really lean on the proto library for fiddly details of the encoding, but relies on a few low-level utility functions.

Why not? Are you referring to

t.P(` err = messages.EncodeMessage(msg)`)
?

That seems like a choice, not a necessity. You could use google.protobuf.Any or bytes in a normal message, and add a second layer of coding. Or you could use concatenation with a Protobuf header that just contains the length of the following message.

I doubt we would have seen any third party implementations if they needed to do low-level manipulation of byte streams to pull out binary-encoded protobuf tag numbers.

(Disregarding how this relates to my thoughts above.) Perhaps each feature has its own time. Perhaps now that those third-party implementations exist is the right time to take the next step?

But Twirp does not need to be all things for all use-cases.

Your point here is that Twirp is designed for some use-cases, and you are happy to not extend that. This is one aspect we (as prospective users) cannot argue against. This is your choice, and it's a fair question to be asking for any FR.

On the streaming side, we already have gRPC, gRPC-Web and Connect.

@spenczar
Copy link
Contributor

spenczar commented Dec 13, 2022

It's been a long time since I looked at this, wow. And I haven't worked at Twitch for several years now, but I'll chime in anyway, since I still believe that comment to be a good summary of my personal views.


@paralin

Those arguments are... weak in my opinion. People use pubsub all the time with, for example, firebase. Sure scalability might be a bit harder to solve and inexperienced developers can get it wrong. But those aren't reasons to not add it at all.

I don't really understand this argument, since pubsub is independent of streaming. AMQP is probably the best-known pubsub protocol, and is based on polling requests from the client. You could implement something that looks a lot like RabbitMQ with Twirp without any changes to the current version.

I've implemented the streaming side here: https://github.com/aperturerobotics/starpc - with no special Protobuf hacks or anything else. Just using message framing (length prefix). It can carry multiple streams over a single connection.

This is cool! It's great that there are multiple options out there.


@tommie:

it can't really lean on the proto library for fiddly details of the encoding, but relies on a few low-level utility functions.

Why not? Are you referring to

t.P(` err = messages.EncodeMessage(msg)`)

?
That seems like a choice, not a necessity. You could use google.protobuf.Any or bytes in a normal message, and add a second layer of coding. Or you could use concatenation with a Protobuf header that just contains the length of the following message.

The issue was that the stream was structured like a repeated oneof, and we want to write each message in the repeated array on the fly. At least as of that writing (over 4 years ago!), the github.com/golang/protobuf library could only write a complete repeated field out - a full array. No streaming writes of partial data. So, I had to implement that write of a single chunk in the repeated field myself.

Inventing a new encoding wrapped in opaque bytes would of course work, but it introduces a different sort of complexity - complexity in specification. Twirp gets a lot of mileage out of having a very simple spec, so that's a nontrivial cost.


Both of these comments are responding to the least important part of my original comment, I think. The implementation complexity is way less important than the architectural complexity. Streams are still very dangerous weapons!

@paralin
Copy link

paralin commented Dec 13, 2022

Inventing a new encoding wrapped in opaque bytes would of course work, but it introduces a different sort of complexity - complexity in specification. Twirp gets a lot of mileage out of having a very simple spec, so that's a nontrivial cost.

Have you ever used Steam by Valve Software? It uses a Protobuf message header of a known fixed size containing the length and the message type ID of the following Protobuf message in the stream. It's not complicated and not dangerous, and quite a common construction.

Every time an architecture that looks slightly different than REST is proposed, project developers tend to scream "that's dangerous!" But some of the largest systems in the world use this structure. It's OK to diverge from rest semantics sometimes.

I understand if twirp doesn't want to go this route, but the argument that streams are dangerous is in itself unfounded.

@QuantumGhost
Copy link
Contributor

Maybe we should make Twirp not generating stubs for streaming RPCs in Protobuf files, and let the users write their own streaming RPC library that co-exist with Twirp?

@gurleensethi
Copy link

What about SSE (server sent events)? I know this doesn't cover the case of bi-directional communication, but at least we can have server-to-client streaming (sort of), and it works over HTTP.

@TroyKomodo
Copy link

TroyKomodo commented Aug 8, 2023

Why can you not make use of http.Hijacker?
Why do you have to stick to http standards?

So for reference:

  1. Handle the request as usual.
  2. If the request is a streaming request, either bidi or uni stream. Hijack the connection to get back a raw io.ReadWriter
  3. Then just use some sse or something over the protocol. Or whatever the actual impl of how binary is serialized using twirp (i dont use the library :P )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request pending Keeps issues from going stale proposal
Projects
None yet
Development

No branches or pull requests