remote write 2.0 - benchmarking framework; integrated with with prombench #13995

cstyan · 2024-04-26T01:51:22Z

Proposal

We need a more formal and repeatable way of benchmarking changes within remote write. It makes sense to include this as a (non-blocking) task for the remote write 2.0 tracking issue.

We can extend the avalanche project plus build a /dev/null esque sink that accepts remote write metrics/introduces latency etc. These could be used within prombench to provide a way of benchmarking changes to remote write in a realistic environment; k8s, multiple pods, etc.

The text was updated successfully, but these errors were encountered:

berryboylb · 2024-05-03T02:29:53Z

Hello @cstyan, my name is Olorunfemi Daramola a software engineer. I'm interested in this project, is it still open?

cstyan · 2024-05-03T18:09:35Z

There's a bit of work ongoing already but nothing's finished yet. This is also open as a project for the upcoming LFX mentorship session.

berryboylb · 2024-05-03T18:40:58Z

Since it's on LFX, could you send the link so i can apply as a mentee?

cstyan · 2024-05-03T18:58:25Z

This is the general LFX website: https://lfx.linuxfoundation.org/tools/mentorship/ IIRC applications for the summer session aren't open yet.

berryboylb · 2024-05-03T19:19:32Z

I wanted the exact link to apply for this project, but since you said applications aren't opened yet, would you do me a favor and update this thread with the link when it is?

arkhamHack · 2024-05-04T02:15:59Z

Hi @cstyan my name is Avigyan Sinha, I am pretty interested in this project for lfx, could you recommend some resources to get satarted with this?

A-LPHARM · 2024-05-28T23:34:04Z

hi @cstyan my name is Emeka Uzowulu, I am interested in this task and would love to participate

Ellipse0934 · 2024-06-05T08:35:58Z

I have been thinking about this idea, specifically about how we can extend Avalanche and had a few thoughts.

The Avalanche project is derelict. It needs a bit of maintenance and general updates immediately. I'll be happy to help but I will need a volunteer with commit rights to discuss a little before filing PR's. I'll try to ping a previous maintainer even though there is no active maintainer at the moment.
a. Minor fixes and updates to the docs. Add usage documentation prometheus-community/avalanche#62 . (Will also resolve Docker container does not exists prometheus-community/avalanche#38 ). I'll file a PR for this today, no discussion needed (but welcome ofc).
b. Change remote write behavior to an infinite running task to stay consistent with scraping mode.
c. Test and review PR's (add support for histogram metrics prometheus-community/avalanche#51, [PR] Support custom HTTP headers prometheus-community/avalanche#58, Add a configuration for http client conifg prometheus-community/avalanche#61) then ping someone for a secondary review and approval. I'll do this in the next few days and post my comments then perhaps someone can chime in.
If the above gets done then the idea should be to convert Avalanche into a standalone library with a variety of configurable writers. Not only should there be configurability to test for string interning perf or the difference in compression which is relevant to remote write 2.0. But also, %age of OOO series, deviation of OOO blocks,.etc. If the pattern of writes effects some feature then it should be the goal of Avalanche to help test that pattern.

In the TSDB OOO feature talk given in 2022 there are a number of claims on how OOO samples affect performance. Like % of active series getting OOO samples, the cost being more on CPU than memory then finally no noticeable difference on read/write speeds. I personally dislike it when a talk makes a number of claims and a few years down the line there is no available code to test that claim and its very difficult to test it yourself. Maintaining an elaborate test suite for every claim is difficult but a versatile benchmarking system with examples can be extremely useful to users so that they can test it themselves.
In fact there are tons of remote write consumers who may want a nice avalanche library so they can test a number of scenarios themselves.
3. The idea of a versatile Avalanche library has previously been suggested (prometheus/test-infra#559) but has no steam. I'm totally behind turning Avalanche into a versatile library and prombench having configurable and reproducible benchmark runs via Avalanche.

bwplotka · 2024-06-06T09:36:09Z

Nice @Ellipse0934 , feel free to help with avalanche issues and if you will be consistent then I am sure maintainers would love to have you in officially too.

Per 2. What the idea exactly?

Is the idea to convert Avalanche into PRW capable receiver? Could be done I guess, but it could be equally separate project/repo e.g. planned to start sth, never finished. Given small activity on Avalanche I think it would make sense to NOT increase avalanche scope, but to create dedicated small tool in Prometheus ecosystem. We have kind of similar in compliance test - https://github.com/prometheus/compliance/tree/main/remote_write_sender - dedicated, small receiver.

What would be productive is perhaps some exact plan / design on the benchmark (the end result). Given that it will be easier to decide where to put what functionality, perhaps even starting with something small/tailored in prombench.

Ellipse0934 · 2024-06-06T18:06:43Z

Per 2. What the idea exactly?

I mean that allow more granular control over Avalanche. For example, right now although we can set the number of series, labels,.etc I feel its still too coarse. I want to support a larger number of patterns. So the perf effect can more easily be measured, if someone wants to specifically test for OOO prom perf then that should be possible. If someone wants to test only a specific compression algorithm for their workload it should be possible to make such a test happen without too much effort.

Is the idea to convert Avalanche into PRW capable receiver?

No, I was not thinking on these lines. For now I think that passing in flags/additional-prw config into test-infra/prombench to customize workloads will be more beneficial.

but to create dedicated small tool in Prometheus ecosystem

This is a good idea. I'll do a PoC and get back rather than trying to pull attention towards Avalanche.

cstyan · 2024-06-06T18:22:31Z

Note that some work here will be done via the related LFX project, I believe LFX selection notifications go out next week on the 12th PST.

The Avalanche project is derelict. It needs a bit of maintenance and general updates immediately. I'll be happy to help but I will need a volunteer with commit rights to discuss a little before filing PR's. I'll try to ping a previous maintainer even though there is no active maintainer at the moment.

An oversight on my part, I assumed that since I was in prometheus-team I also had merge rights for projects on prometheus-community. I'll get that resolved, thanks for pointing it out.

Change remote write behavior to an infinite running task to stay consistent with scraping mode.

Not sure what you mean here? It already runs forever (in essence), since it will restart if there is an error or config reload, but otherwise keeps running until prometheus itself shuts down.

What are the proposed use cases for having avalanche as a library?

Ellipse0934 · 2024-06-06T18:28:15Z

Not sure what you mean here? It already runs forever

My bad, I mean Avalanche's remote write mode. This was also pointed out in Avalanche#41

cstyan · 2024-06-06T20:38:06Z

I see. I had no envisioned using anything like a "remote write directly from avalanche" mode as a way of benchmark testing remote write itself, just using avalanche as a way of generating various scrape loads worth of data to stress different parts of remote write.

Ellipse0934 · 2024-06-07T17:33:22Z

Okay, so you meant only adding capability to Avalanche to generate more realistic data ? Then it's scraped by prom and sent to another prom as a part of e2e benchmark suite ?

Whether or not remote write is used directly from Avalanche this(different data patterns) capability is important and should be first in any case.

Currently with avalanche --metric-count=5 --label-count=2 the data looks like

# HELP avalanche_metric_mmmmm_0_0 A tasty metric morsel
# TYPE avalanche_metric_mmmmm_0_0 gauge
avalanche_metric_mmmmm_0_0{cycle_id="0",label_key_kkkkk_0="label_val_vvvvv_0",label_key_kkkkk_1="label_val_vvvvv_1",series_id="0"} 51
avalanche_metric_mmmmm_0_0{cycle_id="0",label_key_kkkkk_0="label_val_vvvvv_0",label_key_kkkkk_1="label_val_vvvvv_1",series_id="1"} 1
avalanche_metric_mmmmm_0_0{cycle_id="0",label_key_kkkkk_0="label_val_vvvvv_0",label_key_kkkkk_1="label_val_vvvvv_1",series_id="2"} 35
avalanche_metric_mmmmm_0_0{cycle_id="0",label_key_kkkkk_0="label_val_vvvvv_0",label_key_kkkkk_1="label_val_vvvvv_1",series_id="3"} 11
avalanche_metric_mmmmm_0_0{cycle_id="0",label_key_kkkkk_0="label_val_vvvvv_0",label_key_kkkkk_1="label_val_vvvvv_1",series_id="4"} 39

Which looks non-representative of a realistic workload. test-infra/tools/fake-webserver Looks better.
And even better was using data from a real source in the string interning PR implementation.

scraping over 100 pods from a real kubernetes namespace, which includes various Mimir microservices. I ran the benchmark for 2 hours.

I still feel that benchmarks should be reproducible and hence either data or generator scripts should be a part of the toolkit.

When I used the phrase "making avalanche more versatile/scalable" I was first thinking of writing a small DSL to generate data.

capacity: 3000
max_shards: 10
name: 'test'
writers:
    - name: 'http_requests_total'
      type: counter
      value:
        rate:
            between: [30, 50]
    - name: 'cpu_temp'
      type: gauge
      value:
        bounds: [0, 110]
        rate: [-2,4]
        avg: 65
    - name: 'DB_error'
      type: counter
      value:
         inc:
           expr: t, 300, 'div', 0, 'max', 200, 'min'] # reverse polish notation, t = time

But then I felt it will still be hard to model many problems where a general purpose programming language is much better. Hence the suggestion that Avalanche should be able to act like a library. But perhaps all of this is overkill and we just need to add a small set of writer patterns and config options to avalanche to make the workload more realistic.

cstyan · 2024-06-07T19:42:21Z

Which looks non-representative of a realistic workload. test-infra/tools/fake-webserver Looks better.
And even better was using data from a real source in the string interning PR implementation.

I'm not very familiar with the fake-webserver used in prombench atm, or the metrics it generates, but it looks to me like it's just a few request latency histograms, which isn't enough to be realistic either. Whichever of the two programs we end up extending, we need many more series and potentially even pseudo random sets of labels. Testing worst case scenario in terms of label usage is better than testing against the current avalanche generated metrics which only change one or two labels AFAICT.

I would prefer to have the tool we build/extend be as "intelligent" as possible in terms of it's data generation. Obviously we will need a few more knobs to turn in terms of telling it what data to generate, but I don't want to have to write a whole config file in order to use it.

cstyan mentioned this issue Apr 26, 2024

[meta] Remote write 2.0 #13105

Open

18 tasks

bwplotka mentioned this issue Jun 5, 2024

Remote Write 2.0: Benchmarks vs 1.0 and OTLP #14253

Open

bwplotka changed the title ~~remote write 2.0 - benchmarking~~ remote write 2.0 - benchmarking framework; integrated with with prombench Jun 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

remote write 2.0 - benchmarking framework; integrated with with prombench #13995

remote write 2.0 - benchmarking framework; integrated with with prombench #13995

cstyan commented Apr 26, 2024

berryboylb commented May 3, 2024

cstyan commented May 3, 2024

berryboylb commented May 3, 2024

cstyan commented May 3, 2024

berryboylb commented May 3, 2024

arkhamHack commented May 4, 2024 •

edited

A-LPHARM commented May 28, 2024

Ellipse0934 commented Jun 5, 2024

bwplotka commented Jun 6, 2024

Ellipse0934 commented Jun 6, 2024

cstyan commented Jun 6, 2024

Ellipse0934 commented Jun 6, 2024

cstyan commented Jun 6, 2024

Ellipse0934 commented Jun 7, 2024 •

edited

cstyan commented Jun 7, 2024

remote write 2.0 - benchmarking framework; integrated with with prombench #13995

remote write 2.0 - benchmarking framework; integrated with with prombench #13995

Comments

cstyan commented Apr 26, 2024

Proposal

berryboylb commented May 3, 2024

cstyan commented May 3, 2024

berryboylb commented May 3, 2024

cstyan commented May 3, 2024

berryboylb commented May 3, 2024

arkhamHack commented May 4, 2024 • edited

A-LPHARM commented May 28, 2024

Ellipse0934 commented Jun 5, 2024

bwplotka commented Jun 6, 2024

Ellipse0934 commented Jun 6, 2024

cstyan commented Jun 6, 2024

Ellipse0934 commented Jun 6, 2024

cstyan commented Jun 6, 2024

Ellipse0934 commented Jun 7, 2024 • edited

cstyan commented Jun 7, 2024

arkhamHack commented May 4, 2024 •

edited

Ellipse0934 commented Jun 7, 2024 •

edited