-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Operations timeout while inserting data into ScyllaDB cluster at very low throughput #18632
Comments
Hi @amitesh88 - you can't compare the io_properties IOPS in any way to the CQL OPs - ScyllaDB does a whole lot more 'raw' IOPS per every CQL transaction. For example, commit log I/O, or compaction. |
@amitesh88 - as you can see, the numbers quoted above and iotune are vastly different. I'd also compare with fio. If fio is substantially better, I'd change the number manually to higher values and try again. See scylladb/seastar#1297 for reference |
Using FIO , I am getting below result |
That's a bit low - I expected more. Can you share the full fio command line and results? |
Below is the command with output: fio --filename=/var/lib/scylla/a --direct=1 --rw=randrw --refill_buffers --size=1G --norandommap --randrepeat=0 --ioengine=libaio --bs=5kb --rwmixread=0 --iodepth=16 --numjobs=16 --runtime=60 --group_reporting --name=scylla_io_2 Run status group 0 (all jobs): Disk stats (read/write): |
Very strange. This is what I'm getting on my laptop : And of course, if I switch to 4KB bs, it's slightly better. |
Please check the advanced dashboard in per-shard view mode to see if some shard is the bottleneck. |
Thanks a lot |
Yes, you can use the monitor with open source Scylla. |
I got the issue, It was due to partition key which was not letting data to be equally divided on the nodes , thats why getting |
I have a 3 node scyllaDB cluster
32 CPU ,64GB RAM , scylla version: 5.4.3
io_properties.yaml :
read_iops: 36764
read_bandwidth: 769690880
write_iops: 42064
write_bandwidth: 767818944
When application has increased writes operations from 1200 to 10000 tps , which is far less than claimed write_iops, it was getting error below:
Error inserting Data : Operation timed out for xxx_xxx.xxx_xxxxx_240512 - received only 1 responses from 2 CL=QUORUM.
On ScyllaDB node the only log can be seen is:
[shard 8:comp] large_data - Writing large partition xxx_xxx.xxx_xxxxx_240512: xxx (37041816 bytes) to me-3gg2_13mq_3jyhc2r2wxx7hvxxw4-big-Data.db
CPU utilisation on each node is hardly 15% , but application failed to write
Note: RF of system_auth and other keyspaces is already equal to number of nodes
Need insights on this
Thanks in advance
The text was updated successfully, but these errors were encountered: