Make ec.encode faster/multithreaded #5343

bvanelst · 2024-02-27T13:40:47Z

bvanelst
Feb 27, 2024

On our current cluster we store around the 250TB per day. We do this with replication of 010 and a copy_2 = 720, this combination works very well.
But we are seeing issues when the erasure code starts (ec.encode fullPercent=95 -quietFor=1h). It can't keep up with the data that is written to the cluster. It takes around 25 minutes per volume to encode and distribute the data.

Is it possible to make this faster? It would be nice to have an option to run it multi threaded with say x threads per server.

To minimise the performance impact, it would be great if compactionMBps is also working for the ec.ecode.

Best regards,
bvanelst

chrislusf · 2024-02-27T16:46:44Z

chrislusf
Feb 27, 2024
Maintainer

the logic is in https://github.com/seaweedfs/seaweedfs/blob/master/weed/shell/command_ec_encode.go

the constraint is that when paralleling the encoding, the encoding server should not try to encode multiple volumes at the same time.

need some careful coding on this.

0 replies

bvanelst · 2024-02-29T08:22:47Z

bvanelst
Feb 29, 2024
Author

Understandable, because you definitely don't want to lose data.

I did some tests on our 8 server cluster with RAID0 raid sets. When I create a RAID0 with 6 HDD drives on a volume server with 90 drives, I get 15 RAID0 drives.

This brought the encoding time back to 8 minutes instead of the 25 minutes.

I use the following configuration;

/usr/local/bin/weed -v=4 -logdir=/var/log/seaweedfs master -mdir=/etc/seaweedfs -ip=192.168.203.31 -port=9333 -metrics.address=192.168.203.71:9091 -defaultReplication=010 -volumePreallocate -garbageThreshold=0.05 -peers=192.168.203.31:9333,192.168.203.32:9333,192.168.203.33:9333

[master.sequencer]
type = "raft"     # Choose [raft|snowflake] type for storing the file id sequence
# when sequencer.type = snowflake, the snowflake id must be different from other masters
sequencer_snowflake_id = 0     # any number between 1~1023

[master.volume_growth]
copy_1 = 720               # create 1 x 7 = 7 actual volumes
copy_2 = 360                # create 2 x 6 = 12 actual volumes
copy_3 = 180                # create 3 x 3 = 9 actual volumes
copy_other = 360            # create n x 1 = n actual volumes


# configuration flags for replication
[master.replication]
treat_replication_as_minimums = false

/usr/local/bin/weed -v=3 -logdir=/var/log/seaweedfs volume -index=memory -mserver=192.168.203.31:9333,192.168.203.32:9333,192.168.203.33:9333 -dir=/volumes/md1,/volumes/md1,/volumes/md1,/volumes/md1,/volumes/md1,/volumes/md1,/volumes/md2,/volumes/md2,/volumes/md2,/volumes/md2,/volumes/md2,/volumes/md2,/volumes/md3,/volumes/md3,/volumes/md3,/volumes/md3,/volumes/md3,/volumes/md3,/volumes/md4,/volumes/md4,/volumes/md4,/volumes/md4,/volumes/md4,/volumes/md4,/volumes/md5,/volumes/md5,/volumes/md5,/volumes/md5,/volumes/md5,/volumes/md5,/volumes/md6,/volumes/md6,/volumes/md6,/volumes/md6,/volumes/md6,/volumes/md6,/volumes/md7,/volumes/md7,/volumes/md7,/volumes/md7,/volumes/md7,/volumes/md7,/volumes/md8,/volumes/md8,/volumes/md8,/volumes/md8,/volumes/md8,/volumes/md8,/volumes/md10,/volumes/md10,/volumes/md10,/volumes/md10,/volumes/md10,/volumes/md10,/volumes/md11,/volumes/md11,/volumes/md11,/volumes/md11,/volumes/md11,/volumes/md11,/volumes/md12,/volumes/md12,/volumes/md12,/volumes/md12,/volumes/md12,/volumes/md12,/volumes/md13,/volumes/md13,/volumes/md13,/volumes/md13,/volumes/md13,/volumes/md13,/volumes/md14,/volumes/md14,/volumes/md14,/volumes/md14,/volumes/md14,/volumes/md14,/volumes/md15,/volumes/md15,/volumes/md15,/volumes/md15,/volumes/md15,/volumes/md15 -max=0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 -concurrentDownloadLimitMB=20000 -concurrentUploadLimitMB=20000 -hasSlowRead=true -readBufferSizeMB=8 -compactionMBps=2000 -rack=<nr> -ip=192.168.203.106 -ip.bind=0.0.0.0 -port=10000

The maintenance script that runs every 17 minutes:

if [ -f /tmp/maint.lock ]; then
  echo "lock file exits"
  exit
fi


(
touch /tmp/maint.lock
echo '
  lock
  ec.encode -collection=nntp -fullPercent=95 -quietFor=1h
  ec.rebuild -force
  ec.balance -force
  volume.deleteEmpty -quietFor=24h -force
  volume.balance -force
  volume.fix.replication
  unlock
' | sudo weed shell | sudo tee /var/log/seaweedfs/maint.log ; rm /tmp/maint.lock ) &

It would be very help full if the encoding goes faster, do you have any tips what I can do so speed it up more?

0 replies

sedlund · 2024-04-11T13:27:40Z

sedlund
Apr 11, 2024

The Reed Solomon library used mentions to build it with GOAMD64=v4 to enable faster AVX2 . See https://go.dev/wiki/MinimumRequirements#amd64

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make ec.encode faster/multithreaded #5343

{{title}}

Replies: 3 comments

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Make ec.encode faster/multithreaded #5343

bvanelst Feb 27, 2024

Replies: 3 comments

chrislusf Feb 27, 2024 Maintainer

bvanelst Feb 29, 2024 Author

sedlund Apr 11, 2024

bvanelst
Feb 27, 2024

chrislusf
Feb 27, 2024
Maintainer

bvanelst
Feb 29, 2024
Author

sedlund
Apr 11, 2024