-
-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
is_bucket_to_bucket backup for s3.sink only #5054
base: master
Are you sure you want to change the base?
Conversation
modified: weed/replication/sink/azuresink/azure_sink.go modified: weed/replication/sink/b2sink/b2_sink.go modified: weed/replication/sink/filersink/filer_sink.go modified: weed/replication/sink/gcssink/gcs_sink.go modified: weed/replication/sink/localsink/local_sink.go modified: weed/replication/sink/replication_sink.go modified: weed/replication/sink/s3sink/s3_sink.go Added is_bucket_to_bucket option for replication.toml sinks
Changed Initializer of s3.sink to be compatible with is_bucket_to_bucekt option in replication.toml
Adding bucket creation and deletion functions in s3.sink
Logic of bucket handling when is_bucket_to_bucket enabled in s3.sink
Modified filer_backup to set the source path to /buckets when is_bucket_to_bucket enabled The logic behind this is that it should only backup buckets not other paths
a log message to show the default values when is_bucket_to_bucket is true
Is-bucket_to_bucket option
is_bucket_to_bucketonly for s3.sink
Please explain more about what the problem is. |
Hi, |
|
|
Any suggestions? |
sourcePath replacement and exludedPaths checks only to `s3` sink It won't affect in other sinks and if be enabled will log a warning and ignore
Changed is_bucket_to_bucket to sync_s3_to_s3 Now only sink.s3 will have the sync_s3_to_s3 Other sinks doesnt affect from sync_s3_to_s3
As you requested I changed the code. |
How about this option? prefix_to_trim=["/buckets","/home"] |
Or add a pair of prefix_from and prefix_to, for more generic operations. |
I didn't understand your request. |
how about naming this feature/option as "support_multiple_buckets"? |
Actually it will not support multiple buckets in all scenarios. I think syncing S3 has more meaning. Because when we name it |
Any ideas? Or suggestions? |
The variable name is still confusing and I couldn't tell what it does. |
Your suggestion |
That sounds a misunderstanding also as you mentioned. So very confused. |
Hi. I followed this thread. As it is a valuable feature in my opinion, I have some suggestions and alternative solutions. Hope it would be helpful:
|
If you take a look at my changes on
Also these suggestions are helpful:
We can have different sink for this option, or even have a different command for it, But i suggest to use this built in option in |
Why is this only limited to s3 sink? |
Because you have a S3 on seaweed and another sink which is also a S3. So for kinda mirroring the S3 you can use this option. Kinda syncing them together. The backup S3 will be available when Seaweed S3 is down. You can mirror other syncs but you have to check the structure of the destination backup sink. |
Is it duplicating with https://github.com/seaweedfs/seaweedfs/wiki/Gateway-to-Remote-Object-Storage ? |
While I was checking
Then I head to running the In both attempts I encountered this error: It seems it tries to read the remote storage from filer but it's not present, But when I run Something important to mention, I did all the weed shell commands in master. So actually I wasn't able to test it out to fully compare it to the current I will appreciate if you guide me through this. |
So I checked the After making Please check this issue and do a little code review on what I did. Thanks, |
So performance issue of Now this feature( |
What problem are we solving?
There was a problem of backing up all the buckets in one bucket in one path on s3.sink filer.backup
So I tried to add an option called is_bucket_to_bucket to solve this problem
How are we solving the problem?
I added an option called is_bucket_to_bucket and if it be enable the sourcePath will change to /buckets and the excludedPaths should also start with /buckets
Note that this solution only backups the content of /buckets and not other paths.
When you enable is_bucket_to_bucket it will replace sourcePath with /buckets, then will check the excludedPaths to be valid. then it will set the directory to an empty string to not having a problem with key building and also the bucket name to an empty string in Initializer which will be ignored and replaced with the specific bucket name.
How is the PR tested?
I tested it in 100k PUT and DEL using warp
Checks