-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature Request: Add --worker Flag for Path-Hash-Based Partitioned Transfers in rclone #8400
Comments
This is a great idea. So great that I'm actually already in the middle of implementing it :-) Here is the proposal I made - comments welcome Hash FilterThis proposal describes a new flag Uses include:
The flag takes two parameters expressed as a fraction, so Note that rclone will still have to traverse all directories to select these files. The first parameter can be replaced with |
@zackees if you want to have a go with v1.70.0-beta.8612.dfe1aacc2.fix-8400-hash-filter on branch fix-8400-hash-filter |
Great. Thanks for this! Here is my feedback Add Justification
|
Feature Request: Add
--worker
Flag for Partitioned, Hash-Based File TransfersI'm going to do this myself in my python API to increase throughput. I thought i'd write a feature request for completeness. Feel free to close this feature request if not applicable. I may be able to implement this feature myself in rclone if this is something you are interested in.
Overview
I'd like to request a new feature that allows rclone to transfer only a portion of a server's content. This feature would enable users to run multiple rclone instances concurrently, with each instance responsible for a distinct subset of files. The goal is to facilitate distributed transfers and avoid duplicate work when syncing or copying large datasets.
Proposed Approach
Introduce a new flag,
--worker
, where the argument is formatted asworker_id:(n_workers-1)
. For example:rclone copy ... --worker 0:1
rclone copy ... --worker 1:1
In the above example, two workers are deployed, and each will handle roughly 50% of the files.
How It Works
For each file to be transferred, rclone will calculate a hash based on the file's path (e.g., using MD5). Then, using the worker parameters, it determines if the current worker should process the file based on the following pseudocode:
The text was updated successfully, but these errors were encountered: