parallel_rsync

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revisionBoth sides next revision
parallel_rsync [08.08.2016 21:34] Pascal Suterparallel_rsync [12.09.2016 14:09] Pascal Suter
Line 29: Line 29:
 after waiting too long for Option 1 to finish on a system that carried tons of backups of other systems, i tried this option: \\ after waiting too long for Option 1 to finish on a system that carried tons of backups of other systems, i tried this option: \\
 if you have tons of files and want to skip the lengthy process of producing a file list via rsync, you can create a list of directories using find and then simply run an rsync per directory. this will give you the full parallelism at the begining but might end with a few ever lasting rsyncs if you don't dig deep enough when doing your initial directory list. still, this might save alot of time.  if you have tons of files and want to skip the lengthy process of producing a file list via rsync, you can create a list of directories using find and then simply run an rsync per directory. this will give you the full parallelism at the begining but might end with a few ever lasting rsyncs if you don't dig deep enough when doing your initial directory list. still, this might save alot of time. 
-  find /source/./ -maxdepth 5 -type d | perl -pe 's|^.*?/\./|\1|' > /tmp/filelist+  find /source/./ -maxdepth 5 -type d | perl -pe 's|^.*?/\./|\1|' > /tmp/rawfilelist
 with the ''--maxdepth'' option you can set how deep you want to dive into your directory tree.. the goal is to get directories with a rather small number of files so you don't have to wait too long for the last couple of rsyncs to finish. also note the added ''/./'' at the end of the source path. that's important as we need this to define to which point rsync should be relative. also check out the man page of rsync, i stole the idea from there ;) with the ''--maxdepth'' option you can set how deep you want to dive into your directory tree.. the goal is to get directories with a rather small number of files so you don't have to wait too long for the last couple of rsyncs to finish. also note the added ''/./'' at the end of the source path. that's important as we need this to define to which point rsync should be relative. also check out the man page of rsync, i stole the idea from there ;)
  
Line 50: Line 50:
 ===== Step 3: make sure we didn't miss anything ===== ===== Step 3: make sure we didn't miss anything =====
 probably the best feature about rsync is, that it resumes aborted previous jobs nicely and it can be run several times across the same source and target with no harm. so let's use this property to just fix everything we have missed or done wrong by simply running a single thread rsync in the end. now this can take some time, and I know no way around that.  probably the best feature about rsync is, that it resumes aborted previous jobs nicely and it can be run several times across the same source and target with no harm. so let's use this property to just fix everything we have missed or done wrong by simply running a single thread rsync in the end. now this can take some time, and I know no way around that. 
-  rsync -aHvx /source/ /target/+  rsync -aHvx --delete /source/ /target/
  
  
  • parallel_rsync.txt
  • Last modified: 20.05.2020 19:44
  • by Pascal Suter