Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
parallel_rsync [20.05.2020 19:39] – [the bash function] Pascal Suter | parallel_rsync [20.05.2020 19:44] (current) – [Before we get startet] Pascal Suter | ||
---|---|---|---|
Line 33: | Line 33: | ||
# | # | ||
# version 1: initial release in 2017 | # version 1: initial release in 2017 | ||
- | # version 2: removed the need to escape filenames by using null delimiter + xargs to run commands such as mkdir and rsync, | + | # version 2: May 2020, removed the need to escape filenames by using |
- | # added ability to resume without rescanning (argument $5) and to skip already synced directories (argument $6) | + | # |
+ | # added ability to resume without rescanning (argument $5) and to skip | ||
+ | # | ||
# | # | ||
Line 43: | Line 45: | ||
# $4 = numjobs | # $4 = numjobs | ||
# $5 = dirlist file (optional) --> will allow to resume without re-scanning the entire directory structure | # $5 = dirlist file (optional) --> will allow to resume without re-scanning the entire directory structure | ||
- | | + | |
source=$1 | source=$1 | ||
destination=$2 | destination=$2 | ||
Line 212: | Line 214: | ||
rm -rf / | rm -rf / | ||
- | ===== Before we get startet | + | ===== Doing it manually ===== |
+ | Initially i did this manually to copy data from an old storage to a new one. when I later had to write a script to archive large directories with lots of small files, I decided to writhe the above function. So for those who are interested in reading more about the basic method and don't like my bash script, here is the manual way this all originated from :) | ||
+ | |||
+ | ==== Before we get startet ==== | ||
one important note right at the begining: while parallelizing is certainly nice we have to consider, that spinning harddisks don't like concurrent file access. so be prepared to never ever see your harddisks theoretical throughput reached if you copy lots of small files. | one important note right at the begining: while parallelizing is certainly nice we have to consider, that spinning harddisks don't like concurrent file access. so be prepared to never ever see your harddisks theoretical throughput reached if you copy lots of small files. | ||
make sure you don't run too many parallel rsyncs by checking your cpu load with top. if you see the " | make sure you don't run too many parallel rsyncs by checking your cpu load with top. if you see the " |