Sync files: rsync and rclone

rsync and rclone are two popular open-source tools for synchronizing files and directories. While they share some similarities, there are also some differences between them that make them ideal for different use cases.
Features
rsync
rsync is a file synchronization tool that is widely used for backups, remote file management, and other file synchronization tasks. It is a powerful tool that can handle large amounts of data and can synchronize files and directories across multiple locations. Some of the key features of rsync include:
- Incremental synchronization: rsync can only transfer files that have changed since the last synchronization, which can significantly reduce the time and resources required for synchronization.
- Compression: rsync can compress files before transferring them, which can significantly reduce the amount of data transferred and improve the speed of synchronization.
- Encryption: rsync can encrypt files before transferring them, which can provide an additional layer of security for sensitive data.
- Authentication: rsync can use various authentication methods, including SSH keys and passwords, to authenticate users and ensure that only authorized users can access the data.
rclone
rclone is a command-line tool that is designed to be easy to use and can be used for a wide range of file synchronization tasks. It supports a wide range of cloud storage providers, including Google Drive, Dropbox, Amazon S3, and Microsoft OneDrive. Some of the key features of rclone include:
- Easy to use: rclone is easy to install and use, making it a popular choice for users who are new to file synchronization.
- Multiple source and destination options: rclone can synchronize files and directories between multiple locations, including local storage, cloud storage, and remote servers.
- Compression: rclone can compress files before transferring them, which can significantly reduce the amount of data transferred and improve the speed of synchronization.
- Encryption: rclone can encrypt files before transferring them, which can provide an additional layer of security for sensitive data.
- Syncing options: rclone provides a wide range of synchronization options, including one-way synchronization, two-way synchronization, and bidirectional synchronization.
Examples
Rsync
Case 1: simple copy to destination-host. ShellExplain
rsync -rltvv -e ssh /source/files/ user@destination-host:/destination/files/
Case 2:
- running non-interrupt in background
- destination requires
sudo
to access/destination/files/
- If password auth used you need to execute
fg
to enter it, thenControl+Z
to put back in background ShellExplain
nohup rsync -rltvv --exclude='*.zip' --exclude='*.tar.gz' --exclude='*.tgz' --exclude='*.jar' --exclude='*.deb' -e ssh --rsync-path="sudo rsync" /source/files/ user@destination-host:/destination/files/ 2>&1 > ~/log.txt &
Case 3: The same as above but run in 4 parallel threads
ls -d /source/files/ | xargs -n1 -P4 -I% rsync -Prltvv --exclude='*.zip' --exclude='*.tar.gz' --exclude='*.tgz' --exclude='*.jar' --exclude='*.deb' -e ssh --rsync-path="sudo -S rsync" % user@destination-host:/destination/files/
Rclone
[Optional] Configure remotes
-
Via GUI https://rclone.org/gui/
rclone rcd --rc-web-gui
-
Via CLI
rclone config # n) # - name: source # - type: sftp # - host: source.com # - pass: *** ENCRYPTED *** # - user: ubuntu # - shell_type: unix # - md5sum_command: md5sum # - sha1sum_command: sha1sum
Now we have source:
and destination:
Examples
Copy with filetypes exceptions
rclone copy -P -v --exclude=*.zip --exclude=*.tar.gz --exclude=*.tgz --exclude=*.jar --exclude=*.deb source:/source/files/ destination:/destination/files/
Copy 10 days old
rclone copy -P -v --max-age 10d source:/source/files/ destination:/destination/files/
Exclude certain file extensions
rclone copy -P -v --exclude=*.zip --exclude=*.tar.gz --exclude=*.tgz --exclude=*.jar --exclude=*.deb source::/var/lib/jenkins/jobs/ destination:/var/lib/jenkins/jobs/
Create cron job using Assible
- name: 'Sync files'
hosts: all
become: true
tasks:
- name: Cron job to sync artifacts from source to local folder
ansible.builtin.cron:
name: "source_sync"
state: present
weekday: "*"
month: "*"
day: "*"
hour: "*/5"
minute: "0"
user: "ubuntu"
job: "sudo -E timeout 58m rclone sync -L -P -v --transfers=30 --checkers 20 source-remote:/var/lib/jenkins/jobs/ /mnt/data/jenkins/jobs/ 2>/tmp/source-errors.log >/tmp/source-log.log"