-Title: s3ql, a locally mounted cloud file system - nice free software
+Title: S3QL, a locally mounted cloud file system - nice free software
Tags: english, debian, personvern, sikkerhet
-Date: 2014-04-09 11:20
+Date: 2014-04-09 11:30
-<p>For a while now, I have been looking for a sensible off site backup
-solution to use at home. My requirements are cheap and locally
-encrypted (in other words, I keep the keys, the storage provider do
-not have access to my private files). One idea me and my friends have
-had over the years have been to use Google mail as storage, writing a
-Linux block device storing blocks as emails in the mail service
-provided by Google, and thus get heaps of free space. On top of this
-one can add encryption, RAID and volume management to have lots of
-(fairly slow, I admit that) cheap and encrypted storage. But I never
-found time to implement such system. But the last few weeks I have
-looked at a system called
+<p>For a while now, I have been looking for a sensible offsite backup
+solution for use at home. My requirements are simple, it must be
+cheap and locally encrypted (in other words, I keep the encryption
+keys, the storage provider do not have access to my private files).
+One idea me and my friends had many years ago, before the cloud
+storage providers showed up, was to use Google mail as storage,
+writing a Linux block device storing blocks as emails in the mail
+service provided by Google, and thus get heaps of free space. On top
+of this one can add encryption, RAID and volume management to have
+lots of (fairly slow, I admit that) cheap and encrypted storage. But
+I never found time to implement such system. But the last few weeks I
+have looked at a system called
<a href="https://bitbucket.org/nikratio/s3ql/">S3QL</a>, a locally
mounted network backed file system with the features I need.</p>
<p>S3QL is a fuse file system with a local cache and cloud storage,
handling several different storage providers, any with Amazon S3,
-Google Drive or OpenStack API. There are heaps of such providers. It
-can also use a local directory as storage, which combined with sshfs
-allow for file storage on any ssh server. S3QL include support for
-encryption, compression, de-duplication, snapshots and immutable file
-systems, allowing me to mount the remote storage as a local mount
-point, look at and use the files as if they were local, while the
-content is stored in the cloud as well. This allow me to have a
-backup that should survive fire. The file system can not be shared
-between several machines at the same time, as only one can mount it at
-the time, but any machine with the encryption key and access to the
-storage service can mount it if it is unmounted.</p>
+Google Drive or OpenStack API. There are heaps of such storage
+providers. S3QL can also use a local directory as storage, which
+combined with sshfs allow for file storage on any ssh server. S3QL
+include support for encryption, compression, de-duplication, snapshots
+and immutable file systems, allowing me to mount the remote storage as
+a local mount point, look at and use the files as if they were local,
+while the content is stored in the cloud as well. This allow me to
+have a backup that should survive fire. The file system can not be
+shared between several machines at the same time, as only one can
+mount it at the time, but any machine with the encryption key and
+access to the storage service can mount it if it is unmounted.</p>
<p>It is simple to use. I'm using it on Debian Wheezy, where the
package is included already. So to get started, run <tt>apt-get
install s3ql</tt>. Next, pick a storage provider. I ended up picking
Greenqloud, after reading their nice recipe on
<a href="https://greenqloud.zendesk.com/entries/44611757-How-To-Use-S3QL-to-mount-a-StorageQloud-bucket-on-Debian-Wheezy">how
-to use s3ql with their Amazon S3 service</a>, because I trust the laws
-in Iceland more than those in USA when it come to keeping my data safe
-and private, and thus would rather spend money on a company in
-Iceland. Another nice recipe is available from the article
+to use S3QL with their Amazon S3 service</a>, because I trust the laws
+in Iceland more than those in USA when it come to keeping my personal
+data safe and private, and thus would rather spend money on a company
+in Iceland. Another nice recipe is available from the article
<a href="http://www.admin-magazine.com/HPC/Articles/HPC-Cloud-Storage">S3QL
Filesystem for HPC Storage</a> by Jeff Layton in the HPC section of
Admin magazine. When the provider is picked, figure out how to get
<p><blockquote><pre>
# mkdir -m 700 /var/lib/s3ql-cache
-# mkfs.s3ql --cachedir /var/lib/s3ql-cache --authfile /root/.s3ql/authinfo2 --ssl s3c://s.greenqloud.com:443/bucket-name
+# mkfs.s3ql --cachedir /var/lib/s3ql-cache --authfile /root/.s3ql/authinfo2 \
+ --ssl s3c://s.greenqloud.com:443/bucket-name
Enter backend login:
Enter backend password:
Before using S3QL, make sure to read the user's guide, especially
<p>The next step is mounting the file system to make the storage available.
<p><blockquote><pre>
-# mount.s3ql --cachedir /var/lib/s3ql-cache --authfile /root/.s3ql/authinfo2 --ssl --allow-root s3c://s.greenqloud.com:443/bucket-name /s3ql
+# mount.s3ql --cachedir /var/lib/s3ql-cache --authfile /root/.s3ql/authinfo2 \
+ --ssl --allow-root s3c://s.greenqloud.com:443/bucket-name /s3ql
Using 4 upload threads.
Downloading and decompressing metadata...
Reading metadata...
..contents..
..ext_attributes..
Mounting filesystem...
-# df -h /mnt
+# df -h /s3ql
Filesystem Size Used Avail Use% Mounted on
s3c://s.greenqloud.com:443/bucket-name 1.0T 0 1.0T 0% /s3ql
#
which is very close to my upload speed, and downloading the same
Debian installation ISO gave me 610 kiB/s, close to my download speed.
Both were measured using <tt>dd</tt>. So for me, the bottleneck is my
-network, not the file system code.</p>
+network, not the file system code. I do not know what a good cache
+size would be, but suspect that the cache should e larger than your
+working set.</p>
-I mentioned that only one machine can mount the file system at the
+<p>I mentioned that only one machine can mount the file system at the
time. If another machine try, it is told that the file system is
-busy:
+busy:</p>
<p><blockquote><pre>
-# mount.s3ql --cachedir /var/lib/s3ql-cache --authfile /root/.s3ql/authinfo2 --ssl --allow-root s3c://s.greenqloud.com:443/bucket-name /s3ql
+# mount.s3ql --cachedir /var/lib/s3ql-cache --authfile /root/.s3ql/authinfo2 \
+ --ssl --allow-root s3c://s.greenqloud.com:443/bucket-name /s3ql
Using 8 upload threads.
Backend reports that fs is still mounted elsewhere, aborting.
#
<p>The file content is uploaded when the cache is full, while the
metadata is uploaded once every 24 hour by default. To ensure the
file system content is flushed to the cloud, one can either umount the
-file system, or ask s3ql to flush the cache and metadata using
+file system, or ask S3QL to flush the cache and metadata using
s3qlctrl:
<p><blockquote><pre>
<a href="http://crowncloud.net/">Crowncloud</A>. The latter even
accept payment in Bitcoin. Pick one that suit your need. Some of
them provide several GiB of free storage, but the prize models are
-quire different and you will have to figure out what suit you
+quite different and you will have to figure out what suits you
best.</p>
<p>While researching this blog post, I had a look at research papers
Chen, Benjamin McClelland, David Sherrill, Alfred Torrez, Parks Fields
and Pamela Smith. Please have a look.</p>
+<p>Given my problems with different file systems earlier, I decided to
+check out the mounted S3QL file system to see if it would be usable as
+a home directory (in other word, that it provided POSIX semantics when
+it come to locking and umask handling etc). Running
+<a href="http://people.skolelinux.org/pere/blog/Testing_if_a_file_system_can_be_used_for_home_directories___.html">my
+test code to check file system semantics</a>, I was happy to discover that
+no error was found. So the file system can be used for home
+directories, if one chooses to do so.</p>
+
<p>If you do not want a locally file system, and want something that
work without the Linux fuse file system, I would like to mention the
<a href="http://www.tarsnap.com/">Tarsnap service</a>, which also