- <title>A fist full of non-anonymous Bitcoins</title>
- <link>http://people.skolelinux.org/pere/blog/A_fist_full_of_non_anonymous_Bitcoins.html</link>
- <guid isPermaLink="true">http://people.skolelinux.org/pere/blog/A_fist_full_of_non_anonymous_Bitcoins.html</guid>
- <pubDate>Wed, 29 Jan 2014 14:10:00 +0100</pubDate>
- <description><p>Bitcoin is a incredible use of peer to peer communication and
-encryption, allowing direct and immediate money transfer without any
-central control. It is sometimes claimed to be ideal for illegal
-activity, which I believe is quite a long way from the truth. At least
-I would not conduct illegal money transfers using a system where the
-details of every transaction are kept forever. This point is
-investigated in
-<a href="https://www.usenix.org/publications/login">USENIX ;login:</a>
-from December 2013, in the article
-"<a href="https://www.usenix.org/system/files/login/articles/03_meiklejohn-online.pdf">A
-Fistful of Bitcoins - Characterizing Payments Among Men with No
-Names</a>" by Sarah Meiklejohn, Marjori Pomarole,Grant Jordan, Kirill
-Levchenko, Damon McCoy, Geoffrey M. Voelker, and Stefan Savage. They
-analyse the transaction log in the Bitcoin system, using it to find
-addresses belong to individuals and organisations and follow the flow
-of money from both Bitcoin theft and trades on Silk Road to where the
-money end up. This is how they wrap up their article:</p>
-
-<p><blockquote>
-<p>"To demonstrate the usefulness of this type of analysis, we turned
-our attention to criminal activity. In the Bitcoin economy, criminal
-activity can appear in a number of forms, such as dealing drugs on
-Silk Road or simply stealing someone else’s bitcoins. We followed the
-flow of bitcoins out of Silk Road (in particular, from one notorious
-address) and from a number of highly publicized thefts to see whether
-we could track the bitcoins to known services. Although some of the
-thieves attempted to use sophisticated mixing techniques (or possibly
-mix services) to obscure the flow of bitcoins, for the most part
-tracking the bitcoins was quite straightforward, and we ultimately saw
-large quantities of bitcoins flow to a variety of exchanges directly
-from the point of theft (or the withdrawal from Silk Road).</p>
-
-<p>As acknowledged above, following stolen bitcoins to the point at
-which they are deposited into an exchange does not in itself identify
-the thief; however, it does enable further de-anonymization in the
-case in which certain agencies can determine (through, for example,
-subpoena power) the real-world owner of the account into which the
-stolen bitcoins were deposited. Because such exchanges seem to serve
-as chokepoints into and out of the Bitcoin economy (i.e., there are
-few alternative ways to cash out), we conclude that using Bitcoin for
-money laundering or other illicit purposes does not (at least at
-present) seem to be particularly attractive."</p>
-</blockquote><p>
-
-<p>These researches are not the first to analyse the Bitcoin
-transaction log. The 2011 paper
-"<a href="http://arxiv.org/abs/1107.4524">An Analysis of Anonymity in
-the Bitcoin System</A>" by Fergal Reid and Martin Harrigan is
-summarized like this:</p>
-
-<p><blockquote>
-"Anonymity in Bitcoin, a peer-to-peer electronic currency system, is a
-complicated issue. Within the system, users are identified by
-public-keys only. An attacker wishing to de-anonymize its users will
-attempt to construct the one-to-many mapping between users and
-public-keys and associate information external to the system with the
-users. Bitcoin tries to prevent this attack by storing the mapping of
-a user to his or her public-keys on that user's node only and by
-allowing each user to generate as many public-keys as required. In
-this chapter we consider the topological structure of two networks
-derived from Bitcoin's public transaction history. We show that the
-two networks have a non-trivial topological structure, provide
-complementary views of the Bitcoin system and have implications for
-anonymity. We combine these structures with external information and
-techniques such as context discovery and flow analysis to investigate
-an alleged theft of Bitcoins, which, at the time of the theft, had a
-market value of approximately half a million U.S. dollars."
-</blockquote></p>
-
-<p>I hope these references can help kill the urban myth that Bitcoin
-is anonymous. It isn't really a good fit for illegal activites. Use
-cash if you need to stay anonymous, at least until regular DNA
-sampling of notes and coins become the norm. :)</p>
-
-<p>As usual, if you use bitcoin and want to show your support of my
+ <title>S3QL, a locally mounted cloud file system - nice free software</title>
+ <link>http://people.skolelinux.org/pere/blog/S3QL__a_locally_mounted_cloud_file_system___nice_free_software.html</link>
+ <guid isPermaLink="true">http://people.skolelinux.org/pere/blog/S3QL__a_locally_mounted_cloud_file_system___nice_free_software.html</guid>
+ <pubDate>Wed, 9 Apr 2014 11:30:00 +0200</pubDate>
+ <description><p>For a while now, I have been looking for a sensible offsite backup
+solution for use at home. My requirements are simple, it must be
+cheap and locally encrypted (in other words, I keep the encryption
+keys, the storage provider do not have access to my private files).
+One idea me and my friends had many years ago, before the cloud
+storage providers showed up, was to use Google mail as storage,
+writing a Linux block device storing blocks as emails in the mail
+service provided by Google, and thus get heaps of free space. On top
+of this one can add encryption, RAID and volume management to have
+lots of (fairly slow, I admit that) cheap and encrypted storage. But
+I never found time to implement such system. But the last few weeks I
+have looked at a system called
+<a href="https://bitbucket.org/nikratio/s3ql/">S3QL</a>, a locally
+mounted network backed file system with the features I need.</p>
+
+<p>S3QL is a fuse file system with a local cache and cloud storage,
+handling several different storage providers, any with Amazon S3,
+Google Drive or OpenStack API. There are heaps of such storage
+providers. S3QL can also use a local directory as storage, which
+combined with sshfs allow for file storage on any ssh server. S3QL
+include support for encryption, compression, de-duplication, snapshots
+and immutable file systems, allowing me to mount the remote storage as
+a local mount point, look at and use the files as if they were local,
+while the content is stored in the cloud as well. This allow me to
+have a backup that should survive fire. The file system can not be
+shared between several machines at the same time, as only one can
+mount it at the time, but any machine with the encryption key and
+access to the storage service can mount it if it is unmounted.</p>
+
+<p>It is simple to use. I'm using it on Debian Wheezy, where the
+package is included already. So to get started, run <tt>apt-get
+install s3ql</tt>. Next, pick a storage provider. I ended up picking
+Greenqloud, after reading their nice recipe on
+<a href="https://greenqloud.zendesk.com/entries/44611757-How-To-Use-S3QL-to-mount-a-StorageQloud-bucket-on-Debian-Wheezy">how
+to use S3QL with their Amazon S3 service</a>, because I trust the laws
+in Iceland more than those in USA when it come to keeping my personal
+data safe and private, and thus would rather spend money on a company
+in Iceland. Another nice recipe is available from the article
+<a href="http://www.admin-magazine.com/HPC/Articles/HPC-Cloud-Storage">S3QL
+Filesystem for HPC Storage</a> by Jeff Layton in the HPC section of
+Admin magazine. When the provider is picked, figure out how to get
+the API key needed to connect to the storage API. With Greencloud,
+the key did not show up until I had added payment details to my
+account.</p>
+
+<p>Armed with the API access details, it is time to create the file
+system. First, create a new bucket in the cloud. This bucket is the
+file system storage area. I picked a bucket name reflecting the
+machine that was going to store data there, but any name will do.
+I'll refer to it as <tt>bucket-name</tt> below. In addition, one need
+the API login and password, and a locally created password. Store it
+all in ~root/.s3ql/authinfo2 like this:
+
+<p><blockquote><pre>
+[s3c]
+storage-url: s3c://s.greenqloud.com:443/bucket-name
+backend-login: API-login
+backend-password: API-password
+fs-passphrase: local-password
+</pre></blockquote></p>
+
+<p>I create my local passphrase using <tt>pwget 50</tt> or similar,
+but any sensible way to create a fairly random password should do it.
+Armed with these details, it is now time to run mkfs, entering the API
+details and password to create it:</p>
+
+<p><blockquote><pre>
+# mkdir -m 700 /var/lib/s3ql-cache
+# mkfs.s3ql --cachedir /var/lib/s3ql-cache --authfile /root/.s3ql/authinfo2 \
+ --ssl s3c://s.greenqloud.com:443/bucket-name
+Enter backend login:
+Enter backend password:
+Before using S3QL, make sure to read the user's guide, especially
+the 'Important Rules to Avoid Loosing Data' section.
+Enter encryption password:
+Confirm encryption password:
+Generating random encryption key...
+Creating metadata tables...
+Dumping metadata...
+..objects..
+..blocks..
+..inodes..
+..inode_blocks..
+..symlink_targets..
+..names..
+..contents..
+..ext_attributes..
+Compressing and uploading metadata...
+Wrote 0.00 MB of compressed metadata.
+# </pre></blockquote></p>
+
+<p>The next step is mounting the file system to make the storage available.
+
+<p><blockquote><pre>
+# mount.s3ql --cachedir /var/lib/s3ql-cache --authfile /root/.s3ql/authinfo2 \
+ --ssl --allow-root s3c://s.greenqloud.com:443/bucket-name /s3ql
+Using 4 upload threads.
+Downloading and decompressing metadata...
+Reading metadata...
+..objects..
+..blocks..
+..inodes..
+..inode_blocks..
+..symlink_targets..
+..names..
+..contents..
+..ext_attributes..
+Mounting filesystem...
+# df -h /s3ql
+Filesystem Size Used Avail Use% Mounted on
+s3c://s.greenqloud.com:443/bucket-name 1.0T 0 1.0T 0% /s3ql
+#
+</pre></blockquote></p>
+
+<p>The file system is now ready for use. I use rsync to store my
+backups in it, and as the metadata used by rsync is downloaded at
+mount time, no network traffic (and storage cost) is triggered by
+running rsync. To unmount, one should not use the normal umount
+command, as this will not flush the cache to the cloud storage, but
+instead running the umount.s3ql command like this:
+
+<p><blockquote><pre>
+# umount.s3ql /s3ql
+#
+</pre></blockquote></p>
+
+<p>There is a fsck command available to check the file system and
+correct any problems detected. This can be used if the local server
+crashes while the file system is mounted, to reset the "already
+mounted" flag. This is what it look like when processing a working
+file system:</p>
+
+<p><blockquote><pre>
+# fsck.s3ql --force --ssl s3c://s.greenqloud.com:443/bucket-name
+Using cached metadata.
+File system seems clean, checking anyway.
+Checking DB integrity...
+Creating temporary extra indices...
+Checking lost+found...
+Checking cached objects...
+Checking names (refcounts)...
+Checking contents (names)...
+Checking contents (inodes)...
+Checking contents (parent inodes)...
+Checking objects (reference counts)...
+Checking objects (backend)...
+..processed 5000 objects so far..
+..processed 10000 objects so far..
+..processed 15000 objects so far..
+Checking objects (sizes)...
+Checking blocks (referenced objects)...
+Checking blocks (refcounts)...
+Checking inode-block mapping (blocks)...
+Checking inode-block mapping (inodes)...
+Checking inodes (refcounts)...
+Checking inodes (sizes)...
+Checking extended attributes (names)...
+Checking extended attributes (inodes)...
+Checking symlinks (inodes)...
+Checking directory reachability...
+Checking unix conventions...
+Checking referential integrity...
+Dropping temporary indices...
+Backing up old metadata...
+Dumping metadata...
+..objects..
+..blocks..
+..inodes..
+..inode_blocks..
+..symlink_targets..
+..names..
+..contents..
+..ext_attributes..
+Compressing and uploading metadata...
+Wrote 0.89 MB of compressed metadata.
+#
+</pre></blockquote></p>
+
+<p>Thanks to the cache, working on files that fit in the cache is very
+quick, about the same speed as local file access. Uploading large
+amount of data is to me limited by the bandwidth out of and into my
+house. Uploading 685 MiB with a 100 MiB cache gave me 305 kiB/s,
+which is very close to my upload speed, and downloading the same
+Debian installation ISO gave me 610 kiB/s, close to my download speed.
+Both were measured using <tt>dd</tt>. So for me, the bottleneck is my
+network, not the file system code. I do not know what a good cache
+size would be, but suspect that the cache should e larger than your
+working set.</p>
+
+<p>I mentioned that only one machine can mount the file system at the
+time. If another machine try, it is told that the file system is
+busy:</p>
+
+<p><blockquote><pre>
+# mount.s3ql --cachedir /var/lib/s3ql-cache --authfile /root/.s3ql/authinfo2 \
+ --ssl --allow-root s3c://s.greenqloud.com:443/bucket-name /s3ql
+Using 8 upload threads.
+Backend reports that fs is still mounted elsewhere, aborting.
+#
+</pre></blockquote></p>
+
+<p>The file content is uploaded when the cache is full, while the
+metadata is uploaded once every 24 hour by default. To ensure the
+file system content is flushed to the cloud, one can either umount the
+file system, or ask S3QL to flush the cache and metadata using
+s3qlctrl:
+
+<p><blockquote><pre>
+# s3qlctrl upload-meta /s3ql
+# s3qlctrl flushcache /s3ql
+#
+</pre></blockquote></p>
+
+<p>If you are curious about how much space your data uses in the
+cloud, and how much compression and deduplication cut down on the
+storage usage, you can use s3qlstat on the mounted file system to get
+a report:</p>
+
+<p><blockquote><pre>
+# s3qlstat /s3ql
+Directory entries: 9141
+Inodes: 9143
+Data blocks: 8851
+Total data size: 22049.38 MB
+After de-duplication: 21955.46 MB (99.57% of total)
+After compression: 21877.28 MB (99.22% of total, 99.64% of de-duplicated)
+Database size: 2.39 MB (uncompressed)
+(some values do not take into account not-yet-uploaded dirty blocks in cache)
+#
+</pre></blockquote></p>
+
+<p>I mentioned earlier that there are several possible suppliers of
+storage. I did not try to locate them all, but am aware of at least
+<a href="https://www.greenqloud.com/">Greenqloud</a>,
+<a href="http://drive.google.com/">Google Drive</a>,
+<a href="http://aws.amazon.com/s3/">Amazon S3 web serivces</a>,
+<a href="http://www.rackspace.com/">Rackspace</a> and
+<a href="http://crowncloud.net/">Crowncloud</A>. The latter even
+accept payment in Bitcoin. Pick one that suit your need. Some of
+them provide several GiB of free storage, but the prize models are
+quite different and you will have to figure out what suits you
+best.</p>
+
+<p>While researching this blog post, I had a look at research papers
+and posters discussing the S3QL file system. There are several, which
+told me that the file system is getting a critical check by the
+science community and increased my confidence in using it. One nice
+poster is titled
+"<a href="http://www.lanl.gov/orgs/adtsc/publications/science_highlights_2013/docs/pg68_69.pdf">An
+Innovative Parallel Cloud Storage System using OpenStack’s SwiftObject
+Store and Transformative Parallel I/O Approach</a>" by Hsing-Bung
+Chen, Benjamin McClelland, David Sherrill, Alfred Torrez, Parks Fields
+and Pamela Smith. Please have a look.</p>
+
+<p>Given my problems with different file systems earlier, I decided to
+check out the mounted S3QL file system to see if it would be usable as
+a home directory (in other word, that it provided POSIX semantics when
+it come to locking and umask handling etc). Running
+<a href="http://people.skolelinux.org/pere/blog/Testing_if_a_file_system_can_be_used_for_home_directories___.html">my
+test code to check file system semantics</a>, I was happy to discover that
+no error was found. So the file system can be used for home
+directories, if one chooses to do so.</p>
+
+<p>If you do not want a locally file system, and want something that
+work without the Linux fuse file system, I would like to mention the
+<a href="http://www.tarsnap.com/">Tarsnap service</a>, which also
+provide locally encrypted backup using a command line client. It have
+a nicer access control system, where one can split out read and write
+access, allowing some systems to write to the backup and others to
+only read from it.</p>
+
+<p>As usual, if you use Bitcoin and want to show your support of my