X-Git-Url: https://pere.pagekite.me/gitweb/homepage.git/blobdiff_plain/31de1a000c6ada9e33245ff2a06f6b774c225147..fed535c75d10975267558fb37bf358abe4be5b16:/blog/tags/raid/index.html diff --git a/blog/tags/raid/index.html b/blog/tags/raid/index.html index 4d38db1feb..a4a612c2a4 100644 --- a/blog/tags/raid/index.html +++ b/blog/tags/raid/index.html @@ -20,6 +20,100 @@

Entries tagged "raid".

+
+
+ Some notes on fault tolerant storage systems +
+
+ 1st November 2017 +
+
+

If you care about how fault tolerant your storage is, you might +find these articles and papers interesting. They have formed how I +think of when designing a storage system.

+ + + +

Several of these research papers are based on data collected from +hundred thousands or millions of disk, and their findings are eye +opening. The short story is simply do not implicitly trust RAID or +redundant storage systems. Details matter. And unfortunately there +are few options on Linux addressing all the identified issues. Both +ZFS and Btrfs are doing a fairly good job, but have legal and +practical issues on their own. I wonder how cluster file systems like +Ceph do in this regard. After all, there is an old saying, you know +you have a distributed system when the crash of a computer you have +never heard of stops you from getting any work done. The same holds +true if fault tolerance do not work.

+ +

Just remember, in the end, it do not matter how redundant, or how +fault tolerant your storage is, if you do not continuously monitor its +status to detect and replace failed disks.

+ +

As usual, if you use Bitcoin and want to show your support of my +activities, please send Bitcoin donations to my address +15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b.

+ +
+
+ + + Tags: english, raid, sysadmin. + + +
+
+
+
How to figure out which RAID disk to replace when it fail @@ -98,6 +192,17 @@ disk(s) is failing when the RAID is running short on disks.

Archive