As some of you may know, I have one big baby called gluttony
, with now 8x8TB disks (I am at /dev/sdj
now!!), totalling out on 32TB of usable storage.
What some of you won't know is that there's been quite a planning and research phase to make this work, and in this post I'll be explaining the steps I follow, and how you can do it yourself.
This post is written with knowledge gained in my own experience and working at an ISP who offered virtual disks of 2TB+ for clients, and will primarily focus on storage, whatever else you run on your NAS is not my problem.
The process I'll be writing here will be starting from scratch with only money on the table, you can skip ahead if you already made bad choices there.
Also fair warning, this guide will not go in-depth about setting this all up, it assumes some knowledge about the Unix ecosystem.
1. Chassis and Expansion plan
So, you have money set aside for this Epic Journey of having a NAS, and now you'll have to choose.
What chassis are you going to be using? What expansion options will it offer? Are you going to be using a pre-built NAS? will you be building one completly from scratch? will you be using a rack-chassis? or just a desktop chassis?
If you're going with rack-chassis route, I can recommend the Norco RPC-4220/4224, they're 4U ATX rack chassis with 20/24 hotswap bays which can connect via SAS. I use the RPC-4220!
But here's where you're going to make the first, semi-definite choices.
- Your maximum amount of storage
- Your expansion options
- Your disk type (SAS? SATA? M.2? 2.5″? 3.5″?)
The choices should depend on what you'll be storing and what you'll be using it for.
- Indefinitely storing anime: 4-8TB should be a good start, but be ready to expand
- Indefinitely storing western media: 8-16TB should give you around a year of content
- Storing family photos: 4TB
- I am a Data Hoarder: No amount is enough. The Chassis Hungers for More.
The amount of storage you're planning to use will also eventually determine what size of disks you can use to reach that goal. And the size of disks you'll be using will determine what RAID setups you can comfortable use.
So for anime, one can comfortable go with 2TB-4TB disks with about 2-4 slots. for storing family photos 1TB would even be fine.
But if you're a data hoarder, and going for gold I recommend 8TB disks, they're the biggest disks that you can comfortably use and are in the general range of 0.02-0.03 €/GB.
Using bigger mechanical disks becomes a problem, because the bigger the disk, the longer the resilvering process will take, the more stress there will be put on the disk and the chance that the other disk will fail too.
2. Choosing RAID and Filesystem
So now you have a chassis and a bunch of disks selected, what now?
Now it's time to select a RAID strategy and possibly a filesystem.
So, first off, DON'T USE HARDWARE RAID because if your RAID controller breaks, you need to find the Exact Same raid controller, which, even more so if you use refurbished hardware, can be a problem.
So what options do we have left?
- LVM (mdraid, actually under the hood)
- mdraid
- BTRFS
- ZFS
LVM
I have zero experience with LVM's RAID so I can't tell you anything about it. and I will not.
mdraid
mdraid is the only option on the list that transparantly does raid like HW raid would. it's on the block device level and you can use any FS on top of it. it's management tools work, there's not really any special features.
I haven't had much experience with it, but it's been quite stable as far as I've seen.
ZFS
Ah, Good Ol' ZFS. Well do I have things to tell you about ZFS.
So let's start with the features:
- Snapshotting
- Delta snapshot backups (via
zfs send
andzfs receive
) - Subvolumes
- CoW (moooo, Copy-on-Write)
- RAID1/RAIDZ/RAIDZ2
Very good features.
Now let's dissect what this will mean for running your NAS on ZFS.
First there's ARC, ARC will eat your memory, worse than a webbrowser does, BUT it'll make your disk look fast, so not too big of a deal, you can adjust it's usage too. It's just something to be aware of.
Then there's how ZFS internally works, if you're planning to expand over time, this is very important.
ZFS works with zpool
's and vdev
's.
A zpool
is basically the mount or disk™ you'll be seeing inside your OS and is basically a bunch of vdev
's together (kind of a bunch of vdev
's in RAID0), while a vdev
is a set of disks in a particular RAID setup, acting as a single (virtual, yes that's where the v
comes from) device.
Now this is an important distinction because a vdev
is immutable (except for replacing disks), this means you're unable to remove (yes, removing a disk is Very Hard resulting in me calling zfs
a poison) or add disks to a vdev
.
So if you chose ZFS, be aware that you need to add disks in pairs, this means if you're using RAID1/mirror, you need to add 2 disks at a time, if you're using RAIDZ you need to add 3 disks at a time etc.
Everytime you add a new pair of disks to your pool, you'll be adding a new vdev
in that RAID setup, you can also technically mix and match, but uhh. I don't recommend that.
If you're just throwing a portion of your income to your NAS this can be very frustrating since you might need to wait for a few months to add new storage, while you have the disks laying around for a while.
If you chose ZFS, make sure you have a FreeBSD (or worst case Solaris) live usb laying around. ZFSOnLinux is really stable right now, but some recovery options might not be available yet.
BTRFS
OH BTRFS! People hate it! I don't! I love it :)
So another List Of Features™:
- Snapshotting
- Delta snapshot backups (via
btrfs send
andbtrfs receive
) - Subvolumes
- CoW (mooooo, Copy-on-Write)
- RAID1 (on steroids!!!)
Very good features.
Now BTRFS doesn't have a long track record as ZFS does, but it's become quite stable and haven't heard any new Horror Stories™. I've been running it for a long while now and Enjoy It!
While it technically supports all other RAID formats, it's still not marked as OK™ and I personally don't recommend it.
I personally feel like BTRFS is perfect for the hobbyist NAS operator, it's a very much, Just Throw Disks at It filesystem.
Their RAID1 works a bit differently than RAID1 is defined.
Normal™ RAID1 takes N disks and replicates data N times, however! BTRFS RAID1 takes N disks and replicates data 2 times.
For this reason BTRFS also has raid1c3
and raid1c4
, which replicate data in order, 3 and 4 times.
This particularity means that you can throw 3 disks at it, and it'll make sure it's balanced in such a way there's always 2 copies of data on 2 different disks! There's a calculator which shows you how it works here: web archive - carfax.org.uk
So that was a whole bunch of -stuff-. Now, what RAID should you choose
Disk Size | RAID1 | RAID4/5/6 or RAIDZ/2/3 |
---|---|---|
<4TB | ✅ | ✅ |
4TB-8TB | ✅ | ❗ |
>8TB | ❗ | ❗ |
This is eater's RAID Recommendation Table everything with a ❗ is not recommended by eater™
As covered in the previous part, the bigger the disk the higher chance on failure cascading, which is why I do not recommend parity based RAID setups above 4TB.
It's not that it's not possible, it's just for a peace of mind.
The choice between what parity RAID level you're gonna use it totally up to you, just be aware of ZFS's limitations regarding those!
3. End Of All
So, you have your disks, you have chosen RAID and a filesystem. What's next?
Well, you set up Linux or FreeBSD or what you wanna run on your NAS and do what you want :D
but here's a few tips & tricks I've learned.
- Put a label with the WWN on your disk caddy or enclosure. this will save you time figuring out which disk is which (you can find the WWN of your disk either by
ls /dev/disk/by-id/wwn*
orlsblk -o NAME,WWN
), a WWN is a completely unique number for that disk, which means it'll always be the same, everywhere, and if you're lucky it's even printed on the disk itself! - If you're planning to expand set a warning at 60% of pool usage, and start planning for expansion (buying disks etc.), why? at 80% ZFS in particular will -notably- degrade in performance, and if you're early your rebalancing task will be less load on your other disks too. So make sure you have new disks in your pool before hitting 80%!
- Make sure you have
smartd
enabled and a monitoring system for it, so you can respond as soon as possible to a failing disk - If you're planning to have a Big Tank™ try to divide your disk into at least 2 subvolumes, 1 part that you want to backup, and 1 part that's okay to lose. This will allow you to have a small backup server, but still have backups.
- If your array is failing, don't panic! Ask around via your usual channels for help if you can't figure it out, or ask the mailing lists and IRC, people are willing to help!
- MAKE BACKUPS OF IMPORTANT DATA
Ye uh,, that's about it.
thanks for reading, I didn't proof read. so you now have done it for me. If you have any open questions don't be afraid to ask! me, or anyone else really.