Growing existing vdev and no spare SATA ports - can I use external USB dock to resilver new disks?

Martin Maisey · May 22, 2017

Hi,

I'm running a home NAS on an an HP Microserver Gen8. The main pool is formed from 4xWD RED 2TB drives as a stripe of two mirrored vdevs in the drive bays, with the fifth internal SATA port occupied with a Crucial SSD drive for ZIL. The sole PCIe slot is occupied with an HyperX Predator 240GB Gen2 PCIe x4 SSD for L2ARC. 'zpool status' output below. Boot is from an externally attached USB3 SSD.

Everything's currently healthy. However, the pool is getting a little short on space, so I've ordered 2xWD RED 8TB drives with the aim of expanding one of the mirrored vdevs to a mirror of the 8TB drives to give 10TB usable.

I've obviously read https://doc.freenas.org/9.3/freenas_storage.html#replacing-drives-to-grow-a-zfs-pool which describes two ways of doing this. The safest way is obviously to do the replacement with the original 2TB drives online, in order to prevent the array going into a degraded state during the resilver. However, I haven't got any SATA drive bays to do this with.

I do have an external USB3 SATA drive dock, and was wondering whether it would be sensible to put each new 8TB drive into this and do the resilver that way, then swap the drive into the internal bay when complete? This raises two questions:

1/ The documentation page above says to use 'an eSATA port and a hard drive dock'. It doesn't mention USB anywhere, and Googling turns up lots of people saying USB with ZFS on FreeNAS is scary. Unfortunately, the Gen8 doesn't have an eSATA port. Is using USB acceptable for this type of short term operation where I have two good disks on internal SATA ports in the mirrored vdev to fall back to if something goes wrong? Are there risks from doing it this way and if so what are they?

2/ When the resilver operation is complete, the replaced 2TB drive should be offlined automatically, as I understand. If I then power down, swap the newly resilvered disk into the internal bay, and power up, will FreeNAS pick up and start using the new 8TB drive without any problems, despite the fact I've moved it between controllers? Googling, I *think* the answer is yes, but wanted to be sure.

I guess an alternative approach would be to remove the L2ARC from the pool, swap in an eSATA PCIe card and do the resilvering from that, then reverse the process. But that's a lot of mucking around and opening the case that I'd rather not do if I don't have to.

Cheers,

Martin

---

Code:

[root@fileserver] ~# zpool status

  pool: Pool1
state: ONLINE
  scan: scrub repaired 0 in 6h4m with 0 errors on Sun May  7 06:04:47 2017

config:

NAME											STATE	 READ WRITE CKSUM
Pool1										   ONLINE	   0	 0	 0
  mirror-0									  ONLINE	   0	 0	 0
	gptid/910019e1-9859-11e2-9858-a0b3cce00e31  ONLINE	   0	 0	 0
	gptid/916edfff-9859-11e2-9858-a0b3cce00e31  ONLINE	   0	 0	 0
  mirror-1									  ONLINE	   0	 0	 0
	gptid/e4c86e4a-9abf-11e2-984b-a0b3cce00e31  ONLINE	   0	 0	 0
	gptid/e5384f3a-9abf-11e2-984b-a0b3cce00e31  ONLINE	   0	 0	 0
logs
  gptid/f57beb86-2994-11e5-8a07-d0bf9c460284	ONLINE	   0	 0	 0
cache
  gptid/aa1c5dd7-2994-11e5-8a07-d0bf9c460284	ONLINE	   0	 0	 0

errors: No known data errors

  pool: freenas-boot
state: ONLINE
  scan: scrub repaired 0 in 0h0m with 0 errors on Tue May  9 03:45:33 2017
config:

NAME		STATE	 READ WRITE CKSUM
freenas-boot  ONLINE	   0	 0	 0
  da0p2	 ONLINE	   0	 0	 0

errors: No known data errors

SweetAndLow · May 22, 2017

You should probably remove that l2arc and slog device because I doubt they are doing anything and the l2arc is probably making everything slower.

Don't use the usb to resilver. Remove that slog and do the replace. When done just don't put the slog back and remove the l2arc while you are at it.

Hardware specs, freenas version and workload you use this for would be good to know.

Sent from my Nexus 5X using Tapatalk

Ericloewe · May 22, 2017

That said, a good USB adapter should be fine in that situation. Good luck figuring out which ones are good, though.

Arwen · May 22, 2017

I'd go for removing the SLOG, (even if temporary), and using that SATA port for the replacement.

Martin Maisey · May 22, 2017

Hardware specs: it's the G1610T Celeron version of the HP Microserver Gen8. 16GB ECC RAM and 2 x bonded Intel Gigabit Ethernet to the switch. I'm running FreeNAS-9.10.1-U4 (ec9a7d3).

I run a variety of lab VM workloads (depends on what I'm doing at the time) on FreeNAS NFS storage mounted on two Proxmox servers. Plus general home NAS use including storage of files I don't want to lose (e.g. RAWs and HD home video originals) and media/transcoding via the Plex plugin. I have one small VM running locally on the FreeNAS box under VirtualBox, but it doesn't really do any work - it's just providing quorum for the two servers in the Proxmox cluster. I do snapshot replication to another Microserver running ZFS on Linux in the attic for irreplaceable data, but if I screwed up the expansion it would be a pain to have to re-rip my media collection which is too big to replicate.

On slog/L2ARC:

slog is recommended for virtualisation workloads using NFS - https://www.ixsystems.com/blog/o-slog-not-slog-best-configure-zfs-intent-log/ and https://wiki.freebsd.org/ZFSTuningGuide#NFS_tuning , plus personal experience of dog slow performance before I added it.
The PCIe M.2 L2ARC makes read intensive workloads feel *much* snappier on the VMs. As it should: it's capable of delivering about 120k random 4k read IOPS for the hottest 240GB of storage, compared to ~400 IOPS from all of the RED 2TB drives combined!

So all in all, I think both are appropriate and I'd like to keep them long term.

Thanks for input, all. Sounds like there's a consensus that removing the ZIL temporarily and opening the case to use the internal SATA port for the resilver is the way to go. Hopefully the SATA cable will stretch while I balance the new drive on the top of the open chassis during the resilver - it's currently connected to a 2.5" SSD, and if I recall correctly, there's not a lot of give!

As per my question 2/ above, when that's done and I swap the newly silvered drive into the old drive's proper bay, will FreeNAS transparently continue using it, or will I have to do anything special?

Also, for the record, could someone offer a view on what the worst case scenario would be if I did use what ends up being a dodgy USB controller, given there are already two good mirrors on SATA attached disks? I couldn't find anything out there addressing this case - and even for the more general case of using USB storage, while there are a lot of people saying "don't", there doesn't seem to be much information on *why*.

While obviously it would be good to keep a SATA bay spare for expansions like this (or for a hot spare in case of a failure), I think there are a lot of people using platforms like the Microserver where there are only four bays available, and unfortunately that's not really viable. Being able to expand storage safely without opening the case would be useful to many, I imagine.

SweetAndLow · May 22, 2017

You don't have enough memory for the l2arc. Adding the l2arc users memory from your main arc. It might be faster to not have it. I'll let you figure out how to test it and prove things one way or the other. That topic doesn't really help your disk replacement though. You can try the USB thing. I suspect it might be really slow and you will get disk time outs. That's about the worst I can think of. If the resilver fails I'm not sure what status your pool will be in. I bet it will just keep trying to resilver.

Sent from my Nexus 5X using Tapatalk

danb35 · May 22, 2017

Martin Maisey said:
when that's done and I swap the newly silvered drive into the old drive's proper bay, will FreeNAS transparently continue using it

Yes.

Dice · May 22, 2017

SweetAndLow said:
. I bet it will just keep trying to resilver.

would it continue to resilver, say if the USB-idea is discontinued in favor of a traditional +1 sata port replacement?

joeschmuck · May 22, 2017

Martin Maisey said:
Being able to expand storage safely without opening the case would be useful to many, I imagine.

They call that eSATA. It's easy to expand but you need a chassis for the drives and the proper cables to make it work. Nothing difficult at all and it's much faster and looks cleaner that USB drives.

I also vote you discontinue the SLOG and L2ARC, I don't see then doing you any favors. You should expand your RAM if you can.

Stux · May 22, 2017

Using USB *may* work. Awful lot of stress to put on dodgy USB drivers/hardware.

I'd suggest temporarily removing the SLOG and using that 5th port. Even if you leave the case cracked and treat it like an esata ;)

Martin Maisey · May 23, 2017

Thanks again all for further comments. I'm no longer planning to use USB, given the chorus of voices saying don't and that there seems to be some debate about the state the pool might be left in if the resilver didn't work (eek). But thought it was worth understanding why and might help others in future.

@joeschmuck - RAM's already maxed, unfortunately. And, as mentioned earlier, the HP Microserver doesn't have a built-in eSATA port in its Gen8 incarnation (earlier versions did).

The L2ARC had helped significantly with some workloads I was running a while back (as validated by arc_summary.py and some benchmarking at the time), but I'm not running those any more. This discussion has prompted me to look at arc_summary.py again and it looks like the random access workloads I'm running now seem to have a working set that fits within the approximately ~10GB of available ARC, so the hit rate on the L2ARC is way down and it doesn't appear to be doing anything for me anymore. As far as I can see (?), the L2ARC probably isn't doing a huge amount of harm - historical stats have the headers occupying between 200-300MB of RAM, and I think a lot of the dire warnings on oversized ARC might date from when the L2ARC block header was about 400 bytes of RAM as opposed to 70 it appears to be now - but I agree that it does seem sensible to remove it.

That frees up a PCIe slot for an eSATA port multiplier card, which I can use to attach a 4 bay eSATA enclosure for the rebuild. Given I have both of those in the attic not doing anything currently, it feels like that may be the best option. It's a suggested approach from the manual, and it means I don't have to open the case again if I'm doing anything like this in future. But I know a few people above are saying use the internal SATA port: is there a reason for that as opposed to going the eSATA route, or are they both OK?

Regarding slog: I haven't got time to do a huge amount of benchmarking right now, but I've just run a microbenchmark extracting the linux source TAR inside a Proxmox VM with and without slog. From a couple of runs each way, it seems to be reliably ~3.25x faster with slog. That isn't a small difference, and backs up my qualitative experience and the advice in the links I posted above to use an slog when doing virtualisation with NFS. I can't see any reason to get rid of it, other than the potential for losing a few seconds of in-flight writes if the drive fails *and* the server crashes before the writes can be flushed, which I'm happy with - ZFS is meant to keep the filesystems consistent in this event with the pool version I'm running. Am I missing something?

SweetAndLow · May 23, 2017

Martin Maisey said:
Thanks again all for further comments. I'm no longer planning to use USB, given the chorus of voices saying don't and that there seems to be some debate about the state the pool might be left in if the resilver didn't work (eek). But thought it was worth understanding why and might help others in future.

@joeschmuck - RAM's already maxed, unfortunately. And, as mentioned earlier, the HP Microserver doesn't have a built-in eSATA port in its Gen8 incarnation (earlier versions did).

The L2ARC had helped significantly with some workloads I was running a while back (as validated by arc_summary.py and some benchmarking at the time), but I'm not running those any more. This discussion has prompted me to look at arc_summary.py again and it looks like the random access workloads I'm running now seem to have a working set that fits within the approximately ~10GB of available ARC, so the hit rate on the L2ARC is way down and it doesn't appear to be doing anything for me anymore. As far as I can see (?), the L2ARC probably isn't doing a huge amount of harm - historical stats have the headers occupying between 200-300MB of RAM, and I think a lot of the dire warnings on oversized ARC might date from when the L2ARC block header was about 400 bytes of RAM as opposed to 70 it appears to be now - but I agree that it does seem sensible to remove it.

That frees up a PCIe slot for an eSATA port multiplier card, which I can use to attach a 4 bay eSATA enclosure for the rebuild. Given I have both of those in the attic not doing anything currently, it feels like that may be the best option. It's a suggested approach from the manual, and it means I don't have to open the case again if I'm doing anything like this in future. But I know a few people above are saying use the internal SATA port: is there a reason for that as opposed to going the eSATA route, or are they both OK?

Regarding slog: I haven't got time to do a huge amount of benchmarking right now, but I've just run a microbenchmark extracting the linux source TAR inside a Proxmox VM with and without slog. From a couple of runs each way, it seems to be reliably ~3.25x faster with slog. That isn't a small difference, and backs up my qualitative experience and the advice in the links I posted above to use an slog when doing virtualisation with NFS. I can't see any reason to get rid of it, other than the potential for losing a few seconds of in-flight writes if the drive fails *and* the server crashes before the writes can be flushed, which I'm happy with - ZFS is meant to keep the filesystems consistent in this event with the pool version I'm running. Am I missing something?

The slog is fine if you are doing sync writes, which it sounds like you are with NFS + your virtualzation workflow.

You mentioned losing writes if your slog dies. That isn't true. If your slog dies your performance just tanks. The data will always be written to disk from memory. The slog is just there for a non volitile backup incase the one I'm memory disappears like in a power outage. If there is a power loss then the transaction can be executed from the slog.

Sent from my Nexus 5X using Tapatalk

Martin Maisey · May 23, 2017

SweetAndLow said:
The slog is fine if you are doing sync writes, which it sounds like you are with NFS + your virtualzation workflow.

You mentioned losing writes if your slog dies. That isn't true. If your slog dies your performance just tanks. The data will always be written to disk from memory. The slog is just there for a non volitile backup incase the one I'm memory disappears like in a power outage. If there is a power loss then the transaction can be executed from the slog.

Sent from my Nexus 5X using Tapatalk

Yup, that's why I said if the drive fails *and* the server crashes before writes are flushed ;-)

I think I read somewhere that with very old pool versions (<=v15) losing a non-redundant slog could wipe out the pool... wouldn't think many people are running with that nowadays, though.

Stux · May 23, 2017

SweetAndLow said:
The slog is fine if you are doing sync writes, which it sounds like you are with NFS + your virtualzation workflow.

You mentioned losing writes if your slog dies. That isn't true. If your slog dies your performance just tanks. The data will always be written to disk from memory. The slog is just there for a non volitile backup incase the one I'm memory disappears like in a power outage. If there is a power loss then the transaction can be executed from the slog.

Sent from my Nexus 5X using Tapatalk

Which is why enterprise slogs have Power Loss Protection capacitors. The SLOG only gets used upon reboot to write the data which was written to the SLOG to the pool, and that is precisely when the lack of PLP in consumer SSDs can bite you.

You can get the same performance gain by disabling sync writes ;)

People were suggesting using the internal sata port because you didn't have an esata option

Martin Maisey · May 23, 2017

Stux said:
Which is why enterprise slogs have Power Loss Protection capacitors. The SLOG only gets used upon reboot to write the data which was written to the SLOG to the pool, and that is precisely when the lack of PLP in consumer SSDs can bite you.

I'm using an Intel SSD 320 for the slog precisely because it's got power loss protection capacitors ;). Not the fastest drive, but you don't really need it for slog...

Hooked up a non-name eSATA enclosure via a Startech PEXESAT32 PCIe adapter, but for some reason couldn't read the SMART data for the drives, which made me a bit nervous. Not sure if it's FreeBSD drivers, the card or the enclosure at fault, but other people seem to have hit similar problems before. So went with the internal SATA option in the end (which has no problems reading the SMART data) to be on the safe side - turned out there was enough wiggle on the cable :). First drive sitting there resilvering now.

Martin Maisey · May 23, 2017

I was vaguely considering getting one of these to get a few ;) more bays and some more RAM (£338 inc. VAT, with 2xE5260 quad-core Xeon, 48GB ECC RAM, and 12 caddies), but not sure my wife would appreciate it, even if it's in the garage. Or the electricity bill. And I'd have to buy a rack.

Still a bit tempted, mind.

danb35 · May 23, 2017

Martin Maisey said:
And I'd have to buy a rack.

Well, you wouldn't have to buy a rack. It could just as easily go on a shelf. And at least on this side of the pond, used enterprise gear is probably the best bang for your buck, though power requirements are a definite factor.

Robert Trevellyan · May 23, 2017

Martin Maisey said:
a few people above are saying use the internal SATA port: is there a reason for that as opposed to going the eSATA route, or are they both OK?

Aside from what @Stux pointed out, eSATA is decent, if you're careful, but port multipliers are frowned upon. I would argue that for pool expansion or reconfiguration, it's an acceptable risk. For a live pool, not so much.

EDIT: maybe I should clarify. I've personally used eSATA for expansion and reconfiguration several times without issue. You have to be careful because eSATA ports can be a bit wobbly.

Martin Maisey · May 24, 2017

The expansion ran really smoothly and the pool is happy. I'd forgotten, but I'd actually ordered 4TB disks in the end, not 8GB. The latter felt a bit big given that I have mirrored vdevs rather than RAIDZ2 or 3.

The unpaid support on this forum is rather amazing and much appreciated. Much better than most SMB solutions with paid-for support, as you don't have to fight through a layer of scripted first line before you can get in touch with genuine experts!

I did have one further question regarding slog. I read through @danb35's configuration guide linked above, and that pointed me at @cyberjock's presentation. Both fantastic documents - it would really good if they were linked from the official documentation, BTW, as I hadn't seen them before. On page 14 of the presentation, it says "Failure of a ZIL drive (unless using mirrored ZIL drives) will prevent the zpool from mounting on bootup. This is because part of the mounting process checks the ZIL for uncommitted transactions and commits them before mounting the zpool."

In this situation, would there be any way to tell FreeNAS/ZFS to mount the pool anyway (albeit with lost in-flight transactions) - e.g. a force mount etc.? I'm less worried about the availability impact of the pool not being automatically being mounted on boot, but if using an unmirrored ZIL means introduced a SPoF that could result in the committed data in the pool becoming inaccessible, that would be a worry.

danb35 · May 24, 2017

Martin Maisey said:
read through @danb35's configuration guide linked above

That isn't my guide; it was written by @UncleFester. I just wiki-fied it and made a few edits.

Martin Maisey said:
On page 14 of the presentation, it says "Failure of a ZIL drive (unless using mirrored ZIL drives) will prevent the zpool from mounting on bootup.

I'm 99% sure this isn't the case any more.

Important Announcement for the TrueNAS Community.

Growing existing vdev and no spare SATA ports - can I use external USB dock to resilver new disks?

Dabbler

Sweet'NASty

Server Wrangler

MVP

Dabbler

Sweet'NASty

Hall of Famer

Wizard

Old Man

MVP

Dabbler

Sweet'NASty

Dabbler

MVP

Dabbler

Dabbler

Hall of Famer

Pony Wrangler

Dabbler

Hall of Famer

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "Growing existing vdev and no spare SATA ports - can I use external USB dock to resilver new disks?"

Similar threads