Recommended 3TB+ drives for raidz1/2

Status
Not open for further replies.

nattan

Explorer
Joined
May 19, 2013
Messages
57
I am looking for some serious storage and have come across this:
http://www.seagate.com/files/www-co...dd/en-us/docs/archive-hdd-dS1834-3-1411us.pdf

currently comes out to around $0.033 per GB with a quick sales promo so I would need to decide fairly soon.

the plan would to replace my current 3x3tb raid z with a 6x8tb raidz2, granted the price per gb is very nice but would you guys go with this drive or would you spend more and get server grade drives? I wouldn't say the data is very critical, can always be replaced/backed up ( just time involved )
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
The problem is that those drives are on the slow side (which is saying a lot, by HDD standards) and don't support TLER (and might have overzealous power saving mechanisms that might cause trouble).
 

Jailer

Not strong, but bad
Joined
Sep 12, 2014
Messages
4,977
Newegg is having a pretty good sale today and tomorrow only for a 3 pack of WD Red 3TB for $299.
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
How slow is slow? less than 50MB/s? Just read a long reddit post on it, probably going to pass on it now

http://www.reddit.com/r/DataHoarder/comments/2vxboz/seagate_archive_hdd_v2_8tb_st8000as0002_write/

YES, less than 50MBps write speed. I have an 8TB Seagate Archive used as a single disk pool for backups in
FreeNAS 9.3. Massive writes, like for my 1TB backups were between 30MBps and 60MBps. That said, read
performance was about 150MBps.

These drives have a small, (I think 20GBs), faster, (non-shingled disk space), write cache. If your writes are smaller
than that, AND allow enough time elapsed time between writes for the Archive drive to flush it, then write
performance would not be as bad. Basically specialized case.

But, please note the lack of TLER, (Time Limited Error Recovery), feature. Not having TLER is usually bad for
RAID.

Of course, using multiple Seagate Archives in a RAID-Zx increases the write cache if ZFS spreads the writes to all
the disks. Meaning a RAID-Z1 with 3 disks, would have an apparent write cache of 40GBs. (Minus 1 disk for the
parity...)
 
Last edited:

mattbbpl

Patron
Joined
May 30, 2015
Messages
237
The problem is that those drives are on the slow side (which is saying a lot, by HDD standards) and don't support TLER (and might have overzealous power saving mechanisms that might cause trouble).

How important is TLER in a Raidz2/3 setup? Normally the experts on this site (such as you and Cyberjock) are in approximate agreement on the critical FreeNAS success factors, but it seems to me that you are a staunch supporter of TLER as a critical factor while Cyberjock views it more-or-less a "nice to have".

I'm not looking to drag you and Cyber into a debate here, but as the strong supporter of TLER with your drives I'd like your opinion as to why you deem it to be so important. Does the lack of it put your VDev at risk of corruption (even if only in theory and not actually seen in the wild)? Is it primarily a performance issue, causing a pause in the drive reading while the drive attempts to recover the sector before grabbing the data from parity?

If the concern is the former, than I'd rather steer clear of TLER-less drives. If the concern is the latter, then the benefits of these drives (and other consumer drives) may outweigh the performance risks in a home read-heavy environment (at least in my estimation).
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
How important is TLER in a Raidz2/3 setup? Normally the experts on this site (such as you and Cyberjock) are in approximate agreement on the critical FreeNAS success factors, but it seems to me that you are a staunch supporter of TLER as a critical factor while Cyberjock views it more-or-less a "nice to have".

I'm not looking to drag you and Cyber into a debate here, but as the strong supporter of TLER with your drives I'd like your opinion as to why you deem it to be so important. Does the lack of it put your VDev at risk of corruption (even if only in theory and not actually seen in the wild)? Is it primarily a performance issue, causing a pause in the drive reading while the drive attempts to recover the sector before grabbing the data from parity?

If the concern is the former, than I'd rather steer clear of TLER-less drives. If the concern is the latter, then the benefits of these drives (and other consumer drives) may outweigh the performance risks in a home read-heavy environment (at least in my estimation).
TLER doesn't affect data safety (if your pool is in bad enough shape to really need the sector that the drive is trying to read, you probably won't get far anyway). It's just about performance and associated factors in the presence of an unreadable sector.
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
Does anyone have experience with a non-TLER disk performing it's long error recovery with ZFS?

I forsee 3 possibilities;
  1. Nothing, because ZFS either waits long enough, or uses redundancy to get another copy.
  2. Declares a read error shown with "zpool status", (then uses redundancy to get another copy).
  3. Declares a read error shown with "zpool status", get's redundant copy and repairs "bad" block.
I have personally seen ZFS do both 2 & 3. I don't know why it did not do number 3 in all cases...
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Does anyone have experience with a non-TLER disk performing it's long error recovery with ZFS?

I forsee 3 possibilities;
  1. Nothing, because ZFS either waits long enough, or uses redundancy to get another copy.
  2. Declares a read error shown with "zpool status", (then uses redundancy to get another copy).
  3. Declares a read error shown with "zpool status", get's redundant copy and repairs "bad" block.
I have personally seen ZFS do both 2 & 3. I don't know why it did not do number 3 in all cases...
ZFS will always correct errors, if redundancy still exists. Whatever else happens is up to the disk - the sector may be reallocated or the error may be deemed a fluke and the sector won't be reallocated. Nothing ZFS can do at the disk level, since LBA abstracts out the inner workings of the drive firmware - allowing for such "monstrosities" as SSDs or shingled recording drives.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
How important is TLER in a Raidz2/3 setup? Normally the experts on this site (such as you and Cyberjock) are in approximate agreement on the critical FreeNAS success factors, but it seems to me that you are a staunch supporter of TLER as a critical factor while Cyberjock views it more-or-less a "nice to have".

It's possible for ZFS to drop a disk that's being particularly shitty about reading (imagine sector after sector of 30+ second delays). TLER doesn't necessarily guarantee that won't happen, but it makes it less likely.

For a filer, it basically comes down to whether or not it's acceptable for protocol processing to stall and what happens if a disk drops out of the array.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
For home use I consider TLER to be relatively unimportant. The cost and the fact that your data is safe (the system may just not be particularly usable for you until you offline the failed disk) are the primary considerations.

For work, TLER is important *if* you feel that high uptime, a failing disk potentially slowing your pool to a crawl (which may take some services offline like CIFS, NFS, iSCSI, etc.).

I've worked with enough systems to see the worst case scenario of not having TLER.

I've seen my system become as useful as a brick when a disk started failing. As soon as I offlined the failing disk the system immediately returned to normal. So I had to take a few minutes to offline the failed disk so the server could stream a movie. Not worth paying $100+ per disk for TLER in my opinion.

I also saw someone with about 50 VMs on 4 ESXi servers over iSCSI start hiccuping really badly because a single disk in his 30+ disk zpool started having problems (he had no TLER). His production came to a screeching halt because the storage was too slow. Offlining the failing disk immediately restored the zpool to regular performance.

I have also seen users with TLER and lots of VMs experience the same thing as the guy in the above paragraph. It was still devastating to his production, and he still had to do something about it, and immediately. Just like the guy in the above paragraph, offlining the failing disk immediately restored regular performance.

The catch is that the last guy had more legroom in terms of being able to function. He was still up, but things were incredibly slow. Unusably slow, but nothing went offline. For the other guy he had one ESXi host detach from the iSCSI disk, which kicks all the VMs offline.

So yes, there is value in having TLER. But there is also a cost with having TLER. So the real questions are:

1. How long will it take you to offline a failed disk if a failing disk were to suddenly affect your zpool catastrophically?
2. How much money is potentially going to be lost if your iSCSI disks, CIFS shares, or NFS shares end up getting kicked offline or timing out?
3. How drastic are the long-term consequences internally (aka, will your boss be furious if their email even sneezes)?

These are tradeoffs that you have to make. Generally, anyone here to build a FreeNAS server has already determined that consequences of going offline aren't significant. If they were they wouldn't be here in the forums, they'd be calling iXsystems for TrueNAS High-Availability system and 24x7 support contract so that if something sneezes they call a phone number and let a level 2 support technician figure out what is going on on a Saturday night. So with that knowledge and wisdom I think it's safe to say that TLER isn't a primary concern for the vast majority of users around here. There are exceptions, no doubt. But it's definitely not the norm.
 
Last edited:

Robert Trevellyan

Pony Wrangler
Joined
May 16, 2014
Messages
3,778
Does anyone have experience with a non-TLER disk performing it's long error recovery with ZFS?
I have anecdotal evidence of a non-TLER disk that was having issues causing VirtualBox VMs to be aborted.

EDIT: I think "circumstantial" would be more apt than "anecdotal". I described it here.
 
Last edited:

mattbbpl

Patron
Joined
May 30, 2015
Messages
237
Wow, thanks guys. Between all the responses here, I now have a clear picture of the risks of going without TLER and can make an informed decision on whether it's important for my case (it's not - I'm concerned with data integrity, and downtime is little more than an inconvenience). Hopefully having this information laid out in a short space will help some others make the same decision as well.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Usually what it boils down to is that if you expect enterprise-style NAS behaviour (the thing doesn't just stall for random periods of time), then TLER == highly desirable feature. This is important anywhere you might have protocol freakouts (iSCSI, NFS, CIFS all time out) and where the client might not recover gracefully.

TLER is generally not an issue on archival pools, so there you get the biggest cheapest consumer grade SATA disks you can and then you go RAIDZ3 with a warm spare on that, and that works very well.
 
Status
Not open for further replies.
Top