New Multi-Actuator Hard Drives from Seagate

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
As soon as a home user can buy a set of these drives we will see complaints about how difficult it is to configure TrueNAS to support these new drives. TrueNAS is broken and a bad product because it does exactly what it is programmed to do and doesn't adequately prevent footgun moments despite people having to load, aim, and pull the trigger.
Edited for added cynicism. ;)

As someone who hates the stealth-SMR trend enough to have called it five years earlier, I'll be bookmarking your post to refer to it in another five years or so when the vendors quietly slip "DM-MA" drives into the consumer channels.
 
Last edited:

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
One question I have, is that the data sheet states "Interface Ports=Single".

SATA disks only have 1 port/path. But SAS disks have 2 ports/paths on the same connector. This is to allow disk arrays to have 2 controllers for redundancy, (same as Fibre Channel has). So, does this mean it's the normal SAS "single connector with dual ports/paths"? Or not?

If it is the stock SAS, then these might be good disks for disk arrays. Say you have 10 disks, and 2 controllers. Each controller gets 1 half of each disk. After RAIDing, it exposes the result on the SAN to clients. Each controller gets "full" speed of it's half of the disk. Plus, it's own dedicated communication channel to each disk.


As for LUNs, SCSI has supported LUNs from the very early days. Some hardware RAID controllers used to export their LUNs as SCSI LUNs. In the original SCSI standard, LUNs were limited to 8 per target, (3 encoding bits):

Wikipedia - SCSI CDB - Command Discripter Block


This does help the "badblocks" issue, how to test larger disks without wasting a weeks or months. These would test in half the time of a similarly sized disk, (with the same RPM).
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
One question I have, is that the data sheet states "Interface Ports=Single".

SATA disks only have 1 port/path. But SAS disks have 2 ports/paths on the same connector. This is to allow disk arrays to have 2 controllers for redundancy, (same as Fibre Channel has). So, does this mean it's the normal SAS "single connector with dual ports/paths"? Or not?
Single path SAS. The secondary SAS port is used internally for communication between the two drive SoCs in this model.

If it is the stock SAS, then these might be good disks for disk arrays. Say you have 10 disks, and 2 controllers. Each controller gets 1 half of each disk. After RAIDing, it exposes the result on the SAN to clients. Each controller gets "full" speed of it's half of the disk. Plus, it's own dedicated communication channel to each disk.
Unfortunately not for these. But even if they were, you'd have to avoid locating two halves of a mirror on the same unit. The middleware could be written to detect the presence of multiple LUs on a single physical device and work around it, but Seagate would need to provide some sample drives (or simulated ones via firmware) for validation.

Edit: Given the per-unit pricing I'm seeing through channel on these (north of USD$600) I don't anticipate any home users jumping in the pool too soon. Scratch that, looks like it was a placeholder item. Seemed awful expensive over the regular Exos X14 - pretty sure the "2X14" model number shouldn't mean "double the price."
 
Last edited:

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
Given the per-unit pricing I'm seeing through channel on these (north of USD$600) I don't anticipate any home users jumping in the pool too soon.
Where did you find them? I was looking and couldn't find them for sale anywhere.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Where did you find them? I was looking and couldn't find them for sale anywhere.
Scratch that, looks like it was a placeholder item. Seemed awful expensive over the regular Exos X14 - pretty sure the "2X14" model number shouldn't mean "double the price."
 
Joined
Jul 2, 2019
Messages
648
you effectively fit two hard drives in the space that one hard drive used and it only takes a little more power than one regular drive.
I think everyone sees the problem is that the shared drive enclosure (e.g., helium leak), shared motor, etc. That has to be balanced against the ability to get more drives - AND more potential throughput - in the same space as "traditional" drives.

I think that time will tell with this configuration. That said, I can't recall the last time that I had a spindle motor fail but I don't have that many drives :smile:
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
I think that time will tell with this configuration. That said, I can't recall the last time that I had a spindle motor fail but I don't have that many drives
At work, I have some servers with about 300 drives running and the last time I had a total failure was in 2017. That was a WD Red (maybe Pro don't remember) 4TB where it overheated so hot that it also caused the two drives adjacent to it in the server to fail. Craziest drive failure I ever saw. I wasn't standing there watching, but I did hunt through the logs and there were temperature alerts on that drive. As I recall, it got over 140c. Most drive faults I have experienced in the last five years have been bad sectors or some relatively minor fault where replacing the drive was a preventative measure to protect against the potential of a future, something, that I didn't want to worry about.
Similar results in my home network, but on a smaller scale.

For work, I see these drives as a way to get more IO out of the equipment I already have because trying to cram more drives in the limited space I have to work with is a problem. Realistically, I doubt I will be able to get these drives for a year, not at a quantity that I can actually use operationally.
I am still hopeful that I might get a set of sample drives from Seagate.
 
Last edited:

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
So in terms of handling the failure domain story, I'm thinking that to get the most out of these drives, one would need to pair two of them in a pair of crossed mirrors, with two different backplane paths also used (one for each) in order not to have this make things worse.

So one disk (presenting da1 & da2 to the OS) attached to backplane 1:

Code:
 MIRROR-A

    da1

 MIRROR-B

    da2


A second disk (presenting da3 and da4 to the OS) attached to backplane 2:

Code:
 MIRROR-A

    da3

 MIRROR-B

    da4


This resulting in an overall pool looking like:

Code:
ZPOOL1
 MIRROR-A
   da1
   da3
 MIRROR-B
   da2
   da4


With this setup, you can have backplane 1 or backplane 2 fail (not both) and still have a pool of two broken mirrors.

You can also have one "drive" fail and have a pool of two broken mirrors.

I guess if you continue like that, you could also come up with a plan to spread disks across backplanes and make a 2-VDEV RAIDZ2 pool with evens in one and odds in the other (likewise you could continue adding mirrors with the same alternating odds and evens pattern).

I'm not sure that this would be easy to keep compliance to, but at least there should be a way to do it.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Basically, Seagate has released hyper-threaded drives. Now we need to code the software to schedule them.

I'm not sure that this would be easy to keep compliance to, but at least there should be a way to do it.

Similarly to how we have physical/logical cores now, the middleware stacks for storage appliances like TrueNAS will need to be aware of and able to handle these drives correctly. Eg: da0 and da1 are detected as multiple LUs that resolve to the same target/serial/WWN - mark them as being in a shared failure domain and design pools accordingly if possible (in large scale, you could automatically create a set of two vdevs for each given set of drives at half-size). If user tries to implement a pool that will compromise redundancy (co-locating mirror vdevs on same physical device, attempting to create RAIDZn with insufficient physical devices), alert and offer override via checkbox/command-line. OpenZFS itself could certainly write some detection logic in the zpool create command as well, and the middleware would then just need to interpret.

For anyone building a pool or other software/mdraid style setup with MA drives though, they'll need to be aware and design around it themselves or else face the music later when a single physical drive failure takes out a "mirrored" vdev.

I'm far more concerned about the implications of if/when these drives start making their way onto SATA plugs, where the concept of LUs doesn't exist. The controller could simply do an internal stripe of the LBAs across both sets of platters (odd-numbered LBAs on spindle1, even-numbered on spindle2) and hope for some performance benefit there.

Q: Is a single-volume presented drive in future scope (i.e., the drive itself will load balance/optimize between either the first or second half of the drive)?
A: Maybe; it is possible. There are tail-latency issues with a configuration like this, but it is the simplest way to get plug-and-play performance. There needs to be enough market to justify the firmware development/complexity.

I'm reminded of the WD Black2 drive, where it mashed 120GB of MLC NAND and a 1TB HDD together to a single contiguous LBA range. The first 120GB worth of LBAs mapped to the NAND, everything after that was on the HDD. Formatting it as a single large partition would result in "unexpected performance characteristics" certainly, and it needed manual setup. If you did, though, you had the same idea - a small SSD for your operating system and a larger HDD for capacity, in a single 2.5" space. Good idea for laptops but the pricepoint made it unappealing. Even now I can only find the drives for about US$70 or so, and that's just silly when I could buy a 500GB 860 EVO for USD$55.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Reading the techpaper there's definitely some other issues at hand with specific SCSI commands crossing LU boundaries due to being device-wide.

SAS LUN behavior leads to ambiguity where some commands affect the individual LUN and others affect the device (both LUNs). It’s important for users of the drive to note these differences to successfully deploy the drive. High priority commands (HPC) such as Read and Write are LUN-based. Low priority commands (LPC) are a mix of LUN and device-based. Some examples of the more impactful device-based commands are noted in the table below:
COMMANDLUN/DEVICEDetails
Test Unit Ready (0×00) DeviceCommand will only report ready if both LUNs are ready
Power ModesDeviceIdle A, B, C, and Standby modes are all device-based
Format Unit (0×04)DeviceFormat to either LUN initiates the data loss format of both LUNs
Flush CacheDeviceCache is shared, so this command will affect both LUNs
Start/Stop Unit (0×1B)DeviceStart/Stop Unit affects the single motor in the drive
SanitizeDeviceSanitize sent to either LUN sanitizes the entire device

You can query the device vs. LUN effect of each command on the drive by issuing the REPORT SUPPORTED OPERATION CODES command and noting the multiple logical units (MLU) field for each command. Seagate is actively working on the inclusion of these nuances into a T10 proposal to standardize the usage of the MLU field on multi-actuator drives. You can access the T10 proposal here: http://www.t10.org/cgi-bin/ ac.pl?t=d&f=18-102r1.pdf

I don't expect low-level formats or sanitize ops to be issued frequently, but "flush cache" is kind of important to ZFS and that could make things work Not So Very Well.
 

AlexGG

Contributor
Joined
Dec 13, 2018
Messages
171
"flush cache" is kind of important to ZFS and that could make things work Not So Very Well.

Maybe it will make things Not So Very Fast, but it does not violate the original guarantee. The guarantee states that as the Flush Cache command is issued, the cache is flushed. It does not say anything about the drive flushing its cache at some other time on its own initiative. The MA-drive flushes LUN-specific cache on command, but also at some other times, which by a sheer coincidence match the times the flush command was issued against the other LUN.
 
Joined
Jul 2, 2019
Messages
648
OpenZFS itself could certainly write some detection logic
There is a very good point here - I don't think it would be iXsystems who would be responsible but Oracle?
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Maybe it will make things Not So Very Fast, but it does not violate the original guarantee. The guarantee states that as the Flush Cache command is issued, the cache is flushed. It does not say anything about the drive flushing its cache at some other time on its own initiative. The MA-drive flushes LUN-specific cache on command, but also at some other times, which by a sheer coincidence match the times the flush command was issued against the other LUN.
It doesn't violate the data assurance guarantee, no, but it could definitely be responsible for all kinds of ugly performance issues. True that if "both halves" of a disk are in the same pool (a 2x2 mirror equivalent) they'll likely be receiving requests to commit their cached data to disk at the same time; but the potential for interference does exist. I'd much prefer the command to be able to be sent at the LU level, but this would likely also require the cache to be logically split in the same manner.

Like I said; there's coding to be done. Although this seems like it'll be a much more reasonable goal rather than "make SMR not hot garbage."
 

ChrisRJ

Wizard
Joined
Oct 23, 2020
Messages
1,919
I am inclined to believe this kind of drive came into being because at least one of the big hyperscalers asked for it. It seems to be made for a very specific combination of requirements. And at scale it can make sense to have something like this. For the rest of us, with the current trajectory of price for SSDs (and "adjacent" devices) and how you can combine them with regular spinning drives for performance, I don't see how these special drives will ever be relevant.
 

robbiek01

Cadet
Joined
Oct 23, 2022
Messages
2
I bought 8x Exos 2x14 I have 6 of them in my TrueNAS Server in 2 drive mirrors and I can confirm that TrueNAS-13.0-U2 only detects one actuator and reports 6.37GB of drive space.

I purchased them off a fellow TrueNAS user where I live who works for a Hyperscaler, after purchasing 2 HBA's could not get the drives to spin up, I purchased a Supermicro LSI 9300 8i HBA because that what was recommended by Seagate it recognizes the drives but only half the storage.

I'm gutted to say the least the old adage "If it seems too good to be true it usually is" I spent 1220 euros on 8 drives and a HBA, I should have and will do in the future stick to SATA drives, my SAS days are behind me.

I have 4 SATA bays left in my Server I will populate them with SATA and decomission the SAS drives for sale over on Ebay
 

rvassar

Guru
Joined
May 2, 2018
Messages
972
I bought 8x Exos 2x14 I have 6 of them in my TrueNAS Server in 2 drive mirrors and I can confirm that TrueNAS-13.0-U2 only detects one actuator and reports 6.37GB of drive space.

I purchased them off a fellow TrueNAS user where I live who works for a Hyperscaler, after purchasing 2 HBA's could not get the drives to spin up, I purchased a Supermicro LSI 9300 8i HBA because that what was recommended by Seagate it recognizes the drives but only half the storage.

I'm gutted to say the least the old adage "If it seems too good to be true it usually is" I spent 1220 euros on 8 drives and a HBA, I should have and will do in the future stick to SATA drives, my SAS days are behind me.

I have 4 SATA bays left in my Server I will populate them with SATA and decomission the SAS drives for sale over on Ebay

I wouldn't give up on SAS. The dual actuator drives are the problem here. If you had single actuator drives you'd almost certainly be content.

Consider, SAS has an entire next generation coming out, 24gig SAS 4. There are no plans at present to extend SATA any further, it's a dead end. This makes your SAS config is somewhat more future proof. Additionally, all SAS controllers can communicate with SATA devices. You can plug SATA drives into your SAS controller and as long as you keep the 1 meter cable restriction, they'll work just fine. SATA controllers do not talk to SAS devices of course. SAS also allows longer cable runs, expander's permit hundreds of devices to be connected, and there are also other useful types of SAS devices, like LTO tape drives, etc...
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
I bought 8x Exos 2x14 I have 6 of them in my TrueNAS Server in 2 drive mirrors and I can confirm that TrueNAS-13.0-U2 only detects one actuator and reports 6.37GB of drive space.
I wouldn't mind seeing a smartctl and dmesg output here. It might be a case of the middleware only presenting the first LU.

Is there a potential to try this drive under a different OS (TN SCALE perhaps, or a live Linux distro) in case it's a matter of a driver/kernel needing to speak to two LUs behind one SAS device?

If not, then there might be enterprising/crazy individuals here who'd be interested in picking one up to experiment with themselves.
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
Without doing any research, I'd guess that that if these dual actuator drives are SAS, then they need a tweak in the SAS controller. Sometimes a SAS controller is configured to scan only LUN 0 of each target. It generally adds noticeable amount of time to scan all the LUNs. But, the SAS configuration could have an option to scan LUN 0 -> LUN X, with X being configurable. Thus, change X from 0 to 1.

That could expose the second half of the disk.

Just keep in mind HOW you configure the pool. If using Mirrors, obviously don't use both halves in the same Mirror. And in the case of RAID-Z1, either keep the halves in separate vDevs or separate pools. Because loss of a whole disk would take out 2 sub-disks.
 
Top