New Build, new drives have over a BILLION bad blocks

DGenerateKane · Jul 17, 2017

I just got a used rackmount server off ebay last week (specs in sig), and I'm currently testing the first four of the eight drives I'm putting in it. The last four only just arrived today, otherwise they'd be in it getting tested as well. I followed Fester's Guide (https://www.familybrown.org/dokuwiki/doku.php?id=fester:hvalid_hdd) for hard drives tests step by step, and I'm a bit alarmed on the number of bad blocks on three of the drives during the destructive bad block test. I'm still running the tests so I can't post any results yet, just what I can see in the tmux sessions.

ada0 is the only one I'm not concerned with yet, as it doesn't have any bad blocks. Yet.

ada1 I thought was fine as it finished the bad blocks test itself, and was at the "reading and comparing" stage of the test. The next time I looked it was just outputting a ton of numbers in seemingly sequential order (Too fast for me to tell for sure) which I had already determined from looking at ada3 and ada4 that it was listing bad blocks, dozens per second. ada3 and ada4 already did that, and eventually aborted due to too many errors. I restarted the tests on both ada3 and ada4 and so far they are error free, and are much further along in the test then they got before spewing bad blocks everywhere. ada1 just aborted it's first attempt at a badblock test. I'm not sure if I should retry the test, or wait for the other three to finish theirs before trying something else, like checking the SMART data. At the rate they are going and assuming they don't abort due to too many bad blocks, I estimate at least 20 more hours. ada0 is 30 hours in, and is only 50% of the way through the reading and comparing stage. I would like to shut it down and put the last four drives in so I can test them as well, as I'm not sure if hot swapping is enabled in the BIOS. Stupid of me not to check that ahead of time.

This is the last bit visible in the tmux session for ada1:

Code:

1198666168
1198666169
1198666170
1198666171
1198666172
1198666173
1198666174
Too many bad blocks, aborting test
done
Testing with pattern 0x55: Too many bad blocks, aborting test073741823/0/0 error
done
Reading and comparing: Too many bad blocks, aborting test073741823/0/0 errors)
done
Testing with pattern 0xff: Too many bad blocks, aborting test073741823/0/0 error
done
Reading and comparing: Too many bad blocks, aborting test073741823/0/0 errors)
done
Testing with pattern 0x00: Too many bad blocks, aborting test073741823/0/0 error
done
Reading and comparing: Too many bad blocks, aborting test073741823/0/0 errors)
done
Pass completed, 1073741823 bad blocks found. (1073741823/0/0 errors)

I scrolled up as far as I could and it is just a crap ton more blocks listed as bad. If these drives are bad, why have two of them gone a lot further in the test with 0 errors so far this time? Oh, for reference the specific command I used was badblocks -b 4096 -vws /dev/daX for the destructive bad blocks test, in case it matters. Is there a different test I should do? Or are those three drives really bad and need to be RMA'd?

Spearfoot · Jul 17, 2017

DGenerateKane said:
I just got a used rackmount server off ebay last week (specs in sig), and I'm currently testing the first four of the eight drives I'm putting in it. The last four only just arrived today, otherwise they'd be in it getting tested as well. I followed Fester's Guide (https://www.familybrown.org/dokuwiki/doku.php?id=fester:hvalid_hdd) for hard drives tests step by step, and I'm a bit alarmed on the number of bad blocks on three of the drives during the destructive bad block test. I'm still running the tests so I can't post any results yet, just what I can see in the tmux sessions.

ada0 is the only one I'm not concerned with yet, as it doesn't have any bad blocks. Yet.

ada1 I thought was fine as it finished the bad blocks test itself, and was at the "reading and comparing" stage of the test. The next time I looked it was just outputting a ton of numbers in seemingly sequential order (Too fast for me to tell for sure) which I had already determined from looking at ada3 and ada4 that it was listing bad blocks, dozens per second. ada3 and ada4 already did that, and eventually aborted due to too many errors. I restarted the tests on both ada3 and ada4 and so far they are error free, and are much further along in the test then they got before spewing bad blocks everywhere. ada1 just aborted it's first attempt at a badblock test. I'm not sure if I should retry the test, or wait for the other three to finish theirs before trying something else, like checking the SMART data. At the rate they are going and assuming they don't abort due to too many bad blocks, I estimate at least 20 more hours. ada0 is 30 hours in, and is only 50% of the way through the reading and comparing stage. I would like to shut it down and put the last four drives in so I can test them as well, as I'm not sure if hot swapping is enabled in the BIOS. Stupid of me not to check that ahead of time.

This is the last bit visible in the tmux session for ada1:

Code:
1198666168 1198666169 1198666170 1198666171 1198666172 1198666173 1198666174 Too many bad blocks, aborting test done Testing with pattern 0x55: Too many bad blocks, aborting test073741823/0/0 error done Reading and comparing: Too many bad blocks, aborting test073741823/0/0 errors) done Testing with pattern 0xff: Too many bad blocks, aborting test073741823/0/0 error done Reading and comparing: Too many bad blocks, aborting test073741823/0/0 errors) done Testing with pattern 0x00: Too many bad blocks, aborting test073741823/0/0 error done Reading and comparing: Too many bad blocks, aborting test073741823/0/0 errors) done Pass completed, 1073741823 bad blocks found. (1073741823/0/0 errors)

I scrolled up as far as I could and it is just a crap ton more blocks listed as bad. If these drives are bad, why have two of them gone a lot further in the test with 0 errors so far this time? Oh, for reference the specific command I used was badblocks -b 4096 -vws /dev/daX for the destructive bad blocks test, in case it matters. Is there a different test I should do? Or are those three drives really bad and need to be RMA'd?

How are you connecting the drives?

Are you using both the motherboard AHCI and SCU SATA ports? Looks like you have 6 of the AHCI SATA ports and 4 of the SCU-based SATA ports, correct?

The SCU ports require a special cable, AFAIK this would be an SFF-8087 forward breakout cable. If the 'failing' drives are connected to a SCU port, try connecting them to the AHCI ports instead and see how they do.

You might have a bad SFF-8087 cable; sometimes just re-seating the cable can clear up problems... if you're lucky.

DGenerateKane · Jul 17, 2017

They're all connected to one of the backplanes, which are connected to the HBA with I assume an SFF-8087 cable. The seller provided everything except the drives. Since the drives can be installed without opening the case, I haven't had a need to open it.. I guess I will now. Should I abort all the tests now and do it, or wait?

Spearfoot · Jul 17, 2017

DGenerateKane said:
They're all connected to one of the backplanes, which are connected to the HBA with I assume an SFF-8087 cable. The seller provided everything except the drives. Since the drives can be installed without opening the case, I haven't had a need to open it.. I guess I will now. Should I abort all the tests now and do it, or wait?

You have an HBA? So you're not using any of the motherboard's SATA ports (AHCI or SCU)? If you're using an HBA, what brand and model is it? LSI HBAs should be flashed to the firmware version FreeNAS expects; it will fuss and you'll see an alert if this is out-of-wack. For LSI 2008-based HBA cards (LSI-9211/9210, IBM M1015, Dell H200/H310) this would be firmware version P20.00.07.00

Yes, I would abort the tests; down the server; open it up; and check all of the connections.

DGenerateKane · Jul 17, 2017

Well, I assume it is. Like I said I haven't opened it up to look at it. But the listing said it had "1x LSI 9211-8i HBA JBOD FREENAS UNRAID." I don't have an alert about it either. I'll shut it down and open it up.

Stux · Jul 17, 2017

You need to determine if the issue is the HD, or something else.

I would connect the HDs directly to the SATA ports at first... perhaps even in an open chassis scenario, and work out if its the HD or the PSU, or the mobo, or the HBA, etc.

Maybe even try testing the drives in a different machine.

If its the drives, and it could be... especially if ~~OOPS~~ UPS lived up to their name.

DGenerateKane · Jul 20, 2017

So I opened it up, and the cables look brand new to me. I've never worked with a rackmount before and looking at the cabling I'm lost on how I'd hook up power to any drives. The only machine I have that I could possibly test them in won't be free for several days, nor do I know how I should do a badblocks test on a windows machine. So I decided to try again with all eight drives, as well as three old drives I had lying around that don't have any data on them. I connected the eight new drives to the first backplane, and the other three to the second backplane. Unfortunately now I can't run a badblocks test on any drive. Every drive I try the command badblocks –b 4096 –vws /dev/daX gives me the error: badblocks: invalid first block - –vws. I tried searching for "badblocks: invalid first block" but I was unable to find a single result for it. So I'm at a loss on what to do right now. I might try moving the HBA card to another PCIe slot. Oh, after this happened, I did flash the latest IT firmware to the card, though it wasn't that outdated, it was on P20.00.02.00, and like I said before FreeNAS was not giving me an alert about it.

I've also noticed a lot of errors in the console that may or may not be related. These were right after I booted. da0-da7 are the new drives FYI, which were sourced from both Amazon and Newegg, and manufactured a month apart.

Code:

Jul 20 13:13:28 APPA (da2:mps0:0:10:0): READ(6). CDB: 08 00 02 00 10 00
Jul 20 13:13:28 APPA (da2:mps0:0:10:0): CAM status: CCB request completed with an error
Jul 20 13:13:28 APPA (da2:mps0:0:10:0): Retrying command
Jul 20 13:13:28 APPA (da2:mps0:0:10:0): READ(6). CDB: 08 00 02 00 10 00
Jul 20 13:13:28 APPA (da2:mps0:0:10:0): CAM status: SCSI Status Error
Jul 20 13:13:28 APPA (da2:mps0:0:10:0): SCSI status: Check Condition
Jul 20 13:13:28 APPA (da2:mps0:0:10:0): SCSI sense: UNIT ATTENTION asc:29,0 (Power on, reset, or bus device reset occurred)
Jul 20 13:13:28 APPA (da2:mps0:0:10:0): Retrying command (per sense data)
Jul 20 13:13:28 APPA (da3:mps0:0:11:0): READ(6). CDB: 08 00 00 10 10 00
Jul 20 13:13:28 APPA (da3:mps0:0:11:0): CAM status: SCSI Status Error
Jul 20 13:13:28 APPA (da3:mps0:0:11:0): SCSI status: Check Condition
Jul 20 13:13:28 APPA (da3:mps0:0:11:0): SCSI sense: ABORTED COMMAND asc:47,3 (Information unit iuCRC error detected)
Jul 20 13:13:28 APPA (da3:mps0:0:11:0): Retrying command (per sense data)
Jul 20 13:13:28 APPA (da5:mps0:0:13:0): READ(6). CDB: 08 00 00 10 10 00
Jul 20 13:13:28 APPA (da5:mps0:0:13:0): CAM status: SCSI Status Error
Jul 20 13:13:28 APPA (da5:mps0:0:13:0): SCSI status: Check Condition
Jul 20 13:13:28 APPA (da5:mps0:0:13:0): SCSI sense: ABORTED COMMAND asc:47,3 (Information unit iuCRC error detected)
Jul 20 13:13:28 APPA (da5:mps0:0:13:0): Retrying command (per sense data)
Jul 20 13:13:28 APPA (da7:mps0:0:15:0): READ(6). CDB: 08 00 02 00 10 00
Jul 20 13:13:28 APPA (da7:mps0:0:15:0): CAM status: SCSI Status Error
Jul 20 13:13:28 APPA (da7:mps0:0:15:0): SCSI status: Check Condition
Jul 20 13:13:28 APPA (da7:mps0:0:15:0): SCSI sense: ABORTED COMMAND asc:47,3 (Information unit iuCRC error detected)
Jul 20 13:13:28 APPA (da7:mps0:0:15:0): Retrying command (per sense data)
Jul 20 13:13:28 APPA	 (da7:mps0:0:15:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fc 20 00 00 00 e0 00 00 length 114688 SMID 876 terminated ioc 804b scsi 0 state 0 xfer 0
Jul 20 13:13:28 APPA	 (da7:mps0:0:15:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fe 20 00 00 00 e0 00 00 length 114688 SMID 877 terminated ioc 804b (da7:mps0:0:15:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fc 20 00 00 00 e0 00 00
Jul 20 13:13:28 APPA scsi 0 state 0 xfer 0
Jul 20 13:13:28 APPA (da7:mps0:0:15:0): CAM status: CCB request completed with an error
Jul 20 13:13:28 APPA (da7:mps0:0:15:0): Retrying command
Jul 20 13:13:28 APPA (da7:mps0:0:15:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fe 20 00 00 00 e0 00 00
Jul 20 13:13:28 APPA (da7:mps0:0:15:0): CAM status: CCB request completed with an error
Jul 20 13:13:28 APPA (da7:mps0:0:15:0): Retrying command
Jul 20 13:13:28 APPA (da7:mps0:0:15:0): READ(6). CDB: 08 00 02 20 e0 00
Jul 20 13:13:28 APPA (da7:mps0:0:15:0): CAM status: SCSI Status Error
Jul 20 13:13:28 APPA (da7:mps0:0:15:0): SCSI status: Check Condition
Jul 20 13:13:28 APPA (da7:mps0:0:15:0): SCSI sense: ABORTED COMMAND asc:47,3 (Information unit iuCRC error detected)
Jul 20 13:13:28 APPA (da7:mps0:0:15:0): Retrying command (per sense data)
Jul 20 13:13:28 APPA	 (da7:mps0:0:15:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fc 20 00 00 00 e0 00 00 length 114688 SMID 879 terminated ioc 804b scsi 0 state 0 xfer 0
Jul 20 13:13:28 APPA	 (da7:mps0:0:15:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fe 20 00 00 00 e0 00 00 length 114688 SMID 880 terminated ioc 804b (da7:mps0:0:15:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fc 20 00 00 00 e0 00 00
Jul 20 13:13:28 APPA scsi 0 state 0 xfer 0
Jul 20 13:13:28 APPA (da7:mps0:0:15:0): CAM status: CCB request completed with an error
Jul 20 13:13:28 APPA (da7:mps0:0:15:0): Retrying command
Jul 20 13:13:28 APPA (da7:mps0:0:15:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fe 20 00 00 00 e0 00 00
Jul 20 13:13:28 APPA (da7:mps0:0:15:0): CAM status: CCB request completed with an error
Jul 20 13:13:28 APPA (da7:mps0:0:15:0): Retrying command
Jul 20 13:13:28 APPA (da7:mps0:0:15:0): READ(6). CDB: 08 00 02 20 e0 00
Jul 20 13:13:28 APPA (da7:mps0:0:15:0): CAM status: SCSI Status Error
Jul 20 13:13:28 APPA (da7:mps0:0:15:0): SCSI status: Check Condition
Jul 20 13:13:28 APPA (da7:mps0:0:15:0): SCSI sense: ABORTED COMMAND asc:47,3 (Information unit iuCRC error detected)
Jul 20 13:13:28 APPA (da7:mps0:0:15:0): Retrying command (per sense data)
Jul 20 13:13:28 APPA	 (da7:mps0:0:15:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fc 20 00 00 00 e0 00 00 length 114688 SMID 882 terminated ioc 804b scsi 0 state 0 xfer 0
Jul 20 13:13:28 APPA	 (da7:mps0:0:15:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fe 20 00 00 00 e0 00 00 length 114688 SMID 883 terminated ioc 804b (da7:mps0:0:15:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fc 20 00 00 00 e0 00 00
Jul 20 13:13:28 APPA scsi 0 state 0 xfer 0
Jul 20 13:13:28 APPA (da7:mps0:0:15:0): CAM status: CCB request completed with an error
Jul 20 13:13:28 APPA (da7:mps0:0:15:0): Retrying command
Jul 20 13:13:28 APPA (da7:mps0:0:15:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fe 20 00 00 00 e0 00 00
Jul 20 13:13:28 APPA (da7:mps0:0:15:0): CAM status: CCB request completed with an error
Jul 20 13:13:28 APPA (da7:mps0:0:15:0): Retrying command
Jul 20 13:13:28 APPA (da7:mps0:0:15:0): READ(6). CDB: 08 00 02 20 e0 00
Jul 20 13:13:28 APPA (da7:mps0:0:15:0): CAM status: SCSI Status Error
Jul 20 13:13:28 APPA (da7:mps0:0:15:0): SCSI status: Check Condition
Jul 20 13:13:28 APPA (da7:mps0:0:15:0): SCSI sense: ABORTED COMMAND asc:47,3 (Information unit iuCRC error detected)
Jul 20 13:13:28 APPA (da7:mps0:0:15:0): Retrying command (per sense data)
Jul 20 13:13:28 APPA	 (da7:mps0:0:15:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fc 20 00 00 00 e0 00 00 length 114688 SMID 885 terminated ioc 804b scsi 0 state 0 xfer 0
Jul 20 13:13:28 APPA	 (da7:mps0:0:15:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fe 20 00 00 00 e0 00 00 length 114688 SMID 886 terminated ioc 804b (da7:mps0:0:15:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fc 20 00 00 00 e0 00 00
Jul 20 13:13:28 APPA scsi 0 state 0 xfer 0
Jul 20 13:13:28 APPA (da7:mps0:0:15:0): CAM status: CCB request completed with an error
Jul 20 13:13:28 APPA (da7:mps0:0:15:0): Retrying command
Jul 20 13:13:28 APPA (da7:mps0:0:15:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fe 20 00 00 00 e0 00 00
Jul 20 13:13:28 APPA (da7:mps0:0:15:0): CAM status: CCB request completed with an error
Jul 20 13:13:28 APPA (da7:mps0:0:15:0): Retrying command
Jul 20 13:13:28 APPA (da7:mps0:0:15:0): READ(6). CDB: 08 00 02 20 e0 00
Jul 20 13:13:28 APPA (da7:mps0:0:15:0): CAM status: SCSI Status Error
Jul 20 13:13:28 APPA (da7:mps0:0:15:0): SCSI status: Check Condition
Jul 20 13:13:28 APPA (da7:mps0:0:15:0): SCSI sense: ABORTED COMMAND asc:47,3 (Information unit iuCRC error detected)
Jul 20 13:13:28 APPA (da7:mps0:0:15:0): Retrying command (per sense data)
Jul 20 13:13:28 APPA	 (da7:mps0:0:15:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fc 20 00 00 00 e0 00 00 length 114688 SMID 888 terminated ioc 804b scsi 0 state 0 xfer 0
Jul 20 13:13:28 APPA	 (da7:mps0:0:15:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fe 20 00 00 00 e0 00 00 length 114688 SMID 889 terminated ioc 804b (da7:mps0:0:15:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fc 20 00 00 00 e0 00 00
Jul 20 13:13:28 APPA scsi 0 state 0 xfer 0
Jul 20 13:13:28 APPA (da7:mps0:0:15:0): CAM status: CCB request completed with an error
Jul 20 13:13:28 APPA (da7:mps0:0:15:0): Error 5, Retries exhausted
Jul 20 13:13:28 APPA (da7:mps0:0:15:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fe 20 00 00 00 e0 00 00
Jul 20 13:13:28 APPA (da7:mps0:0:15:0): CAM status: CCB request completed with an error
Jul 20 13:13:28 APPA (da7:mps0:0:15:0): Error 5, Retries exhausted
Jul 20 13:13:28 APPA (da7:mps0:0:15:0): READ(6). CDB: 08 00 02 20 e0 00
Jul 20 13:13:28 APPA (da7:mps0:0:15:0): CAM status: SCSI Status Error
Jul 20 13:13:28 APPA (da7:mps0:0:15:0): SCSI status: Check Condition
Jul 20 13:13:28 APPA (da7:mps0:0:15:0): SCSI sense: ABORTED COMMAND asc:47,3 (Information unit iuCRC error detected)
Jul 20 13:13:28 APPA (da7:mps0:0:15:0): Error 5, Retries exhausted
Jul 20 13:13:28 APPA	 (da6:mps0:0:14:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fe 20 00 00 00 e0 00 00 length 114688 SMID 894 terminated ioc 804b scsi 0 state 0 xfer 0
Jul 20 13:13:28 APPA (da6:mps0:0:14:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fe 20 00 00 00 e0 00 00
Jul 20 13:13:28 APPA (da6:mps0:0:14:0): CAM status: CCB request completed with an error
Jul 20 13:13:28 APPA (da6:mps0:0:14:0): Retrying command
Jul 20 13:13:28 APPA (da6:mps0:0:14:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fc 20 00 00 00 e0 00 00
Jul 20 13:13:28 APPA (da6:mps0:0:14:0): CAM status: SCSI Status Error
Jul 20 13:13:28 APPA (da6:mps0:0:14:0): SCSI status: Check Condition
Jul 20 13:13:28 APPA (da6:mps0:0:14:0): SCSI sense: ABORTED COMMAND asc:47,3 (Information unit iuCRC error detected)
Jul 20 13:13:28 APPA (da6:mps0:0:14:0): Retrying command (per sense data)
Jul 20 13:13:28 APPA (da6:mps0:0:14:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fe 20 00 00 00 e0 00 00
Jul 20 13:13:28 APPA (da6:mps0:0:14:0): CAM status: SCSI Status Error
Jul 20 13:13:28 APPA (da6:mps0:0:14:0): SCSI status: Check Condition
Jul 20 13:13:28 APPA (da6:mps0:0:14:0): SCSI sense: ABORTED COMMAND asc:47,3 (Information unit iuCRC error detected)
Jul 20 13:13:28 APPA (da6:mps0:0:14:0): Retrying command (per sense data)
Jul 20 13:13:28 APPA	 (da5:mps0:0:13:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fc 20 00 00 00 e0 00 00 length 114688 SMID 901 terminated ioc 804b scsi 0 state 0 xfer 0
Jul 20 13:13:28 APPA	 (da5:mps0:0:13:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fe 20 00 00 00 e0 00 00 length 114688 SMID 902 terminated ioc 804b (da5:mps0:0:13:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fc 20 00 00 00 e0 00 00
Jul 20 13:13:28 APPA scsi 0 state 0 xfer 0
Jul 20 13:13:28 APPA (da5:mps0:0:13:0): CAM status: CCB request completed with an error
Jul 20 13:13:28 APPA (da5:mps0:0:13:0): Retrying command
Jul 20 13:13:28 APPA (da5:mps0:0:13:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fe 20 00 00 00 e0 00 00
Jul 20 13:13:28 APPA (da5:mps0:0:13:0): CAM status: CCB request completed with an error
Jul 20 13:13:28 APPA (da5:mps0:0:13:0): Retrying command
Jul 20 13:13:28 APPA (da5:mps0:0:13:0): READ(6). CDB: 08 00 02 20 e0 00
Jul 20 13:13:28 APPA (da5:mps0:0:13:0): CAM status: SCSI Status Error
Jul 20 13:13:28 APPA (da5:mps0:0:13:0): SCSI status: Check Condition
Jul 20 13:13:28 APPA (da5:mps0:0:13:0): SCSI sense: ABORTED COMMAND asc:47,3 (Information unit iuCRC error detected)
Jul 20 13:13:28 APPA (da5:mps0:0:13:0): Retrying command (per sense data)
Jul 20 13:13:28 APPA	 (da5:mps0:0:13:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fc 20 00 00 00 e0 00 00 length 114688 SMID 904 terminated ioc 804b scsi 0 state 0 xfer 0
Jul 20 13:13:28 APPA	 (da5:mps0:0:13:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fe 20 00 00 00 e0 00 00 length 114688 SMID 905 terminated ioc 804b (da5:mps0:0:13:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fc 20 00 00 00 e0 00 00
Jul 20 13:13:28 APPA scsi 0 state 0 xfer 0
Jul 20 13:13:28 APPA (da5:mps0:0:13:0): CAM status: CCB request completed with an error
Jul 20 13:13:28 APPA (da5:mps0:0:13:0): Retrying command
Jul 20 13:13:28 APPA (da5:mps0:0:13:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fe 20 00 00 00 e0 00 00
Jul 20 13:13:28 APPA (da5:mps0:0:13:0): CAM status: CCB request completed with an error
Jul 20 13:13:28 APPA (da5:mps0:0:13:0): Retrying command
Jul 20 13:13:28 APPA (da5:mps0:0:13:0): READ(6). CDB: 08 00 02 20 e0 00
Jul 20 13:13:28 APPA (da5:mps0:0:13:0): CAM status: SCSI Status Error
Jul 20 13:13:28 APPA (da5:mps0:0:13:0): SCSI status: Check Condition
Jul 20 13:13:28 APPA (da5:mps0:0:13:0): SCSI sense: ABORTED COMMAND asc:47,3 (Information unit iuCRC error detected)
Jul 20 13:13:28 APPA (da5:mps0:0:13:0): Retrying command (per sense data)
Jul 20 13:13:28 APPA	 (da5:mps0:0:13:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fc 20 00 00 00 e0 00 00 length 114688 SMID 907 terminated ioc 804b scsi 0 state 0 xfer 0
Jul 20 13:13:28 APPA	 (da5:mps0:0:13:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fe 20 00 00 00 e0 00 00 length 114688 SMID 908 terminated ioc 804b (da5:mps0:0:13:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fc 20 00 00 00 e0 00 00
Jul 20 13:13:28 APPA scsi 0 state 0 xfer 0
Jul 20 13:13:28 APPA (da5:mps0:0:13:0): CAM status: CCB request completed with an error
Jul 20 13:13:28 APPA (da5:mps0:0:13:0): Retrying command
Jul 20 13:13:28 APPA (da5:mps0:0:13:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fe 20 00 00 00 e0 00 00
Jul 20 13:13:28 APPA (da5:mps0:0:13:0): CAM status: CCB request completed with an error
Jul 20 13:13:28 APPA (da5:mps0:0:13:0): Retrying command
Jul 20 13:13:28 APPA (da5:mps0:0:13:0): READ(6). CDB: 08 00 02 20 e0 00
Jul 20 13:13:28 APPA (da5:mps0:0:13:0): CAM status: SCSI Status Error
Jul 20 13:13:28 APPA (da5:mps0:0:13:0): SCSI status: Check Condition
Jul 20 13:13:28 APPA (da5:mps0:0:13:0): SCSI sense: ABORTED COMMAND asc:47,3 (Information unit iuCRC error detected)
Jul 20 13:13:28 APPA (da5:mps0:0:13:0): Retrying command (per sense data)
Jul 20 13:13:28 APPA	 (da5:mps0:0:13:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fc 20 00 00 00 e0 00 00 length 114688 SMID 910 terminated ioc 804b scsi 0 state 0 xfer 0
Jul 20 13:13:28 APPA	 (da5:mps0:0:13:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fe 20 00 00 00 e0 00 00 length 114688 SMID 911 terminated ioc 804b (da5:mps0:0:13:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fc 20 00 00 00 e0 00 00
Jul 20 13:13:28 APPA scsi 0 state 0 xfer 0
Jul 20 13:13:28 APPA (da5:mps0:0:13:0): CAM status: CCB request completed with an error
Jul 20 13:13:28 APPA (da5:mps0:0:13:0): Retrying command
Jul 20 13:13:28 APPA (da5:mps0:0:13:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fe 20 00 00 00 e0 00 00
Jul 20 13:13:28 APPA (da5:mps0:0:13:0): CAM status: CCB request completed with an error
Jul 20 13:13:28 APPA (da5:mps0:0:13:0): Retrying command
Jul 20 13:13:28 APPA (da5:mps0:0:13:0): READ(6). CDB: 08 00 02 20 e0 00
Jul 20 13:13:28 APPA (da5:mps0:0:13:0): CAM status: SCSI Status Error
Jul 20 13:13:28 APPA (da5:mps0:0:13:0): SCSI status: Check Condition
Jul 20 13:13:28 APPA (da5:mps0:0:13:0): SCSI sense: ABORTED COMMAND asc:47,3 (Information unit iuCRC error detected)
Jul 20 13:13:28 APPA (da5:mps0:0:13:0): Retrying command (per sense data)
Jul 20 13:13:28 APPA	 (da5:mps0:0:13:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fc 20 00 00 00 e0 00 00 length 114688 SMID 913 terminated ioc 804b scsi 0 state 0 xfer 0
Jul 20 13:13:28 APPA	 (da5:mps0:0:13:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fe 20 00 00 00 e0 00 00 length 114688 SMID 914 terminated ioc 804b (da5:mps0:0:13:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fc 20 00 00 00 e0 00 00
Jul 20 13:13:28 APPA scsi 0 state 0 xfer 0
Jul 20 13:13:28 APPA (da5:mps0:0:13:0): CAM status: CCB request completed with an error
Jul 20 13:13:28 APPA (da5:mps0:0:13:0): Error 5, Retries exhausted
Jul 20 13:13:28 APPA (da5:mps0:0:13:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fe 20 00 00 00 e0 00 00
Jul 20 13:13:28 APPA (da5:mps0:0:13:0): CAM status: CCB request completed with an error
Jul 20 13:13:28 APPA (da5:mps0:0:13:0): Error 5, Retries exhausted
Jul 20 13:13:28 APPA (da5:mps0:0:13:0): READ(6). CDB: 08 00 02 20 e0 00
Jul 20 13:13:28 APPA (da5:mps0:0:13:0): CAM status: SCSI Status Error
Jul 20 13:13:28 APPA (da5:mps0:0:13:0): SCSI status: Check Condition
Jul 20 13:13:28 APPA (da5:mps0:0:13:0): SCSI sense: ABORTED COMMAND asc:47,3 (Information unit iuCRC error detected)
Jul 20 13:13:28 APPA (da5:mps0:0:13:0): Error 5, Retries exhausted
Jul 20 13:13:28 APPA (da4:mps0:0:12:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fe 20 00 00 00 e0 00 00
Jul 20 13:13:28 APPA (da4:mps0:0:12:0): CAM status: SCSI Status Error
Jul 20 13:13:28 APPA (da4:mps0:0:12:0): SCSI status: Check Condition
Jul 20 13:13:28 APPA (da4:mps0:0:12:0): SCSI sense: ABORTED COMMAND asc:47,3 (Information unit iuCRC error detected)
Jul 20 13:13:28 APPA (da4:mps0:0:12:0): Retrying command (per sense data)
Jul 20 13:13:28 APPA	 (da3:mps0:0:11:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fc 20 00 00 00 e0 00 00 length 114688 SMID 924 terminated ioc 804b scsi 0 state 0 xfer 0
Jul 20 13:13:28 APPA	 (da3:mps0:0:11:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fe 20 00 00 00 e0 00 00 length 114688 SMID 925 terminated ioc 804b (da3:mps0:0:11:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fc 20 00 00 00 e0 00 00
Jul 20 13:13:28 APPA scsi 0 state 0 xfer 0
Jul 20 13:13:28 APPA (da3:mps0:0:11:0): CAM status: CCB request completed with an error
Jul 20 13:13:28 APPA (da3:mps0:0:11:0): Retrying command
Jul 20 13:13:28 APPA (da3:mps0:0:11:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fe 20 00 00 00 e0 00 00
Jul 20 13:13:28 APPA (da3:mps0:0:11:0): CAM status: CCB request completed with an error
Jul 20 13:13:28 APPA (da3:mps0:0:11:0): Retrying command
Jul 20 13:13:28 APPA (da3:mps0:0:11:0): READ(6). CDB: 08 00 02 20 e0 00
Jul 20 13:13:28 APPA (da3:mps0:0:11:0): CAM status: SCSI Status Error
Jul 20 13:13:28 APPA (da3:mps0:0:11:0): SCSI status: Check Condition
Jul 20 13:13:28 APPA (da3:mps0:0:11:0): SCSI sense: ABORTED COMMAND asc:47,3 (Information unit iuCRC error detected)
Jul 20 13:13:28 APPA (da3:mps0:0:11:0): Retrying command (per sense data)
Jul 20 13:13:28 APPA	 (da3:mps0:0:11:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fc 20 00 00 00 e0 00 00 length 114688 SMID 927 terminated ioc 804b scsi 0 state 0 xfer 0
Jul 20 13:13:28 APPA	 (da3:mps0:0:11:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fe 20 00 00 00 e0 00 00 length 114688 SMID 928 terminated ioc 804b (da3:mps0:0:11:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fc 20 00 00 00 e0 00 00
Jul 20 13:13:28 APPA scsi 0 state 0 xfer 0
Jul 20 13:13:28 APPA (da3:mps0:0:11:0): CAM status: CCB request completed with an error
Jul 20 13:13:28 APPA (da3:mps0:0:11:0): Retrying command
Jul 20 13:13:28 APPA (da3:mps0:0:11:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fe 20 00 00 00 e0 00 00
Jul 20 13:13:28 APPA (da3:mps0:0:11:0): CAM status: CCB request completed with an error
Jul 20 13:13:28 APPA (da3:mps0:0:11:0): Retrying command
Jul 20 13:13:28 APPA (da3:mps0:0:11:0): READ(6). CDB: 08 00 02 20 e0 00
Jul 20 13:13:28 APPA (da3:mps0:0:11:0): CAM status: SCSI Status Error
Jul 20 13:13:28 APPA (da3:mps0:0:11:0): SCSI status: Check Condition
Jul 20 13:13:28 APPA (da3:mps0:0:11:0): SCSI sense: ABORTED COMMAND asc:47,3 (Information unit iuCRC error detected)
Jul 20 13:13:28 APPA (da3:mps0:0:11:0): Retrying command (per sense data)
Jul 20 13:13:28 APPA	 (da0:mps0:0:8:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fe 20 00 00 00 e0 00 00 length 114688 SMID 946 terminated ioc 804b scsi 0 state 0 xfer 0
Jul 20 13:13:28 APPA (da0:mps0:0:8:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fe 20 00 00 00 e0 00 00
Jul 20 13:13:28 APPA (da0:mps0:0:8:0): CAM status: CCB request completed with an error
Jul 20 13:13:28 APPA (da0:mps0:0:8:0): Retrying command
Jul 20 13:13:28 APPA (da0:mps0:0:8:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fc 20 00 00 00 e0 00 00
Jul 20 13:13:28 APPA (da0:mps0:0:8:0): CAM status: SCSI Status Error
Jul 20 13:13:28 APPA (da0:mps0:0:8:0): SCSI status: Check Condition
Jul 20 13:13:28 APPA (da0:mps0:0:8:0): SCSI sense: ABORTED COMMAND asc:47,3 (Information unit iuCRC error detected)
Jul 20 13:13:28 APPA (da0:mps0:0:8:0): Retrying command (per sense data)
Jul 20 13:13:28 APPA (da0:mps0:0:8:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fe 20 00 00 00 e0 00 00
Jul 20 13:13:28 APPA (da0:mps0:0:8:0): CAM status: SCSI Status Error
Jul 20 13:13:28 APPA (da0:mps0:0:8:0): SCSI status: Check Condition
Jul 20 13:13:28 APPA (da0:mps0:0:8:0): SCSI sense: ABORTED COMMAND asc:47,3 (Information unit iuCRC error detected)
Jul 20 13:13:28 APPA (da0:mps0:0:8:0): Retrying command (per sense data)
Jul 20 13:13:28 APPA GEOM_RAID5: Module loaded, version 1.3.20140711.62 (rev f91e28e40bf7)
Jul 20 13:13:28 APPA ipmi0: <IPMI System Interface> port 0xca2,0xca3 on acpi0
Jul 20 13:13:28 APPA ipmi0: KCS mode found at io 0xca2 on acpi
Jul 20 13:13:28 APPA ipmi0: IPMI device rev. 1, firmware rev. 3.40, version 2.0
Jul 20 13:13:28 APPA ipmi0: Number of channels 2
Jul 20 13:13:28 APPA ipmi0: Attached watchdog
Jul 20 13:13:28 APPA hwpmc: SOFT/16/64/0x67<INT,USR,SYS,REA,WRI> TSC/1/64/0x20<REA> IAP/4/48/0x3ff<INT,USR,SYS,EDG,THR,REA,WRI,INV,QUA,PRC> IAF/3/48/0x67<INT,USR,SYS,REA,WRI>
Jul 20 13:13:28 APPA kernel: igb0: link state changed to UP
Jul 20 13:13:28 APPA kernel: igb0: link state changed to UP
Jul 20 13:13:28 APPA kernel: igb1: link state changed to UP
Jul 20 13:13:28 APPA kernel: igb1: link state changed to UP
Jul 20 13:13:28 APPA ums0 numa-domain 0 on uhub2
Jul 20 13:13:28 APPA ums0: <Winbond Electronics Corp Hermon USB hidmouse Device, class 0/0, rev 1.10/0.01, addr 3> on usbus0
Jul 20 13:13:28 APPA ums0: 3 buttons and [Z] coordinates ID=0
Jul 20 13:13:30 APPA ntpd[1925]: ntpd 4.2.8p10-a (1): Starting
Jul 20 13:13:32 APPA root: /etc/rc: WARNING: failed precmd routine for vmware_guestd
Jul 20 13:13:39 APPA root: /etc/rc: WARNING: failed precmd routine for minio
Jul 20 13:13:45 APPA (da1:mps0:0:9:0): READ(10). CDB: 28 00 00 00 01 00 00 01 00 00
Jul 20 13:13:45 APPA (da1:mps0:0:9:0): CAM status: SCSI Status Error
Jul 20 13:13:45 APPA (da1:mps0:0:9:0): SCSI status: Check Condition
Jul 20 13:13:45 APPA (da1:mps0:0:9:0): SCSI sense: ABORTED COMMAND asc:47,3 (Information unit iuCRC error detected)
Jul 20 13:13:45 APPA (da1:mps0:0:9:0): Retrying command (per sense data)
Jul 20 13:13:45 APPA (da1:mps0:0:9:0): READ(10). CDB: 28 00 00 00 02 00 00 01 00 00
Jul 20 13:13:45 APPA (da1:mps0:0:9:0): CAM status: SCSI Status Error
Jul 20 13:13:45 APPA (da1:mps0:0:9:0): SCSI status: Check Condition
Jul 20 13:13:45 APPA (da1:mps0:0:9:0): SCSI sense: ABORTED COMMAND asc:47,3 (Information unit iuCRC error detected)
Jul 20 13:13:45 APPA (da1:mps0:0:9:0): Retrying command (per sense data)
Jul 20 13:13:46 APPA (da1:mps0:0:9:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fd 00 00 00 01 00 00 00
Jul 20 13:13:46 APPA (da1:mps0:0:9:0): CAM status: SCSI Status Error
Jul 20 13:13:46 APPA (da1:mps0:0:9:0): SCSI status: Check Condition
Jul 20 13:13:46 APPA (da1:mps0:0:9:0): SCSI sense: ABORTED COMMAND asc:47,3 (Information unit iuCRC error detected)
Jul 20 13:13:46 APPA (da1:mps0:0:9:0): Retrying command (per sense data)
Jul 20 13:13:46 APPA (da1:mps0:0:9:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fe 00 00 00 01 00 00 00
Jul 20 13:13:46 APPA (da1:mps0:0:9:0): CAM status: SCSI Status Error
Jul 20 13:13:46 APPA (da1:mps0:0:9:0): SCSI status: Check Condition
Jul 20 13:13:46 APPA (da1:mps0:0:9:0): SCSI sense: ABORTED COMMAND asc:47,3 (Information unit iuCRC error detected)
Jul 20 13:13:46 APPA (da1:mps0:0:9:0): Retrying command (per sense data)
Jul 20 13:13:46 APPA (da1:mps0:0:9:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f ff 00 00 00 01 00 00 00
Jul 20 13:13:46 APPA (da1:mps0:0:9:0): CAM status: SCSI Status Error
Jul 20 13:13:46 APPA (da1:mps0:0:9:0): SCSI status: Check Condition
Jul 20 13:13:46 APPA (da1:mps0:0:9:0): SCSI sense: ABORTED COMMAND asc:47,3 (Information unit iuCRC error detected)
Jul 20 13:13:46 APPA (da1:mps0:0:9:0): Retrying command (per sense data)
Jul 20 13:13:46 APPA (da1:mps0:0:9:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fc 00 00 00 01 00 00 00
Jul 20 13:13:46 APPA (da1:mps0:0:9:0): CAM status: SCSI Status Error
Jul 20 13:13:46 APPA (da1:mps0:0:9:0): SCSI status: Check Condition
Jul 20 13:13:46 APPA (da1:mps0:0:9:0): SCSI sense: ABORTED COMMAND asc:47,3 (Information unit iuCRC error detected)
Jul 20 13:13:46 APPA (da1:mps0:0:9:0): Retrying command (per sense data)
Jul 20 13:13:46 APPA (da1:mps0:0:9:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fd 00 00 00 01 00 00 00
Jul 20 13:13:46 APPA (da1:mps0:0:9:0): CAM status: SCSI Status Error
Jul 20 13:13:46 APPA (da1:mps0:0:9:0): SCSI status: Check Condition
Jul 20 13:13:46 APPA (da1:mps0:0:9:0): SCSI sense: ABORTED COMMAND asc:47,3 (Information unit iuCRC error detected)
Jul 20 13:13:46 APPA (da1:mps0:0:9:0): Retrying command (per sense data)
Jul 20 13:13:46 APPA (da1:mps0:0:9:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fe 00 00 00 01 00 00 00
Jul 20 13:13:46 APPA (da1:mps0:0:9:0): CAM status: SCSI Status Error
Jul 20 13:13:46 APPA (da1:mps0:0:9:0): SCSI status: Check Condition
Jul 20 13:13:46 APPA (da1:mps0:0:9:0): SCSI sense: ABORTED COMMAND asc:47,3 (Information unit iuCRC error detected)
Jul 20 13:13:46 APPA (da1:mps0:0:9:0): Retrying command (per sense data)
Jul 20 13:13:46 APPA (da2:mps0:0:10:0): READ(10). CDB: 28 00 00 00 00 00 00 01 00 00
Jul 20 13:13:46 APPA (da2:mps0:0:10:0): CAM status: SCSI Status Error
Jul 20 13:13:46 APPA (da2:mps0:0:10:0): SCSI status: Check Condition
Jul 20 13:13:46 APPA (da2:mps0:0:10:0): SCSI sense: ABORTED COMMAND asc:47,3 (Information unit iuCRC error detected)
Jul 20 13:13:46 APPA (da2:mps0:0:10:0): Retrying command (per sense data)
Jul 20 13:13:46 APPA (da2:mps0:0:10:0): READ(10). CDB: 28 00 00 00 01 00 00 01 00 00
Jul 20 13:13:46 APPA (da2:mps0:0:10:0): CAM status: SCSI Status Error
Jul 20 13:13:46 APPA (da2:mps0:0:10:0): SCSI status: Check Condition
Jul 20 13:13:46 APPA (da2:mps0:0:10:0): SCSI sense: ABORTED COMMAND asc:47,3 (Information unit iuCRC error detected)
Jul 20 13:13:46 APPA (da2:mps0:0:10:0): Retrying command (per sense data)
Jul 20 13:13:47 APPA (da2:mps0:0:10:0): READ(10). CDB: 28 00 00 00 02 00 00 01 00 00
Jul 20 13:13:47 APPA (da2:mps0:0:10:0): CAM status: SCSI Status Error
Jul 20 13:13:47 APPA (da2:mps0:0:10:0): SCSI status: Check Condition
Jul 20 13:13:47 APPA (da2:mps0:0:10:0): SCSI sense: ABORTED COMMAND asc:47,3 (Information unit iuCRC error detected)
Jul 20 13:13:47 APPA (da2:mps0:0:10:0): Retrying command (per sense data)
Jul 20 13:13:47 APPA (da2:mps0:0:10:0): READ(16). CDB: 88 00 00 00 00 04 8c 3f fd 00 00 00 01 00 00 00
Jul 20 13:13:47 APPA (da2:mps0:0:10:0): CAM status: SCSI Status Error
Jul 20 13:13:47 APPA (da2:mps0:0:10:0): SCSI status: Check Condition
Jul 20 13:13:47 APPA (da2:mps0:0:10:0): SCSI sense: ABORTED COMMAND asc:47,3 (Information unit iuCRC error detected)
Jul 20 13:13:47 APPA (da2:mps0:0:10:0): Retrying command (per sense data)
Jul 20 13:13:47 APPA (da2:mps0:0:10:0): READ(10). CDB: 28 00 00 00 02 00 00 01 00 00
Jul 20 13:13:47 APPA (da2:mps0:0:10:0): CAM status: SCSI Status Error
Jul 20 13:13:47 APPA (da2:mps0:0:10:0): SCSI status: Check Condition
Jul 20 13:13:47 APPA (da2:mps0:0:10:0): SCSI sense: ABORTED COMMAND asc:47,3 (Information unit iuCRC error detected)
Jul 20 13:13:47 APPA (da2:mps0:0:10:0): Retrying command (per sense data)

Stux · Jul 20, 2017

DGenerateKane said:
Unfortunately now I can't run a badblocks test on any drive. Every drive I try the command badblocks –b 4096 –vws /dev/daX gives me the error: badblocks: invalid first block

Did you run :

sysctl kern.geom.debugflags=0x10

First? To allow writing to the start of the drives? Ie the partition/boot blocks.

DGenerateKane · Jul 20, 2017

Yes, and it returned kern.geom.debugflags: 0 -> 16

Edit: On a whim I tried a non destructive block test on a drive and it started just fine. I cancelled it and tried a destructive test. Same error. So I've started non destructive tests on all drives, and hope for no errors.

danb35 · Jul 21, 2017

Spearfoot said:
LSI HBAs should be flashed to the firmware version FreeNAS expects; it will fuss and you'll see an alert if this is out-of-wack.

This apparently is no longer the case per another recent thread here--I'll see if I can find it.

Edit: Here's the post.

leenux_tux · Jul 21, 2017

If you have a spare machine, and it's Windows, you can still do a test.

Download a bootable Linux CD image or USB image and write to the appropriate media.
Open up your Windows box and disconnect the drives that are currently connected (just to be on the safe side)
Connect up your suspect drives.
Boot up and select either the CD or USB as bootable media, whichever you decided on.
Once booted into Linux you can run your tests.

DGenerateKane · Jul 21, 2017

Man, most of the drives are showing errors. da0 and da1 have already aborted with over a billion read errors each, da2 has 5 read errors, da3 has just two read errors so far, da4 has 4 read errors and 32 corruption errors, da5 has one read error, da6 is the only one with no errors so far, and da7 has 5 read errors. Yet da8-10 have none so far, and da8 and da9 have already completed their tests. They had a much smaller capacity. So either all the drives are bad, or the backplane itself is bad. Or the cable connecting that backplane. Or the port on the HBA that the backplane connects to. Ugh.

leenux_tux · Jul 21, 2017

Take two that look OK and two of the worst ones, stick them in another box and run the tests again. Then compare results. Might give you a better understanding of where the issue really is ?

Sent from my A0001 using Tapatalk

BigDave · Jul 21, 2017

Assuming you have FreeNAS installed on this chassis, I would first check by booting up
with a single hard drive in one of the bays. Check to see if the drive is listed in the
Volume Manager. If it's there, configure the SSH service and see if you can run a
short SMART test on the drive, and check the HBA's connectivity from the CLI with
dmesg | grep mps while waiting for the smart test to complete.

DGenerateKane · Jul 21, 2017

BigDave said:
Assuming you have FreeNAS installed on this chassis, I would first check by booting up
with a single hard drive in one of the bays. Check to see if the drive is listed in the
Volume Manager. If it's there, configure the SSH service and see if you can run a
short SMART test on the drive, and check the HBA's connectivity from the CLI with
dmesg | grep mps while waiting for the smart test to complete.

I did that. I have no idea what I'm looking at though.

Code:

mps0: <Avago Technologies (LSI) SAS2008> port 0x8000-0x80ff mem 0xdf600000-0xdf603fff,0xdf580000-0xdf5bffff irq 32 at device 0.0 numa-domain 0 on pci2
mps0: Firmware: 20.00.07.00, Driver: 21.01.00.00-fbsd
mps0: IOCCapabilities: 1285c<ScsiTaskFull,DiagTrace,SnapBuf,EEDP,TransRetry,EventReplay,HostDisc>
mps0: SAS Address for SATA device = 3e1f1462ab87546a
mps0: SAS Address from SATA device = 3e1f1462ab87546a
ses0 at mps0 bus 0 scbus0 target 16 lun 0
ses1 at mps0 bus 0 scbus0 target 20 lun 0
da0 at mps0 bus 0 scbus0 target 8 lun 0

BigDave · Jul 21, 2017

What you are looking at is a working HBA card with the proper firmware, a single drive
that is connected (da0) to it. Can you or did you run a short SMART test on that drive?
If the drive can complete the test, it proves that the backplane bay it is currently
plugged into IS WORKING! :D

Try the drive in other bays and stay away from hotswapping for now.
Don't assume this machine was assembled properly, download manuals
for the chassis AND the backplanes and check model numbers!
Post a few pics with the lid off, they may be of some help.

DGenerateKane · Jul 21, 2017

Ok at this point I don't know what to do. I've put four of them in a windows machine I have (all the SATA ports on the board) and installed FreeNAS to a flash drive. I just tried a destructive block test and I get the same invalid first block error. The non destructive block test also fails now, on the actual machine as well as the desktop. I'm pulling my hair out here. Does anyone know anything about that error and how I can get around it? The drives can't all be bad, because the invalid first block error is on the three drives I already had laying around as well. Unless something hardware or software has destroyed every drive I connected to the chassis.

I never hotswapped any of the drives, everything was done offline. I'll take some pics if you really think it will help.

DGenerateKane · Jul 21, 2017

I don't know how useful they'll be.

Sent from my SM-G935V using Tapatalk

BigDave · Jul 21, 2017

As I was told the first test a drive is subjugated to when it comes out of the box is a Conveyance Test,
This test is for the electrical hardware of the drive.
If the drive passes that, then a Long Smart Test is performed and this tests the mechanical parts
of the drive.
As they say in Texas, "Son, you can't put the cart before the horse!"

https://linux.die.net/man/8/badblocks

edit: Added link

DGenerateKane · Jul 21, 2017

I tired to. I used the same command on the desktop that I used before on the chassis and now it won't run. I can't remember the exact wording but basically it said I was giving 3 device names, and then it listed –t conveyance /dev/ada0 as the three devices named. Which they clearly aren't. I'll try again though.

edit: just tried it again on the drive in the chassis.

Code:

root@APPA:~ # smartctl –t conveyance /dev/da0
smartctl 6.5 2016-05-07 r4318 [FreeBSD 11.0-STABLE amd64] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

ERROR: smartctl takes ONE device name as the final command-line argument.
You have provided 3 device names:
–t
conveyance
/dev/da0

Use smartctl -h to get a usage summary

Important Announcement for the TrueNAS Community.

New Build, new drives have over a BILLION bad blocks

Explorer

He of the long foot

Explorer

He of the long foot

Explorer

MVP

Explorer

MVP

Explorer

Hall of Famer

Patron

Explorer

Patron

FreeNAS Enthusiast

Explorer

FreeNAS Enthusiast

Explorer

Explorer

FreeNAS Enthusiast

Explorer

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "New Build, new drives have over a BILLION bad blocks"

Similar threads