panic: I/O to pool 'main' appears to be hung on vdev

Richard Durso

Explorer
Joined
Jan 30, 2014
Messages
70
Yes, drive replacement is in progress. Working with Newegg on it.

I have two things I could use some help understanding:

1) For the duration of the test (now 48 hours+), I've not experiences any slow downs or performance issues. Nor any random reboots or vdev hangs. If the drive is having such problems that it can't complete the tests, I don't understand why it's performing better.

2) At the FreeNAS level, I have no indication that anything is wrong with the drive. No alerts, no emails (other than the panic with hung vdev) Is FreeNAS not picking up on something it should?
 

rvassar

Guru
Joined
May 2, 2018
Messages
972
It's situations like this when I wish I could run "uptime" on the drive controller.
Yes, drive replacement is in progress. Working with Newegg on it.

I have two things I could use some help understanding:

1) For the duration of the test (now 48 hours+), I've not experiences any slow downs or performance issues. Nor any random reboots or vdev hangs. If the drive is having such problems that it can't complete the tests, I don't understand why it's performing better.

2) At the FreeNAS level, I have no indication that anything is wrong with the drive. No alerts, no emails (other than the panic with hung vdev) Is FreeNAS not picking up on something it should?

I suspect a portion of the controller on the drive is hung, and/or the platter spindle has stopped, and the self-test is actually halted. When it works, there nothing to report. When it hangs, the device driver doesn't return from the I/O call, and other kernel threads are blocked from access, this then causes a panic either via a kernel timer, or a memory mapping call that fails for want of a swap device, etc... One way to troubleshoot it would be to move your boot pool to one SATA controller, and the questionable device to a completely separate one (ala a PCIe HBA...), and then make sure there's no swap configured on it. This would minimize the odds of the panic happening before ZFS gets suspicious and fails the drive.

Disclaimer, I do stuff like this all the time, but I'm an engineer / scientist... If you don't feel comfortable, please don't try this: One old school way of detecting this is to get a regular size screwdriver with a large plastic handle. Place the tip on an exposed portion of the disk case / body where there's no circuitry to exposed (Live electronics! Safety first!), then press the end of the screwdriver handle against your ear canal, such that sound conducts thru your skull, just behind your jawbone. You should be able to tell from the sound difference between drives if the suspect one isn't spinning, or is in an endless head-step calibration loop, etc...
 

Richard Durso

Explorer
Joined
Jan 30, 2014
Messages
70
Replacement drive is on its way, it will get properly tested first. In the mean-time I marked the device "offline".

It also looks like the test completed, got this email:
New alerts:
* Device: /dev/ada2, Self-Test Log error count increased from 0 to 1

# smartctl -l selftest /dev/ada2
smartctl 6.6 2017-11-05 r4594 [FreeBSD 11.2-STABLE amd64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed: read failure 70% 15447 3129280168
# 2 Extended offline Completed: read failure 10% 15430 875268464

I appreciate everyone's help.
 

Snow

Patron
Joined
Aug 1, 2014
Messages
309
Man I new New-Egg was having trouble competing with amazon did not think it had gotten this bad. You Dodge a bullet on that one! Just some thing to add, Take look at my Sig and if you have the time read the FreeNas Uncle Fester's FreeNAS Beginner's Guide under the link. There is so much good info in there and it will help you keep your system in tip top shape, So you can miss the next Bullet :P. Also the link has ton's of other info on every thing FreeNas.

The Guide
 
Top