SOLVED help understand S.M.A.R.T report

iliak

Contributor
Joined
Dec 18, 2018
Messages
148
i have a 8 pool ssd, and i have some issues with understanding the report because the server still see that the pull is HEALTHY
on one way the health is ok, but on the other way there are some segments faild

hdd model info 800gb
Code:
 smartctl -a /dev/da2
smartctl 6.6 2017-11-05 r4594 [FreeBSD 11.2-STABLE amd64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor:               SmrtStor
Product:              HXA2D20800GA6001
Revision:             KTK0
Compliance:           SPC-4
User Capacity:        800,166,076,416 bytes [800 GB]
Logical block size:   512 bytes
Physical block size:  4096 bytes
LU is resource provisioned, LBPRZ=1
Rotation Rate:        Solid State Device
Form Factor:          2.5 inches
Logical Unit id:      0x5000516100131bf0
Serial number:        FG001DRL
Device type:          disk
Transport protocol:   SAS (SPL-3)
Local Time is:        Mon Feb  4 22:25:56 2019 PST
SMART support is:     Available - device has SMART capability.
SMART support is:     Enabled
Temperature Warning:  Enabled

=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK

Percentage used endurance indicator: 0%
Current Drive Temperature:     34 C
Drive Trip Temperature:        70 C

Manufactured in week 25 of year 2012
Specified cycle count over device lifetime:  0
Accumulated start-stop cycles:  91
Elements in grown defect list: 0

Error counter log:
           Errors Corrected by           Total   Correction     Gigabytes    Total
               ECC          rereads/    errors   algorithm      processed    uncorrected
           fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors
read:          0        0         0         0          0      14763.456           0
write:         0        0         0         0          0      19174.361           0

Non-medium error count:       54

SMART Self-test log
Num  Test              Status                 segment  LifeTime  LBA_first_err [SK ASC ASQ]
     Description                              number   (hours)
# 1  Default           Completed, segment failed   -    1878                 1 [0x4 0x3e 0x3]
# 2  Default           Completed, segment failed   -    1878                 1 [0x4 0x3e 0x3]
# 3  Default           Completed, segment failed   -    1878                 1 [0x4 0x3e 0x3]
# 4  Default           Completed, segment failed   -    1878                 1 [0x4 0x3e 0x3]
# 5  Default           Completed, segment failed   -    1878                 1 [0x4 0x3e 0x3]
# 6  Default           Completed, segment failed   -    1878                 1 [0x4 0x3e 0x3]
# 7  Default           Completed, segment failed   -    1877                 1 [0x4 0x3e 0x3]
# 8  Default           Completed, segment failed   -    1877                 1 [0x4 0x3e 0x3]
# 9  Default           Completed, segment failed   -    1877                 1 [0x4 0x3e 0x3]
#10  Default           Completed, segment failed   -    1877                 1 [0x4 0x3e 0x3]
#11  Default           Completed, segment failed   -    1877                 1 [0x4 0x3e 0x3]
#12  Default           Completed, segment failed   -    1877                 1 [0x4 0x3e 0x3]
#13  Default           Completed, segment failed   -    1877                 1 [0x4 0x3e 0x3]
#14  Default           Completed, segment failed   -    1877                 1 [0x4 0x3e 0x3]
#15  Default           Completed, segment failed   -    1877                 1 [0x4 0x3e 0x3]
#16  Default           Completed, segment failed   -    1877                 1 [0x4 0x3e 0x3]
#17  Default           Completed, segment failed   -    1877                 1 [0x4 0x3e 0x3]
#18  Default           Completed, segment failed   -    1877                 1 [0x4 0x3e 0x3]
#19  Default           Completed, segment failed   -    1876                 1 [0x4 0x3e 0x3]
#20  Default           Completed, segment failed   -    1876                 1 [0x4 0x3e 0x3]

Long (extended) Self Test duration: 2880 seconds [48.0 minutes]
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
HXA2D20800GA6001
I am not familiar with that type of drive but the error tends to indicate that it needs to be replaced. If it is under warranty, definitely
get it changed out by the manufacturer.
i have some issues with understanding the report because the server still see that the pull is HEALTHY
You can have a drive fault and the pool remains healthy because that is what ZFS is designed to do. The pool will only be degraded if a drive is fully offline or is giving data errors. This health report from the drive is indicating that the drive has lost some capability but the drive is probably not giving any data errors yet.
 

iliak

Contributor
Joined
Dec 18, 2018
Messages
148
I am not familiar with that type of drive but the error tends to indicate that it needs to be r.
here is a link for the model

ok, ill replace it, just to check if i am doing it right(cause it is first)
1. deactivate drive,
2. replace
3. add drive back to pull ?

update, drive is replaced with new one, and error is gone,, thanks
 
Last edited:

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
I appreciate the link.
Happy it's working now.
If the drive is under warranty, I would send it back for replacement.
 
Top