we freenas happen trouble

xmbillion

Dabbler
Joined
Dec 16, 2018
Messages
16
hi,
please help me see image.
before we found freenas no use. check HBA card bad and system os disk same bad.
now we change new HBA card and new os disk.
but very read/write slow.
and maybe have error
2019年7月9日 19:47 - 卷 CLOUDDATA-112 状态为 ONLINE: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected.
so this is disk bad?
 

Attachments

  • 微信截图_20190710221008.png
    微信截图_20190710221008.png
    19.2 KB · Views: 207

xmbillion

Dabbler
Joined
Dec 16, 2018
Messages
16
Code:
pool: CLOUDDATA-112                                                                                                               
 state: ONLINE                                                                                                                     
status: One or more devices has experienced an unrecoverable error.  An                                                             
        attempt was made to correct the error.  Applications are unaffected.                                                       
action: Determine if the device needs to be replaced, and clear the errors                                                         
        using 'zpool clear' or replace the device with 'zpool replace'.                                                             
   see: http://illumos.org/msg/ZFS-8000-9P                                                                                         
  scan: scrub repaired 0 in 1 days 09:01:36 with 0 errors on Tue Jun 18 00:01:39 2019                                               
config:                                                                                                                             
                                                                                                                                    
        NAME                                            STATE     READ WRITE CKSUM                                                 
        CLOUDDATA-112                                   ONLINE       0     0     0                                                 
          raidz1-0                                      ONLINE       0     0     0                                                 
            gptid/720bdafe-8724-11e8-918a-ecf4bbc6db00  ONLINE       0     0    10                                                 
            gptid/72e7f823-8724-11e8-918a-ecf4bbc6db00  ONLINE       0     0    10                                                 
            gptid/74fcf83b-8724-11e8-918a-ecf4bbc6db00  ONLINE       0     0    24                                                 
            gptid/76ebec83-8724-11e8-918a-ecf4bbc6db00  ONLINE       0     0    14                                                 
          raidz1-1                                      ONLINE       0     0     0                                                 
            gptid/79549b06-8724-11e8-918a-ecf4bbc6db00  ONLINE       0     0    16                                                 
            gptid/7a33f945-8724-11e8-918a-ecf4bbc6db00  ONLINE       0     0    29                                                 
            gptid/7b111e0e-8724-11e8-918a-ecf4bbc6db00  ONLINE       0     0    22                                                 
            gptid/7d3232fb-8724-11e8-918a-ecf4bbc6db00  ONLINE       0     0    32                                                 
        cache                                                                                                                       
          gptid/b04cfbcf-a21d-11e9-81dd-ecf4bbc6db00    ONLINE       0     0     0                                                 
                                                                                                                                    
errors: No known data errors                                                                                                       
                                                                                                                                    
  pool: freenas-boot                                                                                                               
 state: ONLINE                                                                                                                     
  scan: none requested                                                                                                             
config:                                                                                                                             
                                                                                                                                    
        NAME        STATE     READ WRITE CKSUM                                                                                     
        freenas-boot  ONLINE       0     0     0                                                                                   
          da0p2     ONLINE       0     0     0                                                                                     
                                                                                                                                    
errors: No known data errors            
 

melloa

Wizard
Joined
May 22, 2016
Messages
1,749
I don't see error on your pool on the above picture.

Try run
Code:
smartctl -t long /dev/daX 
for each disk and post results.

but very read/write slow.

Could be a bad disk or you are out of space, so also post results for
Code:
zpool list
 
Joined
Oct 18, 2018
Messages
969
Hi @xmbillion. You'll want to run long smart tests with smartctl -t long /dev/<device> if you haven't already done so and after that provide the output of smartcl -a /dev/<device> for each of those drives. Specifically you're looking for lines like the following to show whether they are failing or experiencing issues.

Code:
5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
...
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
 

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,421
your pool has a bunch of checksum errors and looks like you might have multiple disks failing. You didnt' setup smart tests or you would have gotten emails about this. Go setup emails and smart tests then run smartctl -t long /dev/daX for each disk. Post the output of smartctl -a /dev/daX for each drive also. You should back up your data and get ready to replace drives.
 

xmbillion

Dabbler
Joined
Dec 16, 2018
Messages
16
I don't see error on your pool on the above picture.

Try run
Code:
smartctl -t long /dev/daX 
for each disk and post results.



Could be a bad disk or you are out of space, so also post results for
Code:
zpool list
Code:
[root@freenas ~]# smartctl -t long /dev/da1                                                                                         
smartctl 6.6 2017-11-05 r4594 [FreeBSD 11.1-STABLE amd64] (local build)                                                             
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org                                                         
                                                                                                                                    
Extended Background Self Test has begun                                                                                             
Please wait 529 minutes for test to complete.                                                                                       
Estimated completion time: Thu Jul 11 08:17:27 2019                                                                                 
                                                                                                                                    
Use smartctl -X to abort test

Code:
NAME            SIZE  ALLOC   FREE  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT                                                 
CLOUDDATA-112  43.5T  7.61T  35.9T         -    56%    17%  1.00x  ONLINE  /mnt                                                     
freenas-boot    464G   757M   463G         -      -     0%  1.00x  ONLINE  - 
 

xmbillion

Dabbler
Joined
Dec 16, 2018
Messages
16
Hi @xmbillion. You'll want to run long smart tests with smartctl -t long /dev/<device> if you haven't already done so and after that provide the output of smartcl -a /dev/<device> for each of those drives. Specifically you're looking for lines like the following to show whether they are failing or experiencing issues.

Code:
5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
...
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
oh. we have running
smartctl -t long /dev/da1
 

melloa

Wizard
Joined
May 22, 2016
Messages
1,749
oh. we have running
smartctl -t long /dev/da1

529 minutes or ~9 hours.

Also get some drives burning in to be ready to replace the failing ones ... how many? OGK.

Go setup emails

Also don't forget this.

As there are many hands helping this, I'm stepping out. Good luck :cool:
 

xmbillion

Dabbler
Joined
Dec 16, 2018
Messages
16
529 minutes or ~9 hours.

Also get some drives burning in to be ready to replace the failing ones ... how many? OGK.



Also don't forget this.

As there are many hands helping this, I'm stepping out. Good luck :cool:
you see we first image. i see all disk maybe have trouble. just issue hba card bad.

about email.
we freenas os is local network .
so set email have send out intelnet email?
 

Attachments

  • 微信截图_20190711002254.png
    微信截图_20190711002254.png
    74.1 KB · Views: 209
  • 微信截图_20190711002305.png
    微信截图_20190711002305.png
    66 KB · Views: 222
Joined
Oct 18, 2018
Messages
969
we freenas os is local network .
so set email have send out intelnet email?
Yes, check out the User Guide for your version.

You may also want to look at the user guide section on SMART tests.

oh. we have running
smartctl -t long /dev/da1
Start this for all of your drives, not just one. The tests can run in parallel, just keep using smartctl -t long /dev/{device}
 

xmbillion

Dabbler
Joined
Dec 16, 2018
Messages
16
Yes, check out the User Guide for your version.

You may also want to look at the user guide section on SMART tests.


Start this for all of your drives, not just one. The tests can run in parallel, just keep using smartctl -t long /dev/{device}
Code:
Transport protocol:   SAS (SPL-3)                                                                                                   
Local Time is:        Thu Jul 11 20:18:37 2019 HKT                                                                                 
SMART support is:     Available - device has SMART capability.                                                                     
SMART support is:     Enabled                                                                                                       
Temperature Warning:  Enabled                                                                                                       
                                                                                                                                    
=== START OF READ SMART DATA SECTION ===                                                                                           
SMART Health Status: OK                                                                                                             
                                                                                                                                    
Current Drive Temperature:     32 C                                                                                                 
Drive Trip Temperature:        60 C                                                                                                 
                                                                                                                                    
Manufactured in week 01 of year 2018                                                                                               
Specified cycle count over device lifetime:  10000                                                                                 
Accumulated start-stop cycles:  119                                                                                                 
Specified load-unload count over device lifetime:  300000                                                                           
Accumulated load-unload cycles:  499                                                                                               
Elements in grown defect list: 0                                                                                                   
                                                                                                                                    
Vendor (Seagate) cache information                                                                                                 
  Blocks sent to initiator = 1212365856                                                                                             
  Blocks received from initiator = 3182164856                                                                                       
  Blocks read from cache and sent to initiator = 16049530                                                                           
  Number of read and write commands whose size <= segment size = 7612597                                                           
  Number of read and write commands whose size > segment size = 496608                                                             
                                                                                                                                    
Vendor (Seagate/Hitachi) factory information                                                                                       
  number of hours powered up = 9175.47                                                                                             
  number of minutes until next internal SMART test = 10                                                                             
                                                                                                                                    
Error counter log:                                                                                                                 
           Errors Corrected by           Total   Correction     Gigabytes    Total                                                 
               ECC          rereads/    errors   algorithm      processed    uncorrected                                           
           fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors                                                 
read:   301396762        0         0  301396762          0        620.731           0                                               
write:         0        0         0         0          0       3832.644           0                                                 
                                                                                                                                    
Non-medium error count:       32                                                                                                   
                                                                                                                                    
                                                                                                                                    
[GLTSD (Global Logging Target Save Disable) set. Enable Save with '-S on']                                                         
Self-test execution status:             0% of test remaining                                                                       
SMART Self-test log                                                                                                                 
Num  Test              Status                 segment  LifeTime  LBA_first_err [SK ASC ASQ]                                         
     Description                              number   (hours)                                                                     
# 1  Background long   Self test in progress ...   -     NOW                 - [-   -    -]                                         
                                                                                                                                    
Long (extended) Self Test duration: 31744 seconds [529.1 minutes
 
Joined
Oct 18, 2018
Messages
969
Hi, please send the output of smartctl -a /dev/{device}. The above looks like just a summary. It is also helpful if you specify exactly what command you entered that gave you the output. :)
 

xmbillion

Dabbler
Joined
Dec 16, 2018
Messages
16
529 minutes or ~9 hours.

Also get some drives burning in to be ready to replace the failing ones ... how many? OGK.



Also don't forget this.

As there are many hands helping this, I'm stepping out. Good luck :cool:
Code:
[root@freenas ~]# zpool status                                                                                                     
  pool: CLOUDDATA-112                                                                                                               
 state: ONLINE                                                                                                                     
status: One or more devices has experienced an unrecoverable error.  An                                                             
        attempt was made to correct the error.  Applications are unaffected.                                                       
action: Determine if the device needs to be replaced, and clear the errors                                                         
        using 'zpool clear' or replace the device with 'zpool replace'.                                                             
   see: http://illumos.org/msg/ZFS-8000-9P                                                                                         
  scan: scrub repaired 0 in 1 days 09:01:36 with 0 errors on Tue Jun 18 00:01:39 2019                                               
config:                                                                                                                             
                                                                                                                                    
        NAME                                            STATE     READ WRITE CKSUM                                                 
        CLOUDDATA-112                                   ONLINE       0     0     0                                                 
          raidz1-0                                      ONLINE       0     0     0                                                 
            gptid/720bdafe-8724-11e8-918a-ecf4bbc6db00  ONLINE       0     0    13                                                 
            gptid/72e7f823-8724-11e8-918a-ecf4bbc6db00  ONLINE       0     0    11                                                 
            gptid/74fcf83b-8724-11e8-918a-ecf4bbc6db00  ONLINE       0     0    25                                                 
            gptid/76ebec83-8724-11e8-918a-ecf4bbc6db00  ONLINE       0     0    16                                                 
          raidz1-1                                      ONLINE       0     0     0                                                 
            gptid/79549b06-8724-11e8-918a-ecf4bbc6db00  ONLINE       0     0    17                                                 
            gptid/7a33f945-8724-11e8-918a-ecf4bbc6db00  ONLINE       0     0    35                                                 
            gptid/7b111e0e-8724-11e8-918a-ecf4bbc6db00  ONLINE       0     0    24                                                 
            gptid/7d3232fb-8724-11e8-918a-ecf4bbc6db00  ONLINE       0     0    34                                                 
        cache                                                                                                                       
          gptid/b04cfbcf-a21d-11e9-81dd-ecf4bbc6db00    ONLINE       0     0     0                                                 
                                                                                                                                    
errors: No known data errors                                                                                                       
                                                                                                                                    
  pool: freenas-boot                                                                                                               
 state: ONLINE                                                                                                                     
  scan: none requested                                                                                                             
config:                                                                                                                             
                                                                                                                                    
        NAME        STATE     READ WRITE CKSUM                                                                                     
        freenas-boot  ONLINE       0     0     0                                                                                   
          da0p2     ONLINE       0     0     0                                                                                     
                                                                                                                                    
errors: No known data errors
 

xmbillion

Dabbler
Joined
Dec 16, 2018
Messages
16
Hi, please send the output of smartctl -a /dev/{device}. The above looks like just a summary. It is also helpful if you specify exactly what command you entered that gave you the output. :)
now we just running 1 disk other disk have no complete.
 
Joined
Oct 18, 2018
Messages
969
What are the values for
Code:
5 Reallocated_Sector_Ct
197 Current_Pending_Sector
198 Offline_Uncorrectable
199 UDMA_CRC_Error_Count
 

xmbillion

Dabbler
Joined
Dec 16, 2018
Messages
16
What are the values for
Code:
5 Reallocated_Sector_Ct
197 Current_Pending_Sector
198 Offline_Uncorrectable
199 UDMA_CRC_Error_Count
Code:
smartctl -a /dev/da2
Transport protocol:   SAS (SPL-3)                                                                                                   
Local Time is:        Fri Jul 12 15:11:17 2019 HKT                                                                                 
SMART support is:     Available - device has SMART capability.                                                                     
SMART support is:     Enabled                                                                                                       
Temperature Warning:  Enabled                                                                                                       
                                                                                                                                    
=== START OF READ SMART DATA SECTION ===                                                                                           
SMART Health Status: OK                                                                                                             
                                                                                                                                    
Current Drive Temperature:     33 C                                                                                                 
Drive Trip Temperature:        60 C                                                                                                 
                                                                                                                                    
Manufactured in week 01 of year 2018                                                                                               
Specified cycle count over device lifetime:  10000                                                                                 
Accumulated start-stop cycles:  116                                                                                                 
Specified load-unload count over device lifetime:  300000                                                                           
Accumulated load-unload cycles:  475                                                                                               
Elements in grown defect list: 0                                                                                                   
                                                                                                                                    
Vendor (Seagate) cache information                                                                                                 
  Blocks sent to initiator = 1544922848                                                                                             
  Blocks received from initiator = 1379678312                                                                                       
  Blocks read from cache and sent to initiator = 14254707                                                                           
  Number of read and write commands whose size <= segment size = 10154820                                                           
  Number of read and write commands whose size > segment size = 592427                                                             
                                                                                                                                    
Vendor (Seagate/Hitachi) factory information                                                                                       
  number of hours powered up = 8717.82                                                                                             
  number of minutes until next internal SMART test = 44                                                                             
                                                                                                                                    
Error counter log:                                                                                                                 
           Errors Corrected by           Total   Correction     Gigabytes    Total                                                 
               ECC          rereads/    errors   algorithm      processed    uncorrected                                           
           fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors                                                 
read:   386454913        0         0  386454913          0        791.000           0                                               
write:         0        0         0         0          0       5115.914           0                                                 
                                                                                                                                    
Non-medium error count:       13                                                                                                   
                                                                                                                                    
                                                                                                                                    
[GLTSD (Global Logging Target Save Disable) set. Enable Save with '-S on']                                                         
Self-test execution status:             0% of test remaining                                                                       
SMART Self-test log                                                                                                                 
Num  Test              Status                 segment  LifeTime  LBA_first_err [SK ASC ASQ]                                         
     Description                              number   (hours)                                                                     
# 1  Background long   Self test in progress ...   -     NOW                 - [-   -    -]                                         
                                                                                                                                    
Long (extended) Self Test duration: 32158 seconds [536.0 minutes

da3
Transport protocol:   SAS (SPL-3)                                                                                                   
Local Time is:        Fri Jul 12 15:12:13 2019 HKT                                                                                 
SMART support is:     Available - device has SMART capability.                                                                     
SMART support is:     Enabled                                                                                                       
Temperature Warning:  Enabled                                                                                                       
                                                                                                                                    
=== START OF READ SMART DATA SECTION ===                                                                                           
SMART Health Status: OK                                                                                                             
                                                                                                                                    
Current Drive Temperature:     34 C                                                                                                 
Drive Trip Temperature:        60 C                                                                                                 
                                                                                                                                    
Manufactured in week 28 of year 2017                                                                                               
Specified cycle count over device lifetime:  10000                                                                                 
Accumulated start-stop cycles:  148                                                                                                 
Specified load-unload count over device lifetime:  300000                                                                           
Accumulated load-unload cycles:  614                                                                                               
Elements in grown defect list: 0                                                                                                   
                                                                                                                                    
Vendor (Seagate) cache information                                                                                                 
  Blocks sent to initiator = 1325587112                                                                                             
  Blocks received from initiator = 1277942440                                                                                       
  Blocks read from cache and sent to initiator = 18499815                                                                           
  Number of read and write commands whose size <= segment size = 11590944                                                           
  Number of read and write commands whose size > segment size = 517869                                                             
                                                                                                                                    
Vendor (Seagate/Hitachi) factory information                                                                                       
  number of hours powered up = 11358.55                                                                                             
  number of minutes until next internal SMART test = 1                                                                             
                                                                                                                                    
Error counter log:                                                                                                                 
           Errors Corrected by           Total   Correction     Gigabytes    Total                                                 
               ECC          rereads/    errors   algorithm      processed    uncorrected                                           
           fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors                                                 
read:   345409179        0         0  345409179          0        678.701           0                                               
write:         0        0         0         0          0       5056.825           0                                                 
                                                                                                                                    
Non-medium error count:       41                                                                                                   
                                                                                                                                    
                                                                                                                                    
[GLTSD (Global Logging Target Save Disable) set. Enable Save with '-S on']                                                         
Self-test execution status:             0% of test remaining                                                                       
SMART Self-test log                                                                                                                 
Num  Test              Status                 segment  LifeTime  LBA_first_err [SK ASC ASQ]                                         
     Description                              number   (hours)                                                                     
# 1  Background long   Self test in progress ...   -     NOW                 - [-   -    -]                                         
                                                                                                                                    
Long (extended) Self Test duration: 33451 seconds [557.5 minutes]

da4
Transport protocol:   SAS (SPL-3)                                                                                                   
Local Time is:        Fri Jul 12 15:12:35 2019 HKT                                                                                 
SMART support is:     Available - device has SMART capability.                                                                     
SMART support is:     Enabled                                                                                                       
Temperature Warning:  Enabled                                                                                                       
                                                                                                                                    
=== START OF READ SMART DATA SECTION ===                                                                                           
SMART Health Status: OK                                                                                                             
                                                                                                                                    
Current Drive Temperature:     33 C                                                                                                 
Drive Trip Temperature:        60 C                                                                                                 
                                                                                                                                    
Manufactured in week 28 of year 2017                                                                                               
Specified cycle count over device lifetime:  10000                                                                                 
Accumulated start-stop cycles:  139                                                                                                 
Specified load-unload count over device lifetime:  300000                                                                           
Accumulated load-unload cycles:  626                                                                                               
Elements in grown defect list: 0                                                                                                   
                                                                                                                                    
Vendor (Seagate) cache information                                                                                                 
  Blocks sent to initiator = 1284404944                                                                                             
  Blocks received from initiator = 971605112                                                                                       
  Blocks read from cache and sent to initiator = 13730224                                                                           
  Number of read and write commands whose size <= segment size = 10945692                                                           
  Number of read and write commands whose size > segment size = 480231                                                             
                                                                                                                                    
Vendor (Seagate/Hitachi) factory information                                                                                       
  number of hours powered up = 11435.80                                                                                             
  number of minutes until next internal SMART test = 44                                                                             
                                                                                                                                    
Error counter log:                                                                                                                 
           Errors Corrected by           Total   Correction     Gigabytes    Total                                                 
               ECC          rereads/    errors   algorithm      processed    uncorrected                                           
           fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors                                                 
read:   326889410        0         0  326889410          0        657.615           0                                               
write:         0        0         0         0          0       4901.442           0                                                 
                                                                                                                                    
Non-medium error count:       18                                                                                                   
                                                                                                                                    
                                                                                                                                    
[GLTSD (Global Logging Target Save Disable) set. Enable Save with '-S on']                                                         
Self-test execution status:             0% of test remaining                                                                       
SMART Self-test log                                                                                                                 
Num  Test              Status                 segment  LifeTime  LBA_first_err [SK ASC ASQ]                                         
     Description                              number   (hours)                                                                     
# 1  Background long   Self test in progress ...   -     NOW                 - [-   -    -]                                         
                                                                                                                                    
Long (extended) Self Test duration: 34230 seconds [570.5 minutes]

da5
Transport protocol:   SAS (SPL-3)                                                                                                   
Local Time is:        Fri Jul 12 15:12:56 2019 HKT                                                                                 
SMART support is:     Available - device has SMART capability.                                                                     
SMART support is:     Enabled                                                                                                       
Temperature Warning:  Enabled                                                                                                       
                                                                                                                                    
=== START OF READ SMART DATA SECTION ===                                                                                           
SMART Health Status: OK                                                                                                             
                                                                                                                                    
Current Drive Temperature:     32 C                                                                                                 
Drive Trip Temperature:        60 C                                                                                                 
                                                                                                                                    
Manufactured in week 28 of year 2017                                                                                               
Specified cycle count over device lifetime:  10000                                                                                 
Accumulated start-stop cycles:  139                                                                                                 
Specified load-unload count over device lifetime:  300000                                                                           
Accumulated load-unload cycles:  616                                                                                               
Elements in grown defect list: 0                                                                                                   
                                                                                                                                    
Vendor (Seagate) cache information                                                                                                 
  Blocks sent to initiator = 1333510552                                                                                             
  Blocks received from initiator = 652819440                                                                                       
  Blocks read from cache and sent to initiator = 18808570                                                                           
  Number of read and write commands whose size <= segment size = 11423975                                                           
  Number of read and write commands whose size > segment size = 453473                                                             
                                                                                                                                    
Vendor (Seagate/Hitachi) factory information                                                                                       
  number of hours powered up = 11672.70                                                                                             
  number of minutes until next internal SMART test = 57                                                                             
                                                                                                                                    
Error counter log:                                                                                                                 
           Errors Corrected by           Total   Correction     Gigabytes    Total                                                 
               ECC          rereads/    errors   algorithm      processed    uncorrected                                           
           fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors                                                 
read:   345160529        0         0  345160529          0        682.757           0                                               
write:         0        0         0         0          0       4736.355           0                                                 
                                                                                                                                    
Non-medium error count:       25                                                                                                   
                                                                                                                                    
                                                                                                                                    
[GLTSD (Global Logging Target Save Disable) set. Enable Save with '-S on']                                                         
Self-test execution status:             0% of test remaining                                                                       
SMART Self-test log                                                                                                                 
Num  Test              Status                 segment  LifeTime  LBA_first_err [SK ASC ASQ]                                         
     Description                              number   (hours)                                                                     
# 1  Background long   Self test in progress ...   -     NOW                 - [-   -    -]                                         
                                                                                                                                    
Long (extended) Self Test duration: 34342 seconds [572.4 minutes]

 

xmbillion

Dabbler
Joined
Dec 16, 2018
Messages
16
What are the values for
Code:
5 Reallocated_Sector_Ct
197 Current_Pending_Sector
198 Offline_Uncorrectable
199 UDMA_CRC_Error_Count
Code:
da6

Transport protocol:   SAS (SPL-3)                                                                                                   

Local Time is:        Fri Jul 12 15:13:15 2019 HKT                                                                                 

SMART support is:     Available - device has SMART capability.                                                                     

SMART support is:     Enabled                                                                                                       

Temperature Warning:  Enabled                                                                                                       

                                                                                                                                    

=== START OF READ SMART DATA SECTION ===                                                                                           

SMART Health Status: OK                                                                                                             

                                                                                                                                    

Current Drive Temperature:     36 C                                                                                                 

Drive Trip Temperature:        60 C                                                                                                 

                                                                                                                                    

Manufactured in week 16 of year 2018                                                                                               

Specified cycle count over device lifetime:  10000                                                                                 

Accumulated start-stop cycles:  258                                                                                                 

Specified load-unload count over device lifetime:  300000                                                                           

Accumulated load-unload cycles:  620                                                                                               

Elements in grown defect list: 0                                                                                                   

                                                                                                                                    

Vendor (Seagate) cache information                                                                                                 

  Blocks sent to initiator = 1560232744                                                                                             

  Blocks received from initiator = 863937448                                                                                       

  Blocks read from cache and sent to initiator = 14910799                                                                           

  Number of read and write commands whose size <= segment size = 9879633                                                           

  Number of read and write commands whose size > segment size = 536416                                                             

                                                                                                                                    

Vendor (Seagate/Hitachi) factory information                                                                                       

  number of hours powered up = 8776.88                                                                                             

  number of minutes until next internal SMART test = 9                                                                             

                                                                                                                                    

Error counter log:                                                                                                                 

           Errors Corrected by           Total   Correction     Gigabytes    Total                                                 

               ECC          rereads/    errors   algorithm      processed    uncorrected                                           

           fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors                                                 

read:   387124578        0         0  387124578          0        798.839           0                                               

write:         0        0         0         0          0       4850.732           0                                                 

                                                                                                                                    

Non-medium error count:       18                                                                                                   

                                                                                                                                    

                                                                                                                                    

[GLTSD (Global Logging Target Save Disable) set. Enable Save with '-S on']                                                         

Self-test execution status:             0% of test remaining                                                                       

SMART Self-test log                                                                                                                 

Num  Test              Status                 segment  LifeTime  LBA_first_err [SK ASC ASQ]                                         

     Description                              number   (hours)                                                                     

# 1  Background long   Self test in progress ...   -     NOW                 - [-   -    -]                                         

                                                                                                                                    

Long (extended) Self Test duration: 31236 seconds [520.6 minutes]



da7

Transport protocol:   SAS (SPL-3)                                                                                                   

Local Time is:        Fri Jul 12 15:13:35 2019 HKT                                                                                 

SMART support is:     Available - device has SMART capability.                                                                     

SMART support is:     Enabled                                                                                                       

Temperature Warning:  Enabled                                                                                                       

                                                                                                                                    

=== START OF READ SMART DATA SECTION ===                                                                                           

SMART Health Status: OK                                                                                                             

                                                                                                                                    

Current Drive Temperature:     35 C                                                                                                 

Drive Trip Temperature:        60 C                                                                                                 

                                                                                                                                    

Manufactured in week 01 of year 2018                                                                                               

Specified cycle count over device lifetime:  10000                                                                                 

Accumulated start-stop cycles:  116                                                                                                 

Specified load-unload count over device lifetime:  300000                                                                           

Accumulated load-unload cycles:  475                                                                                               

Elements in grown defect list: 0                                                                                                   

                                                                                                                                    

Vendor (Seagate) cache information                                                                                                 

  Blocks sent to initiator = 1633536984                                                                                             

  Blocks received from initiator = 1133474096                                                                                       

  Blocks read from cache and sent to initiator = 18971338                                                                           

  Number of read and write commands whose size <= segment size = 10445208                                                           

  Number of read and write commands whose size > segment size = 574091                                                             

                                                                                                                                    

Vendor (Seagate/Hitachi) factory information                                                                                       

  number of hours powered up = 8717.92                                                                                             

  number of minutes until next internal SMART test = 59                                                                             

                                                                                                                                    

Error counter log:                                                                                                                 

           Errors Corrected by           Total   Correction     Gigabytes    Total                                                 

               ECC          rereads/    errors   algorithm      processed    uncorrected                                           

           fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors                                                 

read:   450476459        0         0  450476459          0        836.371           0                                               

write:         0        0         0         0          0       4985.113           0                                                 

                                                                                                                                    

Non-medium error count:       26                                                                                                   

                                                                                                                                    

                                                                                                                                    

[GLTSD (Global Logging Target Save Disable) set. Enable Save with '-S on']                                                         

Self-test execution status:             0% of test remaining                                                                       

SMART Self-test log                                                                                                                 

Num  Test              Status                 segment  LifeTime  LBA_first_err [SK ASC ASQ]                                         

     Description                              number   (hours)                                                                     

# 1  Background long   Self test in progress ...   -     NOW                 - [-   -    -]                                         

                                                                                                                                    

Long (extended) Self Test duration: 33698 seconds [561.6 minutes]



da8

Transport protocol:   SAS (SPL-3)                                                                                                   

Local Time is:        Fri Jul 12 15:13:54 2019 HKT                                                                                 

SMART support is:     Available - device has SMART capability.                                                                     

SMART support is:     Enabled                                                                                                       

Temperature Warning:  Enabled                                                                                                       

                                                                                                                                    

=== START OF READ SMART DATA SECTION ===                                                                                           

SMART Health Status: OK                                                                                                             

                                                                                                                                    

Current Drive Temperature:     33 C                                                                                                 

Drive Trip Temperature:        60 C                                                                                                 

                                                                                                                                    

Manufactured in week 28 of year 2017                                                                                               

Specified cycle count over device lifetime:  10000                                                                                 

Accumulated start-stop cycles:  143                                                                                                 

Specified load-unload count over device lifetime:  300000                                                                           

Accumulated load-unload cycles:  610                                                                                               

Elements in grown defect list: 0                                                                                                   

                                                                                                                                    

Vendor (Seagate) cache information                                                                                                 

  Blocks sent to initiator = 1301011736                                                                                             

  Blocks received from initiator = 451182336                                                                                       

  Blocks read from cache and sent to initiator = 14202836                                                                           

  Number of read and write commands whose size <= segment size = 10687752                                                           

  Number of read and write commands whose size > segment size = 438716                                                             

                                                                                                                                    

Vendor (Seagate/Hitachi) factory information                                                                                       

  number of hours powered up = 11122.28                                                                                             

  number of minutes until next internal SMART test = 59                                                                             

                                                                                                                                    

Error counter log:                                                                                                                 

           Errors Corrected by           Total   Correction     Gigabytes    Total                                                 

               ECC          rereads/    errors   algorithm      processed    uncorrected                                           

           fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors                                                 

read:   343821052        0         0  343821052          0        666.118           0                                               

write:         0        0         0         0          0       4635.329           0                                                 

                                                                                                                                    

Non-medium error count:       20                                                                                                   

                                                                                                                                    

                                                                                                                                    

[GLTSD (Global Logging Target Save Disable) set. Enable Save with '-S on']                                                         

Self-test execution status:             0% of test remaining                                                                       

SMART Self-test log                                                                                                                 

Num  Test              Status                 segment  LifeTime  LBA_first_err [SK ASC ASQ]                                         

     Description                              number   (hours)                                                                     

# 1  Background long   Self test in progress ...   -     NOW                 - [-   -    -]                                         

                                                                                                                                    

Long (extended) Self Test duration: 34097 seconds [568.3 minutes]
 

Fredda

Guru
Joined
Jul 9, 2019
Messages
608
Hi, please send the output of smartctl -a /dev/{device}. The above looks like just a summary.

It looks like the drive is a SAS drive and not a SATA one. SAS smart data looks that way. Although it looks the vendor information has been cut, so the first ~10 lines of the output are missing.
 
Joined
Oct 18, 2018
Messages
969
It looks like the drive is a SAS drive and not a SATA one. SAS smart data looks that way. Although it looks the vendor information has been cut, so the first ~10 lines of the output are missing.
Ah yeah, thanks. I scanned too quickly.

Non-medium error count: 18
These point to possible issues with cables or the controller. Check out this forum post or this smartmontools page. It looks like your drives don't have any `Total uncorrected errors`, this is a good thing. It looks like possibly all of your drives are reporting some Non-medium error count? If that does indeed indicate a communication issue it might be the controller rather than a cable.

Perhaps someone else has better ideas though.
 
Top