83n
Cadet
- Joined
- Sep 14, 2020
- Messages
- 3
Hey, I am hoping i can get some assistance or advice on some errors i am getting while scrubbing my main pool. My setup is as follows:
Server:
Model: IBM System x3550 M4 Server (7914-32M)
CPU: 2x Intel Xeon CPU E5-2643 0 (3.30GHz 4 Core, 8 Thread)
Memory: 384 GiB (24x Hynix 16 GB 2Rx4 PC3L 10600R)
RAID: ServeRAID M5110 (2x RAID 1 volumes, 1 for OS and 1 for Jails)
NIC: 4x Intel I350 Gigabit (Onboard) 1x Emulex OneConnect 10Gb NIC (Duel port)
HBA: LSI SAS 9201-16e (Flashed IT)
Storage :
Enclosures: 2x NetApp DS4243 (1 IOM6 in each enclosure)
Disks: 48x Seagate ST33000650SS 3TB 7.2K RPM (512b Sectors reformatted from
528b sectors)
FreeNAS:
OS Version: FreeNAS-11.3-U4.1
Pool Configuration:
HEALTHY: (55%) Used / 37.09 TiB Free (8x 6 disk RAIDZ2)
HEALTHY: (5%) Used / 506.94 GiB Free (Simple pool of a hardware RAID 1 volume)
I am getting the following errors while running a scrub
When i run zpool status i can see there has been data repaired but it is not showing any data errors
This is what i see from smartctl for the 3 disks that has thrown the errors and one that has not
Does anyone have a suggestion on what might be causing the errors and/or what should be done about them?
I have also noticed the scrub seems to vary in speed quite a lot. It was running at:
But slowed down substantally:
And then seemed to be speeding up again
and now it seems to be showing it has scanned all 70.1T but is only 64.78% complete....
I assume the slowing down/speeding up is due to the sizes of the files being scrubbed and the load on the system but the numbers just feel a bit strange.
The only other issue i have had that may be related is some intermittent system hangs. When this occurs it seems to lock up the system at the console, crashes the WebUI and Plex transcoding freezes (but can restart) but SSH/SAMBA continue to work.
I get the following error in the message log:
I had a disk that was logging increasing SMART errors that i thought was causing it so i replaced that disk and it seemed to stop for a while but has now happened again:
(Red=Hang Blue=Replace disk Green=Manual reboot)
Any help would be greatly appreciated!
Server:
Model: IBM System x3550 M4 Server (7914-32M)
CPU: 2x Intel Xeon CPU E5-2643 0 (3.30GHz 4 Core, 8 Thread)
Memory: 384 GiB (24x Hynix 16 GB 2Rx4 PC3L 10600R)
RAID: ServeRAID M5110 (2x RAID 1 volumes, 1 for OS and 1 for Jails)
NIC: 4x Intel I350 Gigabit (Onboard) 1x Emulex OneConnect 10Gb NIC (Duel port)
HBA: LSI SAS 9201-16e (Flashed IT)
Storage :
Enclosures: 2x NetApp DS4243 (1 IOM6 in each enclosure)
Disks: 48x Seagate ST33000650SS 3TB 7.2K RPM (512b Sectors reformatted from
528b sectors)
FreeNAS:
OS Version: FreeNAS-11.3-U4.1
Pool Configuration:
HEALTHY: (55%) Used / 37.09 TiB Free (8x 6 disk RAIDZ2)
Name | Type | Used | Available | Compression | Compression Ratio | Readonly | Dedup | Comments | |
---|---|---|---|---|---|---|---|---|---|
Data | dataset | 46.79 TiB | 37.09 TiB | lz4 | 1.00x | false | off | more_vert |
HEALTHY: (5%) Used / 506.94 GiB Free (Simple pool of a hardware RAID 1 volume)
Name | Type | Used | Available | Compression | Compression Ratio | Readonly | Dedup | Comments | |
---|---|---|---|---|---|---|---|---|---|
Services | dataset | 31.69 GiB | 506.94 GiB | lz4 | 1.21x | false | off | more_vert |
I am getting the following errors while running a scrub
Code:
Sep 14 00:24:27 freenas (da14:mps0:0:30:0): READ(10). CDB: 28 00 c0 60 ef c8 00 01 00 00 Sep 14 00:24:27 freenas (da14:mps0:0:30:0): CAM status: SCSI Status Error Sep 14 00:24:27 freenas (da14:mps0:0:30:0): SCSI status: Check Condition Sep 14 00:24:27 freenas (da14:mps0:0:30:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error) Sep 14 00:24:27 freenas (da14:mps0:0:30:0): Info: 0xc060f09f Sep 14 00:24:27 freenas (da14:mps0:0:30:0): Field Replaceable Unit: 129 Sep 14 00:24:27 freenas (da14:mps0:0:30:0): Command Specific Info: 0xa1615189 Sep 14 00:24:27 freenas (da14:mps0:0:30:0): Actual Retry Count: 255 Sep 14 00:24:27 freenas (da14:mps0:0:30:0): Descriptor 0x80: 00 00 03 11 00 81 01 fc db 09 01 1e 00 00 Sep 14 00:24:27 freenas (da14:mps0:0:30:0): Error 5, Unretryable error Sep 14 00:24:30 freenas ZFS: vdev state changed, pool_guid=11123641492454680720 vdev_guid=10686807130487373415 Sep 14 06:30:07 freenas (da23:mps0:0:39:0): READ(16). CDB: 88 00 00 00 00 01 06 5d c5 00 00 00 01 00 00 00 Sep 14 06:30:07 freenas (da23:mps0:0:39:0): CAM status: SCSI Status Error Sep 14 06:30:07 freenas (da23:mps0:0:39:0): SCSI status: Check Condition Sep 14 06:30:07 freenas (da23:mps0:0:39:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error) Sep 14 06:30:07 freenas (da23:mps0:0:39:0): Info: 0x1065dc5c3 Sep 14 06:30:07 freenas (da23:mps0:0:39:0): Field Replaceable Unit: 129 Sep 14 06:30:07 freenas (da23:mps0:0:39:0): Command Specific Info: 0xa1615189 Sep 14 06:30:07 freenas (da23:mps0:0:39:0): Actual Retry Count: 255 Sep 14 06:30:07 freenas (da23:mps0:0:39:0): Descriptor 0x80: 00 00 03 11 00 81 02 e9 5a 0(da23:mps0:0:39:0): Error 5, Unretryable error Sep 14 06:30:09 freenas ZFS: vdev state changed, pool_guid=11123641492454680720 vdev_guid=18107924374852554239 Sep 14 18:00:46 freenas (da24:mps0:0:41:0): READ(10). CDB: 28 00 d3 0b e7 20 00 01 00 00 Sep 14 18:00:46 freenas (da24:mps0:0:41:0): CAM status: SCSI Status Error Sep 14 18:00:46 freenas (da24:mps0:0:41:0): SCSI status: Check Condition Sep 14 18:00:46 freenas (da24:mps0:0:41:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error) Sep 14 18:00:46 freenas (da24:mps0:0:41:0): Info: 0xd30be819 Sep 14 18:00:46 freenas (da24:mps0:0:41:0): Field Replaceable Unit: 129 Sep 14 18:00:46 freenas (da24:mps0:0:41:0): Command Specific Info: 0xa69d0701 Sep 14 18:00:46 freenas (da24:mps0:0:41:0): Actual Retry Count: 255 Sep 14 18:00:46 freenas (da24:mps0:0:41:0): Descriptor 0x80: 00 00 03 11 00 81 02 1b 5b 05 01 34 00 00 Sep 14 18:00:46 freenas (da24:mps0:0:41:0): Error 5, Unretryable error Sep 14 18:00:52 freenas ZFS: vdev state changed, pool_guid=11123641492454680720 vdev_guid=7468315239559028326
When i run zpool status i can see there has been data repaired but it is not showing any data errors
Code:
zpool status pool: Data state: ONLINE scan: scrub in progress since Sun Sep 13 23:00:09 2020 70.1T scanned at 918M/s, 43.5T issued at 570M/s, 70.1T total 3M repaired, 62.14% done, 0 days 13:33:01 to go config: NAME STATE READ WRITE CKSUM Data ONLINE 0 0 0 raidz2-0 ONLINE 0 0 0 gptid/8a1c1896-dfda-11ea-a498-40f2e90ac8fc ONLINE 0 0 0 gptid/8abe167d-dfda-11ea-a498-40f2e90ac8fc ONLINE 0 0 0 gptid/8db855ce-dfda-11ea-a498-40f2e90ac8fc ONLINE 0 0 0 gptid/8d986eb0-dfda-11ea-a498-40f2e90ac8fc ONLINE 0 0 0 gptid/8d8d8520-dfda-11ea-a498-40f2e90ac8fc ONLINE 0 0 0 gptid/8e007041-dfda-11ea-a498-40f2e90ac8fc ONLINE 0 0 0 raidz2-1 ONLINE 0 0 0 gptid/8af2eae2-dfda-11ea-a498-40f2e90ac8fc ONLINE 0 0 0 gptid/8c272ae9-dfda-11ea-a498-40f2e90ac8fc ONLINE 0 0 0 gptid/8e61292d-dfda-11ea-a498-40f2e90ac8fc ONLINE 0 0 0 gptid/8efdc51b-dfda-11ea-a498-40f2e90ac8fc ONLINE 0 0 0 gptid/8ec09204-dfda-11ea-a498-40f2e90ac8fc ONLINE 0 0 0 gptid/8f1d7f99-dfda-11ea-a498-40f2e90ac8fc ONLINE 0 0 0 raidz2-2 ONLINE 0 0 0 gptid/8ae937b6-dfda-11ea-a498-40f2e90ac8fc ONLINE 0 0 0 gptid/8b961045-dfda-11ea-a498-40f2e90ac8fc ONLINE 0 0 0 gptid/8dc37f9e-dfda-11ea-a498-40f2e90ac8fc ONLINE 0 0 0 gptid/8f5d9220-dfda-11ea-a498-40f2e90ac8fc ONLINE 0 0 0 gptid/95e13ddb-dfda-11ea-a498-40f2e90ac8fc ONLINE 0 0 0 gptid/98634c94-dfda-11ea-a498-40f2e90ac8fc ONLINE 0 0 0 raidz2-3 ONLINE 0 0 0 gptid/986ffa00-dfda-11ea-a498-40f2e90ac8fc ONLINE 0 0 0 gptid/98cb54af-dfda-11ea-a498-40f2e90ac8fc ONLINE 0 0 0 gptid/99991cfd-dfda-11ea-a498-40f2e90ac8fc ONLINE 0 0 0 gptid/99c3ca51-dfda-11ea-a498-40f2e90ac8fc ONLINE 0 0 0 gptid/9a0c41ac-dfda-11ea-a498-40f2e90ac8fc ONLINE 0 0 0 gptid/9b943f72-dfda-11ea-a498-40f2e90ac8fc ONLINE 0 0 0 raidz2-4 ONLINE 0 0 0 gptid/9a9a28fd-dfda-11ea-a498-40f2e90ac8fc ONLINE 0 0 0 gptid/9b5a082f-dfda-11ea-a498-40f2e90ac8fc ONLINE 0 0 0 gptid/9c1316e1-dfda-11ea-a498-40f2e90ac8fc ONLINE 0 0 0 gptid/9d33e2ac-dfda-11ea-a498-40f2e90ac8fc ONLINE 0 0 0 gptid/9cd956b2-dfda-11ea-a498-40f2e90ac8fc ONLINE 0 0 0 gptid/9cf3e01b-dfda-11ea-a498-40f2e90ac8fc ONLINE 0 0 0 raidz2-5 ONLINE 0 0 0 gptid/9d21a32e-dfda-11ea-a498-40f2e90ac8fc ONLINE 0 0 0 gptid/9d63f987-dfda-11ea-a498-40f2e90ac8fc ONLINE 0 0 0 gptid/a1a3dbf6-dfda-11ea-a498-40f2e90ac8fc ONLINE 0 0 0 gptid/a49ae247-dfda-11ea-a498-40f2e90ac8fc ONLINE 0 0 0 gptid/a4b35860-dfda-11ea-a498-40f2e90ac8fc ONLINE 0 0 0 gptid/96712300-eae3-11ea-ba6e-0090fa96a2c8 ONLINE 0 0 0 raidz2-6 ONLINE 0 0 0 gptid/a6910bd6-dfda-11ea-a498-40f2e90ac8fc ONLINE 0 0 0 gptid/a714fce5-dfda-11ea-a498-40f2e90ac8fc ONLINE 0 0 0 gptid/a72a55bf-dfda-11ea-a498-40f2e90ac8fc ONLINE 0 0 0 gptid/a7e34b60-dfda-11ea-a498-40f2e90ac8fc ONLINE 0 0 0 gptid/a8ccbb39-dfda-11ea-a498-40f2e90ac8fc ONLINE 0 0 0 gptid/aad7536b-dfda-11ea-a498-40f2e90ac8fc ONLINE 0 0 0 raidz2-7 ONLINE 0 0 0 gptid/ab1c7045-dfda-11ea-a498-40f2e90ac8fc ONLINE 0 0 0 gptid/ab3fb99c-dfda-11ea-a498-40f2e90ac8fc ONLINE 0 0 0 gptid/ab609dde-dfda-11ea-a498-40f2e90ac8fc ONLINE 0 0 0 gptid/ab557025-dfda-11ea-a498-40f2e90ac8fc ONLINE 0 0 0 gptid/ab8b1cc1-dfda-11ea-a498-40f2e90ac8fc ONLINE 0 0 0 gptid/aba30cf0-dfda-11ea-a498-40f2e90ac8fc ONLINE 0 0 0 errors: No known data errors pool: Services state: ONLINE scan: scrub repaired 0 in 0 days 00:15:26 with 0 errors on Sat Sep 12 23:15:28 2020 config: NAME STATE READ WRITE CKSUM Services ONLINE 0 0 0 gptid/56bdab3e-dfdb-11ea-a498-40f2e90ac8fc ONLINE 0 0 0 errors: No known data errors pool: freenas-boot state: ONLINE scan: scrub repaired 0 in 0 days 00:00:44 with 0 errors on Sun Sep 13 03:45:44 2020 config: NAME STATE READ WRITE CKSUM freenas-boot ONLINE 0 0 0 da48p2 ONLINE 0 0 0 errors: No known data errors
This is what i see from smartctl for the 3 disks that has thrown the errors and one that has not
Code:
root@freenas[~]# smartctl /dev/da14 -a smartctl 7.0 2018-12-30 r4883 [FreeBSD 11.3-RELEASE-p11 amd64] (local build) Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Vendor: IBM-SSG Product: S7AQ3P0 Revision: A058 Compliance: SPC-4 User Capacity: 3,000,592,982,016 bytes [3.00 TB] Logical block size: 512 bytes Rotation Rate: 7200 rpm Form Factor: 3.5 inches Logical Unit id: 0x5000c5005716b697 Serial number: Z298BSRN0000C4036ZRA Device type: disk Transport protocol: SAS (SPL-3) Local Time is: Mon Sep 14 21:16:19 2020 AEST SMART support is: Available - device has SMART capability. SMART support is: Enabled Temperature Warning: Enabled === START OF READ SMART DATA SECTION === SMART Health Status: OK Current Drive Temperature: 35 C Drive Trip Temperature: 65 C Elements in grown defect list: 0 Error counter log: Errors Corrected by Total Correction Gigabytes Total ECC rereads/ errors algorithm processed uncorrected fast | delayed rewrites corrected invocations [10^9 bytes] errors read: 2285659877 0 0 2285659877 1 4294.049 1 write: 0 0 0 0 0 22915.078 0 verify: 3160043153 0 0 3160043153 0 12539.994 0 Non-medium error count: 41 SMART Self-test log Num Test Status segment LifeTime LBA_first_err [SK ASC ASQ] Description number (hours) # 1 Background short Completed - 33751 - [- - -] # 2 Background short Completed - 33727 - [- - -] # 3 Background short Completed - 33703 - [- - -] # 4 Background short Completed - 33690 - [- - -] # 5 Background short Completed - 33667 - [- - -] # 6 Background short Completed - 33665 - [- - -] # 7 Background short Completed - 33641 - [- - -] # 8 Background short Completed - 33617 - [- - -] # 9 Background short Completed - 33593 - [- - -] #10 Background short Completed - 33568 - [- - -] #11 Background short Completed - 33544 - [- - -] #12 Background short Completed - 33520 - [- - -] #13 Background short Completed - 33496 - [- - -] #14 Background short Completed - 33472 - [- - -] #15 Background short Completed - 33447 - [- - -] #16 Background short Completed - 33423 - [- - -] #17 Background short Completed - 33400 - [- - -] #18 Background short Completed - 33376 - [- - -] #19 Background short Completed - 33352 - [- - -] #20 Background short Completed - 33327 - [- - -] Long (extended) Self-test duration: 27600 seconds [460.0 minutes] root@freenas[~]# smartctl /dev/da23 -a smartctl 7.0 2018-12-30 r4883 [FreeBSD 11.3-RELEASE-p11 amd64] (local build) Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Vendor: IBM-SSG Product: S7AQ3P0 Revision: A058 Compliance: SPC-4 User Capacity: 3,000,592,982,016 bytes [3.00 TB] Logical block size: 512 bytes Rotation Rate: 7200 rpm Form Factor: 3.5 inches Logical Unit id: 0x5000c50040f7dc8f Serial number: Z291R4PN00009227YSFB Device type: disk Transport protocol: SAS (SPL-3) Local Time is: Mon Sep 14 21:16:29 2020 AEST SMART support is: Available - device has SMART capability. SMART support is: Enabled Temperature Warning: Enabled === START OF READ SMART DATA SECTION === SMART Health Status: OK Current Drive Temperature: 38 C Drive Trip Temperature: 65 C Elements in grown defect list: 20 Error counter log: Errors Corrected by Total Correction Gigabytes Total ECC rereads/ errors algorithm processed uncorrected fast | delayed rewrites corrected invocations [10^9 bytes] errors read: 2809016694 0 0 2809016694 1 1334324.714 1 write: 0 0 0 0 0 124564.651 0 verify: 2369234760 0 0 2369234760 0 5534693.455 0 Non-medium error count: 79 SMART Self-test log Num Test Status segment LifeTime LBA_first_err [SK ASC ASQ] Description number (hours) # 1 Background short Completed - 65535 - [- - -] # 2 Background short Completed - 65535 - [- - -] # 3 Background short Completed - 65535 - [- - -] # 4 Background short Completed - 65535 - [- - -] # 5 Background short Completed - 65535 - [- - -] # 6 Background short Completed - 65535 - [- - -] # 7 Background short Completed - 65535 - [- - -] # 8 Background short Completed - 65535 - [- - -] # 9 Background short Completed - 65535 - [- - -] #10 Background short Completed - 65535 - [- - -] #11 Background short Completed - 65535 - [- - -] #12 Background short Completed - 65535 - [- - -] #13 Background short Completed - 65535 - [- - -] #14 Background short Completed - 65535 - [- - -] #15 Background short Completed - 65535 - [- - -] #16 Background short Completed - 65535 - [- - -] #17 Background short Completed - 65535 - [- - -] #18 Background short Completed - 65535 - [- - -] #19 Background short Completed - 65535 - [- - -] #20 Background short Completed - 65535 - [- - -] Long (extended) Self-test duration: 27600 seconds [460.0 minutes] root@freenas[~]# smartctl /dev/da24 -a smartctl 7.0 2018-12-30 r4883 [FreeBSD 11.3-RELEASE-p11 amd64] (local build) Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Vendor: IBM-SSG Product: S7AQ3P0 Revision: A058 Compliance: SPC-4 User Capacity: 3,000,592,982,016 bytes [3.00 TB] Logical block size: 512 bytes Rotation Rate: 7200 rpm Form Factor: 3.5 inches Logical Unit id: 0x5000c50040f7f11b Serial number: Z291QJJ00000921939NB Device type: disk Transport protocol: SAS (SPL-3) Local Time is: Mon Sep 14 21:16:35 2020 AEST SMART support is: Available - device has SMART capability. SMART support is: Enabled Temperature Warning: Enabled === START OF READ SMART DATA SECTION === SMART Health Status: OK Current Drive Temperature: 37 C Drive Trip Temperature: 65 C Elements in grown defect list: 0 Error counter log: Errors Corrected by Total Correction Gigabytes Total ECC rereads/ errors algorithm processed uncorrected fast | delayed rewrites corrected invocations [10^9 bytes] errors read: 3824800365 0 0 3824800365 1 1256600.361 1 write: 0 0 0 0 0 117323.045 0 verify: 331869279 0 0 331869279 0 6185375.410 0 Non-medium error count: 81 SMART Self-test log Num Test Status segment LifeTime LBA_first_err [SK ASC ASQ] Description number (hours) # 1 Background short Completed - 65535 - [- - -] # 2 Background short Completed - 65535 - [- - -] # 3 Background short Completed - 65535 - [- - -] # 4 Background short Completed - 65535 - [- - -] # 5 Background short Completed - 65535 - [- - -] # 6 Background short Completed - 65535 - [- - -] # 7 Background short Completed - 65535 - [- - -] # 8 Background short Completed - 65535 - [- - -] # 9 Background short Completed - 65535 - [- - -] #10 Background short Completed - 65535 - [- - -] #11 Background short Completed - 65535 - [- - -] #12 Background short Completed - 65535 - [- - -] #13 Background short Completed - 65535 - [- - -] #14 Background short Completed - 65535 - [- - -] #15 Background short Completed - 65535 - [- - -] #16 Background short Completed - 65535 - [- - -] #17 Background short Completed - 65535 - [- - -] #18 Background short Completed - 65535 - [- - -] #19 Background short Completed - 65535 - [- - -] #20 Background short Completed - 65535 - [- - -] Long (extended) Self-test duration: 27600 seconds [460.0 minutes] root@freenas[~]# smartctl /dev/da40 -a smartctl 7.0 2018-12-30 r4883 [FreeBSD 11.3-RELEASE-p11 amd64] (local build) Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Vendor: IBM-SSG Product: S7AQ3P0 Revision: A058 Compliance: SPC-4 User Capacity: 3,000,592,982,016 bytes [3.00 TB] Logical block size: 512 bytes Rotation Rate: 7200 rpm Form Factor: 3.5 inches Logical Unit id: 0x5000c50056669933 Serial number: Z297A1VP00009339Z4S2 Device type: disk Transport protocol: SAS (SPL-3) Local Time is: Mon Sep 14 21:16:48 2020 AEST SMART support is: Available - device has SMART capability. SMART support is: Enabled Temperature Warning: Enabled === START OF READ SMART DATA SECTION === SMART Health Status: OK Current Drive Temperature: 39 C Drive Trip Temperature: 65 C Elements in grown defect list: 0 Error counter log: Errors Corrected by Total Correction Gigabytes Total ECC rereads/ errors algorithm processed uncorrected fast | delayed rewrites corrected invocations [10^9 bytes] errors read: 188487665 0 0 188487665 0 759299.408 0 write: 0 0 0 0 0 92571.794 0 verify: 3513884870 0 0 3513884870 2 2819465.820 1 Non-medium error count: 95 SMART Self-test log Num Test Status segment LifeTime LBA_first_err [SK ASC ASQ] Description number (hours) # 1 Background short Completed - 35608 - [- - -] # 2 Background short Completed - 35584 - [- - -] # 3 Background short Completed - 35559 - [- - -] # 4 Background short Completed - 35546 - [- - -] # 5 Background short Completed - 35523 - [- - -] # 6 Background short Completed - 35522 - [- - -] # 7 Background short Completed - 35498 - [- - -] # 8 Background short Completed - 35474 - [- - -] # 9 Background short Completed - 35449 - [- - -] #10 Background short Completed - 35425 - [- - -] #11 Background short Completed - 35401 - [- - -] #12 Background short Completed - 35377 - [- - -] #13 Background short Completed - 35353 - [- - -] #14 Background short Completed - 35328 - [- - -] #15 Background short Completed - 35304 - [- - -] #16 Background short Completed - 35280 - [- - -] #17 Background short Completed - 35257 - [- - -] #18 Background short Completed - 35233 - [- - -] #19 Background short Completed - 35209 - [- - -] #20 Background short Completed - 35184 - [- - -] Long (extended) Self-test duration: 27600 seconds [460.0 minutes]
Does anyone have a suggestion on what might be causing the errors and/or what should be done about them?
I have also noticed the scrub seems to vary in speed quite a lot. It was running at:
scan: scrub in progress since Sun Sep 13 23:00:09 2020 39.0T scanned at 1.40G/s, 35.6T issued at 1.28G/s, 70.1T total 2M repaired, 50.81% done, 0 days 07:40:12 to go
But slowed down substantally:
scan: scrub in progress since Sun Sep 13 23:00:09 2020 39.0T scanned at 601M/s, 37.8T issued at 583M/s, 70.1T total 2M repaired, 53.95% done, 0 days 16:07:10 to go
And then seemed to be speeding up again
scan: scrub in progress since Sun Sep 13 23:00:09 2020 68.1T scanned at 919M/s, 39.4T issued at 531M/s, 70.1T total 3M repaired, 56.17% done, 0 days 16:50:09 to go
and now it seems to be showing it has scanned all 70.1T but is only 64.78% complete....
scan: scrub in progress since Sun Sep 13 23:00:09 2020 70.1T scanned at 906M/s, 45.4T issued at 587M/s, 70.1T total 3M repaired, 64.78% done, 0 days 12:14:51 to go
I assume the slowing down/speeding up is due to the sizes of the files being scrubbed and the load on the system but the numbers just feel a bit strange.
The only other issue i have had that may be related is some intermittent system hangs. When this occurs it seems to lock up the system at the console, crashes the WebUI and Plex transcoding freezes (but can restart) but SSH/SAMBA continue to work.
I get the following error in the message log:
Code:
Sep 13 02:44:46 freenas collectd[2072]: Traceback (most recent call last): File "/usr/local/lib/collectd_pyplugins/disktemp.py", line 60, in read with Client() as c: File "/usr/local/lib/python3.7/site-packages/middlewared/client/client.py", line 250, in __init__ self._ws.connect() File "/usr/local/lib/python3.7/site-packages/middlewared/client/client.py", line 93, in connect rv = super(WSClient, self).connect() File "/usr/local/lib/python3.7/site-packages/ws4py/client/__init__.py", line 223, in connect bytes = self.sock.recv(128) socket.timeout: timed out Sep 13 02:49:46 freenas collectd[2072]: Traceback (most recent call last): File "/usr/local/lib/collectd_pyplugins/disktemp.py", line 60, in read with Client() as c: File "/usr/local/lib/python3.7/site-packages/middlewared/client/client.py", line 250, in __init__ self._ws.connect() File "/usr/local/lib/python3.7/site-packages/middlewared/client/client.py", line 93, in connect rv = super(WSClient, self).connect() File "/usr/local/lib/python3.7/site-packages/ws4py/client/__init__.py", line 223, in connect bytes = self.sock.recv(128) socket.timeout: timed out
I had a disk that was logging increasing SMART errors that i thought was causing it so i replaced that disk and it seemed to stop for a while but has now happened again:
(Red=Hang Blue=Replace disk Green=Manual reboot)
Any help would be greatly appreciated!