Freenas Problem - Hard Disk error? - Volume1 state is DEGRADED

bernd67

Explorer
Joined
Jan 7, 2017
Messages
88
Hallo zusammen,

Ich erhalte nachstehende Meldungen bzw Informationen bei einem älteren Freenas 11.2 Version.
Folgende Fragen:

- Ist schon ein Datenverlust (auf beiden Platten?) eingetreten? Falls ja, kann man herausfinden bei welchen Dateien?
- Ist es ein Software oder Hardware Problem?
- Ist eine Festplatte defekt ? ada1 wie ich evtl vermute?
- Bei der Recherche zu den Fehlermeldungen wurde auch auf Probleme mit Sata Kabeln hingewiesen?
- Wo sind die Resultate des geschedulten Smart Tests zu finden?
- Wie sollte ich hier weiter vorgehen?
- Falls ada1 defekt, wie gehe ich zum Austausch vor?


Danke für Hilfe!



/dev/ada1: Unable to detect device type


----------


smartctl -a /dev/ada0 :

=== START OF INFORMATION SECTION ===
Device Model: ST2000DL004 HD204UI
Serial Number: S2H7JX0D100323
LU WWN Device Id: 5 0000f0 0100b2303
Firmware Version: 1AQ10001
User Capacity: 2,000,398,934,016 bytes [2.00 TB]
Sector Size: 512 bytes logical/physical
Rotation Rate: 5400 rpm
Form Factor: 3.5 inches
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ATA8-ACS T13/1699-D revision 6
SATA Version is: SATA 2.6, 3.0 Gb/s
Local Time is: Mon Aug 28 09:24:48 2023 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 100 100 051 Pre-fail Always - 0
2 Throughput_Performance 0x0026 055 055 000 Old_age Always - 18317
3 Spin_Up_Time 0x0023 067 066 025 Pre-fail Always - 10029
4 Start_Stop_Count 0x0032 096 096 000 Old_age Always - 4360
5 Reallocated_Sector_Ct 0x0033 252 252 010 Pre-fail Always - 0
7 Seek_Error_Rate 0x002e 252 252 051 Old_age Always - 0
8 Seek_Time_Performance 0x0024 252 252 015 Old_age Offline - 0
9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 27299
10 Spin_Retry_Count 0x0032 252 252 051 Old_age Always - 0
11 Calibration_Retry_Count 0x0032 252 252 000 Old_age Always - 0
12 Power_Cycle_Count 0x0032 098 098 000 Old_age Always - 2854
181 Program_Fail_Cnt_Total 0x0022 100 100 000 Old_age Always - 452002
191 G-Sense_Error_Rate 0x0022 100 100 000 Old_age Always - 1
192 Power-Off_Retract_Count 0x0022 252 252 000 Old_age Always - 0
194 Temperature_Celsius 0x0002 064 062 000 Old_age Always - 31 (Min/Max 14/38)
195 Hardware_ECC_Recovered 0x003a 100 100 000 Old_age Always - 0
196 Reallocated_Event_Count 0x0032 252 252 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 252 252 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 252 252 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0036 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x002a 100 100 000 Old_age Always - 0
223 Load_Retry_Count 0x0032 252 252 000 Old_age Always - 0
225 Load_Cycle_Count 0x0032 100 100 000 Old_age Always - 4367

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00% 27297 -
# 2 Extended offline Completed without error 00% 26715 -
# 3 Extended offline Completed without error 00% 26694 -
# 4 Extended offline Completed without error 00% 26643 -
# 5 Extended offline Completed without error 00% 26622 -
# 6 Extended offline Completed without error 00% 25988 -
# 7 Extended offline Completed without error 00% 25967 -
# 8 Extended offline Completed without error 00% 25940 -
# 9 Extended offline Completed without error 00% 25919 -
#10 Extended offline Completed without error 00% 25268 -
#11 Extended offline Completed without error 00% 25247 -
#12 Extended offline Completed without error 00% 25196 -
#13 Extended offline Completed without error 00% 25175 -
#14 Extended offline Completed without error 00% 24524 -
#15 Extended offline Completed without error 00% 24503 -
#16 Extended offline Completed without error 00% 24475 -
#17 Extended offline Completed without error 00% 24454 -
#18 Extended offline Completed without error 00% 23803 -
#19 Extended offline Completed without error 00% 23782 -
#20 Extended offline Completed without error 00% 23732 -
#21 Extended offline Completed without error 00% 23711 -

SMART Selective self-test log data structure revision number 0
Note: revision number not 1 implies that no selective self-test has ever been run
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Completed [00% left] (0-65535)
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.



# smartctl -a /dev/ada1
smartctl 6.6 2017-11-05 r4594 [FreeBSD 11.1-STABLE amd64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

/dev/ada1: Unable to detect device type

-----


The volume Volume1 state is DEGRADED: One or more devices has experienced an error resulting in data corruption. Applications may be affected.

The volume Volume1 state is DEGRADED: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state.

The volume Volume1 state is ONLINE: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state.

The volume Volume1 state is DEGRADED: One or more devices has been removed by the administrator. Sufficient replicas exist for the pool to continue functioning in a degraded state.


-----


root@freenas ~]# zpool status
pool: Volume1
state: DEGRADED
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: http://illumos.org/msg/ZFS-8000-8A
scan: resilvered 527G in 7908 days 02:07:19 with 4869 errors on Mon Aug 28 00:09:49 2023
config:

NAME STATE READ WRITE CKSUM
Volume1 DEGRADED 32.9K 0 0
mirror-0 DEGRADED 127K 0 0
14645395981702454298 REMOVED 0 0 0 was /dev/gptid/eb0192da-0536-11e4-bcda-f46d045fb
858
gptid/506c477d-bbe9-11ec-ac0e-f46d045fb858 ONLINE 0 0 127K

errors: 4832 data errors, use '-v' for a list

pool: freenas-boot
state: ONLINE
scan: scrub repaired 0 in 0 days 00:08:55 with 0 errors on Sat Aug 26 03:53:56 2023
config:

NAME STATE READ WRITE CKSUM
freenas-boot ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
gptid/e3aae27d-9309-11e7-b6c3-54a050800643 ONLINE 0 0 0
da0p2 ONLINE 0 0 0

errors: No known data errors
[root@freenas ~]#


---------

Checking status of zfs pools:
NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
Volume1 1.82T 797G 1.04T - 16% 42% 1.00x DEGRADED /mnt
freenas-boot 14.2G 9.83G 4.42G - - 69% 1.00x ONLINE -

pool: Volume1
state: DEGRADED
status: One or more devices has been removed by the administrator.
Sufficient replicas exist for the pool to continue functioning in a
degraded state.
action: Online the device using 'zpool online' or replace the device with
'zpool replace'.
scan: scrub repaired 0 in 0 days 02:39:07 with 0 errors on Sun Aug 27 02:39:08 2023
config:

NAME STATE READ WRITE CKSUM
Volume1 DEGRADED 0 0 0
mirror-0 DEGRADED 0 0 0
gptid/eb0192da-0536-11e4-bcda-f46d045fb858 ONLINE 0 0 0
9069279119992821473 REMOVED 0 0 0 was /dev/gptid/506c477d-bbe9-11ec-ac0e-f46d045fb858

errors: No known data errors

Checking status of gmirror(8) devices:
Name Status Components
mirror/swap0 DEGRADED ada1p1 (ACTIVE)
 
Last edited:

bernd67

Explorer
Joined
Jan 7, 2017
Messages
88
Hello Forum,

Is it a Software Problem or a Hardware Problem or both?

which of both disks should i replace? ada0 or ada1?
where to find the correct disk in storage unit?
is there any chance to repair the zfs pool or have i to use the backup (older?)?

thx for help.

Checking status of zfs pools:
NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
Volume1 1.82T 799G 1.04T - 16% 42% 1.00x DEGRADED /mnt
freenas-boot 14.2G 9.83G 4.42G - - 69% 1.00x ONLINE -
pool: Volume1
state: DEGRADED
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: http://illumos.org/msg/ZFS-8000-8A
scan: resilvered 527G in 7908 days 02:07:19 with 4869 errors on Mon Aug 28 00:09:49 2023
config:
NAME STATE READ WRITE CKSUM
Volume1 DEGRADED 33.6K 0 0
mirror-0 DEGRADED 130K 0 0
14645395981702454298 REMOVED 0 0 0 was /dev/gptid/eb0192da-0536-11e4-bcda-f46d045fb858
gptid/506c477d-bbe9-11ec-ac0e-f46d045fb858 ONLINE 0 0 130K
errors: 4833 data errors, use '-v' for a list
Checking status of gmirror(8) devices:
Name Status Components
mirror/swap0 DEGRADED ada0p1 (ACTIVE)
 
Last edited:

bernd67

Explorer
Joined
Jan 7, 2017
Messages
88
Tried to wipe ada1:
Serialnumber S2H7J90B8316YX

Exception Value:
Unknown disk ada1

[MiddlewareError: Failed to wipe da1p2: dd: /dev/da1p2: Operation not permitted
]
: ./freenasUI/middleware/notifier.py in _do_disk_wipe_quick, line 3613

why can i not wipe it? Hardware error?


Additional Information:

[root@freenas ~]# glabel status
Name Status Components
gptid/ee3bb4c5-9966-11e7-8b3f-f46d045fb858 N/A da0p1
gptid/e3a239df-9309-11e7-b6c3-54a050800643 N/A da1p1
gptid/e3aae27d-9309-11e7-b6c3-54a050800643 N/A da1p2
gptid/506c477d-bbe9-11ec-ac0e-f46d045fb858 N/A ada0p2

pool: Volume1
state: DEGRADED
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: http://illumos.org/msg/ZFS-8000-8A
scan: resilvered 527G in 7908 days 02:07:19 with 4869 errors on Mon Aug 28 00:09:49 2023
config:

NAME STATE READ WRITE CKSUM
Volume1 DEGRADED 42.8K 0 0
mirror-0 DEGRADED 167K 0 0
14645395981702454298 REMOVED 0 0 0 was /dev/gptid/eb0192da-0536-11e4-bcda-f46d045fb
858
gptid/506c477d-bbe9-11ec-ac0e-f46d045fb858 ONLINE 0 0 167K

errors: Permanent errors have been detected in the following files:

<metadata>:<0x40d>
<metadata>:<0x265>
<metadata>:<0x169>
<metadata>:<0x39f>
<metadata>:<0x3ca>
<metadata>:<0x3e5>
<metadata>:<0x1ef>
<metadata>:<0x3f8>
/mnt/Volume1/DS1/Sta....../.....bak1

Volume1/.system/configs-06bd65d158464db59125089083a20821:<0x0>
Volume1/.system/syslog-06bd65d158464db59125089083a20821:<0x0>
/var/db/system/syslog-06bd65d158464db59125089083a20821/log/samba4
/var/db/system/syslog-06bd65d158464db59125089083a20821/log/middlewared.log.2
/var/db/system/syslog-06bd65d158464db59125089083a20821/log/samba4/log.smbd
/var/db/system/syslog-06bd65d158464db59125089083a20821/log/samba4/samba.backtraces
/var/db/system/syslog-06bd65d158464db59125089083a20821/log/dmesg.yesterday
/var/db/system/syslog-06bd65d158464db59125089083a20821/log/middlewared.log.1
/var/db/system/syslog-06bd65d158464db59125089083a20821/log/middlewared.log
/var/db/system/cores/smbd.core
/var/db/system/cores/syslog-ng.core
/var/db/system/cores/devd.core
/var/db/system/cores/collectd.core
/var/db/system/cores/zfsd.core
Volume1/jails/TestJail:<0x0>
Volume1/jails/TestJail:<0x53b6e>
Volume1/jails/TestJail:<0x53b7d>
Volume1/jails/TestJail:<0x53b7f>
Volume1/jails/TestJail:<0x556a0>
Volume1/jails/TestJail:<0x53b7d>
Volume1/jails/TestJail:<0x53b7f>
Volume1/jails/TestJail:<0x556a0>
Volume1/jails/TestJail:<0x556a1>
Volume1/jails/TestJail:<0x556a2>
Volume1/jails/TestJail:<0x556a3>
Volume1/jails/TestJail:<0x556a4>
Volume1/jails/TestJail:<0x556a5>
Volume1/jails/TestJail:<0x556a6>
Volume1/jails/TestJail:<0x556a7>
Volume1/jails/TestJail:<0x556a8>
Volume1/jails/TestJail:<0x556a9>
Volume1/jails/TestJail:<0x556aa>
Volume1/jails/TestJail:<0x556ab>
Volume1/jails/TestJail:<0x556ac>
Volume1/jails/TestJail:<0x556ad>
Volume1/jails/TestJail:<0x556ae>
Volume1/jails/TestJail:<0x556af>
Volume1/jails/TestJail:<0x556b0>
Volume1/jails/TestJail:<0x556b1>
Volume1/jails/TestJail:<0x556b2>
Volume1/jails/TestJail:<0x556b3>
Volume1/jails/TestJail:<0x556b4>
Volume1/jails/TestJail:<0x556b5>
Volume1/jails/TestJail:<0x556b6>
Volume1/jails/TestJail:<0x556b7>
Volume1/jails/TestJail:<0x556b8>
Volume1/jails/TestJail:<0x556ba>
Volume1/jails/TestJail:<0x556bb>
Volume1/jails/TestJail:<0x556bc>
Volume1/jails/TestJail:<0x556bd>
Volume1/jails/TestJail:<0x556be>
Volume1/jails/TestJail:<0x556bf>
/mnt/Volume1/jails/TestJail/var/log
/mnt/Volume1/jails/TestJail/var/backups
/mnt/Volume1/jails/TestJail/var/spool/clientmqueue

pool: freenas-boot
state: ONLINE
scan: scrub repaired 0 in 0 days 00:08:47 with 0 errors on Sun Sep 3 03:53:48 2023
config:

NAME STATE READ WRITE CKSUM
freenas-boot ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
gptid/e3aae27d-9309-11e7-b6c3-54a050800643 ONLINE 0 0 0
da0p2 ONLINE 0 0 0

errors: No known data errors
 
Top