Pool RAIDZ3 degraded

deigu0

Cadet
Joined
Feb 22, 2023
Messages
4
Hello,

I have an IBM HS-1235t XYRATEX server with the following features:
- 2 x Intel(R) Xeon(R) E5645 CPU
- 96GB RAM
- 12 disks of 2TB IBM SAS
- TrueNAS-13.0-U6.1 installed on a SD CARD (COMPACT FLASH)

The Pool ZFS is a 16TB RAIDZ3 with the 12 2TB disks, the first time I thought it was and it degraded shortly after.
Erase the volume and format all disks to create a new volume.
When I created it again it took about 1 month and it has degraded again.

1704832448430.png


I have verified all the disks with SMART and they tell me that it is OK.

What could I do to fix the degraded volume or in any case make sure it no longer degrades?

Thanks,
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
How are the drives connected? If the SAS HBA properly cooled? What's the firmware version?
Have you checked RAM? (Not long ago another user eventually pinned down his recurring issues on a defective RAM module, and your system is quite old.)
 
Joined
Jun 15, 2022
Messages
674
What SAS Host Bus Adapter(s)?

Are they RAID cards in non-RAID mode but not flashed to IT mode?

Are you running any VM stuff anywhere on the hardware?
 

deigu0

Cadet
Joined
Feb 22, 2023
Messages
4
What SAS Host Bus Adapter(s)?

Are they RAID cards in non-RAID mode but not flashed to IT mode?

Are you running any VM stuff anywhere on the hardware?
Hi, the server have the LSI SAS9201-8I with the firmware 11.00.00.00-IT
1706209886015.png

It's posible to update the firmware of this controller ?

And the disk has a firmware update pending
EC5C to EC5H

1706210919813.png


If update the controller SAS and firmware disk is possible the raid z ir more stable ?

Thanks,
 

deigu0

Cadet
Joined
Feb 22, 2023
Messages
4
How are the drives connected? If the SAS HBA properly cooled? What's the firmware version?
Have you checked RAM? (Not long ago another user eventually pinned down his recurring issues on a defective RAM module, and your system is quite old.)
Hi,
In the server have 12 disk connected, the model od disk is ST4000NM0043 and now stay with the firmware EC5C, I the web have and firmware update to version EC5H.
I' am check the memories in another server and it's ok.

If I update firmware of disk the problem with degraded state is solve?

Thanks,
 
Joined
Jun 15, 2022
Messages
674
Is it a true LSI card or one new off eBay for a very, very reasonable price (from China)? Chinese knock-offs can be "very good," especially under Windows, though often the stress ZFS puts them under shows their weaknesses. There are links in my signature to what knock-off cards look like.

Did you burn in the drives (see links in my signature)? SMART tests and badblocks testing often finds problems up front and saves a lot of work in the back end.

What does a SMART report show? I've seen drives have issues due to heat, even though they're running within OEM specifications.

As a note, the problem could be as simple as poor cables. If that is the case, I have no problem with you buying cheap cables, it's just they should be treated like very fragile $500 each cables, and then they usually are fine, but bend them up or be rough on them and they're often shot. If they are cheap cables you probably don't want to use maximum-length cables, you'd want to come in 2/3 max. length due to lack of shielding in the cable.

For me (and it will be different for others), firmware has not solved any issues except for hardware RAID incompatibility with "new to the market" hardware such as a server mainboard. ZFS does not use hardware RAID so that's a non-issue.
 
Top