Understanding write CBD error messages

Littlejd97

Cadet
Joined
Dec 10, 2021
Messages
6
I am running a virtualized instance of TrueNAS on top of Proxmox with regular drive passthrough (not passing an HBA or anything)

Recently I’ve been noticing write failures when doing large write operations, such as downloading a 100GB video game. Taking a look at the messages log file, I see the following many times

Code:
Dec 24 07:53:20 truenas (0:5:0/23): UNMAP. CDB: 42 00 00 00 00 00 00 00 18 00
Dec 24 07:53:20 truenas (0:5:0/23): Tag: 0x52440200, type 1
Dec 24 07:53:20 truenas (0:5:0/23): ctl_process_done: 108 seconds


These can get into the 500+ second range can causes some of my iSCSI connections to begin to timeout and crash OSes. I want to start narrowing down the problem, but I can’t really decode this error to identify a particular drive or something that is having problems.

I would like to understand what is in the error to identify the problematic drive.

Other notes: I have two pool, this one which is mechanical drive based, and another RAID 0 NVMe SSDs. The NVMe pool can take the write operation just fine.

The VM also has 30GB of ram and 3 CPU cores

Thanks
 

Littlejd97

Cadet
Joined
Dec 10, 2021
Messages
6
I am running a virtualized instance of TrueNAS on top of Proxmox with regular drive passthrough (not passing an HBA or anything)

Recently I’ve been noticing write failures when doing large write operations, such as downloading a 100GB video game. Taking a look at the messages log file, I see the following many times

Code:
Dec 24 07:53:20 truenas (0:5:0/23): UNMAP. CDB: 42 00 00 00 00 00 00 00 18 00
Dec 24 07:53:20 truenas (0:5:0/23): Tag: 0x52440200, type 1
Dec 24 07:53:20 truenas (0:5:0/23): ctl_process_done: 108 seconds


These can get into the 500+ second range can causes some of my iSCSI connections to begin to timeout and crash OSes. I want to start narrowing down the problem, but I can’t really decode this error to identify a particular drive or something that is having problems.

I would like to understand what is in the error to identify the problematic drive.

Other notes: I have two pool, this one which is mechanical drive based, and another RAID 0 NVMe SSDs. The NVMe pool can take the write operation just fine.

The VM also has 30GB of ram and 3 CPU cores

Thanks
Also all drives pass SMART long tests and appear healthy to ZFS
 

Littlejd97

Cadet
Joined
Dec 10, 2021
Messages
6
I forgot to update here. There was some sort of issue with a BIOS update for my motherboard. I restored to the previous version and my chipset sata controller was much more stable
 
Top