- Joined
- Nov 12, 2015
- Messages
- 1,471
UPDATE: 1/15/2021 - 10:45AM Eastern
----
We have finished validation that this data integrity issue is resolved. A 12.0-U1.1 hotfix release is being pushed *today* which all users will be highly encouraged to upgrade to, even if they are not observing any issues with data corruption.
UPDATE: 1/14/2021 - 10:21AM Eastern
----
We believe we've identified the issue and have a fixed kernel module up for testing on the Jira ticket. Feedback so far is positive that this resolves the issue, and we will be working to issue a 12.0-U1.1 hotfix release soon after a bit more validation time in the field.
--- Original Post ---
TrueNAS Community,
I wanted to post an update today on the general state of TrueNAS 12.0-U1 quality as well as a heads up of some of the issues we are tracking for resolution in TrueNAS 12.0-U2.
So far we’ve seen over 40k+ systems upgrade to 12.0 and the response has been very positive. That has made this one of the fastest adopted TrueNAS (And FreeNAS) releases ever. General quality at the U1 stage is far surpassing what we saw in the 11.3 series and we’re eager for the launch of 12.0-U2 with even more polish and improvements. In the meantime we’ve gotten a few reports about some regressions from 11.3-U5 that we wanted to take a moment to address.
First, we had gotten reports of some performance regressions which were eventually tracked down to some bugs that came in from our FreeBSD 12.2 upstream network drivers for Intel and Chelsio devices specifically. Both of these issues have been investigated and now are resolved in the upcoming 12.0-U2. A special thank-you to everybody who helped us track that down and determine what hardware was impacted.
https://jira.ixsystems.com/browse/NAS-107593
Second, we are also currently tracking a handful of users reporting issues with data integrity in some very specific virtualization environments. TrueNAS ZFS is reporting that there is no corruption on the disks or pool, but inside the VM the local filesystem may report needing to run a filesystem repair (fsck or scandisk), depending upon the guest OS. The issue has been seen on a few hypervisors and with iSCSI and NFS. It may be related to the type of VM & filesystem running as a guest. For more details, please refer to this ticket:
https://jira.ixsystems.com/browse/NAS-108627
As you may already know, we take reports of issues with data integrity very seriously, and you can rest assured that the entire iX engineering team is treating this as an "all hands on deck" situation until we come to a resolution. We have held up publishing the TrueNAS Enterprise 12.0 update train and we will hold up the next update to 12.0-U2 until we can be confident that we have a validated fix in place. That release is scheduled to arrive in early February.
The good news is, this issue seems to be extremely rare from the looks of it. Even more so considering how hard it has been to reproduce it internally. Out of tens of thousands systems running 12.0, we've only seen it on a handful of very specific virtualization workloads and nothing related to traditional SMB/NFS/Plugins scenarios.
The bad news is, due to how rare this issue appears, it makes it that much more difficult to troubleshoot and resolve. If you suspect you've seen something similar, please update the Jira ticket referenced above with your details and attach a debug file (System -> Advanced -> Generate Debug).
If you have seen a similar issue or will be running a production-critical environment, we would recommend staying on or rolling back to 11.3-U5 until we can further diagnose and fix this issue.
I'm going to sticky this post for the time being and update as we have more information to share about the status of the integrity issue.
Thanks for your patience everybody, on behalf of the entire iX team we really do appreciate it.
----
We have finished validation that this data integrity issue is resolved. A 12.0-U1.1 hotfix release is being pushed *today* which all users will be highly encouraged to upgrade to, even if they are not observing any issues with data corruption.
UPDATE: 1/14/2021 - 10:21AM Eastern
----
We believe we've identified the issue and have a fixed kernel module up for testing on the Jira ticket. Feedback so far is positive that this resolves the issue, and we will be working to issue a 12.0-U1.1 hotfix release soon after a bit more validation time in the field.
--- Original Post ---
TrueNAS Community,
I wanted to post an update today on the general state of TrueNAS 12.0-U1 quality as well as a heads up of some of the issues we are tracking for resolution in TrueNAS 12.0-U2.
So far we’ve seen over 40k+ systems upgrade to 12.0 and the response has been very positive. That has made this one of the fastest adopted TrueNAS (And FreeNAS) releases ever. General quality at the U1 stage is far surpassing what we saw in the 11.3 series and we’re eager for the launch of 12.0-U2 with even more polish and improvements. In the meantime we’ve gotten a few reports about some regressions from 11.3-U5 that we wanted to take a moment to address.
First, we had gotten reports of some performance regressions which were eventually tracked down to some bugs that came in from our FreeBSD 12.2 upstream network drivers for Intel and Chelsio devices specifically. Both of these issues have been investigated and now are resolved in the upcoming 12.0-U2. A special thank-you to everybody who helped us track that down and determine what hardware was impacted.
https://jira.ixsystems.com/browse/NAS-107593
Second, we are also currently tracking a handful of users reporting issues with data integrity in some very specific virtualization environments. TrueNAS ZFS is reporting that there is no corruption on the disks or pool, but inside the VM the local filesystem may report needing to run a filesystem repair (fsck or scandisk), depending upon the guest OS. The issue has been seen on a few hypervisors and with iSCSI and NFS. It may be related to the type of VM & filesystem running as a guest. For more details, please refer to this ticket:
https://jira.ixsystems.com/browse/NAS-108627
As you may already know, we take reports of issues with data integrity very seriously, and you can rest assured that the entire iX engineering team is treating this as an "all hands on deck" situation until we come to a resolution. We have held up publishing the TrueNAS Enterprise 12.0 update train and we will hold up the next update to 12.0-U2 until we can be confident that we have a validated fix in place. That release is scheduled to arrive in early February.
The good news is, this issue seems to be extremely rare from the looks of it. Even more so considering how hard it has been to reproduce it internally. Out of tens of thousands systems running 12.0, we've only seen it on a handful of very specific virtualization workloads and nothing related to traditional SMB/NFS/Plugins scenarios.
The bad news is, due to how rare this issue appears, it makes it that much more difficult to troubleshoot and resolve. If you suspect you've seen something similar, please update the Jira ticket referenced above with your details and attach a debug file (System -> Advanced -> Generate Debug).
If you have seen a similar issue or will be running a production-critical environment, we would recommend staying on or rolling back to 11.3-U5 until we can further diagnose and fix this issue.
I'm going to sticky this post for the time being and update as we have more information to share about the status of the integrity issue.
Thanks for your patience everybody, on behalf of the entire iX team we really do appreciate it.
Last edited: