Temperature Warning for m.2 SSD NVMe

lotustechie

Dabbler
Joined
Jun 3, 2020
Messages
27
I'm receiving a temperature warning for the m.2 controller when I power on the device.
Device: /dev/nvme0, Critical Warning (0x02): Temperature.

I didn't see any other posts with this issue that I can find. I'm using an ASRock B450 Pro4 motherboard and a 250 GB Crucial P2 NVMe m.2 SSD. It's a brand new device that I just built and I'm completely new to the FreeBSD arch.
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
What does smartctl -a /dev/nvme0 show?
 

lotustechie

Dabbler
Joined
Jun 3, 2020
Messages
27
0°C

Maybe the board/controller doesn't support SMART or it needs to be enabled in the BIOS?
=== START OF INFORMATION SECTION ===
Model Number: CT250P2SSD8
Serial Number: 2012E296505C
Firmware Version: P2CR010
PCI Vendor/Subsystem ID: 0xc0a9
IEEE OUI Identifier: 0x6479a7
Total NVM Capacity: 250,059,350,016 [250 GB]
Unallocated NVM Capacity: 0
Controller ID: 1
Number of Namespaces: 1
Namespace 1 Size/Capacity: 250,059,350,016 [250 GB]
Namespace 1 Formatted LBA Size: 512
Namespace 1 IEEE EUI-64: 6479a7 fff0001969
Local Time is: Fri Jun 5 00:12:20 2020 EDT
Firmware Updates (0x12): 1 Slot, no Reset required
Optional Admin Commands (0x001f): Security Format Frmw_DL NS_Mngmt Self_Test
Optional NVM Commands (0x005e): Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat Timestmp
Maximum Data Transfer Size: 64 Pages
Warning Comp. Temp. Threshold: 70 Celsius
Critical Comp. Temp. Threshold: 85 Celsius

Supported Power States
St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat
0 + 4.50W - - 0 0 0 0 0 0
1 + 2.70W - - 1 1 1 1 0 0
2 + 2.16W - - 2 2 2 2 0 0
3 - 0.0700W - - 3 3 3 3 1000 1000
4 - 0.0020W - - 4 4 4 4 5000 55000

Supported LBA Sizes (NSID 0x1)
Id Fmt Data Metadt Rel_Perf
0 + 512 0 1
1 - 4096 0 0

=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: FAILED!
- temperature is above or below threshold

SMART/Health Information (NVMe Log 0x02)
Critical Warning: 0x02
Temperature: 0 Celsius
Available Spare: 100%
Available Spare Threshold: 5%
Percentage Used: 0%
Data Units Read: 492,617 [252 GB]
Data Units Written: 12,137 [6.21 GB]
Host Read Commands: 1,094,694
Host Write Commands: 134,455
Controller Busy Time: 27
Power Cycles: 11
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
No, you’re getting results from smartctl. This particular NVMe doesn’t appear to have a working temperature sensor.
 

BenMoores

Cadet
Joined
Nov 21, 2020
Messages
4
I can also confirm these crucial P2 NVME SSDs don't appear to report temperature correctly, but I've done a little more digging.

I've got two CT250P2SSD8 with firmware P2CR010. Both consistently report temperatures around 0-3 degrees celsius in truenas. In windows using the 'crucial storage executive tool' I get temperatures reporting around 30 celsius.

It may just be coincidence, but from the output from nvmecontrol it looks like the drives are reporting temperature as tenths of a degree celsius, e.g. a value of 123 -> 12.3 celsius.

E.g.
nvmecontrol logpage -p 2 -x nvme0 | grep "000:"
000: 64011900 00000005 00000000 00000000 00000000 00000000 00000000 00000000

logpage 2 is the smart data, and the raw 'composite temperature' bytes are 1 and 2. In the above case 0x119 = 281.

Subsequent readings give values like 0x114 0x115 0x116 0x12a. Corresponding to values 276 to 298, and if my guess is correct, this would be 27.6 to 29.8 celsius.

It may just be coincidence that the numbers line up with my room temperature, but given that the windows tool does report correct values, there must be a method of getting the real temperature.

There doesn't appear to be any method of applying vendor specific corrections to nvme data reported by smartmontools.

I'm going to attempt contacting crucial to see if they have any opinion, but I don't hold out much hope.
 

BenMoores

Cadet
Joined
Nov 21, 2020
Messages
4
This is the response from crucial/micron. I had only told them I was having trouble with temperature reports in Linux, none of the details...

Firstly we apologies for the inconvenience with your SSD. We would like to inform you that there is no actual risk to the health of the drive, this is just an error in with P2 SSD data. It is being reported to a host system. Our engineering team is working on this issue and a fix will be included in the next firmware update, but there is not an ETA at this time.
 

CSP-on-FN

Dabbler
Joined
Apr 16, 2015
Messages
15
This is the response from crucial/micron. I had only told them I was having trouble with temperature reports in Linux, none of the details...

Thank you Ben,

I recently installed the 500GB version of this Crucial P2 SSD drive - model CT500P2SSD8 - via a PCIe adapter, and mine has the exact same Firmware version as yours (no surprise).

When I ran smartctl -a /dev/nvme0 it reported the drive temperature (when idle) as somewhere below zero degrees Celsius, and when the drive was active, it would 'climb' to around +4 degrees Celsius!

I've got a well-cooled server case, and a quick finger test on the heatsink on my NVMe/M.2 drive assured me that the drive is very comfortable - at around body temperature - so I suspected that this might just be a SMART 'reporting' error .... and your post here confirms my suspicions, thanks.

(I was fairly sure I hadn't wrongly applied the thermal pads when I installed the drive :tongue: and I know that - if I had goofed with the thermal pads - the actual drive circuit board could have been sizzling away at some high really temperature underneath that cool heatsink!)

So I'll watch out for a Firmware update from Crucial.
Thanks again.
Colin P.
 

BenMoores

Cadet
Joined
Nov 21, 2020
Messages
4
Thanks Colin, I'm hoping the update won't take too long, as I'd really like to get rid of the 'critical warning's about drive temperature.

Looking at the graphs from a normal drive and the SSD I can see the data isn't complete rubbish, but I might need to improve my case ventilation :)

1607363845301.png
 

BenMoores

Cadet
Joined
Nov 21, 2020
Messages
4
And new firmware has been released...

P2CR012 is a firmware upgrade for 250GB and 500GB P2 drives. It contains the following improvements:
Fixes a SMART temperature reporting error
Improves performance under certain workloads

But, the firmware package doesn't contain the ISO for making a bootable USB, which the documentation says it should have. And a quick search shows people are still having trouble with it.

I might wait for the next update.
 

CSP-on-FN

Dabbler
Joined
Apr 16, 2015
Messages
15
Thank you again Ben,
This is very helpful, and I'll check the link you provided.

I agree with you in any case that - as this P2 drive is functioning (for me) trouble-free - it won't hurt to wait for the next update!
Colin P.
 
Top