SOLVED SSD Life - Are setting changes required for good SSD Life?

NASbox

Guru
Joined
May 8, 2012
Messages
644
I added this drive so that my boot pool is mirrored (the database is also on this drive) and I notice that smart attribute 231 SSD_Life_Left is dropping 1 point every 35-40 days. The drive is 120GB and it is mostly empty:

#>zfs list freenas-boot
NAME USED AVAIL REFER MOUNTPOINT
freenas-boot 13.0G 94.6G 176K none


It seems a bit excessive given the amount of work being done these days - or am I off base? Do I need to set a tuable for optimum drive life?

20200625_050000~KINGSTON_SA400S37120G
Code:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x0032   100   100   000    Old_age   Always       -       100
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       3856
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       16
148 Unknown_Attribute       0x0000   100   100   000    Old_age   Offline      -       0
149 Unknown_Attribute       0x0000   100   100   000    Old_age   Offline      -       0
167 Write_Protect_Mode      0x0000   100   100   000    Old_age   Offline      -       0
168 SATA_Phy_Error_Count    0x0012   100   100   000    Old_age   Always       -       0
169 Bad_Block_Rate          0x0000   100   100   000    Old_age   Offline      -       0
170 Bad_Blk_Ct_Erl/Lat      0x0000   100   100   010    Old_age   Offline      -       0/0
172 Erase_Fail_Count        0x0032   100   100   000    Old_age   Always       -       0
173 MaxAvgErase_Ct          0x0000   100   100   000    Old_age   Offline      -       0
181 Program_Fail_Count      0x0032   100   100   000    Old_age   Always       -       0
182 Erase_Fail_Count        0x0000   100   100   000    Old_age   Offline      -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
192 Unsafe_Shutdown_Count   0x0012   100   100   000    Old_age   Always       -       11
194 Temperature_Celsius     0x0022   043   054   000    Old_age   Always       -       43 (Min/Max 31/54)
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       0
199 SATA_CRC_Error_Count    0x0032   100   100   000    Old_age   Always       -       0
218 CRC_Error_Count         0x0032   100   100   000    Old_age   Always       -       0
231 SSD_Life_Left           0x0000   094   094   000    Old_age   Offline      -       94
233 Flash_Writes_GiB        0x0032   100   100   000    Old_age   Always       -       1568
241 Lifetime_Writes_GiB     0x0032   100   100   000    Old_age   Always       -       3205
242 Lifetime_Reads_GiB      0x0032   100   100   000    Old_age   Always       -       392
244 Average_Erase_Count     0x0000   100   100   000    Old_age   Offline      -       120
245 Max_Erase_Count         0x0000   100   100   000    Old_age   Offline      -       157
246 Total_Erase_Count       0x0000   100   100   000    Old_age   Offline      -       24281


20200804_050000~KINGSTON_SA400S37120G
Code:
SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x0032   100   100   000    Old_age   Always       -       100
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       4812
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       16
148 Unknown_Attribute       0x0000   100   100   000    Old_age   Offline      -       0
149 Unknown_Attribute       0x0000   100   100   000    Old_age   Offline      -       0
167 Write_Protect_Mode      0x0000   100   100   000    Old_age   Offline      -       0
168 SATA_Phy_Error_Count    0x0012   100   100   000    Old_age   Always       -       0
169 Bad_Block_Rate          0x0000   100   100   000    Old_age   Offline      -       0
170 Bad_Blk_Ct_Erl/Lat      0x0000   100   100   010    Old_age   Offline      -       0/0
172 Erase_Fail_Count        0x0032   100   100   000    Old_age   Always       -       0
173 MaxAvgErase_Ct          0x0000   100   100   000    Old_age   Offline      -       0
181 Program_Fail_Count      0x0032   100   100   000    Old_age   Always       -       0
182 Erase_Fail_Count        0x0000   100   100   000    Old_age   Offline      -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
192 Unsafe_Shutdown_Count   0x0012   100   100   000    Old_age   Always       -       11
194 Temperature_Celsius     0x0022   043   056   000    Old_age   Always       -       43 (Min/Max 31/56)
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       0
199 SATA_CRC_Error_Count    0x0032   100   100   000    Old_age   Always       -       0
218 CRC_Error_Count         0x0032   100   100   000    Old_age   Always       -       0
231 SSD_Life_Left           0x0000   093   093   000    Old_age   Offline      -       93
233 Flash_Writes_GiB        0x0032   100   100   000    Old_age   Always       -       2178
241 Lifetime_Writes_GiB     0x0032   100   100   000    Old_age   Always       -       4236
242 Lifetime_Reads_GiB      0x0032   100   100   000    Old_age   Always       -       471
244 Average_Erase_Count     0x0000   100   100   000    Old_age   Offline      -       140
245 Max_Erase_Count         0x0000   100   100   000    Old_age   Offline      -       157
246 Total_Erase_Count       0x0000   100   100   000    Old_age   Offline      -       28205
 

Spearfoot

He of the long foot
Moderator
Joined
May 13, 2015
Messages
2,478
I added this drive so that my boot pool is mirrored (the database is also on this drive) and I notice that smart attribute 231 SSD_Life_Left is dropping 1 point every 35-40 days. The drive is 120GB and it is mostly empty:

#>zfs list freenas-boot
NAME USED AVAIL REFER MOUNTPOINT
freenas-boot 13.0G 94.6G 176K none


It seems a bit excessive given the amount of work being done these days - or am I off base? Do I need to set a tuable for optimum drive life?

20200625_050000~KINGSTON_SA400S37120G
Code:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x0032   100   100   000    Old_age   Always       -       100
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       3856
12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       16
148 Unknown_Attribute       0x0000   100   100   000    Old_age   Offline      -       0
149 Unknown_Attribute       0x0000   100   100   000    Old_age   Offline      -       0
167 Write_Protect_Mode      0x0000   100   100   000    Old_age   Offline      -       0
168 SATA_Phy_Error_Count    0x0012   100   100   000    Old_age   Always       -       0
169 Bad_Block_Rate          0x0000   100   100   000    Old_age   Offline      -       0
170 Bad_Blk_Ct_Erl/Lat      0x0000   100   100   010    Old_age   Offline      -       0/0
172 Erase_Fail_Count        0x0032   100   100   000    Old_age   Always       -       0
173 MaxAvgErase_Ct          0x0000   100   100   000    Old_age   Offline      -       0
181 Program_Fail_Count      0x0032   100   100   000    Old_age   Always       -       0
182 Erase_Fail_Count        0x0000   100   100   000    Old_age   Offline      -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
192 Unsafe_Shutdown_Count   0x0012   100   100   000    Old_age   Always       -       11
194 Temperature_Celsius     0x0022   043   054   000    Old_age   Always       -       43 (Min/Max 31/54)
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       0
199 SATA_CRC_Error_Count    0x0032   100   100   000    Old_age   Always       -       0
218 CRC_Error_Count         0x0032   100   100   000    Old_age   Always       -       0
231 SSD_Life_Left           0x0000   094   094   000    Old_age   Offline      -       94
233 Flash_Writes_GiB        0x0032   100   100   000    Old_age   Always       -       1568
241 Lifetime_Writes_GiB     0x0032   100   100   000    Old_age   Always       -       3205
242 Lifetime_Reads_GiB      0x0032   100   100   000    Old_age   Always       -       392
244 Average_Erase_Count     0x0000   100   100   000    Old_age   Offline      -       120
245 Max_Erase_Count         0x0000   100   100   000    Old_age   Offline      -       157
246 Total_Erase_Count       0x0000   100   100   000    Old_age   Offline      -       24281


20200804_050000~KINGSTON_SA400S37120G
Code:
SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x0032   100   100   000    Old_age   Always       -       100
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       4812
12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       16
148 Unknown_Attribute       0x0000   100   100   000    Old_age   Offline      -       0
149 Unknown_Attribute       0x0000   100   100   000    Old_age   Offline      -       0
167 Write_Protect_Mode      0x0000   100   100   000    Old_age   Offline      -       0
168 SATA_Phy_Error_Count    0x0012   100   100   000    Old_age   Always       -       0
169 Bad_Block_Rate          0x0000   100   100   000    Old_age   Offline      -       0
170 Bad_Blk_Ct_Erl/Lat      0x0000   100   100   010    Old_age   Offline      -       0/0
172 Erase_Fail_Count        0x0032   100   100   000    Old_age   Always       -       0
173 MaxAvgErase_Ct          0x0000   100   100   000    Old_age   Offline      -       0
181 Program_Fail_Count      0x0032   100   100   000    Old_age   Always       -       0
182 Erase_Fail_Count        0x0000   100   100   000    Old_age   Offline      -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
192 Unsafe_Shutdown_Count   0x0012   100   100   000    Old_age   Always       -       11
194 Temperature_Celsius     0x0022   043   056   000    Old_age   Always       -       43 (Min/Max 31/56)
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       0
199 SATA_CRC_Error_Count    0x0032   100   100   000    Old_age   Always       -       0
218 CRC_Error_Count         0x0032   100   100   000    Old_age   Always       -       0
231 SSD_Life_Left           0x0000   093   093   000    Old_age   Offline      -       93
233 Flash_Writes_GiB        0x0032   100   100   000    Old_age   Always       -       2178
241 Lifetime_Writes_GiB     0x0032   100   100   000    Old_age   Always       -       4236
242 Lifetime_Reads_GiB      0x0032   100   100   000    Old_age   Always       -       471
244 Average_Erase_Count     0x0000   100   100   000    Old_age   Offline      -       140
245 Max_Erase_Count         0x0000   100   100   000    Old_age   Offline      -       157
246 Total_Erase_Count       0x0000   100   100   000    Old_age   Offline      -       28205
Moving the database to a pool will reduce wear on the drives.

Perhaps we are misinterpreting the SMART attribute 231 'SSD_Life_Left'. These drives have low power-on hours -- hard to believe they are wearing out so quickly.
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,702
231 SSD_Life_Left is dropping 1 point every 35-40 days
If you multiply that out, 40 x 200 means you will hit zero somewhere in 21+ years... I don't see where the problem is.
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,740
This might help put things in perspective:

 

NASbox

Guru
Joined
May 8, 2012
Messages
644
Hi @Spearfoot, @sretalla, @Jurgen Segaer, @Patrick M. Hausen, Thanks for the reply

Moving the database to a pool will reduce wear on the drives.

Perhaps we are misinterpreting the SMART attribute 231 'SSD_Life_Left'. These drives have low power-on hours -- hard to believe they are wearing out so quickly.
It's actually the same drive... just showing the change after about 40 days.
If you multiply that out, 40 x 200 means you will hit zero somewhere in 21+ years... I don't see where the problem is.
Should this not be 40x100?
Actually the value is 93 (the 231 is the SMART attribute ID#), about 9 years left.
Yes, I think you are right
This might help put things in perspective:
Thanks... In my case it's a small drive... just looked up the specs, and it's only rated for a lifetime 40TBW.

For some reason the default scrub interval is 7 days... not quite sure why it needs to be that short. I just changed it to 28 days to see if that lightens up the drive use a bit.

I guess it's not an issue... I was thinking that endurance was higher, and that maybe some setting needed to change to enable proper wear levelling--since the drive is 120GB and it has about 11GB used (and could be much lesss if I cleared up a few of the unneeded boot environments.
.
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,740
Despite the wording a "scrub" does not write to your disk unless there are corrupt blocks already. It simply reads all the data and verifies the checksums and the metadata structure.
 

NASbox

Guru
Joined
May 8, 2012
Messages
644
Despite the wording a "scrub" does not write to your disk unless there are corrupt blocks already. It simply reads all the data and verifies the checksums and the metadata structure.
Thanks for that.... so I am assuming that means the scrub won't adversely effect drive life.

Would you recommend I put scrub back to 7 days?
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,740

NASbox

Guru
Joined
May 8, 2012
Messages
644
Just ran a scrub.... not sure what "11.7G issued" means. I've posted info from the scrub below and it appears that about 1/2 that much was actually written. The 11.7GB is about the size of the pool.

Code:
FN#>zpool status freenas-boot
  pool: freenas-boot
state: ONLINE
  scan: scrub in progress since Tue Aug 11 06:10:21 2020
        12.9G scanned at 232M/s, 11.7G issued at 210M/s, 12.9G total
        0 repaired, 90.46% done, 0 days 00:00:06 to go
config:

        NAME        STATE     READ WRITE CKSUM
        freenas-boot  ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            ada2p2  ONLINE       0     0     0
            ada3p2  ONLINE       0     0     0

errors: No known data errors
FN#>zpool status freenas-boot
  pool: freenas-boot
state: ONLINE
  scan: scrub repaired 0 in 0 days 00:01:01 with 0 errors on Tue Aug 11 06:11:22 2020
config:

        NAME        STATE     READ WRITE CKSUM
        freenas-boot  ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            ada2p2  ONLINE       0     0     0
            ada3p2  ONLINE       0     0     0

errors: No known data errors


Before scrub:
Code:
SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x0032   100   100   000    Old_age   Always       -       100
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       4979
12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       18
148 Unknown_Attribute       0x0000   100   100   000    Old_age   Offline      -       0
149 Unknown_Attribute       0x0000   100   100   000    Old_age   Offline      -       0
167 Write_Protect_Mode      0x0000   100   100   000    Old_age   Offline      -       0
168 SATA_Phy_Error_Count    0x0012   100   100   000    Old_age   Always       -       0
169 Bad_Block_Rate          0x0000   100   100   000    Old_age   Offline      -       0
170 Bad_Blk_Ct_Erl/Lat      0x0000   100   100   010    Old_age   Offline      -       0/0
172 Erase_Fail_Count        0x0032   100   100   000    Old_age   Always       -       0
173 MaxAvgErase_Ct          0x0000   100   100   000    Old_age   Offline      -       0
181 Program_Fail_Count      0x0032   100   100   000    Old_age   Always       -       0
182 Erase_Fail_Count        0x0000   100   100   000    Old_age   Offline      -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
192 Unsafe_Shutdown_Count   0x0012   100   100   000    Old_age   Always       -       12
194 Temperature_Celsius     0x0022   040   056   000    Old_age   Always       -       40 (Min/Max 31/56)
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       0
199 SATA_CRC_Error_Count    0x0032   100   100   000    Old_age   Always       -       0
218 CRC_Error_Count         0x0032   100   100   000    Old_age   Always       -       0
231 SSD_Life_Left           0x0000   093   093   000    Old_age   Offline      -       93
233 Flash_Writes_GiB        0x0032   100   100   000    Old_age   Always       -       2294
241 Lifetime_Writes_GiB     0x0032   100   100   000    Old_age   Always       -       4416
242 Lifetime_Reads_GiB      0x0032   100   100   000    Old_age   Always       -       486
244 Average_Erase_Count     0x0000   100   100   000    Old_age   Offline      -       142
245 Max_Erase_Count         0x0000   100   100   000    Old_age   Offline      -       158
246 Total_Erase_Count       0x0000   100   100   000    Old_age   Offline      -       28662


After Scrub::
Code:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x0032   100   100   000    Old_age   Always       -       100
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       4979
12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       18
148 Unknown_Attribute       0x0000   100   100   000    Old_age   Offline      -       0
149 Unknown_Attribute       0x0000   100   100   000    Old_age   Offline      -       0
167 Write_Protect_Mode      0x0000   100   100   000    Old_age   Offline      -       0
168 SATA_Phy_Error_Count    0x0012   100   100   000    Old_age   Always       -       0
169 Bad_Block_Rate          0x0000   100   100   000    Old_age   Offline      -       0
170 Bad_Blk_Ct_Erl/Lat      0x0000   100   100   010    Old_age   Offline      -       0/0
172 Erase_Fail_Count        0x0032   100   100   000    Old_age   Always       -       0
173 MaxAvgErase_Ct          0x0000   100   100   000    Old_age   Offline      -       0
181 Program_Fail_Count      0x0032   100   100   000    Old_age   Always       -       0
182 Erase_Fail_Count        0x0000   100   100   000    Old_age   Offline      -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
192 Unsafe_Shutdown_Count   0x0012   100   100   000    Old_age   Always       -       12
194 Temperature_Celsius     0x0022   040   056   000    Old_age   Always       -       40 (Min/Max 31/56)
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       0
199 SATA_CRC_Error_Count    0x0032   100   100   000    Old_age   Always       -       0
218 CRC_Error_Count         0x0032   100   100   000    Old_age   Always       -       0
231 SSD_Life_Left           0x0000   093   093   000    Old_age   Offline      -       93
233 Flash_Writes_GiB        0x0032   100   100   000    Old_age   Always       -       2295
241 Lifetime_Writes_GiB     0x0032   100   100   000    Old_age   Always       -       4416
242 Lifetime_Reads_GiB      0x0032   100   100   000    Old_age   Always       -       499
244 Average_Erase_Count     0x0000   100   100   000    Old_age   Offline      -       142
245 Max_Erase_Count         0x0000   100   100   000    Old_age   Offline      -       158
246 Total_Erase_Count       0x0000   100   100   000    Old_age   Offline      -       28663
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,740

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,702
Should this not be 40x100?
Sure, I jumped to the conclusion after misreading your line... no problem. My comment still stands... 9 years left... where's the problem?

I think it's taking into account a combination of power-on hours and TBW to get to the rate of deterioration. About 10 points a year isn't bad at all.
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,702
not sure what "11.7G issued" means
That sounds like about the amount of bits that would really be behind 12.9GB of data including parity... so "actual data checked".
 
Top