BOOT-Pool continuos writing [HELP]

ThEnGI

Contributor
Joined
Oct 14, 2023
Messages
140
If you run the script using -dump email it will email me a copy of your drive SMART data and I can look into it.
Done

As for your original problem, have you looked into your SWAP file? Is any being used? If yes, then where is your SWAP file located?
when i run htop, no swap is used

And then 1GB/h of log
so let's write 1TB/h (perhaps a typo?)

regarding the TB/h I was replying to Whattteva, who simplified it by saying that it was enough to use enterprise-grade SSD with a high TBW

So, the problem is you have 1GB/h being written to the boot pool, is that correct? That's your sda graph, sda is your boot pool? And it's the wear level report. I got all that but started reading other posts and got mixed up. What drive is your application pool on?

sda is the 128GB boot ssd,whit costant writing of 256 kB/s or about 1GB/h. apps are on "FAST" (2TB NVMe).

If you have 1GB/h being written, can you not determine which files as a hint which some monitoring? All logging goes to /var/log directory so would think the file(s) are there. Might help the ticket. And that was before you had kubernetes active if I understand correctly from post 28. That's a lot of data and cannot possibly be expected.

The problem occurred with or without kubernet. if that was the question

To answer the other question, don't see it answered, there isn't really a need per se to mirror the boot pool. You can for uptime which I do as I don't want downtime. As long as you download the config every so often you just reinstall Scale on a new boot drive and restore the config file and you are back same as before.
:frown:
 

chuck32

Guru
Joined
Jan 14, 2023
Messages
623
And now I see another user has come in and started posting stuff, I assumed it was the OP but actually looking now, it's not.

I don't want to hijack the thread, I could create a separate one but I guess it's somewhat related to OPs problem?
Sorry for the confusion!

I thought I'll add to here, since I'm seeing the same writes (250 KiB-ish) on the boot pool as OP and I realized the writes on the system dataset are even higher.
The problem occurred with or without kubernet. if that was the question
 

sfatula

Guru
Joined
Jul 5, 2022
Messages
608
Your boot pool is more like my apps pool as far as activity. Not good! Very interested to follow your ticket. I am watching it.

I download my config file monthly. Reinstalling really easy. A mirror is even better, but you still want to download the config file every so often just in case. You do want it! I'd hate to have a boot pool failure and no spare drive. But if I had a spare drive, why not mirror it. Unless no spare ports.
 

sfatula

Guru
Joined
Jul 5, 2022
Messages
608
Sorry for the confusion!

I thought I'll add to here, since I'm seeing the same writes (250 KiB-ish) on the boot pool as OP and I realized the writes on the system dataset are even higher.
The confusion was on my part. As someone who reads and tries to help where they can, I don't always take the time to pay enough attention.

You have homeassistant, don't know how long you've used it but it is very chatty and that is normal. I am running it in a VM also via HASSOS. It is forever logging states of different devices in the logger. I log mine to mariadb, a separate app on Scale.
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
Sorry for the confusion!

I thought I'll add to here, since I'm seeing the same writes (250 KiB-ish) on the boot pool as OP and I realized the writes on the system dataset are even higher.
@ThEnGI issue is that his system dataset is not in the boot pool, yet he is experiencing such abnormal writes volume.
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
I found that my Cobia system also writes to /dev/sda continuously:
Bildschirmfoto 2023-11-21 um 22.18.18.png

To rule out a problem with the reporting - ESXi seems to agree (the system drive is a VMDK):
Bildschirmfoto 2023-11-21 um 22.18.54.png

And ... *drumroll* ... I present one possible culprit:
Code:
root@truenas[/var/log/netdata]# tail -f /var/log/netdata/access.log
2023-11-21 22:36:45: 955648: 3199 '[localhost]:57448' 'DATA' (sent/all = 5561/53715 bytes -90%, prep/sent/total = 0.70/0.51/1.21 ms) 200 '/api/v1/allmetrics?format=json'
2023-11-21 22:36:45: 955649: 3174 '[localhost]:57464' 'CONNECTED'
2023-11-21 22:36:45: 955649: 3174 '[localhost]:57464' 'DISCONNECTED'
2023-11-21 22:36:45: 955649: 3174 '[localhost]:57464' 'DATA' (sent/all = 5313/53680 bytes -90%, prep/sent/total = 0.60/0.49/1.10 ms) 200 '/api/v1/allmetrics?format=json'
2023-11-21 22:36:47: 955650: 3200 '[localhost]:57476' 'CONNECTED'
2023-11-21 22:36:47: 955650: 3200 '[localhost]:57476' 'DISCONNECTED'
2023-11-21 22:36:47: 955650: 3200 '[localhost]:57476' 'DATA' (sent/all = 5341/53685 bytes -90%, prep/sent/total = 1.49/1.31/2.80 ms) 200 '/api/v1/allmetrics?format=json'
2023-11-21 22:36:47: 955651: 3199 '[localhost]:57482' 'CONNECTED'
2023-11-21 22:36:47: 955651: 3199 '[localhost]:57482' 'DISCONNECTED'
2023-11-21 22:36:47: 955651: 3199 '[localhost]:57482' 'DATA' (sent/all = 5313/53678 bytes -90%, prep/sent/total = 0.76/0.48/1.24 ms) 200 '/api/v1/allmetrics?format=json'
[...]


What the heck? Why log when the middleware continuously polls netdata? And besides - this does not belong on the boot drive.
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
Rough calculation, overestimating the data from above: .5 MBytes/s amount to 41 GB per day or 15 TB per year. So nothing that will ruin my SSD in the short run, but iX really should re-evaluate

- the amount of logging, especially in a production release
- the location of the log files
 

sfatula

Guru
Joined
Jul 5, 2022
Messages
608
FIle a ticket for the log spam certainly with netdata. Completely unnecessary logging there.
 

chuck32

Guru
Joined
Jan 14, 2023
Messages
623
@ThEnGI issue is that his system dataset is not in the boot pool, yet he is experiencing such abnormal writes volume.
It's the same for me, OP reported around 250 Kib written on boot pool, that's the same number I have. Additionally my writes on my VM pool are an order of magnitude higher. I also don't have the systemdataset on the boot pool.
The assigned dev on my ticket said he may have an idea. With @Patrick M. Hausen also reporting issues, I'm confident there will be some update in the future. It's not single users with a misconfiguration at this point I'd say.

You have homeassistant, don't know how long you've used it but it is very chatty and that is normal. I am running it in a VM also via HASSOS. It is forever logging states of different devices in the logger. I log mine to mariadb, a separate app on Scale.
HA is what got me into this whole home server mess ;)
I also thought about mariadb, because the db as of now does not seem to be persistent forever. But since I probably don't really need all that history I haven't gotten around to it. So you used a scale app directly to log?
Nonetheless it would still be on the same pool though, so for this particular problem I probably won't gain anything.
 

ThEnGI

Contributor
Joined
Oct 14, 2023
Messages
140
I already opened the ticket, it was set as low priority :frown:
it must be said that it is not urgent and one's problems always feel like a high priority.
@joeschmuck is investigating the values reported by the script, as they are (perhaps) not real. 500GiB written should not result in a 20% reduction in 128GB SSD life

@chuck32 I didn't notice if the disk where the "system dataset" resides has anomalous writes. but since I have on average 3/5 MiB/s of writing (due to docker/kubernet) 250KiB/s makes no difference. the SSD is 2TB (20 15.625 times larger than the boot pool) it can easily support these extra writes
 
Last edited:

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
@joeschmuck is investigating the values reported by the script, as they are (perhaps) not real. 500GiB
The value is correct, I verified it last night, I did confirm 534.67GB was written (had to manually do the math :smile:). However the wear level on your drive is not correct due to some off the wall SMART reporting. I will chat with you over a PM, not here about the wear level but I do have a customization fix for it.
 

sfatula

Guru
Joined
Jul 5, 2022
Messages
608
HA is what got me into this whole home server mess ;)
I also thought about mariadb, because the db as of now does not seem to be persistent forever. But since I probably don't really need all that history I haven't gotten around to it. So you used a scale app directly to log?
Nonetheless it would still be on the same pool though, so for this particular problem I probably won't gain anything.
It really is by default. I am using mariadb, I don't use IX or Truecharts apps but the docker containers via what is called custom apps on Cobia. I am not on Cobia yet. I don't like sqlite for the most part and find mariadb better for my purposes which includes several containers on Scale as well as my own creations, and the ease I can access the data from GUIs like Mysql Workbench or DBeaver. Rather have one centralized place to store everything. I have at least a year of HA history logged now, it's actually useful data. It can all be configued in HA.
 

ThEnGI

Contributor
Joined
Oct 14, 2023
Messages
140
OT:
While waiting for a response from IX, I was writing down the next upgrades to do
As regards the PCIe compartment I have two X16 slots (16+4b lines) and one X1 slot available.
I still haven't quite figured out what to do with the X1 slot, maybe a 2.5GBE card?
Regarding the X16 connector, I was thinking of mounting an LSI 9300-8i (8 Disk) controller in the slot with 4 lines, is the bandwidth sufficient?

At that point I am left with an X16 slot in which to install a 4xNvme adapter. thus obtaining 10HDD + 6 NVME. But with 2.5GB connectivity.
or sacrifice the NVME disks and install a 10GBE NIC

what is the best solution?

EDIT:I would like to avoid "link aggregation" to keep the LAN simple

/OT
 
Last edited:

ThEnGI

Contributor
Joined
Oct 14, 2023
Messages
140
OT (again)

I'm getting "High" (average below 10%) IO wait,But I didn't understand exactly what it represents....
If I understand correctly, it's the pools that are "slowing down" the system, correct?

And how is the "system load average" value measured in reporting?
From the dashboard I have a value that varies between 1/5/10%, in the report around 1.2 (Processes ?)

I have 6 active Dockers, so I'm not surprised by a bit of load on the CPU

END OT
 

chuck32

Guru
Joined
Jan 14, 2023
Messages
623
I still haven't quite figured out what to do with the X1 slot, maybe a 2.5GBE card?
Regarding the X16 connector, I was thinking of mounting an LSI 9300-8i (8 Disk) controller in the slot with 4 lines, is the bandwidth sufficient?

At that point I am left with an X16 slot in which to install a 4xNvme adapter. thus obtaining 10HDD + 6 NVME. But with 2.5GB connectivity.
or sacrifice the NVME disks and install a 10GBE NIC
Maybe post in the appropriate subforum to get proper attention.

Regarding the bandwidth, it's a x8 card in a x16 slot, why wouldn't it be sufficient?

From what I read you should stay away from 2.5 GB, either go or stay at one. The performance/hardware for 2.5Gb is subpar (further reading).

Even with HDDs (depending on your pool layout) you can easily max out 2.5 Gbe speeds, so 10 Gbe wouldn't be a waste.

I'd probably sacrifice the NVME disks for 10Gbe. Aren't there 16 port HBAs anyway? You could use 2.5 SSDs then.
 

ThEnGI

Contributor
Joined
Oct 14, 2023
Messages
140
ok for the 2.5Gbe

The idea is:
1 x16 PCIE 3.0 (x16 Line), 1 x 8 Slot HBA
1 x16 PCIE 3.0 (x4 Line), 1 x 10Gbe NIC (X20-DA1)
1 X1 PCIE 3.0 (x1 Line), 1 x 1 NVME (adapter) reduced speed

in my case there are only 10 of 3.5 and 2 of 2.5. The two 2.5 are the boot disks. An 8 slot HBA is enough to cover all disks (the MB has 6 SATA ports). maybe i will use 2.5to3.5 adapter and sata SSD,which is not a bad idea

I'm writing here because it's not urgent and I needed to get the post up, at the moment priority is UPS, second HDD and second NVME
 

ThEnGI

Contributor
Joined
Oct 14, 2023
Messages
140
They closed the ticket reporting that the swap is on the boot disk
How did it end up there? How can I move it?
I looked in the GUI but it only lets me change the size
 

chuck32

Guru
Joined
Jan 14, 2023
Messages
623
They closed my ticket for the same reason. I have 128 Gb of memory (with mostly 30 Gb free, depending on which VMs are spun up), so I don't really see a reason to use swap at all.

How did it end up there?
Probably said yes upon installation to creating a swap partition. Iirc I ended up with swap on an earlier installation (bluefin) even when I said no.

I looked in the GUI but it only lets me change the size
Did you try setting it to 0?

I'm thinking about reinstalling when I swap out my PSU / want to install the next cobia update if it cannot be removed as is. This thread however suggests I may get away with replacing the drives with themselves since I mirrored my boot pool.
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
Just for context - my boot-pool contains the system dataset and is being written to as an average of 317.32KiB over whatever time period is involved.
1702330862535.png

Its continuous
1702330890101.png
 

ThEnGI

Contributor
Joined
Oct 14, 2023
Messages
140
To remove the swap from the boot drive need i to reinstall TN ?
 
Top