Can't get why scrub is so slow / replication failing during slow scrub

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
If replication is over the network, maybe that's involved... which NIC do you have in there?

The user manual indicates these options were available:
Broadcom 5730 Base-T
Intel 1350 Base-T
Broadcom 57800 SFP+
Broadcom 57800 Base-T
Intel X540 Base-T
 

Ulysse_31

Dabbler
Joined
Aug 22, 2023
Messages
49
Were you looking?
Like I said: we never had to ... appart from the fact that when were working on the server and scrub was running we have spotted speeds around 200Mbytes/s ... I just see that doing the same on TrueNAS, replications are failing with the message above ... and I must pause the scrub in order to get my replications working ^^'

I think you're confused. Or are we not talking about the same pool?

Not sure if we are talking about the same pool => I am talking about the one under solaris the configuration is "older" so SAS is 6 Gb/s ... pool started with similar config (1x raidz2) and was increased gradually while its usage increased (each 90%, we added a new drive bay)

Don't know. No evidence given so far provides a valid reason other than your pool isn't coping with the workload (which is demanding more IOPS than it has) as far as I can see.
What informations would you need ? ^^ just ask
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
Not sure if we are talking about the same pool => I am talking about the one under solaris the configuration is "older" so SAS is 6 Gb/s ... pool started with similar config (1x raidz2) and was increased gradually while its usage increased (each 90%, we added a new drive bay)
So it's not a RAID 0 pool, it's 3 RAIDZ2 VDEVs, each subject to the IOPS limitations mentioned earlier,meaning you get 3x the IOPS of one RAIDZ2 VDEV (which was the IOPS of one disk).
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703

Ulysse_31

Dabbler
Joined
Aug 22, 2023
Messages
49
If replication is over the network, maybe that's involved... which NIC do you have in there?

The user manual indicates these options were available:
Broadcom 5730 Base-T
Intel 1350 Base-T
Broadcom 57800 SFP+
Broadcom 57800 Base-T
Intel X540 Base-T
The "culprit" truenas server is using intel X520 SFP+ 10Gbit network card ... I really would have liked if the issue would simply be a network interface issue ^^"
 

Ulysse_31

Dabbler
Joined
Aug 22, 2023
Messages
49
The "culprit" truenas server is using intel X520 SFP+ 10Gbit network card ... I really would have liked if the issue would simply be a network interface issue ^^"
intel X520 SFP+ with dual SFP+ in LACP lagg to be more precise ...
 

Ulysse_31

Dabbler
Joined
Aug 22, 2023
Messages
49
not being able to edit and correct errors on post is a bit annoying ...but hey ... => dual 10Gbit SFP+ in an LACP lagg
 

Ulysse_31

Dabbler
Joined
Aug 22, 2023
Messages
49
So it's not a RAID 0 pool, it's 3 RAIDZ2 VDEVs, each subject to the IOPS limitations mentioned earlier,meaning you get 3x the IOPS of one RAIDZ2 VDEV (which was the IOPS of one disk).
Well ... sorry for the abuse of language ... but I'm used to call that a stripped raidz2 ^^' ... and also an "aggregate of raidz2" ... or even "a raid 0 of raidz2" (since a raid 0 is an aggregate ^^') ... anyways ^^'
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
Well ... sorry for the abuse of language ... but I'm used to call that a stripped raidz2 ^^' ... and also an "aggregate of raidz2" ... or even "a raid 0 of raidz2" (since a raid 0 is an aggregate ^^') ... anyways ^^'
So you meant to say a RAIDZ2 pool with 3 VDEVs.

And like I said, that 3 VDEVs means you get the IOPS of 3x 1 VDEV, which is 3 disks (about 600 IOPS, depending on your disks).

intel X520 SFP+ with dual SFP+ in LACP lagg to be more precise
OK, so that should not be the problem here.

Back to the list of what can be wrong... are you sure you don't see any CAM status messages in dmesg?
 

Ulysse_31

Dabbler
Joined
Aug 22, 2023
Messages
49
hmmm ... looking into the dmesg (/var/log/message) the logs on the last ... 10 days (12 of august to now) do not have any mention of "CAM" inside it.
On the other hand, found a recurring event (around every ... 30 secs), coming from what seems to be the iscsi freebsd driver :

Aug 12 04:44:59 <host> ses3: da1,pass3,da18,pass22 in 'Drive Slot 0', SAS Slot: 2 phys at slot 0
Aug 12 04:44:59 <host> ses3: phy 0: SAS device type 1 phy 0 Target ( SSP )
Aug 12 04:44:59 <host> ses3: phy 0: parent 5204747299636a7f addr 50000399986b15e2
Aug 12 04:44:59 <host> ses3: phy 1: SAS device type 1 phy 1 Target ( SSP )
Aug 12 04:44:59 <host> ses3: phy 1: parent 5204747299636aff addr 50000399986b15e3
Aug 12 04:44:59 <host> ses3: da2,pass4,da16,pass20 in 'Drive Slot 1', SAS Slot: 2 phys at slot 1
Aug 12 04:44:59 <host> ses3: phy 0: SAS device type 1 phy 0 Target ( SSP )
Aug 12 04:44:59 <host> ses3: phy 0: parent 5204747299636a7f addr 5000039a280a59e2
Aug 12 04:44:59 <host> ses3: phy 1: SAS device type 1 phy 1 Target ( SSP )
Aug 12 04:44:59 <host> ses3: phy 1: parent 5204747299636aff addr 5000039a280a59e3
Aug 12 04:44:59 <host> ses3: da3,pass5,da17,pass21 in 'Drive Slot 2', SAS Slot: 2 phys at slot 2
Aug 12 04:44:59 <host> ses3: phy 0: SAS device type 1 phy 0 Target ( SSP )
Aug 12 04:44:59 <host> ses3: phy 0: parent 5204747299636a7f addr 50000399987053a2

This message loops again and again ... with no particular slot / physical number (among all drives) ... hmmm ... like the iscsi driver would be scanning and scanning drives over and over again ?
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
This message loops again and again ... with no particular slot / physical number (among all drives) ... hmmm ... like the iscsi driver would be scanning and scanning drives over and over again ?
OK, so maybe we're getting somewhere.

That looks like a controller reset... and you say it's happening repeatedly? how often? (it can't be every 30 seconds, you'd have lost your pool by now)

what version of firmware are you using on your HBA?

How is the airflow over the HBA? is it getting hot? (maybe this explains it if it's only a problem under heavy load like a scrub, only getting to really hot temperature when that's running)
 

Ulysse_31

Dabbler
Joined
Aug 22, 2023
Messages
49
So you meant to say a RAIDZ2 pool with 3 VDEVs.

And like I said, that 3 VDEVs means you get the IOPS of 3x 1 VDEV, which is 3 disks (about 600 IOPS, depending on your disks).


OK, so that should not be the problem here.

Back to the list of what can be wrong... are you sure you don't see any CAM status messages in dmesg?
For the 600 IOPS stuff => theorically, yes, I do see the idea ... but since this pool was NOT created from start as a 3 VDEV ... at start it was a 1 VDEV, then, at 90% of use, went to 2 VDEVs, and again, on 90% of usage, went to 3 VDEVs, data as been "mainly" (roughtly ... 90% ? ^^') spread accross 1 VDEV, then accross the other newly VDEV ... then accross the other last one ... so scrubbing the entire data should lead to a "sequential" behavior of the scrubbing task ... at least ... that how I see it ^^' ... but tell me if you see it differently ^^
 

Ulysse_31

Dabbler
Joined
Aug 22, 2023
Messages
49
OK, so maybe we're getting somewhere.

That looks like a controller reset... and you say it's happening repeatedly? how often? (it can't be every 30 seconds, you'd have lost your pool by now)

what version of firmware are you using on your HBA?

How is the airflow over the HBA? is it getting hot? (maybe this explains it if it's only a problem under heavy load like a scrub, only getting to really hot temperature when that's running)
here is the output of "mprutil show adapter" command :

mpr0 Adapter:
Board Name: SAS9300-8e
Board Assembly: 03-25656-02A
Chip Name: LSISAS3008
Chip Revision: ALL
BIOS Revision: 4.00.00.00
Firmware Revision: 3.00.08.00
Integrated RAID: no
SATA NCQ: ENABLED
PCIe Width/Speed: x8 (8.0 GB/sec)
IOC Speed: Full
Temperature: 65 C

PhyNum CtlrHandle DevHandle Disabled Speed Min Max Device
0 0002 0017 N 12 3.0 12 SAS Initiator
1 0002 0017 N 12 3.0 12 SAS Initiator
2 0002 0017 N 12 3.0 12 SAS Initiator
3 0002 0017 N 12 3.0 12 SAS Initiator
4 0001 0009 N 12 3.0 12 SAS Initiator
5 0001 0009 N 12 3.0 12 SAS Initiator
6 0001 0009 N 12 3.0 12 SAS Initiator
7 0001 0009 N 12 3.0 12 SAS Initiator
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
at start it was a 1 VDEV, then, at 90% of use, went to 2 VDEVs, and again, on 90% of usage, went to 3 VDEVs, data as been "mainly" (roughtly ... 90% ? ^^') spread accross 1 VDEV, then accross the other newly VDEV ... then accross the other last one ... so scrubbing the entire data should lead to a "sequential" behavior of the scrubbing task ... at least ... that how I see it ^^' ... but tell me if you see it differently ^^
That might represent a skewed layout of the data across the VDEVs, but actually this will just mean the scrub process benefits little from the additional VDEVs as the first VDEV is still a bottleneck for most of the used blocks.

I noticed about a 10% improvement in my scrub times when I did a rebalance of the data on my largest pool between 3 RAIDZ2 VDEVs. I still have about 1/3 of the data not balanced and I also don't have even sized VDEVs, so you might expect better results by doing the same.


You can check it with zpool list -v

There's no telling what order the scrubbing code will do things, but it will need to get to all the blocks that are used at some point in the scrub.

here is the output of "mprutil show adapter" command :
Can we get the output from sas3flash -list ?

I'm going in the direction of this:
 

Ulysse_31

Dabbler
Joined
Aug 22, 2023
Messages
49
That might represent a skewed layout of the data across the VDEVs, but actually this will just mean the scrub process benefits little from the additional VDEVs as the first VDEV is still a bottleneck for most of the used blocks.
Correct yes ^^
I noticed about a 10% improvement in my scrub times when I did a rebalance of the data on my largest pool between 3 RAIDZ2 VDEVs. I still have about 1/3 of the data not balanced and I also don't have even sized VDEVs, so you might expect better results by doing the same.


You can check it with zpool list -v

There's no telling what order the scrubbing code will do things, but it will need to get to all the blocks that are used at some point in the scrub.
Hmmm ... I do see the idea ... but ... right now ... that host is again at 90% (but "retired" / no new data going to it ^^') ... rebalancing 111Tb ... well ... I'll just say ... let's not try to tempt the devil ^^' ... and again : I do not have any issues regarding this solaris host performances ^^
Can we get the output from sas3flash -list ?

I'm going in the direction of this:
here you have it :
root@<host>[/var/log]# sas3flash -list
Avago Technologies SAS3 Flash Utility
Version 16.00.00.00 (2017.05.02)
Copyright 2008-2017 Avago Technologies. All rights reserved.

Adapter Selected is a Avago SAS: SAS3008(C0)

Controller Number : 0
Controller : SAS3008(C0)
PCI Address : 00:82:00:00
SAS Address : 500605b-0-09a3-0710
NVDATA Version (Default) : 03.05.00.06
NVDATA Version (Persistent) : 03.05.00.06
Firmware Product ID : 0x2221 (IT)
Firmware Version : 03.00.08.00
NVDATA Vendor : LSI
NVDATA Product ID : SAS9300-8e
BIOS Version : N/A
UEFI BSD Version : 04.00.00.00
FCODE Version : N/A
Board Name : SAS9300-8e
Board Assembly : 03-25656-02A
Board Tracer Number : SV44042983

Finished Processing Commands Successfully.
Exiting SAS3Flash.
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
OK, that's an ancient firmware... maybe consider updating it to 16.00.12.00 as recommended in the linked post.
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
again : I do not have any issues regarding this solaris host
So are you saying we're not even talking about TrueNAS here?

You're posting in the TrueNAS CORE forum.
 

Ulysse_31

Dabbler
Joined
Aug 22, 2023
Messages
49
I'm going in the direction of this:

Hmmm ... I've been reading this ... this is in order to flash the card if you do not see the drives ... I do see the drives ... I have no issues accessing it (I do not have CAM timeouts, drives errors ... etc ...) ... on the other side ... I see lots of posts of people flashing their cards to newer firmwares ... and no seeing their drives ... I'm not really confortable in flashing a card that seems to work ... flashing my card may lead to loosing access to the drives ...
I really would like to use this in a last resort ^^' ... or at least being sure that my problem is really coming from there ... need to search on this direction furthermore ...
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
That article talks about the difference between the public firmware (16.00.11.00) and the special one (16.00.12.00).

The differences between your firmware (03.00.08.00) and even the public one are in no way limited to (and are likely to be immense compared to) the ones listed in that post.

I guess consulting the LSI website for firmware notes (but you will need to find a lot of them) will give you what the changes are.

I would consider the issue "found" at this point and not really look at anything else than firmware until you update it to one from this decade.

If you elect to flash the card, you must certainly do it with your pools offline (many people opt for the efi version of the updater to deliver that result).
 
Top