Can't get why scrub is so slow / replication failing during slow scrub

sretalla · Aug 22, 2023

If replication is over the network, maybe that's involved... which NIC do you have in there?

The user manual indicates these options were available:
Broadcom 5730 Base-T
Intel 1350 Base-T
Broadcom 57800 SFP+
Broadcom 57800 Base-T
Intel X540 Base-T

Ulysse_31 · Aug 22, 2023

sretalla said:
Were you looking?

Like I said: we never had to ... appart from the fact that when were working on the server and scrub was running we have spotted speeds around 200Mbytes/s ... I just see that doing the same on TrueNAS, replications are failing with the message above ... and I must pause the scrub in order to get my replications working ^^'

sretalla said:
I think you're confused. Or are we not talking about the same pool?

Not sure if we are talking about the same pool => I am talking about the one under solaris the configuration is "older" so SAS is 6 Gb/s ... pool started with similar config (1x raidz2) and was increased gradually while its usage increased (each 90%, we added a new drive bay)

sretalla said:
Don't know. No evidence given so far provides a valid reason other than your pool isn't coping with the workload (which is demanding more IOPS than it has) as far as I can see.

What informations would you need ? ^^ just ask

sretalla · Aug 22, 2023

Ulysse_31 said:
Not sure if we are talking about the same pool => I am talking about the one under solaris the configuration is "older" so SAS is 6 Gb/s ... pool started with similar config (1x raidz2) and was increased gradually while its usage increased (each 90%, we added a new drive bay)

So it's not a RAID 0 pool, it's 3 RAIDZ2 VDEVs, each subject to the IOPS limitations mentioned earlier,meaning you get 3x the IOPS of one RAIDZ2 VDEV (which was the IOPS of one disk).

sretalla · Aug 22, 2023

Ulysse_31 said:
What informations would you need ? ^^ just ask

See the post before this one.

Ulysse_31 · Aug 22, 2023

sretalla said:
If replication is over the network, maybe that's involved... which NIC do you have in there?

The user manual indicates these options were available:
Broadcom 5730 Base-T
Intel 1350 Base-T
Broadcom 57800 SFP+
Broadcom 57800 Base-T
Intel X540 Base-T

The "culprit" truenas server is using intel X520 SFP+ 10Gbit network card ... I really would have liked if the issue would simply be a network interface issue ^^"

Ulysse_31 · Aug 22, 2023

Ulysse_31 said:
The "culprit" truenas server is using intel X520 SFP+ 10Gbit network card ... I really would have liked if the issue would simply be a network interface issue ^^"

intel X520 SFP+ with dual SFP+ in LACP lagg to be more precise ...

Ulysse_31 · Aug 22, 2023

not being able to edit and correct errors on post is a bit annoying ...but hey ... => dual 10Gbit SFP+ in an LACP lagg

Ulysse_31 · Aug 22, 2023

sretalla said:
So it's not a RAID 0 pool, it's 3 RAIDZ2 VDEVs, each subject to the IOPS limitations mentioned earlier,meaning you get 3x the IOPS of one RAIDZ2 VDEV (which was the IOPS of one disk).

Well ... sorry for the abuse of language ... but I'm used to call that a stripped raidz2 ^^' ... and also an "aggregate of raidz2" ... or even "a raid 0 of raidz2" (since a raid 0 is an aggregate ^^') ... anyways ^^'

Ulysse_31 · Aug 22, 2023

sretalla said:
See the post before this one.

the one about the network card ? I did answer but ... oh well ... the posting delay ...

sretalla · Aug 22, 2023

Ulysse_31 said:
Well ... sorry for the abuse of language ... but I'm used to call that a stripped raidz2 ^^' ... and also an "aggregate of raidz2" ... or even "a raid 0 of raidz2" (since a raid 0 is an aggregate ^^') ... anyways ^^'

So you meant to say a RAIDZ2 pool with 3 VDEVs.

And like I said, that 3 VDEVs means you get the IOPS of 3x 1 VDEV, which is 3 disks (about 600 IOPS, depending on your disks).

Ulysse_31 said:
intel X520 SFP+ with dual SFP+ in LACP lagg to be more precise

OK, so that should not be the problem here.

Back to the list of what can be wrong... are you sure you don't see any CAM status messages in dmesg?

Ulysse_31 · Aug 22, 2023

hmmm ... looking into the dmesg (/var/log/message) the logs on the last ... 10 days (12 of august to now) do not have any mention of "CAM" inside it.
On the other hand, found a recurring event (around every ... 30 secs), coming from what seems to be the iscsi freebsd driver :

Aug 12 04:44:59 <host> ses3: da1,pass3,da18,pass22 in 'Drive Slot 0', SAS Slot: 2 phys at slot 0
Aug 12 04:44:59 <host> ses3: phy 0: SAS device type 1 phy 0 Target ( SSP )
Aug 12 04:44:59 <host> ses3: phy 0: parent 5204747299636a7f addr 50000399986b15e2
Aug 12 04:44:59 <host> ses3: phy 1: SAS device type 1 phy 1 Target ( SSP )
Aug 12 04:44:59 <host> ses3: phy 1: parent 5204747299636aff addr 50000399986b15e3
Aug 12 04:44:59 <host> ses3: da2,pass4,da16,pass20 in 'Drive Slot 1', SAS Slot: 2 phys at slot 1
Aug 12 04:44:59 <host> ses3: phy 0: SAS device type 1 phy 0 Target ( SSP )
Aug 12 04:44:59 <host> ses3: phy 0: parent 5204747299636a7f addr 5000039a280a59e2
Aug 12 04:44:59 <host> ses3: phy 1: SAS device type 1 phy 1 Target ( SSP )
Aug 12 04:44:59 <host> ses3: phy 1: parent 5204747299636aff addr 5000039a280a59e3
Aug 12 04:44:59 <host> ses3: da3,pass5,da17,pass21 in 'Drive Slot 2', SAS Slot: 2 phys at slot 2
Aug 12 04:44:59 <host> ses3: phy 0: SAS device type 1 phy 0 Target ( SSP )
Aug 12 04:44:59 <host> ses3: phy 0: parent 5204747299636a7f addr 50000399987053a2

This message loops again and again ... with no particular slot / physical number (among all drives) ... hmmm ... like the iscsi driver would be scanning and scanning drives over and over again ?

sretalla · Aug 22, 2023

Ulysse_31 said:
This message loops again and again ... with no particular slot / physical number (among all drives) ... hmmm ... like the iscsi driver would be scanning and scanning drives over and over again ?

OK, so maybe we're getting somewhere.

That looks like a controller reset... and you say it's happening repeatedly? how often? (it can't be every 30 seconds, you'd have lost your pool by now)

what version of firmware are you using on your HBA?

How is the airflow over the HBA? is it getting hot? (maybe this explains it if it's only a problem under heavy load like a scrub, only getting to really hot temperature when that's running)

Ulysse_31 · Aug 22, 2023

sretalla said:
So you meant to say a RAIDZ2 pool with 3 VDEVs.

And like I said, that 3 VDEVs means you get the IOPS of 3x 1 VDEV, which is 3 disks (about 600 IOPS, depending on your disks).

OK, so that should not be the problem here.

Back to the list of what can be wrong... are you sure you don't see any CAM status messages in dmesg?

For the 600 IOPS stuff => theorically, yes, I do see the idea ... but since this pool was NOT created from start as a 3 VDEV ... at start it was a 1 VDEV, then, at 90% of use, went to 2 VDEVs, and again, on 90% of usage, went to 3 VDEVs, data as been "mainly" (roughtly ... 90% ? ^^') spread accross 1 VDEV, then accross the other newly VDEV ... then accross the other last one ... so scrubbing the entire data should lead to a "sequential" behavior of the scrubbing task ... at least ... that how I see it ^^' ... but tell me if you see it differently ^^

Ulysse_31 · Aug 22, 2023

sretalla said:
OK, so maybe we're getting somewhere.

That looks like a controller reset... and you say it's happening repeatedly? how often? (it can't be every 30 seconds, you'd have lost your pool by now)

what version of firmware are you using on your HBA?

How is the airflow over the HBA? is it getting hot? (maybe this explains it if it's only a problem under heavy load like a scrub, only getting to really hot temperature when that's running)

here is the output of "mprutil show adapter" command :

mpr0 Adapter:
Board Name: SAS9300-8e
Board Assembly: 03-25656-02A
Chip Name: LSISAS3008
Chip Revision: ALL
BIOS Revision: 4.00.00.00
Firmware Revision: 3.00.08.00
Integrated RAID: no
SATA NCQ: ENABLED
PCIe Width/Speed: x8 (8.0 GB/sec)
IOC Speed: Full
Temperature: 65 C

PhyNum CtlrHandle DevHandle Disabled Speed Min Max Device
0 0002 0017 N 12 3.0 12 SAS Initiator
1 0002 0017 N 12 3.0 12 SAS Initiator
2 0002 0017 N 12 3.0 12 SAS Initiator
3 0002 0017 N 12 3.0 12 SAS Initiator
4 0001 0009 N 12 3.0 12 SAS Initiator
5 0001 0009 N 12 3.0 12 SAS Initiator
6 0001 0009 N 12 3.0 12 SAS Initiator
7 0001 0009 N 12 3.0 12 SAS Initiator

sretalla · Aug 22, 2023

Ulysse_31 said:
at start it was a 1 VDEV, then, at 90% of use, went to 2 VDEVs, and again, on 90% of usage, went to 3 VDEVs, data as been "mainly" (roughtly ... 90% ? ^^') spread accross 1 VDEV, then accross the other newly VDEV ... then accross the other last one ... so scrubbing the entire data should lead to a "sequential" behavior of the scrubbing task ... at least ... that how I see it ^^' ... but tell me if you see it differently ^^

That might represent a skewed layout of the data across the VDEVs, but actually this will just mean the scrub process benefits little from the additional VDEVs as the first VDEV is still a bottleneck for most of the used blocks.

I noticed about a 10% improvement in my scrub times when I did a rebalance of the data on my largest pool between 3 RAIDZ2 VDEVs. I still have about 1/3 of the data not balanced and I also don't have even sized VDEVs, so you might expect better results by doing the same.

GitHub - markusressel/zfs-inplace-rebalancing: Simple bash script to rebalance pool data between all mirrors when adding vdevs to a pool.

Simple bash script to rebalance pool data between all mirrors when adding vdevs to a pool. - markusressel/zfs-inplace-rebalancing

github.com

You can check it with zpool list -v

There's no telling what order the scrubbing code will do things, but it will need to get to all the blocks that are used at some point in the scrub.

Ulysse_31 said:
here is the output of "mprutil show adapter" command :

Can we get the output from sas3flash -list ?

I'm going in the direction of this:

LSI 9300-xx Firmware Update

Hey Community, If you are using an LSI 9300 HBA with FreeNAS or the soon-to-be TrueNAS CORE, you may experience some performance issues causing the controller to reset when using SATA HDDs. After working with Broadcom, we’ve come up with a...

www.truenas.com

Ulysse_31 · Aug 22, 2023

sretalla said:
That might represent a skewed layout of the data across the VDEVs, but actually this will just mean the scrub process benefits little from the additional VDEVs as the first VDEV is still a bottleneck for most of the used blocks.

Correct yes ^^

sretalla said:
I noticed about a 10% improvement in my scrub times when I did a rebalance of the data on my largest pool between 3 RAIDZ2 VDEVs. I still have about 1/3 of the data not balanced and I also don't have even sized VDEVs, so you might expect better results by doing the same.

GitHub - markusressel/zfs-inplace-rebalancing: Simple bash script to rebalance pool data between all mirrors when adding vdevs to a pool.

Simple bash script to rebalance pool data between all mirrors when adding vdevs to a pool. - markusressel/zfs-inplace-rebalancing

github.com

You can check it with zpool list -v

There's no telling what order the scrubbing code will do things, but it will need to get to all the blocks that are used at some point in the scrub.

Hmmm ... I do see the idea ... but ... right now ... that host is again at 90% (but "retired" / no new data going to it ^^') ... rebalancing 111Tb ... well ... I'll just say ... let's not try to tempt the devil ^^' ... and again : I do not have any issues regarding this solaris host performances ^^

sretalla said:
Can we get the output from sas3flash -list ?

I'm going in the direction of this:

LSI 9300-xx Firmware Update

Hey Community, If you are using an LSI 9300 HBA with FreeNAS or the soon-to-be TrueNAS CORE, you may experience some performance issues causing the controller to reset when using SATA HDDs. After working with Broadcom, we’ve come up with a...

www.truenas.com

here you have it :

root@<host>[/var/log]# sas3flash -list
Avago Technologies SAS3 Flash Utility
Version 16.00.00.00 (2017.05.02)
Copyright 2008-2017 Avago Technologies. All rights reserved.

Adapter Selected is a Avago SAS: SAS3008(C0)

Controller Number : 0
Controller : SAS3008(C0)
PCI Address : 00:82:00:00
SAS Address : 500605b-0-09a3-0710
NVDATA Version (Default) : 03.05.00.06
NVDATA Version (Persistent) : 03.05.00.06
Firmware Product ID : 0x2221 (IT)
Firmware Version : 03.00.08.00
NVDATA Vendor : LSI
NVDATA Product ID : SAS9300-8e
BIOS Version : N/A
UEFI BSD Version : 04.00.00.00
FCODE Version : N/A
Board Name : SAS9300-8e
Board Assembly : 03-25656-02A
Board Tracer Number : SV44042983

Finished Processing Commands Successfully.
Exiting SAS3Flash.

sretalla · Aug 22, 2023

OK, that's an ancient firmware... maybe consider updating it to 16.00.12.00 as recommended in the linked post.

sretalla · Aug 22, 2023

Ulysse_31 said:
again : I do not have any issues regarding this solaris host

So are you saying we're not even talking about TrueNAS here?

You're posting in the TrueNAS CORE forum.

Ulysse_31 · Aug 22, 2023

sretalla said:
I'm going in the direction of this:

LSI 9300-xx Firmware Update

Hey Community, If you are using an LSI 9300 HBA with FreeNAS or the soon-to-be TrueNAS CORE, you may experience some performance issues causing the controller to reset when using SATA HDDs. After working with Broadcom, we’ve come up with a...

www.truenas.com

Hmmm ... I've been reading this ... this is in order to flash the card if you do not see the drives ... I do see the drives ... I have no issues accessing it (I do not have CAM timeouts, drives errors ... etc ...) ... on the other side ... I see lots of posts of people flashing their cards to newer firmwares ... and no seeing their drives ... I'm not really confortable in flashing a card that seems to work ... flashing my card may lead to loosing access to the drives ...
I really would like to use this in a last resort ^^' ... or at least being sure that my problem is really coming from there ... need to search on this direction furthermore ...

sretalla · Aug 22, 2023

That article talks about the difference between the public firmware (16.00.11.00) and the special one (16.00.12.00).

The differences between your firmware (03.00.08.00) and even the public one are in no way limited to (and are likely to be immense compared to) the ones listed in that post.

I guess consulting the LSI website for firmware notes (but you will need to find a lot of them) will give you what the changes are.

I would consider the issue "found" at this point and not really look at anything else than firmware until you update it to one from this decade.

If you elect to flash the card, you must certainly do it with your pools offline (many people opt for the efi version of the updater to deliver that result).

Important Announcement for the TrueNAS Community.

Can't get why scrub is so slow / replication failing during slow scrub

Powered by Neutrality

Dabbler

Powered by Neutrality

Powered by Neutrality

Dabbler

Dabbler

Dabbler

Dabbler

Dabbler

Powered by Neutrality

Dabbler

Powered by Neutrality

Dabbler

Dabbler

Powered by Neutrality

Dabbler

Powered by Neutrality

Powered by Neutrality

Dabbler

Powered by Neutrality

Similar threads