Resource icon

solnet-array-test Discussion Thread

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
As highlighted in earlier posts you can't view and download this script in an FTP app.

Only half true; it cannot be *seen* in an FTP app, because the directory is -wx. Your FTP app ought to be able to download it just fine (because of the x).

wget and fetch do not work on windows.

Also only half true;

Code:
C:\Users\jgreco\Downloads>wget ftp://ftp.sol.net/incoming/solnet-array-test-v3.sh
--2023-11-19 06:38:44--  ftp://ftp.sol.net/incoming/solnet-array-test-v3.sh
           => ‘solnet-array-test-v3.sh’
Resolving ftp.sol.net (ftp.sol.net)... 206.55.64.92
Connecting to ftp.sol.net (ftp.sol.net)|206.55.64.92|:21... connected.
Logging in as anonymous ... Logged in!
==> SYST ... done.    ==> PWD ... done.
==> TYPE I ... done.  ==> CWD (1) /incoming ... done.
==> SIZE solnet-array-test-v3.sh ... 14846
==> PASV ... done.    ==> RETR solnet-array-test-v3.sh ... done.
Length: 14846 (14K) (unauthoritative)

100%[==============================================================================>] 14,846      --.-K/s   in 0s

2023-11-19 06:38:45 (37.5 MB/s) - ‘solnet-array-test-v3.sh’ saved [14846]


wget works just fine, go install Cygwin. I believe I also saw a version of fetch at one point for Cygwin. Also Windows did eventually integrate an ftp client of its own over in \windows\system32\ftp

You still get bonus points for the PowerShell thing. :smile:
 

logan893

Dabbler
Joined
Dec 31, 2015
Messages
44
@jgreco Great script, thanks! Been using it on several new systems and drives, for quick benchmarking and for longer duration stress testing of new and old drives alike.

Regarding the seek-heavy test, the suggested time estimate is for a single pass of "dd" while the test runs 6 in parallel for each drive. Scaling up this estimate by at least a factor 6 would be prudent. Something simple like having the first argument be the multiplication factor works.

Code:
approximatepasstime() {
    # multiplication factor as first argument, followed by one or more disks
    disksize=`getdisksize "${2}"`
    if [ "${disksize}" -gt 0 ]; then
        echo "The disk ${1} appears to be ${disksize} MB.        "
        speed=`samplediskspeed "${2}"`
        echo "${disksize} ${speed} ${1}" | awk '{if ($2 != 0) {speed=$2} else {speed=1}; printf "Disk is reading at about %0.0f MB/sec        \nThis suggests that this pass may take around %0.0f minutes\n", speed, $1 / speed / 60 * $3}'
    else
        echo "Unable to determine disk ${2} size from dmesg file (not fatal but odd!)"
    fi
}

# parallel read
approximatepasstime 1 ${disklist}
# seek heavy read
approximatepasstime 6 ${disklist}


Additionally, estimates are disregarding the somewhat unknown of how read speeds typically reduce over the span of a drive. A quick read sampling near the end of the drive could be used to interpolate the average over a full drive. It also only looks at the first drive, while there may be larger and/or slower drives. I understand if you don't feel the need to dump a lot of code in here though, and that many (most?) arrays will be quite homogenous and it's not too difficult extrapolating yourself from the data provided.

(and there's a typo: "thhis")
Code:
This next test attempts to read all devices while forcing seeks.
This is primarily a stress test of your hard disks.  It does thhis
by running several simultaneous dd sessions on each disk.


Performing initial parallel seek-stress array read
Wed Jan  3 17:25:52 CET 2024
The disk da0 appears to be 9537536 MB.
Disk is reading at about 217 MB/sec
This suggests that this pass may take around 733 minutes


                   Serial Parall % of
Disk    Disk Size  MB/sec MB/sec Serial
------- ---------- ------ ------ ------
da0      9537536MB    243    216     89
da1     15259648MB    211    166     79
da2     15259648MB    218    168     77
da3      9537536MB    244    217     89


Awaiting completion: initial parallel seek-stress array read



If anyone is interested in getting a progress snapshot from at least the one instance of "dd" per drive, the first dd started has stderr captured, and a non-interrupting kill signal can be sent to all "dd" processes to trigger the output below.

kill -INFO on TrueNAS CORE (BSD) or kill -USR1 on TrueNAS SCALE (Linux).

Code:
root@truenas[~]# kill -INFO $(pgrep ^dd$)
root@truenas[~]# cat /tmp/sat.da0.err /tmp/sat.da1.err /tmp/sat.da2.err /tmp/sat.da3.err
5598014+0 records in
5598014+0 records out
5869943128064 bytes transferred in 143745.645171 secs (40835624 bytes/sec)
4842212+0 records in
4842212+0 records out
5077427290112 bytes transferred in 143745.678217 secs (35322295 bytes/sec)
4761574+0 records in
4761574+0 records out
4992872218624 bytes transferred in 143745.704238 secs (34734062 bytes/sec)
5496479+0 records in
5496479+0 records out
5763475963904 bytes transferred in 143745.888619 secs (40094893 bytes/sec)
root@truenas[~]#
 
Last edited:

bent98

Contributor
Joined
Jul 21, 2017
Messages
171
I am new to this script and a noob when it comes to Unix. I ran this script and selected 1 pass vs burn in to test it out. It should be done by now as it estimated 19 hours for all disks. I pulled the cover off my server and all hard drives seem to be idle (meaning the heads are not moving). How do I know when test is over? As per my shell prompt in the screenshot ,its still showing awaiting completion.


Screenshot 2024-01-12 072629.png
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
Possibly your shell timed out. Don't use the web UI shell. Use SSH and screen or tmux to keep your process running.
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
Login again via SSH and reconnect to tmux/screen to the session you left. The results of your last run are gone - unless that script writes a result log file, which I don't know.
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222

nasenbaer

Cadet
Joined
Feb 21, 2024
Messages
9
Thanks for the awesome script. Read the whole thread and just want to make sure: it is normal that the hdds get a lot slower after some time? Guess since different parts of the hdds are read which are harder to get to? Or do I have some overheating problems with my HBA (the card is around body temp - meaning not too hot to touch) ?
All four hdds have the same graph / speed. Down from 280 MB/s to 170 MB/s. Disk temp never changed from around 35°

Screenshot_20240225_112313.png
 

dxun

Explorer
Joined
Jan 24, 2016
Messages
52
Read the whole thread and just want to make sure: it is normal that the hdds get a lot slower after some time?

It certainly appears so. Here's a Seagate Exos X22 20 TB drive that I've been stress testing for a week now.

1709610060764.png


It's been doing a this seek-stress read operation for a full week now.

1709610126593.png
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
so how do I find out the results?
Use redirection:
tmux ./solnet-array-test-v3.sh | tee test_result.txt
You can then follow the progress in your SSH session or detach it and let the test run.
If you want to come back:
tmux attach
Or just read the log-file-in-progress:
more test_result.txt
Expect the test to take much more time than the estimate, tough.

(You can even launch the session from the web shell…)
 
Top