Strange disk access behavior

Status
Not open for further replies.

Chuck Munro

Cadet
Joined
Jan 3, 2016
Messages
8
Hello all,

I have been using FreeNAS for almost a year with no issues (I love it!), but after the latest 9.3-Stable-20151212 update some strange disk access behavior has surfaced.

I have two identical servers ... same motherboards, ECC RAM, LSI SATA controllers, and identical disk pools. Both servers have eight 4TB WD disks in RAID-Z3 (one equipped with Blacks, the other with Red-Pros, which are essentially blacks with different firmware). One server is simply an rsync client of the other, acting as a redundant file store.

The issue ... on the primary server, about once per minute there is a frantic flurry of disk activity on all WD drives which lasts for about 10 seconds. This clearly shows as big spikes in the disk activity monitor for each drive. It always involves a lot of head movement on the drives, which gets noisy. The drives are not spun down, I simply leave them spinning 24x7.

The mystery ... the redundant server does not exhibit this behavior, the primary does. On the problematic server I moved the system folders to a separate SSD pool to remove syslog activity from the main array. There are no jails configured, and in the end I switched syslog off altogether. Regular snapshots are done every 30 minutes on both servers, so that shouldn't be a problem. I double-checked all configurations and the two machines are set up identically. The disk pools are only about 50% full.

Running top doesn't really show anything useful. 'iostat' and 'zpool iostat' both show the activity is a burst of disk writes, about 3.5 MBytes/sec or so, with almost no reads. 'rsync' server on the primary uses a lot of CPU every 15 minutes when the redundant server pulls files. I'm also mystified why there is always one rsync process using 100% of one of the CPU cores (why is that??), but rsync activity doesn't seem at all related to disk activity burst timing.

To eliminate the possibility of SAMBA, rsync, and Time Machine accesses being the issue, I disconnected the network, but this once-a-minute behavior didn't stop. There are no system alerts. One thing ... the problematic server seems to take a lot longer to scrub than the other, but that may be caused by the difference in WD drive firmware. Performance doesn't seem to be an issue.

I have read thru a very long list of Forum postings that Google found, but none of them seem to relate to this behavior.

Does anyone out there have any suggestions? In theory this shouldn't be a show-stopper, but the repeated sound of high disk activity gets a bit annoying after a while. I'm a Linux person so the guts of FreeBSD are still on the learning curve.

Many thanks ... sorry for the long-winded posting :(
 

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
Well, I don't have the answer but this is definitely an interesting question, I'll follow the thread to see if anyone has an answer :)

However I have an idea: see if any changes in the processes in top are correlated to the drive activity change.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
rsync isn't multithreaded. If there's work to be done, it can and will use up to 100% of one CPU core.

Have you tried running top with a short interval? Type "s" and then "1" or maybe even 0 to see if you can catch something causing activity. Have you looked at the system log in /var/log/messages to see if there's some correlation? Try "tail -f /var/log/messages" from the VGA console and see if anything kicks out around the same time. Just some easy things to try first.
 

Chuck Munro

Cadet
Joined
Jan 3, 2016
Messages
8
Checking the messages file is a good idea. I'll tail the file and see what shows up when the disks start to rattle. Running top with a short interval at the same time may prove useful. I'll run multiple terminal sessions via SSH to see if anything stands out and post any findings. It will be a while before I can do this because I have a lot of other things to take care of first. Thanks !
 
Status
Not open for further replies.
Top