Resource icon

Resource Why iSCSI often requires more resources for the same result

iSCSI is a SAN protocol. NFS, CIFS, etc., are NAS protocols.

For a NAS protocol, the client sends a command to the filer, such as "open this file", or "read ten blocks", or "remove this file." On the filer, the local NAS protocol daemon translates this into UNIX file syscalls, and passes it off to the filesystem.

For a SAN protocol, the client itself is running the filesystem, and is passing requests for certain blocks over the network, to read or write them. On the filer, the SAN protocol daemon translates this to operations within a single file (or zvol), making the changes the client requests.

At first glance, these two things seem to be very similar, but in practice they may not be, and the differences can hurt you.

Consider the case where a NAS protocol requests a file to be created, written, closed, and then deleted. The client will ask the filer to do each of those operations, and the requests are quick and efficient. The process of creating a file is a single operation for the client, and the filer sorts it out and reads the disk and figures out where to allocate space, and handles all the disk updates. Writing, closing, and deleting the file are also straightforward operations, viewed from the client's side.

On the other hand, for SAN, the same steps are more complex. Since the filesystem code is located on the client, a request to open a file means that the filesystem code has to access the root directory, traverse the directory structure, access the free block list, and then write an update to the target directory, and also write file metadata - each of which are individual read/write operations that may require blocks that are (from the filer's point of view) randomly scattered around. The SAN has no context to understand that this block is directory metadata and that block is file data, or in other words what the relative value of things are. Writing, closing, and deleting the file also result in a larger number of operations.

But those are just immediate differences. There are also more complex side effects.

For example, consider that ZFS has its gorgeous ARC. So, using a NAS protocol, you open a file, write it, and close it. That file is probably stored in ARC. Now you delete that file. It is instantly removed from the ARC, freeing that space for other, more useful things.

That doesn't happen with SAN. Since the SAN abstraction moves the filesystem layer to the client, all ZFS sees are requests to read and write blocks. It has no idea that certain blocks are directories and metadata. It may have no idea that certain blocks have been freed by the client's filesystem. As a result, it is quite possible for useless data to be sitting around in ARC because there's no effective way for ZFS to understand that "these blocks aren't relevant" or "these blocks are most useful." ZFS will make its best guess based on how frequently the data is accessed, but in order for that to work, it needs to be keeping track of a lot more blocks in ARC. It does not have the advantage of its own internal flags as to which blocks might be metadata or the context in which a block is being retrieved (sequential file read, etc) or which ones are no longer needed. By giving it a much larger ARC, it can successfully sample access frequency on a much larger number of blocks, which gives it the ability to arrive at a good level of performance despite the lack of specific insight.

Further, most NAS file updates are to write entire files, which is easily managed by the block allocation strategies of ZFS ... it'll try to find a nice contiguous set of blocks for the file. However, a SAN virtual disk is stored as a single ZFS object (file or zvol), and updates result in fragmentation. Fragmentation is combatted through the use of caching, but caching is more of a remediation than a true solution. A CoW filesystem will always tend towards heavy fragmentation when updating mid-file blocks.

The end result is that you usually need more resources, especially ARC and L2ARC, in order to have good performance with iSCSI when compared to NAS protocols like CIFS and NFS.
Author
jgreco
Views
4,197
First release
Last update
Rating
0.00 star(s) 0 ratings

More resources from jgreco

Top