Last week’s Storage Networking Industry Association Data Storage Innovation Conference 2016 (SNIA DSI) showcased a vast array of new and proven technologies that push the limits of storage speed and capacity. I gave a talk on “The Coming Era of Open Source Enterprise Storage” and iXsystems hosted a Birds of a Feather (BoF) meeting on Open Source, co-chaired by iXsystems’ Jordan Hubbard, Brad Meyer, Samba developer Christopher R. Hertel and myself.
Ulrich Fuchs from CERN and Gary Grider from the Los Alamos National Laboratory set the stage with what I consider to be the key takeaway from the event: Simplicity Scales. A typical run of the CERN ALICE particle accelerator generates 100 petabytes of data at 450 GB/sec which must be available to 2000 clients. Ulrich and his team are evaluating the Open Source Lustre and Ceph distributed file systems as alternatives to the impressive yet costly GPFS from IBM. Gary and this team generate data at up to 2 PB/sec in RAM and transport it through several tiers including 4~6 TB/sec burst buffers, 1~2 terabyte-per-second parallel file systems, 100~300 GB/sec project file systems and eventually 10 gigabyte-per-second archive stores. Both organizations push every technology to its limits: their position is that Fibre Channel and Infiniband are not going away and that scale-out object and simple file systems are key strategies to scaling. To this point, Gary unveiled “MarFS”, a “near-POSIX” file system to compliment the LUSTRE on OpenZFS they are currently running. MarFS adds a simple POSIX layer to its own object file system with erasure coding in order to accommodate a million files arriving in a single directory at once. At that scale, the detailed metadata provided by the POSIX standard, ACLs and extended attributes are all unnecessary complexities that can impact performance. MarFS is BSD licensed and it turns out that licenses are just as important a secret to scaling as is the latest hardware. Fortunately, we have actual nuclear physicists to thank for that conclusion.
My talk focused on the steady spread of Open Source storage into many industries, up through the 98.8% penetration in supercomputing. Of these technologies, OpenZFS is emerging as the de facto scale-up file system and is a solid backing solution to many scale-out strategies. To my pleasant surprise, both CERN and the Los Alamos National Laboratory reaffirmed my point by scaling out on top of OpenZFS.
Continuing the “simplicity scales” theme, NVMe and NVDIMM storage devices had a strong presence thanks to their inherent speed and the fact that they are not slowed by the protocol overhead of SAS and SATA devices. All three are flash technologies but NVMe and NVDIMM represent a steady march closer and closer to the CPU, first to the PCIe bus and on to the RAM bus. The days of making flash storage appear like spinning disks are clearly numbered. On the network side, Fibre Channel has always leveraged simplicity for speed by forgoing a general-purpose network protocol like TCP/IP and has always been timed in sync with the PCI bus. As one attendee put it, “Fibre Channel also lets you keep the network admins out of your storage infrastructure”.
The Tuesday night Open Source BoF hosted by iXsystems brought together both users and vendors and the consensus was that Open Source is the horse that has left the barn and is galloping through the industry. We were fortunate to have several Samba and Microsoft experts who gave a history of open source SMB implementations and the trials and tribulations that surrounded each of them. We also had two OpenStack experts who did a better job of explaining what OpenStack is than five years of project marketing: an aggressively vendor-agnostic cloud platform and a direct challenge to VMware.
If you haven’t attended a SNIA event, I highly recommend you experience their unique balance of cutting edge technologies and vendor neutrality. I hope to see you at my talk on the OpenZFS community at the SNIA SDC in September!