Recently, I met Jim Salter at SouthEast LinuxFest (SELF). Jim is an IT consultant, and while he started with FreeBSD, spends much of his time on Windows and Linux these days. Jim is also the author of Sanoid, a GPLv3 extension to ZFS that does a lot of cool things. I caught up with him on IRC to discuss Sanoid, ZFS, and converged vs. hyperconverged infrastructure. (Of course I never allow a buzzword to go undefined).
Below is the slightly edited interview. (As this came from IRC, I’ve fixed capitalization and minor grammatical mistakes but am otherwise preserving the original as much as possible.)
Andrew: At SELF, we went back and forth on converged infrastructure vs. hyperconverged infrastructure. Can you settle what the difference is once and for all?
Jim: OK, so, converged infrastructure is the same old stack — big, expensive SAN, expensive high-bandwidth storage transport network, and a bunch of compute nodes that connect to the SAN over the storage network and to you over the "regular" network. But with converged infrastructure, it's all sold by one vendor, usually in a rack form factor where literally the whole rack is the "converged infrastructure," and it has the crucial bits in place already.
Hyperconverged infrastructure is what I do — the storage and the hypervisor are in the same chassis running on the same CPU(s), no need for a storage transport network. Truly self-contained.
Andrew: So, while enterprises are using the more complex converged infrastructure, hyperconverged is targeted at the small and midsize market?
Jim: Not necessarily. My hyperconverged infrastructure is targeted more at small and midsize, but hyperconverged in general started out being targeted at the data center. Look at SimpliVity for the big player there. The form factor is very similar to Sanoid, but everything's proprietary and the price is $askforasalescall.
The big difference really is the "in a single chassis" part, as far as the technical difference between regular converged and hyperconverged.
If you want a good example of "converged not hyperconverged," look at Cisco UCS. You buy single components that are storage or compute. They're all designed to plug into a special rack format that has the storage transport and management built in. But they're still separate pieces that build into pretty tall silos.
Andrew: So, you're the author of Sanoid, and you've based your snapshot automation on ZFS. What made ZFS the winning technology to base your project around?
Jim: I've been using ZFS since the very first days it made it into the FreeBSD project for general release. So, I was already very familiar with it and with copy-on-write systems in general. I would have loved to have used btrfs for Sanoid, actually, and I took a really solid flying stab at it, but the reliability just wasn't there. The features that Sanoid provides — rolling hourly/daily/monthly snapshots, instant snapshotting, and rollback.
I plan to support btrfs when it's ready, but the reliability just isn't there yet. The feature set is awesome, and I do believe it will get where it needs to be eventually — betting against the GPL is just lunacy — but it's not there yet.
Andrew: How does Sanoid extend ZFS?
For example, let's say you want to take advantage of replication. First, you need to know whether you have any existing snapshots on the target end. The command arguments you use will be different if you're creating a new data set or if you're updating an existing one.
Then, you need to know what the newest snapshot you have in common on both the source and the target end are. Then, you need to build a command that looks like this: zfs send -i pool/dataset@oldestcommonsnapshotpool/dataset@newestcommonsnapshot.
Then, you need a way to get the output of that command from the source to the target, such as piping it through ssh. That also means you need to know if the target is local or remote. And you'll probably want to specify the algorithm ssh uses to reduce overhead. And you'll probably want compression. And you'll probably want a progress bar. And you'll probably want network buffering. By the time you've finished just getting all your research done to issue the final commands, you've burned 10 minutes pretty easily.
Syncoid automates all that stuff for you, so no matter whether you do or don't already have some version of the data set on the target, the syntax is the same and it does all the donkey work for you: syncoid pool/dataset root@remotemachine:pool/dataset
or syncoid pool/dataset otherpool/dataset if you're replicating to a different pool or data set on the local box.
That's for replication. Similarly, Sanoid manages all the snapshotting for you. You get to define a policy about how often you want to take snapshots, and it does it for you, names them appropriately, gets rid of the old ones once they're older than what you want to keep, etc.
Andrew: Last question. Where do people go to get Sanoid? What if they want it preinstalled?
Jim: [The site] http://sanoid.net/ redirects to the GitHub project page right now if you just want the code. If you want to buy vendor-certified hardware, go to http://openoid.net/ and contact us from there. You get the hardware and Sanoid itself set up and ready to go, just add VMs, along with support and active network monitoring.