Saturday, February 25, 2006

SAN woes

Normally I try not to write about the things I am actually doing at work for paying clients. I tend to not want to give away my own or institutional knowledge for which my firm should charge clients. That's why I stick to discussions about collaboration technologies (OneNote, SharePoint, document management) and personal technology (Treo, TabletPC, Firefox, Microsoft Media Center).

But I've been getting frustrated designing for Network Applications storage as part of my current project. And my prior engagement involved in part carving up storage for Windows servers on an EMC Symmetrix. And I've talked with clients about offerings from Compellent, Falconstor, Hitachi, and HP over the years. I guess I would call my market the middle tier of storage; mainframes are not used and only the occasional Unix system is ever connected to these boxes. Mainly they are for Exchange or Notes clusters, Microsoft SQL server clusters, file clusters and/or NAS, VMware ESX servers, and Windows server boot drives. Ideally the storage controller is used as a mechanism to replicate these systems to another site for DR purposes.

Having worked with some of the major vendors in this middle tier I need to speak my frustration. I do not profess to be a storage engineer so I would not mind any corrections, differences of opinion, or even personal attacks by commenters for this post. I am merely expressing my frustration that nothing is perfect is the small slice of the storage world that I play in. Here are my frustrations:

Network Appliance
This is theoretically the new darling of mid-tier storage, but I grow frustrated with their architecture. Their strength is NAS and in many ways NAS is quite appealing. Consolidating some servers? Want to do it without reconfiguring all your apps? Just have the NAS pretend to be FS1, FS2, NYFS18, whatever. Also, you need far fewer NAS heads than file servers to serve the same amount of data. And, in the NetApp world, snapshots of file systems on NAS are far smaller than snapshots of block storage. And you just have the sense that a custom Linux kernel designed for serving files will do so at a faster rate and far more reliably for far longer than a Windows server.

But, you still need some features that require separate servers when you choose NAS. File share virus protection requires a separate server which must hook into reads and/or writes of every file. Yep, the limiting factor on file serving speeds just became Windows! You could argue that if you protect against viruses via desktop scanning, email gateways, and network edge scanning, you're safe enough, but you still have foreign devices that come onto your network. You could also argue that permissions auditing on a NAS system is not nearly as complete as what you would get in Windows (assuming you set that stuff up). And, in the case of my current client, no NAS can run the docsntss.exe service that DOCS Open requires to properly secure documents. It's a law firm, old document management issue, but if you have that issue, you can't use NAS for your documents. If you were also thinking of using something more sophisticated than RoboCopy or XCOPY /D to replicate data out to branch offices, and you didn't want to put a NetApp NAS in every branch office because you were, I don't know, fiscally responsible or something, then you might want to use FRSv2 in the new R2 release of Windows Server 2003 to replicate file data. Again, that ain't happening with NAS. It's not that I don't still believe in NAS (though you'd have to show me more sophisticated things coming from other vendors, such as running real-time scanning on the NAS head itself) but in my current assignment we've decided against it and in favor of clustered file servers.

However, it is the provision of block storage via iSCSI where I find NetApp falling short. I know that NetApp's strength is that you can make a big-ass RAID group and share all the spindles with a variety of applications. So they're ahead of EMC (Clariion) in my opinion in that they actually can do this. But having RAID4 or RAID-DP be your only options here are wreaking havoc in assigning spindles to write-intensive applications like Exchange and SQL. Maybe I'm gullible or an idiot, but I was given, and am sticking with, a write penalty of 6 for RAID-DP. That means that a group of servers that as a whole are performing 3900 reads/sec and 1100 writes/sec on 500GB of data would require over 58 spindles to handle the I/O load, or over 8TB allocated. So, yeah, you have more spindles to use, but you need more because the controller has to perform so many operations to do one write to a RAID-DP array! (The same 3900 reads / 1100 writes would only need 39 spindles in RAID10.)

And, snapshots of iSCSI block storage, because they are just files served off the NAS head, require their same size again for a snapshot. Granted, you have a lot of extra space on the array because you had to use so many spindles, but that just means you are wasting a ton of disk just to provide the performance that far fewer RAID10 disks could perform!

And iSCSI itself, though no fault of NetApp's, is not perfect. Certain legal software vendors recommend against it because its 1Gbit storage path speed is considered too slow. The only card I know of that boots off of it, the QLogic 4010, does not support jumbo frames, which causes consternation among networking types as creating headers for 8K SCSI packets adds overhead in calculation time and a reduction in size of the data portion of the packet. (QLogic will fix this with the introduction of the 4050.) And the otherwise excellent VMware ESX Server doesn't support iSCSI until they finally decide to ship ESX 3.0 some time in the next 6 months.

Finally, a flaw of at least the NetApp 900 family is that it only takes Fibre Channel disks. So if you, in a law firm environment, would like high-capacity storage for infrequently accessed data like disk backups and litigation support images, you need a different unit (a FAS3000 or R200) or you need to pay a fortune for 300GB FC drives. This is true as well of the EMC Symmetrix.

So, for these reasons, the NetApp offering is not perfect. Yes, I may unfairly disregard NAS because of a requirement of nutty old legacy apps. And I may misunderstand the performance of RAID-DP. And I haven't surveyed the iSCSI HBA offerings closely enough, so I may be missing an offering that allows both boot-from-SAN and jumbo packets. And I may not be aware of a newer NetApp FAS model that supports both FC and ATA disks. But all of these taken together cause me some dissatisfaction with the NetApp Filer offering.

Stay tuned for my discussion of the EMC product line. I find that they, too, provide no single product that meets all my needs.

This post was not sanctioned by my company (the name of which I've declined to disclose on this blog anyway); all beliefs are my own and do not reflect those of anyone else.




With regards to RAID-DP. In isolation, Any RAID-6 offering will do 6 I/O to the disk. However, in order to appreciate RAID-DP one would need an understanding as how DataONTAP works and the intergration between WAFL and the RAID engine. ONTAP does true Write coalescesing. It does NOT hold the write in memory in anticipation of an adjacent block to arrive like a typical raid array would do with RAID-5 or a RAID-6 implementation. Because WAFL can write anywhere and the logical to physical block mapping occurs on the fly WAFL combines writes together into one large write. That means that random writes get sequentialized. It also means that there's no r/w/m penalty. One thing people mis-interpret is that Write-Anywhere File Layout means that Netapp writes *anywhere*. What it really means is that even though WAFL has the ability to write anywhere WAFL chooses where to write and chooses carefully and intelligently, always looking for the best possible locations.

Contrast that with a typical array implementing RAID-5 or RAID-6. Writes are cached into memory and are held for as long as possible in hope that adjecent writes will show up so a full stripe can be written. On higly randoms app what are the chances of that happening using either RAID-5 or RAID-6? Practically zero, thus the pile of battery backed cache implemented within these architectures.

Also, Netapp's TPC-C published results used fewer disks per tpmC (0.7) vs RAID 1/0 (1.7) and RAID 5 (2.5)...

You may also want to take a look at the following links to see the platform Microsoft used for SQL 2005 64bit TPC-C Benchmark using RAID-DP technology:

With regards to iSCSI snapshots...The space reserved for for snapshots does not have to be the same size as the LUN. Under each volume, there's a parameter caled fractional_reserve which can be adjusted to reserve space equivalent to the rate of change of the data between snapshots. The range is between 0-100%. The default is 100% and the reason for that is that one could conveivably overwrite 100% of the LUN between snapshots. It's more of a safeguard approach than anything else, but one that can be adjusted.

As far as iSCSI performance goes, judging purely based on bandwidth FC certainly has a bigger pipe. But speed is not an fuction of bandwidth but rather size...I/O size that is. So you can't tell me that a 5000 IOP SQL Server DB at an 8k page size requiring a 40MB/s of bandwidth using 2x 1Gb active/active paths or using iSCSI MCS (Multiple Connections per Session) can easily handle this.
You are correct that the QLA4050 enables Jumbo Frames. The QLA4010 and 4050/4052 are not the only cards with INT13h support. The Adaptec 7211 also supports iSCSI booting. You can also boot of the iSCSI SW initiator using winboot/i from emboot ( Alacritech uses this with their TOE to provide iSCSI boot.

The FAS6030 and 6070 systems do support intermix of FC and SATA drives. The latter using 250/500GB capacities. My understanding is that the recent accouncement of another vendor's platform refresh does *not* include SATA support across ALL of their new midrange platforms.

One last comment with regards to luns/files as you called them. LUNs are not files within the context of what a file is/means to a lot of folks. LUNs are virtualized objects with characteristics and attibutes that differ from those of a file. Furthermore, writes to these virtual objects follow specific code paths within DataONTAP unrelated to the NAS protocols implemented within the array.

Post a Comment

Links to this post:

Create a Link

<< Home

This page is powered by Blogger. Isn't yours?

eXTReMe Tracker