No, I don't mean brand, or color, or even connector-type zealotry.. but when sizing solutions and working with customers and partners on new builds, I often find that either no thought goes into the SAS cabling, or only a little bit does. I find this distressing. Let me tell you why.
See, I often see either no thought put into it (a common mistake) -- or, the only thought I see put into it is to look at cable redundancy, sometimes paired with thoughts towards JBOD and/or HBA redundancy (an even more common, if more forgivable mistake). All of these are, of course, important (how important depends on the use case). I'm debating a blog post on sizing of solutions, and on redundancy, so I'll save talk of that for later. This is just a short (by my standards) post to explain something I often see completely overlooked. Throughput - and just how much of it you actually have compared to what you think you have.
See, anyone sizing a solution involving JBOD's often does give some thought to throughput. Most people understand that each 'mini-SAS' (SFF 8086/7/8 connector style cables) carry 4 separate SAS paths, and most understand that if your entire solution is SAS, you'll get 3 Gbit/s out of each path, and if it is SAS-2, you'll get 6 Gbit/s out of each path. My first word of advice is to treat this much like many network administrators treat network connections - pretend you only get 80%. For ease of remembrance, I just generally pretend that at best, a mini-SAS cable can do 2 GByte/s, and I'm rarely disappointed.
The thing I often see people forget, however, is that what's coming into your SAN/NAS is not what's going down to the drives. Let's take the easiest use-case to understand (and the generally worst one to deal with) - mirrors (RAID1/10). If 200 MB/s of data is coming into the SAN, all of it unique, how much is going to the drives if they're in a 2-disk mirror vdev pool? Answer: more than 400 MB/s (why more than double? Easy. ZFS maintains metadata about each block, and that also has to go down on the disks). Suddenly that 2 GB/s SAS cable is only actually capable of sending less than 1 GB/s of unique data downstream.
Ironically while ZFS far (far, far, far) prefers mirror pools for IOPS-heavy use cases, it has a significant downstream impact on throughput potential, especially if your build isn't taking this doubling into account in the design. Conversely, raidz1|2|3 vdevs lose much less - the only additional data that has to go down is the parity, which even in a raidz3 vdev is still less ballooning of data going down than mirrors, by quite a bit. So for raw throughput where the SAS cabling could become the bottleneck, raidz is a clear winner in terms of efficiency.
It isn't all bad news, though - once you understand this bottleneck, you'll appreciate ZFS' built-in compression even more than you probably already did, because that compression happens before the data goes down to the disks, potentially having quite an impact on how much usable data can get down the paths per second. And while I almost always steer people away from it, if your use-case does benefit strongly from deduplication, that also takes effect beforehand, massively reducing writes to disk if dedupe ratio is high.
So in the end, my advice when building solutions utilizing any sort of SAS expanding is to bear in mind not just how much performance you want to get out of the pool (a number you often know), but how much that actually means in terms of data going to drives, and rather your cabling can even carry it all. I am seeing more and more boxes with multiple 10 Gbit NIC's go out where there's single SAS cable bottlenecks that will very likely end up making it impossible to fully utilize the incoming network bandwidth in a throughput situation, because even if the back-end disks could support it, the SAS cabling in between simply can't. That is OK if you're hoping for most that network bandwidth to be ARC-served reads -- but if you're expecting it to come to or from disks, remember this advice.