How to set up a RAID array with both SATA and SAS disks
Creating separate RAID sets for SATA and SAS disks
By Paul Hickinbotham of Hammer | Published: 00:00, 31 May 2005
Until now, systems administrators have been faced with an either/or scenario with regards to data storage.
Most companies have a mix of email, databases, file and print and a number of bespoke applications - each with a different need and the only viable option has been to adopt a generic stance. If your business needs high performance, then take the SCSI/fibre channel route: if ultimate speed is not so critical, then you may be able to get away with the more cost effective Serial ATA (SATA) alternative.
More recently it has been possible to mix fibre channel and SATA behind the same RAID application, but this has not offered a truly integrated solution: as a result, the need for multiple devices has brought with it increased cost and problems around manageability.
What we are now seeing however is a new level of connectivity in RAID solutions, with SATA and SAS using the same backplane technology. As SAS enclosures have started to come to the market therefore, in future it will be possible to incorporate SATA and SAS solutions within the same RAID array for the first time.
The first step
In order to implement an integrated solution, the first step is to work out the total capacity requirements of all the IO-intensive applications and all the megabyte per second-intensive applications, or alternatively the random versus sequential applications.
This will give you the basic requirement for how much SATA disk you need for the non-business critical sequential data and how much SAS disk will be required for the random or business critical data.
Having established the capacity requirements, from a manageability perspective there will be a mix of these disks within the same enclosure: the SATA and SAS disks can then be marked as such and prioritised within the RAID solution. Knowing which disks are which, the next step is to create separate RAID sets within the RAID controller or solution based around your requirements from an application level.
This turns the implementation process on its head. Previously, the initial task was to create multiple RAID sets: then, via differing RAID appliances go to the applications servers and provision that storage out to them. Now, by contrast, you start with the application and then select the best disk RAID sets for that application. Thus, it moves from an application to then making sure that the RAID services it, rather than, as before, using the RAID and making sure that it can service the application.
How big a change is this?
The key is that, from a management perspective it is simplifying significantly the implementation and use of RAID solutions by removing many of the difficulties of picking and choosing the best disk approach. Though selecting the right disk for each application has theoretically always been the ideal, until now the easier route has been to choose the biggest, fastest disk for all applications. Now, the like for like option is as easy to manage as the overkill approach.
Once you have established which applications require which disk, and the solution providing the necessary volume of storage is in place, the next step is to provision out that storage to each of the servers as it is needed.
This can be further improved with the use of virtualisation, where allocation and expansion of the storage environment is made much quicker and simpler to implement, control and manage.
From this point on, there is no significant difference from the earlier approach as you need more storage, you add it on an as required basis, adding either SATA or SAS depending on the needs and growth of different applications. The difference is that a single storage solution can meet the requirements for both sequential bandwidth and random performance and grow easily with the future needs of the business.
Are there any pitfalls? So far, we have only considered random and sequential data, but there is a third type namely pseudo-sequential data which can catch people out. This is data that on the face of it appears to be sequential but which, depending on the particular architecture or the nature of the business, presents itself to the RAID solution as if it is in a random environment.
So, an end-user who understands their business but does not understand data can fall into the trap of provisioning data incorrectly. This creates performance bottlenecks where it is presumed that, due to the nature of the data SATA disks are adequate whereas in fact the more expensive enterprise class SAS drives are required, so burdening the end-user with extra administration and cost to rectify the situation.
One answer is to work with a partner who understands their business and which applications are business-critical and at the same time has the necessary level of technical expertise regarding individual data types and the way in which they are laid out onto a RAID array.
Who is this relevant to?
So how does a systems administrator know whether or not this solution will be relevant to his business? The short answer is that almost every SME should look at it, for there are very few niche sectors using either purely sequential or random data, or read-intensive or write-intensive data. (For example, CCTV and video streaming companies will use sequential data only, whereas businesses involved in purely transactional databases will use random data.)
This is another example of two recurring trends in storage development: the move towards convergence and that of vendors developing technologies specifically around the needs of the end-user. So, for any business looking for ease of management in a mixed environment, this integrated approach is worthy of serious consideration.
Paul Hickingbotham is solutions marketing manager at Hammer. Contact him via firstname.lastname@example.org.