Choosing the right RAID, and why disk layout is so important to a DB

This is a topic that is close to my heart at the moment. We are using a slightly older SAN here at the moment, and at the time it was set up, it was set up slightly below optimal. So now we are paying the price as our database performance requirements have outgrown the setup we have on the SAN. The core of my database environment is made up of high volume transactional systems, which all require massive disk I/O capabilities. There are still massive problems despite the machines having dual quad core CPU’s and 16GB of RAM each, so I just wanted to give a little lesson on why choosing the correct disk layout is so important, and also how to shift your databases around after the fact.

I won’t go into much detail on RAID, as there are far more educated and detailed guides out there for that (any guide written by someone aged 4+), but the two you need to be aware of for most DB setups are RAID 5 and RAID 10. Raid 5 is 3 or more disks combined using Block-level striping across all disks to maintain distributed parity (essentially fault tolerance) should one disk die. Raid 10 is 4 or more disks, Combines Raid 1 and Raid 0 which is Striping and mirroring with each disk set using Block-level striping and then Mirroring eg. Disk 1 and 2 stripe and then mirror to disk 3 and 4 maintaining distributed parity. Raid 10 can be considered a mirror of Stripes. (Thanks to David Gargan (@Jaoibh) for the re-write on this section)

The benefits of RAID 5 in a database environment are the storage capacity provided by striping (joining) 2 or more drives for additional capacity, while still providing some redundancy in the case of a drive failure with the spare (third) drive. The downside is that you have one spare trying to keep up with 2 or more drives, so in order to keep everything synched the read/write times are a little slower. Therefore your max I/O is affected. You would use this sort of setup for say Datawarehousing, or low transacting servers where the drive capacity is more important than being able to pump through millions of transactions.

The benefits of RAID 10 in a database environment are the speed at which the disks can process read/writes, as each spare only has to mind one active drive, therefore keeping them in synch is much quicker, and will give you great read/write capability, along with keeping a good level of redundancy. This is what you need to go for when setting up a database that will be handling a lot of database Inserts, lookups and updates at the same time.

One last rule of thumb on the drive side is that your drive block size should be 64K or a multiple of (128, 256 etc.). The default block size on NTFS is 32K, which is what windows uses, but SQL stores it’s data in 64K blocks. So on a default format SQL needs to read 2 blocks on the drive to access one block of its data. If the drive is formatted with 64K blocks, it gets a one to one relationship. Two on one relationships, while very desirable in some facets of life, aren’t great for databases.

The other small detail (sarcastic use of the word small) to remember, is that your data file, and your log file, should be on separate partitions. There are a million and one technical reasons out there as to why this is, but the basic reason is that you want to reduce disk contention, and your log file is constantly processing and using up I/O resource, so it’s best to keep that away from the data seeks going on in your database file.

So! Now you’ve set up your brand new server the way you want it, but your databases are still holed up on your 20GB c: drive. You’ll want to move them to your freshly formatted and serviced partitions. Assuming that you can take them offline for a period of time, this is the safest way to do it.

The first step, is to detach the database. The Microsoft guide to detaching is here, and does a far better job than I could manage!

Once your Database is detached, you want to copy the (most likely) two files to their appropriate directories. You will be looking to locate a DatabaseName.mdf and DatabaseName.ldf. Assuming you have set up your Data partition as E: and your Log file partition as F:, you will want to copy DatabaseName.mdf to E: and DatabaseName.ldf to F:.

Once the file copies are complete, you will want to attach your databases back into your SQL instance. Again, Microsoft have this handled, and you can read the How To here.

And there you go! Super fast server running as smooth as you like. This guide is a confusing middle ground between basic and detailed, I know, but it is just to get the general principles across and allow you to do more research into the areas that concern or interest you.

If anyone is interested in having a scripted version of the detach/reattach done for whatever reason, let me know and I’ll stick it up. Or if you have any questions coming from the guide that you’d like answered in more detail just shoot me a message and I’ll be happy to help.

Enjoy!

2 thoughts on “Choosing the right RAID, and why disk layout is so important to a DB”

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.