Hard Disk Optimization

PLEASE NOTE: This article has been archived. It first appeared on ProRec.com in May 1998, contributed by then Contributing Editor Jose-Maria Catena. We will not be making any updates to the article. Please visit the home page for our latest content. Thank you!

We love music making. So why the heck are we complicating our existence with hard disk technical and configuration issues? The answer is easy: to work with multi-track audio, we need the best disk performance possible. More disk throughput means more tracks. And the hard disk is usually the constraint to achieving the maximum number of tracks. Any improvement in the hard disk performance translates directly into more audio tracks.

In this article, I’ll try to make hard disk optimization easy for everybody, reviewing all important points about hard disk optimization in general, and Cakewalk disk access optimization in particular, where I’ll give a big new surprise for Cakewalk users…

Understanding physical disk performance numbers

Sustained transfer rate

This is the most important performance measurement to evaluate a disk. That is the quantity of information that the disk can read sequentially per time unit, usually expressed as MB/s (megabytes per second). Sustained means that the disk can deliver this performance ad infinitum. An odd thing about this measurement is that many manufacturers do not provide this important number. Instead they prefer to give the more persuasive maximum throughput or burst transfer rate, which is the maximum amount of data per second that the disk can deliver, even if it cannot sustain this rate. The max throughput of a disk is quite relevant for most applications, but not hard disk recording, because hard disk recorders will exhaust a disk’s burst speed in the first second or so of recording. After that, the disk must be able to sustain the tranfer rate or the recorder will stall. Therefore it is critical when buying a hard disk for recording that you make sure the seller is quoting the sustained transfer rate.

In all modern hard disks, the sustained transfer rate is higher at the outer (or first) cylinders or tracks, and it decreases as the head goes toward the center of the platters. The reason is that outer tracks have a larger lineal length, and to maintain the maximum linear density, the bit rate is higher than in the shorter inner tracks. The outer tracks have more sectors than the inner tracks, but a whole track is read in the same time. This means that if we create several partitions, the performance is better in the first one and worse in the last one. This also translates in performance degradation as a partition is being filled, because files are allocated beginning by outer (or first) tracks.

Seek times

The “one stroke seek” is the time the head needs to settle to a contiguous track. The “full stroke seek” is the time required to travel from the first to the last track of the disk. The “average seek” the time required for a 1/3 full stroke, that is considered the average seek time for a random seek, and is the value most often supplied by manufacturers. Some times, “average access time” is specified, that means the average seek time plus the average sector search time (that depends on rotational speed mainly).

Internal transfer rate

This specifies the peak burst transfer rate from disk surfaces to heads. It’s sometimes expressed as MB/s (megabytes per second) and others as Mb/s (megabits per second). This is not the same than the sustained transfer rate, but it’s the parameter that has a larger contribution to the final sustained transfer rate, which is obviously smaller.

This internal transfer includes not only the user data, but also the sector headers, servo info, ECC data, etc. There are gaps between sectors and while the head pass over them, no useful data is transferred. The internal transfer is also stopped while switching heads and seeking. That’s the reason that internal transfer rate is not the same as sustained transfer rate. But manufacturers usually gives this number in lieu of the much more meaningful sustained one. Seems they don’t like to give the relevant information.

Channel type

There are mainly two kinds of disk interfaces: IDE and SCSI. The SCSI interface is much better, but also SCSI disks are much more expensive than IDE disks. We must not base our selection only in the channel type, but in all disk specifications, as some of the fastest IDE disks are faster than non-high-end SCSI disks.

Manufacturers are delivering the latest technology disks with SCSI interfaces. The same disks are also delivered with IDE interfaces typically after they make even faster SCSI ones. So SCSI is only faster if we choose the newest and fastest disks available at this time, or if we need some of the other advantages of the SCSI interface.

The major differences between IDE and SCSI are:

– A SCSI channel supports up to 8 devices the narrow version, or 16 on the wide version. The controller itself counts as a device. The IDE channel supports only 2 devices; most actual mainboards include 2 IDE channels for a total of 4 devices.
– In a SCSI channel, there can be concurrent operations for several devices. This way, the system can order a transfer for device 0, and then for device 1; each device requests the bus as they have data available, sharing the total channel bandwidth. In an IDE channel, an operation for a device must complete before beginning a new operation for other device in the same channel, although different IDE channels can work concurrently.
– IDE CD-ROM burners aren’t currently supported by some of the better CD-Audio premastering programs. This should become a non-issue soon because we can expect that most programs will support IDE burners in the near future, but for now, it’s a problem with IDE burners.

There are often also some WRONG assumptions about SCSI vs IDE:

– SCSI requires less CPU usage. This is totally false. This depends on the controller technology, mainly the bus-mastering capability or the ability to use DMA channels, not the channel type.
– SCSI is faster. This is false when stated alone in a general way. This depends on the disk performance, not the channel type.

Channel transfer rate

This is the maximum transfer rate through the communication channel between the disk drive and the host adapter. It must be higher that the maximum sustained transfer speed to avoid becoming the bottleneck.

With IDE (or ATA) disks, the currently fastest modes are: PIO4 (16 MB/s), and DMA2 (33 MB/s UDMA). DMA1 is 16 MB/s.

With SCSI, there are asyncronous modes (very slow), and syncronous ones. For syncronous modes, the speeds are: Standard = 5 MB/s, Fast = 10 MB/s, Wide = 20 MB/s, Ultra = 20 MB/s, UltraWide = 40 MB/s. There is also a recently specified mode rated at 80 MB/s.

Any channel speed above the transfer speed of the disk doesn’t improve the transfer rate for that disk. It only helps to leave the channel unused more time during the transfer, so it can be used by other drives in the case of SCSI – which is important for RAID systems. It also helps to reduce the PCI bus bandwidth used, leaving it more time free for other peripherals such the display. This is the reason of why we don’t get usually valuable improvements using UDMA/DMA2 versus DMA1, or UltraSCSI versus UWSCSI, for example.

Bus mastering

This is a very important optimization for us, please read carefully. The massive transfer needs of of hard disk recorders require a lot of CPU bandwidth if the CPU needs to transfer the data between the controller and memory. To free CPU time and make it available for other tasks (like audio mixing and processing), the transfers must be actually made by peripherals.

There are two ways to achieve this:

– Bus mastering controllers: These request direct accesses to the bus and do the transfer, signaling the completion to the CPU through an interrupt.
– In our PCs there is a peripheral called DMA controller, which is a bus master device specialized in doing transfer between memory and other peripherals. The ISA and PCI buses include signals to control DMA transfers.

Both methods are actually valid, but most modern PCI based controllers use bus mastering, the preferred method for the PCI bus. ISA doesn’t support bus mastering properly, so the only solution for ISA cards is DMA.

Bus mastering is not the same as the DMA mode of the IDE disks, but it is usually related because most IDE controllers can only operate in bus master mode when using the IDE devices in DMA mode.

To get the great advantages of bus mastering, we need the following:

– A bus master capable controller. Applicable to both IDE and SCSI.
– Bus mastering drivers for the controller.
– In case of IDE channel, DMA mode capable devices. Most modern devices support at least DMA1 (16 MB/s), or better DMA2 or UDMA (33 MB/s).
Install bus master drivers always if your system supports it. The CPU usage typically goes down from above 50 % to below 5 %. Free power for audio processing (or whatever)!

A note about bus mastering drivers: most manufacturers (Adaptec, Intel, Asus, etc), provide bus master drivers for their bus master capable controllers. Those drivers automatically run in bus master mode. But sometimes, if a device doesn’t work fine in DMA mode, that device can’t be used. Win95B (OSR2) automatically installs the Microsoft’s bus master drivers, that are reliable and give good performance, but they don’t work in bus master mode by default. To enable bus master mode for each device, we must go to Control Panel -> System -> Devices -> Disks (& CD-ROM) -> Configuration, and check the DMA box. This option might not be visible if our controller does not support bus mastering or if we installed specific drivers. Versions of Win95 previous to OSR2 do not install bus master drivers.

Another note: the DMA mode of IDE devices includes error checking in the transfers to detect data corruption in the bus. Data corruption should never occur, but some misbehaving devices or bad quality or too long cables can cause this problem. The Microsoft drivers do not check the CRC error reported by the controller, but there is an upgrade in the Microsoft’s web. Anyway, we should be sure that our disk subsystem is not causing CRC errors, because errors translates in retries, which degrades performance a lot. As a general rule, use the shortest possible cables, and use high quality ones. Choose also reliable and proven devices tested in DMA mode: many people do not use the DMA mode, and so there are in the market devices that seem to work but fail when used in DMA mode (specially in UDMA or DMA2 because of the higher speed).

Some drivers (ones from Intel, for example) ignore the caching attribute, which gives some unreliable benchmarks that give higher than real-life results. This doesn’t affect my DskBench program however (see the Software section to download DskBench).

Disk partitioning

1) For best performance, use either a dedicated disk for the audio files or the FIRST partition of your faster disk. As explained earlier, data at the beginning of the disk is transferred faster. Dedicating a disk or partition for audio files also helps to achieve less fragmentation of the files.

2) Always use FAT file system for the audio partitions. It’s faster than NTFS and others. You can use FAT32 if you have Windows OSR2. FAT32 allows for very large disk sizes – FAT16 only allows up to 2GB partitions – so you can create a large audio disk with only one partition. The performance of FAT32 is about the same as with FAT16. However FAT32 will try to create smaller cluster sizes than you want. Disks don’t read one bit at a time, they read clusters of bits. For many applications, smaller clusters mean less wasted space on your disk – but for audio, smaller clusters mean more discrete read/write operations. Translation: slowness. Therefore when formatting a FAT32 device, always use the format command with the /z:32 or /z:64 (the largest possible for your disk) to create bigger cluster sizes. It helps to get a bit better performance and less file fragmentation.

Win95 disk access optimization

There are some things we can fine tune for our needs:

Bus Mastering drivers

Install bus mastering drivers always. The CPU usage typically goes down from above 50 % to below 5 %. Free power for audio processing!

File caching

The minimum and maximum size of memory used to cache file accesses can be adjusted in the SYSTEM.INI file, section [VCACHE]:

MinFileCache = n1

MaxFileCache = n2

N1 and n2 are the sizes in KB. By default, these entries do not exist and Windows adjusts the size automatically.

But file caching is not useful with audio files (or any other kind of very large, streaming files). So, if we run short of physical memory, we can set the MaxFileCache setting to limit the maximum quantity of memory dedicated to file caching, resulting in more memory available for other things. MinFileCache can be left undefined or set to 0.

As general guide, I recommend the following values:

RAM -> MaxFileCache (KB)

16MB -> 1024..2048
32MB -> 2048..4096
64MB -> 4096..8192
64+MB -> 8192

System type and Read Ahead optimizations

These parameters can be adjusted in Control Panel -> System -> Performance -> Files.

The type can be set to “server”, “desktop” or “mobile”, desktop being the default. Using “server”, the systems gives higher priority to disk I/O, which usually helps a bit with disk intensive applications such audio.

The read ahead optimization setting is at maximum (64 KB) by default. The influence of this varies depending on the application. I recommend to leave it at maximum. Some audio programs recommend or set it automatically to the minimum, but this usually results in very little improvement for those programs, and important degradation for other ones.

Measuring performance

I wrote a small but accurate and reliable program to measure disk performance called DskBench. It measures sustained transfer speeds for both reading and writing, and also multiple file reads using various block sizes (which shows us performance for multi-track audio programs). For all measurements, CPU usage is also shown.

I don’t know any other single program providing all this information.

Cakewalk optimization

Here are some good ways you can further improve the number of tracks you can get from Cakewalk 6.0. You’ll want to check out my recommendations for version 7.0 as well.

As can be verified from the results of DskBench (and from theory also, of course), the performance grows as the block size that is read from each track is raised. This is because the ratio read_bytes/seeks improves, and seeks penalize performance significantly. Optimum values are 64 KB to 128 KB.

But with Cakewalk we can’t specify this parameter. It uses a block size indirectly derived from the DMA block size for the soundcard, and the resulting value is too small, resulting in poor performance, giving us less tracks than possible.

I don’t want to give many more details about Cake’s internals, but I’ll say how to achieve block sizes up to 32 KB with Cakewalk 6.x, which will be in most cases a big improvement. Note that depending on your actual DMA size, the block sizes can be now as little as 4 KB. This applies to Cakewalk 6.x. I doubt that Cakewalk 7.x will fix this issue completely, but surely there will be a way to get at least 64 KB block sizes. The folks at Cakewalk have informed me that after release 7, they will implement a better file streaming scheme.

Now, the trick. Follow the instructions:

1) Check the boxes for read and write caching in Settings -> Audio -> Advanced. File caching doesn’t help audio, but Cakewalk uses even smaller block sizes when cache boxes are unchecked, and the BlockFactor provision in Aud.ini doesn’t work because a bug (leave it at 1). Checking the cache boxes, Cake uses a bit cleaner file access scheme, resulting in better performance.
2) Check if your sound card can “Use Wave Position for Timing”, if so, you are lucky! Check it and set the DMA buffer size to 61440 regardless what the wave profiler says. Now, you get 32 KB block sizes! And more tracks and more reliable playback and record! I’m getting now 20 solid tracks from a 7 MB/s IDE disk, even with 500 ms low latency enabled! What you can expect is what DskBench says for 32 KB block sizes.
3) If “Use Wave Out Position for Timing” doesn’t work with your soundcard, indicated by out-of-sync audio and MIDI, then you must leave it unchecked and try the biggest DMA block size possible that meets sync requirements. You can usually set values that are multiples of what Wave Profiler set. Don’t ever set values larger that 64000. The bigger the DMA block size is, the bigger disk block size and the better performance.

NOTE: Cakewalk file streaming is not very good, but it doesn’t mean that this program is bad: it’s in fact one of the better audio + MIDI sequencers out there, and the best of all in my opinion (as long as I forget the stereo-only audio support, which is supposed to be corrected in version 7).

SOME EQUIPMENT DATA

In this section I intend to list some of the best performance disk and controllers for now. I’m not related to the marketing of these products and this is only an independent and unofficial source.

IDE disks

– Seagate Medalist Pro 7200 rpm. New model.
– IBM DeskStar (5400 rpm). The 9 GB version sustains more than 9 MB/s!
– Fujitsu MT30XX (7-8 MB/s for the 4.3 GB model), Seagate Medalist Pro 5400 rpm (some models only, 7-8 MB/s).

SCSI disks

– IBM 9XL. 9 GB, UltraWide, 10,000 rpm. The leader now.
– Seagate Cheetah. 9 GB, UltraWide, 10,000 rpm. Very near in performance to the IBM.

I don’t recommend 7200 rpm or below SCSI drives because the performance benefit over latest IDE disks is too little to justify the big price difference.

SCSI controllers (UW, bus mastering)

– Diamond Fireport 40. Great for the money! Even faster than the 2940UW.
– Adaptec 2940 UW. Well known performer.

Credits

– Pete Leoni, aka pdemotech, for having the stomach to experiment with the cache options in Cakewalk. This, combined with my previous analysis of read block sizes, was the base for the final Cakewalk optimization conclusions covered here.
– Marc Miller, for giving information about the CRC issue in the IDE DMA mode.