Storage Pooling Round Robin (aka I/O and Space balancing)

Behavior of the new Storage Pooling engine

The new Storage Pooling engine in FlexRAID defaults to a file management algorithm that favors energy efficiency.
The algorithm is designed such that files are kept together within their parent folders and such that data access activates the least amount of disks possible.

For most users, this is the preferred mode to keep. In this mode, if you were to take a disk out of the array for outside access, you would find most folders complete as opposed to having those folders with files on other disks.

Round Robin (aka I/O and Space balancing)

The following optimization being discussed is a rather crude optimization, but one that still works as intended in general. There will be hit and misses, but overall, the improvement discussed will take shape under the appropriate workload.

Round Robin

For users that want more parallel file access performance, the new Storage Pooling engine offers an alternate algorithm that randomly places files onto different disks. So, using this algorithm, a typical folder will have its files spread across multiple disks such that reading that folder would involve reading from multiple disks. If the access in done in parallel using different threads, you could achieve performance similar to that of RAID 0 or even better in the most ideal theoretical case.

Although the key design behind this algorithm is I/O balancing, a side effect is that it will somewhat balance out the free space of the disks that are part of the array. Please note that space balancing is not a goal here and that no attempt is made to even out the disk spaces. So, you might find your disks uneven at times and especially when starting with existing data.

In Transparent RAID (tRAID), this feature benefits only parallel read operations. Write operations will see no benefit as the parity disks are the bottleneck in tRAID.
In RAID over File System (RAID-F), both parallel read and write operations benefit from this mode.
In RAID-F and with this mode enabled, you should configure your copy client to start multiple worker threads to take advantage of the increase parallel I/O performance.
In tRAID, you should configure your copy client to queue the files such that they are copied sequentially whenever possible. Attempting a parallel copy process might have the opposite effect and degrade the overall write performance.

As the benefit of this mode is only achieved when parallel access is involved, it should be enable only in heavy user load deployment scenarios. Even then, you should really evaluate if the increase parallel performance is worth having randomly placed files onto the disks as well as losing energy efficiency.

Be Sociable, Share!

Revisions

No comments yet.

Leave a Reply

11 + 3 =