Tech —

From BFS to ZFS: past, present, and future of file systems

From the mainframe to the Mac mini and from BFS to ZFS, Ars explores the past …

IBM and Microsoft duke it out

OS/2 and HPFS 

Most kids won't remember it today, but IBM once briefly toyed with the idea of competing directly with Microsoft for the prize of personal computer operating system dominance. Even more unusual was the fact that this competition was originally a partnership.

OS/2 Warp
IBM's OS/2 Warp (Version 3.0)

Even IBM, with its 10,000 layers of management and more bureaucracy than the Soviet Union, realized that DOS was badly in need of a replacement. IBM decided that it was going to design a successor—brilliantly named OS/2—which it would then fully own, but which Microsoft would do all the work of actually writing. Steve Ballmer, back before he was known for jumping up and down and throwing chairs, once described how IBM was viewed by the computing industry back then. "They were the bear, and you could either ride the bear, or you could be under the bear!" So Microsoft went along with this crazy plan.

OS/2 was to be a multitasking operating system, with a fancy GUI that was to be bolted on later. It took forever to arrive, had difficulty running DOS applications, and required more RAM than most computer users could afford in their lifetimes, so it went over about as well as New Coke. For version 1.2, which was released in 1987, IBM wanted a new file system to replace the awful FAT. Thus was born HPFS, for High Performance File System, written by a small team led by Microsoft employee Gordon Letwin.

HPFS used B-Trees, supported 255-character file names, and used extents. The root directory was stored in the middle of the disk rather than the beginning, for faster average access times. It did not support journaling, but it did support forks and had extensive metadata abilities. These new metadata were called Extended Attributes and could be stored even on FAT partitions by saving themselves in a files called EA_DATA.SF. Extended attributes were also supported in HFS+, but were not exposed in an Apple OS until Mac OS X 10.4.

Microsoft and IBM then went through a rather messy divorce right around the time Windows 3 was ready to be released. (IBM wanted to own that, too, and Microsoft really really didn't want that.) Microsoft refocused its efforts on Windows after version 3 became a smash success, while IBM kept the code it had and added a bunch of extra user interface code they had lying around from various dalliances with Apple and NeXT. Thus was born OS/2 2.0 and the object-oriented Workplace Shell, which had a brief day in the sun (and was even advertised, bizarrely, at the Fiesta Bowl) before Windows 95 arrived and crushed it into the ground. IBM later ported JFS to OS/2, much to the delight of the three people who still used it.

NTFS

Windows NT 3.1 install CD.
Windows NT 3.1 install CD. Note supported platforms!

Microsoft also knew that DOS needed a replacement, but was soured on its experience with IBM. In Bill Gates' second spectacular application of the Theory of Laziness, he hired Dave Cutler, the architect of DEC's rock-solid VMS operating system, just as DEC was going into a downward spiral from which it would never recover.

Dave Cutler took his team with him, and despite lawsuits from the dying DEC, implemented a clean-room implementation of a brand new operating system. Along with a brand-new OS came a brand-new file system, which initially didn't have a name, but was later dubbed NTFS when the OS itself was named Windows NT. NT was a marketing name that stood for New Technology, but it was still an amusing coincidence that WNT was VMS with each letter replaced by the next one.

NTFS was an all-out, balls-to-the-wall implementation of all the best ideas in file systems that Cutler's team could think of. It was a 64-bit file system with a maximum file and volume size of 264 (16 exabytes) that stored all file names in Unicode so that any language could be supported. Even the file date attributes were stretched to ridiculous limits: Renaissance time-travelers can happily set their file dates as early as 1601 AD, and dates as late as 60056 AD are supported as well, although if humanity is still using NTFS by that time, it will indicate something is seriously wrong with our civilization. It first was unveiled to the public with the very first release of Windows NT (called version 3.1 for perverse marketing reasons) that came out in 1993.

NTFS used B+Trees (an enhanced and faster version of B-Trees also supported in HFS+), supported journaling from day one, had built-in transparent compression abilities, and had extremely fine-grained security settings by using Access Control Lists (ACLs were added with NT 3.5, released in 1994). It was designed so that Microsoft could add extra metadata support until the cows came home. Indeed, NTFS's support for metadata was so extensive that sometimes it took Microsoft's operating system team a while to catch up with features that were already there.

For example, super-fast indexed searching of both files and metadata was available in NTFS since the release of Windows 2000, but it took until 2005 before Microsoft released a graphical interface that supported this, and it didn't become a part of the operating system itself until Windows Vista in 2006. Hey, sometimes things get forgotten. Anyone up for redesigning the Add New Font dialog box?

Additional features were added to NTFS in later versions of Windows NT. Version 3.0, released with Windows 2000, added the aforementioned indexed metadata searching, along with encryption and disk quotas so that students could no longer fill up the file server with pirated MP3s.

NTFS stores all of its metadata information in separate files, hidden from the regular operations of the OS by having filenames that start with the $ character. By storing everything as a file, NTFS allows file system structures to grow dynamically. The file system also resists fragmentation by attempting to store files where there is enough contiguous space, not just in the first available space on the drive.

To ease the transition from FAT-based operating systems to NTFS ones, Microsoft provided a handy tool for users of Windows 98 and earlier that would safely convert their FAT16 and FAT32 partitions to NTFS. It wouldn't go the other way around, but honestly, would you want to?

The only thing anyone could really find to complain about NTFS was that its design was a proprietary secret owned by Microsoft. Despite this challenge, open-source coders were able to reverse-engineer support for reading—and, much later, writing—to NTFS partitions from other operating systems. The NTFS-3G project allows any operating system to read and write to NTFS partitions.

Channel Ars Technica