File space allocation

FAT

A clever take on the linked list approach is to move the pointers out of the blocks themselves, into a 'file allocation table', or FAT. This is an array, with one entry per disk block. The directory entry gives the number of the first block, and the entry in the FAT gives the entry of the next block, whose entry in the FAT gives the address of the next block, and so on. Special codes in entries in the FAT indicate unused (free) blocks and bad blocks.

For example, imagine that a disk has a FAT which looks like the one to the right

...and one of the directories on the disk---the root directory, in fact---looks like this:

AUTOEXEC.BAT<other info>6
MYFILE.TXT<other info>4
SECRET.DOC<other info>17

Let's say we are interested in the block #3 of the file "MYFILE.TXT". We see that it is listed in the directory entry as starting at physical disk block #4. Follow pointers around the FAT, and we see that the fourth block (block #3) of the file is physical disk block #10. Now we can read from or write the block from/to disk:

This system too has low overhead, and indeed many of the advantages of the linked list method, while still allowing random access (if the FAT is kept in memory). Another advantage over the linked list approach is that allocation does not use the free chain (just looks for FAT entries which are 'unallocated'), so fragmentation is less of an issue.

The down-side is that the FAT puts all of the filesystem's eggs into one basket. Loosing the FAT  makes the rest of the data on the disk useless (typically keep two copies, cos it's quite small anyway). Also, for random access to files, need random access to the FAT: if disk is big, FAT is big, and expensive to keep in memory, or must be paged into memory.

e.g. If disk blocks are 1024 bytes, and the disk is 1Gig, 1M disk blocks. If disk addresses are 4 bytes, FAT is 4Meg.

MS-DOS uses this technique, for floppies and hard disks. Trouble is that FAT gets really big on hard drives, specially with multigigabyte drives, so FAT can get really huge... one solution is to thin the fat, by making the blocks bigger. This is analagous to making the page size larger in a paging system in order to reduce the size of the page table.

With large disks, the options are:

  1. Keep the FAT in memory, dispite its size (expensive in memory)
  2. Page in parts of the FAT as they are needed (reduces performance)
  3. Make the disk blocks bigger (inefficient use of disk space)

MS-DOS allows disk blocks of up to 32K on large disks. This is tremendously wasteful of disk space: it results in an average 16K of space wasted for each file due to internal fragmentation.


last updated 3 April 1998