FAT: A primer

This is a tl;dr version of how the FAT file system works. If you want a long version, here’s a link.

FAT stands for File Allocation Table, and we’ll soon see how it got this name. The important question we will answer here is, “Why is there a discrepancy between the actual file size and size on disk? This all boils down to something called the Cluster Size a.k.a Allocation Unit Size (AUS). The minimum size a file can be is the size of the AUS.

Why is this so?

The hard drive is split into chunks the size of the AUS each called an Allocation Unit (AU). Below we can see 4 AUs making up 16KB of disk space.

Now if we put a 1KB file, the filesystem simply stores it in an available AU (say the first one) and records that file A is stored in index 1. We also introduce a table, unsurprisingly called the file allocation table, to keep track of this information.

If we were to store another 6KB file, it will look like this.

You can see that any space that is not taken up by the file within an AU is just wasted. This is the reason that a file usually takes up more space than it’s actually worth.

Why would they do that?

Let’s say we have our 16KB hard drive, that’s 16,384B.

The Location part of the table will have to store an address in the form of a number that could potentially go up to 16384.

This number will take log₂(16384) = 14 bits just to store.

This means that every file will have 14b of overhead just to store the address, and this gets bigger as our drive gets bigger.

Now if we cut it up into 4KB blocks, we only have 4 blocks to keep track of, meaning the address will only require log₂(4) = 2 bits to store.

So now you have a basic idea of how the FAT filesystem works. If you want to know more about different aspects like how folders work, you can head over to this link to learn more. In the meantime, how about let’s go back to our main post?

FAT: A primer

Why would they do that?

Chai Jia Xun