HOME
TOPICS
SEARCH
ABOUT ME
MAIL

 
Yhe computer never stores a file in one big chunk. It stores them in 'clusters,' and that's why things are never what they seem on a disk drive.
  technofile
Al Fasoldt's reviews and commentaries, continuously available online since 1983

How your computer really stores things


By Al Fasoldt
Copyright © 1993, The Syracuse Newspapers

    I have this crazy idea for a sitcom. There's this office, see, and the filing clerk goes bananas one day and mixes up all the files and folders, putting the ones that start with "A" after the ones the start with "G," and the ones that start with "Z" in front of the ones that start with "S." When the boss tells the clerk to toss out old files, this ne'er-do-well leaves them right in the folders. And the clerk makes the filing cabinets spin round and round while trying to grab the right file, see, and ... well, I guess it wouldn't sell on prime-time TV.
    But it sells in the billions in the computer industry. It's the crazy way your disk drive works when it stores files. Knowing a little about this oddball process can work wonders when you want to retrieve a deleted file or when you are trying to add some oomph to a fading hard drive.
    Like the clerk who went berserk in my TV comedy, your disk drive sometimes scatters files all over the disk. They're often completely out of order, in two ways: Files aren't necessarily stored in the same order that you put them on the disk. Today's letter to Aunt Emmy might end up ahead of last week's note to the plumber.
    Parts of files are themselves scattered aimlessly about. The part of your letter that says "Hi, Aunt Emmy! Thanks for the birthday card!" might be located right next to the part of that other note that says "Fix that sink before I call the Better Business Bureau!"
    Aunt Em would surely be surprised if she found that item in the letter you sent to her -- and you would be, too. But, fortunately, the computer's operating system usually keeps track of all those parts of your files when it's time to do something with them.
    In fact, that's just what we need to talk about. Why are the parts of files scattered all over? And why does it matter? It helps to know that the computer never stores a file in one big chunk. It always stores them in what are called clusters. These clusters are like buckets, all the same size.
    Let's say they're all 1-gallon buckets. If you need to store a little less than a gallon of water, it will fit nicely in that bucket. If you need to store a little more than a gallon of water, what do you do? If you're a computer, you fill the first bucket up and pour the rest, just a few drops, into the second bucket. Then you put those buckets away.
    Even though the second bucket is mostly empty, the computer won't put anything else in it. When it has to store something else, it looks for another bucket.
    Here's where the story starts getting crazy. Suppose that letter to Aunt Emmy is so long that it needs four buckets -- four clusters, in other words. Does the computer look for four empty clusters in a row to store your friendly phrases? No way. It tells the disk drive to grab the first four empty clusters that it comes to. On a brand new disk that has nothing else on it, those four clusters are probably going to be next to each other. But on a disk that's been around, so to speak, there will be a lot of empty clusters here and there, and the disk drive will take the space wherever it finds it.
    So that letter to Aunt Em is might end up in four separate places on the disk. Can this cause a problem? You bet.
    If nothing ever went wrong, this willy-nilly file-chopping method wouldn't matter at all. The computer is able to put everything back together again, and you don't notice that anything is odd.
    That is, if nothing ever goes wrong. Of course, things always go wrong. Or, as one version of Murphy's Law states, things only go wrong when you least expect it.
    The first problem is an overall slowdown in the way your disk drive works. You probably wouldn't notice the slow, cumulative effect of this until it gets really bad $-$ like the way you don't realize the shock absorbers on your car are going bad until they're totally shot.
    When hundreds or maybe even thousands of files are scattered in pieces all over a disk, the disk drive has to work overtime putting them back together again when it reads files, and finding new nooks and crannies when it saves them. In my own tests, I've seen drives that slowed down to one-half their normal efficiency just from this fragmentation of files.
    The second is a lot more serious. Suppose your letter to Aunt Emily is broken into a half-dozen pieces, and you accidentally erase it. Guess what happens to those pieces when the disk drive saves a few more files?
    Yes, you're right. The space they occupy is put back into the general fund, so to speak, and new pieces of files get stuffed into the places they'd been. Some of the parts of that letter might be around a while, hidden away somewhere on the disk, but any chance you'll get the entire letter back with a regular "undelete" program is either slim or none.
    Unless you change your habits, that is. There's a simple, painless way to do this. It's a two-step process.
    You start by "defragging" your disk. Yes, I know that sounds like something GIs did in Vietnam to officers they didn't like, but it's the term computer mavens use for defragmenting their hard-disk drives.
    If you have MS-DOS 6.0 software (and if you don't, you probably should), all you need to do is type "DEFRAG" at the command line. Otherwise, you can use a commercial program like the one that comes with the Norton Utilities or a shareware program like the excellent one called ORG.
    A defragger puts the pieces of your files back together again, letting your hard drive run at top speed. The added benefit is that files you delete are much easier to recover, since the individual pieces are always right next to each other. On the disk, that letter to Aunt Em will always start with "Hi!" and end with "Love, Mary," and all your chit-chat will be safely tucked in between.
    You add the second step if you really want to be able to get deleted files back safely, no matter what Mr. Murphy is up to. It requires a deletion-tracking program like the one built into DOS 6.0. (Now you have another reason to switch to 6.0.) When you add one of these programs to your computer's startup procedure -- or, to be more accurate, when you let DOS 6.0 do it for you, which takes about 10 nanoseconds -- files that you delete aren't really erased for many days. They're given what you might call a kill date, more or less. (I have to be a little wishy-washy here because the deletion tracker can be set up to work in many different ways.) Any time you need to rescue a file that hasn't reached its kill date, you just type "UNDELETE," and you see a list of all those files on death row. You can do it easier from Windows, but it works the same way.
    I don't want to give the impression that this is the only way of dealing with the problem of fragmented files and rescuing something you have erased. There are a few other neat tricks you can use on any computer -- PC, Macintosh, or any other kind -- and we'll take a look at them another time.
    1997 update: Defrag is now part of Windows 95 and later versions of Windows, and can be run from within Windows.