HOME
TOPICS
SEARCH
ABOUT ME
MAIL

 
Windows tips: The way Windows handles long file names is not quite what you might have expected. Here's an exclusive look at the way Windows really works with long names.
  technofile
Al Fasoldt's reviews and commentaries, continuously available online since 1983

How to make sense out of Windows' long file names


By Al Fasoldt
Copyright © 1997, Al Fasoldt

   What a treat! Windows 95 and its successors let you name files and folders just about anything you want. Want to name that note to Aunt Mary "That Note to Aunt Mary"? Just do it.
   Or go wild. How about "My Note to Aunt Mary After She Cut Off Bertrand's Inheritance, Along With a Message to the Lawyers About Our Own $4 Million Lawsuit"? No problem.
   What a difference between the old Windows and the new! Under the old Windows, names of files and folders were limited to eight characters for the main part of the name and three for the tag-end part of the name. (It's called the filename extension.) That letter to Aunt Mary would be LTANMRY.DOC or something like that.
   Good riddance, right?
   Don't be so hasty. The latest versions of Windows can make your adventures into file and folder names confusing if you're not careful. And they can trip you up if you don't understand what happens when you share stuff you create on your new PC with someone who has an old version of Windows -- or if you create a file on a modern program and then open it in an old one.
   Let's look at the old way of handling filenames first. It will help you understand what can go wrong.
   Until Windows 95 was introduced, nearly all PCs used a disk operating system generically called DOS. The main version was created by Microsoft, which once called itself Micro Soft. And so most PCs had an operating system called MS-DOS. (Don't you like it when someone actually explains where these terms come from? I do, and that's why I pass those tidbits on to you.)
   MS-DOS takes charge of a lot of things that happen on a computer. One function it controls is file and folder creation, as well as file and folder deletion and that sort of thing. Under MS-DOS, the names of files and folders are stored in a special place on each disk. This location is checked each time the computer stores or deletes a file or folder. It's also checked at other times. That's how important this location is.
   The place where this information is stored is called the file allocation table, or FAT. Under DOS, the FAT is actually pretty thin; it has room set aside within its internal "slots" for only a certain number of file and folder names, and each one has room for only a certain number of characters. That's where the "8 dot 3" (or "8.3") rule comes in: File and folder names cannot be more than eight characters long, and, if you or your programs decide to do it, these names can include a period followed by up to three characters at the end of the name.
   By convention, most users of this old system differentiate between files and folders by putting filename extensions on files but not on folders. Files would be named "MYFILE.TXT" or "LT120497.DOC" and so on, while folders would be named TEXTS or DOODLES, that sort of thing. This convention continues today, even when files and folders can have much longer names. You don't need to follow it -- your folders can have as many periods as you want, followed by other characters, and no one will complain. But keeping these extensions off the names of folders does help give them a sort of brand identity when you see a list on your screen.
   Fade to today. We all know that under the latest versions of Windows, there's no need to limit the names of files and folders to the "8 dot 3" rule. That's because the new versions of Windows bypass the limitations of the FAT to store file and folder names in a much larger space. That space will hold 255 characters for each file and folder.
   At the same time, the new versions of Windows continue to use the old method, right in that ancient FAT space, to store the old "8 dot 3" file and folder names, too. Why does Windows go to all this trouble? To make sure files and folders named under the newer versions of Windows can be seen and worked with by DOS and by the old versions of Windows. (Versions prior to Windows 95 weren't really in charge of file naming; they all used DOS to handle everything relating to files and folders.)
   This is getting complicated. I'll restate this as simply as I can. When you create a file or folder or rename one that already exists, current versions of Windows check to see if the name follows the old "8 dot 3" convention. If it does, the name is stored in the old-style FAT, and that's that. If it doesn't, a so-called long filename -- an LFN in the jargon of power users -- is created in a separate storage area on the disk. And an "8 dot 3" filename is also created.
   Here's where things get interesting.
   How does Windows know what kind of "8 dot 3" name it should give to an LFN name? Well, you're smart, I'm smart and your Uncle Clyde may be smart, too, but Windows is very dumb. So instead of trying something clever or creative, Windows just grabs the first characters of the long filename and lops off the rest of the name to make the short filename.
   There's one big catch (and a couple of small ones, which we'll get to later). Windows can't literally take the first eight characters of each long filename to create the corresponding short ones, because some short names might be duplicated. Take these two Microsoft Word long filenames as examples:
   Discussion from the Working Group on Fixing Filenames
   Discussion on the Future of Urban Transportation
   If Windows did an eight-character chop job on each name, the short filename for the first LFN file would be "DISCUSSI" and the short filename for the second LFN file would be ... you guessed it ... the same thing. For both technical and purely logical reasons, two files that are in the same place can't have the same name, so Windows does something else when it creates the short versions of names.
   Windows takes the first six -- not eight, mind you -- characters from the long filename, sticks a tilde in the position of character seven, and puts the number 1 in the eighth position. If it finds another file or folder with the same name, tilde and all, it increments the number and checks to see if it's unique. If not, it increments it again, and so on. (Windows probably won't run out of numbers. It will increment up to 99999999 -- that's a nudge shy of 100 million, for all you trivia fans -- before it runs into trouble. And 100 million files won't fit into a single folder anyway, for reasons that you probably don't want to know.
   (Oh, there I go again, setting myself up. I'll tell you the reason. Skip this section if you want to cut to the chase. Disks can only hold so much stuff, as all of us find out sooner or later. What most of us don't realize is that the record keeping for all this stuff -- that file allocation table I mentioned earlier -- takes up space of its own. So even if your disk can hold 9 billion bytes, give or take a few hundred million -- a 9-gigabyte disk, in other words -- you can't have 9 billion files on that disk. Or even 9 million. Add to that the old problem of the limited number of storage clusters, the places where files are kept, and you get an even higher wall that can't be scaled by mega-numbers. This entire mini-topic would make a great maxi-topic at another time.)
   So here's what Windows does to those two long filenames. The first would have a short filename of "DISCUS~1.DOC" and the second would have the name "DISCUS~2.DOC." (The filename extension, DOC, is taken from the identical extension on the LFN version. Windows normally hides the extension, so don't worry if the idea of having "DOC" on the end of the name seems puzzling.)
   This is only mildly interesting from a technical standpoint. But it's something you need to know from another angle. Suppose you're not able to see the long filenames -- suppose, in other words, you are viewing the list of files using a program that doesn't know anything about long filenames? (One such program is Microsoft Word 6.0, still in use by millions of people around the world.) How would you make sense out of the names?
   Sorry. That was a trick question. You couldn't make sense out of the short filenames. Both files have the same short names except for one uninformative character.
   So the smart thing to do when you create long filenames is to give Windows a chance to make sensible short names out of them. This sounds like a lot of extra work, but it's worth doing if you work in a mixed old-and-new Windows environment, or if some of your programs are not aware of long filenames.
   The trick is simple: Create your own short filename at the front of each long filename. Do this for all files that could end up being viewed in an older program or on an older PC. (Don't do it for all files; it's not worth all that trouble.) Here are our examples with this method applied:
   FILFIX Discussion from the Working Group on Fixing Filenames
   URBTRN Discussion on the Future of Urban Transportation

   These are the actual long filenames you could give to those two files, even though they look like two sets of names -- short names followed by long names. (OK, I'll admit it. That's just what they are. But as far as Windows knows, those double names are just single ones. Because we're smarter than Windows, we can see through the ruse, but you can bet old Mr. Windows doesn't have a clue.) You don't have to put the short name portion in capital letters, but I recommend it to make it stand out.
   When Windows creates short filenames from the two long names, here's what you get:
   Long name:

   FILFIX Discussion from the Working Group on Fixing Filenames

   Short name:

   FILFIX~1.DOC

   Long name:

   URBTRN Discussion on the Future of Urban Transportation

   Short name:

   URBTRN~1.DOC

   This means someone viewing these files in an older version of any program would be able to make at least some sense out of the names, and anyone viewing them in a modern program will see both the short name at the beginning and the full name.
   You can get fancier, of course. You can use punctuation to set off the short part of the name from the long part. A comma works fine, and looks nice, too:
   FILFIX, Discussion from the Working Group on Fixing Filenames

   Two more cautions and we're through.
   First, you'll find out right away that some characters can't be used in filenames, short or long. Windows won't let you. These include the backslash, the asterisk and the question mark, which have special roles in the treatment of filenames and what are called pathnames (files plus their locations) under both DOS and Windows.
   Second, although file and folder names can be up to 255 characters long, try to restrain yourself. Windows needs to keep track of files that are deep within nested folders, but may not be able to track these files if their paths (nested folder names) are too long. You should be able to use a descriptive name without using hundreds of characters. Here is an example of a needlessly long series of names in a pathname. It's followed by one that makes more sense.
   C:\Program Files\Mother Smith's Software\Mother Smith's Internet Utility Suite\Mother Smith's Internet Timer, Shareware Edition\Ntimer.exe
   C:\Program Files\Mother Jones\Net Timer\Ntimer.exe

   If you want to be even more sensible about long filenames, keep folder names simple (and short, too, if possible) and let yourself go when you name files. While there's nothing wrong with a folder named Mother Smith's Internet Timer, Shareware Edition, there's nothing right with it either; it just takes up screen real estate. (A real puzzle is Microsoft's decision to name the primary folder for programs the Program Files folder, when Programs would have conveyed precisely the same thing -- after all, programs are files, right? -- and would have been eight characters long. Its short filename would have been the same as its long filename.)
   What's the advantage of making folder names short whenever possible? Despite the benefit of being able to use long filenames at last, you'll find many times when the programs you are using don't understand and can't deal with long filenames -- installation programs for the latest Windows programs sometimes fit this category, for example. Instead of having to deal with a truncated short filename when you run those programs, you'll probably agree that typing a common filename -- Programs instead of Progra~1, or Tools instead of Progra~2 (a short version of Program File Tools, maybe) -- is much easier on the mind and fingers.