Why do the “” characters seem to corrupt Windows folders?

The error isn’t caused by angle brackets exactly, or by having two of them – instead it occurs when 1) a file name contains wildcard characters in its name, and 2) the wildcard would match a previously seen file, which results in Windows thinking that the folder search doesn’t advance forwards like it should.

First, as far as I understand, listing a directory on Windows is done by wildcard expansion (the opposite of how it would be done on Linux). To expand a wildcard pattern, you start by calling FindFirstFile() with the initial pattern, then repeat FindNextFile() while NTFS finds matching files one-by-one. To list the entire directory, you do the same with * as the pattern.

Second, both < and > (as well as ") are actually treated as wildcards in the deeper parts of Windows file-handling code – they behave like the historical MS-DOS wildcard variants of * and ?. (For example, > aka DOS_STAR matches all characters up until the file extension.) The publicly available .NET source code contains a description of the algorithm, which is identical to the one found in leaked Windows NT kernel source.

So it’s not just the angle brackets, but also " ? * that could be used to trigger this error – as long as they’re used in combination with another file name that would be sorted before the wildcard, if the sorting is done by Unicode value (which is the order enforced by NTFS).

For example, you would also get the “folder corrupted” error if you had items named foo( and foo*. There is nothing special about the ( here, except that it goes before * in Unicode – while a character that sorts after * such as foo+ would not trigger the error. (You can open “Character Map” via charmap.exe if you want to see the Unicode positions of these characters.)

Similarly, a directory containing [foo<, foo=] or [foo?, fooo] would not trigger this situation, but a directory containing [foo=, foo>] or [foo+, foo?] would.

So if I understand everything correctly, what seems to happen is:

  1. The directory has items [foo(, foo*], with NTFS enforcing this exact order.
  2. Kernel asks NTFS “Get first item, starting at *“.
  3. NTFS finds and returns foo(.
  4. Kernel asks NTFS “Get next item, continuing at foo(“.
  5. NTFS finds foo( (exact match) and returns the next item foo*.
  6. Kernel asks NTFS “Get next item, continuing at foo*“.
  7. NTFS finds foo* – which is recognized as a wildcard and matches foo( first, therefore the next item is foo* again – so an error is raised.

As > is handled similarly to the * wildcard, a folder named “>” causes the same problem by matching the previous “<” item before itself.

You may Also Like:

None found

Leave a Comment