DataFormats»Multiple Files

Multiple Files

Storing differnt aspects of the network in differnt files



  • Multiple files are simple to deal with. If you really need

a single file then just cat them together like in a .tar or like a MIME message or something and have the program split them when it loads the file."



  • Multiple files are evil, probably more so then XML structures -- you end up detaching information about nodes from the nodes, and that is bound to breed confusion and losing data.

Breeding habits

  • The problem with multiple files is they tend to multiply like rabbits. Having built a system that used data in that fashion, I can tell you that when you hit > 500 files, things start to get interesting (for example, unix shells have a limit on how many arguments can be passed to a command - so at some point you start running into shell limitations)

Naming issues

  • The other evil thing about multiple files is that the filenames carry an important semantic - the name of the file needs to identify the dataset that the file belongs to, what portion of the dataset is contained, and (if the file contains results of some metric) what tool or routine has created that file.
  • absolutely must have a consistent naming scheme. Moreover, long filenames (a necessity here) are not handled consistently across platforms, and even things like burning a CD may screw up the naming scheme (ISO9660 filesystem has inconsistent support for long file names, especially > 32 characters)

Maksim's Maxim