first previous next last contents

Hash_tar

NAME

hash_tar -- Adds a hash table index to a tar file

SYNOPSIS

hash_tar [OPTIONS] tarfile > tarfile.hash

hash_tar -A [OPTIONS] tarfile >> tarfile

DESCRIPTION

hash_tar adds an index to a tar file so that random access may be performed on it. It is a successor to the index_tar program.

The index is a hash table which may be appended, prepended or stored in a separate file. Then the hash_list and hash_extract programs may be used to query the contents and to extract contents from the indexed tar archive. Note that it's not possible to add to such tar archives without also having to rebuild the index.

Various io_lib based tools also support transparent reading out of tar files when indexed using this tool, so this provides a quick and easy way to remove the clutter of thousands of small trace files on disk.

In separate file mode the hash index is stored in its own file. It's the most flexible method as it means that the tar file can be modified and appended to with ease provided that the hash index is recomputed. In order for this to work the hash index file also needs to store the filename of its associated tar file (see the -a option).

In append mode the hash index is assumed to be appended on the end of the tar file itself. As tar files normally end in a blank block this does not damage the tar and tar tvf will still work correctly. However appending to the tar file will cause problems.

In prepend mode the hash index comes first and the tar follows. This breaks normal tar commands, but is the the fastest way to retrieve data (it avoids a read and a seek call compared to append mode).

For space saving reasons it's possible to add a header and a footer to each entry too. In this case a named entry from the tar file is prepended or appended at extraction time.

OPTIONS

-a archive_filename
Use this if reading from stdin and you wish to create a hash index that is to be stored as a separate file.
-A
Append mode. No archive name will be stored in the index and so the extraction tools assume the index is appended to the same file as the archive itself.
-b
Store the "base name" of the tar file names. That is if the tar holds file a/b/c then the item held in the index will be c.
-d
Index directory names too. (Most likely a useless feature!)
-f name
Set tar entry 'name' to be a file footer
-h name
Set tar entry 'name' to be a file header
-O
Prepend mode. It is assumed that all offsets within the archive file start from the end of the index (ie the index is the first bit in the file).
-v
Verbose mode.

EXAMPLES

The most common usage is just to append an index to an existing tar file. Then extract a file from it.

hash_tar -A file.tar >> file.tar
hash_extract file.tar xyzzy/plugh > plugh

For absolute maximum speed maybe you wish to prepend the hash index. This speeds up the "magic number" detection and avoids unnecessary seeks.

hash_tar -O file.tar > file.tar.hash
cat file.tar.hash file.tar > hashedfile.tar

Finally, if we have a tar file of Experiment Files maybe we wish to add a footer indicating a date and comment to each experiment file so that upon extraction we get a concatenation of the original experiment file and the footer.

(echo "CC   Comment";date "+DT   %Y-%m-%d") > exp_foot
tar rf file.tar exp_foot
hash_tar -f exp_foot -A file.tar >> file.tar
# Now test:
hash_extract file.tar xyzzy.exp > xyzzy.exp
tail -2 xyzzy.exp

SEE ALSO

See section ExperimentFile(4).

See section hash_list(1).

See section hash_extract(1). Read(4)


first previous next last contents
Last generated on 25 November 2011.