Gunzip
| Documentation |
#include <cryptopp/gzip.h>
|
Gzip is a lossless compression format standardized in RFC 1952, GZIP file format specification. Gzip is actually a file format with additional metadata (like original filename, file modified time and comments), and the underlying compression occurs using Deflate from RFC 1951. Crypto++ provides GZIP compression through the Gzip class, and decompression though the Gunzip class.
The Gunzip decompressor takes a pointer to a BufferedTransformation. Because a pointer is taken, the Gunzip owns the attached transformation, and therefore will destroy it. See ownership for more details.
The Crypto++ implementation does not allow you to set or retrieve the original filename, the modified filetime or comments in the archive. A patch provided below allows programs to do it, but the library will have to be recompiled.
The Gunzip class inherits from the Inflator class (which provides the RFC 1951 implementation).
Construction
Gunzip (BufferedTransformation *attachment=NULL,
bool repeat=false,
int autoSignalPropagation=-1)
attachment is a BufferedTransformation, such as another filter or sink. If attachment is NULL, then the Gzip object will internally accumulate the output byte stream.
repeat signals whether the object will decompress multiple compressed streams in series. The default value is false.
autoSignalPropagation indicates whether MessageEnd should be called and propagated to attached transformations. Set ot 0 to disable class to MessageEnd. The default value is -1, which propagates all MessageEnd calls to all attached transformations.
Sample Programs
The following is a small collection of sample programs to demonstrate using the Gunzip decompressor.
In-memory String
string decompressed, data = ...; Gunzip unzipper(new StringSink(decompressed)); unzipper.Put((byte*) data.data(), data.size()); unzipper.MessageEnd();
On-disk File
string filename("test.txt.gz");
FileSource fs(filename.c_str(), true);
Gunzip unzipper;
fs.TransferTo(unzipper);
String using Pipeline
string decompressed, data = ...;
StringSource ss(data, true,
new Gunzip(
new StringSink(decompressed)
));
File using Pipeline
string filename("test.txt.gz");
string decompressed;
FileSource fs(filename.c_str(), true,
new Gunzip(
new StringSink(decompressed)
));
String using Put/Get
Gunzip unzipper;
unzipper.Put((byte*)data.data(), data.size());
unzipper.MessageEnd();
word64 avail = unzipper.MaxRetrievable();
if(avail)
{
string decompressed;
decompressed.resize(avail);
unzipper.Get((byte*)&decompressed[0], decompressed.size());
}
Array using Put/Get
Gunzip unzipper;
unzipper.Put((byte*)data.data(), data.size());
unzipper.MessageEnd();
word64 avail = unzipper.MaxRetrievable();
if(avail)
{
vector<byte> decompressed;
decompressed.resize(avail);
unzipper.Get(&decompressed[0], decompressed.size());
}
Patch
The patch below adds the ability to read and write the original filename, the modified filetime and comments for an archive. The sample program below shows how it could be used.
try {
string filename("test.txt.gz"), s1, s2;
string data = "abcdefghijklmnopqrstuvwxyz";
// Create a compressor, save stream to memory via 's1'
Gunzip unzipper(new StringSink(s1));
// Add some Gzip specific fields
unzipper.SetFilename(filename);
unzipper.SetFiletime((word32)time(0));
unzipper.SetComment("This is a test of filenames and comments");
// Write the data to the stream
unzipper.Put((byte*) data.c_str(), data.size());
unzipper.MessageEnd();
// Save the compressed data to a file
FileSink fs(filename.c_str(), true);
fs.Put((byte*) s1.data(), s1.size());
fs.MessageEnd();
// Create a decompressor, save stream to memory via 's2'
Gunzip unzipper(new StringSink(s2));
// Add the compressed data to it
unzipper.Put((byte*) s1.data(), s1.size());
unzipper.MessageEnd();
// Print the Gzip specific data
cout << "Filename: " << unzipper.GetFilename() << endl;
cout << "Filetime: " << unzipper.GetFiletime() << endl;
cout << "Comment: " << unzipper.GetComment() << endl;
// Print the uncompressed stream
cout << "Data: " << s2 << endl;
}
catch(CryptoPP::Exception& ex)
{
cerr << ex.what() << endl;
}
A typical run of the program is showed below.
$ ./cryptopp-test.exe Filename: test.txt.gz Filetime: 1420337339 Comment: This is a test of filenames, filetimes and comments Data: abcdefghijklmnopqrstuvwxyz
To unpack the archive using the original filename, you would use gunzip -N. It can be tested by renaming test.txt.gz to something else, like test.gz.
And a view of the archive under The Archive Browser:
Note: The Archive Browser on OS X displays the implicit filename (the archive name without the gz extension), and not the original filename embedded in the header. Also see Issue 802: The Archive Browser does not honor original filename field in a GZIP header.
Downloads
gzip.diff.zip - patch that adds the ability to set and retrieve the original filename, the modified filetime and comments on a GZIP archive. The ZIP includes the diff of changes to gzip.h and gunzip.h, and the modified gzip.h and gunzip.h files themselves.
