Re: Fujitsu NetCOBOL for .NET
- From: mwojcik@xxxxxxxxxxx (Michael Wojcik)
- Date: 22 Jan 2006 19:23:24 GMT
In article <1137783996.596533.80170@xxxxxxxxxxxxxxxxxxxxxxxxxxxx>, "Richard" <riplin@xxxxxxxxxxxx> writes:
>
> > text file of 69 bytes. I zipped it. The result was 176 bytes.
>
> 176 - 112 -> 64. The 69 bytes were compressed to 64, probably using
> RLE.
With a 69-byte "text" file, I doubt you'd get much compression, if
any, from RLE; plain ASCII text (which I assume is what we're dealing
with here) rarely has much in the way of runs (of whatever symbol
length). That looks like the result of Deflate (probably straight
LZ77) to me.
I just did a bit of testing of my own, and Info-Zip and WinZip both
compressed a 69-byte text file to 64 bytes using Deflate (defF, defN,
or defX, depending on the "fast", "normal", or "max" options).
Not that it matters...
> With typical Cobol record data there are large runs of spaces and zero
> characters because the fields must allow for the largest expected data
> item. This type of data can be compressed much smaller than 'text'
> that has little white space.
Right, and here RLE often works very well, which is why we include a
handful of RLE compressors (for 8-bit and 16-bit symbols, and for
smaller or larger inputs) with MF COBOL.
> Zip chooses the 'best' compression to use for a particular type of
> data. For small text files it may choose RLE, for large english text it
> may choose to use a dictionary based mechism that has a large table but
> very small resulting data.
Actually, LZ77 doesn't require much table space. The table (or tables,
since LZ77 will generate a new one when the compression ratio drops as
it's processing the data stream) is a Huffman table that maps output
codes to offsets in previous input data. That is, the compressed
symbols in LZ77 output refer back to earlier data: "this next piece is
the same as the 16-bit sequence we saw 124 bits ago". So most of the
information in the table is actually stored in the user data itself.
And the table itself is compressed using a fixed Huffman encoding.
--
Michael Wojcik michael.wojcik@xxxxxxxxxxxxxx
Sure we're tossing out fluff, but tell me, where does anyone deal in words
with substance? -- Haruki Murakami (trans Alfred Birnbaum)
.
- References:
- [experiences] Fujitsu NetCOBOL for .NET
- From: RH
- Re: Fujitsu NetCOBOL for .NET
- From: Richard
- [experiences] Fujitsu NetCOBOL for .NET
- Prev by Date: Re: To Recover a deleted file from MVS Backup
- Next by Date: Re: GoTo in Java
- Previous by thread: Re: Fujitsu NetCOBOL for .NET
- Next by thread: Data Representation in COBOL
- Index(es):
Relevant Pages
|