Re: Reverse-engineering an LZSS compression routine (on a Hitachi H8)

From: CBFalconer (cbfalconer_at_yahoo.com)
Date: 09/04/04


Date: Sat, 04 Sep 2004 04:42:01 GMT

Philip Pemberton wrote:
>
> I'm currently trying to reverse-engineer the LZSS decompression
> engine used in the Cybiko PDA to compress firmware updates (and
> small ASM and C programs) that are sent over a serial link. I've
> disassembled the bootloader and found the decompress() routine
> (in Hitachi H8 assembler), but I haven't managed to work out what
> I need to do to write a C program to decompress the images.
>
> From the disassembly, I guessed that the code used is a variant
> of Haruhiko Okumura's LZSS.C. I've also found the value of
> THRESHOLD, but I don't know what the values of F and N (related
> to ring buffer sizing) are.
>
> All I know about the output file is that it should be 1288 bytes
> in size. Using { THRESHOLD=2, N=1024, F=18 } produces a file of
> that size, but my disassembler reports that the decompressed
> data is not valid code.

I think you are attacking it in the wrong manner. Decompression
is much easier than compression, you don't have to worry about
forming trees of phrases and revising them, etc. I suggest you
start by getting "The Data Compression Book", by Mark Nelson and
Jean-Loup Gailly, M&T Books, ISBN 1-55851-434-1 (paperback, don't
know about hardback) and read up on the data format. Then the
first few bytes of the compressed code should give you clues about
where to go next. Your disassembly probably doesn't matter, the
data does.

This assumes the compressed data is not deliberately obscured, by
such things as encoding it with a pseudo random generator or
such. If so the disassembly comes back into play.

You are lucky it is not LZ78 or LZW compression, which requires
intimate knowledge of the compressor to decompress.

answered in c.a.e

-- 
Chuck F (cbfalconer@yahoo.com) (cbfalconer@worldnet.att.net)
   Available for consulting/temporary embedded and systems.
   <http://cbfalconer.home.att.net>  USE worldnet address!


Relevant Pages

  • Re: Terse for PC
    ... I have not tried to decompress a file created using terse on PC. ... The appropriate function (compression | ... | you must follow to successfully exchange compressed files with the host. ...
    (bit.listserv.ibm-main)
  • Re: [PATCH] init: bzip2 or lzma -compressed kernels and initrds
    ... Compression is slowest. ... The kernel size is about 33 per cent smaller with lzma, ... I guess this is done on the internal hard disk of the laptop (this is ... disk, decompress, write to disk. ...
    (Linux-Kernel)
  • Re: zlib and zip files
    ... I need to decompress zip archive. ... The zlib module is for reading and writing the gzip compression format, used by the gzip program; it is not the same as a zip archive a la PKZip. ... The zipfile module will let you read and write zip archives. ... The gzip compressor and decompressor can work on the fly, but the format that it produces is a bit other than the format of compressed data zipfile. ...
    (comp.lang.python)
  • Is this a valid LZO compressed byte string?
    ... whenever I try to LZO decompress any of these bytestrings. ... bytestring from a data file or if the bytestrings are truly corrupted. ... decompression as was used for compression. ...
    (comp.compression)
  • Re: copyng large compressed files
    ... buffer. ... transparent compression. ... To read a file you need to decompress it. ... overhead of decompress. ...
    (comp.lang.java.advocacy)