Re: disassembler prefix-byte check -- wanting comments good and bad

From: Bx.C (invalid-email-address_at_invalid.shiragajin)
Date: 03/26/04

  • Next message: Frank Kotler: "Re: NASM fails to FAR Jump to specified Address"
    Date: Fri, 26 Mar 2004 19:35:43 +0000 (UTC)
    
    

    > It doesn't make much sense to write something this complex in assembly.
    > Prefix decoding is the easy part; decoding ModR/M and SIB bytes is a
    > nuisance, and decoding opcodes much more so.

    can you supply reason for this statement?

    as i see it, the entire instruction set follows a pattern... exceptions to
    rules can be tested before the normal rules in the pattern...

    for instance...

    for opcode bytes 00-30...

    cmp for 0F first, jz to second opcode byte test....
    cmp (byte and 07), jz to handle POP 2-bit seg (es/cs/ss/ds)
    cmp (byte and 27), jz to handle DAA/DAS/AAA/AAS
    cmp (byte and 06), jz to handle PUSH 2-bit seg (es/cs/ss/ds)
    ;;; no reason to cmp (byte and 26).. already done in prefix test
    cmp (byte and C0), jz to handle GROUP1 set (ADD/OR/ADC/SBB/AND/SUB/XOR/CMP)
    ;;;;; now a quarter of the first-level opcode table has been handled
    .
    .
    .
    .

    complex? i see it as not...
    rather, i see it as a hierarchy of a sort... the task being to handle all
    instructions in equal time, suffering less common instructions when
    possible... for instance... in my prefix check code... the lock prefix gets
    tested last, when that's one of the easiest to check... in fact.. i could've
    done it in this order...

    --------------
     cmp al,0F0h
     jz prefix_lck

     and al,0FEh
     cmp al,0F2h
     jz prefix_rep

     and al,0FCh
     cmp al,064h
     jz prefix_386

     mov al,cl
     and al,0E7h
     cmp al,026h
     jz prefix_seg
    --------------

    in this way, the only subset i would've had to restore AL before checking,
    is the four normal segment registers ES/CS/SS/DS... of course, i would've
    had to place a mov al,cl at the beginning of each of these sections, or
    handled the data in CL without moving it again...

    i chose not to start w/ the LOCK prefix, though, because it is the least
    used, and thus, could be afforded to be checked last, as that would cause no
    noticable performance loss in the long run...

    in any case, the task is NOT complex... and even decoding MOD-REG-R/M and
    SCL-IDX-BAS isn't very much ASM code at all... it's all done in the same
    manner... test for exceptions (like Mod-R/M=006h in 16-bit)... if that falls
    through, then go with the rules... AND-Mask the byte, and follow the
    groupings...

    i see it as a very simple task...


  • Next message: Frank Kotler: "Re: NASM fails to FAR Jump to specified Address"