Assembler Benchmarks

From: Randall Hyde (randyhyde_at_earthlink.net)
Date: 01/28/04


Date: Wed, 28 Jan 2004 03:51:04 GMT

Donkey (Edgar) over at the MASMForum is working on
putting together a set of benchmark files to test the speeds
and capabilities of various assemblers. I've just written a
quick little HLA program (appended later) that generates
a file with two sections - the first has 50,000 equates and
the second section has 50,000 dword data declarations.

The identifier equates are randomly generated as follows:

1. Random length from 1 to 16 characters.
2. The identifiers must begin with alpha or "_" and the
    remaining characters are alphanumeric or "_".
3. The identifier always contains at least one "_", inserted
    in a random location to avoid conflicts with reserved words.
4. The identifiers are all unique (in the equates section, of course).
5. The equates randomly choose a hex or decimal value (to test
    both numeric input types).

The data declarations each reserve four bytes of storage
initialized with a value of the form:

equate[i] - (equate[i]'s value) + equate[n-i-1] - (equate[n-i-1]'s value) +
randomEquate

where equate[i] is the ith equate in the list of equates, equate[n-i-1]
is the ith equate from the end of the list, and randomEquate is a randomly
equate from the list. The reason for subtracting the values is to avoid
arithmetic overflow (in case the particular assembler complains).

Here's some sample MASM/TASM-compatible output produced
(32 entries rather than 50,000!):

.386
.model flat, syscall
option noscoped
_fk_73n equ 1443525641
_14udcdksnb00 equ 0710C1E0Bh
w_ equ 2864601021
u_t393h equ 744872292
cxfra_p10bp03vs equ 0C5019000h
up6t5c3ulxtc6_ equ 04A10C20Ch
j_g4hrh8 equ 01C379435h
ahi_f_e equ 2640028760
ld7iy_3sq equ 2567429123
kq2yxn_ equ 1135678892
t8_g equ 091324C28h
_qo7ruhltxsa9 equ 0614FAB48h
pi_70kbvf3_ equ 0C895C68Ah
e__q1iqdisefv5ks equ 3969935745
s4m_ equ 403988474
tiwk3iep_urbt equ 03CE1CEC0h
r7jly_01sf_n6 equ 06C8C2A68h
_i equ 0982610FBh
fc1_p8zpf_3 equ 05B0260BCh
fw5cs0j3hc8_1 equ 3734479896
isulfwi_fyfddfc equ 09ABE0243h
wn3__o equ 486729813
lls6x7_sz equ 0C97036CBh
o_tx4rzckkbcls6 equ 04683B77Ah
n_dxahu03z0 equ 3252646726
eix25euy_hwr equ 08CFE9053h
uh3_rs0uh equ 060033CC6h
ddzrwbqzk_wbqdp equ 3170338836
_b_b9j equ 055BF196Fh
p_26_1ma0 equ 03910B276h
bg8pg6_tuzp5 equ 0762A640Ah
wvkontt4x_wf2 equ 428793474

 .data
 dword _fk_73n-0560A7409h+wvkontt4x_wf2-0198EDE82h+fc1_p8zpf_3
 dword _14udcdksnb00-1896619531+bg8pg6_tuzp5-1982489610+p_26_1ma0
 dword w_-0AABE57BDh+p_26_1ma0-957395574+bg8pg6_tuzp5
 dword u_t393h-02C65D964h+_b_b9j-1438587247+w_
 dword cxfra_p10bp03vs-3305213952+ddzrwbqzk_wbqdp-0BCF78814h+o_tx4rzckkbcls6
 dword up6t5c3ulxtc6_-1242612236+uh3_rs0uh-1610824902+kq2yxn_
 dword j_g4hrh8-473404469+eix25euy_hwr-2365493331+tiwk3iep_urbt
 dword ahi_f_e-09D5BA458h+n_dxahu03z0-0C1DF7346h+uh3_rs0uh
 dword ld7iy_3sq-09907DC03h+o_tx4rzckkbcls6-1183037306+r7jly_01sf_n6
 dword kq2yxn_-043B115ACh+lls6x7_sz-3379574475+r7jly_01sf_n6
 dword t8_g-2435992616+wn3__o-01D02E855h+tiwk3iep_urbt
 dword _qo7ruhltxsa9-1632611144+isulfwi_fyfddfc-2596143683+ddzrwbqzk_wbqdp
 dword pi_70kbvf3_-3365258890+fw5cs0j3hc8_1-0DE97A418h+u_t393h
 dword e__q1iqdisefv5ks-0ECA06981h+fc1_p8zpf_3-1526882492+r7jly_01sf_n6
 dword s4m_-018145FFAh+_i-2552631547+pi_70kbvf3_
 dword tiwk3iep_urbt-1021431488+r7jly_01sf_n6-1821125224+u_t393h
 dword r7jly_01sf_n6-1821125224+tiwk3iep_urbt-1021431488+ddzrwbqzk_wbqdp
 dword _i-2552631547+s4m_-018145FFAh+e__q1iqdisefv5ks
 dword fc1_p8zpf_3-1526882492+e__q1iqdisefv5ks-0ECA06981h+_qo7ruhltxsa9
 dword fw5cs0j3hc8_1-0DE97A418h+pi_70kbvf3_-3365258890+j_g4hrh8
 dword isulfwi_fyfddfc-2596143683+_qo7ruhltxsa9-1632611144+_i
 dword wn3__o-01D02E855h+t8_g-2435992616+_b_b9j
 dword lls6x7_sz-3379574475+kq2yxn_-043B115ACh+_i
 dword o_tx4rzckkbcls6-1183037306+ld7iy_3sq-09907DC03h+ddzrwbqzk_wbqdp
 dword n_dxahu03z0-0C1DF7346h+ahi_f_e-09D5BA458h+cxfra_p10bp03vs
 dword eix25euy_hwr-2365493331+j_g4hrh8-473404469+o_tx4rzckkbcls6
 dword uh3_rs0uh-1610824902+up6t5c3ulxtc6_-1242612236+fw5cs0j3hc8_1
 dword ddzrwbqzk_wbqdp-0BCF78814h+cxfra_p10bp03vs-3305213952+_b_b9j
 dword _b_b9j-1438587247+u_t393h-02C65D964h+eix25euy_hwr
 dword p_26_1ma0-957395574+w_-0AABE57BDh+tiwk3iep_urbt
 dword bg8pg6_tuzp5-1982489610+_14udcdksnb00-1896619531+e__q1iqdisefv5ks
 dword wvkontt4x_wf2-0198EDE82h+_fk_73n-0560A7409h+isulfwi_fyfddfc
end

I produced files that are compilable by MASM, TASM, and HLA
thus far, here's the results (with 50,000 identifiers, on a 2GHz machine):

MASM 0.15 seconds
TASM32 v5.3 with 200,000 hash table entries: 1.1 seconds
HLA v1.x 7.6 seconds

Clearly this benchmark is testing MASM's superior memory management
handling routines (and skipping the stuff that MASM does slowly; MASM,
in general, is slower than TASM). Still, it's pretty amazing how well MASM
handles outrageously large source files.

I've yet to try it with FASM or NASM. I suspect I'll run into some
memory problems (this has been a problem in the past, but as Tomasz
is working on speeding up FASM for large projects, it may be time to
do this again). Not even going to try it with RosAsm - RosAsm crashed
with a much simpler equates file in the past, so I'm assuming that any
work put into generating a RosAsm version would be a waste of time.
I'm also going to try out Gas before too much longer.

HLA v2.0 benchmarks will have to wait until HLA v2.0 can handle
static declarations.

Keep in mind that this benchmark does not test out the complete facilities
of an assembler. It only tests the symbol table handling routines which
are usually among the slowest portions of a typical assembler when working
with large projects. No doubt, other benchmark files will appear in the
near future.

The nice thing about this benchmark program is that it's fairly easy to add
support for most any (normal) assembler. Generally, it only takes a single
procedure call to emit a file for a new assembler.
Cheers,
Randy Hyde

// Assembler Benchmark Generator

program asmBenchGen;
#include( "stdlib.hhf" )

const
    numNames := 50000;
    hlaFN := "hlaBM.hla";
    masmFN := "masmBM.asm";
    fasmFN := "fasmBM.asm";
    nasmFN := "nasmBM.asm";
    rosAsmFN := "rosAsmBM.asm";
    gasFM := "gasBM.asm";

static
    tempName :str.strvar( 256 );
    NamesInUse :table;
    names :string[numNames];
    equateVals :dword[numNames];
    randomIndex :dword[numNames];
    outputHex :boolean[numNames];

readonly
    idChars :byte; @nostorage;
                byte "_abcdefghijklmnopqrstuvwxyz0123456789";

    procedure randomID( s:string );
    const
        strEBX :text := "(type str.strRec [ebx])";
    begin randomID;

        push( eax );
        push( ebx );
        push( ecx );
        push( edx );

        // Get pointer to string data:

        mov( s, ebx );

        // Create a random length for this identifier:

        rand.urange( 2,16 );
        mov( eax, strEBX.length );

        mov( eax, ecx ); // Save length for use in generating chars

        // All ids must have at least one underscore in them to
        // avoid conflicts with reserved words. We want to randomly
        // select the position of that underscore:

        rand.urange( 0, dec( eax ) );
        mov( eax, edx );

        // Okay, generate the string:

        while( ecx > 0 ) do

            dec( ecx );
            if( ecx = edx ) then

                mov( '_', strEBX.strData[ecx] );

            else

                rand.urange( 0, 36 );
                mov( idChars[eax], al );
                mov( al, strEBX.strData[ecx] );

            endif;

        endwhile;

        // The first character in the string must be alphabetic or "_"

        if( edx = 0 ) then

            mov( '_', strEBX.strData );

        else

            rand.urange( 0,26 );
            mov( idChars[eax], al );
            mov( al, strEBX.strData );

        endif;
        pop( edx );
        pop( ecx );
        pop( ebx );
        pop( eax );

    end randomID;

    // genBench - generates a benchmark file for a specific assembler.

    procedure genBench
    (
        filename :string;

        hexPrefix :string;
        hexSuffix :string;
        decPrefix :string;
        decSuffix :string;

        startFileTxt :string;

        startEquates :string;
        startEQLine :string;
        midEQLine :string;
        endEQLine :string;
        endEquates :string;

        startDcls :string;
        idPrefix :string;
        idSuffix :string;
        startDclLine :string;
        endDclLine :string;
        endDcls :string;

        endFileTxt :string
    );
    var
        hFile :dword;

    begin genBench;

        push( eax );
        push( ebx );
        push( ecx );
        push( edx );

        fileio.openNew( filename );
        mov( eax, hFile );
        fileio.put( eax, startFileTxt, startEquates );
        for( mov( 0, ecx ); ecx < numNames; inc( ecx )) do

            fileio.put
            (
                hFile,
                startEQLine,
                names[ecx*4],
                midEQLine
            );
            if( outputHex[ ecx ] ) then

                fileio.put
                (
                    hFile,
                    hexPrefix,
                    (type dword equateVals[ecx*4]),
                    hexSuffix
                );

            else

                fileio.put
                (
                    hFile,
                    decPrefix,
                    (type uns32 equateVals[ecx*4]),
                    decSuffix
                );

            endif;
            fileio.puts( hFile, endEQLine );

        endfor;
        fileio.put
        (
            hFile,
            endEquates,
            startDcls
        );
        for( mov( 0, ecx ); ecx < numNames; inc( ecx )) do

            // Emit a reference to the nth equate:

            fileio.put
            (
                hFile,
                startDclLine,
                idPrefix,
                names[ecx*4],
                idSuffix,
                "-"
            );
            if( outputHex[ecx] ) then

                // Note: this is backwards compared to
                // the equates!

                fileio.put
                (
                    hFile,
                    decPrefix,
                    (type uns32 equateVals[ecx*4]),
                    decSuffix
                );

            else

                fileio.put
                (
                    hFile,
                    hexPrefix,
                    (type dword equateVals[ecx*4]),
                    hexSuffix
                );

            endif;

            // Emit a reference to the numNames-nth equate:

            mov( numNames-1, edx );
            sub( ecx, edx );
            fileio.put
            (
                hFile,
                "+",
                idPrefix,
                names[edx*4],
                idSuffix,
                "-"
            );
            if( outputHex[edx] ) then

                // Note: this is backwards compared to
                // the equates!

                fileio.put
                (
                    hFile,
                    decPrefix,
                    (type uns32 equateVals[edx*4]),
                    decSuffix
                );

            else

                fileio.put
                (
                    hFile,
                    hexPrefix,
                    (type dword equateVals[edx*4]),
                    hexSuffix
                );

            endif;

            // Emit a reference to a random equate:

            mov( randomIndex[ecx*4], edx );
            fileio.put
            (
                hFile,
                "+",
                idPrefix,
                names[edx*4],
                idSuffix,
                endDclLine
            );

        endfor;
        fileio.put
        (
            hFile,
            endDcls,
            endFileTxt
        );
        fileio.close( hFile );
        pop( edx );
        pop( ecx );
        pop( ebx );
        pop( eax );

    end genBench;

begin asmBenchGen;

    // Create a hash table so we can avoid duplicate names:

    NamesInUse.create( 16384 );

    mov( 0, ecx );
    while( ecx < numNames ) do

        randomID( tempName );

        // See if this is a duplicate name:

        NamesInUse.lookup( tempName );
        if( eax = NULL ) then

            // Okay, this is a unique name. Let's use it:

            str.a_cpy( tempName );
            mov( eax, names[ ecx*4 ] );

            // Enter it into the table so we don't reuse it in the future:

            NamesInUse.getNode( eax );

            // Generate a random value for this symbol:

            rand.uniform();
            mov( eax, equateVals[ ecx*4 ] );

            // Select a random index into the array:

            rand.urange( 0, numNames-1 );
            mov( eax, randomIndex[ecx*4] );

            // Determine whether we will output the data in hex or decimal
            // format:

            and( 1, al );
            mov( al, outputHex[ecx] );

            inc( ecx );

        endif;

    endwhile;

    // No underscore output in the middle of hex numbers:

    conv.setUnderscores( false );

    // Okay, we've generated all the names, now generate the output files:

    // HLA:

    genBench
    (
        hlaFN,

        "$", // Hex prefix
        "", // Hex suffix
        "", // Decimal prefix
        "", // Decimal suffix

        "unit benchmark;" // Text at start of file
        nl,

        "namespace h;" // Start of equates
        nl
        "const" nl,

        "", // Text at start of line

        " := ", // Text between label and value

        ";" nl, // Text at end of line

        "end h;" nl, // Text at end of equates

        "static" nl, // Text at start of dcls
        "h.", // ID Prefix
        "", // ID Suffix
        "dword ", // Start of dcl line
        ";" nl, // End of dcl line
        "", // End of declarations

        "end benchmark;" // Text at end of file
        nl
    );

    // MASM:

    genBench
    (
        masmFN,

        "0", // Hex prefix
        "h", // Hex suffix
        "", // Decimal prefix
        "", // Decimal suffix

        ".386" nl
        ".model flat, syscall" nl
        "option noscoped" nl,

        "", // Start of equates

        "", // Text at start of line

        " equ ", // Text between label and value

        "" nl, // Text at end of line

        "" nl, // Text at end of equates

        " .data" nl, // Text at start of dcls
        "", // ID Prefix
        "", // ID Suffix
        " dword ", // Start of dcl line
        "" nl, // End of dcl line
        "", // End of declarations

        "end " // Text at end of file
        nl
    );

end asmBenchGen;



Relevant Pages

  • Re: variable declration in 8051 assembly
    ... I used the metalink assembler to create an expansion to BASIC-52 and created constants and tables required in my program, as well as the tables, flags, and vector addresses required by the interpreter at the addresses specified in the MCS-51 BASIC Manual. ... The MAKE_TOKEN macro is defined below the code examples, and shows how to generate a basic "structure" in this assembler. ... SMALL_SCRN EQU 0; SHORTEN MOST DISPLAYS TO FIT ON 40 CHAR SCREENS ... REG < OR = VALUE ...
    (comp.arch.embedded)
  • Re: Need LISA help
    ... keep in mind that there are assembler specific mnemonics that may ... Things like ASC, DS, DB, DW, EQU, ORG and such are assembler ... nlc cnd asl lsr rol ror dec ...
    (comp.sys.apple2)
  • Re: Assemblers
    ... (All the aztec C compilers on the aztec C ... Note also that the Aztec C65 compiler and assembler documentation is really ... SERVEMOUSE EQU SETMOUSE+3 ... jsr SETMOUSE ...
    (comp.sys.apple2.programmer)
  • Re: Assembler question #DEFINE or EQU?
    ... assembler style of EQU, when I speak, since that seems more likely to ... style EQU.) ... MYSYM EQU 10 ... then appear in the object file, ...
    (sci.electronics.basics)
  • Re: Linux / NASM equivalent of Iczelions Win32 assembly tuts
    ... The stuff in the lindela directory is for Herbert's own assembler, the stuff in the nasm directory will assemble with Nasm. ... ORIGIN equ 8048000h ... mov ecx, prompt ...
    (alt.lang.asm)