The DLX architecture is a RISC, delayed-branch design. Its instruction set is almost identical to that of MIPS. One can get an overview of it by looking at the sample program, in its C source form and its compiled assembly language form.

If you are not used to reading assembly language compiled from C source files (note: you can generate such files on Unix systems by using the -S option in cc or gcc), you will find that your reading of such files is impeded by a considerable amount of "overhead." For example, the first few lines of our sample file

.global _exit
.global _open
.global _close
.global _read
.global _write
.global _printf
are not code but instead are "pseudoinstructions" or "directives" (you may have learned these terms in your assembly language course). In this case they are telling the assembler not to worry, for example, that the function printf() (the function itself, not our call to it) does not appear in this file; it will be available later in another file for linking with this one.

Similarly, the first few lines in main(),

_main:
        ;; Initialize Stack Pointer
        add r14,r0,r0
        lhi r14, ((memSize-4)>>16)&0xffff
        addui r14, r14, ((memSize-4)&0xffff)
        ;; Save the old frame pointer
        sw -4(r14),r30
        ;; Save the return address
        sw -8(r14),r31
        ;; Establish new frame pointer
        add r30,r0,r14
        ;; Adjust Stack Pointer
        add r14,r14,#-40
        ;; Save Registers
        sw 0(r14),r3
        sw 4(r14),r4
        sw 8(r14),r5
are devoted to setting up the stack, and do not correspond to any of our C source code in main().

The point is that in reading such files you can just skip over this "overhead" and get right to the heart of the matter. You can do this by correlating lines in your source code with lines in the assembly language file.

For instance, the first executable line in main() in our C source file is

N = 100;
Scanning through the .s file, starting at the line
_main:
we find a reference to _N (note that C compilers usually add an underscore to variable names, except for local variables, which are simply slots in the stack without names), and we see the 100 too:
        lhi r3,(_N>>16)&0xffff
        addui r3,r3,(_N&0xffff)
        addi r5,r0,#100
        sw 0(r3),r5
So, let's start our introduction with this.

The instructions

        lhi r3,(_N>>16)&0xffff
        addui r3,r3,(_N&0xffff)
put the address of N in the register r3; the first instruction ("load high") puts the upper 16 bits of that address in r3 and the second puts the lower 16 bits there.

The instruction addi r5,r0,#100 adds the contents of register r0 and the number 100, and then places the sum in register r5. Since r0 has the value 0 hardwired into it (it never changes), this instruction has the effect of putting 100 into r5.

The instruction sw 0(r3),r5 does a "store word" operation, i.e. a write to memory. This one says to store the contents of the register r5 (in which we had put the value 100) to the memory word whose address is the sume of 0 and the contents of r3. Since we had earlier stored the address of N in r3, this instruction here will put 100 in N, thus completing the operation

N = 100;
in our C source file.

Next we have the code

        addi r5,r0,#2
        sw -12(r30),r5
which corresponds to the I = 2 portion of the statement

   for (I = 2; I <= N; I++) Composite[I] = 0;       
Previously r30 had been set to point to the top of the "frame," i.e. main()'s portion of the stack. The local variable I has been allocated to the word 12 bytes from the top.

The

I <= N 
portion of the above C statement is handled by the next few instructions:
L6:
        lhi r3,(_N>>16)&0xffff
        addui r3,r3,(_N&0xffff)
        lw r4,-12(r30)
        lw r3,0(r3)
                ;cmpsi  r4,r3
        sgt     r1,r4,r3
        bnez    r1,L7
        nop
Here bnez ("branch if not equal to 0"), like most conditional branch instructions, is preceded by some kind of comparison instruction, in this case sgt ("set if greater than"). The latter will set r1 to 1 if r4, containing I, is greater than r3, containing N, and set r1 to 0 otherwise. The bnez instruction will then do a jump to L7 if r1 is 1, i.e. if I is greater than N.

The nop ("no-op") instruction does nothing. It is needed to fill the "delay slot" after the branch. Ignore this if you have not studied delayed branch before.

Functions calls are done using instructions such as jal ("jump and link"), in this case our call to DoSieve():

        sub r14,r14,#8
        lw r5,-12(r30)
        sw 0(r14),r5
        jal _DoSieve
        nop
        add r14,r14,#8
The register r14 is serving as a stack pointer. We make room on the stack for two words (8 bytes), store the value of I, and then jump to the function. (The return address is saved by jal in r31.)