(Assembly code execution and analysis)
Consider the following assembly programmes, where M1, M2, M3 and M4 denote
memory addresses, and r1, r2, r3 denote registers:
I1: LOAD r1, M1
I2: LOAD r2, M2
I3: MUL r1, r1, r2
I4: LOAD r2, M3
I5: LOAD r3, M4
I6: MUL r2, r2, r3
I7: ADD r1, r1, r2
Answer the following questions:
(a) Show the execution of the programme on the architecture described in the
appendix. Assume that the fetched and decoded instructions are stored in
an instruction window IW with unlimited capacity (and so you can store
any number of instructions in the IW). Explain where and why delay
slots appear. Assume that the processor can do out-of-order execution to
speed up the completion of the program. Assume that there is only one
bus, and that the fetching of instructions uses this bus. So the fetching of
an instruction can conflict with a stage where another instruction accesses
memory.
(b) Show all the dependencies (both true and false) in the code.
(c) Apply register renaming to remove the false dependencies.
[15 marks]
[10 marks]
[10 marks]
Subtotal: [35 marks]