Question 1 |
A processor X_1 operating at 2 GHz has a standard 5-stage RISC instruction pipeline having a base CPI (cycles per instruction) of one without any pipeline hazards. For a given program P that has 30% branch instructions, control hazards incur 2 cycles stall for every branch. A new version of the processor X_2 operating at same clock frequency has an additional branch predictor unit (BPU) that completely eliminates stalls for correctly predicted branches. There is neither any savings nor any additional stalls for wrong predictions. There are no structural hazards and data hazards for X_1 and X_2. If the BPU has a prediction accuracy of 80%, the speed up (rounded off to two decimal places) obtained by X_2 over X_1 in executing P is
0.72 | |
1.42 | |
0.64 | |
1.64 |
Question 1 Explanation:
Question 2 |
Consider a pipelined processor with 5 stages, Instruction Fetch(IF), Instruction Decode(ID), Execute (EX), Memory Access (MEM), and Write Back (WB). Each stage of the pipeline, except the EX stage, takes one cycle. Assume that the ID stage merely decodes the instruction and the register read is performed in the EX stage. The EX stage takes one cycle for ADD instruction and the register read is performed in the EX stage, The EX stage takes one cycle for ADD instruction and two cycles for MUL instruction. Ignore pipeline register latencies.
Consider the following sequence of 8 instructions:
ADD, MUL, ADD, MUL, ADD, MUL, ADD, MUL
Assume that every MUL instruction is data-dependent on the ADD instruction just before it and every ADD instruction (except the first ADD) is data-dependent on the MUL instruction just before it. The speedup defined as follows.
\textit{Speedup} = \dfrac{\text{Execution time without operand forwarding}}{\text{Execution time with operand forearding}}
The Speedup achieved in executing the given instruction sequence on the pipelined processor (rounded to 2 decimal places) is _____________
Consider the following sequence of 8 instructions:
ADD, MUL, ADD, MUL, ADD, MUL, ADD, MUL
Assume that every MUL instruction is data-dependent on the ADD instruction just before it and every ADD instruction (except the first ADD) is data-dependent on the MUL instruction just before it. The speedup defined as follows.
\textit{Speedup} = \dfrac{\text{Execution time without operand forwarding}}{\text{Execution time with operand forearding}}
The Speedup achieved in executing the given instruction sequence on the pipelined processor (rounded to 2 decimal places) is _____________
2.58 | |
6.37 | |
1.45 | |
1.87 |
Question 2 Explanation:
Question 3 |
A five-stage pipeline has stage delays of 150, 120, 150, 160 and 140 nanoseconds. The registers that are used between the pipeline stages have a delay of 5 nanoseconds each.
The total time to execute 100 independent instructions on this pipeline, assuming there are no pipeline stalls, is _______ nanoseconds.
The total time to execute 100 independent instructions on this pipeline, assuming there are no pipeline stalls, is _______ nanoseconds.
17080 | |
16335 | |
17160 | |
16640 |
Question 3 Explanation:
Question 4 |
One instruction tries to write an operand before it is written by previous instruction. This may lead to a dependency called
True dependency | |
Anti-dependency | |
Output dependency | |
Control Hazard |
Question 4 Explanation:
Question 5 |
Consider a 5- segment pipeline with a clock cycle time 20 ns in each sub operation. Find out the approximate speed-up ratio between pipelined and non-pipelined system to execute 100 instructions. (if an average, every five cycles, a bubble due to data hazard has to be introduced in the pipeline)
5 | |
4.03 | |
4.81 | |
4.17 |
Question 5 Explanation:
Question 6 |
Consider a non-pipelined processor operating at 2.5 GHz. It takes 5 clock cycles to complete an instruction. You are going to make a 5-stage pipeline out of this processor. Overheads associated with pipelining force you to operate the pipelined processor at 2 GHz. In a given program, assume that 30% are memory instructions, 60% are ALU instructions and the rest are branch instructions. 5% of the memory instructions cause stalls of 50 clock cycles each due to cache misses and 50% of the branch instructions cause stalls of 2 cycles each. Assume that there are no stalls associated with the execution of ALU instructions. For this program, the speedup achieved by the pipelined processor over the non-pipelined processor (round off to 2 decimal places) is________.
1.25 | |
2.16 | |
1.85 | |
2.82 |
Question 6 Explanation:
Question 7 |
A particular parallel program computation requires 100 sec when executed on a single processor, if 40% of this computation is inherently sequential (i.e. will not benefit from additional processors), then theoretically best possible elapsed times of this program running with 2 and 4 processors, respectively, are:
20 sec and 10 sec | |
30 sec and 15 sec | |
50 sec and 25 sec | |
70 sec and 55 sec |
Question 7 Explanation:
Question 8 |
The instruction pipeline of a RISC processor has the following stages: Instruction Fetch (IF), Instruction Decode (ID), Operand Fetch (OF), Perform Operation (PO) and Writeback (WB). The IF, ID, OF and WB stages take 1 clock cycle each for every instruction. Consider a sequence of 100 instructions. In the PO stage, 40 instructions take 3 clock cycles each, 35 instructions take 2 clock cycles each, and the remaining 25 instructions take 1 clock cycle each. Assume that there are no data hazards and no control hazards.
The number of clock cycles required for completion of execution of the sequence of instructions is ______.
The number of clock cycles required for completion of execution of the sequence of instructions is ______.
200 | |
315 | |
219 | |
375 |
Question 8 Explanation:
Question 9 |
Instructions execution in a processor is divided into 5 stages. Instruction Fetch (IF),
Instruction Decode (ID) , Operand Fetch (OF), Execute (EX), and Write Back (WB), These
stages take 5,4,20, 10 and 3 nanoseconds (ns) respectively. A pipelined implementation of
the processor requires buffering between each pair of consecutive stages with a delay of 2ns.
Two pipelined implementations of the processor are contemplated.
(i) a naive pipeline implementation (NP) with 5 stages and
(ii) an efficient pipeline (EP) where the OF stage id divided into stages OF1 and OF2 with execution times of 12 ns and 8 ns respectively.
The speedup (correct to two decimals places) achieved by EP over NP in executing 20 independent instructions with no hazards is ________________.
Two pipelined implementations of the processor are contemplated.
(i) a naive pipeline implementation (NP) with 5 stages and
(ii) an efficient pipeline (EP) where the OF stage id divided into stages OF1 and OF2 with execution times of 12 ns and 8 ns respectively.
The speedup (correct to two decimals places) achieved by EP over NP in executing 20 independent instructions with no hazards is ________________.
2.5 | |
1.508 | |
1.212 | |
2.016 |
Question 9 Explanation:
Question 10 |
Register renaming is done in pipelined processors:
as an alternative to register allocation at compile time | |
for efficient access to function parameters and local variables | |
to handle certain kinds of hazards | |
as part of address translation |
Question 10 Explanation:
There are 10 questions to complete.
In question 13, the solution can’t be found. please look into this.
Thank You Ankit,
We have updated it.
Q 12, it should be t1 = 3t2/4 = 2t3