

# UNIVERSITY OF MORATUWA

# FACULTY OF ENGINEERING

## **DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING**

B.Sc. Engineering 2013 Intake Semester 2 Examination

## **CS2052 COMPUTER ARCHITECTURE**

Time allowed: 2 Hours

March 2015

### **ADDITIONAL MATERIAL:** None

## **INSTRUCTIONS TO CANDIDATES:**

- 1. This paper consists of **4** questions in **5** pages.
- 2. Answer ALL questions.
- 3. Start answering each of the main questions on a new page.
- 4. The maximum attainable mark for each question is given in brackets.
- 5. This examination accounts for 60% of the module assessment.
- 6. This is a closed book examination.

# NB: It is an offence to be in possession of unauthorised material during the examination.

- 7. Only calculators approved by the Faculty of Engineering are permitted.
- 8. Assume reasonable values for any data not given in or with the examination paper. Clearly state such assumptions made on the script.
- 9. In case of any doubt as to the interpretation of the wording of a question, make suitable assumptions and clearly state them on the script.
- 10. This paper should be answered only in English.

# Question 1

# [25 marks]

Figure 1 shows the block diagram of a simple microprocessor (a.k.a. nanoprocessor). Answer following questions based on Figure 1.



Figure 1 – High-level diagram of the microprocessor.

- (i) This design separates the data access from instruction access. List an advantage and a disadvantage of such a design. [2 marks]
- (ii) Calculate size of the Program Counter (i.e., number of bits *n* in Figure 1). [2 marks]
- (iii) How many bits are required to select a register from the Register Bank (i.e., number of bits *k* in Figure 1)? [2 marks]

| Instruction               | ADD R1, R2                                                                                    |                                           |                                            |   |
|---------------------------|-----------------------------------------------------------------------------------------------|-------------------------------------------|--------------------------------------------|---|
| Description               | Add registers $R_1$ and $R_2$ and store results on $R_1$ , i.e., $R_1 \leftarrow R_1 + R_2$ . |                                           |                                            |   |
|                           | $R_1, R_2 \in [0, 31]$                                                                        |                                           |                                            |   |
| <i>m</i> -bit instruction | 110                                                                                           | <i>k</i> -bits to indicate Register $R_1$ | k-bits to indicate Register R <sub>2</sub> | x |

One of the instructions supported by the microprocessor can be defined as follows:

- (iv) What is the word length of an instruction? [3 marks]
- (v) Write the machine code for the following instruction. [3 marks]ADD 0x3, 0x12
- (vi) Briefly explain how the above instruction can be implemented on the microprocessor.Clearly identify what components to activate and how to activate them. [7 marks]
- (vii) If the Program ROM can store 128 such instructions, what is the total capacity of the Program ROM in bytes? [2 marks]
- (viii) Briefly discuss two techniques that can be used to reduce the power consumption of this microprocessor. [4 marks]

## **Question 2**

# [25 marks]

- (i) Show the schematic diagram of a 2-to-4 decoder built using a demultiplexer.[3 marks]
- (ii) A 3-bit counter has the following state transitions:

 $001 \rightarrow 010 \rightarrow 100 \rightarrow 000 \rightarrow 001 \rightarrow 010 \rightarrow 100 \rightarrow 000 \rightarrow \dots$ 

Show the schematic diagram of this counter built using D Flip Flops. Your answer should include the truth table, Karnaugh Maps, and schematic diagram. [11 marks]

- (iii) Using the 2's Complement calculate 91 61. Note that registers are 8-bit. [3 marks]
- (iv) Represent 5.718 using Single Precision, IEEE Floating Point standard. [4 marks]

Hint: Single Precision standard has a 23-bit mantissa, 8-bit exponent, and 1-bit sign. The exponent is calculated as E = e + 127.

(v) For a given application, 30% of the instructions require memory access. Miss rate is 3%. An instruction can be executed in 1 clock cycle. L1 cache access time is approximately 3 clock cycles while L1 miss penalty is 72 clock cycles. Calculate the average memory access time.

## **Question 3**

## [25 marks]

(i) Given an integer x, write an Assembly program to calculate its cube  $x^3$ . For example, if x = 3,  $x^3 = 27$ . Provide comments for your code. [14 marks]

Assume the microprocessor has 20, 8-bit general purpose registers labelled as R1, R2, ... R20.  $x \in [1, 6]$ . It is ok to assume x is already stored in one of the general purpose registers. Use only the instruction set given in the Appendix (see page 5).

- (ii) What is Indexed addressing? Explain using a suitable example. [2 marks]
- (iii) Discuss the pros and cons (i.e., advantages and disadvantages) of Pooling, Interrupt Driven, and Direct Memory Access (DMA) based Input Output techniques. [6 marks]
- (iv) Universal Serial Bus (USB) has become the preferred standard for interconnecting external devices such as keyboard, mouse, printer, and smart phone to desktops and laptops. What are the technical reasons that make USB a preferred choice than some of the other interface/bus protocols? [3 marks]

#### **Question 4**

Suppose, just after graduation you were hired by Apple Inc. in USA. Your team is responsible for designing the next processor for the 6<sup>th</sup> generation iPad to be released in 2018. iPad 6 is to be the thinnest tablet ever with high quality graphics, fast App performance, and extended battery life. As a preparation for the first project meeting, each team member is asked to think about how to implement various components of the processor.

- (i) Briefly explain what type of a design you would recommend for each of the following items. Provide at least 2 justifications for each of your selections.
  - a. Number of processing elements (e.g., single core or multi core). [3 marks]
  - b. A CISC-based design or a RISC-based design. [3 marks]
  - c. No of pipeline stages. [3 marks]
  - d. Memory hierarchy (e.g., number of levels, cache size, and cache associativity). [4 marks]
- (ii) After several performance tests it has been identified that the I/O sub-system on the 4<sup>th</sup> generation iPad was the primary performance bottleneck. The reason was that the waiting time for I/O was 60% of the total time. Therefore, one of your team members suggested that by speeding up the I/O system alone, it would be possible to gain an overall speedup of 2 times.
  - a. How much speed up in the I/O sub-system will be required to achieve the desired overall speed up? [3 marks]

You may use the Amdahl's law given below for the calculation:

$$Speedup_{Overall} = \frac{1}{(1 - Fraction_{Enhanced}) + \frac{Fraction_{Enhanced}}{Speedup_{Enhanced}}}$$

- b. Do you agree with your team member's idea of gaining such a speedup by improving only the I/O sub-system? Briefly explain. [3 marks]
- (iii) A majority of the Apple devices is built and assembled in China. What particular challenges need to be overcome, if Apple is interested in building and assembling iPad 6 in Sri Lanka? Briefly explain 3 challenges with examples. [6 marks]

#### [25 marks]