No. of Questions: 18 No. of Pages: 3 ## BITS PILANI - DUBAI International Academic City, Dubai Second Semester 2009 – 2010 Advanced Computer Organization CS C 342 (III year CS) Comprehensive Examination (Closed Book) **Duration: 3 hours** 23.05.10 Weightage: 40% MAX: 120 Marks #### Note: - 1. Answer Part-A and Part-B in separate booklets. - 2. Answer all questions sequentially in each section. #### PART - A 1. Summarize how hardware and software affect performance in a tabular form. (6 M) 2. a. Consider a non pipelined machine with 6 execution stages of lengths 50 ns, 50 ns, 60 ns, 60 ns, 50 ns, and 50 ns. (i) Find the instruction latency on this machine. (3 M) (ii) How much time does it take to execute 100 instructions? (3 M) b. Suppose we introduce pipelining on this machine. Assume that when introducing pipelining, the clock skew adds 5ns of overhead to each execution stage. (i) What is the instruction latency on the pipelined machine? (3 M) (ii) How much time does it take to execute 100 instructions? (3 M) c. What is the speedup obtained from pipelining? (2 M) 3. A system with L1 and L2 caches has a CPI of 1.2 with no cache miss. There are 1.1 memory accesses on average per instruction. What is the effective CPI with cache misses factored in? What are the effective hit rate and miss penalty overall if L1 and L2 caches are modeled as a single cache? (6 M) | <u>Level</u> | Local hit rate | Miss Penalty | |--------------|----------------|--------------| | L1 | 95% | 8 cycles | | L2 | 80% | 60 cycles | 4. Show the format of cache addressing with the label and length of each field for a byte-addressable memory with 32-bit addresses. Cache line = 16B. Cache size = 4096 lines (64KB) (6 M) - 5. An address translation process converts a 32-bit virtual address to a 32-bit physical address. Memory is byte-addressable with 4KB pages. A 16-entry, direct-mapped TLB is used. Specify the components of the virtual and physical addresses and the width of the various TLB fields. (6 M) - 6. What is the average time to read or write a 512 byte sector for a typical disk rotating at 15,000 RPM? The advertised average seek time is 8ms, the transfer rate is 25MB/sec. Assume that there is no overhead. (4 M) - 7. Illustrate with diagram the asynchronous handshaking protocol to read a word from memory. (6 M) - 8. An industrial control application spent 90% of its time on CPU operations when it was originally developed in the early 1980s. Since then, the CPU component has been upgraded every 5 years, but the I/O components have remained the same. Assuming that CPU performance improved tenfold with each upgrade, derive the fraction of time spent on I/O over the life of the system. (8 M) - 9. Draw the block diagram to represent the classic organization of a shared memory multiprocessor. (4 M) #### PART - B - 10. Write the decimal representation of the following MIPS instruction and also label the individual fields of the decimal representation. 2 + 2 = 4M add \$t0, \$s1, \$s2 - 11. Which instruction is used to multiply two operands in MIPS? Using this instruction write a single MIPS assembly level statement to multiply \$s0 by 16, and store the product in \$t2. 2+2=4 M - 12. Given f, g, h, i and j are variables corresponding to the five registers \$s0 through \$s4, what is the compiled MIPS code for the following C if statement? 6 M if (i = j) f = g + h; else f = g h; - 13. Complete the following table to show the combination of operands and results that indicate overflow conditions for addition and subtraction. 6 M | Operation | Operand A | Operand B | Result indicating overflow | |-----------|-----------|-----------|----------------------------| | A + B | | | | | A + B | | | | | A - B | | | | | A - B | | | | 14. Using 4-bit numbers, multiply $2_{ten} \times 5_{ten}$ . Show the following details of all the steps in the form of a table. | Iteration No. | Step | Multiplier | Multiplicand | Product | |---------------|------|------------|--------------|---------| | | | | | | | | | | | | 15. What decimal number is represented by the following single precision floating point number? 6 M - 16. Computer C's performance is 4 times better than the performance of Computer B, which runs a given application in 28 secs. How long will computer C take to run that application? - 17. Consider an implementation of MIPS ISA with 500MHz clock and each ALU instruction takes 3 clock cycles, each branch/jump instruction takes 2 clock cycles, each sw instruction takes 4 clock cycles, each lw instruction takes 5 clock cycles. Also, the program executes 200 million ALU instructions, 55 million branch/jump instructions, 25 million sw instructions and 20 million lw instructions. Find the CPU time and also the CPI. 18. Complete the following table, showing the effect of the seven control signals in an MIPS data path.10 M | Signal Name | Effect when deasserted | Effect when asserted | |-------------|------------------------|----------------------| | RegDst | | | | RegWrite | | | | ALUSrc | | | | PCSrc | | | | MemRead | | | | MemWrite | | | | MemtoReg | | | \*\*\*\*\*\*\*\*\*\* #### BITS PILANI – DUBAI ### International Academic City, Dubai Second Semester 2009 – 2010 Advanced Computer Organization CS C 342 (III year CS) Test - II (Open Book) Duration: 50 minutes Weightage: 20% MAX: 60 Marks 18.04.10 Note: Only Prescribed Text Book and Handwritten class notes are allowed. #### PART - A 1. A set associative mapping cache has a set size of 4. The cache capacity is 2K words and that of main storage is 128K \* 32. Derive all pertinent information required to design the cache memory. Determine the average memory access time for a cache hit ratio of 0.85, cache access time of 100 nsec and main storage access time of 500 nsec. Assume there are eight words in a page. (10 marks) - 2. hierarchical Cache-Main memory subsystem has following specifications: - Cache access time of 50 nsec i. - ii. Main storage access time of 500 nsec - iii. 80% of memory request are for read - iv. Hit ratio of 0.9 for read access and the write through scheme is employed. Estimate - a. Average access time of the system considering only memory read cycle. - b. Average access time of the system both for read and write requests. - c. The hit ratio taking into consideration the write cycle. (10 marks) 3. Assume a cache consisting of eight one word blocks and is direct mapped. Find the number of misses for the cache organization given the following sequence of block addresses: 2,4,0,3,2,8,3,4,7 (10 marks) 4. Complete the following control signal table based on each type of instruction: (12 marks) | Instruction | RegDst | RegWrite | AluSrc | 4 - bit ALU control input | MemWrite | MemRead | Memto<br>Reg | |-------------|--------|----------|--------|---------------------------|----------|---------|--------------| | And | | | | | | | | | Or | | | | | | | | | Slt | | | | | | | | | Lw | | - | | | | | | | Sw | | | | | | | | | Beq | | | | | | | | 5. Assume the operational time for the various functional units of an instruction execution are as given below:- | Instruction fetch | Reg Read | Alu operation | Data Access | RegWrite | |-------------------|----------|---------------|-------------|----------| | 400 ps | 200 ps | 400 ps | 400 ps | 200 ps | - a. What is the total time needed to execute each of the instructions Iw, sw, R type and beq? (2 x 4 = 8 marks) - b. If there are three consecutive **Iw** instructions, what is the time between the first and the fourth instructions in a non-pipelined execution and pipelined execution? (4 marks) - 6. Using a drawing show the forwarding paths needed to execute the following instructions in a pipeline architecture: (6 marks) add \$3, \$4, \$16 sub \$5, \$3, \$2 lw \$7, 100(\$5) add \$8, \$7, \$2 ### BITS PILANI - DUBAI ## International Academic City, Dubai Second Semester 2009 – 2010 # Advanced Computer Organization CS C 342 ( III year CS) Test – 1 (Closed Book) | | Test – 1 (Closed Book) | | |----|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------| | | wration: 50 minutes Weightage: 7.03.10 MAX: 75 N No. of pages | <b>Iarks</b> | | | $\underline{PART - A}$ | | | 1. | Specify two parameters which are affected by an algorithm in terms of p performance. | rogram<br>5M | | 2. | Following is the detail of two pseudo instructions and what each accompeach pseudo instruction, write either a single instruction or the minimal sof actual MIPS instructions to accomplish the same task. a. move \$t1, \$t2 \$t1 = \$t2 | sequence | | | b. clear \$t0 $$ | 5 = 10M | | 3. | Write the MIPS assembly code for the following C statements. Assume a associated with \$s1, b with \$s2 and c with \$s3. if $(a > b)$ $c = a$ ; else | a is | | | c = b; | 10M | | 4. | List only the names of the different addressing modes of MIPS. | 6M | | 5. | List all unconditional jump instructions in MIPS with an example for each. | 6M | | | <u>PART - B</u> | | | 6. | What decimal number is represented by this single precision float? 1 10000001 101000000000000000000000000 | 7M | | 7. | Suppose registers R1, R2 and R3 have the following signed binary number R1 = 0000 0000 0000 0000 0000 0000 0000 | · | | | slt r6, r2, r1 | 9M | | 8. | Multiply 4-bit numbers 7 * 2. Show all the steps in detail. | 12M | | 9. | Add $2.85_{\text{ten}} * 10^3$ to $9.84_{\text{ten}} * 10^4$ Assuming that you have only three significand digits with guard and round digits. | 10M | \*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\* No. of Questions: 5 # BITS, Pilani - Dubai International Academic City, Dubai II– Semester 2009-2010 | Course Number | : CS C342 | | |--------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------| | Course Name | : Advanced Compute | r Organisation | | Nature of Component | : Quiz I1 – Closed Bo | ok (Version - B) | | Weightage | : 7 % | | | Max. Marks | : 21 Marks | | | Duration | : 20 minutes. | | | Date of Examination | : 10.05.2010 | | | Name : | ID No | Sec No: | | 40 – bit virtual ad following with res | I memory system with the following dress, 16KB pages, 36-bit physical spect to this virtual memory system to the page offset in the virtual abits | al address. Answer the m. $5 \times 1 = 5 \text{ M}$ | | Ans : | tal number of page table entries? | | | d) What is the siz | ze of a page table entry? bits | | | e) What is the tot<br>Ans: | tal size of the page table? bits | | | 1, 2, 3, 4, 2, 1, 5, 6<br>How many page fa | wing page reference string: 5, 2, 1, 2, 3, 6 aults would occur for the LRU page and three page frames of main m | | | No. of frames | No. of page fau | ılt <u>s</u> | | 1 | | | | 2 | | | | 3 | | | - 3. What fraction of a 1GHz CPU's time is spent polling the following devices if each polling action takes 800 clock cycles? (6 marks) - a. Keyboard must be interrogated at least 10 times per second. - b. Floppy sends data 4 bytes at a time at a rate of 50KB/s. - c. Hard drive sends data 4 bytes at a time at a rate of 3MB/s. 4. A benchmark executes in 200 seconds of elapsed time and it takes 10% of elapsed time for I/O. If CPU time improves by 25% for the first year, 35% for the second year, 50% for the third year and improves by 60% for the remaining years. If I/O time improves by 10% for the first year, 15% for the second year and then it does not improve. How much faster will the program run at the end of five years. (5 marks) - a. Find the improvement in CPU performance over 5 years. - b. Find the improvement in elapsed time. - c. Find the improvement in I/O time. - d. Find the improvement in %I/O time. 5. A memory backplane bus is capable of sustaining a transfer rate of 1500 MB/sec and the maximum I/O rate of bus is 11,719 I/O's /second. Find I/O transfer. (2 marks) No. of Questions: 11 2 M ## BITS, Pilani - Dubai International Academic City, Dubai II— Semester 2009-2010 | Course<br>Nature<br>Weight<br>Max. M<br>Duratio<br>Date of | of Component<br>age<br>Iarks | II- Semester 2009-2010 CS C342 Advanced Computer Organisation Quiz 1 - Closed Book (Version - B) 8 % 24 Marks 20 minutes 29.03.2010 Section No. ID No. | | |------------------------------------------------------------|-----------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------| | 1. | Give an example for a | nemory reference instruction. | 2M | | 2. | Which functional unit | supply instruction address to instruction memory. | 2M | | | Name the methodolog write that register in sa | sy that allows the user to read the contents of a reginal ame clock cycle. | ster and 2M | | 4. | In Sequential element | the output depends on and | 2M | | 5. | Write the formula for | Arithmetic Mean. | 2M | | 6. | Define response time | and throughput. | 2 M | | | | PI of a 1.4 Ghz machine that executes 12.5 million ands? Express the answer in cycles/instruction. | 2 + 2 M | 8. What is the relationship between clock cycle rate and clock cycle time? When instructions are of different types write the formula for calculating the CPU clock cycles. 10. Assume the following measurements of execution time user taken. Which of the following statements is true? | Program | Computer A | Computer B | |---------|------------|------------| | 1 | 16 sec | 32 sec | | 2 | 40 sec | 16 sec | - a. A is faster than B for program1. - b. A is faster than B for program 2. 11. What is the CPI of a program execution that consists of the following instruction types and their corresponding CPI: 2 M | Instruction Class | CPI | Percentage | |-------------------|-----|------------| | A | 1.4 | 25% | | В | 2.4 | 70% | | C | 2 | 5% |