Computer Systems Architecture
Computer architecture is a vibrant and ever changing field; through this assignment we will have to develop an understanding of the principles underlying the structure of computer hardware by illustrating and developing an understanding of the various engineering, scientific and economic tradeoffs necessary in the design and implementation of computer systems. Additionally, we have to emphasize the ability to work within a team and to respond to the challenging requirements through cooperative activity.
We have selected the Memory Architecture of the X86computer system to design and develop our assignment. It belongs to the family of Instruction set architectures which was mainly based on the Intel 8086 configuration. It was created as 16 bit model but got popular with the 32 bit model and later upgraded to 64 bit models.
We are mainly dealing with the Fundamental Memory architecture which includes....
- Cache Memory
This project holds a high position in our career. Our determination, knowledge and perseverance have helped us to put together our best in this project. Our goal was to give an appreciating and outstanding performance. This project has made us to work and adjust with people around. We would like to thank our project coordinator Ms. Vidhu Bhasin. She had been guiding us throughout the development of the project and willing to share her ideas and time with us that have brought us with completion of the project. Her encouragement gave us a great opportunity to gain handsome experience and knowledge when dealing with such logic development project.
We are thankful to our group members and peers. While doing this project we learn many things which will help us in future. While doing this project we faced many challenges and by continuous efforts and guidance we overcome and completed the project in Gratitude.
Registers are temporary storage locations used to hold data, instructions, or the results of calculations. They are actually memory areas stored on the CPU itself, used for extremely fast access to the values within them, this is because the CPU doesnt have to access a location outside of itself.
Uses of registers:
v It acts as the working space for CPU (temporary storage).
v CPU can't directly perform arithmetic in memory, registers are needed.
E.g. if you want to add 1 to a memory location, the processor will normally do this by loading the initial value from memory into a register, adding 1 to the register, and then saving the value back to memory.
v It also acts as scratchpad for currently executing program.
Used to hold data that must be accessed by the CPU very quickly.
v Stores information about status of CPU and currently executing program.
- Hold data being processed.
- Hold an instruction to be executed.
- Hold a memory address.
- Hold status codes.
Registers are also used for:-
Storing values from other locations.
- The register can be loaded with values from other registers or from memory locations.
- This operation overwrites previous value stored in the register
How Registers Work: Execute Cycle
This is the sequence of steps that happens when the CPU (Central Processing Unit) fetches an instruction from the memory. It involves several registers inside the CPU - specifically, the Program Counter. Here are a summary of the registers needed:
- The program counter is the register that holds the memory address of the current instruction being executed. When the next instruction is to be fetched, this register is incremented by the appropriate number of bytes.
- Some CPUs contain a memory address register, which holds the address of the byte being loaded. This doesn't necessarily mean the instruction byte, as several instructions have one or more bytes as operands (i.e. that follow the instruction). Other CPUs don't have this register. They simply increase the program counter and use it to fetch the next byte(s) into memory.
- CPUs contain general registers. In the example below, I shall use the 6502's registers. The 6502 processor (used in the BBC micro computer) contains three general registers - the Accumulator (A) and two index registers (X and Y).
- In the CPU, there is a status register (also called the condition codes register) which indicates various things about the last calculation carried out. For instance, there is a zero flag (which is set to true if the last calculation produced a zero), a carry flag (true if the last calculation produced a carry out i.e. an overflow) etc.
A processor includes both
- User Visible Registers (UVR): Registers which are visible to programmers, they may be general purpose or have a special use.
- User Invisible Registers (UIR): Registers which are used solely by the CPU and special O/S functions, they can be also used to control the operation of CPU.
General purpose registers:
- Can be used for any operation; hold address as well as data.
- Hold intermediate results or data values, for example, loop counters.
- Early computers had only one, the accumulator.
- Typically several dozen in current CPUs
- Numbers limited by cost versus ability to make use of greater numbers.
USER VISIBLE REGISTERS:
If registers are specialised then their use will be implicit and need not be mentioned in the instruction. This means the instruction will be shorter and less memory required.
- Data Registers: some instructions expect to use certain registers to hold data such as; MUL CX whereas the other operand is assumed to be in AX and result will be placed in AX.
- Address Registers: Stack Pointer, Segment Pointer, and Index Pointer.
USER INVISIBLE REGISTERS:
Every processor contains some special purpose or control registers. These are generally contained within the control unit. Several important ones are:-
- Program Count Register (PC)
- Instruction Register (IR)
- Memory Address Register (MAR)
- Memory Data Register (MDR)
- Status/Flag Registers
The original Intel 8086 and 8088 has fourteen 16-bit registers. Four of them (AX, BX, CX, DX) are general registers (although each may have an additional purpose; for example only CX can be used as a counter with the loop instruction).The FLAGS register contains flags such as carry flag, overflow flag and zero flag.
With the advent of the 32-bit 80386 processor, the 16-bit general-purpose registers, base registers, index registers, instruction pointer, and FLAGS register, but not the segment registers, were expanded to 32 bits. This is represented by prefixing an "E" (for Extended) to the register names in X86 assembly language.
Starting with the AMD Opteron processor, the x86 architecture was extended to extend the 32-bit registers into 64-bit registers in a way similar to how the 16 to 32-bit protected mode extension was done ( RFLAGS, RIP), and 8 additional 64-bit general registers (R8, R9, ..., R15) were also introduced in the creation of x86-64.
Example of microprocessor registers organizations
What is Cache Memory?
Answer: Cache (prominent and pronounced as cash) memory is enormously and extremely fast memory that is built into a computers central processing unit (CPU) or located next to it on a separate chip.
Cache Memory Principles
- Small size of fast memory
- Positioned between the processor and main memory
- situated either on the processor chip or on a separate module
Reason for Cache Memory:
There are various reasons for using Cache in the computer some of the reason is mentioning following.
1. The RAM is comparatively very slow as compared to System CPU and it is also far from the CPU (connected through Bus), so there is need to add another small size memory which is very near to the CPU and also very fast so that the CPU will not remain in deadlock mode while it waiting resources from main memory.
2. Cache memory directly communicates with the processor. It is used preventing mismatch between processor and memory while switching from one application two another instantaneously whenever needed by user. It keeps track of all currently working applications and their currently used resources.
3. For example, a web browser stores newly visited web pages in a cache directory, so that we can return promptly to the page without requesting it from the original server. When we strike the Reload button; browser compares the cached page with the current page out on the network and updates our local version if required.
How Cache Works?
When the data at a given memory address is required to be read, the processor attempts to obtain the data from the cache memory. If cache does not Contain that data the processor would be stopped while it is loaded from main memory to the cache. While required data is overloaded from memory to the cache, it will have to restore something that is already in the cache. So when this happens, the cache takes decision if the memory that is going to be replaced has altered. If it found it firstly saves the changes to main memory and then load new data. This memory is similar to a hot list of instructions needed by the CPU. While cache is filled and the CPU calls for a new instruction, then system overwrites the data in cache that has not been used for the longest span of time. This way the high priority information i.e. used continuously stays in cache, while the less frequently used information remove out after an Interval.
Cache Operation Overview
- Processor requests the contents of some memory location
- The cache is checked for the requested data
- If found, the requested word is delivered to the processor
- If not found, a block of main memory is first read into the cache, then the requested word is delivered to the processor.
When a block of data is fetched into the cache to persuade a single memory reference, it is likely that there will be future references to that same memory location or to other words in the block locality or reference rule. Every block has a tag added to recognize it.
Levels of Cache
Level 1 Cache (L1): L1 cache is the fastest form of storage. L1 cache is implemented using Static RAM, which is traditionally 16KB in size. SRAM uses two transistors per bit. As long as power is supplied to the circuit it can hold data without any external support..
Level 2 Cache (L2): L2 cache also known to as secondary cache, uses the similar control logic as L1 cache and it is also implemented in SRAM.L2 caches usually designed in two sizes i.e. 256KB and 512KB. It is found and soldered on motherboard in a Card Edge Low Profile (CELP) socket. The bus interface of the processor has a unique transfer protocol called burst mode. 976
L3 cache - Level 3 cache is something of a high capacity cache in sense of Performance. Often only high end workstations and servers need L3 cache. So the idea of CPU cache leveling is one of performance optimization for the processor.
Cache Memory Organization:
There are several caches are found in a modern microprocessor. They differ in size and functionality as well as their internal organization is characteristically different across the caches.
It is used to store instructions. It helps to diminish the cost of going to memory to fetch instructions.
Data Cache: it is a very fast buffer that stored the application data. It must be loaded from memory into the data cache before the processor can operate on the data. When an element is needed then it loaded from the cache line into a register and then instruction which is using that value can operate on it.
Exchanging a virtual page address to a valid physical address is slightly costly. The TLB cache store these translated addresses.
An algorithm is needed to map main memory blocks into cache lines. A method is needed to determine which main memory block occupies a cache line.
This organization is relatively easy to implement because of it make scale with the processor clock. This organization has built-in replacement policy because of cache line replacement is controlled by the memory address (virtual or physical).
The fully associative cache design solves the potential problem of thrashing with a direct mapped cache.
Strategies for Cache Memories:
Cache Used various Page Replacement Technique, some of them are:
Each Replacement strategy is a concession between hit rate and latency.
- Segmented LRU: In this case, Cache is divided in to two parts, a probationary segment and a protected segment and for both LRU is used.
- 2-Way Set Associative: it is used for high-speed CPU caches where even Pseudo-LRU is too slow. Etc.
- Least Recently Used: Remove the least recently used items first.
- Beladys Algorithm: Remove the information that is not being needed for the longest period of time in the future.
- Pseudo-LRU: this algorithm can be used which only needs, one bit per cache item to work.
- Most Recently Used: Remove in contrast to LRU, the mostly recently used items first.
RAM stands for Random Access Memory. Robert Herath Dennard of IBM T J Watson Center was the inventor of the RAM. It is a volatile semiconductor memory which can be accessed directly if the row and column of the location is known that intersect at that cell. It temporarily stores the data in exchange to keep computer running quickly.
The implications of RAM are:-
- At the time of loading of program which is also affected by the CPU and your hard drive read/write speed.
- Speed of applications run once started, the more RAM you have, the faster your application will run.
It is the working memory of the computer. When you are running any program, it is stored in the RAM and when the program is finished, RAM is unloaded. RAM is volatile it means that when the computer is shut down, it is cleared out.
Reasons for RAM:-
RAM is called Random Access because any storage location can be accessed directly. It is organized and controlled in a way that enables data to be stored and retrieved directly to the specific locations. It is much like arrangement of cells that can hold 0 or 1. Each cell has a unique address counting across columns and row to row. To find contents of a cell, RAM controller sends column/row address down a very thin electrical line etched into the chip.
How RAM works?
RAM stores the data in memory cells that are arranged in grid much like the cells arranged in a spreadsheet, from which data, in the binary form of 1s and 0s, can be accessed and transferred at random to the processor for processing by the system software.
It consists of millions of capacitors and transistors which are paired to each other to make a memory cell. Each capacitor represents one bit of data and transistors are used for changing the state of the capacitors to either 0 or 1. To achieve the value of 1 capacitor need to be filled of electrons or charged and to achieve 0 needs to empty of the electrons or discharged.
When CPU or processor gets any instruction to perform, then instructions contain the address of memory or RAM location from which data has to be read. That address is sent to the RAM controller which organizes the request and sent it down to the appropriate address where transistors along the data lines open up the cells so that each capacitor value can be read. The read data is sent along the data lines to the processor nearby data buffer known as level-1 cache and another copy may be held in level-2 cache.
Types of RAM:-
RAM is often divided into two main categories: -
- Main RAM: It stores every kind of data and makes it quickly accessible to microprocessor.
- Video RAM: It stores the data intended to display screen, enabling images to display faster.
It is further mainly divided into Static RAM (SRAM) and Dynamic RAM (DRAM).
a) Static RAM (SRAM):-
It primarily used for L1 and L2 Cache. It is more expensive and requires 4 times of amount of space for a data than DRAM but unlikely DRAM it needs not to be power-refreshed and thus SRAM has faster access.
b) Dynamic RAM (DRAM):-
It has memory cells with paired a paired transistor and a capacitor requiring constant refreshing. Because reading a DRAM discharges its contents, a power refresh is required after each read. Just to hold the charge that it contains at that place, it must be refreshed after each 15 nanoseconds. It is less expensive RAM.
Video RAM: -
This type of RAM is used to store the pixels values of graphical display and mother boards controller reads continuously from this memory to refresh the display.
The hierarchy of the memory is simplified on the basis of the speed the CPU can fetch data from them. We have very fast access with the CPU register but it stores a small amount of data about 100 bytes. So we use cache memory which provide the usually up to 10 kilobytes , its capacity differ at different level and can goes up to 2 Megabytes.
Again 2 Megabyte is also smaller now a day so we use RAM (Random Access Memory) which varies in size in Gigabytes.
The CPU search for the data from high level memory towards the low level. The CPU looks and stores the very frequently used data in register. If the registers are full then it goes for the cache memory. 
The benefit of using Memory hierarchy is varies in time complexity, proper management of memory, utilization of resources etc.
Time complexity: As all the necessary data stored in registers so the CPU is not need to search any other location. If that data is not available in register then CPU start searching in cache memory. This reduces the time for CPU to fetch data.
Management of Memory: The memory hierarchy helps in proper management of resources by providing them priority.
Reasons for Memory hierarchy:
The memory hierarchy does not exist if the price of the different memories does not vary. This memory hierarchy is formed due the price of memory. If we talk about the register which are integrated with CPU are sold with the CPU. Our CPU processing speed is so fast that the main memory cannot fulfill its request so we are required to have a memory which can provide the data to the cpu near about its processing speed. This is the reasons why memory hierarchy was designed.
Working of the Memory Hierarchy:
When the user request any file or data the CPU ask for that file firstly into registers. If it finds the required data into register then it will open or show it but if it is not then it search for that file into cache memory. After looking into cache memory the CPU prefer to go on main memory and start to search. So with this we find that CPU request for the required file firstly from the memory which is fast accessed by it.
Frequently Asked Questions (FAQ):
- What is the memory? Why it require in the Computer?
- What is the purpose behind using Cache Memory in Computer?
- How cache differs from RAM?
- Why RAM is necessary for the computer?
- How RAM and Cache Work?
- Why Cache requires Replacement strategies?
- How CPU uses RAM and Cache?
- Why memory hierarchy is necessary for Computer?
- What are Registers and its organization?
- What are the types of RAM, Cache and Registers?
Finally, we reach at this venture where the blend of toil and deep research made this project possible. This would be inordinate if we were not getting proper guidance from our faculty sir. The accepted project work from our group was quite educational and inquisitive. We have worked through projects which feel rejuvenating to us and we are able to explain any terminology in MEMORY ARCHITECTURE. It was great working with fellow members on such an interesting project.
1. http://www.phiral.net/registers.htm[registers [Accessed on 12-03-10 for Introduction to registers]
2. http://www.wisegeek.com/what-is-cache-memory.htm [Accessed on 15-0310 for introduction to Cache]
3. http://ask.yahoo.com/19990329.html [Accessed on 15-03-10 for use of cache]
6. http://fourier.eng.hmc.edu/e85/lectures/figures/Cache_diagram.gif [Accessed on 16-03-10 for working of cache]
9. http://www.compwisdom.com/topics/Cache-memory [Accessed on 18-03-10 for cache memory organization]
10. http://wiki.answers.com/Q/What_is_RAM_for&src=rss [Accessed on 14-03-10 for RAM introduction]
11. http://www.freepatentsonline.com/y2007/0204189.html [Accessed on 16-03-10 for use of RAM]
12. http://whatis.techtarget.com/definition/0,,sid9_gci523855,00.html [Accessed on 18-03-10 for RAM types]
13. http://www.answers.com/topic/memory-hierarchy [Accessed on 15-03-10 for Memory hierarchy]
14. http://en.wikipedia.org/wiki/Processor_register [Accessed on 25-03-10 for categories of registers]
15. http://richardbowles.tripod.com/durham/comparch/fetchex.htm [Accessed on 28-03-10 for FEC]