for Robot Artificial Inteligence

15.CPU and Cache Memory

|

Cache Memory

  • in the early decades of computing, main memory RAM was extremely slow and incredibly expensive. the CPUs were also slow.

  • starting in the 1980s, the gap began to widen very quickly. Microprocessor clock speeds took off but memory access times improved far less dramatically.

  • as this gap grew, it bcame increasingly clear that a new type of fast memory was needed to bridge the gap.

  • Almost all modern CPU makes use of cache memory which significantly improves the CPU’s performance. the cache memory(caching) was invented to bridge the performance gap between CPU and main memory

  • the primary goal of the cache memory system is to ensure that the CPU has the next set of data and instruction it will need, 3 already loaded into cache memory by the time CPU goes looking for it (also called a cache hit)

  • during program execution the CPU will first search L1 cache for the next instruction and data to be executed. if CPU fails to find, then it will look for the same in L2 cache. the CPU will continue the search in L3 cache and then to the main memory, til the required data and instruction is fetched for execution.

  • therefore, from CPU perforamnce point of view higher hit rate for L1 cache is better.

  • the CPU is amazingly small considering the immense (巨大的) amount of circuitry it contains. the circuits of a CPU are made of logic gates. the Gates, however are also made of tiny components called a transisor, and a modern CPU has millions and millions of these transistor in its circuitry

  • the CPU is composed of components: Memory Unit(Cache L1 & Registers), buses, ALU-arithmetic Logic Unit and CU- Control Unit.

  • A special heatsink is installed on the top of the processor chip with a cooling fan in order to protect the CPU chip from the excess heat generation.

  • Digital circuits depend heavily on transitors for automation. Transistors are the reason that digital systems can perform logical functions by using logic gates.

CPU internal components

  • the CPU is primarily responsiblt to do all the calculations, decision making and controlling other parts of the computer system.

  • the CPU is a single Microprocessor chip which performs all these functions

    • ALU : arithmetic logic unit
    • CU - control unit
    • Memory Unit.

CU

  • the Control unit(CU) is one of the components of a computer’s central processing unit(CPU) that directs the operations of the processor.

  • the control unit directs the computer’s memory, arithmetic & logic unit, input and output devices on how to respond to the program’s instruction during the program executing.

  • the control unit directs the operation of the other units by providing timing and control signals.

  • the control unit is typically an internal part of the CPU with its overall role and operation unchanged since it instroduction.

  • most computer resources are managed by the control unit. it also direct the flow of data between the CPU and the other devices.

  • the control unit coordinates and controls all the components of a computer system. the control unit within the CPU intiates the program execution by feting the program instruction from the main system memory RAM to the CPU’s internal memory Registers

  • the control unit also directs the operation of the other units by providing timing (clock) and control signals.

  • the control unit(CU) controls, coordinates and synchronizes the acitivities and operation performed by the of the computer system.

  • the CU acts as nerve centre(控制中心) for entire computer system

  • the CU controls the system operations by routing the selected data items to the selected processing hardware at the right time.

  • the decode section within the control unit decodes the program instructions in binary as per the CPU’s instruction set architecture.

  • the instruction pipelining is a technique used to optimize the CPU’s processing power. the basic idea is to split the processor instruction into a series of small independent stage(Fetch, Decode, Execute, Store). Each stage is designed to perform a certain part of the instruction cycle.

  • in simpler CPUs the instruction cycle is executed sequentially and each instruction being processed before the next one is started. in most modern CPUs the instruction cycles are instead exeucted concurrently and often in parallel, through an instruction pipeline, the next instruction processing starts even before the previous instruction has finished which results improved CPU perforamnce.

ALU

  • The ALU is a complex digital circuit and one of many components within a computer’s CPU

  • The ALU performs both bitwise and mathematical operations on binary numbers and is the last component to perform calculations in the procssor. the ALU uses to operands and OP-code that tells it which operations to perform on the inpud data. After the information hase beend processed by The ALU, it is sent to the main memory(RAM).

  • the ALU stands for arithmetic Logic Unit is a digital circuit inbuilt within the CPU which is responsible to perform the arithmetic and logical operations.

  • the ALU is a fundamental building block of the CPU and integral part of almost all the Microprocessor

  • the modern processors are multi core and can contain number of very powerful and complex ALUs in built within singe processor chip.

  • THe ALU drives the processing power of all moderan CPU and GPU(Graphics processing Unit). Each of these CPU and GPU can consist of one or more very powerful and complex ALU’s performing the millions of arithmetic and logical operaions per second.

  • in some computer processors, the ALU is divided into an AU and LU. the AU performs the arithmetic operations and the LU performs the logical operations.

  • most of the operations of a CPU are performed by one or more ALUs, which load data from input registers . A register is a small amount of high speed memory storage available as part of a CPU.

  • the control unit directs the ALU, what operation to be performed on that data and the ALU stores the result in an output register. the control unit moves the data between these register, the ALU and main memory RAM.

Comment  Read more

14. Cache Memory

|

Cache Memory

  • in computer architecture, a high speed memory (static RAM - SRAM) is placed between the CPU and the main memory RAM (DRAM - Dynamic RAM). this memory primarily used to store the Frequently used data and instructions by the CPU.

  • this intermediate memory is known as Cache Memory which plays important role by significantly improving the CPU performance.

  • the Operating system loads the program which consist of machine code instructions. During the program execution, the CPU fetches the data and instructions from the main memory RAM.

  • the data and instructions are retrieved(取回) from RAM when CPU uses them for the first time. A copy of this data or instructions is then stored in cache memory which is placed between CPU and main memory RAM.

  • the cache memory consist of SRAM(static RAM) which is significantly faster than main memory DRAM. the CPU stores the Frequently needed data into the cache memory.

  • the next time, whenever the CPU needs that data or instructions, it will first looks in to the cache memory. if the required data is found in the cached, it is fetched from cache memory instead of main memory. this significantly improves the CPU speed.

  • the CPU does not directly access the main memory and the disk memory

  • CPU will first search data from the cache memory. thus the data is copied from disk memory to RAM(main memory) and from RAM to the cache memory where CPU can access.

  • in computer system, a cache memory is a high speed intermediate memory placed between the CPU and the main memory(RAM)

  • cache memory is used to increase the execution of computer programs by providing quick access to CPU to the Frequently used data.

  • these Frequently used data/values belong to the program/process which currently being executed by the CPU.

why Computer system needs Cache Memory?

  • the main memory RAM cannot support the CPU’s high processing speed and therefore intermediate high speed memory layers are placed between the main memory(RAM) and the CPU to improve the system performance

  • thses high speed intermediate memory components are known as Cache memory.

  • the cache memory is used in different levels depending upon the system architecture and usually provided in three levels L1, L2 and L3

  • Cache memory is a small-sized type of volatile computer memory that provides high-speed data access to the CPU and stores Frequently used computer programs, application and data. it stores and retain data only till the computer system is powered on

  • there are three types / levels of cache memory (L1, L2 and L3) used depending upon the access speed and its location on mother board from CPU.

  • cache memory is the fastest memory in a computer. the cache(L1) memory is the closest to the CPU and directly embedded inside the processor chip.

  • The L2 cache memory is generally placed on the RAM and L3 cache memory is placed on the mother board.

types of chace memory

  • A computer system can have different levels and sizes of cache memories depending on the CPU architecture. the most common levels of cache are L1, L2 and L3 cache where L1 is closest to the CPU(inbuilt inside CPU) and hence its access time is much faster compared to L2 cache. the L3 cache is closest to the main memory RAM and access time is relatively slower as compared to L1&L2 Cache.

  • Level1 (L1) cache:
    • it is also primary or internal cache. it is built directly into the processor chip. it has small capacity from 8kB to 128kB
  • Level2(L2) cache;
    • it is slower than L1 cache. its storage capacity is more, i-e. from 64kb to 16 MB. the current processors contain adavance transfer cache on processor chip that is a type of L2 caches. the common size of this cache is from 512kb to 8Mb
  • Level(L3) cache:
    • this cache is sperate from processor chip on the motherboard. it exists on the computer that uses L2 advanced transfer cache. it is slower than L1 and L2 cache but faster than DRAM. the personal computer ofen has up to 8Mb of L3 caches.
  • the cache memory sizes are restricted due to cost and size consideration. the cache memory uses SRAM which is expensive

  • the two main type of cache memoris are:
    • Memory cache(L1,L2,L3)
    • Disk cache
  • Memory cache is a portion of the high-speed static RAM(SRAM) and is effective because most program access the same data or instructions repeatedly. by keeping as much of this information as possible in SRAM, the computer avoid accsing the slower main memory(DRAM) which makes the computer perform faster and more efficiently.

  • Disk cache is used by operating system to store frequently accessed data and instruction. as the disk cache memory is a part of the RAM, it is faster to access and retrieve information from the cache memory compared to getting the information of files from the hard disk. the disk caching significantly improve the system performance.

Comment  Read more

13. Virtual Memory

|

Virtual Memory

  • the performance of the computer system primarily depends upon three factors:
  1. size of the main Memory(RAM)
    • 32 bit OS -> 4GB
    • 64 bit OS -> more than 4GB
  2. the CPU processing speed.
  3. Type of the operating system(32bit/64bit)
  • Out of these three factors the size of the maximum permissible RAM depends upon the type of the operating system(32bit/64bit) which defines the maxium addressable memory locations.

  • A platform is a runtime evnvironment provided by the CPU and compatible operating system on which various process can run

  • the choice of platform also affect the maximum permissible addressable RAM that can be supported by the computer system

  • today, the most common size of RAM available generally ranges between 4GB to 16GB depending upon the operating system installed on a laptops and desktop computer.

  • However, the size of the software have grown significantly and it is very common to find a software especially some of the graphics design, video editing and video games where required RAM size exceeds more than 10GB

  • then, how come we can run these programs which exceeds the available RAM in the computer system?

Review of Memory characteristic

Memory

  • the memory of a computer is a large indexed set of memory cells.
    • by “large” we could mean anywhere from 256 cells to 32 million cells.
    • by “indexed” we mean that each individual memory cell has its very own index number(aka identification number). this index number is usually called an address and can be through of as “stamped” on the “edge “of the memmory
    • memory supports these operation
      • selecting
      • reading
      • writing.

Memory Cell

  • A memory cell is the smallest part of a computer memory that can be changed in a signle operation
  • A memory cell records a number writeen in binary - this number is called the content of the memory cell. the cell pictured to the right has content 11010011 = D3
  • A RAM(random access memory) cell can have different numbers recorded in it at different times.
  • it is like a cassette tape that can be recorded with different songs. However, when a ne number is recorded in a RAM memory cell, the old number is FOREVER LOST.
  • A ROM(read only memory) cell has a number permanctly recorded in it. it is like a message chiseled in stone. it cannot be lost, even if the power is turned off.
  • All the numbers recorded in a memory cell always have the same bumber of bits

back to Virual Memory

  • the computer main memory(RAM) is organized into number of addressable units with each unit of a 1 byte(8 bits) size and each of this memory location has a unique memory address

  • for a computer system, the RAM will always be a limited resource due to ever increasing size of the software and number of programs simultaneusly running on the system

  • the operating system can handle number of programs running simultaneously on the system. however sometimes the program/process size exceeds the main system memory RAM.

  • A computer system can crash if the main memory - RAM available falls short of memory required to run the procsses being executed by the CPU.

  • the virtual memory solves this problem by treating each computer as if it has a large size of RAM which is not being shared with any other programs.

  • the operating system, such as Microsoft Windos or Apple’s OS, creates a set of virtual addresses for each program. the operating system translate virutal addresses into phsysical addresses, dynamically fitting programs into the RAM as it becomes available.

  • the virtual memory is an important mechanism provided within the operating system which makse use of some portion of the secondary memory such as Hard Dist which is also referred as disk memory.

  • the virtual memory is an logical extension of the computer system’s main memory RAM which is also referred as primary memory Or phsysical memory

  • in a situation when the size of the various processes (program) being executed by the CPU exceeds the size of the available physical memory RAM. the virual memory creates illusion of unlimited main memory RAM by using some part of hard dist as virtual memmory.

How operating system implements the virtual Memory?

  • the operating system implementation of the virtual memory consist of number of steps which includes both hardware and software side.
    • hardware side deals with the physical RAM and the software side deals with the process management.

  • the virtual memory is an important mechanism/feature provided within the operating system which allows user to run multiple programs simultaneusly which exceeds the size of the RAM available to the computer system.

  • the operating system splits the RAM in to equal size partitions called page frames/pags. the general size of the page frame for intel processors is 4kb which could vary on a higher side for some recent processors.
    • each page frame size = 4Kb
  • A process is a instance of the program. the operating system initiates the program execution by allocating the required memory from the main memory RAM.

  • Similarly, the operating system also splits the procss into equal size partitions called block. the operating system splits the process in such a manner that the process block size equal to the size of the RAM page frame
    • RAM pageFrame size = Process block size.

  • the operating system also maintains “Page Frame Table” for each process which is being executed by the CPU. the page frame table is also stored in the RAM

  • the page frame table contains the details of the mapping for the process block to corresponding page frame in the RAM.

  • the operating system allocates the RAM to different processs being executed simultaneously by the CPU. the operating system neeed not store the process block in to contagious memory locations

  • therefore, the CPU has to each time first look into the page frame table to find the memory location of a block for a given process after converting the virtual addresses with the physical RAM address.

  • the number of processes can share the RAM without conflicting with each other due to active page frame table.

  • the operating system creates a page frame table for each process being executed by the CPU. the page frame table is used by the MMU(Memory Management Unit) of the CPU to generate the physical RAM addresses for each process block.

  • the processor(CPU) executes only one process at a time and during this period the associated page frame table will be a active page frame table. the CPU will access the memory addresses only from the page frame table which is currently active.

  • the operating system does not load all the block associated with process but loads only those block which being used by user.

  • the OS allocates some portion of secondary storage (HardDIst) as swap space.

  • this swap space will have all the blocks/pages belonging to a processs.

  • the pages from swap area are loaded by the OS on demand as required during process execution by the CPU.

  • the present bin in the process page table indicates if that block is present in the RAM or not present in the RAM

  • if the present Bit =1 then block is present in the RAM. if the present Bit=0 then block is not present in the RAM

  • The CPU use page fault interrupt to load the blocks.

  • in demand paging only a part of the program(blocks/pages) are loaded in to the RAM. if the program access the page whicn not present in the RAM then processor issue a page fault interrupt which is a trigger for the operating system to loade the page from swap space in to the RAM and update the process page table present bit from 0 to 1.

  • if there is no free page frame available in the RAM for new pages to be loaded then the OS has to remove some pages from the RAM.

  • the OS take this decision based on the replacement algorithm
    • First in Frist out
    • least recently used.
    • least Frequently used.
  • SWAP OUT : the OS removes the pages from RAM which are not in use by the process and copies back to swap space on the secondary storage(hard disk)

  • SWAP IN : the OS loads the required pages by the program from secondary storage swap space to the RAM. the present bit status is updated in process page table from 0 to 1.

  • present bit : the present bit indicates the status of the process page/block presence in RAM.

  • Dirty bit : the pages already present in the RAM modified by the program and need to swapped out and copied back to the swap space so that the swap area on the secondary storage will have the updated copy. the process page table also need to be modified to track this change.

  • Dirty Bit : the process page/block modified by program inside RAM must be written back to the swap area. the operating system keeps the track of such pages with the help of dirty bit satus inside the process page table.

    • if the Dirty bit = 1 then the page needs to be written back to the swap area on the secondary storage
    • if the Dirty bit = 0 then page has not changed & swap out is not required

  • protection bit : the process page table has one additional column as protection bit in addition to P and D columns. the PB column is used by the operating system to assign some additional attributes to the process page/blcok, such as executable code, read only OR some other code.

Comment  Read more

10.GOLD trick(Mars Trickery)

|

#include <bits/stdc++.h>
#include <iostream>
using namespace std;

int A[8];

void add(int front_, int backend_, int x)
{
    for (front_;front_<=backend_;front_++)
    {
        A[front_-1]+=x;
    }

}

void print()
{
    int n = sizeof(A)/sizeof(A[0]);
    for(int i = 0 ; i<=n; i++)
    {
        cout<<A[i]<<" ";
    }
    cout<<"\n";
}

int main()
{
    add(3,6,5);

    print();

    add(1,4,10);

    print();

    add(5,8,10);

    print();

    return 0;
}

Comment  Read more

9. Binary Search

|

#include <bits/stdc++.h>
#include <iostream>
using namespace std;

int BinarySearch(int M[], int x, int right)
{
    int left = 1;
    int mid;

    while (left<=right)
    {
        mid = (left+right)/2;
        if (M[mid]==x)
            return mid+1;
        else if (x<M[mid])
            right = mid-1;
        else
            left = mid+1;
    }
    return -1;
}

int main()
{
    int A[] = {2,5,5,6,9,19,26,35,69};

    int k = sizeof(A)/sizeof(A[0]);

    cout<<BinarySearch(A,9,k);
}

Comment  Read more