sathya final review

Download Sathya Final review

If you can't read please download the document

Post on 19-Jan-2017

143 views

Category:

Documents

0 download

Embed Size (px)

TRANSCRIPT

  • ARRAY PROCESSOR FEATURING AN EFFECTIVE FIFO BASED DATA STREAM MANAGEMENT

    PROJECT INTERNAL GUIDEMrs.I.VATSALAPRIYA.M.E.,PROJECT MEMBERS:S.SATHIYA SAINATHAN,P.SRIBALAMURUGAN

  • SYNOPSISABSTRACT

    NEED FOR PARALLEL COMPUTING

    INTRODUCTION TO PARALLEL PROCESSOR AND ITS FEATURES

    ARRAY PROCESSOR

    SYSTOLIC ARRAY PROCESSOR

    BASE PAPER ARCHITECTURE FOR MATRIX CALCULATION

    PROJECT THEME IMAGE ROTATION AND IMAGE TRANSPOSE

    COMPARISON BETWEEN MATLAB AND ARRAY PROCESSOR

    PROPOSED ARCHITECTURE

    OUTPUT AND OTHER APPLICATIONS

    CONCLUSION

  • ABSTRACTIn array processors, data I/O management is the key to realizing high-speed matrix operations that are often required in image processing.In this project, we propose an array processor utilizing an effective data I/O mechanism featuring external FIFOs.FIFOs are used as buffers to store Initial matrix data and partially processed results. Therefore, matrix operations, including the algorithm to solve the Algebraic Path Problem (APP), can be performed without any data I/Os.In addition, we can eliminate register files from the processing elements (PEs) if we construct the PE array by controlling the external FIFOs systematically and transferring the data from the FIFOs to the PE array (vice-versa).This enables us to simplify each PE structure and realize a large array processor with limited hardware resources.The FIFOs themselves can be easily realized using conventional discrete FIFO or memory chips.

  • Need for Parallel ComputingEach and Every Future Field development depends on Digital computing!Controlling Applications By means of Digital circuit is simple and cost effective.The increase in complex computational steps in digital processing, results in Performance degradation.To solve this global problem, we are going for an highly efficient architectural design for Parallel Computing.

  • Parallel vs. Serial ComputingSerial ComputingParallel ComputingTraditionally, software has been written forserialcomputation.To be run on a single computer having a single Central Processing Unit (CPU).A problem is broken into a discrete series of instructions.Instructions are executed one after another.Only one instruction may execute at any moment in time.

    Parallel computingis the simultaneous use of multiple compute resources to solve a computational problem.To be run using multiple CPUs.A problem is broken into discrete parts that can be solved concurrently.Each part is further broken down to a series of instructions.Instructions from each part execute simultaneously on different CPUs.

  • Features of Parallel ComputingTo process Multiple datas simultaneously.It reduces the computation time.The cost function of extended architecture design is compromised to achieve accuracy and speed of execution.Complexity is Reduced.It has infinite advantages.

  • Array processorA multiprocessor composed of a set of identical central processing units.A processor, that is capable of performing simultaneous computations on elements of an array of data in some number of dimensions.CPU will act synchronously(parallel) under the control of a common unit.Exclusively designed for matrix calculation.

  • Systolic Array ProcessorIt is the existing processor.A systolic array is a pipe network arrangement of processing units called cells. It has parallel computing operation.Cells are used to compute data and stores independently of each other.Cells consist of data processing units.DPUs connected with each other by mesh like arrangement.

  • Block diagram of systolic array

  • Drawbacks of systolic array processorExpensive.Highly specialized for particular applications.Difficult to build.Limited Memory.More number of registers are required.

  • Features of array processorHigh speed matrix operation.We can eliminate register files from processing units.This is achieved here by means of FIFOS.Control and scalar type instructions are executed in the control unit .Vector instructions are performed in the processing elements .

  • Base paper architectureA design architecture of a 2D array processor is proposed by eliminating the use of ALU and external RAM Memory. Since all the calculations can be performed by rotating and shifting of the MATRIX data.Consists of individual Processing Elements.Supports simple instruction setAvoids Algebraic Path Problem.

    2D toroidal structure of our Proposed array Processor

  • Our ObjectiveProject aim is to rotate and transpose an image in matrix by taking the image coefficients.

    The working of both Matlab and array processor image rotation and transpose.

    To show how the diffrence in time and registers required comparing both the methods.

  • Image rotation in MatlabThis is considered as the normal method of image rotation.

    More number of clock cycle.

    More memory required.

    More internal registers to store data.

    Time consuming process.

  • Image rotation in MatlabTime taken for rotation = 0.128728 seconds.

  • Example

    By taking the above 2x2 matrix let us calculate how much time and memory its going to consume in both the systems.

    Aim: to achieve 90 rotation and transpose using Matlab and array processor.

    location[0][1][0]12[1]34

  • Matlab algorithm for image rotationRequired variables:Temporary variables: s,tMatrix Location: (a[0][0],a[0][1],a[1][0],a[1][1])Required variables: 6

    Procedure for rotation:S=a[0][0],t=a[0][1];-------1st clock cycleA[0][0]=a[1][0]; -----------2nd clock cycleA[1][0]=a[1][1]; -----------3rd clock cycleA[0][1]=s; ------------------4th clock cycleA[1][1]=t; ------------------5th clock cycle

  • Drawbacks in matlab rotationMore variables are required.It takes 5 clock cycles for one variable to be rotated.It takes 0.128sec to rotate an image.More memory(registers) is required.As per design consideration more gates are also needed.

  • Array processor algorithm for image rotationFor that same example, Algorithm for rotation in Array processor is:

    A[0][0]

  • Advantages of array processor RotationIt takes only one clock cycle.It takes 150 nS to rotate the image.No need of Temporary variables.Less memory (registers).Less gates are required.Design is also simple.

  • Matlab algorithm for image TransposeRequired variables:Temporary variables: s,tMatrix Location: (a[0][0],a[0][1],a[1][0],a[1][1])Required variables: 6

    Procedure for rotation:S=a[1][0],t=a[1][1];-------1st clock cycleA[0][0]=a[0][0]; -----------2nd clock cycleA[1][0]=a[0][1]; -----------3rd clock cycleA[0][1]=s; ------------------4th clock cycleA[1][1]=t; ------------------5th clock cycle

  • Matlab algorithm for image TransposeTime taken for transpose = 0.082730 seconds

  • Drawbacks in matlab TransposeIt takes 5 clock cycles for one variable to be transposed.

    It takes 0.0827sec to transpose an image

    More memory(registers) is required.

    As per design consideration more gates are also needed.

  • Array processor algorithm for image transposeFor that same example, Algorithm for transpose in Array processor is:

    A[0][0]

  • Advantages of array processor TransposeIt takes only one clock cycle.It takes 100 nS to transpose the image.No need of Temporary variables.Less memory (registers).Less gates are required.Design is also simple.

  • Proposed architecture for image rotationThe internal architecture of PEs and FIFOs are nothing but registers. It shouldnt have any character as it is going to obey the coded program according to the proposed system.

  • Proposed architecture for image transposeThe internal architecture of PEs and FIFOs are nothing but registers. It shouldnt have any character as it is going to obey the coded program according to the proposed system.

  • Operation of proposed systemRotate and Transpose commands are activated.The rotation and transpose done in a single clock cycle synchronously.All the processing elements are capable of reading as well as writing the datas.Read and write operations are performed synchronously (Parallel).Buses & FIFOs in between the PEs plays a major role in reducing the number of registers.

  • Output of the rotated image coefficients Time taken for rotation = 150 nS.

  • Output of the transposed image coefficients Time taken for transposition = 100 nS.

  • Comparison between matlab and array processor operations

  • OTHER APPLICATIONS OF ARRAY PROCESSORSource: https://computing.llnl.gov/tutorials/parallel_comp/

  • CONCLUSIONThus the image processing in array processor is proved to be more efficient than any other system.In future the number of registers used can be reduced by using more buses in PEs.So the time of processing can also be reduced by reducing the usage of registers.From this project we have learnt one end of the chip design.