With the advancement of science and the need to study complex physical phenomena, the problem of long simulation times in various sciences is highlighted more than ever. Weather Forecast from the meteorological model and the simulation of air flow around aircrafts, are examples of such cases. For such cases it is important to use fine grids for a better accuracy of the numerical simulation. On the other hand, the available computer resources might be limited. Hence, these limitations may result in using coarser grids which in turn, would reduce the accuracy. One way to overcome this problem is to use the parallel processing technology on recent processors which in turn, are constantly being improved. In 2006 new generation of graphics cards were introduced to the market that was able to process non-graphical data, in addition to the graphical data. This new generation founded a new method of parallel processing that is known as parallel computing on graphics processors. In the present thesis, this technology is used to accelerate the simulation of incompressible flows. The governing flow equations have been discretized on unstructured grids using a coupled method. To benchmark the new parallel flow solver three test cases have been considered, namely, the square lid driven cavity, the skewed lid driven cavity and a backward step flow. The first two problems have been solved on both rectangular and triangular grids whereas the last problem has only been solved on rectangular grid. In addition, a simpler heat conduction problem has been solved to validate and check the developed parallel codes and investigate the performance gain when the GPU is used. Results show that the even a rather low end graphics processor such as a GeForce 9800 graphics processor may run as fast as 48 ,56 times compared to a typical CPU. The overall results indicate that the graphics processor can be a source of low cost yet very practical and effective hardware to reduce the time of flow simulations. The results also show that according to the specific architecture of these processors, the highest parallel efficiency in every case occurs for some specific grid sizes. Our results show that when the grid size is a multiple of 16 in all dimensions and for the specific hardware used, the parallel efficiency and the speed up of the GPU code is maximized. Even a small change in the grid size might have a significant negative effect on the parallel efficiency. The main reason for this is due to the number of threads in each block for the GPU. To obtain the maximum performance of a GPU one needs to occupy all threads in a block and if some are left idle, there could be a high penalty which lowers the overall parallel efficiency. This study shows that the new technique of parallel processing on graphics processing units is a low cost solution to speed up the flow simulation and seems to be a promising approach for the next generation CFD codes. Keywords: Incompressible flow, Parallel processing, GPU, Unstructured rectangular and triangular grids .