پیاده سازی یک حلگر موازی جریان های تراکم ناپذیر بر روی پردازشگرهای گرافیکی

STUDENT

DEGREE

YEAR

With the advancement of science and the need to study complex physical phenomena, the problem of long simulation times in various sciences is highlighted more than ever. Weather Forecast from the meteorological model and the simulation of air flow around aircrafts, are examples of such cases. For such cases it is important to use fine grids for a better accuracy of the numerical simulation. On the other hand, the available computer resources might be limited. Hence, these limitations may result in using coarser grids which in turn, would reduce the accuracy. One way to overcome this problem is to use the parallel processing technology on recent processors which in turn, are constantly being improved. In 2006 new generation of graphics cards were introduced to the market that was able to process non-graphical data, in addition to the graphical data. This new generation founded a new method of parallel processing that is known as parallel computing on graphics processors. In the present thesis, this technology is used to accelerate the simulation of incompressible flows. The governing flow equations have been discretized on unstructured grids using a coupled method. To benchmark the new parallel flow solver three test cases have been considered, namely, the square lid driven cavity, the skewed lid driven cavity and a backward step flow. The first two problems have been solved on both rectangular and triangular grids whereas the last problem has only been solved on rectangular grid. In addition, a simpler heat conduction problem has been solved to validate and check the developed parallel codes and investigate the performance gain when the GPU is used. Results show that the even a rather low end graphics processor such as a GeForce 9800 graphics processor may run as fast as 48 ,56 times compared to a typical CPU. The overall results indicate that the graphics processor can be a source of low cost yet very practical and effective hardware to reduce the time of flow simulations. The results also show that according to the specific architecture of these processors, the highest parallel efficiency in every case occurs for some specific grid sizes. Our results show that when the grid size is a multiple of 16 in all dimensions and for the specific hardware used, the parallel efficiency and the speed up of the GPU code is maximized. Even a small change in the grid size might have a significant negative effect on the parallel efficiency. The main reason for this is due to the number of threads in each block for the GPU. To obtain the maximum performance of a GPU one needs to occupy all threads in a block and if some are left idle, there could be a high penalty which lowers the overall parallel efficiency. This study shows that the new technique of parallel processing on graphics processing units is a low cost solution to speed up the flow simulation and seems to be a promising approach for the next generation CFD codes. Keywords: Incompressible flow, Parallel processing, GPU, Unstructured rectangular and triangular grids .

با پیشرفت علوم و افزایش نیاز به مطالعه ی پدیده های فیزیکی پیچیده، مشکل زمان اجرای طولانی در علوم مختلف نسبت به قبل پر رنگ تر شده است. پیش بینی هوا از روی الگو های هواشناسی و یا شبیه سازی حرکت هواپیماهای مسافربری، نمونه های از این پدیده ها می باشند. یکی از راه های غلبه بر این مشکل استفاده از فناوری پردازش موازی است که با پیشرفت رایانه های جدید پا به عرصه ی وجود گذاشته است. در سال 2006 نسل نوینی از کارت های گرافیکی به بازار عرضه شد که توانایی پردازش داده های غیر گرافیکی را علاوه بر داده های گرافیکی دارا بود. این نسل جدید پایه گذار روش تازه ای در پردازش موازی به نام محاسبات موازی بر روی پردازنده های گرافیکی گردید. در پایان نامه ی حاضر از این تکنولوژی برای افزایش سرعت شبیه سازی جریان تراکم ناپذیر استفاده شده است. معادلات این جریان برای شبکه ی بدون سازمان به روش همبسته، مورد گسسته سازی قرار گرفته است. مسایل مورد بحث، حفره با دیواره ی متحرک 90 درجه و 30 درجه و جریان پشت پله می باشد که دو مسئله ی اول برای دو نوع شبکه ی مربعی و مثلثی مورد تحلیل قرار گرفته اند. البته در ابتدا برای بررسی صحت کد نوشته شده و مقایسه ی عملکرد پردازنده مرکزی با پردازنده ی گرافیکی، یک مسئله انتقال حرارت هدایتی نیز مورد بحث قرار گرفته است. بررسی نتایج بدست آمده نشان می دهند که پردازنده های گرافیکی می توانند به عنوان مثال جریان درون حفره با دیواره ی متحرک 90 و 30 درجه را به ترتیب تا 48 و 56 برابر سریعتر شبیه سازی کنند. به طور کلی نتایج بدست آمده حاکی از آن است که پردازنده های گرافیکی می توانند به عنوان منبعی ارزان ولی بسیار کاربردی و موثر در مقابله با مشکل زمان اجرای طولانی مسایل دینامیک سیالات محاسباتی بشمار روند. همچنین نتایج نشان دادند که با توجه به معماری خاص این پردازنده ها، بیشترین میزان راندمان برای حالات خاصی از شبکه و تعداد معادلات، بدست می آید. کلمات کلیدی : 1- جریان تراکم ناپذیر 2-پردازش موازی 3-پردازنده ی گرافیکی 4-شبکه ی نامنظم مربعی و مثلثی