موازی‌سازی حلگر خطی گرادیان مزدوج پیش‌شرط شده در نرم‌افزار اپن‌فوم

STUDENT

DEGREE

YEAR

OpenFOAM is an open source educational, research an industrial CFD tool, which contains a vast variety of applications or solvers for incompressible flow, compressible flow, combustion, heat transfer and electromagnetic simulations. OpenFOAM uses C++ programming language as its base language and it is possible to improve existing applications by object-oriented programming or even create a totally new solver. A common approach to accelerate computation and reduce the simulation time is to employ more powerful computers and even supercomputers. However, with the increasing availability of multicore processors and newer GPU technologies, developing efficient and fast solvers is critically needed to improve the usage of the present hardware. To accomplish this task, one needs to use parallel programming techniques specially for intensive computational applications. In the present thesis, we intend to improve the speed-up of the PCG linear solver used in the icoFoam application of OpenFOAM. OpenFOAM uses the MPI protocol as its default parallel programming model to accelerate the computations. Through an analysis of the current code structure of the PCG solver in OpenFOAM, a two-level parallelism structure using both distributed memory parallel programming model (i.e. MPI) and shared memory parallel programming model (i.e. OpenMP) is proposed and implemented. The structure of the present method is comprised of a coarse grain parallelization between different sub-domains by the MPI and fine grain loop-level parallelization in the linear solver functions using the OpenMP constructs. After a domain decomposition of entire computational domain, each sub-domain with its specific geometric information and initial values is assigned to a distinct MPI process. Each sub-domain exchange its information through MPI commands. However, within each sub-domain, some of the computations are accelerated by running them on available processor cores using the OpenMP protocol. The key idea here, is to use the thread parallelism feature of OpenMP and employ available processor cores on each single machines which contribute to the parallel run of the application. In theory, this should reduce the unnecessary MPI communication overhead and results in a higher code performance. Our results show that by using the present hybrid parallel programming technique, the speed-up of the PCG solver is improved compared to that of the native MPI implementation. Based on the performance results, we conclude that the present hybrid parallel model is an appropriate technique to further improve the efficiency of the PCG parallel solver and employ available hardware more efficiency. Also through the discussions presented in this study, some suggestions for future work are presented. Key words: Hybrid parallel programming, OpenMP, MPI, Conjugate gradient, Open FOAM, Eclipse

اپن‌فوم یک ابزار CFD آموزشی، تحقیقاتی و صنعتی بصورت متن‌باز است که انواع گسترده‌ای از برنامه‌های کاربردی و حلگرها را برای شبیه‌سازی‌هایی از جمله جریان تراکم‌ناپذیر، جریان تراکم‌پذیر، انتقال حرارت و الکترومغناطیس شامل می‌شود. در اپن‌فوم از زبان برنامه نویسی C++ به عنوان زبان پایه استفاده شده است و می‌توان توسط برنامه‌نویسی شئ‌گرا برنامه‌های موجود را بهبود داده و یا حتی یک نمونه جدید ایجاد کرد. یک شیوه مرسوم در سرعت بخشیدن به محاسبات و کاهش زمان اجرا شبیه‌سازی‌ها، استفاده از پردازش موازی بر روی رایانه‌های پر قدرت و حتی ابررایانه‌ها می‌باشد. در عین حال، با افزایش دسترس‌پذیری به پردازنده‌های چند هسته‌ای و فن‌آوری‌های جدیدتر GPU، کارایی رو به توسعه و حلگرهای سریع شدیداً به استفاده بهتر از منابع سخت‌افزاری حاضر نیاز پیدا کرده‌اند. برای انجام این کار، خصوصاً برای برنامه‌های سنگین محاسباتی نیاز به استفاده از تکنیک‌های برنامه‌نویسی موازی است. در پژوهش حاضر، افزایش سرعت حل در حلگر خطی گرادیان مزدوج پیش‌شرط شده موسوم به PCG، بکار رفته در برنامه کاربردی icoFoam از اپن‌فوم مد نظر قرار گرفته است. اپن‌فوم از پروتکل MPI به عنوان مدل برنامه‌نویسی موازی پیش فرض خود برای سرعت دهی محاسبات استفاده می‌کند. با بررسی ساختار کد فعلی حلگر PCG در اپن‌فوم، یک شیوه موازی‌سازی دو سطحی، برگرفته از هر دو مدل برنامه‌نویسی موازی حافظه توزیع‌یافته و حافظه اشتراکی به ترتیب توسط MPI و OpenMP، پیشنهاد و پیاده‌سازی شده است. ساختار شیوه حاضر از موازی‌سازی دانه درشت در بین زیردامنه‌های مختلف توسط MPI و موازی‌سازی دانه ریز در سطح حلقه‌ها در توابع حلگر خطی با استفاده از ساختارهای OpenMP، تشکیل یافته است. پس از تجزیه دامنه کل شبکه محاسباتی، هر یک از زیردامنه‌ها با اطلاعات هندسی خاص و مقادیر اولیه خود به هر یک از پردازه‌های MPI بصورت مجزا اختصاص یافته است. هر یک از زیردامنه‌ها از طریق دستورات MPI اطلاعات خود را مبادله می‌کنند. در عین حال، درون هر یک از زیردامنه‌ها، برخی از بخش‌های محاسباتی از طریق اجرا بر روی هسته‌های پردازشی موجود با استفاده از پروتکل OpenMP، سرعت‌دهی می‌شوند. نکته کلیدی در اینجا، استفاده از ویژگی موازی‌سازی رشته‌های پردازشی OpenMP است که اجرای موازی برنامه را بین هسته‌های پردازشی موجود در هر یک از دستگاه‌های واحد توزیع می‌کند. طبق دانش نظری، این امر می‌بایست سربار ارتباطات غیر ضروری MPI را کاهش دهد و منجر به افزایش عملکرد کد شود. نتایج ما نشان داد که با استفاده از تکنیک برنامه‌نویسی موازی ترکیبی ارائه شده، تسریع حلگر PCG نسبت به حالت پیاده‌سازی اولیه MPI بهبود یافت. بر اساس نتایج گزارش شده، نتیجه می‌گیریم که مدل ترکیبی حاضر یک تکنیک مناسب در بهبود کارایی، هم در اجرای موازی حلگر PCG و هم در بکارگیری تعدد منابع سخت‌افزاری موجود، می‌باشد. همچنین با توجه به بحث‌های مطرح شده در این پژوهش، پیشنهاداتی برای کارهای آتی ارائه شده است. کلمات کلیدی: 1) پردازش موازی ترکیبی 2) MPI, OpenMP 3) گرادیان مزدوج 4) اپن‌فوم 5)eclipse