بهینه سازی فرآیند تولید با استفاده از رویکرد داده کاوی براساس مولفه های اصلی(مطالعه موردی شرکت نفت سپاهان)

STUDENT

DEGREE

YEAR

: Nowadays, the volume of data rises with a considerable rate. Thus, there is an indelible demand in techniques able to analysis of the mentioned data and intelligently discover the hided knowledge in them. Data mining provides some tools for analyzing mass data bases, discovering the processes, patterns and knowledge, using this data and information. The Sepahan Oil Company with the production capacity of more than 450000 Ton of various kinds of oil, is one of key industries in Iran. Considering its rich data base, Sepahan Oil Company is a perfect target to applicate the data mining techniques and consequent discovering of knowledge embedded in such data base. The efficiency and qualitative parameters of base oil as the main production of this company are considered as two of the most important factors affecting the final quality of products. Incompatibility of the of these parameters with the desired parameters represented by product specialists is one of the basic challenges in this production line. Current thesis, accordingly, is pioneer at manipulating data mining techniques to process the data bases available in Sepahan Oil Company and consequent prediction of output parameters in base oil production line. Coming across with multiplicity of variable items during data analysis and encountering dependence of this items on each other which make some difficulties at analyzing them, this investigation is intended to lower such problems by declining the variable size and eliminating their dependence using principal component analysis. In order to meet this target, intelligent techniques of knowledge discovery such as neural networks, regression trees, support vector machines and conventional regression methods have primarily been described and used to predict the output parameters of production of base oil after the data selection and data preparation. Subsequently, Principal factors resulted from principal component analysis method have been assumed as the input of these methods to evaluate the impact of such factors on data mining techniques as well as comparing its outcomes with implementation results of data mining techniques on basic data. Based on received evidences, it is fair to say that principal component analysis method represent a better function when accompanied with data mining techniques and lead to improve them. Drawing a comparison between distinguished methods, neural network method based on principal factors has been considered as the best method at prediction of output process parameters. Finally, affecting method of input factors on target parameters has been determined using the regression trees.

امروزه، حجم داده در سازمان‌ها با نرخ بی سابقه ای رشد می کند. افزایش روز افزون داده ها، تکنیک هایی را می طلبد که هوشمندانه به تجزیه و تحلیل آنها پرداخته و دانش پنهان موجود در آنها را کشف نماید. داده کاوی، ابزارهایی را برای تحلیل پایگاه داده بزرگ، کشف روند ها، الگوها و دانش از این منابع داده فراهم می کند. عموماً در تحلیل داده‌ها با تعدد صفات متغیرها روبرو هستیم که وابستگی این صفات موجب بروز مشکلاتی در تحلیل آن‌ها می‌شود؛ لذا سعی می‌شود با کاهش ابعاد متغیرها و حذف وابستگی میان آن‌ها، این مشکلات را کاهش داد که انجام این عمل می‌تواند به‌وسیله‌ی تجزیه و تحلیل مؤلفه‌های اصلی صورت ‌پذیرد. شرکت نفت سپاهان با ظرفیت تولید بیش از 450 هزار تن انواع روغن، از جمله صنایع کلیدی کشور است که با برخورداری از بانک داده‌ی بسیار غنی، زمینه‌ی مناسبی برای کاربرد تکنیک های داده کاوی و کشف دانش نهفته در این داده ها می‌باشد. در خط تولید روغن پایه که محصول اصلی این شرکت است، راندمان و پارامتر‌های کیفی محصول خروجی از جمله عوامل تعیین کننده‌ی کیفیت نهایی محسوب می‌شوند . در این پایان نامه برای نخستین بار از تکنیک های داده کاوی به منظور پردازش پایگاه های داده موجود در شرکت نفت سپاهان استفاده شده است. بدین منظور ابتدا به تشریح تکنیک‌های هوشمند کشف دانش مانند شبکه‌های عصبی، درخت‌های رگرسیون و ماشین‌های بردار پشتیبان و همچنین روش سنتی رگرسیون پرداخته شده است، سپس این روش‌ها جهت پیش‌بینی پارامترهای خروجی فرآیند تولید روغن پایه به‌کار گرفته می‌شود. علاوه بر این، عملکرد روش تجزیه و تحلیل مؤلفه‌های اصلی بر روی این روش‌ها مورد بررسی قرار می‌گیرد. نتایج محاسباتی نشان می‌دهد که این روش عملکرد مناسبی در ترکیب با تکنیک‌های داده‌کاوی از خود نشان داده و باعث بهبود کارایی این تکنیک‌ها می‌شود. با مقایسه‌ی نتایج روش‌های مختلف، روش شبکه عصبی بر اساس مؤلفه‌ها‌ی اصلی به عنوان بهترین روش جهت پیش‌بینی پارامترهای خروجی فرآیند تولید در نظر گرفته شده و در نهایت نحوه‌ی اثرگذاری پارامترهای ورودی بر پارامترهای هدف تعیین می‌گردد.