ارائه‌ی روشی جدید برای شناسایی عابرپیاده در تصاویر با استفاده از هیستوگرام گرادیان جهت‌دار

STUDENT

DEGREE

YEAR

Designing a system which could detect humans in images is one of the most important subcatogories of computer vision. Human detection has many usefull applications such as surveillance systems in buildings, driver assistant systems, robotics, virtual reality, automatic analysis of digital media content, etc. Due to the fact that humans are non-rigid and articulated objects, they could appear in images in many differnet shapes. Occlusion, variance in illumination, appearance and scale are other difficulties that make the problem of human detection of very high complixity. So far many various methods have been introduced to do the task of human detection but none of them could detect all of the input images completely. Most human detection methods have two steps. In the first one, information of each image is representd by a descriptor; in the second step a calssifying method is used to detect humans in input images. Human detection is a two These descriptors could be devided into two subcategories, global and local descriptors. Global descriptors, describe the image as a whole. In other words global methods decode the whole image into a single vector. In contrast local descriptors do not use the whole image information and try to find significant key-points. Histograms of Oriented Gradients (HoG) is an example of global descriptors which was introduced by Dalal in 2005 and has been widely used in human detection systems since. The aim of this thesis is to propose a new method for human detection. In order to represent the image information, bag of features model is used. Features are image patches extracted densely and then described by Histograms of Oriented Gradients descriptor. To form our codebook of visual words the extracted patches are clustered using a clustering method like K-means algorithm. The center of each cluster is considered as a visual word. Training images are described with these visual words so each picture would be represented by a vector which has the length of our visual words. Finding the exact number of visual words is not an easy task. Automatic methods could be used. In this thesis we experimentally found the number of the clusters by applying the trained detector with differnet number of clusters on the test images. To highlight the most important features a weigthing method could be applied to the descriptor vectors. Here we used Term Frequency_Inverse Document Frequency (Tf_Idf) which has been used in data mining and text clustering. In the proposed approach, Support Vector Machine (SVM) is used as the binary proposed method to the MIT and INRIA datasets and compared the performance of our algorithm with a similar method in the literature. The results of our experiments show that our method performs at least as well as other available methods. Keywords: Computer vision, Human Detection, Histograms of Oriented Gradients (HoG), Bag of Features, Tf_Idf Weighting

طراحی سیستمی که قادر به شناسایی انسان در تصویر باشد از جمله کاربردهای مهم هوش مصنوعی و بینایی ماشین است. تشخیص انسان کاربردهای بسیاری دارد. از جمله‌ی آنها می‌توان به کاربردهای امنیتی در سیستم‌های نظارتی ساختمان‌ها و مراکز اداری که ورود و خروج اشخاص را کنترل می‌کنند؛ سیستم‌های دستیار راننده و ماشین‌های خودکار هوشمند که موانع انسانی را در مسیر تشخیص می‌دهند؛ رباتیک، حقیقت مجازی، ارتباط کامپیوتر و انسان، تحلیل خودکار تصاویر و غیره اشاره کرد.تاکنون راه حل‌های متعددی برای شناسایی انسان در تصاویر ارائه شده است، اما هیچ کدام از این روش‌ها قادر به حل کامل مسئله نبوده‌اند. روش های ارائه شده به طور کلی شامل دو مرحله هستند: 1) نمایش اطلاعات تصاویر و توصیف آن 2) دسته‌بندی.از جمله‌ی این روش‌ها، روش استفاده از هیستوگرام گرادیان جهت‌دار است که کارآیی خوبی در مقایسه با سایر روش‌ها داشته است. در این روش تصاویر با بردارهایی توصیف می‌شوند که می‌توانند بسیار بزرگ باشند و در نتیجه حجم اطلاعاتی که برای آموزش رده‌بند لازم است بسیار زیاد شود. جهت بهبود این روش و کاهش اندازه‌ی بردارهای توصیفگر، در این پایان‌نامه از مدل کیف ویژگی‌ها بهره گرفته شده است. ویژگی‌ مورد استفاده تکه‌های تصویر است که به صورت متراکم از تصویر استخراج شده‌اند و توسط هیستوگرام گرادیان جهت‌دار توصیف شده است. از ویژگی‌های متراکم استخراج شده جهت تشکیل کلمات بصری و ساخت کیف ویژگی‌ها استفاده می‌شود. پس از به دست آوردن کلمات بصری تصاویر از روش وزن‌دهی tf_idf استفاده می‌شود. مزیت استفاده از روش وزن‌دهی در این است که کلمات بصری‌ای که نقش مهم‌تری در توصیف شیء موردنظر دارند وزن بیشتری می‌گیرند. برای بررسی کارآیی روش پیشنهادی، از مجموعه داده‌هایMIT و INRIA استفاده شده است. نتایج به دست آمده از آزما ی ش‌ها گویای بهبود کارآیی روش پیشنهادی در مقایسه باروش دلال و تریگز است. کلمات کلیدی: 1- شناسایی انسان 2-هیستوگرام گرادیان جهت‌دار 3-مدل کیف ویژگی‌ها 4-وزن‌دهی tf_idf