کدینگ و ذخیره سازی اطلاعات

STUDENT

DEGREE

YEAR

The explosive demand for digital data storage with larger storage capacity, higher reliability and fault tolerance, poses tremendous challenge on the storage industry. Distributed storage systems (DSS) are used to store massive amounts of data over a set of distributed storage nodes. Applications of DSS can be found in social networking, websites such as Facebook, cloud storage systems and video streaming website. Over time, storage nodes will leave the system, called node failures. A principal goal in DSS is the efficient repair of a failed node. tandard techniques for such a distributed way of storage include multiple backups (typically triple replication) or using erasure codes such as Reed-Solomon codes. But this techniques is not efficient.\\\\ \\emph{Locally repairable codes} (LRCs) are important, due to their applications in DSS. Let $\\mathcal{C}$ denote an $[n,k,d]$ linear code having block length $n$, dimension $k$ and minimum distance $d$. The $i$-th coordinate, $1\\leq i\\leq n$, of $\\mathcal{C}$ is said to have locality $r$, if the value at this coordinate can be repaired by accessing at most $r$ ($r \\ll k$) other coordinates and performing a linear computation. Further, if the $i$-th coordinate, $1\\leq i\\leq n$, can be repaired by accessing $r_i$ other coordinates, then $\\mathcal{C}$ is called an LRC with locality $r$.\\\\ In DSS, it is possible that multiple failed nodes occur at a time, and DSS must be able to repair as many failed nodes as possible. Basically, in case of multiple failed nodes, the repairing performance can heavily depend on the repairing strategy in use, say, repairing the failed nodes simultaneously or one by one. These two strategies are called parallel approach and sequential approach, respectively. In the parallel mode, the $t$ failed nodes can be repaired simultaneously, whereas in the sequential mode, the $u$ failed nodes are repaired one by one and the previously fixed failed nodes can be used in the next rounds of repairing. otentially, for a given LRC, the sequential mode can fix more failed nodes than the parallel mode ( In the other words $u\\geq t$); hence, for given parameters $r$ and $t=u$, the codes with sequential mode can potentially achieve higher rate and larger minimum distance than the parallel mode. In this thesis, we first consider LRC aiming at joint sequential-parallel repairing multiple failed nodes, and study the $(n,k,r,t,u)$-ELRCs (Exact locally repairable codes) which are $[n,k]$ linear codes with the property that any set of failed nodes of size at most $t$ can be simultaneously repaired in parallel mode, and each element of a set $E$ of failed nodes of size at most $u$ can be sequentially repaired by $r$ ($r k$) other coordinates. We present a method by which with a given parity-check matrix of an $(n,k,r,t,u)$-ELRC with minimum Hamming distance $d$, a new ELRC with minimum Hamming distance $2d$ and availability $t+1$ is constructed that can repair each set of failed nodes $E$ of size at most $2u + 1$ in sequential mode and this repair is done in at most $u-t+2$ steps. We construct a big family of LRCs by making use of orthogonal Latin rectangles, permutation cubes, back circulant Latin squares and some other combinatorial designs. Also, LRCs based on $t$-dimensional permutation $m$-cubes with block length $m^t$ are presented which have locality $r=2m-3$ and availability $t \\geq 3 $ in parallel mode, and also can repair any set of failed nodes of size up to $2^t-1$ in sequential mode in at most $t-1$ steps. Further, we show these codes are \\emph{overall local repair} with repair tolerance $2t$ in parallel mode. Further, we introduce some new clases of LRC, such as: Locally~repairable~code with arbitrary live node: The codes in this Sequential probabilistic repair: The codes in this Finally, we calculate the repair time (in the worst-case and average case) for some of the previously constructed codes in sequential mode.

امروزه میزان تولید اطلاعات دیجیتال بشدت در حال افزایش است و بنابراین مسئله ذخیره اطلاعات، یک موضوع جدی به شمار می‌رود همچنین نیاز به دسترسی سریع به اطلاعات ذخیره‌شده نیز بسیار مهم و جدی می‌باشد، و این موضوع باعث شده است تا سیستم‌های ابری و سیستم‌های ذخیره ‌توزیع‌شده مورد توجه واقع شوند. استفاده از سخت‌افزارهای بیشتر (درایوهای ذخیره اطلاعات) در سیستم‌های ذخیره اطلاعات، برای بازیابی بدون خطای اطلاعات و بالا بردن قابلیت اعتماد (به روش تکرار اطلاعات)، هزینه بالایی دارد؛ به همین سبب، کدگذاری برای ذخیره اطلاعات روی سیستم‌ها بسیار با اهمیت است. برای ذخیره اطلاعات به‌کمک کدگذاری، می‌توان قابلیت اطمینان را جهت بازیابی اطلاعات افزایش داده و هزینه‌ها را کاهش داد. در یک سیستم ذخیره اطلاعات با تعداد زیادی درایو (رأس)، هر لحظه امکان دارد که تعدادی از رئوس دچار مشکل شوند و برای جلوگیری از قطع‌شدن کل سیستم، می‌بایست هرچه سریعتر اطلاعات این رئوس معیوب، به کمک رئوس سالم دیگر، ترمیم شده و در رئوس جدیدی جایگزین شوند. اخیراً کدهای ترمیم‌پذیر‌موضعی، با این هدف معرفی شده‌اند و به‌طور کاربردی در سیستم‌های فیسبوک به‌کار گرفته شده‌اند. در این نوع از کدها، هر مؤلفه از کد به‌عنوان یک رأس از سیستم ذخیره‌سازی درنظر گرفته می‌شود و برخلاف کدهای سنتی (مانند کدهای رید-سالامون) برای بازیابی اطلاعات یک رأس، نیاز به تمام رئوس دیگر نیست و به‌طور موضعی تنها با تعداد کمی از رئوس دیگر، می‌توان یک رأس را ترمیم کرد. در ترمیم یک رأس معیوب، دو پارامتر نقش اساسی دارند، یکی تعداد رئوس سالمی است که در فرآیند ترمیم رأس معیوب به‌کار گرفته می‌شوند و آن را درجه ترمیم می‌نامند و دیگری میزان اطلاعات مورد نیاز است که باید از رئوس سالم مورد نظر بارگذاری شود تا اطلاعات رأس معیوب ترمیم و بازیابی شود و به آن پهنای باند ترمیم گفته می‌شود. ترمیم یک رأس معیوب زمانی کارآمد است که یا درجه ترمیم و یا پهنای باند ترمیم کم باشد. زمانی که تعداد رئوس معیوب در یک سیستم بیش از یک رأس باشد دو رویکرد کلی جهت ترمیم آن‌ها ارائه می‌شود که یکی از آن‌ها ترمیم موازی و دیگری ترمیم متوالی است. در رویکرد ترمیم موازی تمام رئوس معیوب به‌طور هم‌زمان در یک مرحله ترمیم می‌شوند. ولی در رویکرد ترمیم متوالی، رئوس معیوب در چند مرحله ترمیم می‌شوند و در هر مرحله از ترمیم، رئوس ترمیم شده در مراحل قبلی می‌توانند در فرایند ترمیم رئوس شرکت کنند. در شرایط یکسان برای یک کد، فرآیند ترمیم متوالی نسبت به ترمیم موازی دارای توانایی ترمیم تعداد رئوس بیشتری است، ولی زمان ترمیم نیز افزایش می‌یابد. در این رساله پس از معرفی کامل این دو رویکرد، با هدف کاهش زمان ترمیم، حالتی برای کدگذاری سیستم‌های ذخیره اطلاعات معرفی می‌شود که ترکیبی از هر دو رویکرد است و حالت ترمیم متوالی-موازی باهم ‌نامیده می‌شود. در این حالت، علاوه بر توانایی ترمیم تعداد رئوس بیشتر نسبت به حالت موازی، زمان ترمیم نسبت به رویکرد متوالی کاهش می‌یابد. همچنین با استفاده از ساختارهای ترکیبیاتی (مانند مربع لاتین، مکعب‌ لاتین، مربع‌های لاتین متعامد، مکعب‌های لاتین متعامد، طرح یودن و مربع لاتین چرخشی)، چندین ساختار از این نوع کدها ارائه می‌شود و زمان ترمیم این کدها نیز بررسی می‌شود. در ادامه رساله زمان ترمیم کدهای موضعی ساخته‌شده توسط پژوهشگران دیگر در حالت متوالی، مورد بررسی قرار می‌گیرد. زمان ترمیم مجموعه رئوس معیوب، در بدترین-حالت ممکن و متوسط‌زمان‌ترمیم مجموعه رئوس معیوب در تمام حالت‌ها، مواردی هستند که در مورد زمان ترمیم رئوس معیوب در حالت متوالی مورد بررسی قرار می‌گیرند. در بخش دیگری از این رساله، کلاس‌های دیگری از کدهای ترمیم‌پذیر موضعی معرفی می‌شوند. یکی از مهمترین کلاس‌های معرفی شده در این بخش، تعمیمی از کدهای ترمیم‌پذیر در حالت متوالی است. در این نوع تعمیم از کدها، به بررسی احتمالاتی توانایی ترمیم تعداد رئوس معیوب پرداخته می‌شود. به عبارت دیگر، این امکان وجود دارد که اگر یک کد ترمیم‌پذیر موضعی دارای توانایی ترمیم رأس در حالت متوالی باشد، با احتمال بسیار زیاد (بالاتر از 98 درصد) می‌تواند رئوس بیشتری را نیز در حالت متوالی ترمیم نماید.

SUPERVISOR

STUDENT

FACULTY - DEPARTMENT

DEGREE

YEAR

TITLE