In many fields of science and engineering, we encounter data with high dimensionality. Using the high dimensional data directly, owning to the many problems that it would cause is undesirable. Due to the limitations that real world problems pose, these data actually reside on suaces of much lower dimensionality. Learning these suaces is of special importance in the field of machine learning and pattern recognition. The reason for this importance is that in many cases due to the phenomenon of curse of dimensionality, using high dimesion data directly would severely degrade the performance of Suace learning methods try to find a suace that reduces the dimensionality of the data while maximally increasing a certain criterion. Many suace learning methods have been proposed one of the most well-known and widely used of which is canonical correlation analysis. In canonical correlation analysis our goal is to find two suaces for two sets of data so that in those suaces those sets of data are maximally correlated. This method is one of the most important tools for analysis of multiview data and has recieved a lot of attention lately. One of the most important advances in the field of suace learning has been the introduction of probabilistic suace learning methods. These methods interpret suace learning as solving a latent variable probabilistic model. The existence of probabilistic model offers many advantages the most important of which is that these models are extendible and its possible to extend the probabilistic model in different ways. Many suace learning methods are only applicable to vector data and when the data have structure (such as matrix data) they should be first vectorized. When this happens locality information gets lost and also the covariance matrices become huge. To avoid these problems two-dimensional methods have been proposed. These methods can reduce dimensionality of matrix data without vectrorization. In this thesis a two-dimensional probabilistic interpretation of canonical correlation analysis is proposed which has the benefits of probabilistic mathods and also avoids vectorizing matrix data. The proposed probabilistic model is based on matrix-variate distributions. Two approaches for solving this probabilistic model are proposed one of which is simplifying the model and solving it through the expectation maximisation algorithm and the other is solving the model using the variational method. The performance of the proposed method for synthetic and real data is assesed and its superiority to rival methods in experiments with real data is shown. Key word: Suace learning, Dimensionality reduction, Probabilistic Model, Matrix-variate distribution.