Sequence labeling is one of the important problems in pattern recognition which involves assignment of labels to each member of various kinds of sequences like sequences of characters , images or speech . One approach for sequence labeling is to model input-output structure as a graphical model . Farther analysis can be performed on the designed graphical model . Graphical models are powerful tools for modeling probability distributions with large number of variables . In some problems like handwriting recognition , the relation between input and output features can be nonlinear and highly complicated . Initially , generative models such as Hidden Markov Model (HMM) were commonly used for sequence labeling . After a while , another model called Conditional random field (CRF) was represented which became popular quickly because of its capabilities in resolving some issues related to previous generative models . CRF is a discriminative probabilistic model . Experiments in recent years show that combining CRF with other models increases the performance . In this thesis , the combination of the Conditional Random Field model and the concept of mixture of experts is investigated . A mixture of experts model increases the learning accuracy through partitioning the input space and having a focused expert network for every partition . It has been shown that utilizing mixture of experts model in learning a model will increase its performance . In this research , by using a number of expert networks , which are some types of neural networks , between the input and output layers of a CRF model , a higher level of features is obtained from the observation sequences and used for training the model . A clustering algorithm is used to assign input strings to experts . To do this , due to the inequality of the length of the observation sequences , clustering is initially performed on the elements of each sequence . After this , there will be two choices for assigning experts to input data : In the first choice , by voting among the clusters of the elements of a string , its cluster is determined and the entire elements of that sequence will be used for learning the related expert and model parameters . In the second choice , according to the cluster related to each element , the training can be performed on all experts assigned to clusters of the input elements . The result of these two choices gives us two models . Experimental results in the application of handwriting recognition demonstrate that the proposed models can considerably improve recognition accuracy in comparison to previous models . In this research , the comparison is performed with models such as neural networks , conditional random field and conditional neural field . The results indicate that the first and second proposed models improve the recognition accuracy up to 7% and 7.5% respectively . Keywords : Sequence Labeling , Log-Linear Model , Discriminative Model , Conditional Random Field , Mixture of Experts