This study presents a number of data-driven models for wind-wave process at the Caspian Sea . The problem associated with these models is to forecast significant wave heights for several hours ahead using buoy measurements. Models are based on Artificial Neural Network (ANN) and Instance Based Learning (IBL).To capture the wind-wave relationship at measurement sites, these models use the existing past time data describing the phenomenon in question. Three feed-forward ANN models have been built for time horizon of 1, 3 and 6 hours with different inputs. The relevant inputs are selected by analyzing the Average Mutual Information (AMI). The inputs consist of priori knowledge of wind and significant wave height. The other six models are based on IBL method for the same forecast horizons. Weighted k-Nearest Neighbors (k-NN) and Locally Weighted Regression (LWR) with Gaussian kernel were used. In IBL based models, forecast is made directly by combining instances from the training data that are close (in the input space) to the new incoming input vector. These methods are applied to two sets of data at the Caspian Sea . Experiments show that the A yield slightly better agreement with the measured data than IBL. A can also predict extreme wave conditions better than the other existing methods. Non-linear data assimilation for a wind-wave dynamical surrogate model in a reduced space is presented in next part of this study. This surrogate provides a fast emulation of a wind-wave model. Such a fast dynamical surrogate is used for the evaluation of the system states in a small period of time. The system state consists of wave height and wave direction in reduce space which is affected by reduce space wind field. The projection from full space to reduced space is done by a principal component analysis. It is computationally efficient to couple this surrogate with an Ensemble Kalman filter (EnKF). Ensemble methods require the evaluation of dynamics for a large number of statistical ensembles. Application of the procedure is demonstrated through 6 month hindcast study of wind waves over a Caspian Sea using third-generation wave model and analysis ECMWF wind field. Also a dynamic Artificial Neural Network for surrogate model of wind-wave process is used in this work. The trained network is embedded into the stochastic environment and the EnKF is used to find estimates of the system states. Experiments show that the proposed DA technique corrects the prediction of the wind-waves with a modest execution time.