One of the challenges in sciences, is the optimization problem. By increasing the complexity of the real world problem finding the optimum solution by using the exact and heuristic methods, is not possible. Because these methods are trapped in local optimum. In order to find satisfactory solutions for these problems, metaheuristic methods such as evolutionary algorithms can be used. Estimation of distribution algorithm(EDA) is a type of the evolutionary algorithms that explore the space of potential solutions by building and sampling explicit probabilistic models of promising candidate solutions. Using flexible probability model that can be efficiently learned and sampled, has influence in the optimization process. Given the capabilities of generative neural networks in various areas of machine learning, especially learning distribution of data and produce the same data, these networks can be used as probabilistic model in EDA. We investigate the suitability of invertible generative neural network as a model to capture the dependencies between variables and objectives in estimation of distribution algorithm and the model is enriched in terms of information. In EDA we need to train the neural network in each generation, which increases the total time to solve the problem. We propose an approach based on active learning to selecting samples for each generation so that it is possible to use the trained network to produce new samples in several generations.One of the defects of evolutionary algorithm is the high number of function evaluations leading to the desired solution. To solve this problem, surrogate models are used to predict the value of the objective function, and this increases the cost. We propose a strategy for selecting the informative data based on active learning to train surrogate model, in order to reduce training costs. Experimental results on the benchmark functions, demonstrate that the addition of objective function information to the probabilistic model of distribution estimation algorithm and the use of active learning, accelerates the optimization process compared to the previous related methods.