Computational cost is a critical issue for large-scale water resource optimization problems that often involve time-consuming simulation models. Less accurate approximation ("meta") models can be used to improve computational efficiency. We propose a novel trust-region-based metamodel framework, in which hierarchically trained metamodels are embedded into a genetic algorithm (GA) optimization framework to replace time-consuming numerical models. Numerical solutions produced from early generations of the GA, along with solutions dynamically sampled from later generations, are used to retrain the metamodels and correct the GA’s converging route. A bootstrap sampling technique is used to cluster the collected numerical solutions into hierarchical training regions and then multiple metamodels are trained based on these clustered regions. The hierarchically trained metamodels are then used to approximate the numerical models. A trust region testing strategy selects the most appropriate metamodels for prediction. This allows the local regions (particularly those near the optimal solution) to be approximated by smoother and smaller metamodels with higher accuracy. This can speed up GA’s convergence when the population moves into local regions. The technique was tested with artificial neural networks (ANNs) and support vector machines (SVMs) on a field-scale groundwater remediation case in a distributed network computation environment. Our preliminary results show that the adaptive meta-model GA (AMGA) with the trust region based training technique converges with higher accuracy with the same computation effort. Introduction Groundwater management models often involve coupling complex chemical simulation models with optimization techniques to achieve management goals. These models usually involve solution of computationally intensive partial differential equations (PDEs). These complex nonlinear models can be difficult to solve with traditional optimization approaches, which may not find a global optimum. Genetic algorithms are well suited for solving such problems, but can require substantial computational resources when each objective function (fitness) evaluation involves solving time-consuming PDE models. This is because the GA optimization process needs to evaluate the fitness function thousands of times before it can converge to the global optimum. This is a significant limitation when applying the GAs to more complex cases, even with the help of parallel computation. To alleviate the computational burden, many less accurate but much more computationally efficient approximation methods have been explored. Cooper et al. (1998) used curve-fitting methods to approximate the response surface. Aly and Peralta (1999) and Rogers et al. (1995) used artificial neural network (ANN) and GA to optimize groundwater problems. The approximated static response surface, albeit accurate at the beginning, may be less and less representative when the GA population converges to local regions later in the run. As a result, the GA may prematurely converge to regions containing local minima. One approach that addresses this difficulty involves adaptive response surface methods. Brooker (1998) proposed a framework for generating and managing a sequence of surrogates to the objective function. Jin et al. (2002) used Evolutionary Algorithms (EAs) with ANNs that adapt to sampling model errors to accelerate aerodynamic design problems. In the water resource field, Yan and Minsker (2004) proposed a dynamic neural network framework for optimizing groundwater management problems. This approach used sampling techniques to update the ANNs to achieve good prediction accuracies. This paper extends the work of Yan and Minsker by proposing a novel trustregion based meta-model framework. Instead of creating a global approximation model, the method hierarchically trains a global model and a series of local models to replace the numerical models. The meta-models trained on local regions are trusted only in these regions and trust region testing is used to select appropriate metamodels for prediction. A bootstrap sampling algorithm is used to cluster the sampled PDE solutions into hierarchical regions for training these models. An overview of the algorithm is presented in this paper. The approach is tested on a field-scale groundwater remediation case using both artificial neural networks (ANNs) and support vector machines (SVMs) as the approximation models. Trust Region Based Adaptive Meta-model GA Framework Figure 1 shows the flowchart for the trust region based adaptive meta-model GA (TRAMGA), which starts like all simple genetic algorithms. First an initial population of trial designs is generated randomly. Second, the designs are evaluated to determine their fitness with respect to the objectives and constraints of the optimization problem. In this study, the fitness evaluations were distributed to multiple host computers on a standard office network and cluster systems, allowing them to run them simultaneously (TRAMGA does not require such a setup, however). After the fitness of each design is evaluated, TRAMGA uses GA operators to create new trial designs for the next iteration (“generation”). This process continues until the population converges to the optimal solution. In the first few generations, TRAMGA is in the phase of preparing training sets for the meta-models and the designs are evaluated by simulation models. Once a sufficient number of simulation model evaluations have been completed to train the meta-models, most of the fitness evaluations are bypassed using meta-model evaluations based on a trust region test, coupled with a memory cache of previously evaluated simulations. At each generation, only a few designs are evaluated using the simulation model (“sampled”) according to a sampling policy. These designs are evaluated by the simulation models, whose results are used to update the cache and build the retraining sets. From time to time, meta-models may be retrained and if sufficient new data are available, new hierarchical meta-models are created by sampling the training points into trust regions, which in turn improves their approximation accuracy. These components are described in more detail in the following sections. Trust Region Based Meta-Model Methodology Artificial Neural Network (ANN). ANNs have been successfully applied to numerous applications, including Rogers, 1995, and Aly, 1999 in the water resources field. Their widespread application in different disciplines can be attributed to their ability to approximate almost any linear or non-linear problem. An ANN has many neuron-like units. Each unit accepts input and gives output according to its activation function. The interconnections among the neurons in an ANN have weights associated with each connection, which compose a large parameter set. By gradually tuning the weights associated with each interconnection, a supervised learning algorithm can learn the mapping relationship between the inputs and the outputs sampled from a training set. Then the trained ANN is used to make a prediction. Figure 1. TRAMGA optimization framework Random initial population of trial designs Objective function and constraints Satisfactory design(s)? Stop New generation: Create new trial designs Find in cache?