[Invited talks] [Contributed talks] [Minisymposia talks] [Student papers' talks]
Regularization based on total variation, and related higher order variants based on curvature of level sets, have been used very successfully in image processing. However, their non-smooth nature poses major challenge for numerical methods designed for their iterative numerical minimization. Primal based methods have to be regularized themselves, and even then are often only linearly convergent. Consequently, there has been interest in developing duality-based iterative methods for their solution. The advantage of these dual methods is that the objective function is smooth (often quadratic), but accompanied by many constraints, which may or may not be active at the solution. In this talk, I'll review some of these developments, starting with the PhD thesis of J. Carter, then reviewing the primal-dual method of Chan-Golub-Mulet (related to work by Conn and Overton), to a recent major advance by A. Chambolle for a all-dual formulation. I'll finish by presenting some recent work of mine (in collaboration with J.F. Aujol, S. Esedoglu and F. Park) on extending Chambolle's approach to higher order variants of TV and their applications to computing decomposition of images into geometric, texture and noise components.
Increasingly, financial institutions are concerned with portfolios of derivative instruments. Basic operations include designing effective hedging strategies and constructing investment rebalancing schemes to minimize value-at-risk (VaR) or conditional VaR. Unfortunately, for a portfolio of financial derivatives these problems are typically either ill-posed or ill-conditioned. We illustrate why this is the case, the negative computing implications of this ill-conditioning, and propose problem modifications to yield better-conditioned problems. In addition, we illustrate how some very expensive CVaR computations can, with a problem reformulation, be solved by efficient numerical procedures.
This work is joint work with colleagues Yuying Li, Katharyn Boyle, and Siddharth Alexander.
Derivative free optimization methods have been extensively developed in the past decade. To be able to employ the well developed convergence theory of derivative based methods, the models used (whether in a trust region or a line search approach) have to satisfy Taylor-like error bounds. We will present a unified framework to study these error bounds, initially for the case of polynomial interpolation. These bounds depend on the geometry of the interpolation set. We will analyze this geometry and present generalisations which include a viable way of measuring the quality of the geometry. We will also present extensions to least squares and minimum norm methods. Practical techniques of ensuring that the geometric requirement is satisfied and of improving the geometry of a sample set, as well as connections with other results, will also be included.
Joint work with Katya Scheinberg and Luis Vicente.
We consider application of recently developed spectral element multigrid (SEMG) methods to the study of unsteady incompressible flows. Our simulation approach is based on a semi-implicit formulation, in which the space-discretized Navier-Stokes equations are split into independent convective, viscous, and pressure substeps. Our multigrid solver is developed for the consistent Poisson problem governing the pressure arising from the lP_N - lP_{N-2} spectral element discretization of the Stokes problem. The pressure is associated with the fastest timescale and is therefore the leading contributor to stiffness and computational complexity in the Navier-Stokes time advancement.
The spectral element method (SEM) is a high-order weighted residual discretization similar to p-type finite element methods (FEMs). The SEM, however, is generally restricted to hexahedral elements (quadrilateral in lR^2) with nodal bases located on tensor products of Gauss-Legendre (GL) or Gauss-Lobatto-Legendre (GLL) quadrature points. The GL and GLL bases provide stability and allow efficient evaluation of quadratures in the weighted residual formulation, while the tensor product forms allow for efficient local evaluation of the operators. For a discretization involving E elements of order N (i.e., ~ E N^3 gridpoints in lR^3), the SEM operator evaluation and storage complexities are respectively only O(E N^4) and O(E N^3), which is in marked contrast to the O(E N^6) costs incurred by the standard p-type FEM. Typical values of N for the SEM are in the range N = 4 to 15.
Following the early work of Rønquist and Patera, and of Maday and Munoz, our SEMG method is based on intra-element restriction and prolongation, with coarse-grid spaces derived from on a reduction of N within a fixed mesh (i.e., fixed E). The earlier SEMG approaches were based on pointwise-Jacobi smoothing. This was shown numerically and theoretically to yield order-independent convergence rates in lR^1 with a multigrid convergence factor of rho = .75 for the Poisson equation. In lR^2, however, the performance deteriorates to rho = O(N^{1/2}) because the tensor-product GL and GLL bases yield high-aspect ratio cells within each element that are not amenable to pointwise smoothing.
Here, we develop an approach that extends to two and three dimensions. We avoid the cell aspect-ratio difficulty by using Schwarz-based smoothing in which one solves directly, using fast tensor-product solvers, local Poisson problems over subdomains that are taken to be extensions of each element. The coarsest level problem is derived from a linear finite element discretization based on a triangulation of the quadrilateral (or hexahedral, in 3D) spectral element mesh. We demonstrate that effective smoothing requires the local Schwarz solutions to be weighted by the inverse of the diagonal counting matrix that represents the number of subdomains sharing a given vertex. We compare this hybrid Schwarz-multigrid approach with other multi-level solvers for the Poisson equation, the Helmholtz equation, and for the Schur complement (consistent Poisson) matrix that governs the pressure in our unsteady Navier-Stokes computations. We demonstrate for several incompressible flow simulations that SEMG provides a two-fold reduction over our earlier two-level additive-Schwarz based approach.
Joint work with James W. Lottes.
Over the past decade, Algebraic Multigrid (AMG) has become a prominent tool for the rapid solution of unstructured-grid problems. For nearly 20 years after its inception, AMG was largely ignored, a victim of the overhead and setup costs necessary to create the components of the algorithm. So long as geometrically structured grids were the norm, this condition persisted. In recent years, however, as unstructured grids became more common and as the problem sizes grew to demand large-scale massively parallel computers, AMG came into its own. In this talk we describe AMG at a broad-brush level, indicating its relation to geometric multigrid. We discuss some pertinent facts of the history of AMG, mention the underlying philosophy of the method, and describe briefly the major families of AMG algorithms and their underpinnings. We then concentrate on the more recent advances in AMG, and in particular how an intense period of research primarily into perfecting the prolongation operators has led to a contemporary focus on coarse-grid selection, compatible relaxation, and adaptive AMG algorithms. We describe some of the recent advances in AMG theory. We conclude the presentation with some general observations and remarks designed to set the stage for the AMG presentations in minisymposia (M2 and M3) that follow.
The PageRank of a web page is a major ingredient in how the search engine Google determines the ranking of web pages. Computing the PageRank amounts to computing the stationary distribution of a stochastic matrix whose size is now in the billions.
We discuss the effect on the PageRank when links are added and removed, and when the personalization vector or damping factor are changed. We also discuss different approaches for computing PageRank.
In this talk we show how compactness can be used to design and analyze Newton-Krylov and multilevel algorithms for parameter-dependent compact fixed point problems. Such problems arise not only as integral equations, but also from indirect integrations such as source iteration in radiative transfer calculations and time-stepping approaches to dynamic analysis of parabolic partial differential equations.
The methods we propose are easy to implement and require no detailed knowledge of the internal structure of the fixed point map. We will illustrate the ideas with an example from computational chemistry.
An iterative solution method, in the form of a preconditioner for a Krylov subspace method, is presented for the Helmholtz equation. The preconditioner is based on a Helmholtz type differential operator with a complex term. A multigrid iteration is used for approximately inverting the preconditioner. The choice of multigrid components for the corresponding preconditioning matrix with a complex diagonal is made with the help of Fourier analysis. Multigrid analysis results are verified by numerical experiments. High wavenumber Helmholtz problems in heterogeneous media are solved indicating the performance of the preconditioner.
The Navier-Stokes equations describing the flow of an incompressible viscous fluid lead via discretization and lineaization to non-symmetric matrices with eigenvalues in both halves of the complex plane. Related subproblems are the Stokes equations which are symmetric but indefinite and the convection-diffusion equations which are non-symmetric but have eigenvalues with only positive real parts. The discrete Poisson operator represented by a symmetric and positive definite matrix is a component of all of these problems.
We will use this sequence of problems to describe how fast preconditioned iterative solvers can be built for the more and more challenging problems described by these various models of incompressible fluid flow. In particular we will show how good preconditioners respect the natural norms associated with the originating PDE problems and thus lead to iterative convergence in the norms relevant for the underlying problem.
This work is joint with Howard Elman and David Silvester.
Vanilla option prices satisfy an Integro-Differential equation, when the underlying asset follows a Levy process. We propose two different approaches to compute the numerical solution in the finite variation case.In the first approach, a graded mesh is used in combination with Simpson's rule. In the second approach, a connection with Volterra integral equations with weakly singular kernels is established, opening a possibilty for an application of known solution methods from the theory of Volterra Integro-Differential equations. Both approaches produce a second order convergence rate.
Unlike x-ray and CT imaging, the radio-frequency radiation used in Magnetic Resonance Imaging is low-energy and not thought to pose any long-term health risk. However, it does carry enough energy to heat the patient. To prevent patient discomfort the energy is limited by government regulation. With advances in other aspects of MR Imagers, these limits constrain the types and rapidity of clinical imaging, including the majority of body imaging in high-end 3 Tesla imagers. A new class of energy-minimizing (rf,gradient) pulse pairs is needed to remove this limitation. This has previously been formulated as an optimal control problem, but the simultaneous optimization of both controls has not been practical with current tools. So, we are developing a dedicated iterative optimizer.
In this talk, we will explain the structure of the control problem as a continuously-parametrized family of ODEs with a common set of piecewise-constant controls. We have developed an exact method of integration for such systems using symbolic calculation in Lie Groups, and a computational technique for obtaining gradients in O(N) time, with measured computation time much less than double the time for simple function evaluation. The overall optimization framework (1) approximates the continuous family of terminal constraints using a Remez-like algorithm; (2) approximates the penalized constraints with a low-dimensional quadratic surrogate; (3) defines an unconstrained problem incorporating both penalties and barriers for the different types of path constraints; and (4) solves the resulting problem using an interleaved Interior-Point/Remez algorithm.
In order to be able to simulate complex devices, we need in some cases a full three-dimensional code for the resolution of the Vlasov-Maxwell equations. A three-dimensional code has been written for this purpose and has already been used for many applications. However this code is sequential and on several occasions, it appears worthwhile to use multiple processors. This makes it indispensable to develop a version of the code that will run fast and reliably on parallel machines. In order to develop a parallel version that scales well when using many processors we choose to use a domain decomposition approach moreover this option allows us to reuse a large part of the sequential code for the resolution on each subdomain.
Let us now recall the different kind of Maxwell solvers and their features with respect to parallelization and point out the specificities of ours. The most used and fastest Maxwell solvers are based on Yee's scheme on uniform grids. These are completely explicit and straightforward to parallelize, at least when the continuity equation ¶t r+ div J=0 is numerically verified. However in complicated geometries or for some other reason it might be preferable to use unstructured grids. Several kind of methods have been developed in this case : Delaunay-Voronoi Finite Volume methods have the same nice properties concerning parallelization as those on structured grids. Their drawback is that they need Delaunay-Voronoi meshes which are difficult to generate in 3 dimensions. Other types of methods include traditional vertex centered or cell centered finite volume methods which are explicit and thus can be easily parallelized in the same conditions as finite difference methods. And finite element methods based either on edge elements or as in our case on Taylor-Hood elements. Concerning edge element solvers, the mass matrix needs to be inverted, so the parallelization problems are alike those arising for elliptic problems. In our case, i.e. Taylor-Hood elements, the mass matrix is diagonal. But the formulation is based on a constrained formulation of the Maxwell equations, which makes it impossible to have a completely explicit formula for the time advance and thus makes the parallelization challenging.
In our talk, we first shall recall the constrained wave equations formulation of Maxwell's equations which is used in the code. Then we shall introduce a variational formulation adapted to the given domain decomposition, the continuity at the interfaces being imposed by duality using Lagrange multipliers. After that we shall describe the discretization of this variational formulation and derive a linear system suitable for multiprocessor resolution. The preconditioned Uzawa algorithm used for the resolution of this system shall be described. And finally we shall present numerical results demonstrating the efficiency of the method.
Space filling curves have become a valuable tool in many applications, where data locality is an issue. A well known example in scientific computing is the parallelization of data during the discretization of partial differential equations. In this context, space filling curves have quasi-optimal locality properties. We would like to present several approaches that try to benefit from these locality properties in the context of cache efficient algorithms.
For the multiplication of two matrices we present an approach, where the matrix elements are mapped to the linear memory space via an index function that resembles iterations of the Peano space filling curve. The respective multiplication algorithm can then be designed such that temporal and spatial locality in the data access is optimized. With respect to temporal locality, the algorithm guarantees that during any k3 operations only O(k2) elements are accessed. Every element will be reused approximately k times. For spatial locality, the algorithm totally avoids jumps in memory space: matrix elements will either be directly reused, or the next accessed element will be one of its direct neighbours in memory. This is not only advantageous for cache reuse, but may also be an important property for hardware implementations.
The presented approach can also be generalized to store sparse matrices. Then, an adaptively refined iteration of the Peano curve is used for the index mapping. Locality properties are retained to a large extent.
We will also discuss a related approach to deal with sparse matrices, and especially with the corresponding systems of equations that result from discretizations of partial differential equations. In that case, optimal locality can be retained, if we allow a collection of stacks that hold intermediate results. This can even be extended to certain kinds of adaptive or multilevel discretizations.
We come across the type of interface problems where two different ordinary differential equations are defined on two different intervals with a common interface point and the solution satisfies certain matching condition at the interface. The analytical solution for this kind of problem have been studied. But there is no systematic study to find numerical solution of this type of problems. Here we like to present analysis and results to show that exising methods for IVPs and BVPs can be extended to find solution to these problems.
In this paper, a computational method using parametric cubic spline for general linear second order two point boundary value problems is analysed. The method is shown to have second and fourth order convergence for specific choice of parameters present. Some test examples are given to support the theoretical estimates.
When a thermo-elastic body is suddenly heated or cooled from out side, heat flow arises resulting in change in the temperature distribution in the body and development of stress field. In many engineering problems, thermoelastic effects due to a temperature distribution in the structure have to be considered. This paper aims to study the developments of these fields in a homogeneous, isotropic elastic plate of thickness ?H?, initially has temperature T0 and is in stress free state .The variation in temperature field and stress field occurs owing to the action of external loadings. Temperature and stress distribution have been determined using appropriate initial and boundary conditions. The numerical computations are carried out for a plate of stainless steel to display the variation of temperature and stress fields. It is concluded that the temperature distribution is symmetrically distributed and variation in stress component (tau)xy is more in left side and less in right side whereas stress component
As numerical techniques widely are applied to multi-physical, multi-scale and multi-dimensional engineering applications, the ability to compute solutions and furthermore the ability to quantify the accuracy of underlying mathematical models seem to be significant importance for robust and reliable numerical simulations. Over last two decades, a number of endeavors have focused on the challenge for development and improvement of notable error estimation and adaptive refinement methods. Significant progress has been dedicated in recent years by the use of the advanced a-posteriori finite element error estimation technique termed the bound method. This novel technique provides fast yet robust, reliable and efficient bounds to the `output' of underlying partial differential equations (PDEs) as well as naturally yields a local error indicator, which estimates numerical error for given mathematical models, leading to construct economical and optimized meshes for computations. The bound method in this work is applied to the multi-physical, micro-scale and multi-dimensional PDEs, in particular to facilitate engineering applications governed by the incompressible Navier-Stokes and Energy equations. Furthermore, the bound method devised by advanced numerical strategies, such as an adaptive refinement, the direct equilibration and a parallel computing techniques, will address inter-disciplinary and emerging applications: heat and mass transport phenomena in an array of electronic chip devices and heat exchangers; as well as fluidic motion flow and species transport phenomena of the electro-osmotic flow in microchannels for integrated microfluidic systems such as biochip and Lap-on-chip devices.
Numerical-analytical study of the three-dimensional nonlinear stochastic partial differential equations, analogous to that proposed by V. N. Nikolaevskii [Recent Advances in Engineering Science (Springer - Verlag, Berlin. 1989)] to describe longitudinal seismic waves, is presented. The equation has a threshold of short-wave instability and symmetry, providing long-wave dynamics. The results of computation are in a sharp contradiction with Tribelsky M.I and K. Tsuboi's work (Phys. Rev. Lett. 76 1631 (1996)), in which the influence of the thermal fluctuations was not taken into account and a principally erroneous scheme of numerical modeling of chaos on the lattice was used. The proposed new mechanism for quantum chaos generates nonlinear dynamical systems. The hypothesis is that physical turbulence could be identified with quantum chaos of considered type.
Adequate quantification of the effects of nuclear system parameter variations, as well as of material composition perturbations, on the lowest mode solution (the three-dimensional flux shape) of the neutronics eigenvalue equation is of importance in various areas in nuclear reactor technology. Examples are nuclear uncertainty analysis, sensitivity analysis for reactor transients and design optimization (i.e. fuel lattice design optimization, control rod movement planning, nuclear shielding design or in-core fuel management optimization).
A numerical development, utilizing physically meaningful, iteratively obtained preconditioners, is presented which is aimed essentially at enabling a fast higher order accurate treatment of the flux shape change caused by localised material composition changes (such as material choice variations, burnable poison density variations at lattice level, or complete fuel assembly permutations at core level). This fast method is particularly useful when the effects of large numbers of variations are to be assessed.
According to previously pursued studies in the numerical reactor physics community, acquisition of higher-order accurate results for the perturbed flux shape either entailed the use of preconditioned iterative methods, or required the availability of an extensive set of higher reference eigensolutions of the system equations. Both of these methodologies excluded the option to obtain a higher-order accurate estimate of a perturbed flux defined in only one or a limited number of localised spatial subregions in the system, without the need to compute the perturbed fluxes in all other positions. For example, the exact change in power density in one specific fuel element in a reactor, due to a material perturbation imposed elsewhere in the system (for example, due to control rod movements or burnable poison concentration variations) could follow only from an iterative procedure defined for the entire flux distribution.
The methodology presented here enables the setup of what could be interpreted as a polynomial form. This form features the property that the effect on a specific, only locally defined flux, caused by system variations elsewhere that are localised in space as well, can be expanded polynomially up to higher order accuracy, with the specific imposed variations themselves as functional arguments. For specific cases of interest, the application of this method leads to substantial savings in computational effort, while obtaining similarly accurate results. Numerical investigations, showing the accuracy and computational efficiency of the method, are reported, and possible application areas are discussed.
Krylov space methods for solving linear systems of equations come in many flavors and use various types of recursions to generate iterates xn (approximate solutions of Ax = b), corresponding residuals rn : = b - A xn, and direction vectors (or search directions) vn : = (xn+1 - xn) / wn. Starting from a general definition for a Krylov space solver we give necessary and sufficient conditions for the existence of the the various types of recursions, and we recall the relations that exist between the matrix representations of these recursions. Much of this is more or less well known, but there are also some new, perhaps even surprising aspects.
In this study, we first present a review of standard interpolating algorithms (based on splines and Hermite approximations) and then we introduce a new interpolation procedure proposed recently by Enright (2000) as an alternative to traditional interpolation methods. This new interpolation, called Differential Equation Interpolation (DEI), is based on associating a set of collocation points with each mesh element and requiring that the local approximation interpolates the meshpoint data and almost satisfies the PDE at the collocation points. We show, both theoretically and through numerical experiments, that the accuracy of this interpolation is higher than the accuracy of traditional interpolation and generally consistent with the meshpoint accuracy of the PDE solution. For most linear PDEs, the DEI is obtained by exact collocation and the associated linear system is solved analytically using Maple symbolic computation. For nonlinear PDEs, the DEI is based on almost collocation and the resulting nonlinear system is solved using an iterative Newton method. Examples are given for off mesh interpolation of numerical solutions of evolution PDEs such as the Korteweg-de Vries equation and regularized long wave equation and also for the integration of several invariants of motion related to the conservation of mass, momentum and energy.
We consider the problem of assigning parking slots to buses of different types as soon as their arrivals such that each bus can leave the depot next morning without maneuvers (i.e, rearrangement of buses within lanes), this problem is recognized in literature by "Dispatching Problem". We introduce a class of robust solutions for this problem where each depot lane is partitioned at most two blocks, lanes with one-block (containing one bus type) and lanes with two-block (containing two bus types gathered together, one block for each type). Each block have to contains only the buses of the right type, and the problem may not have a solution. So, the assignment an arrival bus to block of wrong type yield incompatible arrival, likewise, the assignment a bus that parked in block of a given type to departure that required a different bus type(each departure require a bus type) yield incompatible departure. Defining a mismatches by the assignment of a bus to a departure of wrong type. First, we formulate a model in witch the depot lanes arefilled according to specific patterns one-block patterns and two-block patterns, that minimize the number of incompatible arrivals and incompatibledepartures. A computational results are presented. Second, we extend the previous model to a model that minimize the number of mismatches. We propose Bender's decomposition as resolution approche, and a computational study is performed.
In this paper we give an iteration method for solving non-linear integro-differential equations.
[2] presents a preconditioner based on solving approximate Schur complement systems with overlapping restricted additive Schwarz methods (RAS) [1]. The construction of SchurRAS is as simple as in the standard RAS. The communication requirements for each application of SchurRAS are similar with those of the standard RAS approach. For a tested MHD problem, the number of iterations required by SchurRAS exhibits moderate growth with the increase in the number of processors. In the tests, the residual norm reduction by 10-6 was achieved by flexible FGMRES(60). In preconditioning, ILUT with lfil = 70 and the dropping tolerance 10-3 was used as a choice of an incomplete LU factorization. The tests were performed on 1.7Ghz IBM p690 at MSI. The SchurRAS takes about 234 steps to converge on 16 processors. However, iteration counts required by the RAS is 3431. Because the computational costs of SchurRAS are similar to those of RAS, the reduction in the iteration number also leads to the reduction in the execution time: RAS takes 654.39 seconds to converge on 16 processors, SchurRAS takes only 47.88 seconds.
In [2], the overlapping level is 1. This talk will investigate how the overlapping level affects the performance of RAS and SchurRAS respectively.
References
[1] X.-C. Cai and M. Sarkis. A restricted additive Schwarz
preconditioner for general sparse linear systems. SIAM Journal on
Scientific Computing, 21(2):792-797, 1999.
[2] Z. Li and Y. Saad. SchurRAS: A restricted version of the overlapping
Schur complement preconditioner. Technical Report 2004/77, Minnesota
Supercomputing Institute, 2004.
The behavior of road using is one of the complexities of living in today's world. As the trend of increasing travel demand, planning, design, prediction, control and management of the transportation system become more and more important. Traffic flow theory provides analytical techniques and the description of the fundamental traffic flow characteristics, such as flow, density and speed. This new science has addressed questions related to understand traffic processes and to optimize these processes through proper design and control. To make traffic management strategies (such as ramp metering, congestion tolls, entrance/exit control and so on) become more efficient, dynamic traffic flow model is necessary. There are four main methodologies of modeling dynamic traffic flow: car-following model, kinetic model, Boltzmann-like model and cellular automation. The kinetic model is the simplest of the methodologies and has advantages of having analyzable equations and rapidly computational properties. In 1955, Lighthill, Whitham and Richards firstly proposed a macroscopic kinetic traffic flow model (LWR model), which is a continuity equation. The model is derived from the conservation of vehicle numbers. With different boundary conditions, the LWR model can be considered to describe traffic flow under different control strategies. Periodical boundary conditions are suggested for describing signalized intersections or ramp metering interchanges. Furthermore, the LWR model is employed to predict the formation or dispersion of platoon with different initial conditions. Although the LWR model is simple, it is still a nonlinear model. Also, the formation of shock wave is a problem. In the viewpoint of numerical approximation, to handle the nonlinearity of the model and the discontinuity of the solutions are the difficult parts in simulation. Spectral methods are considered good approximations of nonlinear conservation laws. However, using spectral methods for nonlinear conservation laws will result in the formation of the Gibbs phenomenon once spontaneous shock discontinuities appear in solution. In this work, an enhanced spectral viscosity method is applied to stabilize the spectral method by adding a spectrally small amount of high-frequencies diffusion carried out in the dual space and a post-processing method, which removes the spurious oscillations at the discontinuities. An enhanced shock location detector is developed to ensure the post-processing method successful. In addition, the shock location detector can be used to detect the location of traffic incident (such as location of traffic jam or accidents), where discontinuity of density occurs.
Numerical solution of a set of semiconductor partial differential equations (PDEs) has been of great interest in these years. Nowadays, the dimension of semiconductor devices, such as MOSFETs is in the regime of nanometer. Quantum mechanical confinement effect becomes an important factor in numerical simulation of semiconductor devices. Among approaches, density-gradient (DG) approach has successfully been proposed to account for the effect of quantum mechanical confinement. A set of DG model involves the Poisson equation, the electron current continuity equation, and the electron quantum corrected density equation to be solved for a specified device domain. With the continuous decrease of device dimensions, one- and two-dimensional simulations are not accurate enough to describe and explore the physical transportation phenomena for electrons and holes in a semiconductor device.
In this work, we iteratively solve the 3D DG model for a given nanoscale semiconductor device, where several physical quantities and convergence properties are discussed. The solution procedure is described briefly as follows. (1) the stop criteria, mesh, output variables and simulated devices are chosen, (2) solving the Poisson equation with density-gradient correction modified potential iteratively, (3) after the Poisson equation is convergent, continuity equations are solved, (4) then, we'll check the whole system converges or not, (5) and if the whole system converges, then stop computing, otherwise, the Poisson equation and continuity equations should be solved again until the whole system equations converge. All decoupled PDEs to be solved are approximated with the finite element method over 3D domain. This scheme makes sure the solution will be self-consistent. By calculating several essential characteristics of devices, such as the threshold voltage (VTH) and the drain current (ID), the device physical quantities are compared with the results of 1D and 2D simulations by varying channel lengths. Also, the convergence behavior in the inner and outer iteration loops is discussed among 1D, 2D, and 3D simulations.
The Lanczos method is an iterative method for solving the linear system $Ax = b$, where $A$ is large, sparse, symmetric, and positive definite. This method breaks down if $A$ is indefinite and the $LDL^T$ factorization of the tridiagonal matrix $T$ arising from the underlying Lanczos process does not exist. In this talk, we present a pivoting strategy for a \textsl{block} $LDL^T$ factorization of $T$. We demonstrate normwise backwards stability and how this factorization can be applied to the Lanczos method.
Microstructures in Shape Memory Alloys (SMA) hold the key to the unique properties of these materials. However, the numerical simulation of the SMA microstructures is quite involved due to non-convexity of the associated free energy functional, coupling between mechanical and thermal fields, and strong nonlinearities.
In this presentation, we analyse numerically phase combinations in a SMA patch by reformulating the original problem as an energy minimization problem. Our main emphasis is given to square-to-rectangular phase transformations, induced in a low-temperature SMA patch by applying either thermal or mechanical loadings.
The free energy density of the SMA patch is constructed on the basis of the modified Ginzburg-Landau theory for a specific case of square-to-rectangular transformations. Due to non-convexity of the Ginzburg-Landau free energy functional, we have to deal with many local minima.
First, we apply the multiple population genetic algorithm to search for an estimation of the global minimizer. Then, by using this estimation as an initial guess, we apply the sequential quadratic programming based on the quasi-Newton iteration method to refine the global minimum with a higher accuracy. By this hybrid optimization method, we are able to model laminar microstructures in a SMA patch with and without external loadings, and several specific examples will be discussed in this talk.
Many problems from engineering and the sciences require the solution of sequences of linear systems where the matrix and right-hand side change from one system to the next, and the linear systems are not available simultaneously. We review a class of Krylov subspace methods for sequences of linear systems, which can significantly reduce the cost of solving the next system in the sequence by "recycling" subspace information from previous systems. These methods have been successfully applied to sequences of linear systems arising from several different application areas. We analyze a particular method, GCRO-DR, that recycles approximate invariant subspaces, and establish residual bounds that suggest a convergence rate similar to one obtained by removing select eigenvector components from the initial residual. We review implications of this analysis, which suggests problem classes where we expect this technique to be particularly effective. From this analysis and related numerical experiments we also demonstrate that recycling the invariant subspace corresponding to the eigenvalues of smallest absolute magnitude is often not the best choice, especially for nonsymmetric problems, and that GCRO-DR will, in practice, select better subspaces. These results suggest possibilities for improvement in the subspace selection process.
A large technogeneous or space catastrophe, as a rule, is known to accompanied by the initiation of large forest fires. In this paper attention is given to questions of description of the initial stage in the development of a mass forest fires initiated by high altitude radiant energy source (for example Tunguska celestial body fall and etc.). Mathematical model of forest fire was based on an analysis of known experimental data and using concept and methods from reactive media mechanics. Within the framework of this model, the forest and combustion products are considered as a homogeneous two temperatures reacting medium. The temperature of condensed (solid) phase includes a dry organic substance, moisture (water in the liquid-drop state), condensed pyrolysis and combustion products and mineral part of forest fuels. In the gaseous phase we separate out only the components necessary to describe reactions of combustion (oxygen, combustible products of pyrolysis of forest fuels and the rest inert components). It is considered that 1) the flow has a developed turbulent nature, molecular transfer being neglected, 2) gaseous phase density doesn't depend on the pressure because of the low velocities of the flow in comparison with the velocity of the sound, 3) forest canopy is supposed to be non-deformed porous medium. The research is done by means of mathematical modeling of physical processes. It is based on numerical solution of Reynolds equations for chemical components and equations of energy conservation for gaseous and condensed (for canopy) phases. To describe the transfer of energy by radiation the diffusion approximation is used. The boundary-value problem is solved numerically using the method of splitting according to physical processes. In the first stage, the hydrodynamic pattern of flow and distribution of scalar functions are calculated. The system of ordinary differential equations of chemical kinetics obtained as a result of splitting is then integrated. The chemical time step is selected automatically. A discrete analog for equations is obtained by means of the control volume method using the SIMPLE algorithm. Calculation method and program have been check. As a result of mathematical modeling the distributions of temperatures, mass concentrations of components of gaseous phase, volume fractions of components of solid phase, as well as fields of velocity at different instants of time are obtained. It allows to investigate dynamics of forest fire initiation under influence of various external conditions: a) meteorology conditions (air temperature, wind velocity etc.), b) type (various kinds of forest combustible materials) and their state (load, moisture etc.). A great deal of final and intermediate gaseous and dispersed combustion products of forest fuels is known to be exhausted into the atmosphere during forest fires. The knowledge of these kinds of ejections enables a full estimate of the damage from forest fires to be made. The numerical results obtained by the proposed method show satisfactory agreement with experimental measured data.
The overset grid system nicknamed "Ying-Yang" grid is singularity free and has quasi-uniform grid spacing. It is composed of two identical latitude/longitude orthogonal grid panels that are combined to cover the sphere with partial overlap on their boundaries. The system of Shallow-Water equation (SWEs) is a hyperbolic system at the core of many models of the atmosphere. In this paper SWEs are solved on the Ying-Yang grid by using a semi-implicit and semi-Lagrangian discretization on a staggered mesh. The scalar elliptic equation is solved trough Schwarz-type domain decomposition methods, known as optimized Schwarz methods, which gives better performance than the classical Schwarz methods by using specific Dirichlet- or Robin- type interface conditions.
This work is an application of numerical analysis to the solution of open problems in computational electromagnetics. More specifically, modeling of integrated antennas by the Method of Moment (MoM) is considered. We propose using an efficient wavelet approximate method to solve the electrical field integral equation (EFIE). The method is applied on irregular grids (triangular sub-domains) and is applicable to arbitrary antenna shapes, which cannot be accomplished by using FFT, for example.
Instead of searching new discretizations, we propose the discrete wavelet transform (DWT) to be applied to the discretized EFIE via the MoM, and then to solve the resulting matrix equation in the wavelet space, followed by the inverse DWT of the solution vector. This approach allows the available software for discretizing EFIE to be used with greater success, especially when solving matrix equations of very large size. The proposed procedure involves O(N^2) operations and leads to a matrix problem, which can be sparsified. The latter opportunity is based on the fact that the MoM matrix is fully populated but has many small off-main-diagonal elements, which can be neglected without losing valuable information. To that goal, suitable threshold techniques are used to create sparse matrix while not compromising the accuracy. This is possible because the wavelet expansions create matrices with condition numbers orders of magnitude better that the conventional MoM. We solve the matrix problem using iterative methods (e.g., by the BiCG method) with suitable preconditioners in O(N^2) operations.
Several numerical examples are considered and it is shown that the CPU time is reduced by a factor of 10 for problems of N > 1000.
The Sylvester equation
A X + X B = CDT, (1)
where A and B are large square matrices and C and D are rectangular matrices plays a significant role in linear control theory. The most popular methods construct low rank approximate solutions to (1) of the form
Xj = Vj Yj WjT, (2)
where Yj solves a small order Sylvester equation and where Vj and Wj are rectangular matrices whose columns form orthonormal bases of the block Krylov spaces
Kj (A, C ) = range[ C, A C, ..., Aj-1 C ],
Kj (BT, D ) = range[ D, BTD, ..., (BT)j-1 D ].
We propose an algorithm to compute low rank approximations of the type (2), where the matrices Vj and Wj are constructed with a recent version of the block Arnoldi algorithm. A version which takes into account the near breakdowns, i.e., the near singularities in the Krylov matrices
[ C, A C, ..., Aj-1 C ] and [ D, BTD, ..., (BT)j-1 D ].
This leads to matrices Vj and Wj whose columns are selected from the spaces Kj (A, C) and Kj (BT, D) in a way appropriate for convergence.
We show how to detect and use the near breakdowns and discuss the theory and numerical aspects of the proposed algorithm.
The aim of this paper is to introduce and to study symplectic Householder transformations, based on an algebraic and geometric approaches. The construction is so that the parallel between orthogonal Householder and symplectic Householder becomes easy and natural. A block form is given. A structure-preserving algorithm for computing SR factorization via symplectic Householder transformations is obtained. Numerical tests are given. Keywords: Skew-symmetric inner product, symplectic geometry, symplectic transvection, symplectic Householder transformation, symplectic SR factorization, structure-preserving methods.
AMS subject classifications. 65F15, 65F50
Consider the solution of A u = f using additive Schwarz (right) preconditioning, i.e., A M-1 w = f, with Mu = w. The preconditioner is M-1 = åi=1p RiT Ai-1 Ri , where Ai = Ri A RiT. We also consider a restricted additive Schwarz preconditioner [1]. When the local problems Ai zi = vi are hard to solve, one considers approximations to their solution, and uses some inexact local solver [A\tilde]i, for example, one may use a (local or inner) Krylov method with some (fixed) inner stopping criterion. We apply the recently developed theory of inexact Krylov subspace methods [2], or in this case of inexact preconditioning, by which we allow the inexactness of the local solver to change from step to step. We provide a computable criterion to allow the local (or inner) stopping criterion to be relaxed as the (outer) iteration progresses while maintaining overall convergence to the desired (outer) tolerance. Numerical examples illustrate the potential of this dynamically chosen local inexact solvers.
References:
[1] X.-C. Cai and M. Sarkis. A restricted additive Schwarz
preconditioner for general sparse linear systems. SIAM J. Sci. Comput.,
21:792-797, 1999.
[2] V. Simoncini and D. B. Szyld. Theory of inexact Krylov subspace
methods and applications to scientific computing. SIAM J. Sci. Comput.,
25:454-477, 2003.
There are many examples where non-orthogonality of a basis for Krylov subspace methods arises naturally. These methods usually require less storage or computational effort per iteration than methods using an orthonormal basis (öptimal" methods), but the convergence may be "delayed". Truncated Krylov subspace methods and other examples of "non-optimal" methods have been shown to converge in many situations, often with small delay, but not in others. We explore the question of what is the effect of having a non-optimal basis. We prove certain identities for the relative residual gap, i.e., the relative difference between the residuals of the optimal and non-optimal methods. These identities and related bounds provide insight into when the delay is small and convergence is achieved. Further understanding is gained by using a general theory of superlinear convergence recently developed. Our analysis confirms the observed fact that the orthogonality of the basis is not important, only the need to maintain linear independence is. Numerical examples illustrate our theoretical results.
We analyze an iterative method for minimizing general nonlinear function with equality and inequality constraints by interior point method. Applying the barrier penalty technique, we solve a sequence of unconstrained minimization problems with primal-dual merit function. We develop an iterative method to compute the Newton steps. Descent direction and direction of negative curvature are found by using alternatively Conjugate Gradient and Lanczos methods. Various computational enhancements to improve these directions are provided. Our strategy is well adapted to deal with indefiniteness and can be applied to large scale nonlinear optimization problems. Promising preliminary numerical results are presented.
Objectives: The heart rate variability has been shown to have fractal behaviour during the time. Fractals are self similar shapes with fractional dimension and we can use them to find order in the biologic systems which seem to have non predictive and irregular behaviour. The exact relationship between fractal behaviour of heart rate variability and the condition of the cardiovascular system and its changes with age and diseases have not well studied.
We used an artificial neural network for screening the patient with arrhythmia as a pathologic state based on fractal dimension of heart rate variability.
Methods: We used 200 digital rhythmograms which included 100 healthy subjects and 100 patients with arrhythmia. We calculated the fractal dimension of heart rate variability using Nlyser software (version 3.5). We designed an artificial neural network by Neurosollutions software (version 4) and we used the age, gender and fractal dimension of heart rate variability of 150 rhythmograms (75 healthy rhythmograms and 75 ones with arrhythmia) as input data and the condition of being healthy or having arrhythmia as target for training the artificial neural network. We used the data of 20 rhythmograms (10 healthy rhythmograms and 10 ones with arrhythmia) as validation set. Finally after training the network it was tested using the data of 30 rhythmograms.
Results: The designed artificial neural network could screen the healthy persons from arrhythmia patients based on the fractal dimension of heart rate variability with 0.2 mean squared errors. The network was 100% successful in finding healthy persons and it was 90.14% successful in finding arrhythmia patients. It has fault for discriminating the arrhythmia patients in 98.6% of cases.
Conclusions: Fractal dimension which is used as a factor showing fractal characteristics of heart rate variability can be used as a diagnostic tool to screen the patients with arrhythmia by artificial neural networks. The results will depend on the size of our training set for artificial neural network. If we develop the training set the network will be more successful in screening the patients.
Latent Semantic Indexing (LSI) is an information retrieval (IR) method that unites IR with numerical linear algebra by representing a dataset as a term-document matrix. Because of the tremendous size of modern databases, such a term-by-document matrix can potentially be very large, with millions of entries. The partial singular value decomposition (PSVD) is a matrix factorization that captures the salient features of a matrix, while using much less storage. We look at two challenges posed by this PSVD data compression process in LSI. Traditional methods of computing the PSVD are very expensive; most of the processing time in LSI is spent in calculating the PSVD of the term-by-document matrix [1]. Given the potentially huge size of this matrix, the first challenge is calculating the PSVD efficiently, in terms of computational and memory requirements. The second challenge is efficiently updating the PSVD when the matrix is altered slightly. In a rapidly expanding environment, such as the Internet, the term-by-document matrix is altered often as new documents and terms are added. Updating the PSVD of this matrix is much more efficient than recalculating it after each change. We investigate the use of SVD updating methods proposed by Zha and Simon [2] to meet both of these challenges. These updating methods for adding documents or terms to the term-by-document matrix use the previously calculated PSVD of the term-by-document matrix to compute the PSVD of the updated matrix. Results will be presented illustrating that updating in this manner provides a tremendous saving in computation time, with little or no significant reduction in accuracy. Having shown that the updating methods are both fast and accurate, an algorithm for iteratively computing the PSVD of an extremely large matrix using the document updating method will then be presented. This iterative method, suggested by Zha and Zhang [3], provides a means of calculating the PSVD for matrices so large that the computation would be infeasible using traditional methods. Again, results will be given showing that this method provides savings in computational time and memory resources without compromising the accuracy of the results.
[1] M.W. Berry, S.T. Dumais, and G.W. O'Brien. Using Linear Algebra for
Intelligent Information Retrieval. SIAM Rev., 37(4):573-595, 1995.
[2] H. Zha and H.D. Simon. Timely Communication on Updating Problems in
Latent Semantic Indexing. SIAM J. Sci. Comput., 21(2):782-791, 1999.
[3] H. Zha and Z. Zhang. Matrices with Low-rank-plus-shift Structure:
Partial SVD and Latent Semantic Indexing. SIAM J. Matrix Anal. Appl.,
21(2):522-536, 1999.
In this paper we propose a new trust region method for systems of nonlinear equarions. We consider this problem only in a subspace. Global convergece is got using GMRES strategy. Numerical experiences show our algorithm is efficient.
The biharmonic Dirichlet problem appears in many applications. Therefore, there is a lot of interest in developing high-performance methods for its solution. Two important components of performance are accuracy and efficiency. In this research, we combine a sixth order discretization method and a Fast Fourier Transform (FFT)-based solver for the solution of the biharmonic Dirichlet problem.
The discretization of the problem uses the collocation methodology and quartic splines. Quartic Spline Collocation (Quartic SC) methods that produce sixth order approximations have been recently developed for the solution of fourth order Boundary Value Problems (BVPs). In this paper, we apply an adjustment to the quartic spline basis functions so that the Quartic SC matrix has more favorable properties. We study the properties of two Bi-Quartic SC matrices, more specifically, one that corresponds to Dirichlet and Neumann conditions (the standard harmonic problem) and another that corresponds to Dirichlet and second derivative conditions. The latter is solvable by FFTs and used as preconditioner for the first which is solved by the preconditioned GMRES method. Numerical experiments verify the effectiveness of the solver and preconditioner.
Structure preserving methods for eigenvalue problems have received quite some attention recently. We present several alternative numerical methods for the solution of an eigenvalue problem with Hamiltonian structure. These approaches are based on Jacobi-Davidson type techniques.
Lifetimes of metastable states may be determined by computing complex eigenvalues of a complex-symmetric matrix. One way to do this is to use the complex-symmetric version of Cullum and Willoughby's non-orthogonalised Lanczos procedure [1]. It is also possible use shift and invert complex-symmetric Lanczos to compute eigenvalues window by window [2]. Both of these approaches require doing complex matrix-vector products. In this talk I shall outline a new method for obtaining complex eigenvalues from real matrix-vector products. The complex-symmetric eigenvalue problem is written as a quadratic real eigenvalue problem which is linearized [3]. We attempted to solve the linearized (asymmetric) problem using the symmetric indefinite Lanczos method [4] but discovered that the symmetric indefinite Lanczos method works poorly for the purpose of computing many eigenpairs. Instead, we use the standard two-sided Lanczos algorithm (exploiting the symmetry of the eigenvalue problem to reduce the number of matrix-vector products by a factor of two) to compute approximate eigenvectors. It is not possible to get very accurate eigenvectors. We use groups of approximate eigenvectors as a basis and compute extremely accurate complex eigenvalues.
[1] Jane Cullum, Wolfgang Kerner, Ralph Willoughby A generalized
nonsymmetric Lanczos procedure, Comput. Phys. Comm., 53 (1989)
[2] Bill Poirier and T. Carrington
A preconditioned inexact spectral transform
method for calculating resonance energies and widths, as applied to HCO,
J. Chem. Phys. 116 1215-1227 (2002).
[3] A. Neumaier and V.A.  Mandelshtam, Further generalization and
numerical implementation of pseudo-time Schroedinger equations for
quantum scattering calculations, J. Theor. Comp. Chem. 1, 1 (2002).
[4] Templates for the Solution of Algebraic Eigenvalue Problems
Zhaojun Bai, James Demmel, Jack Dongarra, Axel Ruhe, and Henk van der Vorst
The computation of optical modes inside axisymmetric cavity resonators with a general spatial permittivity profile is a formidable computational task. In order to avoid spurious modes the vector Helmholtz equations are discretised by a mixed finite element approach. We formulate the method for first and second order Nedelec edge and Lagrange nodal elements. We discuss how to accurately compute the element matrices and to solve the resulting large sparse complex symmetric eigenvalue problems. We validate our approach by three numerical examples that contain varying material parameters and absorbing boundary conditions (ABC).
Iterative methods that find many eigenvalues of large, sparse, symmetric matrices have to deal with the converged eigenpairs. Typically some form of deflation is used to keep them from reappearing, and to speed up convergence. One of the more stable forms of deflation is known as locking. This usually means that converged eigenvectors Q are removed from the search space V, and all current and subsequent vectors in V are orthogonalized against Q. The alternative, sometimes called soft-locking, is to keep Q in the basis, making sure the orthogonality of V is maintained. Converged eigenvectors must be flagged as converged so that they are not targeted for further improvement by the algorithm. In soft-locking, the converged eigenvectors participate in the Rayleigh-Ritz projection, hence they improve at every step. However, the basis becomes large, and restarting and Rayleigh Ritz projection become too expensive.
We show that, in some cases, locking is susceptible to numerical instability, especially when the eigenvectors are not computed close to full accuracy. In particular, if k eigenvectors Q have converged and are locked with residual tolerance e, any subspace method (like Lanczos or Jacobi-Davidson) may be unable to converge to the k+1 eigenvector to full accuracy e. The reason is that the k+1 eigenvector of A is not the same as the lowest eigenvector of (I-QQT)A(I-QQT). The same reason may cause the correction equation of Jacobi-Davidson type methods to converge without having produced the appropriate correction. We explore possible solutions to these problems.
Numerical simulation of physical processes is often constrained by our ability to solve the complex linear systems at the core of the computation. Classical geometric and algebraic multigrid methods rely on (implicit) assumptions about the character of these matrices in order to develop appropriately complementary coarse-grid correction processes for a given relaxation scheme. The aim of the adaptive multigrid framework is to reduce the restrictions imposed by such assumptions, thus allowing for efficient black-box multigrid solution of a wider class of problems.
There are, however, many challenges in altogether removing the reliance on assumptions about the errors left after relaxation. In this talk, we discuss work to date on an adaptive AMG algorithm that chooses the components of the coarse-grid correction cycle based on automated analysis of the performance of relaxation. Fundamental measures of the need for and quality of a coarse-grid correction will be discussed, along with related techniques for choosing coarse grids and interpolation operators. We will also discuss the (lack of) computability of these ideal measures, and suggest cost-efficient alternatives.
Considerable efforts in recent multigrid research have concentrated on algebraic multigrid schemes. A vital aspect of this work is uncovering algebraically smooth error modes in order to construct effective multigrid components. Many existing algebraic multigrid algorithms rely on assumptions regarding the nature of algebraic smoothness. For example, a common assumption is that smooth error is essentially constant along `strong connections'. Performance can degrade as smooth error for a problem differs from this assumption. Through tests on the homogeneous problem (Ax = 0) adaptive multigrid methods expose algebraically smooth error.
The method presented in this talk uses relaxation and subcycling on complementary grids as an evaluative tool in correcting multigrid cycling. Each complementary grid is constructed with the intent of dampening a subset of the basis of algebraically smooth error. The particular implementation of this framework manages smooth error in a manner analogous to spectral AMGe. Numerical results will be included.
Standard algebraic multigrid methods assume explicit knowledge of so-called algebraically-smooth or near-kernel components, which loosely speaking are errors that give relatively small residuals. Tyically, these methods automatically generate a sequence of coarse problems under the assumption that the near-kernel is locally constant. The difficulty in applying algebraic multigrid to lattice QCD is that the near-kernel components can be far from constant, often exhibiting little or no apparent smoothness. In fact, the local character of these components appears to be random, depending on the randomness of the so-called "gauge" group. Hence, no apriori knowledge of the local character of the near-kernel is readily available.
This talk develops an adaptive algebraic multigrid (AMG) preconditioner suitable for the linear systems arising in lattice QCD. The method is a recently developed extension of smoothed aggregation, aSA [1], in which good convergence properties are achieved in situations where explicit knowledge of the near-kernel components may not be available. This extension is accomplished using the method itself to determine near-kernel components automatically, by applying it carefully to the homogeneous matrix equation, Ax=0. The coarsening process is modified to use and improve the computed components. Preliminary results with model 2D QCD problems suggest that this approach may yield optimal multigrid-like performance that is uniform in matrix dimension and gauge-group randomness.
[1] Brezina, M., Falgout, R., MacLachlan, S., Manteuffel, T., S. McCormick, S., Ruge, J.: Adaptive Smoothed Aggregation (aSA). SIAM J. Sci. Comp., 25 (2004), pp. 1806-1920.
In this talk we consider two related approaches for solving linear systems that arise from a higher-order finite element discretization of elliptic partial differential equations. The first approach explores direct application of an algebraic-based multigrid method (AMG) to iteratively solve the linear systems that result from higher-order discretizations. While the choice of basis used on the discretization has a significant impact on the performance of the solver, results indicate that AMG is capable of solving operators from both Poisson's equation and a first-order system least-squares (FOSLS) formulation of Stoke's equation in a scalable manner, nearly independent of basis order, p. The second approach incorporates preconditioning based on a bilinear finite element mesh overlaying the entire set of degrees of freedom in the higher-order scheme. AMG is applied to the operator based on the bilinear finite element discretization and is used as a preconditioner in a conjugate gradient (CG) iteration to solve the algebraic system derived from the high-order discretization. This approach is also nearly independent of p. We present several numerical examples that support each method and discuss the computational advantages of the preconditioning implementation.
Algebraic multigrid (AMG) is a very efficient iterative solver and preconditioner for large unstructured linear systems. Traditional coarsening schemes for AMG can, however, lead to computational complexity growth as problem size increases, resulting in increased memory use and execution time, and diminished scalability. We propose new parallel AMG coarsening schemes, that are based on solely enforcing a maximum independent set property, resulting in sparser coarse grids. The new coarsening techniques remedy memory and execution time complexity growth for various large three-dimensional (3D) problems. If used within AMG as a preconditioner for Krylov subspace methods, the resulting iterative methods tend to converge fast. For some difficult problems, however, these methods don't converge well enough, and improved interpolation is necessary in order to obtain scalable methods.
*This is joint work with Ulrike Yang. This work was performed under the auspices of the U.S. Department of Energy by the University of California Lawrence Livermore National Laboratory under contract number W-7405-Eng-48 and subcontract number B545391.
Algebraic multigrid (AMG) is a very efficient, scalable algorithm for solving large linear systems on unstructured grids. When solving linear systems derived from systems of partial differential equations (PDEs) often a different approach is required than for those derived from a scalar PDE. There are mainly two approaches, the function approach (also known as the "unknown" approach), and the nodal or "point" approach. The function approach defines coarsening and interpolation separately for each function. The nodal approach uses AMG in a block manner, where all variables that correspond to the same grid node are coarsened, interpolated and relaxed together. While the function approach is much easier to implement and often more efficient, there are problems for which this approach is not sufficient and the more expensive nodal approach is needed.
In this paper we present several parallel implementations of both approaches using various coarsening schemes and interpolation operators. Advantages and disadvantages of both approaches are discussed, and numerical results are presented.
*This work was performed under the auspices of the U.S. Department of Energy by University of California Lawrence Livermore National Laboratory under contract number W-7405-Eng-48.
In this talk, we present algorithmic and parallel performance improvements to Sandia's algebraic multigrid solver for the eddy current equations. The motivation for these improvements is a 3600 processor, multiphysics Zpinch simulation on Sandia's parallel machine, Janus. We begin with a very brief overview of our AMG solver for the eddy current equations. We then discuss the various challenges that large-scale Zpinch simulation present to our particular AMG solver, such as operator load-balancing and memory constraints, as well as our solutions to these issues. We finish with numerical results that show vastly improved scalability in the AMG solver.
In this paper we introduce a new framework for image registration. Our formulation is based on consistent discretization of the optimization problem coupled with a multigrid solution of the linear system which evolve in a Gauss-Newton iteration. We show that our discretization is h-elliptic independent of parameter choice and therefore a simple multigrid implementation can be used. To overcome potential large nonlinearities and to further speed up computation, we use a multilevel continuation technique. We demonstrate the efficiency of our method on a realistic highly nonlinear registration problem.
I will present some recent work looking at iterative methods for a fairly general family of partitioned systems of nonlinear equations that seems to occur in numerous applications (e.g., discretisation of elliptic partial differential equations). In such systems, various analytic problem formulations exist based on a natural partitioning of the unknowns or introduction of auxiliary unknowns. The existence of distinct analytic formulations prompts the question of which analytic system should be solved. While the problems are mathematically equivalent, once fed into appropriate variants of Newton's method, distinct discrete dynamical systems result. These distinct formulations suggest different preconditioning strategies for inner iterations of inexact Newton methods in turn. I will present some examples of such systems and some suggestions of how to choose between alternative equivalent formulations of nonlinear systems of equations that arise in practical applications.
This talk introduces a variation on standard preconditioned Krylov methods for solving linear systems, where we extend our search space using several preconditioners instead of just one. Our multi-preconditioned conjugate gradient (MPCG) solver can prove useful for problems where it is not clear which preconditioning approach is the most effective. At the very least, MPCG can robustly identify which to select for a faster PCG iteration, but in some cases can automatically exploit synergy between the preconditioners for asymptotically faster convergence. Numerical experiments for bending problems and domain decomposition highlight the capabilities of MPCG. Unfortunately, similar to flexible methods, the recurrence relation is not short in general. However, in at least one case (where the two preconditioners sum to the original matrix, as in 2D ADI) I will prove the recurrence is short. I will also explore the possibilities for truncation in other cases.
We consider the application of preconditioned Krylov subspace methods to continuation problems, in which the object is to track a solution curve as a continuation parameter varies. Continuation methods are typically of predictor-corrector form, in which, at each step along the curve, Newton-like corrector iterations are used to return to the curve from an initial "predicted" point. The linear system that characterizes corrector steps is underdetermined, and we discuss the adaptation of preconditioned Krylov subspace methods to solving this system. The focus is on a particular approach that preserves problem conditioning, allows the use of preconditioners constructed for the fixed-parameter case, and has certain other advantages. The effectiveness of the approach is shown in numerical experiments on large-scale problems.
Saddle-point systems arise in any problem with constraints: the incompressibility condition is a constraint on the momentum equations for incompressible flows, coupling in multiphysics problems is solving one problem subject to the other being satisfied (and conversly by duality) and throughout Optimization constrained problems are ubiqitous. In this talk we will describe preconditioners for linear saddle-point systems which preserve the constraints. Such `Constraint Preconditioners' have attractive theoretical properties which we will describe.
The problem of finding good preconditioners for the numerical solution of a certain important class of indefinite systems is considered. These systems are of a saddle point structure $$ \left[\begin{array}{cc} A & B^T \\ B & 0 \\ \end{array}\right] \left[\begin{array}{c} x \\ y \\ \end{array}\right] = \left[\begin{array}{c} c \\ d \\ \end{array}\right], $$ where $A$ is a $n$ by $n$ real, symmetric matrix and $B$ is a $m$ by $n$ real matrix $(m\leq n)$. The idea of using ``constraint preconditioners'' that have a specific 2 by 2 block structure was introduced by Keller, Gould and Wathen (SIMAX vol 21, 2000). They showed that the preconditioned system have some favourable spectral properties. However, even with simple choices for these constraint preconditioners, the factoring of the preconditioner, $K$, may become infeasible for large problems. Suppose that we use a constraint preconditioner of the form $$K = P B P^T,$$ where solutions with each of the matrices $P$, $B$ and $P^T$ are easily obtained. In particular, rather than obtaining $P$ and $B$ from a given $K$, $K$ is derived implicitly from specially chosen $P$ and $B$. We shall investigate how these ``approximate factorization constraint preconditioners'' can be successfully applied to saddle-point problems of the form considered by Keller, Gould and Wathen.
The mixed finite element formulation of the time-harmonic Maxwell's equation motivates the preconditioners considered in this talk. The saddle point linear system that represents the discretization has a (1, 1) block with high nullity. A discrete divergence-free property and spectral equivalence results that we derive are exploited, and a special block diagonal preconditioner is proposed for the problem. We also present a class of algebraic preconditioners that do not rely on computation of Schur complements or their approximations and may work well for a large class of saddle point systems.
We develop an inversion methodology for 3D electromagnetic data when the forward model consists of Maxwell's equations in which the permeability is constant but electrical conductivity can be highly discontinuous. The goal of the inversion is to recover the conductivity given measurements of the electric and/or magnetic fields. A standard Tikhonov regularization is incorporated and we use an inexact, all-at-once methodology, solving the forward problem and the inverse problem simultaneously in one iterative process. This approach allows development of highly efficient algorithms. Here we present the basic methodology for the all-at-once approach. We then briefly review the time domain forward modelling equations, develop the linearized Gauss-Newton system of equations to be solved at each iteration, and show how these equations can be solved using a preconditioned conjugate gradient solution. We invert a synthetic data set generated from a surface loop transmitter and receivers in four boreholes.
The exponential growth rate h of the number of configurations on a multi-dimensional grid arises in many fields. It is called topological entropy in mathematics, entropy in physics and multi-dimensional capacity in engineering. In the 1-dimensional case the entropy equal to the spectral radius of a certain digraph. There are very few 2-dimensional models where the value of h is known in a closed form. In this talk we first survey the theory of the computation of h by using lower and upper bounds that converge to h. As a demonstration of these techniques, we compute the topological entropy of the monomer-dimer covers of the 2-dimensional grid to 8 decimal digits and of the 3-dimensional grid with an error smaller than 1.35
Second we discuss a more delicate problem to compute the exponential growth rate h(p), with prescribed densities of configurations. For that purpose we introduce the notion of pressure P, which is a convex function in d-dimensional space. We can compute its values using lower and upper values which converge to P. The values of partial derivatives of P give the value of h(p), which can be estimated from below and above.
As an application we compute the h(p) for the monomer-dimer configurations in 2-dimensional grid. First we let p be the density of the dimers. Then our lower and upper bounds confirm the heuristic computations of Baxter 1968. Second we show that we can compute the h(x,y), where x and y are the densities of the dimers in the direction X and Y in the plane.
The exponential growth h(p) of the monomer dimer configurations in the d-dimensional lattice, with prescribed density p of dimers, is equivalent to the exponential growth of the number of partial matching in the bipartite graphs corresponding to d-dimensional lattice tori with even length sides. The number of this partial matching is equal to a sum of subpermanents of certain orders in the corresponding integer value matrix.
In the work of Friedland and Peled it was shown that h(p) can be bounded below by a(p), using the sharp lower bound on the subpermanents of doubly stochastic matrices. As in the work of Schrijver on permanents of 0-1 matrices, corresponding to regular bipartite graphs, it is plausible that one can improve the lower bound a(p).
In this talk we present a conjectured lower bound b(p) of h(p), which indeed is better than a(p). We give a probabilistic motivation for the form of b(p).
We give numerical examples which test the conjectured function b(p). This is done by using the notion of pressure introduced by Friedland and Peled. Our case is relatively simple case of this theory, since it is reduced to the one dimensional subshift of finite type. Our computation are based on calculations of spectral radii of certain transfer matrices and their derivatives.
We demonstrate that a small modification of the multiplicative, additive and restricted additive Schwarz preconditioner at the algebraic level, motivated by optimized Schwarz methods defined at the continuous level, leads to a significant reduction in the iteration count of the iterative solver. In the context of optimized Schwarz methods this approach permits the use of low order subdomain preconditioners for high order spectral element discretizations, while retaining spectral accuracy in the solution. Numerical experiments using finite difference and spectral element discretizations of the modified Helmholtz problem illustrate the effectiveness of the new approach.
During the years Domain Decomposition (DD) methods have been proven very efficient and effective numerical techniques for the analysis of Partial Differential Equations. Among other advantages, it suffices to stress their parallelization ability, scalability, ability to systematically couple different numerical methods into hybrid schemes, efficient exploitation of geometrical redundancies and symmetries, and relaxing meshing and adaptive meshing strategies.
In the hart of every DD algorithm is the transmission condition (TC). This is the boundary condition imposed on the interfaces of subdomains to facilitate the information passing between domains. As it is natural, the choice of TC can lead to a successful or a divergent DD algorithm. Appropriately designed TCs have been used to optimize and accelerate convergence of the DD iteration. Such accelerations have been reported for elliptic and Helmholtz problems (M. Gander F. Magoules and F. and Nataf, SIAM SISC., 24, 38-50, 2002), by adopting Robin-type TCs with complex Robin constants.
Recently the authors (S.-C. Lee, M. N. Vouvakis, and J.-F. Lee, J. of Comput. Phy., 203, 1-21, 2005) have proven in the continuous level, that the situation for EM problems, that involve the vector curl-curl operator, is more complicated. With this insight developed in previous work and the powerful Fourier analysis machinery, new higher-order transmission conditions have been developed. Such TCs grand superior convergence properties since they damp both propagative and evanescent error fields.
We intend to present a variant of the Despres-Schwarz Algorithm for time-harmonic electromagnetic problems. In this talk new results for a non-conforming DD method based on higher-order TCs will be given. Convergence curves and computational times will be presented in order to demonstrate the power of the method. Results on real-world challenging radiation and scattering problems such as very large antenna arrays, antenna arrays mounted on large platforms, and antenna arrays in the presence of hybrid radomes and EBGs will be presented.
Discontinuous Galerkin methods for the Stokes problem with equal and mixed orders of polynomial approximation for the velocity and pressure have been recently developed (see e.g., [P. Hansbo and M. Larson, Comput. Methods Appl. Mech. Engrg., 191 (2002), pp. 1895-1905] and [D. Schotzau, C. Schwab, and A. Toselli, Math. Models Methods Appl. Sci., 13 (2003), pp. 1413-1436]). However, corresponding efficient numerical solution procedures for the unsteady Stokes problem have not yet been proposed. Our goal is to propose and test such a scheme as the foundation of an unsteady incompressible Navier-Stokes solver.
We have adopted the approximate algebraic splitting scheme (which has been successfully used in the context of the continuous Galerkin method) for the nodal high-order discontinuous Galerkin discretization of the unsteady Stokes problem. Among the available discontinuous Galerkin methods for treating the second order Laplacian, we use the interior penalty method introduced in [D. Arnold, SIAM J. Numer. Anal., 19 (1982), pp. 742-762]. The high-order nodal bases, defined over a standard tetrahedron, consist of Lagrange polynomials calculated based on the nodal set reported in [J. Hesthaven and C. Teng, SIAM J. Sci. Comput., 21 (2000), pp. 2352-2380].
In the proposed splitting scheme, a d-dimensional positive definite Hemholtz problem is solved for the velocity vector and an approximate consistent Poisson equation is solved for the pressure at each time step. For a typical case of a small viscosity and a small time step size, the Helmholtz problem is effectively preconditioned with a block diagonal preconditioner. The pressure system, on the other hand, requires a more sophisticated preconditioner. For this system, we devise a two-level additive Schwarz preconditioner with a local fine part and a global coarse part. The local part corresponds to the summation of the restrictions of the discretized consistent Poisson operator on each element of the triangulation. The global coarse part corresponds to the discretization of the Poisson operator (which is spectrally equivalent to the consistent Poisson operator, but has higher sparsity) on the same mesh but with lower approximation order k=1. We will present data showing the good performance of this preconditioner with respect to the different approximation orders.
We present a method for automatically selecting optimal implementations of sparse matrix-vector operations. Our software `AcCELS' (Accelerated Compress-storage Elements for Linear Solvers) involves a setup phase that probes machine characteristics, and a run-time phase where stored characteristics are combined with a measure of the actual sparse matrix to find the optimal kernel implementation. We present a performance model that is shown to be accurate over a large range of matrices.
A simple checkpoint-free fault-tolerant scheme for parallel iterative methods is given. Assuming that when one processor fails, all its data is lost and the system is recovered with a new processor, this scheme computes a new approximate solution from the data of the non-failed system. The iterative method is then restarted from this new vector. The main advantage of this technique over standard checkpoint is that there is no extra computation added in the iterative solver. In particular, if no failure occurs, the fault-tolerant application is the same as the original application. The main drawback is that the convergence after failure of the method is no longer the same as the original method. In this paper, we present this recovery technique as well as some implementations of checkpoints in iterative methods. Finally, experiments are presented to compare the two techniques. The fault tolerant MPI library is the FT-MPI library. Iterative linear solvers are considered. (Iterative eigensolvers are considered in [*]).
[*] George Bosilca, Zizhong Chen, Jack Dongarra, and Julien Langou. Recovery patterns for iterative methods in a parallel unstable environment. Technical Report UT-CS-04-538, Department of Computer Science, the University of Tennessee, December 2004.
Chebyshev polynomials have been recently proposed for designing public-key cryptosystems, which is shown insecure due to the inherent periodicity trait of trigonometric function. An attack can easily recover the corresponding plaintext from a given ciphertext. In the paper, semi-group property based on extended Chebyshev polynomials is used, and two schemes based on such polynomials can avoid this attack and improve security. The first algorithm works on real number and is quite efficient. The second is defined over finite fields and is as secure as RSA. Moreover key agreement, digital signature and entity authentication schemes based on the algorithms are given respectively. In the end, a single consolidate formula used in public-key cryptosystems is concluded.
Keywords: Extended Chebyshev polynomials, Security, Public-key encryptosystem, Entity authentication, Key exchange, Digital signature
The problem of finding good preconditioners for the numerical solution of a certain important class of indefinite linear systems is considered. These systems are of a saddle point structure
- - - - - - | A B^T || x | | c | | || |=| | | B -C || y | | d | - - - - - -where A \in R^{n x n}, C \in R^{m x m} are symmetric and B \in R^{m x n}.
In Constraint preconditioning for indefinite linear systems, SIAM J. Matrix Anal. Appl., 21 (2000), Keller, Gould and Wathen introduced the idea of using constraint preconditioners that have a specific 2 by 2 block structure for the case of C being zero. We shall extend this idea by allowing the (2,2) block to be non-zero. Results concerning the spectrum and form of the eigenvectors are presented, as are numerical results to validate our conclusions.
Key words. preconditioning, indefinite linear systems, Krylov subspace methods
In the present paper a dynamic separation problem is modeled and solved using Mixed Integer Nonlinear Programming (MINLP) techniques. The objective is to maximize the profit for continuous cyclic operation, and at the same time, to find the optimal configuration for the separation column system. The dynamics of the chromatographic separation process is modeled as a boundary value problem which is solved, within the optimization, using an iterative finite difference method. The separation of a fructose-glucose mixture is solved using the Extended Cutting Plane (ECP) method. It is shown that the production planning can be done efficiently for different purity requirements, such that all the output of a system can be utilized. Using a process design that is optimized it is thus possible to use existing complex systems, or to design new systems, more effciently and also to reduce the energy costs or the costs in general.
Multigrid methods are highly efficient solution techniques for large sparse multilevel Toeplitz systems which are positive semidefinite and ill-conditioned. In this paper, we develop multigrid methods which are especially designed for anisotropic two-level Toeplitz (BTTB) matrices. First, a method is described for systems with anisotropy along coordinate axes as a suitable combination of semicoarsening and full coarsening steps. Although the basic idea is known from the solution of PDEs, we present it here in a more formal way using generating functions and their level curves. This enables us not only to give a formal proof of convergence, but also to carry over the results to systems with anisotropy in other directions. After introducing new coordinates to describe these more complicated systems in terms of generating functions, we can solve them with the same efficiency. We hope that this will be useful in future applications such as the design of new multilevel preconditioners and solvers for the 2D-Helmholtz equation.
In this article we consider the problem of assigning parking slots to buses of different types so that the required buses can be dispatched easily in the morning. More precisely, if a bus of a certain type is needed at a given time, the buses that precede it in the lane must have departed already. Thus care must be taken to ensure that the buses arriving in the evening are parked in an order compatible with the types required for the morning departures. Maneuvers (i.e., rearrangements of buses within lanes) might be necessary to achieve this goal. Since the transit authorities need robust solutions to this problem (known as the dispatching problem in the literature), we formulate a model in which the depot lanes are filled according to specific patterns, called one-block or two-block patterns. We present two versions of this model, study their properties and show that some real-life instances can be solved within reasonable times by a commercial MIP solver.
Modern algorithms in numerical simulation need to combine efficient mathematical methods with concepts from computer science for nowadays computer architectures. In high-performance computing, memory access is a crucial factor and more important than pure cpu power. This memory boundedness can be reduced by utilizing modern hierarchical memory structures. However, many classical numerical methods lack the necessary data locality. In this paper we focus on the numerical solution of PDEs. First, we use a space-filling curve, the Peano curve, for the discretization and linearization of the domain. Then, we employ a system of stacks for processing the grid points linearly in a cache efficient way. In addition, this technique requires less memory for the organizational overhead, than for example a hierarchical approach using tree-structures.
This paper deals with the solution of a nonlinear ill-conditioned
inverse problem arising in digital image registration. In the first part
of the paper, we define the problem as the minimization of a regularized
non-linear least-squares functional, which measures the image
difference and smoothness of the transformation. We study inexact
Newton methods for solving this problem, i.e. we linearize the functional
around a current approximation and replace the Hessian by a
suitable operator in order to obtain well-posed subproblems in each
step of the iteration.
These anisotropic subproblems are solved using a multigrid solver. Due
to the anisotropy in the coefficients of the underlying equations, standard
multigrid solvers suffer from poor convergence rates. We discuss
modifications to the multigrid components, specifically to the smoothing
procedure, the interpolation and the coarse grid correction. Numerical
results that demonstrate the improvements obtained with these
new components are given.
Uninterruptible Power Supplies (UPS) systems, which supply emergency power in case of utility power failures, are used to interface critical loads such as computers and communication equipment to a utility power grid. Many discrete time controllers have been designed to control a single-phase inverter for use in UPS. In order to apply advanced control techniques in low coat micro controllers, a proper choice of the controller gains can be made off-line, which simplifies the control algorithm. Evolutionary algorithms are optimisation and search procedures inspired by genetics and the process of natural selection. This form of search evolves throughout generations improving the features of potential solutions by means of biologically inspired operations. On the ground of the structures undergoing optimisation the reproduction strategies, the genetic operators' adopted, evolutionary algorithms can be grouped in: evolutionary programming, evolution strategies, classifier systems, genetic algorithms and genetic programming. The genetic algorithms behave much like biological genetics. The genetic algorithms are an attractive class of computational models that mimic natural evaluation to solve problems in a wide variety of domains. A genetic algorithm comprises a set of individual elements (the population size) and a set of biologically inspired operators defined over the population itself etc. a genetic algorithms manipulate a population of potential solutions to an optimisation (or search) problem and use probabilistic transition rules. According to evolutionary theories, only the most suited elements in a population are likely to survive and generate offspring thus transmitting their biological heredity to new generations. In this paper, an adaptive LQR with repetitive controller (LQR-RP) for single-phase UPS application is designed. In the proposed controller, a RLS estimator identifies the plant parameters, which are used to compute the LQR gains periodically. The quadratic cost function parameters are chosen in order to reduce the energy of the control signal. The repetitive control action improves the response of the controller mainly in the presence of cyclic loads. It can be shown that using the repetitive controller with forgetting exponential coefficients, the control results can be improved significantly, which leads to an output voltage with low Total Harmonic Distortion (THD). There is not any efficient analytic mechanism to obtain the optimal values of the parameters of theses forgetting coefficients. So, we have to use intelligent optimization methods. Genetic Algorithms (GAs) is one the most efficient methods for this application. In this paper, using GAs the optimal tuning of repetitive controller's parameters is performed. Simulation results on the closed-loop system with obtained parameters conform the ability of the proposed method to achieve an output signal to low and admissible THD.
In this paper, we derive recursive and closed formulae for a family of root-finding algorithms that, contrary to typical algorithms, have higher order of convergence for multiple roots than for simple roots.
Key words: root-finding, order of convergence, symmetric functions, generating functions
Two-dimensional phase unwrapping is the problem of deducing unambiguous "phase" from values known only modulo 2 \pi. Many authors agree that the objective of phase unwrapping should be to find a weighted minimum of the number of places where adjacent (discrete) phase values differ by more than \pi. This NP-hard problem is of considerable practical interest, largely due to its importance in interpreting data acquired with synthetic aperture radar (SAR) inter-ferometry. Consequently, many heuristic algorithms have been proposed. Here we present an efficient multi-level graph algorithm for the approximate solution of an equivalent problem -- minimal residue cut in the dual graph.
The significant gap between peak and realized performance of parallel systems motivates the need for performance analysis. In order to predict the performance of a class of parallel multilevel ILU preconditioners (PBILUM), we build two performance prediction models for both the preconditioner construction phase and the solution phase. These models combine theoretical features of the preconditioners with estimates on computational cost, communications overhead, etc. Experimental simulations show that our model prediction based on certain reasonable assumptions is close to the simulation results. The models may be used to predict the performance of this class of parallel preconditioners.
A certain regularization technique for contact problems leads to a family of problems that can be solved efficiently using infinite-dimensional semismooth Newton methods, or in this case equivalently, primal-dual active set strategies. We present two procedures that use a sequence of regularized problems to obtain the solution of the original contact problem: first-order augmented Lagrangian, and path-following methods. The first strategy is based on a multiplier-update, while pathfollowing with respect to the regularization parameter uses theoretical results about the path value function to increase the regularization parameter appropriately. Comprehensive numerical tests investigate the performance of the proposed strategies for both a 2D as well as a 3D contact problem.