|
Organizers |
Electron Paramagnetic Resonance, Optimization and Automatic Differentiation of Algorithms
by
Edgar Soulié
CEA Saclay (FRANCE)
Coauthors: Christèle Faure (INRIA-Sophia Antipolis, FRANCE), Théo Berclaz (Université de Genève, SUISSE)
Electron paramagnetic resonance (EPR) spectroscopy is a method to investigate the behavior of samples containing unpaired electrons (free radicals or compounds comprising an ion whose outer electronic shell is incomplete) in an applied magnetic field. It consists in recording the microwave energy absorbed by the sample as a function of the applied magnetic field. Two kinds of information may usually be obtained by this technique. The first one is a tensor characterizing the so-called Zeeman interaction, or interaction of the applied magnetic field with the unpaired electron(s). The second consits of tensors characterizing the interaction of the unpaired electron(s) with each type of nuclei possessing an intrinsic magnetic moment or spin, provided that this interaction is reflected by specific lines in the EPR spectrum. The interpretation of the results obtained from EPR spectroscopy relies on a theory deriving from quantum mechanics. This theory describes the variation of the resonance field as a function of the orientation of the microscopic resonating species in the applied magnetic field and of adjustable parameters. The values of the magnetic field which fulfill the "resonance condition" may be determined by a numerical iterative procedure, or by approximate formulae. They may also be derived by analytic formulae valid under assumptions generally verified which were published by Iwasaki. The quantitative analysis of the variation of the resonance field with orientation leads to accurate values of the adjustable parameters in the case of a ßingle crystal", that is a regular arrangement of atoms or molecules in three directions of space. However, many interesting materials which lend themselves to EPR investigations are disordered (powders, frozen solutions, etc.) and their EPR spectra do not vary with orientation in the magnetic field. For such materials, the spectrum of a sample is the sum of the spectra of the microscopic resonating species. Provided that the theory sketched above for a single crystal can be extended to encompass the intensities, shapes and widths of the resonance lines, a spectrum of the sample may be predicted for a given set of spectroscopic parameters.
As in many fields of science and technique, the inverse problem consists in determining the values of the spectroscopic parameters for which the calculated spectrum best matches the observed spectrum. The minimization of the sum of squares of differences between the intensities of the observed and calculated spectra provides the means of fitting the spectroscopic parameters. We have chosen to use a software named BSOLVE to achieve non-linear least squares optimization. It is written in Fortran 77 and implements the Levenberg-Marquardt algorithm. In this algorithm, for each iteration, the determination of an appropriate displacement in the Rn space relies on the calculation of the derivative of the calculated spectrum at each sampled point with respect to the adjustable parameters. Since the calculation of the spectrum is complicated, the hand calculation of analytical formulae of its derivatives would be lengthy and error prone. In the original version of this optimizer, the Jacobian matrix of the spectrum with respect to the adjustable parameters is approximated by divided differences. We observed that the application of this optimizer to our data sets often results in an abnormal end of the optimization, as indicated by the following diagnostic ïmprovement of the function impossible", supplied by the optimizer. In such a case, a doubt remains about the significance of the final values of the spectroscopic parameters. We draw the hypothesis that these abnormal stops of the optimizer come from the approximation of the Jacobian matrix due to the use of divided differences. Automatic differentiation of algorithm (AD) was undertaken in order to avoid this approximation and to obtain a better displacement of the current point in Rn at each iteration. For that purpose, the simulator is differentiated using the automatic differentiator ODYSSÉE in direct mode. ODYSSÉE provides us with a new programming unit that computes both the spectrum and one directional derivative of the spectrum with respect to one adjustable parameter. Our choice is to modify the optimizer BSOLVE as little as possible: the modified version of BSOLVE calls this new programming unit or the original simulator to compute the Jacobian matrix line by line.
To validate our approach, we compare optimization processes using the same optimizer for different cases: one simple test with two hyperfine interactions, and a complex one with five hyperfine interactions. We compare the behavior of the optimizer using automatic differentiation or divided differences to compute the Jacobian matrix. As previously mentioned, we knew before hand that the simulators were numerically unstable. We have also compared the obtained results using single and double precision. As a Jacobian matrix is to be computed for the Levenberg-Marquardt algorithm, we know before hand that the gain in terms of execution time is not on one optimization step. But we hope that the exactness of the Jacobian matrix computed by automatic differentiation leads to a change of the behavior of the optimizer and a decrease of the number of optimization steps. The results of the tests show that using AD allows the optimizer to reach a solution in single precision, whereas no solution is computed using divided differences. The final value of the objective function is smaller using AD than divided differences. The behavior of the optimizer all along the optimization process is also better: less parameters to be adjusted from one step to the other, less intermediate calls to the simulator. All of this indicates that the optimizer is better guided, from one step to the other, by the Jacobian matrix computed using automatic differentiation than divided differences. We also investigate the influence of the method on the optimization time: we know for sure that the computation of the Jacobian matrix is cheaper using finite differences than automatic differentiation. The tests show that the number of optimization steps is smaller using AD than finite differences. To lead to a gain in terms of execution time of the optimization process, these two characteristics must not compensate. On the simple test, using AD leads to an increase in terms of execution time and on the complex one we obtain a gain in terms of execution time: we investigate this by running the optimizer on some other tests.
The hypothesis that the abnormal stops of the optimizer come from the lack of accuracy of the Jacobian matrix computed by divided differences is verified. If one provides the optimizer with exact values of this Jacobian matrix, the solution is always reached. As for the gain in execution time, some more tests are to be performed.
Date received: December 22, 1999
Copyright © 1999 by the author(s). The author(s) of this document and the organizers of the conference have granted their consent to include this abstract in Atlas Conferences Inc. Document # cads-20.