Easydd: A Program For Batch Processing and Visualization of Powder Diffraction Data

Download as pdf or txt
Download as pdf or txt
You are on page 1of 22

EasyDD: A Program for Batch Processing and Visualization of Powder Diraction Data

Taha Sochi

2009

ScienceWare, PO Box 293, London, N1 5UY, UK. Email: [email protected].

Contents
Contents List of Figures List of Tables 1 Abstract 2 Introduction 3 EasyDD 3.1 3.2 3.3 3.4 3.5 Main Window . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tab Widget . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2D Plotter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3D Plotter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii ii 1 2 3 4 7 8 9 11 12 13 13 15 17 18 19

4 Modules of EasyDD 4.1 4.2 Curve-Fitting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Back Projection . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5 Acknowledgement 6 References 7 Nomenclature

List of Figures
1 EasyDD main window . . . . . . . . . . . . . . . . . . . . . . . . . 7

ii

1 2 EasyDD main window with tab tooltip, tab context menu and list widget after reading data and performing batch tting . . . . . . . 3 4 5 Plotter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Spreadsheet form . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 10 12

The 3D graph of intensity as a function of the voxel position in a tab. 13

List of Tables
1 Some of the common shape functions used in EasyDD to describe individual peak proles . . . . . . . . . . . . . . . . . . . . . . . . . 2 Statistical indicators for powder diraction pattern renement . . . 10 15

1 ABSTRACT

Abstract

In this article we report the release of a new program for batch processing and visualization of powder diraction data. The program, which is free-of-charge for non-commercial use and can be obtained with its detailed documentation from our website www.scienceware.net, is currently in use by a number of researchers in University College London, University of Manchester, Utrecht University in the Netherland, European Synchrotron Radiation Facility (ESRF), and Diamond Light Source. The software is designed for the treatment of large volume of powder diraction data, especially those obtained from the new generation of synchrotron detectors. The program has a great potential for future development to be a workbench for powder diraction work.

2 INTRODUCTION

Introduction

The principal objective of EasyDD project is to develop a computer code for batch rening and bulk analysis of large volume of diraction data sets, mainly those obtained from synchrotron radiation sources. Such a facility greatly assists studies on various materials systems and enables far larger detailed data sets to be rapidly interrogated and rened. Modern detectors coupled with the high intensity X-rays available at synchrotrons have led to the situation where data sets can be collected in ever shorter time scales and in ever larger numbers. Such large volume data sets pose a data processing problem which is set to augment with the current and future instrument development. So far, EasyDD has achieved a number of its original objectives. In a recent assessment by the author of the code (i.e. the author of this article) it was concluded that the time has come for the program to be released for the use by the wider scientic community, in particular powder diractionists and synchrotron users.

3 EASYDD

EasyDD

EasyDD is a high throughput software to manage, process, analyse and visualise powder diraction X-ray data, especially those obtained from synchrotron facilities such as Diamond [2009] and ESRF [2009]. The name EasyDD comes from the original name EasyEDD which was adopted for historical reasons as the software was developed initially for the users of station 16.4 of Daresbury Synchrotron Radiation Facility (SRS) [2008], which was closed in August 2008. As the program has eventually evolved to be more general and can be used for processing Angle Dispersive Diraction (ADD) data as well as Energy Dispersive Diraction (EDD) data, the name EasyDD was adopted to consider this extension. In fact there is no restriction on the program to be used for general applications not related to synchrotron and powder diraction, as the program is suciently general for the use of processing any data having the correct format. One of these formats is a generic x-y entries which can be used for any data type. The program is written in C++ programming language and uses a hybrid approach of procedural and object oriented programming methodologies. Its main purpose is to process large quantities of data les with ease and comfort using limited time and computing resources. EasyDD combines Graphic User Interface (GUI) technology (e.g. wizards, dialogs, tooltips, color coding, context menus, and so forth) with standard scientic computing techniques. The ultimate objective for EasyDD is to become a workbench for batch processing and analysis of scientic data, especially from synchrotron and powder diraction applications. Currently, ve input data le formats are supported. The code can be easily extended to support other data formats. The current supported formats are: 1. Generic x-y format: this is a simple x-y format where the rst column of the data le contains the x values (e.g. energy or channel number or scattering angle) while the second column contains the y values (usually count rate).

3 EASYDD

The rest of each line (which can contain other data such as count error) is ignored. This format is very general and can be used for reading, mapping and analysing a variety of data les. 2. Diamond MCA format: this format is used by some detectors in the Diamond Light Source of Rutherford Appleton Laboratory (RAL) where the le contains the y values (count rate) only as a function of an implicit channel number with possible redundant header and footer lines. 3. ESRF MCA format: this is one of the formats that are in use by the ESRF detectors. The les have only headers to be ignored, and each 16 data entries are on a single text line ending with backslash \. Again, the data in the ESRF MCA les are intensity versus implicit channel number. 4. ERD detector format: this is the format of the 2D Energy Resolving Detector (ERD). Each row in the ERD les corresponds to an event taking place (a photon being detected by a specic pixel). The main data in each row are the values in volt corresponding to the energy of events and the pixel number in which the event took place. The pixels are arranged in a 2D array and hence a map le that depicts the physical layout of the detector channels is required. The map le contains the number of rows, the number of columns and the voltage bin size. This is followed by the channels index map which mimics the physical layout of the 2D detector. All the lines with the same pixel number are grouped together and the energies for each pixel are processed as spectra. The x-values of these spectra are the bin number while the y-values are the count rate (number of events in a specic bin). Because the energy is not quantised, the binning process is applied before creating a spectrum. The bin size is determined by the user, and is input through the map le. The routine that implements this operation is general with regard to the detector

3 EASYDD dimension, so it can be used for a hypothetical 100110 detector.

5. SRS 16.4 format: this is the format of the data les obtained from station 16.4 of Daresbury Synchrotron Radiation Source in EDD mode. These les are classied as scalars and vectors, where the scalars contain data about the diraction measurements while the vector les contain the actual measurements. Each vector le contains a number of EDD spectra ( 20) stored as intensity versus implicit channel numbers. The number of channels depends on the type of Multi-Channel Analyser (MCA) detector in use (normally 4000). Also each set of measurements consists of a single scalar le and three vector les, one from each of the three TEDDI detectors. One of the main functionalities of EasyDD in its current state is to read and map data les. What is required from the user is to deposit the data les of a particular format in a directory and invoke the relevant read data function. In the case of SRS les where the data les have a highly structured format, the data les are read and automatically recognised (e.g. SRS, scalars or vectors), and therefore non-SRS data les in the source directory will have no eect as they will be recognised and ignored. For the other four formats, where the structure is relatively generic and cannot be automatically recognised by the program, the program relies on the user for identifying the les. The program therefore assumes that all the les in the source directory are valid data les of the selected format. On reading the data les, the data are stored in memory and mapped on a 2D color-coded tab, as seen in Figure (2). Multiple tabs from dierent data sources can be created at the same time. The tabs can also be removed collectively or individually in any order. In the following sections we outline the main components of EasyDD.

3.1 Main Window

3.1

Main Window

This is a standard GUI window with menus, toolbars, a status bar, context menus and so on. Figure (1) displays a screenshot of the main window. The basic functionality of the main window is to serve as a platform for accessing and managing the other components with their specic functionalities.
Menus Tab widget width and height spin boxes Grid view button Tab Widget Multi/single color coding view button Toolbars Context menu

Status bar

Figure 1: EasyDD main window. The main window has several menus which are the main access tool to the program utilities. Some of these menus also have submenus. The main window menus contain all the principal items of the program with their icons and shortcuts. The data in any tab can also be exported in a generic text format for the use in other applications, such as Excel, or for archiving especially after introducing corrections and modications. Intensities, rened parameters and statistical residuals after curve-tting can also be exported separately to generic text les for further processing or for visualisation by other applications such as MATLAB. The written data are structured in a 2D matrix with the same dimensions as the tab.

3.2 Tab Widget

The spectrum of each voxel, as represented by a cell in a tab, can be plotted using a 2D plotter. The sum of all patterns in a tab can also be plotted by the 2D plotter to compare with the individual patterns and to charcterise the overall behaviour. Rows, columns and cells in each tab can be managed and manipulated by 12 functions which include delete, copy, exchange, rotate clockwise and anticlockwise and so on. Moreover, the tab can be transposed by reection across the y = x line.

3.2

Tab Widget

This is a widget (Figures 1 and 2) that can accommodate a number of 2D colorcoded scalable tabs for voxel mapping with graphic and text tooltips to show all essential le and voxel properties. On double-clicking a cell in a tab in the tab widget the plotter will be launched, if it was not launched already, and the pattern of the selected cell is plotted. The plotter is dynamically updated by hovering the mouse cursor over the tab so that the plotter displays the pattern of the cell where the mouse cursor is currently placed. The tabs have graphic and text tooltips and a context menu for managing the rows, columns and cells in the tab.

3.3 2D Plotter

Tab

List Widget Color Bar

Tab Context Menu Tab Tooltip

Figure 2: EasyDD main window with tab tooltip, tab context menu and list widget after reading data and performing batch tting.

3.3

2D Plotter

This is a 2D plotter to obtain a graph of intensity for any voxel in the current tab by clicking on its cell. It is also used to create basis functions and forms for curve-tting. A screenshot of the plotter is shown in Figure (3). The 2D plotter capabilities include: Creating and drawing tting basis functions (polynomials of order 6 that pass through a number of selected points, Gauss, Lorentz and pseudo-Voigt proles, as presented in Table 1) by mouse click or by press and drag actions.

3.3 2D Plotter The tting basis functions can also be modied and removed. Non-linear least squares curve-tting by Levenberg-Marquardt algorithm.

10

Saving the plotter image in a number of dierent formats (e.g png, bmp, jpg, jpeg, xpm and pdf).

Figure 3: The 2D plotter.

Table 1: Some of the common shape functions used in EasyDD to describe individual peak proles. A is area under peak, F is FWHM, X is position of peak and m is a dimensionless mixing factor (0 m 1). Function Gaussian Lorentzian Pseudo-Voigt Equation G= L=
2A F ln(2)

4ln(2)(xX)2 F2

2A/(F ) 1+4(xX)2 /F 2

V = mL + (1 m)G

3.4 Form

11

3.4

Form

The spreadsheet form is mainly used for batch curve-tting. The idea is that a form is conveniently prepared within the plotter and saved to the computer hard disc or any other suitable storage media. It is then imported to the main window for batch tting a number of cells or tabs in the tab widget or a number of data sets stored in a directory. The form has two modes of launch, one from the plotter and the other from the main window. In the rst mode the form interacts with the plotter, that is the form started from the plotter will automatically lists the existing basis functions in the plotter, while the plotter will be updated on modifying the parameters of the basis functions in the form. In the second mode the form will serve as a tting model for batch curve-tting. A screenshot of the spreadsheet form is shown in Figure (4). The form consists of a number of menus, toolbars and a status bar, similar to the ones in the main window. The form also has a number of columns that contain data required for curve-tting such as data range; initial tting parameters; upper and lower limits, boolean ags and values for applying restrictions on the rened parameters when they exceed acceptable limits; and counters and boolean ags for controlling the number of iteration cycles and the parameters to be rened in the least squares tting process.

3.5 3D Plotter

12

Figure 4: The spreadsheet form.

3.5

3D Plotter

The 3D plotter is used to create a 3D graph of the current tab where the x and y axes stand for the numbers of rows and columns respectively, while the z-axis represents the intensity (count rate) of the voxel, as shown in Figure (5). To keep the proportionality of the plot dimensions, the intensity is automatically scaled by a scale factor which is displayed beside the tab number. The 3D plot can be scaled by re-scaling the z-axis in the range between 0.01-100. The plotter responds to the adjustment dynamically. On performing batch renement in one of its various modes the residuals and the rened parameters can also be visualised by the 3D plotter.

4 MODULES OF EASYDD

13

Figure 5: The 3D graph of intensity as a function of the voxel position in a tab.

Modules of EasyDD

Currently, there are two main modules implemented in EasyDD; these are curvetting and back projection:

4.1

Curve-Fitting

One of the main modules of EasyDD is curve-tting by least squares using LevenbergMarquardt algorithm. Curve-tting can be performed on a single pattern from the plotter, or as a single batch process over multiple patterns, or as a multiple batch process over multiple forms and data sets from the main window. Curve-tting can be done on a single or multiple peaks using any number of basis functions with and without polynomial background modeling. The range of the data to be tted can also be selected graphically. Some relevant statistical indicators for the tting process, as outlined in Table (2), are also computed in the curve-tting routine. It should be remarked that curve-tting is a general module that can be used

4.1 Curve-Fitting

14

for general spectral analysis in any application. Hence, EasyDD can be used for curve-tting any data type from disciplines other than powder diraction and synchrotron radiation such as astronomical data or other spectral data. As indicated already, there are three main curve-tting modes: 1. Single Curve-Fitting: this is carried out within the plotter on a single pattern, i.e. on the pattern belonging to the cell that has been plotted on doubleclicking the cell. Single curve-tting applies to the x-range in the current view of the plotter. 2. Batch Curve-Fitting: this operation mode is carried out from the main window. Diraction pattern data and a spreadsheet form that contains basis tting functions should be loaded before running this operation. Batch curvetting applies on the x-range of the loaded form. Curve-tting in this mode can be performed on a single pattern, a number of randomly selected patterns from one tab or a number of tabs. It can also be performed on all cells in a tab or a number of tabs. Batch curve-tting can also be applied in an overlapping mode by applying dierent tting models on dierent parts of a tab. The tting results can then be written to a text le in an overlapping mode. On running this mode of batch curve-tting a list widget for each rened tab is created. These widgets contain items which include intensity, statistical indicators and renement parameters. On selecting each one of these items, the current tab color coding and color bar change to display the variation of the selected item. The 3D plotter will also display the selected item. 3. Multiple Batch Curve-Fitting: this operation mode is carried out from the main window. It is an automated multiple batch curve-tting process that is particularly useful for large-scale curve-tting operation. The idea of this routine is to deposit a number of prepared forms and the folders containing

4.2 Back Projection

15

the data sets to be rened in a directory and the program will carry out an automated batch curve-tting operation where each data set will be tted to each individual form and the tting data will be saved in a structured directory tree. The residuals and the rened parameters les, alongside the intensity and other results les for each tting, will be saved in a directory bearing the name of the tting form inside the directory of the particular data set. A report that contains informative messages about the outcome of the tting processes will be written to a le. The use of this operation to replace repeated manual application of batch curve-tting process produces more reliable results and can save a huge amount of time.

Table 2: Statistical indicators for powder diraction pattern renement. y o and y c are the observed and calculated count rate respectively, w is the statistical weight, O, P and C are the numbers of observations, adjusted parameters in the calculated model, and applied constraints respectively, I o and I c are the observed and calculated integrated intensity respectively, and i and k are counting indices. Statistical indicator Prole residual Weighted prole residual Expected residual Bragg residual Structure factor residual Goodness-of-t index Denition Rp =
P
o c i |yi yi | P o yi i

Rwp = Rexp = RB = RS = 2 =

w o c 2 iPi (yi yi ) o2 i wi yi 1/2

1/2

OP +C P o2 i wi yi

P o c k |I I | P k o k |Ik | k P o c | Ik Ik | k P o Ik | k| Rwp Rexp 2

o c 2 i wi (yi yi ) (OP +C)

4.2

Back Projection

The purpose of this module is to reconstruct images from sinograms consisting of a set of rotation versus translation measurements with possible ltering and

4.2 Back Projection

16

application of Fast Fourier Transform. This standard technique for tomographic reconstruction of 2D images from a set of 1D projections and the angles at which these projections were taken relies on the application of the inverse Radon transform and the Fourier Slice Theorem. The exibility and diversity of this operation as implemented in EasyDD (e.g. back projection of intensity or some channels or all channels in a tab with dynamic display of reconstructed images and possibility of toggling between sinogram and reconstructed image) make it very useful.

5 ACKNOWLEDGEMENT

17

Acknowledgement

The author would like to acknowledge valuable contributions from Prof. Paul Barnes and Dr Simon Jacques. Dr Jacques should also be accredited for being the originator of the main ideas of EasyDD.

6 REFERENCES

18

References

Diamond Light Source website: www.diamond.ac.uk/default.htm. 4 European Synchrotron Radiation Facility (ESRF) website: www.esrf.eu/. 4 Synchrotron Radiation Source (SRS) website: www.srs.ac.uk/srs/. 4

7 NOMENCLATURE

19

7
2 A C F Ic Io m O P RB Rexp Rp RS Rwp w X yc yo 2D 3D

Nomenclature
goodness-of-t index area under peak number of constraints Full Width at Half Maximum calculated integrated intensity observed integrated intensity mixing factor in pseudo-Voigt function number of observations number of parameters Braggs residual expected residual prole residual structure factor residual weighted prole residual statistical weight position of peak calculated count rate observed count rate two-dimensional three-dimensional Angle Dispersive Diraction Energy Dispersive Diraction Energy Resolving Detector European Synchrotron Radiation Facility (Grenoble - France)

ADD EDD ERD ESRF

7 NOMENCLATURE

20

FWHM GUI MCA RAL SRS TEDDI

Full Width at Half Maximum Graphic User Interface Multi-Channel Analyser Rutherford Appleton Laboratory (Didcot - UK) Synchrotron Radiation Source (Daresbury - UK) Tomographic Energy Dispersive Diraction Imaging

You might also like