DatView: a graphical user interface for visualizing and querying large data sets in serial femtosecond crystallography (2019)
DatView is a new graphical user interface (GUI) for plotting parameters to explore correlations, identify outliers and export subsets of data. It was designed to simplify and expedite analysis of very large unmerged serial femtosecond crystallography (SFX) data sets composed of indexing results from hundreds of thousands of microcrystal diffraction patterns. However, DatView works with any tabulated data, offering its functionality to many applications outside serial crystallography. In DatView‘s user-friendly GUI, selections are drawn onto plots and synchronized across all other plots, so correlations between multiple parameters in large multi-parameter data sets can be rapidly identified. It also includes an item viewer for displaying images in the current selection alongside the associated metadata. For serial crystallography data processed by indexamajig from CrystFEL [White, Kirian, Martin, Aquila, Nass, Barty & Chapman (2012). J. Appl. Cryst. 45, 335–341], DatView generates a table of parameters and metadata from stream files and, optionally, the associated HDF5 files. By combining the functionality of several commonly needed tools for SFX in a single GUI that operates on tabulated data, the time needed to load and calculate statistics from large data sets is reduced. This paper describes how DatView facilitates (i) efficient feedback during data collection by examining trends in time, sample position or any parameter, (ii) determination of optimal indexing and integration parameters via the comparison mode, (iii) identification of systematic errors in unmerged SFX data sets, and (iv) sorting and highly flexible data filtering (plot selections, Boolean filters and more), including direct export of subset CrystFEL stream files for further processing.