Openseize: a novel open-source software to analyze large-scale digital signals
Electroencephalography (EEG) is an indispensable tool used by clinicians to diagnose neurological diseases and by researchers to study and discover brain circuit mechanisms that support sensory, mnemonic, and cognitive processing. A new software - Openseize - created by Dr. Matthew Caudill, an investigator at the Jan and Dan Duncan Neurological Research Institute at Texas Children’s Hospital and assistant professor at Baylor College of Medicine, can now analyze massive amounts of one-dimensional digital signals including huge EEG datasets. The study was published in the Journal for Open Source Software.
“Typically, EEG signals are recorded using devices that have a channel count between 1-10 electrodes,” Dr. Caudill said. “However, recent advances in thin-film electronics have resulted in new types of recording devices that can simultaneously record from thousands of channels, each of which records the activity of many neurons from a particular brain region for long periods of time (e.g. months). Currently, these new high-throughput EEG recording devices are being used to record very high-quality neural activity from the brains of rodent models. This has led to the incredible challenge of how to process such massive amounts of data in an efficient and reliable manner.”
Mechanistically, EEGs are moving time series that capture alterations in the brain’s electromagnetic field arising from synchronous synaptic potential changes across neuronal populations. Linear digital signal processing (DSP) tools are routinely used in EEGs to reduce noise, resample the data, remove artifacts, expose the data’s spatiotemporal frequency content, and much more.
“We typically run this data through mathematical analyses to mine and extract useful biological information from such datasets,” Dr. Caudill said. “The problem is even the most high-end computers do not have large enough memory to perform computations on the massive datasets that are being generated by the new recording devices. To tackle this issue, I designed Openseize software to break down large datasets into smaller fragments that are amenable to typical computations and then assemble these analyzed chunks to build up to the full solution. It is analogous to how we bake a complex multi-tier cake – first, we gather the ingredients needed to bake individual components from the pantry, bake each tier one–by-one and finally, assemble the cake.”
In a similar fashion, Openseize first provides instructions for extracting the right-sized chunk of data from the file on disk (‘data producer’) that can be easily stored in any average computer’s memory. Using this data producer, users can then customize processing pipelines that will operate on each chunk of the data to extract important biomarkers such as frequency densities, spike rates, and much more. Once complete, this sequence of steps is repeated for other snippets until the entire dataset is analyzed. The analyzed segments are then assembled into the final product.
“Currently, this software is being used at the Duncan NRI to detect and study several biological events from EEG and electromyography (EMG) data in animals including seizure events, understanding brain activity patterns, and long-term motor coordination in different models of neurological diseases,” Caudill added. “This open-source software has much wider applicability beyond analyzing electrical signals in the brain and muscles. In theory, it can be used to analyze any one-dimensional digital signals such as audio signals, which means there are many more ways one could explore using this software to analyze biomedical and other data in the future.”
This work was generously supported by the Ting Tsung and Wei Fong Chao Foundation.