Home | Contact ST  

Feature Article

Ocean Data Manager Handles Large, Diverse Data Sets
Software Tool for Oceanographic Data Management Allows Processing of Complex and Nonheterogeneous Data Files


AUTHORS:
Francisco J. Gutiérrez
Technologist
Department of Ecology and Coastal Management
Instituto de Ciencias Marinas de Andalucía
Consejo Superior de Investigaciones Científicas
Puerto Real, Spain
David Roque
Upper Technician
Department of Physiology, Nutrition and Mollusc Culture
Instituto de Investigaciones Marinas
Consejo Superior de Investigaciones Científicas
Vigo, Spain
Dr. Gabriel Navarro
Research Assistant
Department of Ecology and Coastal Management
Instituto de Ciencias Marinas de Andalucía
Consejo Superior de Investigaciones Científicas
Puerto Real, Spain
Over the past three years, the Institute of Marine Science of Andalucía (ICMAN), which belongs to the Spanish National Research Council (CSIC), has acquired diverse oceanographic equipment to perform a very intensive 30-month continuous sampling of a variety of biogeochemical and physical variables in the Guadalquivir estuary, a marshy area at the base of the Guadalquivir River in southwest Spain. Researchers carried out intensive sampling using many different types of instruments, which has generated a very complex and inhomogeneous set of data files to be processed and analyzed.

Ocean Data Manager (ODM) is a free software tool developed in MathWorks Inc.’s MATLAB software (versions 6.5 and above) that allows the homogenization of all those data files, providing an easy-to-use graphical tool whose functionality has been extended not only to import but also to filter, concatenate, visualize and export data to more convenient file formats, for instance, to Ocean Data View (ODV) software. Through this functionality, ODM guarantees a common framework for data file manipulation, aiding researchers and technicians that have to handle data files from such a different set of instruments. ODM is also very useful during oceanographic cruises for rapid campaign data visualization.

ODM Software Overview
ODM version 1.0 is available for free download at www.icman.csic.es, along with reference material. ODM software consists of five subprograms or tools that can be accessed from the main interface window: import, filter, concatenate, plot and export. The program also includes functionality for implementing new user-defined graphical interfaces.

In order to simplify the addition of a new functionality, ODM implements a modular software architecture. Thus, all users can import their own instrument data files, or they can add new tasks that can be executed directly from the main interface.


Import Tool
The import tool subprogram provides the grounds for data format normalization. It relies on the use of an internal MATLAB data structure, which is used by the rest of the subprograms. The fields are divided into three groups. The first group includes the fields that put data into context: the project name; the main researcher; vessel, mooring or station name; geographical coordinates; initial and final date; and other similar categories. The second group consists of a set of fields that are more instrument-specific: sample and average interval, instrument brand, model and serial number, instrument type, column names and units, number of cells, cell size or blanking distance for acoustic current profilers. A third group of fields includes auxiliary variables that allow consistency among the rest of subprograms.

The import tool follows five steps. In the first step, the software asks the user to select the instrument, brand and model of the data being imported. Behind this simple selection, a set of variables is initialized, allowing the application to use the correct drivers that decode the original data file. At the second step, the user selects the data file to be imported. ODM quickly inspects the data file to check that it came from the right brand and model, then automatically reads some metadata fields from the data file. All those fields are shown to the user for verification. At the third step, the application asks the user to fill in the rest of metadata fields that put the data into context. The fourth step shows the user a summary of the information gathered, and the fifth step imports the data.

(Top) ODM plot screen capture showing a 13-station temperature-salinity diagram. (Bottom) Graphical representation of an ASCII data file generated by ODV export.

After importing, the data are stored as a matrix in the metadata structure at a new field named data, and a new MATLAB data file containing that structure is created. As all the data contained in the original input file are imported at this stage, bad data points can be removed using the filter tool.


Filter Tool
This subprogram allows the user to remove easy-to-identify bad data. For example, in a temperature-salinity mooring time series, the first and the last data points, having meaningless salinity values, can be easily removed with this tool.

Overall, this tool plots the variables that the user selects and removes the points inside a region delimited by thresholds defined by the user. The filter allows zooming in and out of different regions, easing the selection of the bad points. This process can be repeated in a loop to plot different pairs of variables to continue removing bad data points.

It is important to mention that no data is removed from the ODM metadata structure. Instead, a Boolean vector identifying good and bad data rows is added to the structure. Once the user is pleased with the result, the tool creates a new MATLAB file.


File Concatenating Tools
File concatenation is meaningful in a time-series extension, when a moored instrument is recovered and its data file downloaded for analysis. ODM has two tools for file concatenating.

The first tool is the “mooring extend” tool. The mooring extend tool only merges files belonging to the same project, having the same mooring location name, obtained with the same instrument type, even if the data was recorded with different brands and/or models. In order to link up files from different times and locations, but still belonging to the same project and coming from the same instrument type, ODM has a “generate campaign” tool. After finishing with the mooring extend tool, ODM generates an output file.

Internally, both tools generate an array of ODM metadata structures. The difference between each tool is made by introducing a state variable, named extended_type, set for “mooring” or “campaign” depending on the type of concatenation made. This variable makes the ODM export tool behave differently depending on the type of concatenation made.


Plot Tool
The plot tool provides direct graphical outputs in some common picture formats (JPEG, TIFF, PNG, etc.) as well as MATLAB figures.

At the first step, the tool asks the user for a data file. This file may be one of those generated after importing, filtering or concatenating. Then it shows the metadata contained in it.

The next step is choosing the variables to be plotted. The user can tailor the graphic to a variety of options: dual and single Y-axis, main Y-axis reversal, changing axis limits and, in the case of multiple stations, selecting and deselecting stations.

Finally, the user can select the graphical output file format that best suits the application. It is also possible to edit the final figure to match it to the user’s requirements and export it to other standard graphical formats supported by MATLAB.

ODV Export Tool
The ODV export tool transforms the information contained in a MATLAB data file generated after importing, filtering or concatenating, and it outputs a tab-separated ASCII file. This file has an internal structure that allows it to be plotted by more sophisticated third-party programs.

In the case of files created by the import, filter and mooring extend tools, the exported ASCII file has the following structure: a header with the most important metadata fields, an additional line for the column descriptors, a first data line containing information such as the name of the location, geographic coordinates, bottom depth, etc., and the rest of the file contains the actual data.

In the case of files created by the generate campaign tool, immediately after the header the following substructure is repeated for every station: a first line containing information relative to each station and then the actual data for that station.


Other User Tools
This is a very simple subprogram that provides users with the capability of adding their own tools and creating an open-code generic graphical user interface that can be freely modified to implement new user functionality.

ODM Applications
Former versions of ODM have been used in several projects undertaken at ICMAN-CSIC since 2009. In 2010, further development of the software was necessary, mainly to fulfill the need for fast report generation. It was at that point when the first proper version of ODM was introduced.

In the Guadalquivir project, for instance, where oceanographic surveys took place every month all along the 30-month duration of the project, conductivity, temperature, depth (CTD) casts were handled using the ODM software. CTD casts were imported and filtered, and all the measured variables (conductivity, temperature, dissolved oxygen, turbidity and chlorophyll concentration) were compiled for a monthly data report.

Time-series data from moored instrumentation were also generated using ODM for this project. For example, a Nortek (Rud, Norway) AWAC-AST current profiler had been deployed at the mouth of the Guadalquivir River for two years.

During mooring maintenance, researchers used ODM to import the data (currents and waves), filter them (mainly by removing bad data at the beginning and the end of the time-series portions), merge them with earlier time-series portions and plot them together for the data report.

Other projects’ data have also been reprocessed using ODM, such as in the CTD casts generated during two oceanographic cruises (April and May, 2008) for the European Integrated Project FP6-SESAME. ODM allowed importing, filtering and concatenating the CTD casts from the 33 stations of each cruise. After that, all the data were put together in a single file using ODM’s ODV export tool and submitted for integration in the European SESAME database, which contained CTD data from other groups’ cruises covering the entire Mediterranean Sea.


Conclusions
ODM is a free software package, developed with MATLAB 6.5, that eases the homogenization of data acquired by oceanographic instrumentation such as CTDs, tide gauges, and current profilers.

It provides an easy-to-use graphics tool that allows importing, filtering, concatenating, visualizing and exporting data and it has been extensively used in projects where fast report generation was necessary.

Although no updates are planned for this software, the application includes an open-code subprogram (in “other user tools”), allowing users to easily upgrade the functionality of the software by adding their own tools.


Acknowledgments
ODM has been developed in the context of the P09-RNM-4853 project. The authors would like to offer particular recognition to ICMAN-CSIC technicians Raúl García, Joaquín Pampín and Antonio Moreno.



Francisco J. Gutiérrez received an M.Sc. degree in physics in 1996. Thereafter he developed his career in various applied physics and technology fields: fluid dynamics, condense matter, nuclear power generation, digital signal processing and embedded electronics. He works as a technologist at the Department of Ecology and Coastal Management, ICMAN-CSIC.

David Roque received an M.Sc. degree in marine science in 2005, later working at the Department of Applied Physics at the University of Cadiz and at ICMAN-CSIC. He is currently responsible for the oceanographic data of the Department of Physiology, Nutrition and Mollusc Culture, at Instituto de Investigaciones Marinas de Vigo-CSIC.

Dr. Gabriel Navarro received a Ph.D. in marine science in 2004. In June 2005, he was given a permanent position as a research assistant at the Department of Ecology and Coastal Management, ICMAN-CSIC, where he provides expertise in remote sensing.




-back to top-

-back to to Features Index-

Sea Technology is read worldwide in more than 110 countries by management, engineers, scientists and technical personnel working in industry, government and educational research institutions. Readers are involved with oceanographic research, fisheries management, offshore oil and gas exploration and production, undersea defense including antisubmarine warfare, ocean mining and commercial diving.