Section 1

Introduction

 

 


1.1 Introduction

The Interactive Image SpreadSheet (IISS) program provides image processing, visualization, and algorithm development capabilities within a generalized spreadsheet framework. The extension of the IISS to Web-based technologies, client-server environments, high performance networks, and petabyte digital libraries is referred to as the Distributed Image SpreadSheet (DISS). All of the material in this IISS User's Manual also applies to the DISS functionality.

The IISS and DISS software tools extend the traditional spreadsheet paradigm to develop highly interactive visualization and analysis environments using intuitive collaborative interfaces. The IISS/DISS supports the organization of multiple layers of information within a cell. These layers are a sequential organization of data referred to as IISS frames or equivalently levels, pages or depth dimension inside of IISS cells. Thus the IISS is explicitly a 3-D spreadsheet or three-dimensional tabular layout of information.

An image spreadsheet can contain any number of cells and frames, and each frame supports the storage and display of arbitrary-sized datasets as the composition of frame contents. The IISS/DISS enables any subset of cells to be grouped together for synchronized viewing, interactive manipulation and computation. The IISS/DISS uses the powerful spreadsheet formula mechanism together with data-linking for efficient algorithm development, for rapid data browsing, and quality checking. In the remainder of this document when we refer to the IISS we are also implicitly referring to the DISS unless otherwise stated.

The term image spreadsheet is used in a generic sense. The IISS is not restricted to the display of the 2D images and never was. The terminology image spreadsheet was deliberately chosen to differentiate it from all previous spreadsheets that manipulated only scalars. The term image spreadsheet refers to the depiction of the multidimensional data object contained in a frame as a 2D rendering within the display associated with that frame.

As such the original scientific dataset contained in a frame may be 3D spatial, 4D timevarying, or 5D timevarying multidimensional datasets that when rendered to a two-dimensional display are also referred to as images. The IISS handles complex multidimensional datasets and provides a variety of visualization techniques for creating 2D displays. Note that nearly all displays are two-dimesional in nature including various (auto)sterescopic displays with the exception of a few true volumetric displays that are being developed in research laboratories. We consider the terms image spreadsheet, visual spreadsheet, information spreadsheet, visualization spreadsheet or combinations thereof to be essentially equivalent.

The IISS explicitly incorporates image processing functionality, interactive high-quality rendering of surfaces, interactive rendering of volumes, and enables a systematic and compact organization of a large number of datasets, such as satellite multispectral image sequences, along rows, columns and levels. The IISS has special support for geophysical datasets that are often stored in specialized formats and often need to be remapped or renavigated to different projection coordinates. The IISS has been used for information visualization of multidimensional datasets.

The IISS visual interface consists of a two-dimensional matrix of cells, each with its own set of data (such as image, surface, volume, etc.) organized in a sequence of frames. Interactive manipulation and display of data accessible within each cell are described in Section 4 and include: (i) animation of frames, (ii) smooth roaming (scrolling/panning) and zooming within a frame, (iii) probing of original and displayed data values composing the frame dataset, and (iv) rendering of surface data. The grouping of cells into subsets provides one of the most powerful browsing capabilities of the IISS by synchronizing the display of data across multiple related cells. Establishing group dynamics among a set of cells, as well as establishing and breaking connections between frames and operational forms are described in Section 5.

A variety of display features are accessible from within the IISS. Any cell can be placed into a full-screen mode to use the full monitor display area for examining data in detail while minimizing the number of user interface panels. Each frame can have its own colormap or can share the colormap space with other frames when the available number of colormaps have all been used. Auxiliary information such as text, overlays and colormaps can be shown along with each frame. Multiple views of the same frame data in several cells is supported in a memory efficient manner without keeping multiple copies.

The data for each frame in the IISS is specified by a formula. The analysis capabilities of the IISS using formula operators to establish and represent relationships between frames are described in Section 6. Formulas can be constructed using typical spreadsheet syntax with image processing extensions. All formula evaluations are done in floating point when necessary. The evaluation of formulas can be automatic, view-driven (also known as demand-driven) or manually controlled. The group dynamics of formula evaluation can be synchronized with respect to tracking the progress of evaluation by displaying each frame as it is evaluated, just the final resulting frame, or a combination.

An example of a 3x3 image spreadsheet for the visualization of multidimensional satellite data is shown below. The spreadsheet organizes monthly and daily global ozone measurements in three different cartographic projections - polar stereographic and cylindrical equidistant to show amount of ozone in the upper atmosphere with intense purple signifying very low ozone levels. Each frame can render 2D or 3D representations of the data using a variety of display visualization methods.


1.2 Hardware and Software Requirements

The current release of the IISS that is being distributed is Version 3.0 from August 2000 (previous official was Version Beta.0).

The IISS software has been tested and used across the full spectrum of single and multiprocessor Silicon Graphics Inc. (SGI) workstations with a range of graphics capabilities including: Onyx3, Onyx2, Octane, Octane2, O2, and earlier generation hardware such as Power Onyx, Onyx with VTX or RE2 graphics, Indigo2, Octane, Octane2, Indy, Indy Presenter, 4D Powerseries VGX or VGXT graphics, Crimson, Indigo, and Personal Iris. The CPU processor types that have been used include MIPS R12000, R10000, R8000, R5000, R4400, R4000, and R3000. The IISS software can be ported to non-SGI workstations and personal computers that support SGI's Graphics Language (GL) or OpenGL libraries. The IISS supports both 8-bit and 24-bit color display monitors. However, certain operations may not be functional on 8-bit monitors that do not support the automatic dithered display of 24-bit data.

The IISS is a UNIX-based application that requires IRIX 5.x (32-bit version), IRIX 6.x (32-bit version) or later versions of the operating system on the SGI platforms. Under IRIX 6.x the IISS is still a 32-bit application due to the dependency on IRIS GL and Igloo graphics libraries which are supported only under 32-bit mode.

The IISS can run on a machine with as little as 16 MB of RAM provided that there is enough physical or virtual memory to hold all of the displayed frame (image) data in process memory. In order to work with large spreadsheets the size of the swap space may have to be increased on systems with limited physical memory. In order to support handling an interesting number of datasets with the capability to display rendered surfaces with texture-mapping, and support remapping of geophysical data to different projection coordinates, it is highly recommended to use a workstation with 256MB or more of RAM.

Disk storage requirements are determined by the size and number of datasets to be manipulated. The IISS itself requires about 32 MB for the executable (when statically linked) and several example spreadsheets. The IISS distribution is usually statically linked due to the dependencies on compression and other libraries that may not be available on all systems as shared libraries. The IISS also uses temporary disk space to store intermediate results (usually in floating point format) during image calculations. A multigigabyte to terabyte disk-array which implements disk-striping is recommended for fast parallel input/output (i/o) access performance. The IISS explicitly supports direct-io reads from external disk directly to program memory (without going to intermediate cache) to speed up access to disk files. Direct-I/O is supported in IRIX 5.x and later versions of the SGI operating system. The SGI Bulk Data Service (BDS) software product supports direct-i/o over local or wide area networks (LAN or WAN) using NFS.

The IISS requires access to Khoros 1.0 Release 5 executables in order to execute many of the image processing operators supported by the IISS formula language.

Certain additional libraries will be required to compile the IISS including the Univ. of Illinois NCSA Hierarchical Data Format (HDF) library Version 3.3 Release 2 or later for accessing HDF disk files, other data ingest libraries (GrADS, SGI, SUN, JPEG, GIF, TIFF, pnm, MPEG, SLCCA, VSLCCA), gzip for handling compression and ftp and http for network access. The USGS proj library that supports remapping data represented in latitude and longitude coordinates to different cartographic projections, the Forms library from Utrecht Univ. for the Iris GL-based graphical user interface library, UNIX tools (lex, yacc) for formula evaluation, Flyby for surface rendering, and Vis5D for volume rendering are also needed.


1.3 Spreadsheet Organization

The IISS is a matrix of cells, where each cell contains an array of frames called a frame-stack, and each frame contains image data as illustrated in Fig. 1.1 The IISS, like numerical spreadsheets, is composed of an unlimited number of cells arranged in an arbitrary number of rows and columns. Cells are identified by their row and column location; rows are labeled with alphabetic characters, and columns are labeled with numbers.

All the cells in a given row have the same height (number of pixels in the y-dimension) and all the cells in a given column have the same width (number of pixels in the x-dimension). Resizing a cell affects all the other cells in the same row and column to maintain a grid-like interface typical of numerical spreadsheets. Image spreadsheets requiring more pixels than available on the display monitor may be scrolled through in an analogous manner to the traditional numerical spreadsheets. For example, Fig.1.2 represents an 8 x 8 matrix of image cells. When the full 8 x 8 matrix of cells is visible on the screen then each cell occupies 160 x 128 pixels on a 1280 x 1024 pixel monitor. For larger cell sizes, only a portion of the IISS is visible on the screen and the remainder may be brought into view by scrolling along the columns or rows using the cell matrix glyph described in Section 7. For a 4 x 4 subset of neighboring cells, as shown by the grayed region in Fig.1.2, each cell occupies 320 x 256 pixels using the full screen.

The matrix configuration of the IISS is adaptive and application dependent. In a typical configuration, each row of the spreadsheet is dedicated to an instrument channel or combination thereof while each column is reserved for a particular enhancement of the raw data as shown in Configuration 1 of Fig.1.2. A multispectral instrument which makes observations as a function of time can also be easily accommodated in the IISS as shown in Configuration 2 of Fig.1.2. The spreadsheet rows represent different spectral bands and the spreadsheet columns represent the different temporal epochs. If time averages or composites are useful, they can be computed using formulas and included as an additional column of cells.

Cells figure

FIG. 1.1 A diagrammatic representation of the organization of the IISS. The illustration shows the terminology for the composition of the sheet. The image data is organized in the sheet in terms of frames that are grouped together as frame-stacks within cells.

In addition to controlling the spatial layout of the image data the user also has control over the configuration of the frame-stack for each cell. The frame-stack can be used to load all of the channels of a multispectral image into a single cell or to compactly hold a temporally contiguous image sequence. An effective visualization technique has been to view all or part of the matrix of cells as a synchronized animation. The frame-stack can be advanced (forward or backward) one frame at a time or animated as a video-loop. The time sequence in a single cell may also be viewed as a full screen animation. The Graphical User Interface (GUI) which controls video animation of cells, zoom and roam, and the overlay of annotation, legends and geopolitical boundaries is discussed in Section 4.

Two additional example IISS configurations are given in Fig.1.3. Figure 1.3(A) schematically represents a multichannel dataset with the original data as well as the results of various processing steps. Since raw and processed datasets are available side-by-side, this enables artifacts in the final product to be attributed either to the original data or any of the processing steps. Many time ordered datasets can be effectively viewed by using a 3-D spreadsheet where the third dimension contains images in a time sequence (for example, by day of month or by year). An example of such a spreadsheet organization is schematically represented in Fig.1.3(B) where the third dimension represents data from different years.

8x8 cell matrix figure

FIG. 1.2. A multidimensional IISS consisting of an 8 x 8 matrix of cells each containing a stack of images called frames. When just the 4 x 4 array of cells (gray area) is displayed on the monitor each cell occupies more screen space. Configuration 1 shows multiple channels and products at various processing stages. Configuration 2 shows channels over time and a time composite.

configurations figure

FIG. 1.3. Examples of possible IISS configurations. (A) Shows a specific IISS illustration of how multichannel data may be displayed and analyzed; the ////// indicates empty cells. Note that the original data as well as the results of various processing steps are readily available. This enables the detection of artifacts in the final product which may be attributed either to the original data or processing steps. (B) Shows a 3D spreadsheet where both the second dimension (rows) and third dimension (shown here as pages) represent temporal observations on a monthly and annual scale respectively.


1.4 Data Structure Terminology

A hierarchical tree-based data structure is used to internally represent the IISS and facilitate the interactive manipulation of the data residing in cells of the spreadsheet. The hierarchical scheme closely corresponds to items being manipulated by the user in terms of Sheet, Cell, Frame-stack and Frame as shown in Fig. 1.1. The chart in Fig.1.4 shows the overall organization of the IISS data structure. Note that some leaves in the tree represent individual members of the data structure while others refer to categories of items coalesced to simplify the diagram.

hierarchy data structure figure

FIG. 1.4. IISS hierarchical object oriented data structure showing the organization of the image spreadsheet. Members of the data structure at the same level in the hierarchy have the same parent data field. Not all of the leaves in the hierarchy are unique members but may themselves be structures; for example, the Geometric Transformation field would include as members the scaling, translation and rotation parameters.

The top level is the Sheet along with the associated global display parameters and the two-dimensional matrix of cells. Each cell has an associated display window having its own screen dimensions subject to the constraint that all cells in a row share the same height and all cells in a column share the same width. Typically in many applications all cell windows will be of the same size. Information at the Sheet level includes the screen size of the sheet, size of the cell matrix, and hardware features and limitations like double buffering the screen display, the size of the colormap (8 bits versus 12 bits for example), and the capability to zoom an image by fractional amounts. Each cell contains zero or more frames, which hold the actual image data and all of the frames in a cell are organized in a sequence referred to as a frame-stack.

The row-column constraint on the dimensions of a cell enforces a strict matrix or grid organization. While it might be more convenient to have cells of unconstrained sizes to accommodate viewing images of different sizes, this would complicate the regular organization of a spreadsheet that affords its ease of interpretability. In fact the row-column spreadsheet interface affords a convenient mechanism for organizing large amount of image data in a convenient fashion that maximizes the use of display screen space and reduces the clutter generated in other image processing environments. The lack of flexibility in changing both dimensions of a single cell is compensated by the ability to roam and zoom within any cell or to view a single cell at full screen resolution.

Cells are arranged in a matrix format and are accessed via matrix addressing since the two-dimensional arrangement of cells will typically remain stable during a user's session or will change infrequently. There are no apriori limitations (other than memory constraints) on the size of the matrix of cells that can be created and manipulated; the spreadsheet's size (the number of rows and columns of cells) can be changed by the user.

The position of the frame within a frame-stack is termed its level (or equivalently depth or index) within the cell. Each cell can display only one frame at a time. The frame that is being displayed is referred to as the current frame. A frame contains data as well as display (or viewing) parameters that determine the way in which the data will be presented on the screen.

The dataset within a frame is defined through the use of formulas, that are textual expression defining one of more sources of data, and the sequence of operations to be performed on that data prior to display. A frame's formula may reference files on disk, or other frames in memory, as sources of data. If frame j's formula contains a reference to frame k, j is said to depend on k. In dataflow terminology, j is downstream of k, or equivalently k is upstream of j.

The display parameters for a frame can be controlled interactively, and the effects can be propagated throughout the frame-stack as well as to all of the cells that have been grouped together. The propagation of effects across cells is a consequence of using the grouping mechanism to operate on arbitrary subsets of cells in an efficient manner. Parameter settings can be synchronized within frame-stacks and cell groups.