Programming Projects and Assignments (WILL BE FINALIZED IN CLASS)

Algorithms are to be directly implemented in the C++ (or C) language and applied to test images. You are expected to implement each programming assignment as a plug-in operator as part of the qtimage software environment provided in class. The qtimage environment automatically provides a mechanism for reading and writing several image formats, displaying many images simultaneously, probing pixel value information, zooming and panning an image, obtaining image information (size, bpp, padding, format), and a QImage class for manipulating image data in computer memory. You should apply each algorithm to several grayscale and color images of your choice. The input images, output images, histograms and other processing steps are to be saved and organized for access via Web pages. The results of your implementation can be checked for accuracy against the results produced using other image processing packages such as VisiQuest/Khoros, Matlab or the ImageVision Library.

NOTE: Assignments are to be set up as a HTML document (Web page) with a description of the algorithm, source code, input and output images. Please print out your document and turn it in for grading unless a different submission mechanism is described in class.


MIDTERM: Nov 3 Covers Chapters 1 thru 3 inclusive, all class and lab materials, and parts of Chapter 4


Images from the Gonzalez and Woods, 2/E DIP textbook can be directly downloaded.
  1. Bit Counting. Determine the number of bits that have a value of one in the binary representation of each pixel's intensity value in an image. You can assume that for grayscale images the gray level intensities are represented using 8-bits per pixel (unsigned char) and for color RGB images the intensities are represented using three channels with 8-bits per channel or 24-bits per pixel total.

    Your program should work for both gray scale and color images. For an input gray level image the output can be another gray level image with pixel intensities ranging between 0 to 8. You can scale the output image for display by scaling (ie multiplication by 16) or offset (ie addition by 200). Since the dynamic range is small your output gray level images may not be very informative. So an alternative is to output a (8-bit) color image using a color look-up-table (LUT) of size 9, whereby each value between 0 and 8 is mapped to a distinctive color.

    For an input color image the output bit count image data can be either gray level pixels with a range from 0 to 24 (suitable scaled or offset), a 8-bit color image using a color LUT of size 25, or a 24-bit color image using RGB pixels with each component in the range 0 to 8 suitably scaled and offset for display.

    Again to reiterate, for display purposes you may need to manipulate your data - you can scale the output image data suitably for display via a scale and/or offset factor; for example multiplied by 10. Or you can use a color look-up-table (LUT) to index the bit count values to a color image where each pixel's color corresponds to a different bit count; this keeps the original bit count values and does not change the count data. Your program should read and process byte data for images stored in standard image formats such as raw, PGM, BMP, GIF, JPEG, SGI, TIFF, etc.

    BONUS: Extend your program and bitwise manipulations to work for 16-bit imagery.

    If you implement your program as a plug-in (to qtimg) the following specification does NOT apply. If your program runs on the command line and accepts command line arguments, then your program should process the following arguments:

    bit_count [x_dim] [y_dim] [8|24] input_image output_image

    The following C program, pgm_read+write.c, may be useful to directly read a pgm format grayscale image (in binary or ASCII representation) into a 2-D array for further processing.

    Notice that you have a lot of choices and options even for such a simple program. This will usually be the case for the other programming assignments in the course. Your written description should be complete and detail the choices you made in your software implementation.

    Image display software: If you work with raw data, then you can use the IISS to look at your input and output image data (on SGI Irix). If you read and write other file formats you can use qtimg (software for this course), gimp, ImageMagick, VisiQuest/Khoros, imageview, imageworks, xv, Photoshop, or any number of other image processing packages to view and check your results.

  2. Connected Components. Implement the scan-line based connected component labeling algorithm using the union-find data structure. Determine the number of connected components in an image using 4-connectivity AND 8-connectivity. Let the user choose the neighborhood connectivity.

    Your program needs to output several different pieces of information. First, output in (ASCII) text format the total number of components found along with the size of each component (ie total number of connected pixels). Second, you will need to output an image in which each component is assigned a unique numeric label from 0 through N; that is each pixel is labeled with a component number or zero for the background. A color LUT can then be conveniently used to visually check that all the components have been correctly identified.

    You may assume that you will need to represent only 256 distinct components in the image so you can use just one byte to represent the component label (0 to 255) for each pixel in the output image. If the image contains more than 256 components, your program must correctly identify ALL the components but only 256 unique components or labels need to be assigned to objects in the image for display and viewing; that is reuse labels when there are more than 256 components.

    The connected component labeling algorithm described in class assumes a binary image (ie zero-one valued or segmented image) as input. The general image segmentation task is very challenging and highly application dependent. Image segmentation is a topic that will be covered later in the course. For this assignment, you can generate a binary image from a gray level or RGB input image in several simple ways.

    In the first approach, you can modify your bit counting program to generate a binary image for input to the connected components program by selecting a specific bit plane to output instead of counting all the bits (ie bitplane 7 would be all pixels with gray value greater than or equal to 128). In the second approach, for either gray scale or 24-bit RGB images you can select a range of intensity values to use as a simple threshold for selecting components from the image. For example, 22 to 100 would mean all pixel values between 22 and 100 would be considered as a one and all other pixel values would be a zero for the identifying objects in the image. For RGB images 3 ranges will be needed, one for each channel. Of course multiple sets of ranges can be used in the thresholding procedure.

    Your connected components labeling program should work for binary, 8-bit gray scale images and 24-bit RGB color images.

    If you are NOT using the qtimg program and plug-ins, then use the following command line options. For command line processing your program should accept the following arguments:

    connected_comp [x_dim] [y_dim] [1|8|24] [component_range] input_image output_image


  3. Radiometric Enhancement via Histogram Processing. Write a qtimg plug-in (program) that implements the histogram processing algorithms described in class: (a) histogram equalization and (b) histogram specification for both grayscale (8-bit) and color (24-bit) images. For histogram specification your program will need a target image and/or target histogram in addition to the input image. The specified or target graylevel distribution is the desired shape of the histogram that we want the transformed input image to have after modification. Note that in histogram equalization our implied targe distribution is a uniform distribution of graylevels in the transformed (output) image.

    Histogram processing is a global transformation of the first order statistics (ie gray level or single channel probabilities) for the entire image that attempts to improve visual quality by increasing contrast and adjusting brightness levels. Due to the discrete nature of image histograms the transformations are only approximations of the exact continuous functions. The advantage of discrete histogram transformations (such as equalization or specification) though is that they can be implemented as simple table lookups very efficiently to achieve a wide range of effects.

    In the command line syntax below the input_image and output_image arguments refer to the input and output image filenames as in previous assignments. The input_image_histogram and output_image_histogram are the names of two output files for saving the input and output histograms in plain ASCII text form that would be useful for testing, graphical plotting, and display. Each line of the output text files should contain two numbers: grayvalue count (i.e. there should be 256 lines). The target_histogram can be specificed in a similar manner as a table (input_grayvalue output_grayvalue), or as a piece-wise function, or via another image from which the histogram is computed. The latter method is useful, for example, when several images are mosaiced together and we need to balance the contrast and brightness across adjacent overlapping images so that the overlap regions and seams at the boundaries are not obvious nor distracting. For color images three histograms for both input and output are involved.

    histo_equalize [x_dim] [y_dim] [8|24] input_image output_image input_image_histograms output_image_histograms

    histo_spec [x_dim] [y_dim] [8|24] [target_histogram | target_image] input_image output_image input_image_histograms output_image_histograms

    Test your histogram equalization algorithm using the four images of pollen shown in Fig 3.15(a)1. Test your histogram specification algorithm using the Phobos moon image in Fig. 3.20 and the desired histogram in Fig. 3.22(a); you can manually approximate the histogram curve using the graph in 3.22(a). You can also try to see how well histogram specification works using the five small squares image in Fig 3.23(a). Try to detect the little squares just using histogram specification without using local histogram enhancement. Note that this figure does not appear to be available on the book image link. We will try to get a replacement image.


  4. Chapter 3 Problems 3.3, 3.4, 3.6, 3.10, 3.18, 3.22 from text book.Bonus: 3.26


  5. Convolution filtering for edge detection. Implement a plug-in for spatial filtering using a convolution operation with any 2x2 or 3x3 kernel. Your program should work correctly for both positive and negative filter weights, integer or floating point and should compute the correct normalization if necessary to scale the output image appropriately. Some standard filters to test your program with include Roberts, Prewitt, Sobel, Laplacian, Gaussian and Unsharp masking. Your program should work for grayscale and color images. Support several different methods for handling the edge or boundary effects from convolution. Estimate the local image gradient using one set of filters (i.e. Prewitt or Sobel) to compute the magnitude and direction. The local image gradient is often used to locate edges in an image.

    The magnitude of the edge gradients for RGB images can be displayed as a single RGB image. Displaying the edge direction is more complex in order to handle the following requirements or constraints: (a) angle wraparound, 0 deg and 360 deg are the same, (b) restrict the range to say -Pi/2 to +Pi/2, (c) quantize the range, (d) use three grayscale or color edge direction images, (e) use a single RGB edge direction image that combines orientation information from each channel.

    convolve [3x3_filter_coefficients | edge_detect] input_image output_image

  6. Spatial Frequency Filtering Using Fourier Transforms. Compute the 2-D Fourier transform of an input image and display the magnitude and phase of the transform coefficients. The output will usually need to be scaled (logarithmically) for improving visual dynamic range. Explain your scaling transformation. Explain your method for computing the Fourier transform. What is the computational complexity of your implementation. Do your images need to be zero-padded? A sample program for computing the discrete fourier transform is provided as a guideline: dft_fft.c

    Using the DFT your program should support several types of filtering operations in the spatial frequency domain including ideal low pass, Butterworth, high pass, band pass, etc.

    fourier_transform [log] input_image output_magnitude output_phase
    freq_filter [low|high|butter] input_image output_image

  7. Chapter 4 Problems from textbook: 4.1, 4.5, 4.7, 4.9, 4.10, 4.14, 4.21, 4.22

  8. Image Segmentation Using Thresholding: Implement a region based image segmentation program using Ohta's histogram based segmentation technique discussed in class. Evaluate how well your segmentation method works on complex images. Suggest methods by which you might improve the performance of your segmentation algorithm.

Programming Project Topics:

  1. Implement the Hough transform for detecting lines and circles (ie. linear and circular segments) in an image.

  2. Compute a set of local image texture features such as energy, entropy, homogeneity, contrast, correlation, cluster using several different co-occurence matrices for image segmentation.

  3. Implement binary morphological operations including dilation, erosion, opening, closing, etc that are then used for higher level processing such as boundary extraction, region filling, thinning, skeletons, etc.

  4. Compute and display the wavelet transform of an image and show its application for edge detection.

  5. Shape detection using moments invariant to translation, rotation and scale change and other shape measures.

  6. Computational photography/plenoptic camera - digital cameras are ubiquitous and can be used more flexibly in complex lighting and group photo situations by taking multiple images rapidly with different exposure, depth of field, dynamic range, zoom and other camera settings then digitally combining them to produce the image desired by the user.

  7. Panoramas and mosaics - develop an interactive image registration program for aligning two or more images together with sub-pixel precision given manual tie points.

  8. Realtime image processing algorithms using programmable graphics processing units (GPUs) via stream computations that are now readily available on modern graphics cards. Examples include video stabilization using frame to frame registration, feature tracking in video, optical flow motion estimation.

  9. Image interpolation, restoration, super-resolution, and in-painting for improving image quality using non-linear filters, deblurring using deconvolution, noise reduction, removing artifacts, blemishes, unwanted objects, etc.

  10. Visual masking for display, compression, watermarking, encryption/steganography. Transcoding images or video from one format to another such as JPEG2000 to TIFF. Using multi-page TIFFS to store image pyramids for multiresolution analysis.

  11. Image classification using machine learning and pattern recognition methods with applications to face recognition, license plate recognition, texture identification, object recognition.

Extra/optional assignments:

  1. Problems 2.2, 2.5, 4.3, 4.4, 4.14, 7.3 and 7.4 from text book.

  2. Fast Median filtering and co-occurence matrices. Implement spatial processing using an mxm median filter with optimization. Implement a fast technique for co-occurence matrix computation for texture measures; some of the ideas from fast median filtering will be helpful in developing the algorithm.
    median [m | 3] input_image output_image

  3. Problems 3.16, 7.17, 7.18 from text book.


Example Program information


Submitting Assignments