The literature survey

What is image and types?

Now a days, image is quite familiar word. But there are lot of things that come under the tag of image family. This include maps, diagrams, pictures, statues, projections, poems, patterns, optical illusions, dreams, hallucinations, spectacles, memories, and even ideas as images. But the point is that by calling of all these things by the name of image does not necessarily mean that they all have something in common anyhow digital image is array of pixels[7].Image is an important source of information in world wide web era. In present era there is terrific boost in storage and communication technologies. Information storage is not limited to text but also images, videos all plays major role and contribute to major share of this stored information.

These stored information is useless until unless it is effectively handle for searching and indexing. Large databases contain textual and images, information. Extraction of information and searching from these databases in minimum time and with max recall is one of the major issue these days.[1]As the type of image varies according to the number of pixels it contains. There are certain types of images, based on number of pixels.Binary images contains only black and white colors in simplest image and each pixel is represented by one bit only.Grayscale images contain band of shade or intensity from white to black varying from black at the weakest intensity ,total absence, and white ,total presence, at its strongest. In gray scale the each pixel's value is a single sample. Grayscale images have many shades of gray in white and black. Grayscale images do not contain chromatic variation.[3]

Image global and local features

Image modes

Image can be categorized on pixel- base and region base models. Pixel base model are further divide into three types.Syntatic models, one dimensional tune series models and random field models. Random field models further incorporate eight global and local properties of an image

What is Retrieval ?How it is measure?

An image retrieval system is a computer system for browsing, searching and retrieving images from a large database of digital images [3].The need for desire image or information is captured in form of query and presented to retrieval system for desire result. In multimedia technology the field of image retrieval is of prime importance and growing rapidly. Typical image retrieval process involve the extracting and processing of meaning full information about images from distributed image collection to characterize images in collection. Then fast intelligent manipulation of these feature .these feature are then match to the processed query features to find out similarity. Results are refined by user through feed back and specification[5].Now a days the focus in son fulfilling this semantic gap

Current IR system mostly use low level feature given fig A. these features can be dig out easily from machine automatically. Semantic level retrieval is more desirable by the user. Due to heterogeneous and fast growing image collection it can not be achieved easily [4].

Information Retrieval Systems & Image Retrieval Systems.

Easy Web hosting and low-cost storage and has stimulate the transmutation of common man from a inert consumer of photography in the past to a current-day active producer. Today, image data is not centered to one location growing rapidly, with extremely varied visual and semantic content. All these parameters are have created numerous possibilities and considerations for real-world image search system designers. [8]

Annotation is further divided into manual annotation and auto-annotation of images

CBIR is second method of image retrieval .This methodology avoid the use of textual descriptions and image retrieval is base on visual similarity to a user-supplied query image or user-specified image features.

Content base image retrieval systems

Content-based image retrieval, uses three visual contents of an image such as color, shape, texture, and spatial layout to represent and index the image. In any typical content-based image retrieval systems retrieval process can be considered as. Firstly the visual contents of the images in the database are extracted and described by multi-dimensional feature vectors. The feature vectors of the images in the database form a feature database. Users provide the retrieval system with example images or sketched figures. As there are different form of queries. Internal representation of these example or query images are obtained and stored as feature vector. Then similarities and difference are being computed between both feature vectors.Retrival is performed then using some good indexing technique. An indexing technique provide a sufficient way to search for image database. Users' relevance feedback is another parameter incorporated in image retrieval systems to modify the retrieval process in order to produce more meaningful retrieval results [11].If we think the image retrieval process and break down into different phases .Feature extraction from image is first step in most of IR system. These feature can be local features like shape and texture and global feature like color and color histogram.

Color base feature extraction: IMAGE content descriptor:

Color Space

Pixel is minute point of an image in 3D color space. Three color spaces are commonly used for image retrieval. These are RGB, CIE L*a*b*, CIE L*u*v*, HSV (or HSL, HSB), and opponent color space. It is undeciedable which color space is best but desirable feature of color space is uniformity. Uniformity means that if two color pairs are in equal similarity distance in color space are perceived equal by viewer also.

RGB space is composed of three basic color components and is a widely used color space for image display. These components are called "additive primaries" since a color in RGB space is produced by adding them together. In contrast, CMY space is a color space primarily used for printing. The three color components are cyan, magenta, and yellow. These three components are called "subtractive primaries" since a color in CMY space is produced through light absorption. Both RGB and CMY space are device-dependent and perceptually non-uniform. The CIE L*a*b* and CIE L*u*v* spaces are device independent and considered to be perceptually uniform. They consist of a luminance or lightness component (L) and two chromatic components a and b or u and v. CIE L*a*b* is designed to deal with subtractive colorant mixtures, while CIE L*u*v* is designed to deal with additive colorant mixtures.

In HSV (or HSL, or HSB) space is widely used in computer graphics and is a more intuitive way of describing color. The three color components are hue, saturation (lightness) and value (brightness). The hue is invariant to the changes in illumination and camera direction and hence more suited to object retrieval. RGB coordinates can be easily translated to the HSV (or HLS, or HSB) coordinates . The opponent color space uses the opponent color axes (R-G, 2B-R-G, R+G+B). This representation has the advantage of isolating the brightness information on the third axis. With this solution, the first two chromaticity axes, which are invariant to the changes in illumination intensity and shadows, can be down-sampled since humans are more sensitive to brightness than they are to chromatic information.[11]

Color in possibly distinctive and most prevailing visual feature in content base image retrieval. In CBIR the color histogram is mostly use descriptor. A color histogram captures the distribution of colors in an image. The advantage is that color histogram are easy to compute .But the drawback is that resultant large image feature vectors are hard to index and it also involve high search and retrieval cost. In addition it lack the spatial information.e.g.the histogram from an image having a red color blob in green background is same to histogram of that image that contain same number of red and green pixels but distributed in random order. Several of the recently proposed color descriptors try to integrate spatial information along with color histogram. [10]

Color Moments

When image contains objects color moments are particularly used. It has been successfully used in many system like QBIC.There are three moments. The first order (mean),the second (variance) and the third order (skewness).Color moments have been proved to be efficient and effective in representing color distributions of images.

Usually the color moment performs better if it is defined by both the L*u*v* and

L*a*b* color spaces .Color moments give the dense representation of iamge at the cost of discrimative power. The advantage is only 9 numbers are used to represent the content of an image.

Color Histogram

The color histogram serves as an effective representation of the color content of an image if the color pattern is unique compared with the rest of the data set. The color histogram is easy to compute and effective in characterizing both the global and local distribution of colors in an image. In addition, it is robust to translation and rotation about the view axis and changes only slowly with the scale, occlusion and viewing angle. Since any pixel in the image can be described by three components in a certain color space (for instance, red, green, and blue components in RGB space, or hue, saturation, and value in HSV space), a histogram, i.e., the distribution of the number of pixels for each quantized bin, can be defined for each component. Clearly, the more bins a color histogram contains, the more discrimination power it has. However, a histogram with a large number of bins will not only increase the computational cost, but will also be inappropriate for building efficient indexes for image databases. Furthermore, a very fine bin quantization does not necessarily improve the retrieval performance in many applications. One way to reduce the number of bins is to use the opponent color space which enables the brightness of the histogram to be down sampled. Another way is to use clustering methods to determine the K best colors in a given space for a given set of images. Each of these best colors will be taken as a histogram bin. Since that clustering process takes the color distribution of images over the entire database into consideration, the likelihood of histogram bins in which no or very few pixels fall will be minimized. Another option is to use the bins that have the largest pixel numbers since a small number of histogram bins capture the majority of pixels of an image. Such a reduction does not degrade the performance of histogram matching, but may even enhance it since small histogram bins are likely to be noisy.

When an image database contains a large number of images, histogram comparison will saturate the discrimination. To solve this problem, the joint histogram technique is introduced . In addition, color histogram does not take the spatial information of pixels into consideration, thus very different images can have similar color distributions. This problem becomes especially acute for large scale databases. To increase discrimination power, several improvements have been proposed to incorporate spatial information. A simple approach is to divide an image into sub-areas and calculate a histogram for each of those sub-areas. As introduced above, the division can be as simple as a rectangular partition, or as complex as a region or even object segmentation. Increasing the number of sub-areas increases the information about location, but also increases the memory and computational time.

Color Coherence Vector

A different way of incorporating spatial information into the color histogram is color coherence vectors (CCV). Each histogram bin is partitioned into two types, i.e., coherent, if it belongs to a large uniformly-colored region, or incoherent, if it does not. Let i denote the number of coherent pixels in the ith color bin and i denote the number of incoherent pixels in an image. Then, the CCV of the image is defined as the vector <(1, 1), (2, 2), ..., (N, N)>. Note that <1+1, 2+2, ..., N+N> is the color histogram of the image.

Due to its additional spatial information, it has been shown that CCV provides better retrieval results than the color histogram, especially for those images which have either mostly uniform color or mostly texture regions. In addition, for both the color histogram and color coherence vector representation, the HSV color space provides better results than CIE L*u*v* and CIE L*a*b* space.

Color Correlogram

The color correlogram was proposed to characterize not only the color distributions of pixels, but also the spatial correlation of pairs of colors. The first and the second dimension of the three-dimensional histogram are the colors of any pixel pair and the third dimension is their spatial distance. A color correlogram is a table indexed by color pairs, where the k-th entry for (i, j) specifies the probability of finding a pixel of color j at a distance k from a pixel of color i in the image. Let I Color plays a significant role in image retrieval. Different color representation schemes include red-green blue (RGB), chromaticity and luminance system of CIE (International Commission on Illumination), hue-saturation-intensity (HSI), and others. The RGB scheme is most commonly used in display devices. Hence digital images typically employ this format. HSI scheme more accurately reacts the human perception of color. All perceivable colors can be reproduced by a proper combination of red, green and blue components. A 24-bit per pixel RGB color image may have 2 or approximately 16.7 million distinct colors. In order to reduce the number of colors for efficiency in image processing, colors are quantized with a suitable algorithm.[12]


Texture is such visual pattern that has properties of homogeneity and this homogeneity is not due to sung intensity or color. Texture is a visual pattern where there are a large number of visible elements densely and evenly arranged. A texture element is a uniform intensity region of simple shape which is repeated. Texture can be analyzed at the pixel window level or texture element level. The former approach is called statistical analysis and the latter structural analysis. Generally, structural analysis is used when texture elements can be clearly indented, whereas statistical analysis is used for ne (micro) textures .Statistical measures characterize variation of intensity in a texture window. Example measures include contrast (high contrast zebra skin versus low contrast elephant skin), coarseness (dense pebbles vs coarse stones), and directionality (directed fabric versus undirected lawn). Fourier spectra are also used to characterize textures. By obtaining the Fourier transform of a texture window, a signature is generated. Windows with same or similar signatures can be combined to form texture regions. Structural texture analysis extracts texture elements in the image, determines their shapes and estimates their placement rules. Placement rules describe how the texture elements are placed relative to each other on the image and include measures such as the number of immediate neighbors (connectivity), the number of elements in unit space (density), and whether they are layed out homogeneously (regularity). By analyzing the deformations in the shapes of the texture elements and their placement rules, more information can be obtained about the scene or the objects in it. For instance, a density increase and size decrease along a certain direction might indicate an increase in distance in that direction. [12]

Working of Large Scale Image Search Engine?

Some retrieval systems are combination of keyword base retrieval and perception base retrieval system. They work in iterations. In this proposed perception base retrieval system the results are better by achieving more relevant images. The proposed system uses the relevance feed back and sampling to refine the results in iterations. So this system uses active learning to capture subjective and complex query concepts. Two main component of the proposed image retrieval system are a multi-resolution image-feature extractor and a high-dimensional indexer. Both these parts help in query-concept learning and image retrieval efficient. [12]

Present Image Retrieval system?

Block truncation coding (BTC) was first developed for grayscale image coding. Later it was extend to color images .This method segment the color image into three components R,G and B.Threshold is set by taking the mean of interband average image(IBAI).IBIA is average of all three components. Each pixel is compared to its threshold value to create bitmap. If a pixel in the interband average image is greater than or equal to the threshold, the corresponding pixel position of the bitmap will have a value of 1otherwise it will have a value of 0. Two mean colors one for the pixels greater than or equal to the threshold and other for the pixels smaller than the threshold are also calculated. There are various equation given in paper [13] to describe the example to compute mean.The two means Upper Mean and Lower Mean are calculated and both together will form a signature of vector of an image. These features are stored in feature vector table .This process is done for images in database. The feature vector is computed for the query image when Query image is giving to CBIR system. Then a matching process starts in which these feature vector is compared with the table entries for best match. Image Retrieval applications based on block truncation code used Direct Euclidean Distance as a similarity measure to compute the similarity measures of images. The disadvantage of the above mention technique is that prominent component dominates the threshold vale and minimize the effect of other components. A more generic approach is by using three independent R, G and B components of color images. Then based on these component individual threshold for each component be created. Then apply BTC to each R,G,B individual planes. Thresholds for the R,G,B be TR, TG and TB, .This can be computed from equation below.

What is indexing techniques?

Latent semantic indexing techniques is based on matrix computation of image. As majority of images are stored as raster images. According to LSI images can be viewed as vector of pixels .Each vector represent some keyword in image. When a person sees an images it doesn't think about pixel. S/He automatically extract the features that define the meaning of image for him but the object on image.

LSI focus on searching the semantic of document. This searching is done by using matrix computation in particular single value decomposition(SVD).So in multidimensional space the two documents that are semantically similar will be closely located[6].


RST is linguist theory that explains the relationship between texts. This theory was originally developed as part of studies of computer based text generation.RST offers an explanation of coherence of text. Mainly it describes the coherence of text. It describe about the structure of text as various sorts of building blocks. The blocks are two levels, the coherence relations and schemas.[15]

Scope of RST?

RST is not limited to text structure only. Its scope is enhanced to multimedia objects as well including audio,vedio,text ,graphics and images. The propose model for multimedia objects with respect to RST is (N,S (t,n),MFi).where N is nucleus S(t,n) are satellite or satellites that provide solution for large objects. It contains the text tht provide support to nucleus where t is text part, any word or sentence related to multimedia objects, and n describe the number of satellite that will be affiliated to the nucleus. MFi is multimedia framework .it provides two basic features Mi= (D,F). First one is raw data of the object and second one is the feature associated with multimedia objects.[16]


  1. Image retrieval using augmented Block Truncation Coding Techniques
  4. J.P.Eakins. Automatic image content retrieval - are we getting anywhere? In Proc. of Third International Conference on Electronic Library and Visual Information Research, pages 123-135, May 1996.
  5. Methods for Image Retrieval by Zoran Pe?cenovic, Minh Do, Serge Ayer, Martin Vetterli Laboratory for Audio-Visual Communications, Swiss Federal Institute of Technology (EPFL) CH-1015 Lausanne, Switzerland
  6. Latent Semantic Indexing for Image Retrieval Systems by Pavel Praks_, Ji?ri Dvorsky, Vaclav Sna?sel
  7. Image Models NARENDRA AHUJA ANDB. J. SCHACHTER Coordinated Scwnce Laboratory, Unwers~ty of llhno~s at Urbana-Champa~gn, Urbana, Illinois 61801 General Electric Co., Daytona Beach, Florida 32014
  8. Image Retrieval: Ideas, Influences, and Trends of the New Age Ritendra Datta, Dhiraj Joshi, Jia Li, And James Z. Wang
  9. Content-Based Image Retrieval - Approaches and Trends of the New Age
  10. Fundamentals Of Content-Based Image Retrieval by Dr. Fuhui Long, Dr. Hongjiang Zhang and Prof. David Dagan Feng
  11. Techniques and Systems for Image and Video Retrieval Y.AlpAslandogan,Clement
  12. Wei-Cheng Lai, Edward Chang, and Kwang-Ting (Tim) Cheng Morpho Software Inc.
  13. Image Retrieval using Augmented Block Truncation Coding Techniques by H. B. Kekre ,Sudeep D. Thepade
  14. An Introduction to Rhetorical Structure Theory by Bill Mann,August 1999
  15. A Paradigm for Rhetorical Structure Theory Based relations to accommodate Multimedia Objects by Dr.Muhammad Shoaib and S.Khaldoon

Please be aware that the free essay that you were just reading was not written by us. This essay, and all of the others available to view on the website, were provided to us by students in exchange for services that we offer. This relationship helps our students to get an even better deal while also contributing to the biggest free essay resource in the UK!