METHOD AND SYSTEM FOR DETECTING AND CLASSIFYING OBJECTS IN AN IMAGE
A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.
The present invention relates to image detection, image recognition and other types of computer vision. More particularly, the present invention provides a system with improved ability to classify objects in an image.
1. Image Recognition and Computer Vision in General Image detection, image recognition and computer vision
in general refer to abilities of electronic systems to detect the presence of predefined objects within an image and, if appropriate, to then take actions based upon that detection. Present day applications of these systems include fingerprint, hand or retina identification, identification of faces out of crowd photos or of types and numbers of enemy aircraft from a satellite photo, and similar applications. Some of these systems are "image detection" systems, while others can be "image recognitions" systems. An image detection system generally does not need to recognize a specific match, but just simply the fact that there is some sort of match; for example, an infrared fingerprint scanner used to open a door lock might simply determine whether a scanned fingerprint is present in a fingerprint directory, without determining exactly which fingerprint in the directory was matched. An image recognition system might go further, e.g., matching a fingerprint with a fingerprint of a specific criminal. These examples are illustrative only, and there are many applications of image detection, image recognition and computer vision.
The aforementioned systems usually require one or more "target" images, which can be acquired from a camera, infrared scanner, stored image, satellite or other source. In addition, these systems also generally require a directory (or database) of known images that a computer is to search for within a target image. The computer can use various methods to perform this searching, including intensity or feature based methods, among others. The computer can first preprocess the target image, to identify "candidate" portions which might contain a match with a directory image. If desired, the computer can then focus on the candidate portions and limit further processing to such portions only.
The processing required of the computer can become quite complex if the directory is large; for example, a typical directory might have several thousand images. Since a target image can feature distortions such as light variation, shading effects, perspective orientation and other circumstances that render detection and recognition difficult, the processes and time required of the computer can be enormous and perhaps too demanding for real-time processing. A common design goal, therefore, is to produce systems which perform very efficiently and are less computationally demanding.
2. Use of Vectors in Image Recognition and Computer Vision
Systems typically use vectors to describe directory images; the vectors can represent individual features, or the vectors can describe other aspects of directory images. Each directory image is represented by a different combination of
the vectors, and these systems generally use mathematical shortcuts whereby they can process the target image all at once, for all directory images (instead of searching the target image again and again, once for each directory image).
5 Some systems which use vectors in this manner use "eigenvectors." These vectors are derived from preprocessing of all the directory images, and they share the property that each eigenvector is independent from every other eigenvector. In this context, each eigenvector is simply
10 a set of numbers, perhaps as many as four hundred numbers or more, one each for each pixel of the largest directory image. For example, if the directory had images that were each twenty pixels by twenty pixels, each directory image can be said to be represented by a vector having four
15 hundred numbers. Eigenvectors would be derived in this instance by a computer subroutine which determines commonalities and difference between different images in the directory, with an eigenvector for each different dimension of image commonalities and differences; the eigenvectors
20 would in this instance also have four hundred numbers and could also be thought of as a sort of building block image. A value related to each eigenvector is an eigenvalue, that is, a single value indicating how strong or significant the associated eigenvector is across all images of the directory.
25 In a different vector system (where, for example, each vector could represent a feature of an image, e.g., a nose, an eye, etc.), an analogous value might be how important the associated feature is in describing the images in the directory.
30 A system might then use vectors (such as eigenvectors or other vectors) to perform image recognition and detection as appropriate. For example, the vectors might first be applied to a target image to select candidate portions, that is, objects in the image which are believe sufficiently "close" to the
35 directory images so that a match is likely. To perform this processing, typically the strongest or most significant vector is multiplied against different groups of pixels in the target image; if the result is sufficiently large, a match is likely. Image portions screened in this manner can then have the
40 second strongest vector applied, and so on, until it is very likely that screened portions are also found in the directory. If it is desired to perform image recognition (as opposed to just detection), a further pre-processing step exists where the contribution of each vector to a particular directory image is
45 ascertained and stored, as a vector signature for the image. When processing a target image, the target is processed using the vectors and then it is ascertained whether results of processing match specific vector signatures. Many image recognition systems will use both of these processes; that is,
50 a system might first process a target image to screen candidate portions of the target image, and then perform image recognition only for those candidate portions. 3. Shortcomings of Some Vector Based Systems One problem with some vector based systems is their
55 tendency to detect false matches. That is to say, some systems will detect matches in a target image where the human eye, by contrast, can readily observe that there is no match. One reason this results occurs is that basis vectors are typically selected only for detecting a match, but usually not
60 to reject false matches or for affinity to screen out nonmatches. A related problem relates to vector multiplication, mentioned above, and the application of thresholds to the results to detect matches; strong or significant vectors multiplied against portions of the target can produce high values
65 even though there truly is no "match," and a suitable threshold is needed to distinguish these results. However, the target image may be produced under different conditions,