Review And Analysis Of Computer Vision Algorithms

Computer vision as a scientific discipline refers to the theories and technologies for creating artificial systems that receive information from an image. Despite the fact that this discipline is quite young, its results have penetrated almost all areas of life. Computer vision is closely related to other practical fields like image processing, the input of which is two-dimensional images obtained from a camera or artificially created. This form of image transformation is aimed at noise suppression, filtering, color correction and image analysis, which allows you to directly obtain specific information from the processed image. This information may include searching for objects, keypoints, segments, and annexes;

1) Detection of objects in the image. Despite the fact that a person can easily select specific objects from an image, this task has not yet been completely solved for artificial systems. Most of the solutions are private in nature, based on the specific properties of the sought object, and, accordingly, are not suitable for searching for objects that do not have them. There are several universal algorithms for object detection (neural networks, Viola-Jones, etc.) that work slowly and have a serious detection error with insignificant deviations of objects from the desired ones, set during training, but, nevertheless, their simplified versions are widely used for small images. Therefore, an important task while maintaining the detection accuracy is to speed up the calculations; 2) Recognition of objects in the image. This task is a continuation of the previous one, the result of which is an array of areas where objects can be found. The purpose of this task is to determine the presence in these areas of a specific class of objects, which already has more specific features and, accordingly, can be better classified; 3) Identification of objects, the result of which can be a conclusion about the correspondence of the recognized object to a specific (unique) instance (for example, a fingerprint, the face of a specific person, car number, etc.). Separately, it is worth highlighting character recognition systems from this group, the identification accuracy of which affects the quality of the material being recognized; 4) Search for images in the database by content, based on the recognition of a specific class of objects. In this case, the performance of the artificial system plays a significant role in speeding up the search for images, therefore, the possibility of parallelizing the algorithms of this group of tasks is considered very important; 5) Reconstruction of a three-dimensional scene from a certain set of input images (video stream) allows you to determine the positions of objects and a source in threedimensional space, used for the movement of robots, creating panoramic images, etc.; 6) Tracking moving objects in a video stream provides for the direct determination of the position of an object in space by changing its position on two-dimensional images while maintaining the characteristic features of the object. This task is very resource-intensive and must be performed in real time, so the main emphasis when creating algorithms for this area is made on their performance; 7) Image processing. This task area is intended for converting pixels of twodimensional images and is of primary importance for other computer vision tasks. Almost all transformations are filter transformations, i.e. a set of operations is performed over each pixel of the image, depending on other pixels that are in close proximity to the desired one using special matrices.

OBJECTIVE STATEMENT
Obviously, not all computer vision algorithms can be parallelized on GPUs. Any artificial computer vision system, regardless of its area 2) Preliminary processing; 3) Highlighting characteristic features; 4) Detection or segmentation; 5) High-level processing.
Almost all stages can be realized with the help of parallel computer vision algorithms executed on modern parallel computing devices. Some of them can use data parallelism, which is advisable to use on the general-purpose GPUs discussed above. Consider the main groups of computer vision algorithms using data parallelism (this classification is a generalization of groups from): 1) Image transformation algorithms -input and output data are two-dimensional images, the coordinates of the output image element differ from the coordinates of the input element. These algorithms include: affine transformations, coordinate system transformations, etc .; 2) Filtering algorithms -input and output data are two-dimensional images. Each pixel of the output image is the result of an operation on a group of pixels in the input image that fall into a window of a certain size (filter Creation of the integral of the image. An image integral is a two-dimensional array of sums of color intensities that has the same dimensions as the image itself, and the value in each cell is calculated using the following formula:  1986). These algorithms are data parallelized and implemented on GPUs. A combined algorithm can also be used depending on the amount of input data.
In some cases, it is required to find rotated integrals at a certain angle to the original image, which can be reduced to the problem of finding the integral of the image through the first implementation group for SIMD. If the result of a strong classifier is less than a certain threshold value, then the object is considered detected. As weak classifiers, Viola-Jones uses Haar features -a weighted function of the brightness of rectangular regions inside a sliding window. An obvious way to parallelize this algorithm is to transfer the computations of each strong classifier to a separate process. However, when implementing a parallel algorithm on a graphics processor, there are problems of sparse hypotheses and idle processors associated with memory addressing patterns and simultaneous execution of bundles. These problems are solved by scaling the image integral and prepassing several weak classifiers.
Grouping of results. The purpose of this stage is to group the nearby possible areas of finding an object with their subsequent averaging to display the result of the method. Comparison is performed across all areas using a special union function. At the same time, a tree is being built, the root nodes of which will be the required search results. This algorithm has a complexity that with a large number of areas found at the second stage of the method can negatively affect performance in general. To parallelize this algorithm, you can transfer the calculation of the union function to the graphics processor.
The increased interest in the field of heterogeneous programming and the high availability of massively parallel devices in the form of GPUs lead to the creation of new parallel computing models that take into account the peculiarities of various parallel computing devices. In the course of the analysis, it was concluded that not a single abstract model of parallel computing known to date can fully describe the operation of the CPU-GPU system for various types of graphics processors. Nevertheless, a sufficient number of parallel algorithms have been created that can be implemented using this system. Therefore, the development of an abstract model of parallel computing on graphic processors and the corresponding abstract model of the CPU-GPU system can be considered an urgent task of the research. Obviously, this model will have the properties of the abstract parallel machines presented in the review, which, in turn, will allow transferring the algorithms for these models to the new system without significant changes. It should be noted that in the CPU-GPU system, the memory hierarchy and the corresponding bandwidth of the data transmission channels will play a significant role. Therefore, in most cases, to improve the performance of algorithms, it will be advisable to cache data in various types of memory. Consequently, the study of different types of caching in the CPU-GPU system is an urgent task aimed at optimizing programs using parallel algorithms on GPUs.

CONCLUSIONS
The theoretical results of the study are supposed to be implemented and experimentally tested on various groups of computer vision algorithms, which in most cases use two-dimensional images as input and output data, converted into a one-dimensional N-element array. The results of these experiments will prove the consistency of the model and optimization methods. The Viola-Jones method should be separately noted as a resource-intensive general method for recognizing objects in an image using rectangular Haar features. This method contains several different computer vision algorithms and is a working example of 3 and 4 groups of algorithms, which were discussed above. In addition, this method is implemented on a CPU and is used in a complex for automatic functional testing of interactive graphics applications. Therefore, transferring the algorithms of this method to the graphics processor can increase the performance of the recognition module and, accordingly, the entire complex.