Computer vision means identifying, tracking, and measurement using camera and computer instead of human eye. The main function of image processing is to process the low quality image (contrast, blur, deformation, etc.) to more suitable for human eye. Observe or detect images of the instrument.
Robot Vision is an emerging development rapid discipline. Since the 1980s, the research of robotic visual has experienced the development stage of the actual application from the hazard. Processes from simple binary images to high-resolution multi-grayscale image processing, from a general two-dimensional information to three-dimensional visual mechanism and model and algorithm have achieved great progress. The development of the computer industry level and the development of disciplines such as artificial intelligence, parallel treatment and neuronal network have promoted the practicalization of robotic vision systems and the study of many complex visual processes. Currently, robotic vision systems are widely used in visual inspection, robotic visual guidance and automation assembly.
Among modern production, visual tests are often indispensable. For example, the appearance of the car parts, the mass of drug packaging, the quality of IC character printing, good or bad board welding, etc., there are numerous test workers to observe the observation inspection through the naked eye or combined microscope. A large number of detection artificial not only affects factory efficiency, but also brings unreliable factors, directly affecting product quality and cost. In addition, many testing processes not only require detection of appearance, but also require accurate acquisition data, such as the width of the part, the diameter of the circular hole, and the coordinates of the reference point, etc., these work is difficult to completely complete.
In recent years, it has developed a rapid machine vision technology to solve this problem. The machine vision system typically uses the CCD camera to take over the detection image and convert to a digital signal, and then use advanced computer hardware and software technology to process the image digital signal to obtain the desired target image features, and thus Implement pattern identification, coordinate calculation, grayscale profile, etc. Then display the image, output data, send instructions according to its results, and cooperate with the actuator to complete the position adjustment, quality, and data statistics, etc. Compared with artificial visual, the biggest advantage of machine visual is precise, fast, reliable, and digital.
Technical introduction
Overview of machine vision systems
The machine vision system refers to the use of a computer to realize people's visual features, which is to realize the identification of objective three-dimensional world with a computer. According to now, the feeling of human vision systems is a retina, which is a three-dimensional sample system. The visible portion of the three-dimensional object is projected onto the retina, and people perform three-dimensional understanding of the object in accordance with two-dimensional images projected onto the retina. The three-dimensional understanding refers to an understanding of the shape, size, distance of the observation point, the texture and motion characteristics (direction and speed), and the like.
The input device of the machine vision system can be a camera, a drum, and the like, which uses a three-dimensional image as an input source, that is, the input computer is the two-dimensional projection of the three-dimensional tube. If the three-dimensional guest world to two-dimensional projection is as a positive transformation, the machine vision system is to do, from this two-dimensional projection image to the inverse transformation of the three-dimensional guest world, that is, according to this two-dimensional projection Image to rebuild 3D objective world.
The machine vision system is mainly composed of three parts: the image acquisition, the processing and analysis, output, or display of the image.
Nearly 80% of industrial vision systems are mainly used in terms of inspection, including product quality, controlling product data, etc. in the production process. The classification and selection of products is also integrated into the detection function. The following is a single camera vision system for the production line, indicating the composition and function of the system.
Visual system detects products on the production line, determines whether the product meets quality requirements, and generates corresponding signal input host machines according to results. Image acquisition devices include light sources, cameras, etc .; image processing devices include corresponding software and hardware systems; output devices are related systems connected to the manufacturing process, including process controllers and alarm devices, and the like. Data is transferred to the computer, analyzing and product control, if you find unqualified products, the alarm alarm and exclude the birthplace. The results of the machine vision are the quality information source of the CAQ system, and can also be integrated with other systems of CIMS. Image acquisition
The acquisition of the image is actually a series of data that converts the visual image and the intrinsic feature of the measured object into the computer, mainly consists of three parts: * Lighting * image focus formation * image determination and forming a camera output signal
1. Optimization of illumination and influence machine visual system input, because it directly affects the quality of input data and at least 30% of the application effect. Since there is no universal machine visual lighting device, the corresponding lighting device is selected for each specific application example to achieve the best results.
In the past, many industrial machine visual systems used visible light as a light source, mainly because visible light was easy to obtain, low price, and convenient operation. Several visible light sources are white lights, fluorescent lamps, mercury lamps and soda lamps. However, a maximum disadvantage of these light sources is that light can remain stable. Taking the fluorescent lamp as an example, the light energy will drop by 15% in the first 100 hours of use, and the light energy will continue to decrease as the usage time increases. Therefore, how to keep light energy stabilize in a certain extent, which is a problem that needs to be solved in the practical process.
In another aspect, the ambient light will change these light sources to the total light energy on the object, so that the output image data has noise, generally adopting the method of adding a protective screen, reducing the environmental light.
Due to the presence of the above problems, in today's industrial applications, it is often used as a light source for certain high detection tasks. However, invisible light is not conducive to the operation of the system, and the price is higher, so it is currently in practical applications, still use visible light as a light source.
The lighting system can be divided into: back illumination, forward lighting, structural light and flash lighting. Among them, back illumination is placed between the light source and the camera, and its advantage is that the high contrast image can be obtained. The forward lighting is the light source and the camera located on the same side of the measured object, which is easy to install. The structural light illumination is projected onto the measured product or the like, demodulating the three-dimensional information of the measured object depending on the distortion thereof. Flashlight illumination is to illuminate high frequency light pulses to the object, and the camera shooting requirements are synchronized with the light source.
2, image focus formation
The image of the measured object is focused on the sensitive element through a lens, just like the camera photo. The difference is that the camera uses the film, and the machine vision system uses the sensor to capture the image, the sensor converts the visual image into an electrical signal, which is convenient for computer processing.
Select the camera in the machine vision system should be based on the actual application, where the camera's lens parameters are an important indicator. The lens parameters are divided into four parts: magnification, focal length, depth of field and lens installation.
3, image determine and form a camera output signal
The machine vision system is actually a photoelectric conversion device, that is, the lens imaging received by the sensor, transforms the electrical signal that the computer can handle, the camera can be a tube, or a solid state sensing unit.
The electronic camera has earlier, and it has been applied to commercial television in the 1930s. It uses a vacuum tube containing a photosensitive element to convert the received image into an analog voltage signal output. A camera with RS-170 output system can be directly connected to a commercial television display. The solid state camera was in the late 1960s, and the US Bell Phone Laboratory invented the charge coupling device (CCD), which developed. It is configured on the linear array or rectangular array of photosensitizing diodes of each pixel, and outputs the image optical signal to electrical signals by outputting the voltage pulse of each diode in a certain order. The output voltage pulse sequence can be input directly to the RS-170 model, or enter the computer's memory, numerical processing. The CCD is now the most common machine visual sensor. Image processing technology
In the machine vision system, the processing technology of visual information mainly depends on image processing methods, including image enhancement, data coding, smoothing, edge sharpening, segmentation, feature extraction, image recognition, and other content. After these treated, the quality of the output image is improved, which improves both the visual effect of the image, but also facilitates the analysis, processing, and identification of the image.
1. Enhanced image of the image is used to adjust the contrast of the image, highlight important details in the image, improve visual quality. Image enhancement is usually used in grayscale histogram modification technology.
The grayscale histogram of the image is a statistical characteristic chart showing an image grayscale distribution, which is closely connected to the contrast.
Typically, a two-dimensional digital image represented in the computer can be represented as a matrix, and the element in the matrix is an image gradation value located at the corresponding coordinate position, is an integer of discretization, generally take 0, 1, ..., 255. This is mainly because the numerical range represented by one byte in the computer is 0 ~ 255. In addition, the human eye can only distinguish between about 32 grayscale levels. So, use a byte to represent the grayscale.
However, the histogram can only count the probability of appearing at a certain gray pixel, reflecting the two-dimensional coordinates of the pixel in the image. Therefore, different images may have the same histogram. Through the shape of the grayscale histogram, the sharpness of the image and the black and white contrast can be determined.
If the histogram effect of obtaining an image is not satisfactory, it can be appropriately modified by histogram equalization processing technology, ie, pixel grayscale in a known grayscale probability distribution image makes a mapping transform, so that it becomes A new image with a uniform gray probability distribution, which makes the image clearly.
2, the smoothing of the image
The smoothing technology of the image is the decimal noise processing of the image, mainly in the actual imaging process, and extracts useful information due to image distortion caused by imaging equipment and the environment. It is well known that the actual image is in the process of forming, transmitting, receiving, and processing, there is inevitable external interference and internal interference, such as the sensitivity of sensitivity of sensitivity during photoelectric conversion, quantifying noise, transmission process Errors and human factors, etc., the image can be metamous. Therefore, remove noise, restore the original image is an important part of image processing.
The linear filters developed in this century, in its perfect theoretical basis, mathematical processing, easy to adopt FFT and hardware implementation, etc., has always placed an important role in the image filtering area, where Wiener filter theory and The Karman filter theory is represented. However, the linear filter has a disadvantage such as high computational complexity, not convenient for real-time processing. Although it has a good smooth effect on Gaussian noise, it is poor inhibitory effect on pulse signal interference and other forms of noise interference, and the signal edge is blurred. To this end, in 1971, the famous scholar tukey proposed a non-wire filter-median filter, that is, the median value of the grayscale in the local area as the output grayscale, combined with statistical theory, using iterative methods It is ideally designed to recover from noise, and protect the contour boundary of the image, which is not blurred. In recent years, nonlinear filtering theory has been widely used in machine vision, medical imaging, voice treatment, etc., and it has also developed the research to develop in the depth direction. 3, data encoding and transmission of images
The amount of data of the digital image is quite large. The amount of data of 512 * 512 pixels is 256 k bytes. If a 25-frame image is assumed, the channel rate of the transmission is 52.4m bits per second. High channel rate means high investment, which also means a popularization of difficulties. Therefore, during the transmission process, it is very important to compress the image data. The compression of the data is mainly completed by encoding and transform compression of image data.
Image data encoding generally adopts a predictive code, so that the spatial variation of image data and the sequence change law indicates a predictive formula. If a certain pixel is known, the pixel value can be predicted by formula. Using predictive coding, only the starting value and predictive error of image data are generally required, and 8 bits / pixels can be compressed to 2 bits / pixels. The transformation compression method is to divide the entire image into a small (one show 8 * 8 or 16 * 16) data block, and then classify, transform, quantize these data blocks, thereby constituting an adaptive transformation compression system. This method can compress the data of a image to several tens of transmissions, and change back in the receiving end.
4, edge sharpening
The image edge sharpening process is mainly to enhance the contour edges and details in the image, forming a complete object boundary to separate the object from the image or detect the area of the same object surface. It is the basic problem in the early visual theory and algorithm, and is also one of the important factors in the mid-term and post-visual success or failure.
5, segmentation of the image
Image splitting is divided into several portions, each part corresponding to a certain object surface, and at the time of segmentation, the grayscale or texture of each part meets a uniform metric. A certain essence is classified. The basis of classification is the gray value, color, spectrum characteristics, spatial characteristics or texture characteristics of the pixel. Image split is one of the basic methods of image processing technology, applied to, such as chromosome classification, scene understanding system, machine visual, etc.
There are two main methods of image segmentation: one is given the grayscale threshold segmentation method in the metric space. It determines the image spatial pixel cluster according to the image grayscale histogram. However, it only utilizes image grayscale characteristics, and does not utilize other useful information in the image, so that the division results are very sensitive to noise; the other is the spatial domain region growth segmentation method. It is a pixel communication formation of a similar property in a sense such as gray level, tissue, gradient, etc., which has a good segmentation effect, but the disadvantage is complex, and the processing speed is slow. Other methods such as edge tracking, mainly focusing on maintaining edge properties, tracking edges and forming a closed outline, dividing the target; the cone image data structure and the laboratory lack of lighting are also using the pixel distribution relationship, the edges The pixels are reasonable and returned. The knowledge-based segmentation method is to utilize the priori information and statistical characteristics of the scene, first split the image, extract zone features, and then utilize the interpretation of the domain knowledge derivation area, and finally consolidate according to the interpretation. 6, image recognition
The identification process of the image can actually be regarded as a tag process, i.e., using an identification algorithm to identify individual objects that have been split in the scene, give these objects to a specific mark, which is a task that the machine visual system must complete.
According to image identification, it can be divided into three types of problems. In the first class identification problem, the pixels in the image express a particular information of a certain object. If a certain pixel in the remote sensing image represents the reflection characteristics of a certain spectral band of a certain place in the ground, the type of the object can be determined by it. In the second type of problem, the object to be identified is tangible, and the two-dimensional image information is sufficient to identify the object, such as text identification, some three-dimensional identification having a stable visual surface. However, such problems are not like the first type of problem, it is easy to represent a feature vector. During the recognition process, the object to be identified correctly from the background of the image, and then try to establish the property map of the object in the image. Assume that there is a match between the properties chart of the model library. The third type of problem is the three-dimensional representation of the object to be measured by the two-dimensional map, element map, 2 · 5 dimension, etc. Here, how to extract the implicit 3D information, as the hotspots of this research.
The method currently used for image recognition is mainly divided into decision theory and structural methods. The basis of the decision theory is the decision function, and it is based on the mode vector to classify the mode vector. It is based on timing description (such as statistical texture); the core of the structural method is to decompose objects into model or mode primitives, and different The object structure has a different base string (or a string), by obtaining the encoded boundary by using a given mode element using a given object, obtain a string, and then determination of its genus based on the string. This is a method dependent on the relationship between the symbols describe the object being measured.