Camera calibration is a very basic key issue in computer vision.
The brightness of each point in the image inserted by the camera reflects the intensity of a point of reflected light on the surface of the space, and the position of the position is related to the corresponding point of the surface of the space object, the interrelationship of these positions, by the camera The imaging geometry is determined, the parameters of the geometric model are called camera parameters. These parameters must be determined by experiments and calculations, and the tests and calculations are called the camera. The camera model is a simplification of the optical imaging geometry, the simplest model is a linear model, or a pin-hole model. When the calculation accuracy is high, especially when the camera's lens is a wide-angle lens, the linear model cannot accurately describe the image formation geometry of the camera, so use a nonlinear model. The camera calibration is also related to the task of the computer vision system. In stereoscopic visual, two or more cameras are generally required, so it is necessary to know the geometric relationship between the individual cameras. In addition, there is a Hand-Eye Calibration, etc. The general camera labeling method requires an object that is known in front of the camera, referred to as a caliber or a standard Object. Because in some visual systems, such as robotic vision systems, active vision systems, you need to change the location of the camera or adjust the camera optical system (such as aperture and focal length), so, after each adjustment, you need to be a camera labeled . In this case, a calibration block is placed in a camera working environment is often unrealistic, and the camera self-calibration means that the scaling method is not required to use the calibration.
The following main discussion will only involve the calibration problem of the linear camera.
The calibration of the linear camera involves three coordinate systems: the image coordinate system, the camera coordinate system and the world coordinate system.
(U, V) indicates the coordinates of the image coordinate system in pixels, (x, y) represents the coordinates of the image coordinate system in millimeters. In the X, Y coordinate system, the origin O1 defines the intersection of the optical axis and the image plane of the camera, which is generally located at the center of the image, and some deviations will be deviated when the camera is made. If the coordinates in the U, V coordinate system are (U0, V0), each pixel is in the X-axis and the Y-axis direction of DX, DY, and any pixel in the image is under two coordinate systems. The coordinates have the following relationships:
The camera imaging relationship can be represented by the following figure. Among them, O is known as the visual center of the camera, the X-axis and the YC axis and the x-axis of the image are parallel, the ZC axis is the optical axis of the camera, which is perpendicular to the image plane. The intersection of the optical axis and the image plane, that is, the origin of the image coordinate system, is referred to as a right angle coordinate system consisting of point O and XC, YC, and ZC shaft as a camera coordinate system. OO1 is the camera focus. Since the camera can be placed anywhere in the environment, we also select a reference coordinate system in the environment to describe the location of the camera, and use it to describe the location of any object in the environment, the coordinate system is called the world coordinate system. It consists of XW, YW, ZW.
In the linear camera model (pinhole model), the imaging position of any point P in the image can be approximately represented by a pinhole model, ie the projection position P on the image on the image, which is the connection of the optical center O and P. This relationship is also called a central shooting or perspective projection.
The relationship between the camera coordinate system and the world coordinate system can be described with the rotation matrix R and the translation vector T. So, the following formula is established:
After several coordinate transformations, in the end, a relational can be derived:
The above formula is important for calibration. Among them, AX = f / dx, AY = f / Dy; M is 3 × 4 matrix, called projection matrix; M1 is completely determined by AX, AY, U0, V0, due to, AX, AY, U0, V0 only with the camera The internal structure is related, we call these parameters for the internal parameters of the camera; M2 is fully determined by the camera relating to the direction of the world coordinate system, referred to as the external parameters of the camera, determines the inside and outside of a camera, called the camera. The camera is generally required a special calibration parameter in front of the camera, the camera acquires the image of the object, and thus calculates the inside and outside of the camera, and the position of each feature point on the calibration reference is in the position of the world coordinate system. When making it, it should be accurately determined that the selection of the world's coordinate system can be arbitrary, of course, it is necessary to consider the future and present convenient problems. After obtaining the projection of these known points on the image, you can calculate the inside and outside parameters of the camera, and the specific practices are as follows.
The following formula can be derived from this equation.
The above subsequently, if there are N known points on the standard block, and they know their spatial coordinates and image coordinates, which we will have 2N regard to the linear equation of the M matrix element, and below is written in the form of matrix These equations:
It can be seen from the above formula that the M matrix multiplied by the constant of any less than 0 does not affect the relationship between (xw, yw, zw) and (U, V), so the M34 = 1 can be specified, thereby obtaining other elements of the M matrix 2N linear equations, the number of these unknown elements is 11, which is 11-dimensional vector M, so the above formula can be short-minded into KM = u, where K is 2N × 11 matrix, m is an unknown 11-dimensional vector, U is 2N dimensional vector. K and u are known nations, when 2n> 11, the least squares can be used to obtain the solution of the above linear equation: m = (ktk) -1 KTU
It can be seen that there are more than 6 image point coordinates, which we can find the M matrix. In the general calibration work, we have made dozens of known points on the settlement block, making the number of equations exceeds the number of unknown, thus using the least squares method to reduce the impact of errors. After obtaining the M matrix, you can calculate the entire inside and outside of the camera through some derivation.
In summary, the M matrix can be obtained by more than 6 known points and their image point coordinates, and all internal and external parameters can be obtained from a certain formula.
It should be noted that the M matrix determines the relationship between spatial point coordinates and its image point coordinates. After many applications (such as stereoscopic vision), after calculating the M matrix, it is not necessary to decompose the inside and outside of the camera, that is Said that the M matrix itself also represents the camera parameters, but these parameters do not have specific physical significance, in some literature, called hidden parameters. In some applications (such as motion analysis), you need to decompose the M matrix to find the inside and outside of the camera. The decomposition of the inside and outside parameters of the camera will cause errors.