Three methods of image zoom

xiaoxiao2021-03-06 63

Using a linear interpolation algorithm to realize images to scales in Windows Making an image Program should all know that Windows's GDI has an API function: StretchBLT, corresponding to the StretchDraw method of the Tcanvas class in VCL. It can simply realize the zoom operation of the image. But the problem is that it is the fastest, simplest but effect is also the worst "nearest neighborhood", although in most cases, it is also enough, but it will not be possible for higher requirements. Not long ago, I did a little girl (see "My Album of Human Information Assistant"), used to manage a bunch of photos I took in DC, one of which provides zoom function, the current version is used STRETCHDRAW, sometimes the effect can not be satisfactory, I always want to join two better: linear interpolation and three samples. After study, it was found that the amount of calculation of three splines was too large, not practical, so decided to do only the linear interpolation method. From the basic theory of digital image processing, we can know that the transformation of the image is the coordinate transformation of the source image to the target image. The simple idea is to convert each point coordinate of the source image to the corresponding point of the target image through the deformation operation, but this will cause a problem that the coordinates of the target point will not be integer, and it can cause the target like an enlargement operation. There is no point of the source image in the image to the shortcomings of the so-called "forward mapping" method. Therefore, it is generally adopted "reverse mapping" method. However, the reverse mapping method is not an integer when mapping the source image coordinates. Here is a "resample filter". This term looks very professional, but it is because it borrows the usual statement in electronic signal processing (in most cases, its function is similar to the bandpass filter in electronic signal processing), it is not complex, How to determine what color problem should be in this non-integer coordinate. The three methods mentioned earlier: the nearest neighborhood method, linear interpolation method and triple spline method are so-called "resample filters". The so-called "nearest neighborhood" is the color of this non-integer coordinate as a round of round, take the color of the nearest integer point coordinates. The "linear interpolation method" is based on the most close few points (for planar images, a total of four points) is used as linear interpolation calculations (two-dimensional linear interpolation for planar images) to estimate this color. In most cases, its accuracy is higher than the nearest neighborhood method. Of course, the effect is much better, and the most obvious is that the serration of the image edge is much smaller than the nearest neighborhood. Of course, it also has a problem with a problem: it is the image will be relatively soft. This filter uses a professional term (huh, the priority of the priority) is called: good with resistance, but there is a losses, the rectangular coefficient of the passband curve is not high. As for three splines, I don't say it, it is more complicated, and I can refer to the professional books in digital image processing, such as the reference. Let's discuss the algorithm for coordinate transformation. Simple spatial transformations can be represented by a transformation matrix: [x ', y', w '] = [u, v, w] * t these: x', y 'is the target image coordinate, U, V is the source image The coordinates, W, W 'are called homogeneous coordinates, typically set to 1, t is a 3x3 transformation matrix. This representation is very mathematically, but in this form, it is convenient to represent a variety of different transformations, such as translation, rotation, zoom, and the like.

For zoom, it is equivalent to: [su 0 0] [x, y, 1] = [u, v, 1] * | 0 sv 0 | [0 0 1] wherein SU, SV are X-axis direction and Y, respectively The zoom ratio in the axial direction is larger than 1, which is larger than 0 less than 1 when it is reduced, and the reversal is inverted when less than 0. Does the matrix look more dizzy? In fact, the above formula is deployed by matrix multiplication is: {x = u * SU {y = v * sv is as simple as it is. ^ _ ^ Has the above three aspects, you can start writing code. The idea is simple: first use two cycles to traverse each point coordinate of the target image, through the above transformation type (note: the corresponding transformation type should be: u = x / su and v = Y / SV Get the source coordinate. Because the source coordinates are not integer coordinates, two-dimensional linear interpolation operations are required: P = N * b * pa n * (1 - b) * Pb (1 - n) * b * pc (1 - n) * ( 1 - b) * Pd where: N is V (the corresponding point of the corresponding point in the source image, is generally not an integer) The following nearest row of Y-axis coordinates and V; the same B is similar, but it Is an x-axis coordinate. The PA-PD is the color of four (U, V) points (U, V) (top left, upper right, left, right) source image points (using Tcanva's PIELS attribute). P is (U, V) point of the interpolation color, that is, the approximate color of the (x, y) point. I don't write this code because it is extremely low: RGBs to each point of the target image for each string of complex floating point operations. So you must optimize. For VCL applications, there is a relatively simple optimization method to use Tbitmap's Scanline property, and can avoid Pixels's pixel level operations, which can be greatly improved in performance. This is already a basic optimization of image processing with VCL. However, this method does not always use, such as when the image is rotated, there is more techniques. In any case, the overhead of floating point operations is much larger than integers, and this is also necessary to optimize. As can be seen from the above, the floating point number is introduced during transformation, and the transform parameters SU, SV usually the floating point number, so it will be optimized from it. In general, SU, SV may represent the form of a component: SU = (double) DW / SW; SV = (double) DH / SH where DW, DH is the width and height of the target image, SW, SH is the source image Width and height (because all integers, for finding floating point results, type conversion is required).

The new SU, SV can be substituted in the previous transform formula and interpolation formula, which can export new interpolation formulas: B = 1 - x * SW% DW / (double) dw; n = 1 - y * sh% DH / DH) DH set: b = dW - x * SW% DW; n = DH - Y * SH% DH: b = b / (double) DW; n = n / (double) DH uses integers B, N instead Floating point B, N, conversion interpolation formula: P = (B * n * (PA - PB - PC PD) DW * N * PB DH * B * PC (DW * DH - DH * B - DW * N) * PD) / (Double) (DW * DH) Here the final result P is the floating point number, and the results can be obtained from it. To completely eliminate the floating point number, you can use such a method to carry out the round range: P = (B * n ... * PD DW * DH / 2) / (dw * dh) This, P is directly rounded to the integer value, all Calculation is an integer operation.

The simple optimized code is as follows: int __fastcall Tresizedlg :: Stretch_Linear (Graphics :: Tbitmap * ADest, Graphics :: Tbitmap * ASRC) {Int sw = asrc-> width - 1, sh = asrc-> height - 1, dw = ADEST-> Width - 1, DH = ADEST-> HEIGHT - 1; INT B, N, X, Y; INT NPIXELSIZE = getpixelsize (ADest-> Pixelformat); byte * PLinePrev, * plinenext; Byte * PDest; Byte * Pa , * Pb, * pc, * pd; for (int i = 0; i <= DH; i) {pdest = (byte *) ADEST-> scanline [i]; y = i * sh / dh; n = DH - I * SH% DH; PLINEPREV = (Byte *) ASRC-> Scanline [y ]; plinenext = (n == DH)? PLINEPREV: (Byte *) ASRC-> Scanline [y]; for (int J) = 0; j <= dw; j) {x = j * sw / dw * npixelsize; b = dw - j * SW% DW; PA = PLinePrev x; PB = PA NPIXELSIZE; PC = PLINENEXT X ; Pd = pc npixelsize; if (b == dW) {PB = Pa; PD = pc;} for (int K = 0; K

转载请注明原文地址:https://www.9cbs.com/read-86810.html

9cbs

New Post(0)