1-2 embodiment, the key information is now repeated compression look out duplicate information PRODUCT table ID PNAME PRICE NUMBER PDESCRIPTION 1Apple 123000NULL2Banana 16.997600NULL3Olive 25.224500NULL4Coco Nut 40.992000NULL4Orange 15.995500NULL5Pineapple 302500NULL6Olive 25.223000NULL
There are also a few problems here. In the table, COCO NUT and ORANGE ID are 4, and the ID number is 3 and 6, the name (PNAME) of the two items (PNAME) is Olive. And our original intention is clearly to let each item corresponding to one ID number, and the ID number and PNAME in the table should be unique. Only 7 lines in this table, we can observe directly, find problems with the naked eye, the amount of data in the table is very large? Now let's review the statement of the query repeating data in Take the Example 1. We use ... Group BY ID, PNAME, PRICE, Number, and PDescription to packet with Having Count (*)> 1, filter repeated data, and packet, separately, and filter the ID column separately. Can I find ID repeated data? Try it: Select IDFrom ProductGroup By IDHaving Count (*)> 1 Return Result: ID ----------- 4 This is the ID number we want, but this statement is really no practical, Now let's check this ID, who is: select, pname, price, number, pdescriptionFrom products> 1 This statement performs an error, it is obvious that the four columns after the ID number are neither in Group By. It is also a statistical function, they should not appear here. And this statement: SELECT ID, PNAME, PRICE, NUMBER, PDESCRIPTIONFROM ProductGroup By ID, PNAME, PRICE, Number, PDESCRIPTIONHAVING Count (*)> 1 Return to an Empty Result Set: ID PNAME Price Number PDEScription ----- ------------------------------------ -------- ------------------------------------
(The number of rows affects) Many friends are querying SELECT ID, PNAME, PRICE, NUMBER, PDESCRIPTIONFROM ProductWhere ID in (Select IdFrom ProductGroup By IDHAVING Count (*)> 1) to solve, I have also seen one Example of the second-level cursor (!?), Then haven't there a better way? I prefer the following statement: Select L.ID, R.Pname, R.Price, R.Number, R.PDescriptionFrom Product Ljoin Product Ron L.ID = R.IDGROUP BY L.ID, R.PNAME, R. Price, r.Number, r.pdescriptionhaving count (*)> 1 Return the result is as follows: ID PNAMEPRICE NUMBER PDESCRIPTION 4COCO NUT40.992000 Null4Oran 15.995500null
With join queries, the speed will be much faster than the child, because no need to search data each time I use in the subforizons. Especially when there are many data in the table, the returned result set is also very amazing. If you run on a multiprocessor, multiple-hard drives, join queries can also make full use of parallel operations to improve efficiency. In the summer of 1999, when Engineers from IBM attended the National Database Technical Conference in Lanzhou University, I explained the performance of the performance of the optimization of the parallel operation technology to optimize the connection inquiry. In contrast, subqueries are a bit a little loss in this regard. Some powerful database engines will convert sub-queries to join queries when appropriate. But master the truth in our own hands, isn't it better? Of course, the subquery is not a slower than the connection, and there is a chance to demonstrate some examples of queries faster than join queries, and even some subqueries are queried, it is difficult to implement. In terms of theory, the joint query generates a Dikar, and the size of this collection is the product of each subset of it. This will bring huge overhead on the space (the actual database system we have seen is not so dry). The status of subquery is more complicated. Collections of generated results, standard quantum query, query, (standard subqueries return a query of a simple data, this subquery can directly use a list of external query statements in MS SQL Server); The relationship between the query and the external query is divided, and the associated subqueries and non-associated subquers (the result sets of the associated subqueries depends on the current data line, the non-related subquery). Usually the associated subquery is more headache, it needs to repeatedly perform sub-query statements, if the external query operation data set (not the return data set) line number is n, the number of data sets of subquery operation is M, then it The biggest complexity is the N times of M! Plus the huge space overhead brought by sub-query data set, which will greatly affect speed. The subquery in the above example is more fortunate, it is an independent to quantum query, but even if so, a sub-result set is saved and repeatedly operates, and it is difficult to operate, the result is that it is not better than Joint query is fast. This is why MySQL has never supported child inquiry for a long time. Under normal circumstances, the operation of the big data set, the performance of the join query is always better than the subquery, so we should fully grasp this method. Taking the final joint query in Example 2 as an example, we analyzed the idea of writing this joint query. As mentioned earlier, in theory, a Dikar is generated when the data set is connected. If there is a table t as follows: Word ---- AB So when "SELECT L.WORD, R.WORD FROM T AS L JOIN T AS R On L.WORD = R.WORD" will be L. Wordr.Worda aa bbabb then execute "On L.Word = R.Word", filter it to L.wordr.Worda ABB
Here, we use the middle of this Dikar to make an article. If the data in the ID column of the Product table is indeed unique, after doing self-join, you should be like the T Table Word column just seen, the ID in the result is still unique. Now we try to execute this statement: SELECT L.ID, R.ID, L.PNAME, R.PNAMEFROM PRODUCT LJOIN PRODUCT RON L.ID = R.ID results are as follows: IDIDPNAMEPNAME 11AppleApple 22BananaBanana 33OliveOlive 44OrangeOrange 44Coco nutOrange 44OrangeCoco nut 44Coco nutCoco nut Is 55PineApplePineApple 66oliveolive noticed? I originally repeated the ID number 4, now, repeated 4 times. This is due to the repetition of Coco Nut and ORANGE, Dikal is equipped with it, and cannot filter the joint conditions. So, after the grouping set, the ID is divided into two groups, and the data is divided into two groups, each set of two lines, and normal data is only A row. You can find ID repeated data, even we can also know that it repeats a few times! Please see the following SQL statement: Select L.ID, R.Pname, R.Price, R.Number, R.PDescription, Count (*) Row_CountFrom Product Ljoin Product Ron L.ID = R.IDGROUP BY L.ID, R .Pname, r.Price, R.Number, R.PDescriptionhaving count (*)> 1 Return Results: IDPNAMEPRICENUMBERPDESCRIPTIONROW_COUNT 4COCO NUT40.992000Null24Out40.995500null2
(The number of rows of rows) is 2 rows. This structure is also the same as the subquery, and it will also bring some interesting attachment effects, there is a good or bad, this is going back. The chapter discussed. Similarly, use SELECT R.ID, L.PNAME, R.PRICE, R.Number, R.PDescription, Count (*) Row_CountFrom Product Ljoin Product Ron L.PName = R.PnameGroup by r.ID, L.PNAME, R .Price, r.Number, R.PDescriptionHaving Count (*)> 1 statement, you can find the data of PNAME columns and its repetition: ID PNAME Price Number PDESCRIPTION ROW_COUNT 3OLIVE 25.224500Null 26OLIVE 25.223000NULL 2
(The number of rows of rows) After two examples, we can see that more deeply understand the operational mechanism of relational databases, skilled with simple query and join queries, can effectively improve the performance of the program and Maintenance, reduce the complexity of code. Why not do it? There is no Money data type in Interbase, so when you create a Product table in InterBase, you remember to define the Price field as another type, where I use Numeric (8, 4). In addition, there is still a problem in Interbase, perform the following statement: Delete from Product WHERE ID, PNAME, PRICE, NUMBER, PDESCRIPTIONHAVING COUNT (*)> 1) Theory, it should Delete the two lines "Apple" all, MS SQL Server2000 is doing this. But it only deletes one of them! The data in the execution is as follows: idpnamepricenumberpdescription 1apple12.00003000null2bana16.99007600null Obviously, after deleting a line of data, re-querying the data table and re-deciding the data to be deleted. For relational databases, this is not a good thing, neither rigorous, not beautiful. However, the specific to this statement is a good thing. We use a deleted command to complete the data merged by several operations. In Interbase, there are similar places that do not realize real collection operations like MS SQL Server. I will mention it at any time in the later example. Pay attention to everyone in actual work. I am using Interbase 6.0.1 is a database that can be opened free of charge, and MS SQL Server is Microsoft's Palm Pearl, and MS SQL Server7 designers have won the 1998 Trim Award. I have to admit that this lightweight interbase is an admiring good thing. It realizes the powerful features such as Camplion update so MS SQL Server until the 2000 version is added. Of course, it also does not agree. However, considering its price / performance, we really can't ask more. In addition, it is advised to use this opportunity to learn about the use of a temporary table. There are not many domains here.