?
The initial idea of writing this book comes from the communication and learning of colleagues when working in the first company. However, the fuse of this book is to read a post on the latest information on the latest information on the latest information on 9CBS. The problem in the post can be done in two ways of subquery and join. Due to the limited conditions, I can't answer them in detail, and the friends who posted can't understand my original intention, let my heart regret. So decided to put the idea of the book into action, and put this book on 9CBS, communicate with everyone, and make progress together. Today, I just see the problem similar to the day, my heart is touching, and I decided to discuss it in detail here.
In actual work, we sometimes need to establish a data sheet to store the data of the changes, and statistics our information we need by these data. It is characterized by a class of problems in the final result of filtering conditions from packet statistics. Such applications are common in website databases, accounting systems, real-time systems, data warehouses and data mining. In fact, this proposition itself has included data mining. Now let's see the example below
Example 4-4-1: Latest offer
Netizen Kikilyq asked:
I have a table: computer_price, the format is as follows:
Goods price dates
--------------------------------
HP Computer 20000 5.21
HP Computer 20050 5.23
NEC 31200 5.3
NEC 32000 5.5
Query results Requirements: To find out the latest price of each computer;
The result of the above table is:
Goods price dates
---------------------------------
HP Computer 20050 5.23
NEC 32000 5.5
Help, get this problem?
According to the problem, we first establish a data sheet. After analysis, the data should be marked by the goods name and date, so these two fields are the main keyword:
SQL Server version is as follows
Create Table [DBO]. [Goods] (
[Goods] [25) Collate Chinese_PRC_CI_AS Not Null,
[Price] [Money] Not null,
[DATE_TIME] [datetime] Not null,
PRIMARY Key (Goods, Date_Time)
) On [primary]
The Interbase version is as follows
CREATE TABLE GOODS
(
Goods char (10) Not null,
Price Numeric (15, 4) Not null,
Date_time TimeStamp Not Null,
PRIMARY Key (Goods, Date_Time)
)
After the establishment, the reader will insert the data itself.
In this problem, the price of the computer in the final report depends on its subsequent quotation, which is the latest line of quotation. Typically belongs to the type mentioned earlier. Initial, I will try to select the latest offer date for each computer, this is relatively simple:
SELECT Goods, Max (Date_Time)
From goods
GROUP BY Goods
Return the result
Good
---------- -------------------------------------------------------------------------------------------------------------------------------- ----------------
HP 2002-05-23 00: 00: 00.000
NEC 2002-05-05 00: 00: 00.000
Obviously, as long as each brand of computers are displayed in the above date, it is what we have to have. So how do you write directly?
Select Goods, Price, Max (Date_Time)
From goods
GROUP BY Goods
I think this statement doesn't have to tried. The slight experience will find that the Price column is not in the statistical function, nor the Group By, the database system cannot perform such a statement. Such columns must be removed from another data set, so I first thought of self-join. But I believe that most friends will first think of subquery. Now let's take a look at how the child is doing, after all, this is more intuitive. The most lazy way is to express the price to an inserted standard query:
SELECT L.GOODS,
(Select R.Price from Goods R where r.goods = l.goods and r.date_time = max (l.date_time)) AS Price,
Max (l.date_time) AS current_date_time
From Goods L
Group by l.goods
But unfortunately, this statement can only be executed in SQL Server, and Interbase's prompt is the Invalid Column Reference (invalid column reference). However, you can write a universal version for a thinking. Now, we say the system, I have to take some lines from the table, each brand of computer, the date is the latest offer date of this brand:
Select L.Goods, L.PRice, L.Date_time
From Goods L
WHERE L.DATE_TIME =
(SELECT)
Max (r.date_time)
From Goods R
Where r.goods = L.GOODS
GROUP BY R.GOODS)
This one
I think I can still write a few different subquery variants, but the same size is different, I don't try it. This version seems to be a little unreliable, because only DATE_TIME in the WHERE condition of the primary query does not seem to have an accurate area to separate each row of data. But rest assured, here has a "strange" mutual reference, the record of the primary query is to meet the date equal to the return value of the child query, while the subquery of the goods name (goods column) depends on the goods name of the primary query (goods column). In this way, the subquery will return its correct date for the current brand, which is the stunts of the associated subqueries, which also caused the reason why it is more efficient in many occasions. My interest in the child is here, I said in the previous article, the connection query is a very good technology, then is this inquiry that may be implemented? The analysis of the subquery in front will help us think here. Now that we have two datasets, a maximum date, a price, join them, can n't you? The two result sets are in the subquery above, the current problem is how we join them, clearly, there is a joint condition is R.Goods = L.Goods, which also determines the unique identity of the final result set. One - the goods column, and the filter criteria of the date is l.date_time = max (r.date_time), plus the column in the Group BY result, so there is:
Select L.Goods, L.PRice, L.Date_time
From Goods L
Join Goods R
On L.Goods = R.GOODS
Group by l.price, l.goods, l.date_time
Having l.date_time = max (r.date_time)
Some friends may not understand why this group By is coming, simply. We did the self-heaters of a Goods table (two datasets were L and R), they generated a Dikar. Some people call it data explosion. However, after the ore blows, it is convenient for us to find gold. Using the Join Conditions, the L Data Concentration is in the R data set, all of its own quotation date. Now we have to group the R data collection group, select the latest quotation date of each brand, with its filter L, the data set should use group by r.date_time
Having l.date_time = max (r.date_time)
Since we have to take out the three columns in the D data concentration, you have to list them in Group By, because there is already L.Goods = R.Goods, R.Goods is not written in Group By. Since I mentioned the reasons I mentioned in the previous article, I personally prefer to use join. Of course, in terms of efficiency, when the data of each group in the table is much more, the efficiency of the connection query is offset by huge Dikar, and the app (if this is a table that records the production line instant work). Statistics Its latest production situation, usually this); there is very little data in the table, but there are many data groups (such as large online bookstores, users may be an astronomical figure, relatively speaking, the book bought Not much), the child query still needs to traverse the entire table, the efficiency is too low, and the Dikar, which is the Dikar, which is successful, and exchanges in space. Advantage At this time, you should use a join query.
The return result set of the above programs is:
Goods Price Date_time
---------- --------------------------------------- -----------------------------------
HP 20050.0000 2002-05-23 00: 00: 00.000
NEC 31000.0000 2002-05-5 00: 00: 00.000
In addition to specifically indicated, the above scripts can be universal in MS SQL Server7.0 or higher and interbase6.0.1 or higher. Mastering the way to design the above scripts can be widely used in the various SQL programming of the query condition. From this, we can also discuss two interesting aspects: join queries and data packets.
Attachment: This connection query script should also be posted behind Kikilyq, but because I stupidly posted two not the post (a wrong, a stupid), 9CBS webpage does not allow me to go again Kikilyq's post is followed. Please kikilyq to read it here. Sorry.
About "SQL Story": This book is in proposing some common SQL programming problems, summarizes some relational database design and SQL language programming mode, through practical problems, helps readers to improve database programming capabilities, this The book will also involve the theory of relationship algebra. I plan to collect some typical SQL programming issues on 9CBS, as an important part of this book. The author guarantees that no one's labor results, all answers and analyzes are never faked. If you have a master pointing, I will definitely point out and give you a thank you. Welcome friends from all users to provide materials, first express gratitude. In addition, I hope that there is an Oracle master to work with me to complete the Oracle section of this book. Share successful joy and labor hardship.
The following is the "SQL Story" outline I have developed, that is, now the example of each article will have a certain position of this book at that time, and the current number will now be changed. Foreword
Chapter 1 Basic Principles
First, before the encoding
Second, the coding principle
Third, SQL script design principle
Chapter II Truth of Relational Database
First, the collection of vector space
Second, the information in the collection
Third, the collection operation
Fourth, the real world
Chapter 3 is not simple and simple query
First, choose information
Second, the conditions filter
Third, order and disorder
Fourth, information statistics and data mining
Chapter 4 Space "Multiplication" - Database & Sql Blog Articles
First, Dikal Consultation and Information Explosion
Second, a higher dimensional space
Third, join inquiry
Four, Join and WHERE
Chapter 5 SQL "Sentence" - subquery
First, from sentence syntax
Second, the cost assessment
Third, live subscription
Chapter VI Collection Writes - Ninert, Delete and Update
I. Design method
Second, delete
Third, insert
Fourth, change
Chapter VII Ultimate Weapon - Cursor
First, why
Second, how to do the cursor
Third, the tag special
Fourth, procedural coding
Chapter 8 Impossible Tasks
I. Funny problem highlights
Second, no limit
Third, unlimited information set
Fourth, explore collection
Chapter 9 is not just SQL - database management
First, reasonable index
Second, the code for lasting - stored procedures, storage functions, and triggers
Third, view usage
Fourth, the reasonable way of data storage
postscript
Attached: Alternative sound