Page & QueryKey & Settlement
Database paging query is generally divided into two steps,
(1) According to the query conditions, the total number of count records
(2) Remove the corresponding range data from the data range (starting position OFFSET, per page data number span), and the recordset that meets the query condition is taken out.
First, the method of taking data according to the scope
This method can be employed if a specified range (OFFSET, SPAN) is taken from the RESULTSET.
PS = con.preparestatement (SQL, ResultSet.Type_Scroll_INSensitive, ResultSet.concur_read_only);
Ps.setMaxRows (Offset SPAN);
RS = ps.executeQuery ();
rs.absolute (offset);
While (rs.next ()) ...
When the amount of data is large, there are a lot of pages, and OFFSET is large, this method is not suitable. At this time, you need to use the Native SQL feature of each database.
Let's take a class of Hibernate Dialect Package and support the getLimitstring method for various databases. Here, Mysql and Oracle examples. Suppose the query statement
SELECT * from message where forum_id =? And created_time>? Order by created_time desc
So, MySQL's LIMIT SQL is
SELECT * from message where forum_id =? And created_time>? Order by created_time desc
Limit?,?
The two Limit?, The other is OFFSET, SPAN, respectively.
Oracle's LIMIT SQL is
Select * from (select row _. *, Rownum Rownum_ from (") (S) (SELECT ROWNUM_WROM)
SELECT * from message where forum_id =? and created_time>? Order by created_time desc
Row_ where rownum <=?) Where rownum_>?
The two Limit?, The other is OFFSET SPAN, SPAN, respectively.
Second, cache & queryys
COUNT statement can be automatically generated according to query statements, such as
SELECT Count (*) from
SELECT * from message where forum_id =? And created_time>? Order by created_time desc
)
Such an automatic count statement is a waste, using a child inquiry, but also retains an ordered Order By. It is best to provide a count statement.
SELECT Count (*) from message where forum_id =? And created_time>?
In the case of multi-page flipping, this count statement is to be executed repeatedly. In order to improve efficiency, I save this count result in a global cache, not only this session user can reuse, other users can reuse this result when they flourish Message according to the same conditions.
I use common QueryKey in the persistence layer as a cache key value.
QueryKey is divided into three parts, SQL, Parameters, Range. such as:
Query Key: SQL: SELECT Count (*) from message where forum_id =? And created_time>?
Parameters: [Buaawhl, Time Long Value]
Range: (0, 1)
This QueryKey is critical. Mainly two methods of Hashcode and Equals.
We know that when Key is placed in a Hash data structure such as Map, first, first, then compare a string of KEY behind the Hashcode.
for example. KEY1 and Key2's havehcode, all of which are different from Key3's Hashcode.
...
[101] -> Key1 -> Key2
...
[666] -> Key3
...
It can be seen that Hashcode, Equals, these two methods are methods to be called every time you look up. In particular, the equals method is more important, it is likely to be called multiple times.
The optimization of HashCode is relatively simple, as long as it is different from the different parts in QueryKey, try to achieve the proliferation of the value of HashCode, and reduce the repetition rate of HashCode.
The key is Equals implementation. There is a principle here, the smaller the structure, the more compared, and the comparative speed can be improved.
Parameters and Range in QueryKey are better. Each time equals compares, first compare the Range, if not, return false; if it is equal, compare parameters, if there is a parameter value does not equal, return false. In this way, we can filter out a large number of unhaneous QueryKey with a very short time overhead.
But when parameters and Range are equal, we are still unavoidable to compare SQL. String's equals method is as follows:
// from JDK SRC
// This method does not have a comparison of Hashcode, directly compare length and character
Public Boolean Equals (Object AnObject) {
IF (this == anObject) {
Return True;
}
anObject instanceof string {
String annotherstring = (string) anobject;
INT n = count;
IF (n == annotherstring.count) {
CHAR V1 [] = VALUE
CHAR V2 [] = anotherstring.value;
INT i = OFFSET;
INT j = anotherstring.offset;
While (n--! = 0) {
IF (V1 [i ]! = v2 [j ])
Return False;
}
Return True;
}
}
Return False;
}
We see that two identical long String have different referen, then compare is quite time consuming. So, the string is compared, not afraid, it is afraid of the same. In most cases, the length of different string is different, or the first few strings are not the same, and the comparison results can be obtained soon.
Of course, there is also this situation, both SQL String are very long, and the length is equal, and when most characters in front are the same, there is a different character. such as,
SELECT * from message where forum_id =? And create_time>? Order by created_time desc and
SELECT * from message where forum_id =? And created_time>? Order by updated_time desc
The length of these two String is equal, most of the front, etc. If the two string contents are the same, then go to the head to determine that the two strings are exactly the same.
My first approach is that try to use Static Final String as QueryKey's SQL. Such two SQL Reference If each other, then two SQL can be quickly determined.
This approach can only define a good SQL statement in the actual demand, and there are many cases where dynamic splicing SQL is required, and all the same SQL has the same REFERENCE.
I took a second approach: Separate, split a SQL STRING into an array of multiple SQL constants; generalized SQL type, SQL is not limited to String type, or String [] type.
such as.
String [] sql1 = {
"SELECT * from message where forum_id =?",
"And created_time>?",
"Order By",
"Created_time",
"Desc"
}
with
String [] sql2 = {
"SELECT * from message where forum_id =?",
"And created_time>?",
"Order By",
"Created_time",
"Desc"
}
with
String [] sql3 = {
"SELECT * from message where forum_id =?",
"And created_time>?",
"Order By",
"Updated_time",
"Desc"
}
At this time, the efficiency of comparing SQL1 and SQL2 and SQL3 will greatly improve, although the length of SQL1 and SQL2 two arrays is equal, or a comparison of an element of an element, but due to a large number of String constants, the same String constant has The same reference, so 5 steps, you can determine that the elements of the SQL1 and SQL2 arrays are completely equal; 4 steps, plus the first character comparison, you can judge the fourth element of SQL1 and SQL3. equal.
We see, practices 1 and practices 2, can improve SQL comparison efficiency, most case, perhaps faster than parameters.
Third, prefetch
Multi-user access to the same page is relatively large, for example, some popular topics of the Forum are likely to be read by many people. At this time, if the data object List taken out according to the range is also put into the cache according to QueryKey, then the response speed can be greatly improved, the data server burden, of course, your Web Server's memory burden has also increased. :-)
We have further considered the following two situations:
1. User custom page record number
In general, users can customize their own list of records, for example, some users like 20 per page, some like 10 per page. Suppose User A is topped to the first page of a forum, showing 1 - 20 information; User B is topped to the first page of the same forum, showing 1 - 10 information. At this time, the cache hit rate is very low. User A and User B cannot share cache information. Because their Range (SPAN) is always different, QueryKey will never be the same.
2. Record a lot, the number of records per page is too small
Suppose there is 1,000 information in a forum, 10 per page, then a total of 100 pages. If the user flips one page, each program issues a Query request for a SPAN size 10, takes 10 records, and causing QueryKey. Since the number of page records is too small, the efficiency of each database query is very low, and the cockpit rate is also very low.
In order to improve the cache hit rate, and the data prefetch function is enabled, we can take the same length of SPAN. For example, or the above example, we set a unified SPAN size of 100 in the program.
When the user A requests 1-10 record, the program determines that this falls within 1 - 100, then obtains 100 records in Range (1, 100), and returns 10 of the previous 10 to the user. When the user A is turned on, when the record 11 - 20 records, the program determines or falls within 1 - 100, and has already existed in the cache, then directly returns the corresponding 11 - 20 to the user A. can.
When the user b requests the record of 1 - 20, the program determines that this falls within 1 - 100, and has existed in the cache, then directly returns the corresponding 1 - 20 to the user B.
It can be seen that this set of length prefetch can greatly improve the efficiency of the database query and the hit rate of the cache.