There is no doubt that ADO.NET provides people with a powerful, simulated database object model that saves data records to memory. In particular, ADO.NET's DataSet class, not only functionally equivalent to the centralized memory of the database table, but also supports various constraints and logical relationships in the table. Further, the DataSet object is actually an offline data container. At first glance, just unite all the characteristics of the DataSet class, you can eliminate complex clauses in the SQL query command, such as those inner Join clauses or Group By clauses that are flooding and layered nested. Complex clauses can be broken down into two or more independent simple clauses, while saving the query results of each simple clause separately in different DataTable objects; as long as they analyze the constraints between these memory data and The logical relationship can rebuild the necessary "Referenceial Integrity" between the original table. For example: You can save the customer (the order) table to the order (Orders) table to two different DataTable objects, and then bind (bind) through the DataRelation object. In this way, SQL Server (or other DBMS system) eliminates the heavy burden caused by the Inner Join clause; more importantly, network transmission load is also greatly reduced. The solution like this simplifies SQL query is effective, but it is not always the best choice, especially when your database is large and frequent. This article will introduce another technique for simplifying SQL queries, which makes full use of ADO.NET's memory data objects to reduce the burden on users and DBMS systems. Decompose SQL query commands Many books about ADO.NET, such as David Sceppa's masterpiece "Programming ADO.NET CORE REFERENCE", it is recommended to break the complex SQL query command into a number of simple subquers, then put each The return result of the subquery is saved to several DataTable objects inside the same DataSet container. Please see an instance. Suppose you need to get some customer order information, requiring orders to be submitted to the designated year and packet by customers, and requires at least 30 items in the order. At the same time, you also want to get the submitted submitter (Employee) name and customer name of Customer.
You can implement it with the following SQL query statement: declare @theyear int set @Theyear = 1997 Select O.Customerid, Od.Orderid, O.OrDerDate, O.ShippedDate, Sum (Od.quantity * Od.Unitprice) AS Price, c.companyname, e.lastname FROM Orders AS o INNER JOIN Customers AS c ON c.customerid = o.customerid INNER JOIN Employees AS e ON e.employeeid = o.employeeid INNER JOIN [Order Details] AS od ON o.orderid = Od.orderid WHERE Year (O.orderdate) = @theyear and odd.orderid = O.Orderid Group by O.Customerid, C.CompanyName, Od.Orderid, O.OrDerdate, O.ShippedDate, E.lastname Having Sum (OD .quantity> 30 ORDER BY O.CUSTOMERID Temporarily, let go of the ADO or ADO.NET you used. Use the most original command to submit the above SQL query, you can see the result set as shown in Figure 1: Figure 1. The output result of the first SQL query command is generated and displayed by SQL Server Query Analyzer. In this query, one clause is the core, and the other two INNER JOIN clauses are assisted. The function of the core clause is to query all submitted by the designated year from the database, contain at least 30 items. The core clause is as follows: Select O.Customerid, O.Orderid, O.OrDerdate, O.ShippedDate, SUM (Od.quantity * Od.Unitprice) As Price, O.Employeeid from Orders As O Inner Join [Order Details] AS OD On o.orderid = Od.Orderid WHERE YEAR (O.OrderDate) = @theyear and odd.orderid = O.Orderid Group by O.Customerid, O.Orderid, O.Ordate, O.ShippedDate, O.Employeeid Having Sum Od.quantity)> 30 Order by o.customerid is represented by the client and the submitter in the return result set. However, this example is required by the customer's company name and the submitted name (Lastname). The Order By O.Customerid statement is particularly simple, but its function is important: because the character is more characters contained in the customer company name and the submitter name, use this statement to avoid their repetition, thus get a more compact result. set. In summary, the entire SQL query can be broken down into three subquery commands - 1 core subquery for obtaining order records; 2 auxiliary subqueries, used to create the submitter ID - Submiter Name and Customer ID - Customer company name two comparison tables, namely: Select Employeeid, LastName from Employees Select Customerid, CompanyName from Customers The following ADO.NET code demonstrates how to save the return result set of these three subquers into the DataSet object.
Dim conn As SqlConnection = New SqlConnection (connString) Dim adapter As SqlDataAdapter = New SqlDataAdapter () conn.Open () adapter.SelectCommand = New SqlCommand (cmdCore, conn) adapter.SelectCommand.Parameters.Add ( "@ TheYear", 1997) adapter.SelectCommand.Parameters.Add ( "@ TheQuantity", 30) adapter.Fill (ds, "Orders") adapter.SelectCommand = New SqlCommand (cmdCust, conn) adapter.Fill (ds, "Customers") adapter.SelectCommand = New SQLCOMMAND (CMDemplan) Adapter.Fill (DS, "Employees") Conn.close () Please note: When you continue the SQL query command, you usually have to operate the database connection to avoid excess Open / Close action. . The Adapter.Fill method of this example will automatically perform the Open / Close action unless you set an adapter.selectCommmmanD property to explicitly associate a connection over the Adapter object. In order to establish a relationship between memory data tables, you can create two relationships, associate the EmployeeId (submitter ID) to lastname, associate CustomerId to CompanyName (customer company name). In general, a pair of relationships between two independent tables can be created in the same DataSet object in the same DataSet object. However, this example is needed to establish a multi-faceted relationship, which is rare. In fact, just turn the parent table (Orders) in a multi-relationship into a sub-table, and the sub-table (Employees, Customers) will turn into a parent table. Figure 2. The parent table in the relationship is quite flexible with the DataRelelation object in ADO.NET in ADO.NET, enough to build multiple pairs. Each generation of a DataRelelation object, ADO.NET is established in the background to prevent the key value in the parent table to repeat. Of course, once the duplicate key value appears, ADO.NET will throw (throw) an exception. Consider the following code:.. Dim relOrder2Employees As DataRelation relOrder2Employees = New DataRelation ( "Orders2Employees", _ ds.Tables ( "Orders") Columns ( "employeeid"), _ ds.Tables ( "Employees") Columns ( "employeeid" )) DARARELATION object constructor here initializes three parameters: The first is the relationship name, the two is the Datacolumn object, representing two columns representing the constituent relationship (Column): Before A Datacolumn object represents a parent, the latter Datacolumn object representative column.
Once the constructor finds there is no legal record in the parent, it will activate (RAISE) an argumentException exception. Eliminating this exception The simplest solution is to add a boolean value in the constructor as the fourth parameter: relorder2employees = new datagelation ("Orders2emPloyees", _ DS.Tables ("Orders"). Column ("EmployeeID"), _ DS .Tables ("Employees"). Column ("EmployeeID"), _ false) When the fourth parameter value of the constructor is false, ADO.NET will not establish a consistent constraint, while the latter is triggering the crowder of ArgumentException exceptions. A disaster. After setting the data relationship, you can use the column expression to add two columns to the OrderS table to display its content. In theory, do full matching logic: Dim ORDERS AS DATATABLE = DS.TABLES ("Orders") Orders.columns.Add ("Employee", GetType (String), _ "Child (Orders2emPloyees), _" Child (Orders2emPloyees) .lastname ") Orders.columns .Add ("Customer", gettype (string), _ "Child (Orders2Customers) .companyName") Unfortunately, it does not work. Worse, when it runs to the code containing Child, it will throw (THROW) a "syntax error" information, and this error message is easily misleading programmer. (For more information on column expressions, see "Last Month's Column".) Why is it wrong? Because only when there is a consistency constraint when the parent is consistent, the Child keyword is allowed in the column expression. This is not clearly pointed out in the development document, but it is a fact, and it is very important. What is ineractive is that you can not only access the child elements of the ORDERS in any line, but also directly access the Employees table or any column of the Customers table. The following code can prove this: Dim Orders as datatable = ds.tables ("Orders") DIM Employee As DataRow = Orders.Rows (0) Msgbox Employee ("Lastname") So, the so-called "syntax error ", Does not mean that you cannot establish a multi-pair relationship. It just reminds you: unless it is established in advance, you cannot use the Child key in the column expression. In the initial relationship, the ORDERS table is a parent table. However, in order to obtain the submitter name or customer name from the ID, you must change the role played by the table: let the Orders table act as a sub-table, and let the Employees table, the Customers table acts as a parent table.
To ensure this, you must change the column name in the DataRelation object constructor code, and like this using column expression: Dim ORDERS AS DATATABLE = DS.TABLES ("ORDERS") ORDERS.COLUMNS.ADD ("Employee" Gettype (String), _ "Parent (ORDERS2EMPLOYEES) .lastname") ORDERS.COLUMNS.ADD ("Customer", GetType (String), _ "Parent (Orders2Customers) .companyname") Summary: In this example, we put A complex SQL query breaks down into three simpler sub-queries, eliminating two INNER JOIN statements, reducing the burden on the database server; more importantly, greatly reduces the network transmission load from the server to the client. It seems that this seems to be the best solution? The solution in front of an alternative is based on a comparison table, and data is not performed during the generation of the control table. Once the scale of the control table is too large, what will happen? Are you willing to download 10,000 records from the server in order to get the name of hundreds of submitters in the district? Do you want to download that big plus redundancy data? What's more, those redundant data are useless! However, please think about it. The comparison table is often valuable in the entire life cycle of the application. Also in other words, although many records are downloaded to construct a complete comparative table for a single query, it is not fair transaction for the entire application. In this case, why don't we try another technology to reduce the scale of the comparison table? The easiest way to think is too much to narrow the result set by means of the WHERE clause. Unfortunately, this program is either unreasonable, or the effect is poor, especially when the columns of the control form do not include the object you want to query. For example, in order to filter the submitted name, you must combine other tables, such as the Order table and the Order Details table. I think the best solution is to re-acquire the return result set of last SQL queries and parses each submitted information. That is to say, after completing the aforementioned SQL query, send an almost the same query command, to re-run the decomposition subquery. In this way, the database will return the identical data at the minimum query price. What is even better is that the SQL server also specializes in query optimization engines that minimize the cost of such repeated queries.
Select Distinct T. Customerid, T.CompanyName from (SELECT O.CUSTOMERID, O.Orderid, O.Orderdate, O.ShippedDate, Sum (Od.quantity * Od.Unitprice) AS Price, C.CompanyName, O.Employeeid from Orders AS O INNER JOIN CUSTOMERS AS C on C.Customerid = O.Customerid Inner Join [ORDER DETAILS] As Od ON O.Orderid = Od.Orderid WHERE YEAR (O.OrderDate) = @theyear and odd.orderid = O.OrderId Group By O.Customerid, O.Orderid, C.CompanyName, O.OrDerdate, O.ShippedDate, Od.Employeeid Having Sum (Od.quantity)> 30) AS T All, data retrieval code designed based on several simple queries The biggest advantage is that it transfers the weight of the data linkage by the server to the client. On the other hand, since the client is distributed in a table independently, the data query operation is unusually flexible. With some short, simple SQL query read records from the database and saved to additional DataTable objects separately, this is a good news for client applications. However, if the part of the data returned by these subsequins does not meet the consistency constraint, what will happen? The application of the transaction is usually, each query command, whether it has more complicated, is executed in the same default transaction (Transaction). Because of this, it ensures that the overall consistency of data is not damaged due to interference of other code during the execution. However, if you decompose the overall query command as a logic into a few subqueries, what will be? You may wish to assume that your software needs to process a changing environment, and the records in the database are also updated quickly. Perhaps your child query is still in the process, the data has been replaced by a user. In the previous example program, this unexpectedly does not cause much loss, because the second and third subquery commands only process the generated comparison table. Despite this, once someone deletes a submitted record after you get the order data, the data you are inquiry may also violate the consistency constraint. Therefore, the subquery and related processing code of the decomposed result must be put into the same transaction. In addition, don't choose. The so-called transaction refers to a set of operations. It is strictly complied with the following rules when executing: • Consistency · Isolation · Induction (Durability) people usually These four rules (4 properties) are known for the ACID. For this example, the most important rule (attribute) is independent. The so-called independence means that the database is capable of ensuring that each running transaction is not interfered with any other parallel factories. When your query command is running in a transaction, if other users' database operations are running simultaneously in other transactions, the result you finally get will be related to the independence level of the transaction. Normally, the database can reasonably assign independent levels based on the operations in each transaction. If the application requires data to remain absolutely consistency, it will not allow "phantom Rows", it must obtain a "serializable" independent level. When a "serial" transaction is running, it will lock all related tables to prevent any other user update or insert fields.
Only when the transaction is running, the table will unlock. Under this independence level, "read contamination" (Dirty Reads, ie unauthorized data) and "illusory line" (Phantom Rows, that is, rows that have not been recorded, or have been deleted by other transactions) Naturally Will appear; thus, the overall consistency of data is still unaffected. Since your subquery command may appear in the run, the submission record is modified, then you must, you should certainly package all subquers to a "serial" transaction. Set Transaction Isolation - Get Orders - Get Customers - Get Employees Commit Transaction Re-emphasizes once, if the condition allows, your transaction should have a "serial" independent level. If you are worried that all forms will bring adverse effects, please try the memory data objects of ADO.NET. We will discuss these objects later. The decimal data link allows us to take a look at the query example in this article. Its goal is to read all orders records submitted in the specified year and the number of items included in the database; we also need to know the total number of orders, the name of the customer's company name and order submitter. Declare @theyear int declare @theamount int set @theyear = 1997 set @theamount = 30 select o.customerid, Od.Orderid, O.OrDerdate, O.ShippedDate, Sum (Od.quantity * Od.Unitprice) AS Price, C. companyname, e.lastname FROM Orders AS o INNER JOIN Customers AS c ON c.customerid = o.customerid INNER JOIN Employees AS e ON e.employeeid = o.employeeid INNER JOIN [Order Details] AS od ON o.orderid = od. ORDERID WHERE Year (O.OrderDate) = @theyear and odd.orderid = O.Orderid Group by O.Customerid, C.CompanyName, Od.Orderid, O.OrDerdate, O.ShippedDate, E.lastname Having Sum (Od.quantity )> @Theamount Order by O.Customerid The above SQL query command can indeed returning all the required information at once. Just allow them to run in the same transaction, ensure consistency and "serial" sexes returned to the data. However, this program has been outdated, and we have not used it. why? In fact, it has two problems: First, the return result set comes from 3 different tables: • Orders · Customer (EMPLOYEES) This does not include the Order Details table. Second, the Inner Join statement causes some unnecessary data movement. We can't solve the second problem, but some ADO.NET code helps to solve the first problem. Therefore, we still have the opportunity to increase the feasibility and effectiveness of the entire solution. The specific ideas are as follows: First, the SQL query is executed, save the return result set into a DataTable object; then disperse the data in the DataTable to 3 different but related DataTable objects.
The result of the final output is nothing difference with the separate query, but it saves the overhead of definition and setting "serial" transactions, but also avoids download extra records from the database; . When can I adopt this program? I found that when the client needs to build a complex "master / Detail) view with Group-by functions and various filters, this solution is a nice choice. By the way, the related tables are very effective at this time, and ADO.NET also provides a lot of optimization characteristics. Let's discuss specific operations. The following code demonstrates its main process: Function splitdata (Byval DS AS Dataset) AS Dataset Dim _DataSet As New Dataset () 'make a fulw worker copy of the dataset _dataset = ds.copy () CreateCustomers (_Dataset, DS) CreateEmployees _Dataset, ds) 'Remove Column from Orders (CompanyName [2] and lastname [4]) _Dataset.tables ("Orders"). Columns.removeat (1 )_Dataset.tables ("Orders"). Column.removeat (3) Return_DataSet End Function Code The DataSet Object (DS) is completely copied, which is used as the Orders table in the new DataSet object (_Dataset). Next, the code is dynamically adding the Customers table and the Employees table in the new DataSet object. Finally, it deletes the columns contained in the other two subtables from the ORDERS table of the new DataSet object. The figure below shows the contents of the Customers table in the new DataSet object. Hey, it only leaves two colors of the customer ID and the company name of the order in the order of the order. Since both tables have a CustomerID column, they still have a relationship. Figure 3. The newly generated Customers table below is simply talking about the code to create and populate the Customers table and the Employees table below. At first, you must call the Clone method to clone the original order table to create a new DataTable object. Unlike the COPY method, the Clone method only copies metadata. Since the DataTable interface does not allow cloning orders, this method is the simplest way to generate a peer table. However, this generated table will contain some excess column, we must delete it. As long as you analyze the structure of the first DataSet object, you will find the CustomerID column and the companyName column to return the first column and the second column of the result set. Dim _customers As DataTable = orig.Tables ( "Orders") Clone () _customers.TableName = "Customers" 'Remove unneeded columns Dim i As Integer For i = 2 To _customers.Columns.Count -. 1 _customers.Columns.RemoveAt ( 2) After the NEXT establishes a table structure, it has to load data. However, the same submitter may appear many times in the Orderation table. In addition, you must filter data in the source DataSet object.
Good in the Orders table has been sorted according to the CustomerID column, so you only need to circulate all rows, and select the conditional person. Dim row As DataRow Dim customerKey As String = "" For Each row In _dataset.Tables ( "Orders"). Rows 'Already sorted by CustomerID If customerKey <> row ( "customerid") Then' select distinct _customers.ImportRow (row) CustomerKey = ROW ("Customerid") end if next 'add to the dataset _dataset.tables.add (_customers) Importrow is the fastest way to specify a new table from the database. Typically, the Importrow method selects the exported columns based on the requirements of Schema. In principle, the creation of the Employeess table is substantially the same as the Customers table. Of course, you should delete the columns different. From the structure of the ORDERS table, we must retain the third column and 4th column. The following codes first delete the first column and the second column, then solve other columns with a loop. Dim _employees As DataTable = orig.Tables ( "Orders"). Clone () _employees.TableName = "Employees" 'Remove unneeded columns _employees.Columns.RemoveAt (0) _employees.Columns.RemoveAt (0) Dim i As Integer For i = 2 to _employees.columns.count - 1 _employees.columns.removeat (2) Next, you must also clear the repeat line in the Employees table. In this example, the sorting of the ORDERS table helps to simplify the operation. You can create the sorted view of the Orders table and then loop all rows. Dim employeeKey As Integer = 0 Dim view As DataView = New DataView (_dataset.Tables ( "Orders")) view.Sort = "employeeid" Dim rowView As DataRowView For Each rowView In view If employeeKey <> Convert.ToInt32 (rowView ( " employeeid ")) Then 'select distinct _employees.ImportRow (rowView.Row) employeeKey = Convert.ToInt32 (rowView (" employeeid ")) End If Next' Add to the DataSet _dataset.Tables.Add (_employees) summary This article demonstrates a Complex SQL query instances and discuss three solutions to improve their efficiency. It has to be admitted that classic ADO has limited resolution for such problems, but ADO.NET can make you construct a powerful offline data object model to improve program performance. This article refers to several solutions, which is the best choice? Hard to say.