Writing a Portable Data Access Layer
Silvano CorianImicrosoft Corporation
April 2004
Applies TO: Microsoft? Visual Studio? .NET 2003 Microsoft? .NET Framework 1.1 ADO.NET VARIOUS RDBMS
Summary: Find Out How To Write Smart Applications That Work Transparently with DiffERENT DATA SOURCES, from Microsoft Access To SQL Server To Oracle Rdbms. (15 Printed Pages)
Contents
Introductionusing a universal Data Access Approachworking with Base Interfaceswriting a Specialized Data Access Layerusing Data Access Classes from the Other LayersSOME Possible ImprovementsConclusion
Introduction
During the last six years of doing consulting work, I've heard this question about data access and manipulation many times, and it's become a real obsession: "How can I write my application so that it works with database servers x, y, and ? z with few or no changes "Knowing that the data access layer is still the most critical part of a modern application, and usually the # 1 enemy for inexperienced developers, my first reaction has always been: you can not!
People's scared faces and the question, "But what about the Universal Data Access mantra that Microsoft proposed with ADO?" Prompt me to provide a more detailed explanation of the problem, and a suggested solution.
The problem is that everything goes well while your application is a small prototype, or if you have few concurrent users and simple data access logic, even if you choose the easiest way:? The use of RAD tools, like the Data Environment in Microsoft Visual Basic? 6.0, or some "all-in-one" solutions like the ActiveX? Data Control and other third-party components, which usually hide the complexity of the interaction between your application and a specific data source. But when the number of users grow, and concurrency becomes more of an issue, a lot of performance problems can appear due to the underlying use of dynamic record sets, server-side cursors, and unnecessary locking policies. The design and code changes you'll have to make to the SYSTEM TO REACH YOURS 'Goals Will Then Cost You a Lot More Because You Didn't Take This Problem Into Consideration from The Beginning.Using a Universal Data Access Approach
Microsoft launched the Universal Data Access campaign when ADO entered maturity in MDAC (Microsoft Data Access Commponents version 2.1). The idea was to show developers that with a simple object model (Connection, Command and Recordset), they could write an application that could connect with a wide set of different data sources, in both relational and non-relational form. What the documentation-and the majority of the articles and samples at that time-usually forgot to say was that even using the same data access technology, the programmability And The Characteristics of The Various Data Sources..............
The net result was that in applications which needed data from several sources, it was easiest to use the "common denominator" of functionalities provided by all data sources, thereby missing the benefits of using data source-specific options that could provide an optimized way to access and manipulate information inside the various RDBMSs.What always made me skeptical about this approach was that, after a more detailed analysis of the problem with my customers, we usually agreed the portion of the application that interacted with the data source was a very small one, when compared to the rest of the presentation and business logic. By doing a good job with a modular design, it was possible to isolate the RDBMS-specific code in some easily interchangeable modules, and thereby avoid the "one-size fits- All "Approach to Our Data Access. INSTEAD, We Could", Command Batches and Other Features, Depending on The Data Source ...............
ADO.NET brings some important changes into the data access coding arena, like the concept of specialized .NET data providers. Using a specific provider, you can get an optimized way to reach your data sources, bypassing the [very] rich-but sometimes unnecessary-series of software interfaces and services that the OLE DB and ODBC layer interposed between your data access code and the database server. Still, every data source has different characteristics and features, with different SQL dialects, and to write efficient applications you must still use these specific characteristics instead of a "common denominator". From the point of view of portability, managed and unmanaged data access technologies are still very similar.Outside of "Leverage the unique characteristics of the data source," the other rules necessary to write A Good Data Access Layer Are Usually The Same with Every Data Source:
· Use a connection pooling mechanism, where possible.
· Take Care with the limited resources of a database server.
Pay Attention to the Network Round-Trips.
· Promote The Reuse of Execution Plans and Avoid RecompiLations, WHERE Applicable.
· Use an adequate locking model to manage concurrency.
In my personal experience using the modular design approach, the amount of code in a complete application which is dedicated to working with a particular data source is not more than 10% of the total. Obviously, this is more complex than just changing the connection string IN A Configuration File, But I Think You'll Find That Is A Tolerable Compromise In Return for the Performance Benefits.
Working with Base InterfacesOur goal here is to use abstraction, and encapsulate the code specific to a particular data source in a layer of classes that let the rest of the application be independent, or decoupled, from the database server in the backend.
The object-oriented characteristics of the .NET Framework will help us during this process, giving us the opportunity to choose which level of abstraction we want to use. One option is to use the base interfaces that every .NET Data Provider has to implement ( IDbConnection, IDbCommand, IDataReader, etc). Another is to create a set of classes-the data access layer-that manage all the data access logic for the application (using the CRUD paradigm, for example). We will examine these two possibilities, Starting from A Sample Order-Entry Application, Based on The Northwind Database, To INSERT AND RETAS.
Data Provider Base Interfaces Identify The Classic Behaviors That An Application Usually Requires To Interact with a Data Source:
· Define a connection string.
· Open and close A Physical Connection to the data source.
· Define a command and related parameters.
· Execute the Different Kind of Commands you can create.
· Returning a set of data.
· Returning a scalar value.
· Executing an action on data without returning annice.
· Provide a forward-only and read-only access to the return data set.
· Define a set of operations to keep in sync a data set with the content of the data source (a data adapter).
That being said, however, if we encapsulate the various operations needed to retrieve, insert, update and delete information in different data sources (using different data providers) in our data access layer, and only expose members of the base interfaces, we can reach A First From A Data Province;
Using system.data;
Using system.data.common;
Using system.data.sqlclient;
Using system.data.oledb;
Using system.data.oraclient;
Namespace DAL
{
Public Enum DataBaseType PUBLIC ENUM
{
Access,
SQL Server,
Oracle
// Any Other Data Source Type
}
Public Enum ParameterType
{
Integer,
CHAR,
VARCHAR
// define a common parameter type set
}
Public Class DataFactory
{
PRIVATE DATAFAACTORY () {}
Public Static IDBConnection CreateConnection
String Connectionstring,
DatabaseType DBTYPE)
{
IDBCONNECTION CNN;
Switch (DBTYPE)
{
Case DatabaseType.Access:
CNN = New OLEDBCONNECTION
Connectionstring;
Break;
Case DataBaseType.sqlServer:
CNN = New SQLCONNECTION
Connectionstring;
Break;
Case DatabaseType.Oracle:
CNN = New OracleConnection
Connectionstring;
Break;
DEFAULT:
CNN = New SQLCONNECTION
Connectionstring;
Break;
}
RETURN CNN;
}
Public Static IDBCommand CreateCommand
String CommandText, DatabaseType Dbtype,
IDBCONNECTION CNN)
{
Idbcommand cmd;
Switch (DBTYPE)
{
Case DatabaseType.Access:
CMD = New OLEDBCommand
(CommandText,
OLEDBCONNECTION) CNN);
Break;
Case DataBaseType.sqlServer:
CMD = New SQLCOMMAND
(CommandText,
(SQLConnection) CNN);
Break;
Case DatabaseType.Oracle:
CMD = New OracleCommand
(CommandText,
(OracleConnection) CNN);
Break;
DEFAULT:
CMD = New SQLCOMMAND
(CommandText,
(SQLConnection) CNN);
Break;
}
Return CMD;
}
Public Static DBDataAdapter CreateAdapter
(IDBCommand CMD, DatabaseType DBTYPE)
{
DBDataAdapter Da;
Switch (DBTYPE)
{
Case DatabaseType.Access:
Da = New OLEDBDataAdapter
(OLEDBCOMMAND) CMD);
Break;
Case DataBaseType.sqlServer:
Da = New SqlDataAdapter
(SQLCommand) CMD);
Break;
Case DatabaseType.Oracle:
Da = New OracleDataAdapter
(OracleCommand) CMD);
Break;
DEFAULT:
Da = New SqlDataAdapter
(SQLCommand) CMD);
Break;
}
Return DA;
}
}
}
The point of this class is to hide, from the upper levels of the application, the details regarding the creation of instances of a particular type coming from a specific data provider, the application can now interact with a data source using the generic behaviors exposed through The base interface.
Let's Look AtHow To Use this class from the rest of the application:
Using system;
Using system.data;
Using system.data.common;
Using system.configuration;
Namespace DAL
{
Public Class Customersdata
{
Public DataTable getCustomers ()
{
String connectionString =
ConfigurationSettings.Appsettings
["Connectionstring"];
DatabaseType DBTYPE =
(DatabaseType) Enum.Parse
(TypeOf (DatabaseType),
ConfigurationSettings.Appsettings
["DatabaseType"]);
IDBCONNECTION CNN =
DataFactory.createConnection
Connectionstring, DBTYPE;
String cmdstring = "Select Customerid"
", CompanyName, ContactName from Customers";
IDBCOMMAND CMD =
DataFactory.createCommand
CMDString, DBTYPE, CNN);
DBDataAdapter Da =
DataFactory.createAdapter (CMD, DBTYPE);
DataTable DT = New DataTable ("Customers"); DA.FILL (DT);
Return DT;
}
Public Customersds GetCustomerRorders (String Customerid)
{
// TBD
Return NULL;
}
Public Customerslist getCustomersbyCountry
(String Countrycode)
{
// TBD
Return NULL;
}
Public bool insertcustomer ()
{
// TBD
Return False;
}
}
}
In the GetCustomers () method of our CustomerData class we can see how, by reading information from a configuration file, it's possible to use the DataFactory class to create an XxxConnection instance with a particular connection string, and write the rest of the code with no Particular Dependency on the underlying data source.
An Example of a Business Layer Class That Interacts with Our Data Layer Could Look Like THIS:
Using system;
Using system.data;
Using dal;
Namespace BLL
{
Public Class Customers
{
Public DataTable getAllCustomers ()
{
Customersdata CD = New Customersdata ();
DataTable dt = cd.getcustomers ();
Return DT;
}
Public Dataset getCustomerRorders ()
{
// TBD
Return NULL;
}
}
}
So, what's wrong with this approach The problem here is there's just one important detail that ties the code to a particular data source:?! The SQL syntax of the command string In fact, writing your app this way, the only thing you can do to make it portable is to adopt a base SQL syntax that can be interpreted by any of your data sources, thereby losing any chance to benefit from the specific functionality of a particular data source. This could be a small problem if your application has to do only very simple and standard operations over the data, and if you do not want to use advanced functionality (XML support, for example) in a particular data source. Usually, though, this approach will result in poor performance, since you can not use The Optimized Features of Each Data Source.writing a Specialized Data Access Layer
Consequently, the use of base interfaces only is not enough to provide an acceptable level of abstraction from the different data sources. In this situation, a good solution could be to "raise the bar" of this abstraction, creating a set of classes (eg Customer, Order, etc.) to encapsulate the use of a specific data provider, and exchanging information with the other levels of the application through data structures not related to a particular data source; a typed DataSet, an object collection, etc.
This layer of specialized classes can be created inside a particular assembly, one for every supported data source, and can be loaded on demand from the application, following instructions in a configuration file. In this way, if you want to add a brand new data source to your application, the only thing you have to do is implement a new set of classes, respecting the "contract" defined in the common set of interfaces.Let's see a real example:? If we wanted to support both Microsoft SQL Server? And Microsoft? Access AS Data Sources, WE WOULD CREATE TW DIFFERENT PROJECTS IN Microsoft? Visual Studio? Net, ONE for Each Data Source.
The One for SQL Server Would Would Look Like this:
Using system;
Using system.data;
Using system.data.common;
Using system.data.sqlclient;
Using system.configuration;
USING COMMON;
Namespace DAL
{
Public Class Customersdata: IDBCUSTOMERS
{
Public DataTable getCustomers ()
{
String connectionString =
ConfigurationSettings.Appsettings
["Connectionstring"];
Using (SqlConnection CNN = New SQLCONNECTION
Connectionstring)
{
String cmdstring = "Select Customerid,"
"CompanyName, ContactName"
"From customer";
SQLCOMMAND CMD =
New SQLCommand (cmdstring, cnn);
SqlDataAdapter Da = New SqlDataAdapter (CMD);
DataTable DT = New DataTable ("Customers");
Da.fill (DT);
Return DT;
}
}
Public DataTable getCustomerRorders (String Customerid)
{
// TBD
Return NULL;
}
Public DataTable getCustomersBycountry
(String Countrycode)
{
// TBD
Return NULL;
}
Public bool insertcustomer ()
{
// TBD
Return False;
}
}
}
THE CODE for DATA RETRIEVAL from Microsoft? Access Would Look Like this:
Using system; using system.data;
Using system.data.common;
Using system.data.oledb;
Using system.configuration;
USING COMMON;
Namespace DAL
{
Public Class Customersdata: IDBCUSTOMERS
{
Public DataTable getCustomers ()
{
String connectionString =
ConfigurationSettings.Appsettings
["Connectionstring"];
Using (OLEDBConnection CNN = New OLEDBCONNECTION
Connectionstring)
{
String cmdstring = "Select Customerid,"
"CompanyName, ContactName"
"From customer";
OLEDBCOMMAND CMD =
New OLEDBCommand (cmdstring, cnn);
OLEDBDataAdapter Da = New
OLEDBDataAdapter (CMD);
DataTable DT = New DataTable ("Customers");
Da.fill (DT);
Return DT;
}
}
Public DataTable getCustomerRorders (String Customerid)
{
// TBD
Return NULL;
}
Public DataTable getCustomersBycountry
(String Countrycode)
{
// TBD
Return NULL;
}
Public bool insertcustomer ()
{
// TBD
Return False;
}
}
}
.................... ..
An Interface of this Type Can Look Like this:
Using system;
Using system.data;
Namespace Common
{
Public Interface IDBCustomers
{
DataTable getcustomers ();
DataTable getCustomerRorders (String Customerid);
DataTable getCustomersBycountry (String countrycode);
BOOL INSERTCUSTOMER ();
}
}
We can create private or shared assemblies to encapsulate these data access classes; in the first case, the assembly loader will search for the one we specify in the configuration file inside the AppBase folder, or in a child directory using the classic probing rules If. WE Have To Share There Classes with Other Applications, We can Put these assembly in the global assembly cache.using data access classes from the Other
These two almost identical CustomersData classes are contained in two different assemblies that the rest of the application will use. Through the following configuration file, we can now specify which assembly to load and which data source to target.
An Example of a Possible Configuration File Would Be Something Like this:
XML Version = "1.0" encoding = "UTF-8"?>
Value = "Server = (local); Database = Northwind; User ID = UserDemo; PWD = USERDEMO "/>
Value = "provider = microsoft.jet.oledb.4.0; Data Source = .. / .. / .. / northwind.mdb "/> -> appsettings> configure> . We have to specify two pieces of information inside this file The first one is a canonical connection string; to have the opportunity to change, for example, the name of the server, or some other parameter for the connection The second is the fully. Qualified name of the assembly, limited ,,,: Let's Look at this portion of code too: Using system; Using system.data; Using system.configuration; Using system.reflection; USING COMMON; Namespace BLL { Public Class Customers { Public DataTable getAllCustomers () { String assemblyname = ConfigurationSettings.Appsettings ["DALASSEMBLY"]; String Typename = "Dal.customersdata"; IDBCUSTOMERS CD = // (idbcustomers) = AskMBLY.LOAD (AssemblyName). CreateInstance (MyType); DataTable dt = cd.getcustomers (); Return DT; } Public Dataset getCustomerRorders () { // TBD Return NULL; } } } As you can see, the assembly loads Using the name read from the configuration file, and create and uses an instance of the customersdata class. Some Possible IMPROVEMENTS To see an illustration of the approach I'm suggesting, see the .NET Pet Shop v3.0 sample application. I'd recommend downloading the sample and taking an in-depth look at it-not just for portability issues, but also for Other Interesting Areas Like Caching and Performance Optimization. An important area on which to focus your attention during the design of the data access layer for a portable application is how to pass the information back and forth with the other layers In my example, I simply use a generic DataTable instance;. In a production scenario you might want to consider a different solution, based on what kind of data you have to represent (do you have to deal with hierarchy, etc.). I do not want to reinvent the wheel here, and my suggestion is to take a look at the Designing Data Tier Components and Passing Data Through Tiers guide that describes very well the different scenarios and the benefits of the recommended solutions.As I said in the introduction, the particular features that your targeted data sources expose-as well as the overall data access-should be considered during the design phase. This should cover such things as stored procedures, XML serialization, and so forth. Regarding Microsoft? SQL Server? 2000, you can find a discussion of how to opt Imally Use these Features in the .NET Data Access Architecture Guide, Which I strongly suggest you read. I always receive a lot of requests about the Data Access Application Block and how it is related to the arguments I'm describing in this article. These .NET classes act as a layer of abstraction over the SQL Server .NET Data Provider, and let You Write More Elegant Code To Interact with The Database Server. This is an idea of what you can do: DataSet DS = SQLHELPER.EXECUtedataSet CONNECTIONSTRING, CommandType.StoredProcedure, "getProductSBycategory", New Sqlparameter ("@ categoryid", categoryid); There's also an extrapolation of this approach available in the open source Data Access Block 3.0 (Abstract Factory Implementation) sample that you can find on GotDotNet. This release implements the same abstract factory pattern, and lets you use different data sources based on the available. Net data providers.conclusion You should now be able to build business logic classes that do not require modification based on the choice of a particular data source, yet allow you to exploit the unique features of the given data source to obtain a more optimized result. This comes with a cost;. we have to implement multiple sets of classes to encapsulate the low-level operations for a particular data source, together with all the programmable objects that we build for every specific data source (stored procedures, functions, etc.) If you want Performance and Portability, HoWever, this is the price you have to pay. based on My Practical Experiences, It's Worth IT!