Cache Your Data: JDBC Vs. ADO.NET

Most applications use some means of storing data whether it be a flat or XML file, an object-based database, or a relational database any one of which can be designed to fit your application needs.
Since the data is usually physically separated from your application you need to somehow fetch and bring it to your application before it can be displayed to users, manipulated and saved back to your data source. In this article, we will compare and contrast the use of Java and .NET to communicate with a remote relational database.

Disconnected vs. Connected

In the connected data model your application opens a connection to the remote database and retrieves the data then leaves the connection open.

Changes to each row are written back to the database for all users to see as the changes are made. The data is streaming back and forth as it is manipulated. In the disconnected model, a connection is opened to the remote database; the data is retrieved and cached in application objects while the connection is closed. You can then edit the data as you see fit, open a connection back to the database, commit the changes resolving any differences between the application version of the data and the version that in the database.

Advantages of Caching Your Data

Since you are not streaming between your application and your database, you are saving a considerable amount of network traffic thereby speeding up the displaying of data to the user. However, since you are caching a substantial amount of data in RAM you may run out of memory especially if you are running a website that gets lots of traffic. If your application retrieves enough records into memory, say by a high number of users simultaneously requesting a page containing 1,000 records from the server, this can cause your website to run out of memory.

When to Use Data Caching

Whenever you are working with a desktop application feeding off data in a remote server on the Internet, network traffic is a major consideration. Let's say you have returned 1,000 records and you are busy editing a grid when your network connection goes off line. (This happens more frequently than you may think.) A connected model will give you an error message every time the server is not available. In the disconnected model, you cache the data, edit it as you see fit, then wait until your network goes back online to save it to the database. Caching the data is also useful in procedures that need the same data multiple times when processing data in complex calculations. For example: on a payroll system, when you need the exact year-to-date income of a person to calculate five different tax types, it is more efficient for your program to have the data cached inside your application instead of making five round trips to the database server.

You can also cache complete web pages for a desired amount of time so data that is just being displayed without editing will be cached on your web server.

When not to Use Data Caching

As a rule of thumb, the cached model is not to be used on a single tiered web application. It is not necessary for a web page to cache data since at the end of an HTTP Request, the page unloads and the objects are being destroyed anyway. Be wary of the cached model when running a desktop application on a machine with limited memory, or do not retrieve too many records at once. To limit the amount of data retrieved, use paging in your stored procedures or your objects whenever you can.

A Side-By-Side Look at Our Classes and Passages in JDBC and ADO.NET

We are going to be using Microsoft's SQL Server for our examples and working with the following packages:

VB.NET 
   using System.Data; 
   using System.Data.SqlClient; 

   JAVA 
   import java.sql.*; 
   import javax.sql.rowset.*;

The idea behind both Sun and Microsoft's data caching is similar even though the implementation differs.
In both technologies we have objects that hold data and helper objects designed to communicate between the data base and the cached in memory representation of the data.

Microsoft's uses the DataSet class to represent the data in memory. A dataset can hold multiple tables of data represented by DataTable classes. The DataAdapter object is used to communicate between dataset and the database. The adapter is in charge of selecting, updating, inserting, and deleting data according to the differences between the cached data and the data base. The DataRelation object represents the relations between the different tables in the Dataset. For example: if we are selecting all the categories in a certain data base and all the products belonging to the categories, we will probably want to match the category ID in our products table to our category ID in our category table.

Sun's implementation is a little different. Instead of a master Dataset class holding everything, it uses CachedRowSet classes that can be populated from a ResultSet class to hold tables and the JoinRowSet class designed specifically for connecting related tables. Each table uses a SyncProvider to communicate to the data base by using a RowSetReader and a RowSetWriter. If we were to put our main classes side by side it would probably look something like this:

ADO.NET	JAVA
DataSet	JoinRowSet
DataTable (DataRow, DataColumn)	CachedRowSet
DataAdapter( SelectCommand, InsertCommand, DeleteCommand, UpdateCommand)	SyncProvider(RowSetReader on the ResultSet and RowSetWriter)

Selecting Data

To retrieve data with ADO.NET, open a new SqlConnection, define a new SqlDataAdapter, add a SqlCommand to the adapter, define a new DataSet, and fill it out using our adapter.

VB.NET 
   Dim ConnectionStr As string = "database=InvertedSoftware;server= InvertedSoftware " 
   Dim objConnection As SqlConnection = new SqlConnection(ConnectionStr) 
   Dim categoryDataAdapter As SqlDataAdapter = new SqlDataAdapter() 
   categoryDataAdapter.TableMappings.Add("Table", "Category") 
   objConnection.Open() 
   Dim objCommand As SqlCommand = new SqlCommand("SELECT * FROM tbl_category", objConnection) 
   objCommand.CommandType = CommandType.Text 
   categoryDataAdapter.SelectCommand = objCommand 
   Dim objDataSet As DataSet = New DataSet("Inventory") 
   categoryDataAdapter.Fill(objDataSet)

As we mentioned before, a data set can hold multiple tables, so let's go ahead and add a second table into our DataSet set using the same procedure.

VB.NET 
   Dim productDataAdapter As SqlDataAdapter = new SqlDataAdapter() 
   productDataAdapter.TableMappings.Add("Table", "Product") 
   Dim objCommand2 As SqlCommand = new SqlCommand("SELECT * FROM tbl_product", objConnection) 
   productDataAdapter.SelectCommand = objCommand2 
   productDataAdapter.Fill(objDataSet) 
   objConnection.Close()

Creating a relationship between two tables looks like this:

VB.NET 
   Dim dr As DataRelation 
   Dim dc1 As DataColumn 
   Dim dc2 As DataColumn 
   dc1 = ds.Tables("Category").Columns("category_id") 
   dc2 = ds.Tables("product").Columns("category_id") 
   dr = new System.Data.DataRelation("productsForCategory", dc1, dc2) 
   objDataSet.Relations.Add(dr)

Now using Sun's CachedRowSet, let's populate CachedRowSet with a standard ResultSet using a SQL Statement. We can cache another statement and now add them both to our JoinRowSet using our Match Column.

JAVA 
   String strUrl = "jdbc:microsoft:sqlserver://InvertedSoftware.com:1433;DatabaseName=InvertedSoftware"; 
   String strUserId = "USER"; 
   tring strPassword = "PASSWORD"; 
   String className = " com.microsoft.jdbc.sqlserver.SQLServerDriver "; 
   try { 
   Class.forName(className); 
   } catch(java.lang.ClassNotFoundException e) { 
   System.err.println(e.getMessage()); 
   } 
   try { 

     Connection con = DriverManager.getConnection(strUrll, strUserId, strPassWord); 
     Statement stmt = con.createStatement(); 
     ResultSet rs1 = stmt.executeQuery("SELECT * FROM tbl_category"); 
     ResultSet rs2 = stmt.executeQuery("SELECT * FROM tbl_product"); 
     CachedRowSet categoryCachedRowSet = new CachedRowSetImpl(); 
     categoryCachedRowSet.populate(rs1); 
     CachedRowSet productCachedRowSet = new CachedRowSetImpl(); 
     productCachedRowSet.populate(rs2); 
     con.close(); 
     JoinRowSet jrs = new JoinRowSetImpl(); 
     jrs.addRowSet(categoryCachedRowSet, "category_id"); 
     jrs.addRowSet(productCachedRowSet, "category_id"); 
     jrs.close(); 
   } catch(SQLException sqle) { 
   }

Updating Data

Using ADO.NET we can create a data adapter and add an update command to it. (A different way would be to add a select command and use a SqlCommandBuilder to generate our corresponding update and delete commands, but this is an entirely different subject and a topic for another article.) Now let's take our already existing data set and use the adapter to update the data.

VB.NET 
   categoryDataAdapter.UpdateCommand = New SqlCommand("UPDATE tbl_category SET category_name = @CategoryName WHERE category_id = @CategoryID", objConnection) 
   categoryDataAdapter.UpdateCommand.Parameters.Add("@CategoryName", SqlDbType.NVarChar, 15, "CategoryName") 
   Dim IDParm As SqlParameter = categoryDataAdapter.UpdateCommand.Parameters.Add("@CategoryID", SqlDbType.Int) 
   IDParm.SourceColumn = "category_id" 
   IDParm.SourceVersion = DataRowVersion.Original 
   Dim objDataRow As DataRow = objDataSet.Tables("category").Rows(0) 
   objDataRow("category_name") = "New Category" 
   categoryDataAdapter.Update(objDataSet)

Any differences between our current database state and our current data set must be resolved. Imagine a situation where two users connect and cache the data. One user changes a certain column and updates the database while the other user is still looking at the original data representation generated before the database was updated. Our second user now changes the data and tries to commit the changes. Which version should we use? Well, resolving differences is fairly easy if we can decide on certain rules. Each row in the current data base receives a row state status to determine if it is unchanged. We can then decide to update only original values to prevent overwriting any changes by other users or we can update all of the data rows so that the last saved version of the data is the final version.

Using JDBC: Since there is no one major object and no one major data adapter we have to update one table at a time. Using one of our CachedRowSet we can move the curser into the row we need to update and simply set the value in the column.

JAVA 
   categoryCachedRowSet.absolute(2); 
   categoryCachedRowSet.updateString(2, "Apples"); 
   categoryCachedRowSet.updateRow();

Deleting Data

In ADO.NET to delete a row you have to start with a dataset class and refer to the rows collection inside the data table you are working with. Use the delete method on the row you wish to remove.

In JDBC simply call delete row on your CachedRowSet

VB.NET 
   objDataSet.Tables("Product").Rows(3).Delete() 

   JAVA 
   productCachedRowSet.absolute(2); 
   productCachedRowSet.deleteRow();

Inserting data

With ADO.NET you again need to get the specific table you are working with and get a reference for a new row. You can then add values to the columns and call the add method on the rows collection.

In JDBC call insert row on your cached row set, update the values on the columns and call the insert row method.

VB.NET 
   DataRow objDataRow = objDataSet.Tables("Product").NewRow(); 
   objDataRow("product_name") = "new product" 
   objDataSet.Tables("product").Rows.Add(objDataRow); 

   JAVA 
   productCachedRowSet.moveToInsertRow(); 
   crs.updateString("product_name", "new product"); 
   productCachedRowSet.insertRow(); 
   productCachedRowSet.moveToCurrentRow();

Committing changes back to the Database

In ADO.NET call the accept changes method.

JDBC uses the exact same method.

VB.NET 
   objDataSet.AcceptChanges() 

   JAVA 
   productCachedRowSet.acceptChanges();

In JDBC calling the accept changes method will commit the data to the database. Any differences between the cached version of the data and the database would raise a SyncProviderException. Evaluating the SyncResolver object will help us determine all of our conflicts and solve them according to our business rules.

JAVA 
   try { 
     productCachedRowSet.acceptChanges(); 
   } catch (SyncProviderException spe) { 
     SyncResolver resolver = spe.getSyncResolver(); 
     Object crsValue; // value in productCachedRowSet 
     Object resolverValue; // value in the SyncResolver object 
     Object resolvedValue; // value to be persisted 
     while (resolver.nextConflict()) { 
       if (resolver.getStatus() == SyncResolver.UPDATE_ROW_CONFLICT) { 
         int row = resolver.getRow(); 
         productCachedRowSet.absolute(row); 
         int colCount = productCachedRowSet.getMetaData().getColumnCount(); 
         for (int j = 1; j <= colCount; j++) { 
           if (resolver.getConflictValue(j) != null) { 
             crsValue = productCachedRowSet.getObject(j); 
             resolverValue = resolver.getConflictValue(j); 
             resolvedValue = crsValue; 
             resolver.setResolvedValue(j, resolvedValue); 
           } 
       } 
     } 
     } 
   }

Unique Features for JDBC

CachedRowSet provides us with two very important features that help us increase performance in our application as well as communicate better between the different data processing operations. The first is page size. Bringing in a large amount of data requires us to have a large amount of memory available and takes a substantial amount of network traffic. On a CachedRowSet we can set the page size for the amount of data we need for each round trip and then just call the nextPage() method. The second unique feature is listeners. We can define a cursor or a row listener to tell us exactly when we have moved rows or changed values. This can be very effective to notify us in real time about data changes, especially if it is being done by another user.

Advantage of ADO.NET over JDBC

ADO.NET offers better control over the data since it is encapsulated in one object and we do not need to keep track of all of our tables.

Advantages of JDBC over .NET.

JDBC allows more control over an individual table and having an adapter embedded inside the table helps us manipulate the code faster without invoking any outside objects. JDBC also supports data paging, a very important part of keeping your available memory under control. ADO.NET has to resort to paging using stored procedures.

Conclusion

In this article we've seen that catching your data is easy and can be a very powerful tool in helping you reduce bandwidth and speed up complex data manipulation. Both Sun and Microsoft offer data catching solutions that are very similar and work on the same principles. The only major difference between the two is the fact that in Sun's way, we have one adapter per table and an object for the relationships. In Microsoft's way everything is held under one roof while the adapters are not apart of the tables, but stand alone classes. Which of those solutions would work best with your application is totally up to your imagination.

posted on 2006-04-12 11:03 泾溪阅读(615) 评论(1) 编辑收藏举报