NHibernate(Part1)

Chances are, as a .NET developer, you are already intimately familiar with the ADO.NET dataset. If you are an “enterprise” developer, those odds approach 100%. Database interaction via the FCL centers around retrieving a static snapshot of some portion of the database and manipulating it via the dataset, which mimics the RDBMS in almost every way: the data is tabular, relationships between data are modeled by foreign keys, data is largely untyped. Since the dataset provides a static, in-memory cache of data, it makes the manipulation of that data much more efficient than requiring constant round trips to the live data store.

The problem with the dataset is that it doesn’t fit particularly well with modern object-oriented application design. Whereas datasets have tabular data, we tend to code using objects. Datasets have foreign key relationships, our domain objects use references. Where we want to use only methods, datasets require a certain amount of SQL code. Of course, some of these problems can be solved through the use of “strongly typed” datasets, but the fact remains that you are changing modes as you move from your domain model to your data access and back again. Depending on how you choose to layer that data access code into the application, changes to the data store can have enormous ripple-effects on your codebase.

Last month, Bruce Tate and I released a new book called “Better, Faster, Lighter Java”. Don’t let that “j” word in the title throw you too much; the principles we espouse in the book are equally applicable to any modern development platform. One of those principles is transparency; the key to any enterprise application is the domain model. These are the classes that model, and solve, your customers’ business problems. If you customer is a bank, your domain model is filled with Accounts, Deposits and Loans. If your customer is a travel agent, your domain is filled with Tours and Hotels and Airlines. It is in these classes that your customers’ problems are addressed; everything else is just a service to support the domain. I mean things like data storage, message transport, transactional control, etc. As much as possible, you want those services to be transparent to your domain model. Transparency means that your model benefits from those services without being modified by them. It shouldn’t require special code in your domain to utilize those services, it shouldn’t require specific containers, or interfaces to implement. Which means that your domain architecture can be 100% focused on the business problem at hand, not technical problems outside the business. A side effect of achieving transparency is that you can replace services with alternate providers or add new services without changing your domain.

Coding directly against the dataset breaks the transparency. It is obvious inside of your code what storage mechanism you use, and it affects the way your code is written. Another approach to storage is the use of an object-relational mapping tool. Microsoft is in the process of building such a framework, called ObjectSpaces, but recently announced it would be delayed until as far as 2006. NHibernate, an open source solution, is available today and solves the same set of problems. With NHibernate, your code and your data schema remain decoupled, and the only visible indicator of the existence of the O/R layer are the mapping files. With HNibernate, you’ll see that these consist of configuration settings for the O/R framework itself (connecting to a data source, identifying the data language, etc.) and mapping your domain objects to the data tables.

There are a variety of other O/R frameworks available today, some commercial and some open source. For the purposes of this article, though, we’ll focus on NHibernate. It has a lot of momentum carrying over from the Java side of the world, and is extremely easy to get started with. However, if the general techniques we see in this article appeal to you, I suggest you take a look at the other options available to see if others are a better fit for your needs.

To get started, you’ll need to download the framework at http://nhibernate.sourceforge.net. Reference the assembly in your project. The next step will be to add the appropriate configuration settings to your application’s config file to tell NHibernate where and what your data store is. For the purposes of this article, we’ll use Microsoft SQL Server, though you could just as easily target Oracle, MySQL, or any number of other vendors. To map to a SQL Server instance, here is what your configuration settings might look like:

<configuration>
<configSections>
<section
name="nhibernate"
type="System.Configuration.NameValueSectionHandler, System,
Version=1.0.5000.0,Culture=neutral,
PublicKeyToken=b77a5c561934e089" />
</configSections>
<nhibernate>
<add
key="hibernate.connection.provider"
value="NHibernate.Connection.DriverConnectionProvider"
/>
<add
key="hibernate.dialect"
value="NHibernate.Dialect.MsSql2000Dialect"
/>
<add
key="hibernate.connection.driver_class"
value="NHibernate.Driver.SqlClientDriver"
/>
<add
key="hibernate.connection.connection_string"
value="Server=localhost;initial catalog=nhibernate;User
ID=someuser;Password=somepwd;Min Pool Size=2"
/>
</nhibernate>
</configuration>

Once you have configured the framework to recognize your data store, the next step is to create your domain model and database representation. It is entirely plausible to do those steps in either order – if your application is depending on a highly efficient storage schema, perhaps starting there is appropriate. However, if the database is just a place to store object state, then it probably makes more sense to start in the domain classes. There is a third option – start with the mapping files that will describe the relationships between your classes and tables. NHibernate provides a tool today that can auto-generate DDL from your mapping files. A recent addition to the project is a NAnt task that will auto-generate C# stubs from the mapping files. Taken together, you can adequately construct a base implementation by just coding up the mappings and letting NHibernate’s tools take care of the rest. NHibernate provides a tool today that can auto-generate DDL from your mapping files. A recent addition to the project is a NAnt task that will auto-generate C# stubs from the mapping files. Taken together, you can adequately construct a base implementation by just coding up the mappings and letting NHibernate’s tools take care of the rest.

The Domain Model

We’ll start with the domain model. For this article, we’ll tackle just a narrow segment of a larger enterprise application. Specifically, we’ll look at part of a university registration system. For our purposes, we’ll examine the following classes:

  1. Department: describes one segment of the University’s curriculum.
  2. UniversityClass: one class offered at the school (the funky name is to prevent using a reserved word as the name of the class).
  3. Professor: person who teaches a class.
  4. Student: person who takes a class.

Each of these classes has its own data fields, but the relationships between them are where the fun starts.

A department can contain zero, one or more professors. A professor can be associated with more than one department. A department contains one or more classes. A class belongs to only one department. A class is taught by a single professor, but a professor can teach multiple classes. Many students can take a single class, and students (should) take more than one class. There is no direct relationship between students and departments or professors (any relationship between student and professor is probably illegal, anyway).

Our domain objects are fairly straightforward. Here’s the list of classes and their data fields:

	public class Department
{
private int id;
private string name;
private IDictionary classes;
private IDictionary professors;
}
public class Professor : Person
{
private int id;
private string firstname;
private string lastname;
private string id;
private IDictionary departments;
private IDictionary classes;
}
public class UniversityClass
{
private int id;
private string name;
private string number;
private string syllabus;
private DateTime startDate;
private Professor professor;
private IDictionary students;
private Department department;
}
public class Student : Person
{
private int id;
private string firstname;
private string lastname;
private string ssn;
private IDictionary classes;
}
public interface Person
{
public int Id;
public string FirstName;
public string LastName;
}

Of course, you’ll probably want to provide public properties to wrap those fields, as it is good coding practice. Though NHibernate can work directly with private and protected fields, it just generally makes more sense to use properties for field access.

Secondly, remember that our Professor and Student classes will share a table. In the domain, they share several data fields as well (FirstName, LastName and ID). In our domain model, we represent this relationship through some form of inheritance. In this case, they both implement the Person interface, as shown.

Finally, you will see that our collection fields are all defined using IDictionary. When defining your collections in your domain, stick to the interfaces provided for you in System.Collections. Its generally a good practice, and it gives NHibernate the maximum flexibility in creating the collections for you

The Database

Now let’s take a look at the database. The first table is Department, which simply stores department IDs and names. Next, we would need to make a table for Student and another for Professor. However, if we look carefully, Professors and Students are almost identical (which stands to reason, since they are both human beings). They have first and last names, and some kind of string identifier (title for professors, ssn for students). Instead of having two tables, then, we’ll create one table called People. It will have a unique ID per person, their first and last names, an identifier field, and a field called PersonType which we will use to distinguish between students and professors. Finally, we need a table for our classes (UniversityClass). It has a unique ID, all the descriptive information about the class, and two foreign keys: PersonID (which maps to the Professor who teaches the class) and DeptID (which matches the department the class belongs to).

The other two tables are join tables, modeling the many-to-many relationships between Students and Classes, and between Departments and Professors. These join tables simple match IDs from the appropriate base tables, forming a union between them.

The Mapping Files

The next step is to provide the mapping files that fill our domain model from the data tables. Each class requires its own mapping file, which can be stored wherever you like. I keep mine mixed in with the class files themselves so that they are easy to find when I make changes to the model. Regardless, you’ll need one mapping file per persistent class in your application.

The mapping files connect your classes and their persistent properties to the database. Your classes can have properties that aren’t persistent; this is one of the beautiful things about a transparent data layer. If your domain calls for runtime-calculated properties, your classes can have them mixed in with the persistent ones. Your mapping files can just ignore the non-persistent ones.

Let’s start building the mapping file for Department. All mapping files are genuine XML files, so they start with the standard declaration:

<?xml version="1.0" encoding="utf-8" ?> 

Next, we’ll need to declare that this file is a mapping file for NHibernate. The root element of an NHibernate mapping file looks like this:

<hibernate-mapping xmlns="urn:nhibernate-mapping-2.0">
</hibernate-mapping>

That’s the boilerplate part. Now, let’s start constructing the actual mappings. First, we need to tell NHibernate which class type we are mapping, using the <class> element. When we pass the class type to NHibernate, we must give it the fully qualified name (including all namespaces) and the name of the assembly containing the type. For our Department class, the fully qualified name is “nhRegistration.Department” and the assembly is “nhRegistration”. We also need to feed the table name that holds our class data; in this case, “department”.

<class  name="nhRegistration.Department, nhRegistration" 
table="department">
</class>

We’re doing well. All we have left to do is map the persistent data fields to our schema.
All property mappings share some common features: the name of the field on the class, the column in the table that contains that field’s data, and the type of the field being persisted. Regardless of anything else you will see for different kinds of fields, all three of those attributes will be present.

Let’s look at a standard property: the Department’s Name field. The field itself is declared as a string, and the data table defines it as a char field of length 50. The property declaration is very straightforward:

<property name="Name" column="deptname" type="String(50)"/>

Any standard properties of a class will look largely the same. Note that the length declaration at the end of the type is no longer needed and is being deprecated as NHibernate will reflectively handle the configuration now. In fact, as of the latest release, you can even drop the type attribute altogether, and NHibernate will just examine the type of the mapped property.

A more interesting case is the Department’s “Id” field. Every class has to have a field on it containing data that uniquely identifies the instance on the persistence table. For our model, every class has this “Id” field which serves that purpose, though we could have used any field with a unique data value. This special field is known as the Id Property. When you map it, you have to provide the standard set of attributes, and two special ones: the generator and the unsaved-value.

The generator of the Id field lets NHibernate know how these unique identifiers will be created: by the programmer, by NHibernate, or by the underlying persistence store. Different applications will have different rules about identifiers, and different databases offer unique services for managing those values, so you have to pick carefully based on your requirements and infrastructure. Common values for generator are:

  • Identity: the identity column type in Microsoft SQL Server, MySQL, and others
  • Sequence: sequence tables in DB2, Oracle and others
  • Hilo: uses a hi/lo algorithm to generate identity values
  • Native: chooses whichever of the first three values is supported by the underlying database.

We’ll choose native for ours. The unsaved-value attribute specifies a default value for the Id Property when an object is created and not yet persisted. In this case, since we are letting the value of the Id property be created and managed by the database, having a default is useful. This is especially true since the default value for this property if not provided will be null, which won’t play nicely with the Int32 type of our property. The declaration looks like this:

<id name="Id" column="deptid" type="Int32" unsaved-value="0">
<generator class="native" />
</id>

Taken all together, our file now looks like this:

<?xml version="1.0" encoding="utf-8" ?> 
<hibernate-mapping xmlns="urn:nhibernate-mapping-2.0">
<class name="nhRegistration.Department, nhRegistration" 
table="department">
<id name="Id" column="deptid" type="Int32" unsaved-value="0">
<generator class="assigned" />
</id>
<property name="Name" column="deptname" type="String(50)"/>
</class>
</hibernate-mapping>

If we just left it at that, and loaded our database using this mapping file, we’d get a list of Departments whose Name and Id fields were populated, but whose Classes and Professors fields were null, since they weren’t mapped in the mapping file. If a field isn’t mapped, it is ignored by NHibernate.

Next up come the collection properties, which have special requirements all their own. Let’s look first at Classes. This is a collection of instances of UniversityClass. Remember our requirements for the model: a Department can contain multiple UniversityClasses, but a UniversityClass can belong to only one department. This is a one-to-many relationship. To model it, we need to use the <set> (or <bag>) element to show that we are mapping a collection, not a single instance, field.

<set name="Classes">
<key column="deptid"/>
<one-to-many class="nhRegistration.UniversityClass,nhRegistration"/>
</set>

The “name” attribute of <set> is the field name that will hold the collection, the key column is the name of the column in the collected class’s table that maps to the parent class’s Id Property, and the <one-to-many> element determines the type of the collected class. Our universityclass table has a field called “deptid” which maps back to the “Id” field in our department table.

Finally, we have to map the Professors field. Remembering our model, Departments can have many Professors, and Professors can belong to more than one Department. This is a many-to-many relationship, and is defined in the database using a join table (departmentprofessor) containing the id fields from the two related tables. In order to correctly map this relationship, we have to declare the collection mapping much like in the one-to-many example above, but also include the name of the join table, and the name of the column that contains the identify field for the related class.

<set name="Professors" table="departmentprofessor">
<key column="deptid"/>
<many-to-many class="nhRegistration.Person, nhRegistration" 
column="personid"/>
</set>

That’s everything that defines the Department class. The full mapping file reads:

<?xml version="1.0" encoding="utf-8" ?> 
<hibernate-mapping xmlns="urn:nhibernate-mapping-2.0">
<class name="nhRegistration.Department, nhRegistration" 
table="department">
<id name="Id" column="deptid" type="Int32">
<generator class="assigned" />
</id>
<property name="Name" column="deptname" type="String(50)"/>
<set name="Classes" cascade="all">
<key column="deptid"/>
<one-to-many class="nhRegistration.UniversityClass,
nhRegistration"/>
</set>
<set name="Professors" table="departmentprofessor">
<key column="deptid"/>
<many-to-many class="nhRegistration.Person,
nhRegistration" column="personid"/>
</set>
</class>
</hibernate-mapping>

However, if we try to load the project using just this mapping file, it will fail. NHibernate will throw an exception because of the one-to-many and many-to-many relationships. When you map a relationship like that, NHibernate expects that both sides of the relationships are persistent types; as of right now, we haven’t mapped the UniversityClass or Professor types that exist on the other end of those relationships. To successfully load a Department, we have to map those types as well.

Here’s the mapping file for UniversityClass.

<?xml version="1.0" encoding="utf-8" ?>
<hibernate-mapping xmlns="urn:nhibernate-mapping-2.0">
	<class name="nhRegistration.UniversityClass, nhRegistration" 
	table="universityclass">
	<id name="Id" column="deptid" type="Int32">
<generator class="assigned" />
</id>
<property name="Name" column="classname" 
type="String(50)"/>
<many-to-one name="Dept"
class="nhRegistration.Department,nhRegistration" 
column="deptid"/>
	
</class>
</hibernate-mapping>

The only new element in this file is the <many-to-one> element that represents the other side of the <one-to-many> element in Department.hbm.xml.

Finally, there’s the Professor class. This one has all kinds of fun. This class is special because Professor is an implementation of the Person interface, and shares a table with Student. The mapping file isn’t targeted at Professor, therefore, but at Person (Person.hbm.xml). The file maps all the implementations of Person. In order to distinguish between Professors and Students, you have to tell NHibernate which field (and values of that field) identify the type of entity in that row. You do this with the <discriminator> element.

If you look back at the domain model, the Person interface identifies three data fields: Id, FirstName and LastName. Both Professor and Student expose these fields, but then they begin to differ. Professor has a field called “Identifier” while Student has “SSN”. Professors have collections of Departments and UniversityClasses, while Students only have a collection of Classes. You can map the common fields in the Person class definition, but the fields that are specific to the subtypes must be mapped inside special <subclass> elements.

(In order to save space, I’ll elide the different collections from this mapping file to highlight the polymorphism-related features. The collections look just like the others given in the examples above.)

<?xml version="1.0" encoding="utf-8" ?> 
<hibernate-mapping xmlns="urn:nhibernate-mapping-2.0">
<class name="nhRegistration.Person, nhRegistration" table="people">
<id name="Id" column="personid" type="Int32">
<generator class="assigned" />
</id>
<discriminator column="persontype" type="String"/>
<property name="FirstName" column="firstname"
type="String(50)"/>
<property name="LastName" column="lastname"
		type="String(50)"/>
<subclass name="nhRegistration.Professor, nhRegistration" 
discriminator-value="professor">
<property name="Identifier" column="identifier" 
type="String"/>
</subclass>
<subclass name="nhRegistration.Student, nhRegistration"
discriminator-value="student">
<property name="SSN" column="identifier"
			type="String"/>
	</subclass>
</class>
</hibernate-mapping>

That is finally everything. This set of mapping files can now be used to create, save and load instances in your domain model.

NHibernate Configuration, SessionFactories and Sessions

In order to manipulate your persistent objects, you will have to complete the configuration process. This means telling NHibernate which mapping files to load. You can do this by pointing NHibernate to the physical files, but this means keeping track of the paths to the files no matter where or how your application is installed. Instead, you can let NHibernate find the files as embedded resources in your assembly. To do that, you must first set the “Build Action” of each of the mapping files to “Embedded Content”, meaning the compiler will add them to the assembly image. Then, you can simply tell NHibernate which classes in your application are the persistent classes, and NHibernate will use the class names to find the embedded mapping files that match them.

Configuration config = new Configuration();
config.AddClass(typeof(nhRegistration.Department));
config.AddClass(typeof(nhRegistration.Person));
config.AddClass(typeof(nhRegistration.UniversityClass));

You actually manage your persistent classes through a Session object. NHibernate provides a SessionFactory which builds the individual Sessions. A Session models a sequence of related persistence methods. As in any database management scenario, performance is tied almost directly to the number of roundtrips from your application to the physical database. If every individual persistent action required a roundtrip, you would be drastically decreasing the overall speed of your application. Conversely, if you simply open a Session and use it for the entire length of your application’s lifetime, you are holding open a valuable and rare resource: the physical database connection (though strictly speaking, it is possible to disconnect an open session from a physical database connection and still retain state, for now, we’ll operate under the more simplistic Session==Connection assumption). Managing your Sessions therefore is one of the most important parts of a good NHibernate application. The best strategy for managing your Sessions, and managing persistence logic in general, is to create a generic façade behind which you can plug whatever persistence logic you need. This allows you to write your application code against the generic interface and swap implementations out behind the scenes (say, if you switched from NHibernate to IBATIS.NET or Microsoft’ ObjectSpaces). The manager object also allows you to finely tune your Session management strategy.

Here’s a partial list of the DBMgr interface for the nhRegistration application:

public interface DBMgr
{
IList getDepartments();
Department getDepartment(int id);
void saveDepartment(Department dept);
IList getClasses();
//etc...
}

The NHibernate implementation of this interface needs to configure the framework and manage our SessionFactory and Sessions.

public class RegMgr : DBMgr
{
Configuration config;
ISessionFactory factory;
public RegMgr()
{
config = new Configuration();
config.AddClass(typeof(nhRegistration.Department));
config.AddClass(typeof(nhRegistration.Person));
config.AddClass(typeof(nhRegistration.UniversityClass));
factory = config.BuildSessionFactory();
}

From then on, the other persistent methods (that implement those found on DBMgr) can make use of the stored SessionFactory to open and use individual Sessions.

Creating, Loading and Saving Objects

Since the purpose of a Session is to provide a mechanism for managing the performance of your application, and the nature of that management is preventing unnecessary round trips to the database, it stands to reason that operations on a Session do not always result in direct operations against the underlying database. In fact, it is the point of the Session to cache operations until they can be batched to the database, thus saving round trips. All of which means that when you call a persistent method on a Session, it may or may not result in immediate changes to the database.

Usually, changes are only written to the database when the Session’s flush() method is called. You can do this directly, of course, but it is more common to find that flush() is being called on your behalf during some other operation, namely a transactional commit.

Let’s be frank: if you are building a database application, but aren’t concerned with transactions, you should probably take a few minutes to re-evaluate your design. It is vital, even when you are coding your SQL queries by hand, to make sure that atomic units of change to the database get executed transactionally. This becomes exponentially more important when using an O/R mapping layer like NHibernate, because a simple statement like save(dept) can have cascading effects on the database (changes to professors, or classes, or join tables, etc.). If any one of those cascading changes fails, and you aren’t using transactions to manage your writes, then you will be leaving your database in an unusable state. So, when using NHibernate, don’t use Sessions without Transactions.

The common usage pattern is this:

	ISession session;
ITransaction tx;
try
{
session = factory.OpenSession();
tx = session.BeginTransaction();
// do database work
tx.Commit();
session.Close();
}
catch (Exception ex)
{
tx.Rollback();
session.Close();
// further exception handling
}

The authors of the Hibernate documentation strenuously suggest never treating an exception as recoverable, hence rolling back the transaction as the first operation in your exception handling code. This is largely because NHibernate manages cascading updates depending on the relationships between your objects, and failures might represent failures anywhere along that chain. The best idea is to just roll back everything, fix the problem in the domain model, and try again.

Let’s talk a look at the most common types of activities for our persistent objects. We’ll examine the implementation of the first three methods of the DBMgr interface shown above. First up, getDepartments(). This method returns a list of all the departments in the university.

public IList getDepartments()
{
IList depts = null;
try
{
ISession session = factory.OpenSession();
ITransaction tx = session.BeginTransaction();
depts = session.CreateCriteria(typeof(Department)).List();
session.Close();
}
catch (Exception ex)
{
tx.Rollback();
session.Close();
// handle exception. 
}
return depts;
}

The Session object exposes the CreateCriteria() method, which takes a persistent class Type as its only argument. This tells NHibernate to gear up to interact with the tables mapped to this object; asking for the List() property of the results returns all the instances of that criteria. In this case, we’ll get back a List containing instances of Department, fully populated from the database (including all the other persistent objects related to Department). This straightforward method is used to iterate through all instances of a class in your database.

If you want to retrieve a specific instance from the database, you have to know the specific value of the identifier field for the target class that you are looking for.

public Department getDepartment(int i)
{
Department d = null;
try
{
ISession session = factory.OpenSession();
ITransaction tx = session.BeginTransaction();
d = (Department)session.Load(typeof(nhRegistration.Department), i);
session.Close();
}
catch (Exception ex)
{
tx.Rollback();
session.Close();
// handle exception.	}
return d;
}

Just remember to cast the results of session.Load() to the appropriate type, as what you get back is an Object.

Finally, what if you want to save changes to an instance (or create a new instance)? Session provides two methods: Session.Save() for creating a new instance, and Session.Update() for modifying an existing instance. You have to keep the two separate; Session conflicts and database errors occur if you try to use the wrong method. In our interface, though, our DBMgr implementation only has one method, saveDepartment(Department dept). It doesn’t make much sense for such a simple application to force developers to distinguish between persisting a new object versus an existing one, so the interface only exposes that single entry point.

In our implementation, however, we are still required to make that distinction. Here is a (altogether too simple) version:

public void saveDepartment(Department dept)
{
try
{
ISession session = factory.OpenSession();
ITransaction tx = session.BeginTransaction();
if(dept.Id == 0)
{
session.Save(dept);
}
else
{
session.Update(dept);
}
tx.Commit();
session.Close();
}
catch (Exception ex)
{
tx.Rollback();
session.Close();
// handle exception
}
}

First, before we persist the instance, we determine if it is a new instance or modified pre-existing version. With a simple integer-based identification field, we can just check to see if it has a value of 0, which means it has no corresponding row in our database (remember our unsaved-value attribute above). You can imagine, though, that this logic breaks down in the face of more complicated identification fields. Once we make that determination, we invoke the correct method on Session to do the work for us.

Other Concerns

This article has been a whirlwind tour of NHibernate, giving you just the basics you’ll need to get started. Clearly, there is much I have left out, glossed over or hidden behind a veil of fog. In my next article, we’ll tackle some of those issues, like:

  • Parameterized queries in HQL (Hibernate Query Language)
  • Caching
  • Advanced collection semantics (lazy loading, bidirectional mapping)
  • Advanced session management (disconnected objects, SaveOrUpdate semantics)
  • Limiting configuration steps (only building the SessionFactory once)

…and more. I hope this article has piqued your interest in the power of O/R mappers. Remember, NHibernate is just one of many, and if this looks interesting but not QUITE to your liking, there’s many more out there to choose from.

Authors

Justin Gehtland is a founding member of Relevance, LLC, a consultant group dedicated to elevating the practice of software development. He is the co-author of Windows Forms Programming in Visual Basic .NET (Addison Wesley, 2003) and Effective Visual Basic (Addison Wesley, 2001). Justin is an industry speaker, and instructor with DevelopMentor in the .NET curriculum.
posted on 2006-11-23 18:09  Gardener  阅读(789)  评论(0编辑  收藏  举报