博客园  :: 首页  :: 新随笔  :: 联系 :: 订阅 订阅  :: 管理

Excel Adapter for ADO.NET

Posted on 2010-10-12 11:32  gczhao  阅读(361)  评论(0编辑  收藏  举报

转自http://www.codeproject.com/KB/grid/ExcelAdapter.aspx

On the recent project I worked on, I had a requirement to load data from an Excel spreadsheet to a database on a regular basis. My goal was to load the data from Excel to a ADO.NET Dataset and run my business rules. I couldn't find any articles addressing this problem and so I decided to write one.

There are two projects included in the source code package.

  • ExcelADOAdapter builds the library, ExcelADOAdapter.dll.
  • ExcelAdapterTest is a test Windows Form application.

In addition, you need to add the following references.

  • Microsoft Excel 11.0 Object Library
  • Microsoft Office 11.0 Object Library

All sample codes in this articles are in C#.

This component uses Microsoft Office 2003 Interop Assemblies to handle the Excel source and the code is written in C#. The individual worksheets on the Excel workbook will be loaded into separate Datatables on the DataSet. I have a workbook, PurchaseOrder, with 5 worksheets; Customers, Orders, OrderDetails, Shippers and Products. The load process will create a dataset with 5 tables. The name of the dataset will be set to the name of the Excel workbook, in this example PurchaseOrder. The DataTable names will be the name of the worksheets.

 

 

 

Background

For more information on the Office 2003 Interop Assemblies, refer MSDN Reference.

Using the code

The most important thing to remember when working with Excel is to get a valid range of cells on the worksheet to work with. We certainly don't want all the empty cells to get loaded into the DataTables. The logic that I used to get the valid range is depicted below:

Collapse | Copy Code
/// <summary />
/// Gets the valid range of cells to work with.
/// </summary />
/// 
/// <returns />Excel Range</returns />
private Microsoft.Office.Interop.Excel.Range getValidRange(Worksheet sheet)
{
    string downAddress = "";
    string rightAddress = "";
    long indexOld = 0;
    long index = 0;
    Microsoft.Office.Interop.Excel.Range startRange;
    Microsoft.Office.Interop.Excel.Range rightRange;
    Microsoft.Office.Interop.Excel.Range downRange;

    try
    {
        // get a range to work with
        startRange = sheet.get_Range("A1", Missing.Value);
        // get the end of values to the right 
        // (will stop at the first empty cell)
        rightRange = startRange.get_End(XlDirection.xlToRight);

        /*get_End method scans the sheet in the direction specified 
        *until it finds an empty cell and returns the previous cell.
        *We need to scan all the columns and find the column with 
        *the highest number of rows (row count).
        *Then use the Prefix character on the right cell and the 
        *row count to determine the address for the valid range.
        */
        while (true)
        {
            downRange = rightRange.get_End(XlDirection.xlDown);
            downAddress = downRange.get_Address(false, false, 
            eferenceStyle.xlA1, Type.Missing, Type.Missing);
            index = getIndex(downAddress);
            if (index >= 65536) index = 0;
            if (index > indexOld) indexOld = index;
            if (rightRange.Column == 1) break;
            rightRange = rightRange.Previous;
        }

        rightRange = startRange.get_End(XlDirection.xlToRight);
        rightAddress = rightRange.get_Address(false, false, 
        ReferenceStyle.xlA1, Type.Missing, Type.Missing);

        return sheet.get_Range("A1", getPrefix(rightAddress) + indexOld);
    }
    catch (Exception)
    {
        throw;
    }
    finally
    {
        startRange = null;
        rightRange = null;
        downRange = null;
    }
}

startRange = sheet.get_Range("A1", Missing.Value); sets the starting cell as A1 which is the top left cell on the Excel worksheet. The get_End method on the Range object scans the sheet in the direction specified until it finds a empty cell and returns the previous cell. But a blank cell can also hold valid values as shown below.

 

To avoid this problem, you need to check all the columns and find the column with the highest number of rows and use that as the boundary for the valid range. But still if all the cells in the row are blank, the get_End method will not find any rows after that row.

Once after a valid range is found, we can create a two dimensional array and load the array to a DataTable. See the code segment below.

Collapse | Copy Code
/// <summary />
/// Loads the data table.
/// </summary />
/// 
/// <returns /></returns />
private System.Data.DataTable loadDataTable(Worksheet sheet)
{
    object[] columnValues;
    System.Data.DataTable dt = new System.Data.DataTable();
    dt.TableName = sheet.Name;
    Microsoft.Office.Interop.Excel.Range range;

    try
    {
        range = getValidRange(sheet);
        object[,] values = (object[,])range.Value2;
        columnValues = new object[values.GetLength(1)];

        for (int i = 1; i <= values.GetLength(0); i++)
        {
            if (i == 1 && _firstRowHeader)
            {
                for (int j = 1; j <= values.GetLength(1); j++)
                {
                    object value = values[i, j];
                    if (value != null) dt.Columns.Add(value.ToString());
                    else dt.Columns.Add("");
                }
            }
            else
            {
                for (int j = 1; j <= values.GetLength(1); j++)
                {
                    object value = values[i, j];
                    if (value != null)
                    {
                        columnValues[j - 1] = value.ToString();
                    }
                    else
                    {
                        columnValues[j - 1] = "";
                    }
                }
            dt.Rows.Add(columnValues);
            }
        }
        return dt;
    }
    catch (Exception)
    {
        throw;
    }
    finally
    {
        range = null;
    }
} 

I hope you will find the code useful. Thank you.