[MSDN2001]3层结构图
Data Interoperability Using Enterprise-Wide Data Sources
Rob Caron and Paul Larsen
Microsoft Corporation
August 1999
Summary: Discusses the foundations on which to develop a data interoperability solution using enterprise-wide data sources. (16 printed pages)
Contents
Interoperability in the Enterprise
Universal Data Access
OLE DB Providers
Microsoft SQL Server 7.0
Conclusion
Interoperability in the Enterprise
Most mission-critical corporate data resides in diverse data stores on a variety of computing platforms. Thus, successful enterprise solutions must be able to function in heterogeneous environments and access data regardless of form or location.
The foundation for developing enterprise data interoperability solutions on the Microsoft® Windows® platform is Microsoft Windows Distributed interNet Applications Architecture (Windows DNA). Windows DNA, which is based on the widely used Component Object Model (COM), specifies how to do the following:
- Develop robust, scalable, distributed applications using the Windows platform.
- Extend existing data and external applications to support Internet operations.
- Support a wide range of client devices maximizing the reach of an application.
Figure 1. Microsoft Windows DNA Architecture
Interoperability and reuse are key attributes of Windows DNA. Unlike traditional software development, which required each application to be built from scratch, the Component Object Model (COM) enables developers to create complex applications using a series of small software objects (COM components). For example, a component might be a credit card authorization procedure or the business rules for calculating shipping costs. The COM programming model speeds up the development process by enabling multiple development teams to work on different parts of an application simultaneously.
COM also offers the advantage of programming language independence. This means that Windows developers can create COM components using tools and languages with which they are familiar, such as Microsoft Visual Basic® and Visual C++®. For non-Windows programmers, including mainframe COBOL and Web publishers, COM components can be accessed from simple scripting languages, such as VBScript and JScript® development software. Windows DNA simplifies development by providing access to a wide range of services and products developed using a consistent object model—COM.
One example of the services available is what we call COM services for interoperability. COM services for interoperability include the network, data, application, and management services that are part of existing Microsoft products, such as Microsoft SNA Server. COM services for interoperability provide a common approach to system integration using the wide range of COM components available today.
An Interoperability Framework
Microsoft has defined a four-layer framework for interoperability based on industry standards for Network, Data, Applications, and Management—or NDAM for short. Microsoft provides access to interoperability components in each of these four categories. This document focuses on the Data interoperability layer, providing an overview of the wide range of COM components available for accessing multiple data stores across an enterprise environment.
Figure 2. Microsoft Interoperability Framework
Universal Data Access
Enterprises run their daily operations relying on multiple data sources, including database servers, legacy flat-file records, e-mail correspondence, personal productivity documents (spreadsheets, reports, presentations), and Web-based information publishing servers. Typically, applications, end-users, and decision-makers access these data sources by employing a variety of nonstandard interfaces. Data interoperability standards offer the transparent and seamless ability to access and modify data throughout the enterprise. Microsoft’s data interoperability strategy is called Universal Data Access. Universal Data Access uses COM objects to provide one consistent programming model for access to any type of data, regardless of where that data may be found in the enterprise.
An easy-to-use programming architecture that is both tool and language independent, Universal Data Access provides COM objects for high-performance access to a variety of relational (SQL) and nonrelational information sources. The technologies that make up the Universal Data Access strategy enable you to integrate diverse data sources, create easy-to-maintain solutions, and use your choice of best-of-breed tools, applications, and platform services.
To leverage existing investments, Universal Data Access does not require expensive and time-consuming movement of data into a single data store, nor does it require commitment to a single vendor’s data products. Universal Data Access is based on open industry specifications with broad industry support, and works with all major established database platforms.
Figure 3. Universal Data Access Architecture
The Microsoft Data Access Components (MDAC) are the key technologies that enable Universal Data Access. Data-driven client/server applications deployed over the Web or a LAN can use these components to easily integrate information from a variety of sources, both relational and nonrelational. These technologies include Open Database Connectivity (ODBC), OLE DB, and Microsoft ActiveX® Data Objects (ADO).
For more information on Universal Data Access, see www.microsoft.com/data/.
ODBC
Open Database Connectivity is an industry standard and a component of Microsoft Windows Open Services Architecture (WOSA). The ODBC interface makes it possible for applications to access relational data stored in almost any database management system (DBMS).
Microsoft's ODBC industry-standard data access interface continues to provide a unified way to access relational data as part of the OLE DB specification. ODBC is a widely accepted application programming interface (API) for database access. It is based on the Call-Level Interface (CLI) specifications from X/Open and ISO/IEC for database APIs and uses Structured Query Language (SQL) as its database access language. Microsoft has implemented a number of ODBC drivers to access diverse data stores. ODBC is widely supported by Microsoft, third party application development products, and end-user productivity applications.
Figure 4. ODBC Architecture
Microsoft offers a number of ODBC drivers as part of the Microsoft Data Access Components, which ships as a feature of many popular Microsoft products, including Microsoft SQL Server™, Microsoft Office, Microsoft BackOffice® family of products, Microsoft SNA Server, and Microsoft Visual Studio®. The following ODBC drivers are included in MDAC version 2.1:
- Microsoft ODBC Driver for SQL Server.
- Microsoft ODBC Driver for Oracle.
- Microsoft ODBC Driver for Microsoft Visual FoxPro®.
- Microsoft ODBC Driver for Access (Jet engine).
Additionally, Microsoft SNA Server 4.0 with Service Pack 2 ships with a Microsoft ODBC Driver for DB2.
A number of third-party ISVs offer ODBC drivers for many data sources. For more information on these third-party offerings, see www.microsoft.com/sql/.
In addition, OLE DB includes a bridge to ODBC to enable continued support for the broad range of ODBC relational database drivers available today. The Microsoft OLE DB Provider for ODBC leverages existing ODBC drivers, ensuring immediate access to databases for which an ODBC driver exists, but for which an OLE DB provider has not yet been written.
For more information on ODBC, see www.microsoft.com/data/odbc/.
Note JDBC is a technology for accessing SQL database data from a Java client program. Microsoft offers a JDBC-to-ODBC bridge that allows Java programmers to access back-end data sources using available ODBC drivers. The Microsoft JDBC-ODBC bridge is part of the core set of classes that come with the Microsoft Virtual Machine (Microsoft VM) for Java. For more information, see www.microsoft.com/java/.
OLE DB
OLE DB is a strategic system-level programming interface to data across the organization. OLE DB is an open specification designed to build on the success of ODBC by providing an open standard for accessing all kinds of data. Whereas ODBC was created to access relational databases, OLE DB is designed for relational and nonrelational information sources.
Figure 5. OLE DB Components
OLE DB encapsulates various database management system functions that enable the creation of software components implementing such services. OLE DB components consist of data providers, which contain and expose data; data consumers, which use data; and service components, which gather and sort data (such as query processors and cursor engines). OLE DB interfaces are designed to help diverse components integrate smoothly so that OLE DB component vendors can bring high-quality OLE DB products to market quickly.
OLE DB data providers
OLE DB data providers implement a set of core OLE DB interfaces that offer basic functionality. This basic functionality enables other OLE DB data providers, service components, and consumer applications to interoperate in a standard, predictable manner. The MDAC Software Development Kit (SDK) includes a set of OLE DB conformance tests that OLE DB component vendors, as well as end-user consumer developers, can run to ensure a standard level of compatibility. In addition, data providers can implement extended functionality as appropriate for a particular data source.
OLE DB data consumers
OLE DB data consumers can be any software programs that require access to a broad range of data, including development tools, personal productivity applications, database products, or OLE DB service components. A major set of OLE DB data consumers are ActiveX Data Objects (ADO), which provide a means to develop flexible and efficient data interoperability solutions using such high-level programming languages as Visual Basic.
OLE DB service components
OLE DB service components implement functionality not natively supported by some simple OLE DB data providers. For example, some basic OLE DB providers do not support rich sorting, filtering, and finding on their data sources. OLE DB service components, such as the Microsoft Cursor Service for OLE DB and the Microsoft Data Shaping Service for OLE DB, can seamlessly integrate with these basic OLE DB data providers to complete the functionality desired or expected by a given OLE DB consumer application. Universal Data Access allows for the development of “generic” OLE DB consumer applications that access many data sources in a single, uniform manner. Enterprise developers can write COM components that perform a specific function against nonspecific data sources. Such a component might run against a VSAM data set today and run against a SQL Server table tomorrow. This allows enterprises to migrate from one data store to another as efficiencies allow or business needs require.
Resource pooling is another popular function provided by OLE DB service components. When running under Microsoft Transaction Server, Microsoft Internet Information Server, or on standalone basis, OLE DB and ADO applications can make use of OLE DB resource pooling, supported by the OLE DB core services, to enable reuse of OLE DB Data Source proxy objects. Typically, the OLE DB session start-up, from instantiation of the OLE DB data source object (DSO) to creating the underlying network connection to the data source, is the most expensive part of a given transaction or unit of work. This is a critical issue when developing Web-based or multi-tier applications. Maintaining connections to the database in the resource state of the middle-tier component can create scalability issues. Creating a new connection on every page of a Web application is too slow. The solution is OLE DB resource pooling, which enables better scalability and offers better performance.
For more information see www.microsoft.com/data/oledb/.
ADO
Microsoft ActiveX Data Objects (ADO) are a strategic application-level programming interface to data and information wrapped around OLE DB. ADO provides consistent, high-performance access to data and supports a variety of development needs, including the creation of front-end database clients and middle-tier business objects that use applications, tools, languages, or Internet browsers. ADO is designed to be the one data interface needed for single and multi-tier client/server and Web-based data-driven solution development. Its primary benefits are ease of use, high speed, low memory overhead, and a small disk footprint.
ADO provides an easy-to-use programming interface to OLE DB, because it uses a familiar metaphor—the COM automation interface, accessible from all leading Rapid Application Development (RAD) tools, database tools, and languages (including scripting languages).
Figure 6. ADO Object Model
ADO uses a flatter and more flexible object model than any previous object-oriented data access technology. Any of the five top-level objects can exist independent of the other four. Unlike DAO, which required constructing a hierarchical chain of objects before accessing data, ADO can be used to access data with just a couple of lines of code.
The secret of ADO’s strength is the fact that it can connect to any OLE DB provider and still expose the same programming model, regardless of the specific features offered by a particular provider. However, because each provider is unique in its implementation, how your application interacts with ADO may vary slightly when run against different data providers. Some common differences include ADO connection strings, command execution syntax, and supported data types.
For more information see www.microsoft.com/data/ado/.
OLE DB Providers
Microsoft offers a number of technologies that facilitate interoperability with data stored on non-Windows systems. Developers can create solutions that take advantage of a large installed base of information sources while working in a familiar environment. The following is a collection of some of the Microsoft technologies available today that provide this capability.
Microsoft OLE DB Provider for DB2
The Microsoft OLE DB Provider for DB2 allows application developers to use familiar object-oriented programming techniques to access DB2 databases over SNA LU6.2 and TCP/IP networks, without requiring knowledge of SNA APPC programming.
Implemented using the open protocol of the Distributed Relational Database Architecture (DRDA), the Microsoft OLE DB Provider for DB2 supports access to remote DB2 data using industry-standard Structured Query Language (SQL) statements. Because the provider is part of Microsoft's universal data access strategy, which is based on OLE DB and ADO, it can interoperate with OLE DB-aware tools and applications in Microsoft Visual Studio 6.0, Microsoft SQL Server 7.0, and Microsoft Office 2000 Developer Edition. These generic consumer applications rely on this OLE DB provider's compatibility as verified using the OLE DB conformance tests. A key requirement is that the provider support the IDBSchemaRowset object. IDBSchemaRowset provides the consumer with a means to query the data source’s metadata that describes the target table. Using this schema information, the generic consumer can intelligently process and display the result sets of queries.
Figure 7. OLE DB Provider for DB2 connecting to DB2 for MVS
At run time, generic consumers use IDBSchemaRowset information to make choices on how to behave with a back-end data source. Other information available to generic consumers at run time includes IDBInfo data source-supported keywords and literal information. Typically, consumer applications are designed to modify their behavior based on an expected set of values returned in IDBSchemaRowset and IDBInfo based on well-defined rules in the OLE DB specification. Additionally, the OLE DB Provider for DB2 publishes the data types, including precision and scale limits, in the form of standard OLE DB IDBSchemaRowset DBSCHEMA_PROVIDER_TYPES.
Another use of IDBSchemaRowset data is to populate a list of tables and columns available in the data source’s current collection. The OLE DB Provider for DB2 maps IDBSchemaRowset to DB2 system table information. For example, DBSCHEMA_TABLES are provided using information stored in DB2 for OS/390 SYSIBM.SYSTABLES and DB2 for OS/400 QSYS2.SYSTABLES tables. DBSCHEMA_COLUMNS are mapped to DB2 for OS/390 SYSIBM.SYSCOLUMNS and DB2 for OS/400 QSYS2.SYSCOLUMNS information. The OLE DB provider queries these DB2 system tables at run time, returning the data on calls to IDBSchemaRowset TABLES and COLUMNS.
Two examples of this usage:
- The Visual Studio Data Designer, which allows developers to preview DB2 tables and drag these tables and columns into the query designer.
- The SQL Server Data Transformation Services, which offers the end user an intuitive wizard for picking tables for bulk movement of data between DB2 and any other OLE DB or ODBC data source.
DB2 varies from platform to platform and version to version. The provider can access the following DB2 systems via SNA LU6.2 using Microsoft SNA Server:
- DB2 for MVS V4R1.
- DB2 for OS/390 V5R1 or later.
- DB2 for OS/400 V3R2 or later.
Additionally, the provider supports direct TCP/IP access to the following DB2 systems:
- DB2 for OS/390 V5R1 or later.
- DB2 for OS/400 V4R2 or later.
- DB2 Universal Database for Windows NT® V5R2 or later.
- DB2 Universal Database for AIX V5R2 or later.
OLE DB providers implement a default read-only, forward-only server cursor that is mapped to the data source’s cursor engine whenever possible. In the case of the OLE DB Provider for DB2, the server cursor offered is a forward-only updateable cursor. A server cursor offers the developer the most efficient means to traverse tables on the data source. Some generic consumers may expect and request a scrollable cursor when, as in the case of the OLE DB Provider for DB2, only a forward-only cursor is offered by the provider. In these cases, the generic consumer can request to use a client-side cursor that is implemented on behalf of the provider by the Microsoft Cursor Service for OLE DB. In ADO, a developer can specify the use of the ADO Client Cursor Engine (CCE) by simply specifying CursorLocation=adUseClient. The CCE offers rich scrolling in the form of ADO Recordset.Sort, Recordset.Filter, and Recordset.Find methods.
In many cases, the most complex issues involve preparing the data source for remote access as well as configuring the OLE DB data source. MDAC 2.x offers a central configuration tool called the Microsoft Data Links. Data Links allows the user to create, test, and save a file-persisted Universal Data Link (or UDL file). Most popular generic consumer applications support the use of UDL files and enable one to load pre-created UDL files from the standard location in Program Files\Common Files\System\OLE DB\Data Links. This location may be familiar to those who have used ODBC file-persisted data sources, which are stored in Program Files\Common Files\ODBC\Data Sources. If the directory location seems long or cryptic, you need not worry; OLE DB service components and generic consumers mask the location from most end users, making use of UDL files intuitive and time-saving.
For additional information see Q218590 INF: "Configuring Data Sources for the Microsoft OLE DB Provider (DB2)" at http://support.microsoft.com/search/default.asp.
Microsoft OLE DB Provider for AS/400 and VSAM
The Microsoft OLE DB Provider for AS/400 and VSAM allows developers to use familiar object-oriented programming techniques to access nonrelational data stored in mainframe Virtual Storage Access Method (VSAM) and midrange AS/400 files without having to master the mainframe or AS/400 programming environments. At its core, the Microsoft OLE DB Provider for AS/400 and VSAM allows the data consumer to access the remote host computer’s native file system. It does this by complying with the guidelines in the IBM Distributed Data Management (DDM) architecture. IBM developed DDM to enable applications on one computing platform (for example, the PC) to seamlessly access data on another computing platform (for example, the mainframe), without knowing that the data is remotely stored. Microsoft has merged DDM with Universal Data Access to provide Windows applications with transparent access to non-Windows legacy data. The provider uses the DDM Record Level Input/Output (RLIO) protocol to enable record-level access to files. This provider is an example of a non-SQL provider, interacting with a nondatabase back-end information source. As such, the provider cannot consume the commonly used universal database language of Structured Query Language (SQL). Rather, the provider enables command access to host files using a command syntax somewhat similar to calling a stored procedure. OLE DB offers two primary means of accessing data through an OLE DB provider. One means is through the IRowset object’s IOpenRowset::OpenRowset interface, which maps to the ADO Recordset.Open method. The second method is through the ICommand object’s ICommand::Execute interface, which maps to the ADO Command.Execute method. Typically, IOpenRowset is invoked by qualifying a target table name. In contrast, ICommand::Execute is invoked by specifying a command string that typically is a well-understood SQL statement. In the case of the OLE DB Provider for AS/400 and VSAM, users can open a remote legacy data set or file by using IOpenRowset::OpenRowset and the fully-qualified host data set path and member name. Using the ICommand::Execute interface to open an AS/400 file is similar to invoking a SQL stored procedure: “EXEC OPEN LIBRARYNAME.FILENAME(MEMBERNAME)”.
OLE DB and ADO offers interfaces for sorting, filtering, and finding result sets based on indexes implemented by an OLE DB provider’s server cursor. The OLE DB Provider for AS/400 and VSAM offers support for the OLE DB IRowsetIndex and IRowsetFind interfaces, as well as the ADO Recordset.Sort, Recordset.Filter, and Recordset.Find methods. These interfaces and methods are mapped to the DDM RLIO indexed access methods and can be employed when searching VSAM Key-Sequenced Data Sets (KSDS), VSAM Relative Record Data Sets (RRDS), VSAM Alternate Indexes on KSDS and RRDS, AS/400 Keyed Physical Files, and AS/400 Logical Files. Searching on host indexes is often the only reasonable means to access the volumes of information stored in a single enterprise-level multi-megabyte VSAM data set or AS/400 file.
An additional feature of the provider in an AS/400 environment is the ability to issue Control Language (CL) commands as command text on the OLE DB ICommand::Execute interface or ADO Command.Execute method. This feature enables developers to prepare or reset AS/400 resources at run-time while concurrently accessing data stored in native data files.
The Microsoft OLE DB Provider for AS/400 and VSAM offers the means to deploy an enterprise-scalable data integration solution. Without remote data access providers, such as the OLE DB Provider for AS/400 and VSAM, developers would be limited to developing host-based applications that push a subset of host data down to the PC in the form of a comma-separated values (CSV) flat file.
The OLE DB provider offers automatic data conversion from host data type to OLE DB data type. In the case of the AS/400, the data mappings are done automatically. Most AS/400 files are externally described to the operating system. The AS/400 contains system data dictionaries in which are published the schema or metadata of most AS/400-stored physical data files. In the case of the mainframe, all data files are program-described. There is no system-wide data dictionary on the mainframe. When creating an OLE DB Provider for AS/400 and VSAM data source (such as file-persisted universal data links), the developer uses an intuitive Microsoft Management Console (MMC)-based snap-in to create a PC-side metadata mapping file of host column lengths and data types to OLE DB data types. The snap-in utility produces a Host Column Description (HCD) file that is stored with the provider and referenced in the data source description. At run-time, the provider reads this metadata map to resolve the host data type to OLE DB data type conversions. Optionally, the AS/400 user can create an HCD file to describe program-described files used by AS/400 application programs that have their own proprietary data dictionary.
The OLE DB Provider for AS/400 and VSAM uses existing IBM DDM server applications developed by IBM and offered as features of popular IBM software applications. In the case of the mainframe, the OLE DB provider interoperates with IBM Distributed FileManager, which is a feature of the popular IBM Data Facility Storage Management Subsystem (DFSMS) for MVS version 1.2 and higher. DFSMS is part of most enterprise accounts’ mainframe systems because it is part of IBM’s Systems-Managed Storage framework for providing reliable and secure mainframe storage. In the case of the AS/400, the DDM server software is integrated with the operating system beginning with OS/400 version 2.2. As such, the OLE DB Provider for AS/400 and VSAM does not require the introduction of any non-IBM host software products. It makes use of existing IBM standard interfaces and software.
The provider can access most popular mainframe data set types, including:
- Sequential Access Method (SAM) data sets.
- Basic Sequential Access Method (BSAM) data sets.
- Queued Sequential Access Method (QSAM) data sets.
- Basic Sequential Access Method (BSAM) data sets.
- Virtual Storage Access Method (VSAM) data sets.
- Entry-Sequenced Data Sets (ESDS).
- Key-Sequenced Data Sets (KSDSs).
- Fixed-length Relative Record Data Sets (RRDSs).
- Variable-length Relative Record Data Sets (VRRDSs).
- VSAM Alternate Indexes to ESDSs or KSDSs.
- Entry-Sequenced Data Sets (ESDS).
- Basic Partitioned Access Method data sets.
- Partitioned Data Set Extended (PDSE) members.
- Partitioned Data Set (PDS) members.
- Read-only support for PDSE directories.
- Read-only support for PDS directories.
- Partitioned Data Set Extended (PDSE) members.
Because the provider accesses host resources through the DFM/MVS DDM server, it cannot access some less popular data set types, including the following:
- VSAM Linear Data Sets (LDSs).
- Generation Data Groups (GDGs) and Generation Data Sets (GDSs).
- Basic Direct Access Method (BDAM) data sets.
- Indexed Sequential Access Method (ISAM) data sets.
- Sequential Data Striping data sets.
- OpenEdition MVS Hierarchical File System (HFS) files.
- Tape media.
Using the Microsoft OLE DB Provider for AS/400 and VSAM, enterprises can efficiently develop data interoperability solutions while preserving their investments in host-based data storage, management, backup, and security systems.
For additional information, see Q219077 INF: "Configuring Data Sources for the Microsoft OLE DB Provider (AS/400 and VSAM)" at http://support.microsoft.com/search/default.asp.
Microsoft OLE DB Provider for Oracle
The Microsoft OLE DB Provider for Oracle allows application developers to use familiar data access techniques to access one or more Oracle databases on such platforms as Windows NT, HP-UX and Sun Sparc Solaris. Network connections between OLE DB provider and Oracle servers are supported by means of the Oracle Client for Windows.
The Microsoft OLE DB Provider for Oracle implements a forward-only, read-only server cursor. Updates to a row can be made by using a SQL UPDATE statement or by using the client cursor engine. Typically, applications are structured to issue SQL SELECT WHERE statements to fetch data and then SQL INSERT, UPDATE, or DELETE statements to post changes to the Oracle database. This type of common application can be supported efficiently using the OLE DB Provider for Oracle’s default server cursor.
Another common means to fetch and update Oracle data is to call Oracle stored procedures. Stored procedures in Oracle can be called using either the canonical syntax defined in the ODBC specification or the Oracle native syntax, PL/SQL. Calling stored procedures offers an optimized means to interoperate with Oracle data when compared to dynamic SQL statements. Typically, Oracle applications have evolved from those based on dynamic SQL to those based on stored procedures.
The Microsoft OLE DB Provider for Oracle supports OLE DB Schema Rowsets, which enables tools such as Visual Studio’s Data Designer to view tables and rowsets from Oracle databases. Using developer productivity tools such as the Visual Basic Data Environment, users can create and debug Oracle stored procedures remotely prior to implementing then in their Visual Basic-based COM component or Visual Basic application.
For additional information, see "Microsoft OLE DB Provider for Oracle: Tips, Tricks, and Traps."
Microsoft OLE DB Provider for ODBC
Although specific OLE DB data providers are not yet available for every database, the Microsoft OLE DB Provider for ODBC permits the use of OLE DB with any database for which there is an ODBC driver. This provider alone enables instant OLE DB access and data interoperability by leveraging existing ODBC drivers that exist for over fifty of the most popular databases. This provider is also the default OLE DB provider that ADO uses if an OLE DB provider is not specified.
Using this provider can yield differing results depending on the ODBC driver with which it is being used. For example, the actual cursor type received when requesting similar cursors can vary depending on the underlying ODBC driver.
Although ODBC offers excellent performance, when comparing native OLE DB providers with the OLE DB Provider for ODBC used in conjunction with an underlying ODBC driver, a native provider will often yield greater speed when establishing connections and moving through server-side cursors.
When a native OLE DB provider is unavailable, using the OLE DB Provider for ODBC allows the developer to get a head start on developing an OLE DB and ADO solution. With minimal effort, an application that has been written to use the OLE DB Provider for ODBC can be modified to use the native OLE DB provider. Thus, the OLE DB Provider for ODBC provides dual duty as surrogate gateway to efficient OLE DB and easy-to-use ADO interfaces, while preserving one’s development investment for the future.
For additional information, see "OLE DB for the ODBC Programmer."
More OLE DB Providers
The Microsoft Data Access Components install a number of Microsoft OLE DB providers, including:
- Microsoft OLE DB Provider for SQL Server.
- Microsoft OLE DB Provider for Oracle.
- Microsoft OLE DB Provider for Jet.
Many independent software vendors offer native OLE DB providers for various data sources. These providers offer direct support for OLE DB consumers and service components. For more information, see the Data Interoperability Alliance Solutions Web site at www.microsoft.com/SQL/productinfo/datainteropsol.htm.
Microsoft SQL Server 7.0
Microsoft SQL Server offers a number of tools for accessing and integrating enterprise data sources with SQL Server-stored data. Among these tools are three key features of Microsoft SQL Server 7.0:
- Distributed Query Processor.
- Data Transformation Services.
- Replication.
Distributed Query Processor
The Distributed Query Processor (DQP) feature of Microsoft SQL Server 7.0 enables application developers to develop heterogeneous queries that join tables in disparate databases. To access the remote data sources, a user must create a Linked Server definition. The Linked Server encapsulates all of the network, security, and data source-specific information required to connect DQP to the remote data. Linked servers rely on underlying OLE DB providers or ODBC drivers for the actual physical connection to the target data source. Once a linked server has been defined, it can always be referred to with a single logical name as part of a SQL Server dynamic SQL statement or stored procedure. At run time, a linked server resource, such as a remote DB2 table, is treated like a local SQL Server table.
When a client application executes a distributed query using a linked server, SQL Server analyzes the command and sends any requests for linked server data via OLE DB to that data source. If security credentials were specified when the linked server was created, those credentials are passed to the linked server. Otherwise, SQL Server will pass the credentials of the current SQL Server login. When SQL Server receives the returned data, it is processed in conjunction with other portions of the query.
DQP can concurrently access multiple heterogeneous sources on local and remote computers. Additionally, it supports queries against both relational and nonrelational data by using OLE DB interfaces implemented in OLE DB providers. Using DQP, SQL Server administrators and end user developers can create linked server queries that run against multiple data sources with little or no modifications required. For example, a Visual Basic developer can create a single linked server query that selects and inserts data stored in DB2 today, and then can change the linked server name and run the same query against Microsoft SQL Server tomorrow. In this way, DQP provides the developer with an isolation level against changes in the storage engine.
Further, DQP is an efficient tool with which to join information from multiple tables spanning multiple data sources. For example, let’s say you are a regional sales manager for a large retail company with subsidiaries located in several countries. Because of mergers and acquisitions, some regional offices store their data in different databases from those of the corporate headquarters. The United Kingdom subsidiary stores its data in DB2; the Australian subsidiary stores its data in Microsoft Access; the Spanish subsidiary stores its data in Microsoft Excel; and the United States subsidiary stores its data in Microsoft SQL Server. You want a report that lists, on a quarterly basis for the last three years, the subsidiaries and the sales locations with the highest quarterly sales figures. Joining the required data tables can be accomplished in real time by using a single distributed query, running on Microsoft SQL Server.
For additional information, see "Creating a Heterogeneous Query with Microsoft SQL Server 7.0."
Data Transformation Services
The Distributed Transformation Services (DTS) feature of Microsoft SQL Server 7.0 provides bidirectional snapshot data propagation between any data sources accessible via an ODBC driver or OLE DB provider. DTS accesses many popular data sources directly via OLE DB providers for Microsoft SQL Server, Microsoft Access, Oracle, and DB2. Other data sources are accessed through the Microsoft OLE DB Provider for ODBC. DTS can move data to and from ASCII fixed-field length text files and ASCII-delimited text files using the provided Microsoft SQL Server DTS Flat File OLE DB Provider.
All tasks accomplished by DTS are pre-stored in DTS packages. At run time, DTS opens the packages and executes the tasks in turn, based on their priority as defined by the package’s task flow. DTS packages contain these types of objects:
- Connection objects, which define each source or destination OLE DB data source.
- Task objects, which define the specific actions to be performed (such as executing an SQL statement, copying the contents of a table, or executing an ActiveX script).
- Step objects, which define the sequence in which tasks are to be performed. (Steps also define whether the execution of one task is dependent on the results of a prior task. For example, step 2 can be defined with an On Success precedence constraint for step 1, so that step 2 is executed only if Step 1 completes successfully.)
DTS packages can be persisted in DTS COM-structured storage files, the Microsoft SQL Server msdb database, and the Microsoft Repository. File-persisted DTS packages offer the greatest flexibility in terms of sharing between SQL Server computers. In fact, file-persisted DTS packages are easily distributed using e-mail or a file server share. Packages saved in SQL Server msdb database are easily accessible to applications that have connected to that server. They have less overhead than files connected to the repository, so are a good choice for packages that need high performance. Packages saved in Repository are also easily accessible to applications that have connected to that SQL Server installation. They also support providing package metadata to applications and recording data lineage. Packages saved to Microsoft Repository or to SQL Server storage are both stored in the msdb database, but they are stored in separate sets of system tables.
DTS packages are assigned versions numbers. Versioning allows changes to packages to be tracked and also indicate which version of a package performed a specific transformation. DTS packages saved in Data Transformation Services files can be encrypted to secure user IDs and passwords.
To create complex packages involving multiple data sources and numerous tasks, the Microsoft SQL Server Enterprise Manager hosts a graphical DTS Package Designer. Using the package designer, or by writing to the DTS COM interfaces, one can accomplish many advanced tasks, including:
- Data Driven Queries, using SELECT, INSERT, UPDATE, DELETE SQL statements.
- Custom Data Transformation.
- Execute Windows NT Process Task.
- Execute SQL Task, whether dynamic SQL or stored procedure.
- Bulk Insert Task, using the Microsoft SQL Server Bulk Copy Protocol.
- Set Tasks within Steps.
- Schedule Package Execution.
- Execute prepared ActiveX script.
DTS offers the end user intuitive wizards with which to get started moving data between two data sources. The DTS Import and DTS Export wizards simplify and guide the user through the steps required to define DTS packages. The wizards assist with data transformations by suggesting likely conversion schemes based on best matches of OLE DB data types. Custom data transformations are supported as well and can be selected by the user or programmatically selected by ActiveX script embedded in the DTS package. Using an ActiveX script, DTS can calculate new values from one or more source columns, or even break a single column into multiple values to be stored in separate destination columns.
The user, database administrator, and developer have many choices when it comes to execute DTS packages. One can execute packages directly from the DTS Import and Export Wizards, from the “dtsrun” command prompt utility, from a COM application by invoking the Call Execute method of the Package object, as well as from the Microsoft SQL Server Agent. DTS provides a set of OLE Automation interfaces and a set of COM interfaces that you can use to create customized import, export, and transformation applications in development systems that support OLE Automation or COM.
Today, enterprises need to provide decision-makers with access to mission-critical vital data in real-time. In many cases, this data is stored in a variety of formats and in a variety of back-end stores. Using DTS, one can import and export data between multiple heterogeneous sources using an OLE DB-based architecture. Additionally, using DTS, one can pull data into Microsoft SQL Server and use the Microsoft OLAP Server to analyze the complex information. Further, using DTS, one can create a local data warehouse on which to develop an online transaction processing (OLTP) system for Web commerce. Updates, inserts, and deletes to the data source can be made using DTS or Microsoft SQL Server Replication.
Replication
Replication is powerful tool for distributing data across an enterprise. The replication technology in Microsoft SQL Server allows one to create duplicate copies of data and move the copies to non-SQL Server data sources, while synchronizing the data types and data values automatically.
The replication feature of Microsoft SQL Server 7.0 is based on the “publish and subscribe” model, consisting of the following objects:
- Publisher, which is a server that makes data available for replication to other servers, based on whether or not the data has changed since the last publication.
- Distributor, which is a server that contains the distribution database that stores the metadata and history.
- Subscribers, which are servers that store replicated data and receive updates.
- Articles, which contain data and stored procedures to be published.
- Publications, which is a collection of articles to be published.
- Subscription, which is a definition of the publications to be published.
The replication feature of Microsoft SQL Server 7.0 enables incremental data movement from SQL Server to other data sources. The subscribers can be any of the many popular enterprise-wide SQL data sources that are accessible via an ODBC driver or OLE DB provider, for example Oracle and DB2 databases.
Conclusion
With the advent of such technologies as OLE DB and ADO, it is becoming increasingly easier to bridge the gap between the islands of data that can exist in the enterprise. Developers are now able to build enterprise solutions that leverage the investments companies have made in data storage infrastructure over the past 20 years or so. Microsoft’s commitment to developing technologies to simplify interoperability focuses on providing a consistent set of high-performance interfaces for easy access to the widest range of systems.
Send feedback to MSDN. Look here for MSDN Online resources.