[3].SSIS Process Control
A significant advancement to SSIS is the package architecture design for its process control management. You've already learned that the SSIS process control architecture includes the control flow, data flow, and event handler components. Each of these process control components includes common and unique sets of objects for you to use when designing and creating your packages.
SSIS Control Flow
SSIS package objects (containers, data flow tasks, administration tasks, precedence constraints, and variables) are elements of the control flow component of the process control architecture. The control flow is the highest-level control process. It allows you to orchestrate and manage the run-time process activities of data flow and other processes within a package. In fact, you can design a control flow by using an Execute Package task to manage the sequence of processing for a set of existing packages in a Master Package concept. This capability allows you to combine individual packages into a highly manageable workflow process. Use precedence constraints to set the process rules and to specify sequence within the control flow. An SSIS package consists of a control flow and one or more objects. Data flow and event handler process control components are optional.
SSIS Data Flow
When you want to extract, transform, and load data within a package, you add an SSIS data flow task to the package control flow. Each data flow task creates its own data flow process control component for processing at run time. You configure each data flow to manage data sources, data destinations, and optional data transformations for any kind of data manipulation your packages might require. You can have as many data flow components within a package as you need to handle all the kinds of data sources and destinations you might have.
The SSIS data flow component provides a comprehensive set of pre-defined data sources and destination objects to enable you to design and develop packages easily for most of the databases and data source files you might have within your IT environment. You can add custom data sources if you need them. Data destinations allow you to deliver data from a data flow process in a variety of formats. An SSIS package can even provide data directly to an application by storing it in an ASP.NET DataReader destination object. Using this destination-type object, you don't have to place the data in a persistent data store, and you can design application integrations, enabling near real-time data delivery.
A set of data transformation task objects is provided within SSIS data flow. These transformation tasks have been designed to meet most, if not all, of the kinds of data conversion, manipulation, standardization, merging, splitting, fuzzy matching, and other types of transformations without having to write complicated programming code. You will learn about many of these transformation tasks, data sources, and destination objects later, in Part II of this book, "Designing Packages."
SSIS Data Pipeline
The SSIS data flow process control component and its tasks are processed by the data flow engine within SSIS. A key feature of the SSIS data flow engine is the data pipeline, shown in Figure 1-1, which uses memory buffers to improve processing performance. The data pipeline enables parallel data processing options and reduces or eliminates multiple passes of reading and writing of the data during package execution and processing. This level of efficiency means you can process significantly more data in shorter periods of time than is possible if you rely simply on stored procedures for your ETL processes.
Maximum data processing performance for SSIS packages is achieved because the data pipeline uses buffers to manipulate data in memory. Source data, whether it's relational, structured as XML data, or stored in flat files like spreadsheets or comma-delimited text files, is converted into table-like structures containing columns and rows and loaded directly into memory buffers without the need of staging the data first in temporary tables. Transformations within a data flow operate on the in-memory buffered data as well as on sorting, merging, modifying, and enhancing the data before sending it to the next transformation or on to its final destination. By avoiding the overhead of re-reading from and writing to disk, the processes required to move and manipulate data can operate at optimal speed.
SSIS Event Handler
The event handler process control, unlike the data flow process control, is not managed by the control flow. When you want to control processing at specific occurrences of events during package execution, you use the SSIS event handler process control component. An event handler runs in response to an event raised by the package or by a task or container within the package. Typically, event handlers are created in a package to perform special processing as a result of data anomalies, to trigger other programs, or to launch other packages based upon the event state within the running package. For example, you can create an event handler to send an e-mail alert notification in the event of a task or package for either a success or a failure or simply for a completion state.
You will learn more about SSIS package architecture and its objects and process control components later, in Part II of this book.
--翻译
有显着进步到SSIS包的体系结构设计的过程控制管理。你已经了解到,SSIS过程控制体系结构,包括控制流,数据流和事件处理程序组件。这些过程的控制元件,包括共同的对象和独特的设计和创造你的包时使用的套。
SSIS的控制流
SSIS包对象(容器,数据流任务,管理任务,优先约束和变量)控制流组件的过程控制体系结构中的元素。控制流是最高级别的控制过程。它允许你来协调和管理包内的数据流和其他进程的运行时间流程活动。事实上,你可以设计使用执行包任务管理的现有包的主包装概念集的处理顺序控制流。此功能允许你单独的软件包组合成一个高度管理的工作流程。使用优先约束设置程序规则和指定范围内的控制流的顺序。一个SSIS包包含一个控制流和一个或多个对象。数据流和事件处理程序过程控制组件是可选的。
SSIS数据流
当你想提取,转换和负载数据包内,您可以添加一个SSIS数据流任务包的控制流。每个数据流任务,创建自己的数据处理流程控制在运行时组件。您可以配置每个数据流管理数据源,数据目的地,以及可选的数据转换为你的包可能需要的任何数据操作。你可以有一个包内,因为你需要处理各种数据源和目的地,你可能有尽可能多的数据流组件。
SSIS数据流组件提供了一整套预先定义的数据源和目标对象,让你可以设计和开发的数据库和数据源文件,你可能有您的IT环境内包容易。您可以添加自定义的数据源,如果你需要他们。数据的目的地,让您提供各种格式的数据从一个数据流的过程。一个SSIS包,甚至可以提供数据,应用程序直接存储在一个ASP.NET DataReader目标对象。使用此目标类型的对象,你不必须放置在一个持久性数据存储的数据,你可以设计应用程序集成,实时数据传输,使附近。
SSIS数据流内提供的一组数据转换任务对象。这些改造任务已设计,以满足大多数,如果不是所有的各种数据转换,操作,标准化,合并,分割,模糊匹配,与其他类型的转换,,而无需编写复杂的程序代码。你会学到很多关于这些改造任务,数据源和目标对象后,在这本书的第二部分,“设计的软件包。”
SSIS数据管道
SSIS数据流的过程控制元件,其任务是在SSIS数据流引擎处理。 SSIS数据流引擎的一个主要特点是数据管道,如图1-1所示,它使用的内存缓冲区,以改善加工性能。数据管道使并行数据处理选项,并减少或消除了多遍的阅读和写的数据包执行过程中和处理。这种效率的水平,意味着你可以处理在较短的时间内显着更多的数据比是可能的,如果仅仅依靠你的ETL过程的存储过程。
图1-1:SSIS数据流的数据管道
SSIS包的最大数据处理性能实现数据管道,因为使用缓冲区来处理内存中的数据。源数据,无论它是关系,构建为XML数据,或者存储在电子表格或逗号分隔的文本文件一样平坦文件,转化成包含列和行结构,如表和加载到内存缓冲区直接,无需举办的数据首先在临时表。在一个数据流的转换操作在内存中缓冲数据以及排序,合并,修改,提高,然后将它发送到未来的转变,或到其最终目的地的数据。避免重新阅读的开销和写入到磁盘的过程中需要移动和操纵数据,可以在最佳的速度运作。
SSIS事件处理程序
事件处理过程控制,不同的数据流的过程控制,管理控制流。当你想控制在包执行过程中的特定事件发生的处理,可以使用SSIS事件处理程序过程控制组件。事件处理程序运行在响应包或包内的任务或容器引发的事件。通常情况下,事件处理程序创建一个包中执行特殊处理的数据异常的结果,引发其他程序,或启动后运行包内的事件状态的其他包。例如,你可以创建一个事件处理程序来发送电子邮件警报通知任务或包时,一个成功或失败或干脆为完成状态。
您将了解更多有关SSIS包的结构和它的对象和过程控制元件后,在这本书的第二部分。