Module 9: Processing Dimensions and Cubes

Chia sẻ: Vu Trung | Ngày: | Loại File: PDF | Số trang:64

0
55
lượt xem
3

Module 9: Processing Dimensions and Cubes

Mô tả tài liệu

Tham khảo tài liệu 'module 9: processing dimensions and cubes', công nghệ thông tin, quản trị mạng phục vụ nhu cầu học tập, nghiên cứu và làm việc hiệu quả

Chủ đề:

Bình luận(0)

Lưu

Nội dung Text: Module 9: Processing Dimensions and Cubes

1. Module 9: Processing Dimensions and Cubes Contents Overview 1 Introducing Dimension and Cube Processing 2 Processing Dimensions 5 Processing Cubes 15 Lab A: Processing Dimensions and Cubes 27 Lab B: Updating Dimension Data 30 Lab C: Incrementally Updating Cubes 36 Optimizing Cube Processing 41 Troubleshooting Cube Processing 49 Review 54
3. Module 9: Processing Dimensions and Cubes iii Instructor Notes Presentation: Multidimensional online analytical processing (OLAP) databases include 75 Minutes schema and data, both of which change and need updating from the relational database management system (RDBMS) on a regular basis. In this module, Lab: students learn to manage dimension and cube processing with Microsoft® SQL 75 Minutes Server™ 2000 Analysis Services. After completing this module, students will be able to: Understand the difference between OLAP schema and data. ! Process dimensions. ! Perform the three types of cube processes. ! Optimize cube processing. ! Troubleshoot cube processing. ! Materials and Preparation This section lists the required materials and preparation tasks that you need to teach this module. Required Materials To teach this module, you need Microsoft PowerPoint® file 2074A_09.ppt ! Preparation Tasks To prepare for this module, you should: Read all the student materials. ! Read the instructor notes and margin notes. ! Complete all the demonstrations. ! Practice the lecture presentation and demonstration. ! Complete the lab. ! Review the Trainer Preparation presentation for this module on the Trainer ! Materials compact disc. Review any relevant white papers that are located on the Trainer Materials ! compact disc. BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY
4. iv Module 9: Processing Dimensions and Cubes Demonstration: Rebuilding the State Dimension The following demonstration procedures provide information that will not fit in Demonstration: the margin notes or is not appropriate for student notes. 10 Minutes In this demonstration, you will rebuild the State dimension and will show students the impacts to the Sales Units cube after the rebuild. ! To restore a new database and define a data source 1. In Analysis Manager, right-click the server, click Restore Database, click the Look in list, click the file C:\Moc\2074A\Labfiles\L09\Module 09.CAB, click Open, click Restore, and then click Close. 2. Double-click Module 09 to expand the database. 3. Below Module 09, double-click the Data Sources folder, right-click the Module 09 data source, and then click Edit. 4. Click the Connection tab of the Data Link Properties dialog box, and then verify that localhost is selected in step 1. 5. In step 2, verify that Use Windows NT Integrated security is selected. 6. In step 3, verify that Module 09 is selected. 7. Click Test Connection and verify that the test succeeded. Click OK twice. ! To browse the Sales Units cube 1. In the Module 09 database, browse the Sales Units cube. 2. Show students that the cube contains two dimensions—State and Time— and one measure—Sales Units. 3. Point out that the cube is processed and able to accept queries. ! To rebuild the State dimension 1. In the Module 09 database Shared Dimensions folder, right-click the State dimension, and then click Process. The Process a Dimension dialog box opens. Here you see two options for processing a dimension—incrementally updating the dimension and rebuilding the dimension. 2. Click Rebuild the dimension structure, and then click OK. The Process dialog box opens and steps through the dimension rebuild. 3. After the dimension finishes processing, double-click the line in the Process dialog box that begins with a yellow SQL icon. Here you see the SQL that executes to rebuild the State dimension: SELECT DISTINCT "dbo"."State"."Country", "dbo"."State"."Region", "dbo"."State"."STATE_ID", "dbo"."State"."State_Name" FROM "dbo"."State" 4. Close the View Trace Line window, and then click Close. BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY
5. Module 9: Processing Dimensions and Cubes v ! To attempt to browse the Sales Units cube 1. In the Module 09 database, right-click the Sales Units cube, and then click Browse Data. 2. Notice that you receive the following error: “Unable to browse the cube ‘Sales Units’. Cube not processed. To browse sample data for this cube, open Cube Editor, and then on the View menu, click Data.” You did not change the State source data, and therefore the dimension did not require a rebuild. However, if the dimension is rebuilt, the Sales Units cube is unavailable. After you rebuild a shared dimension, all cubes containing the shared dimension are unavailable for user access. 3. Click Close. BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY
6. vi Module 9: Processing Dimensions and Cubes Other Activities Difficult Questions Below are difficult questions that students may ask you during the delivery of this module and answers to the questions. These materials delve into subjects that are within the scope of the module but are not specifically addressed in the content of the student notes. 1. How are incremental updates useful if the Analysis Server is unaware of data warehouse loads and it therefore does not know what data to load from a table? You can create a column in the fact table that has an identifying record, such as a time stamp or a batch number. You can include the unique record in the WHERE clause of the associated partition definition and filter records based on that identifier. 2. Are multiple partitions of a cube processed sequentially, even if they reside on different computers? Yes, unless you specify that you want certain partitions to be built in parallel. You can do this by using Decision Support Objects (DSO). BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY
7. Module 9: Processing Dimensions and Cubes vii Module Strategy Use the following strategy to present this module: Introducing Dimension and Cube Processing ! Define the term processing and explain that Analysis Server creates SQL statements to extract information from the data source. Next, define the terms schema and data as they are used in Analysis Services. Processing Dimensions ! Explain to students that dimensions must be processed when they are first designed and whenever there are changes or updates in the source dimension tables. Describe the interface and the steps to follow to process a shared dimension. Explain that there are two ways to process a shared dimension—a rebuild or an incremental update. Describe when to use each type of processing and the implications of doing so. Explain that private dimensions are created and manipulated in single cubes, and that to process a private dimension, the entire cube must be processed. Finally, define relational OLAP (ROLAP) dimensions and changing dimensions and explain when to use them. Processing Cubes ! Begin with an explanation of when to process a cube and how to get to the Process a Cube dialog box. Describe the three options available in the dialog box—full process, refresh, and incremental update. Describe when to use each option and the implications of doing so. Introduce the Incremental Update Wizard and the steps involved in performing an incremental update—specifying the data source and specifying the filter expression. Explain what happens behind the scenes when using a filter. Discuss available properties that affect cube and dimension processing. Optimizing Cube Processing ! Explain to students that, while query performance is obviously a high priority when using OLAP cubes, processing time is also important. Describe several ways to improve cube processing performance— optimizing the data source, optimizing the cube schema, optimizing cube design, and optimizing Analysis Server. Explain how to perform each of these types of optimization. Troubleshooting Cube Processing ! Describe the three most common cube problems related to processing— missing data, processing errors, and insufficient memory and disk space. Give tips and techniques for solving each type of problem. BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY
8. Module 9: Processing Dimensions and Cubes 1 Overview Topic Objective To provide an overview of the module topics and objectives. Introducing Dimension and Cube Processing ! Lead-in Processing Dimensions In this module, you will learn ! about dimension and cube Processing Cubes processing and the various ! ways to perform processes. Optimizing Cube Processing ! Troubleshooting Cube Processing ! Multidimensional online analytical processing (OLAP) databases include schema and data, both of which change and need updating from the relational database management system (RDBMS) on a regular basis. In this module, you will learn to manage Analysis Server dimension and cube processing with Microsoft® SQL Server™ 2000 Analysis Services. After completing this module, you will be able to: Understand the difference between OLAP schema and data. ! Process dimensions. ! Perform the three types of cube processes. ! Optimize cube processing. ! Troubleshoot cube processing. ! BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY
9. 2 Module 9: Processing Dimensions and Cubes # Introducing Dimension and Cube Processing Topic Objective To introduce the concept of cube and dimension Definition of Processing processing. ! Lead-in Overview of Schema and Data ! In this section, you are introduced to dimension and cube processing, and will learn the difference between OLAP schema and OLAP data. In this section, you are introduced to dimension and cube processing, and will learn the difference between OLAP schema and OLAP data. BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY
10. Module 9: Processing Dimensions and Cubes 3 Definition of Processing Topic Objective To define the term processing, as used in Processing Must Occur Prior to Users Querying the Analysis Services. ! Cube Lead-in Before users can access Dimension Processing Loads Dimension Data ! data from a cube, some form of processing must Cube Processing Loads Cube Data and Creates ! occur in the cube. Aggregations Processing Uses SQL Queries to Populate Dimension ! and Cube Data Before users can access data from an OLAP cube, some form of processing must occur in the cube. In Analysis Services, the term processing means loading dimensions and cube data from the RDBMS data source. When you process dimensions and cubes, Analysis Server creates SQL statements to extract the necessary information from the data warehouse dimension and fact tables. In addition, Analysis Server creates any aggregations in the cube that were designed previously. BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY
11. 4 Module 9: Processing Dimensions and Cubes Overview of Schema and Data Topic Objective To describe the differences between OLAP schema and OLAP Schema OLAP data. ! Lead-in Consists of dimensional hierarchies and members $It is important to understand the difference between the Comes from dimension tables$ terms schema and data as Is commonly referred to as OLAP metadata they are used in Analysis $Services. OLAP Data ! Consists of numeric information$ Comes from fact table $To master various dimension and data load processes, it is important to understand the difference between schema and data. Definitions of the two terms as used in Analysis Services may differ from what you are accustomed to when using other databases and technologies. OLAP Schema Unlike relational databases, where member names are considered data, member names in Analysis Services are considered part of the schema. The following is true of OLAP schema: The schema consists of the dimensional hierarchies and the members that ! populate these hierarchies. OLAP schema comes from the dimension tables in the source RDBMS. ! Member names, for example Quarter 1, Canada, and Bread, are ! commonly referred to as OLAP metadata in multidimensional databases. OLAP Data The data in Analysis Services is the numeric information, such as units sold, prices, inventory levels, or dollar revenues. The following is true of OLAP data: This numeric information, or data, is commonly referred to as the measures. ! OLAP data comes from the fact table in the source RDBMS. ! Understanding the distinction between schema and data is important for database administrators and OLAP architects. BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY 12. Module 9: Processing Dimensions and Cubes 5 # Processing Dimensions Topic Objective To introduce dimension processing. Dimension Processing Overview ! Lead-in In this section, you will learn Rebuilding Dimensions ! about dimension processing and the various ways of Incrementally Updating a Dimension ! processing dimensions. Processing Private Dimensions ! Understanding ROLAP and Changing Dimensions ! A cube consists of one or more dimensions combined with one or more measures. The dimensions form the structure or organization for the data values in the cube. Before the Analysis Server can process a cube, it must have already processed each dimension that is used in the cube. In this section, you will learn about dimension processing and the various ways to process dimensions. BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY 13. 6 Module 9: Processing Dimensions and Cubes Dimension Processing Overview Topic Objective Purpose of Dimension Processing To explain the purpose and ! mechanics of dimension Creates a new dimension$ processing. Lead-in Maintains an existing dimension $Before data can be loaded Shared Dimension Processing Mechanics into a cube, you must first ! create and process the Right-click a single dimension dimensions of the cube.$ Right-click the Shared Dimensions folder $The Process Dialog Box ! Opens at time of dimension process$ Contains dimension processing information $Before data can be loaded into a cube, you must first create and process the Key Point dimensions of the cube. Once you process the dimensions, the Analysis Server Analysis Server creates can load data from the fact table into the cube and can create aggregations. multidimensional OLAP (MOLAP) dimension Purpose of Dimension Processing structures in the Data folder located on the server, You process a dimension when you first create it, when you modify its except in those cases when structure, and when data updates occur in the source data dimension tables. you use ROLAP dimensions. Non-ROLAP After the initial dimension process, you must maintain dimensions on an dimensional structures are ongoing basis to reflect changes in the underlying dimension tables—for created on the server for example, new products are added to the product line, sales representatives cubes that use the MOLAP, change sales regions, and so on. Dimensions must reflect the changes in the hybrid OLAP (HOLAP), and business structure. relational OLAP (ROLAP) storage modes. Two methods are available to the Analysis Server developer for processing shared dimensions: Rebuild the dimension structure completely rebuilds and constructs a Do not spend too much time ! dimensional structure. here describing dimension rebuilds and incremental Incremental update imports new members into an existing dimensional ! updates because they are structure. discussed in detail later in this section. Analysis Server creates multidimensional OLAP (MOLAP) dimension structures in the Data folder located on the server, except in those cases when you use ROLAP dimensions. Non-ROLAP dimensional structures are created on the server for cubes that use the MOLAP, hybrid OLAP (HOLAP), and relational OLAP (ROLAP) storage modes. BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY 14. Module 9: Processing Dimensions and Cubes 7 Shared Dimension Processing Mechanics You can initiate the processing of a shared dimension in either of two ways: Expand the Shared Dimensions folder, right-click the dimension, and then ! click Process to display the Process a Dimension dialog box. This interface lets you choose between an incremental update and a dimension structure rebuild. Right-click the Shared Dimensions folder and click Process All ! Dimensions. There is no choice about which type of process will occur when you choose this option. The Analysis Server determines the type of process applied to dimensions by determining which dimensions contain structure changes and which dimensions maintain their original structure. The Process Dialog Box When you process a dimension in Analysis Manager, a Process dialog box opens that steps through each of the processing phases. The dialog contains the following information that you can use to troubleshoot errors or determine the success or failure of the process: Start time, end time, and process duration. ! Number of RDBMS rows processed. ! Initialization and committal information. ! SQL statements performed on the RDBMS to access member and ! dimension data. Information on the success or failure of the dimension build. ! BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY 15. 8 Module 9: Processing Dimensions and Cubes Rebuilding Dimensions Topic Objective Situations in which to Rebuild a Dimension ! To describe the process of rebuilding a dimension. A level is added or deleted$ Lead-in A member is deleted $The Rebuild the dimension structure option A member is renamed$ entirely recreates the dimension structure. It is the A child is moved from one parent to another $most comprehensive Implications of Rebuilding a Dimension method for processing ! dimensions. Dimensional structure is rebuilt$ Cubes become unavailable to clients until the cubes are $reprocessed Rebuilding takes time$ The Rebuild the dimension structure option in the Process a Dimension dialog box entirely recreates the dimension structure. It is the most comprehensive method for processing dimensions. Situations in which to Rebuild a Dimension The following situations require a rebuild of the dimension structure to reflect the structural dimension changes: Adding or deleting a level. ! Deleting a member. ! Renaming a member. ! Moving a child from one parent to another. This movement is commonly ! called re-parenting. For example, if you move a product from one business unit to another, the dimensional structure must reflect the change. Implications of Rebuilding a Dimension Rebuilding a dimension is not a minor action. The following are implications of rebuilding a shared dimension: The Analysis Server erases and rebuilds the dimensional structure that is ! stored on the Analysis Server. Cubes that depend on the given dimension are unavailable to clients until ! the cubes are reprocessed. In other words, clients cannot connect to cubes that depend on a rebuilt dimension until the cubes are reprocessed. The rebuilding of the dimensional structure can be a time-consuming ! process if the dimension contains an abundance of members. BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY
16. Module 9: Processing Dimensions and Cubes 9 Incrementally Updating a Dimension Topic Objective Situations in which to Incrementally Update a Dimension To describe dimension ! incremental updates. Members are added $Lead-in Member properties are added Incremental updates allow$ you to add members and Implications of Incrementally Updating a Dimension member properties to a ! shared dimension when no Cubes remain available structural changes have $occurred. New members appear$ Existing members remain intact $New associated fact table data requires a cube update$ The other option of the Process a Dimension dialog box is Incremental update. This option allows you to add members and member properties to a shared dimension when no structural changes have occurred. Incremental updates do not force a reprocessing of the cube. Therefore, perform incremental updates of dimensions instead of dimension rebuilds when possible. Situations in which to Incrementally Update a Dimension You can perform incremental updates in the following situations: Adding a member to a dimension. ! Adding a member property to a dimension. ! Implications of Incrementally Updating a Dimension Though the implications are not as severe as dimension rebuilds, it is still important to understand what happens when an incremental update occurs: Cubes that use the given dimension are available to clients during this ! process. The dimension hierarchy reflects the member updates when the incremental ! update is complete. Existing members are left intact. ! New fact table data associated with the new members requires an ! incremental cube update. Note When you process a cube, you have the option to incrementally update BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY
17. 10 Module 9: Processing Dimensions and Cubes all shared dimensions found in the cube. You do this by selecting the Incrementally update the shared dimensions used in this cube check box found in the Process a Cube dialog box. BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY
18. Module 9: Processing Dimensions and Cubes 11 Processing Private Dimensions Topic Objective To discuss the processing of private dimensions. Exist in Single Cubes ! Lead-in In this section, you will learn Cannot Be Processed Independently ! about private dimension processing. Are Processed When The Cube Is Processed ! Incremental update of cube $Refresh of cube$ Full process of cube $You create and manipulate private dimensions in single cubes. Because they exist in single cubes, you do not affect other cubes by processing private dimensions. Therefore, you can isolate cube-processing needs by defining dimensions as private. You cannot process private dimensions without also processing the cubes in which they reside, and no options or commands exist that process private dimensions independently. There are two methods of processing private dimensions: Perform an incremental update or a refresh of a cube. An incremental ! update or a refresh of a cube performs an incremental update of all private dimensions in the cube. In other words, an incremental update or a refresh of a cube adds new members and member properties in private dimensions in the cube. Perform a full process of a cube. A full process of a cube performs a rebuild ! of all private dimensions in the cube. In other words, a full process of a cube updates the structure and all members in private dimensions in the cube. Note For more information on cube refreshes and full processes, refer to the next section in this module, Processing Cubes. BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY 19. 12 Module 9: Processing Dimensions and Cubes Understanding ROLAP and Changing Dimensions Topic Objective ROLAP Dimensions ! To discuss ROLAP and Store dimension data in dimension tables$ changing dimensions. Can support tens of millions of members Lead-in $You can define dimensions Force you to define their cubes as ROLAP cubes$ as being ROLAP or changing. Require SQL Server 2000 Enterprise Edition $Are also considered changing dimensions$ Changing Dimensions ! Are optimized for frequent data source changes $Permit more types of changes with incremental updates$ Include virtual, parent-child, and ROLAP dimensions $Are set within the Dimension Editor$ When you create a dimension, its structure is stored, by default, in multidimensional structures on the Analysis Server computer. You maintain its dimensional structure based on the processing rules presented in the preceding pages of this section. These dimensions are considered MOLAP dimensions. Alternatively, you can define dimensions as ROLAP dimensions to allow for millions of members in your dimension, or you can create changing dimensions to allow for more frequent changes to the dimension structure. ROLAP Dimensions A ROLAP dimension’s data is stored in the dimension table or tables. Therefore, there is no need to build the multidimensional dimension structures that MOLAP dimensions create on the Analysis Server. By defining a dimension as ROLAP, you can maintain a dimension with tens of millions of members. If you attempt to store tens of millions of members in a MOLAP dimension, you receive errors when you process the dimension. Because of the query performance degradation you receive when using ROLAP dimensions, you usually define a dimension as ROLAP only if the dimension contains millions of members. When you add a ROLAP dimension to a cube, the cube must use the ROLAP option for its cube storage mode. You are not given the option to define the cube as MOLAP or HOLAP in the Storage Design Wizard. ROLAP dimensions require SQL Server 2000 Enterprise Edition. All dimensions that use the ROLAP storage mode are also considered to be changing dimensions. BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY