# Module 8: Managing Storage and Optimization

Chia sẻ: Vu Trung | Ngày: | Loại File: PDF | Số trang:52

0
49
lượt xem
4

## Module 8: Managing Storage and Optimization

Mô tả tài liệu

Tham khảo tài liệu 'module 8: managing storage and optimization', công nghệ thông tin, quản trị mạng phục vụ nhu cầu học tập, nghiên cứu và làm việc hiệu quả

Chủ đề:

Bình luận(0)

Lưu

## Nội dung Text: Module 8: Managing Storage and Optimization

1. Module 8: Managing Storage and Optimization Contents Overview 1 Analysis Server Cube Storage 2 The Storage Design Wizard 10 Analysis Server Aggregations 17 Lab A: Designing Storage for Sales 23 Usage-Based Optimization 28 Lab B: Implementing Usage-Based Optimization 35 Optimization Tuning 39 Review 41
3. Module 8: Managing Storage and Optimization iii Instructor Notes Presentation: This module provides students with a comprehensive understanding of 70 Minutes Microsoft® SQL Server™ Analysis Services storage options and optimization techniques for online analytical processing (OLAP) cubes. The characteristics Labs: of the three storage modes—multidimensional OLAP (MOLAP), relational 20 Minutes OLAP (ROLAP), and hybrid OLAP (HOLAP)—are reviewed in detail followed by an overview of aggregations. The module then takes students through the Storage Design Wizard with discussion of specific aggregation options and further discussion of the contents of aggregations and design guidelines. The module concludes with a review of usage-based optimization and general optimization tuning techniques. There are two labs in the module. In lab A, students create a storage design and process a cube by using the Storage Design Wizard. In lab B, students learn the interfaces and mechanics of usage-based optimization. After completing this module, students will be able to: Explain the advantages and disadvantages of the three data storage modes. ! Use the Storage Design Wizard to set storage design. ! Describe how aggregations work and design aggregations for cubes. ! Describe the concepts and mechanics of usage-based optimization. ! Override aggregation settings per dimension. ! Materials and Preparation This section lists the required materials and preparation tasks that you need to teach this module. Required Materials To teach this module, you need the following materials: Microsoft PowerPoint® file 2074A_08.ppt ! Preparation Tasks To prepare for this module, you should: Read all the student materials. ! Read all the instructor notes and margin notes. ! Practice the lecture presentation and demonstration. ! Complete the labs. ! Review the Trainer Preparation presentation for this module on the Trainer ! Materials compact disc. Review any relevant white papers that are located on the Trainer Materials ! compact disc. BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY
4. iv Module 8: Managing Storage and Optimization Demonstration: Designing Storage for the Sales Cube In this demonstration, you will learn how to create a storage design by using the Demonstration: Storage Design Wizard. 10 Minutes The following demonstration procedures contain information that does not fit in the margin notes or is not appropriate for student notes. ! To restore a new database and define a data source 1. In Analysis Manager, right-click the server, click Restore Database, click the Look in list, find and click the file C:\Moc\2074A\Labfiles\L08\Module 08.CAB, click Open, and then click Restore. 2. Click Close, and then double-click Module 08 to expand the database. 3. Below Module 08, double-click Data Sources, right-click the Module 08 data source, and then click Edit. 4. Click the Connection tab of the Data Link Properties dialog box, and then verify that localhost is selected in step 1. 5. In step 2, verify that Use Windows NT Integrated security is selected. 6. In step 3, verify that Module 08 is selected. 7. Click Test Connection and verify that the test succeeded. Click OK twice. ! To specify storage type 1. In the Module 08 database, right-click the Sales cube and click Design Storage. 2. Click Next to skip the welcome page. 3. From the Select the type of data storage step, click MOLAP, and then click Next. ! To design aggregations 1. In the Set aggregation options step, click Performance gain reaches from the Aggregation options pane. 2. Type 20 in the percent box for Performance gain reaches to reflect a 20- percent aggregation target. For the Sales cube, the default value of 50 is unnecessarily high. 3. Click Start to initiate the graphical simulation of Performance vs. Size on the Set aggregation options step, and then click Next. BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY
5. Module 8: Managing Storage and Optimization v ! To process the cube 1. In the Finish the Storage Design Wizard step and with the Process now option clicked, click Finish. Regardless of the processing option you choose, Analysis Manager stores the definition of the aggregations in the OLAP repository. Storing the definition of the aggregations is different from physically creating them, however. The Storage Design Wizard designs aggregations but does not create them. The Analysis Server does not create aggregations until you process the cube. Processing the cube automatically creates any aggregations that have been designed. 2. Close the Process dialog box when processing is complete. ! To examine the metadata 1. In Analysis Manager, click the Sales cube in the Analysis Manager tree pane and then click Metadata in the right details pane. 2. Scroll down and notice the process and storage mode statistics. BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY
6. vi Module 8: Managing Storage and Optimization Other Activities Difficult Questions Below are difficult questions that students may ask you during the delivery of this module and answers to the questions. These materials delve into subjects that are within the scope of the module but are not specifically addressed in the content of the student notes. 1. If ROLAP is slow to process and query, why would an organization use this option? ROLAP would be adopted if the organization needs a real-time OLAP solution—that is, data is always updated with the current fact table values. In this scenario, an organization defines its cube as ROLAP with zero aggregations. All detail and aggregate data are calculated as users query the cube. While queries are slow, in some situations perfectly updated data is more important than fast query times. 2. Do Analysis Services MOLAP cubes have the “data explosion” problem common to OLAP solutions? MOLAP database engines in competing products often create cubes that grow exponentially from source files to fully calculated cubes. For example, a five-megabyte (MB) source file has been known to grow into a five-gigabyte (GB) cube after processing. The data explosion problem when using MOLAP in Analysis Services does not exist to the extent experienced with other OLAP products. In many cases, the MOLAP cube may be smaller than the data source. The following are the principal reasons for the MOLAP storage efficiency: • Analysis Services MOLAP cubes are completely dense in their data storage—that is, no null values are stored. • The Analysis Services query engine is highly optimized, calculating commonly accessed aggregations as the cube is queried so that fewer aggregations need to be precalculated and stored. • The Analysis Services data compression algorithms are highly efficient. Some multidimensional products presumably solve the data explosion problem by not using an OLAP engine, instead accessing data directly from a relational database. Such products classify themselves as ROLAP solutions because they access relational databases directly and give users a multidimensional view of the data, but do not create cubes that consume large amounts of storage. Such ROLAP solutions typically suffer in query performance compared to MOLAP solutions. BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY
7. Module 8: Managing Storage and Optimization vii 3. MOLAP cubes duplicate the detail data already stored in relational tables. How can MOLAP storage be more efficient than ROLAP cube storage? How can MOLAP be faster to process than ROLAP if Analysis Services brings all the detail data to the Analysis Server? Multidimensional structures and data are extremely compressed and optimized compared to the two-dimensional tables in relational databases. In addition, when defining ROLAP cubes, indexes are automatically created in the relational database management system (RDBMS). Even though MOLAP cubes carry over the detail data, they can still be smaller than their ROLAP cube counterparts. The exception is a ROLAP cube that has few or no aggregations. From a processing standpoint, it may be faster to create a multidimensional structure than to create, insert, and update data in relational tables. It also may take a long time to build the indexes that are automatically created in ROLAP cubes. Again, the exception is when the ROLAP cube has few or no aggregations defined. 4. If processing time and disk space are not constraints, should aggregations be set to 100 percent for Performance gains reaches? If Performance gains reaches is set to 100 percent, all a cube’s possible aggregations will not necessarily be computed. The setting simply targets that query performance will be potentially increased by 100 percent. As a cube defines more aggregations, query performance improvements reach a point of diminishing returns. Some cubes may slow in their query performance if the aggregation percentage is set too high. 5. How does one estimate the size of a cube based on fact table size? You can estimate the data storage for MOLAP data on disk in bytes, assuming zero aggregations, by using the following formula: (((2 * total number of levels) + (4 * number of measures)) * number of records) / 3 6. Analysis Services has intelligent algorithms for determining the most optimized aggregation design. Why would you choose to override the dimension level aggregations? In most cases, you use the Storage Design Wizard and the Usage-Based Optimization Wizard to define aggregations. However, there may be exceptional situations in which you might want to exercise control, overriding wizard algorithms. For example, you might not want your cube to contain aggregations for the lowest level of the Product dimension, because users will not be accessing data at that level. Therefore, you have the ability to turn off aggregations for this level. BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY
8. viii Module 8: Managing Storage and Optimization Module Strategy Use the following strategy to present this module: Analysis Server Cube Storage ! Deliver an overview discussion of general server storage issues and then talk through the characteristics of each of the three storage modes— MOLAP, ROLAP, and HOLAP. This discussion leads into a basic introduction to the concept of aggregations. The Storage Design Wizard ! The materials in this section can be delivered as lecture by using the slides or integrated with the demonstration Design Storage for the Sales Cube. Because the demonstration essentially duplicates the materials in the student notes, it is recommended to integrate lecture and demonstration. The following table is a mapping of lecture topics to demonstration steps. Lecture Topic Demonstration Procedure Choose Storage Option To specify storage type Set Aggregation Options To design aggregations How Much Aggregation? To design aggregations Estimated Storage Reaches To design aggregations Performance Gain Reaches To design aggregations Until I Click Stop To design aggregations Finishing Up To examine the metadata and browse the cube There will be substantial interest and questions from students about the storage size and performance implications of choosing from each of the three storage methods. Do not rush through these materials or limit discussion and questions. Review the questions and answers in the previous Difficult Questions section. Be prepared to discuss elements of aggregation again, including the specific functioning of the three aggregation options—Estimated storage reaches, Performance gain reaches, and Until I click stop. These three choices represent different conceptual approaches and specific underlying algorithms for implementing aggregations. Analysis Server Aggregations ! The subject of aggregation is explored in more detail, including review of aggregation tables, general characteristics of aggregations, and details about ROLAP aggregations. Because students must thoroughly understand the concepts and wizard implementation of aggregations, the subject is approached repeatedly in this module at increasing levels of detail and sophistication. Lab A follows the aggregation details section. Students now have an opportunity to create storage designs of their own based on MOLAP and ROLAP storage modes. The lab essentially replicates the steps performed in the demonstration. BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY
9. Module 8: Managing Storage and Optimization ix Usage-Based Optimization ! This section presents an important feature set of Analysis Services—usage- based optimization. Your lecture, following the materials in the student notes, takes students through the simple Usage-Based Optimization Wizard. Be prepared to answer detailed questions about how each of the five query options work by themselves and in conjunction with each other. The section is followed by lab B, Implementing Usage-Based Optimization, which can be conducted as a hands-on exercise with students following your demonstration. The lab allows students to perform their own usage-based optimizations. Optimization Tuning ! You complete the module with a discussion of specific optimization tuning methods, including how to override dimension and level settings. No exercises or labs are included in this section. However, you should switch to Analysis Manager to show the settings and to briefly explain their functions. BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY
10. Module 8: Managing Storage and Optimization 1 Overview Topic Objective To provide an overview of the module topics and Analysis Server Cube Storage ! objectives. The Storage Design Wizard ! Lead-in In this module, you will learn Analysis Server Aggregations ! about aggregation design and storage modes, which Usage-Based Optimization ! are the key factors in enabling fast query Optimization Tuning ! response times. Query performance—how long it takes a user to access requested information—is a primary determining factor for online analytical processing (OLAP) cube storage design. An optimal design produces fast queries for users while maintaining reasonable cube processing times. Designing the storage mode and aggregations for a cube is one of the most crucial steps in cube development. After completing this module, you will be able to: Explain the advantages and disadvantages of the three data storage modes. ! Use the Storage Design Wizard to set storage design. ! Describe how aggregations work and design aggregations for cubes. ! Describe the concepts and mechanics of usage-based optimization. ! Override aggregation settings per dimension. ! BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY
11. 2 Module 8: Managing Storage and Optimization ` # Analysis Server Cube Storage Topic Objective To introduce Analysis Services storage options. Lead-in Analysis Services supports three storage options: MOLAP, ROLAP, and HOLAP. Microsoft® SQL Server™ 2000 Analysis Services supports three storage Delivery Tips options: Avoid going into too much detail on the three storage Multidimensional OLAP (MOLAP) ! modes here. Wait until the Relational OLAP (ROLAP) following slides. ! Hybrid OLAP (HOLAP) ! Point out that in the Cube developers design the storage for cubes. The storage design of a cube is preceding illustration, Aggs stands for aggregations. transparent to clients—users do not realize that different cubes have different storage designs. The following list contains descriptions of some key characteristics of cube storage modes: The storage mode is transparent to clients. Users and client applications see ! only cubes. For users, the only indication of the storage mode is their observations of query performance. The storage mode can be changed after the initial storage decision is made. ! Once you specify storage and put a cube into production, you can change to a different storage type later. After you change the mode, you must reprocess the cube and then Analysis Server reloads the data and creates new aggregations. Each partition of a cube can have a different storage mode. A cube can ! consist of multiple partitions. One cube might have both a MOLAP partition and a ROLAP partition. Note For more information about partitions, see module 10, “Managing Partitions,” in course 2074A, Designing and Implementing OLAP Solutions with Microsoft SQL Server 2000. BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY
12. Module 8: Managing Storage and Optimization 3 Analysis Server does not allocate storage for missing values. For example, if ! no bikinis are sold in Antarctica, no space is allocated to that missing value. Because missing values take up no storage, cubes are 100 percent dense— that is, all storage is efficiently used. This characteristic of Analysis Server helps avoid the data explosion problems of other OLAP products. BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY
13. 4 Module 8: Managing Storage and Optimization MOLAP Storage Mode Topic Objective To describe the characteristics of MOLAP storage. Lead-in In MOLAP cubes, detailed data and aggregations are stored in a multidimensional Details and Aggregations Stored in Multidimensional format on the Analysis ! Format Server. Fastest Storage Option for Queries ! Often the Most Efficient in Terms of Disk Storage, Due ! to Compression The following list contains characteristics of MOLAP storage: Detailed data and aggregations are stored in a multidimensional format on Delivery Tips ! the Analysis Server. Tell students that you design cubes as MOLAP the • Because the detail data from the fact table is brought into Analysis vast majority of the time Server for storage, data is duplicated. because of the fast query • The level of detail imported into a cube is based on the grain of the times, processing times, and efficient storage of MOLAP cube’s dimensions. For example, if the fact table contains daily data cubes. records, but the grain of the cube time dimension is month, the cube will contain data at a month level. The fact table daily records are combined Point out that Aggs stands at cube processing time. for aggregations in the • After a MOLAP cube is processed, all data necessary for querying is preceding illustration. located on the Analysis Server. The source relational database management system (RDBMS) is not accessed other than at processing time. MOLAP cubes have the fastest query performance for users. ! MOLAP is a very economical mode in terms of disk storage, due to efficient ! data compression algorithms. BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY
14. Module 8: Managing Storage and Optimization 5 ROLAP Storage Mode Topic Objective To describe the characteristics of ROLAP storage. Lead-in The following are characteristics of ROLAP storage. Details and Aggregations Analysis Server Can Create ! ! Stored in RDBMS Indexed Views Slowest Query Performance Useful for Large Data Sources ! ! Most Often the Slowest to Provides Real-Time OLAP ! ! Process Solution The following are characteristics of ROLAP storage: Detailed data and aggregations are stored in relational tables in the source ! Key Point database. Aggregation tables must be • RDBMS indexes are automatically created in the data source to improve stored in the same RDBMS cube performance. as the data source of a cube. • All queries, other than those satisfied by the client or server caches, must access the source RDBMS tables. Under most circumstances, ROLAP cubes are much slower in query ! Delivery Tip performance than MOLAP cube equivalents. Point out that Aggs stands for aggregations in the ROLAP cubes are usually the slowest to process, unless the ROLAP cubes ! preceding illustration. contain few aggregations. When assigning the ROLAP storage mode to cubes that have data sources ! defined in SQL Server 2000 databases, the Analysis Server attempts to create indexed views instead of tables, assuming certain criteria are met in the data source. You use the ROLAP storage mode when the data source is too large to be ! stored and processed effectively in MOLAP or HOLAP. You use the ROLAP storage mode when you require a real-time OLAP ! solution. BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY
15. 6 Module 8: Managing Storage and Optimization Creating Real-Time Cubes Some cubes require immediate refreshing of data when changes occur in the data source. However, by using standard cubes, you are forced to reprocess cubes when data changes in the underlying database. To overcome the delay of data updates, you have the ability to create real-time OLAP cubes in Analysis Services. You create a real-time cube by performing the following steps: Define the cube by using the ROLAP storage mode. ! Select the Enable real-time updates check box in the Select the type of ! data storage page in the Storage Design Wizard. Real-Time Cube Behavior The following behavior occurs in real-time cubes: The Analysis Server polls the database to determine if changes have been ! made to the data source. The Analysis Server flushes the server cache after it detects any database ! changes to ensure that clients do not query outdated data. Cube data automatically refreshes when fact table data changes. ! Real-Time Criteria ROLAP cubes must meet certain criteria before they can behave as real-time cubes: Cubes must contain zero aggregations or must store aggregations in SQL ! Server 2000 indexed views. Cube partitions cannot be defined as real-time partitions if they are remote ! partitions. BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY
16. Module 8: Managing Storage and Optimization 7 HOLAP Storage Mode Topic Objective To describe the characteristics of HOLAP storage. Lead-in The following are characteristics of HOLAP storage. Details Maintained in RDBMS ! Aggregations Created in Multidimensional Format ! Good Option where Disk Consumption Is a Concern ! Good Compromise if Details Are Accessed Infrequently ! The following are characteristics of HOLAP storage: Detailed data is maintained in the RDBMS. ! Delivery Tip Aggregations are created in the multidimensional cube format and are stored ! Point out that Aggs stands on the Analysis Server. for aggregations in the preceding illustration. Because detailed data is not duplicated, HOLAP is a reasonable storage ! compromise where disk consumption is a concern. In a situation when users do not frequently access the details stored in the ! RDBMS and the cube contains a high degree of aggregation, HOLAP is a good option for cube storage. Most cubes use MOLAP as the cube storage mode. However, you can define a cube with a HOLAP design to use less cube storage than if the cube used the MOLAP storage design. The following are effects of using the HOLAP design in cubes: Queries are not as slow as in a ROLAP cube, nor as fast as in a MOLAP ! cube. Processing time for a HOLAP cube is similar to processing time for a ! MOLAP cube. • The same amount of data is read from disk into memory for both HOLAP and MOLAP cube types. • The only processing difference between MOLAP and HOLAP cubes is the writing of detail data to the Analysis Server for MOLAP cubes. This process does not add significant processing time because the data has already been read into memory. BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY
17. 8 Module 8: Managing Storage and Optimization Cube Aggregations Topic Objective Full Aggregation Not Necessary To explain aggregation ! topics. Effects on Cube Size and Processing Time ! Lead-in Aggregations are Cube size and processing times increase as $precalculated summaries of aggregations are added to a cube detailed data that enable Tools for Implementing Aggregations Analysis Server to answer ! queries quickly. Storage Design Wizard$ Usage-Based Optimization Wizard $Dimension and level aggregation properties$ Aggregations are precalculated summaries of detailed data that enable Analysis Delivery Tip Server to answer queries quickly. Cubes contain aggregations designed with the Use the preceding Storage Design Wizard or with the Usage-Based Optimization Wizard. illustration to introduce aggregations, stepping Precalculated aggregations are fundamental to OLAP cubes, making user through the bullets on the queries significantly faster than calculating aggregations at query time. slide. Point out that there Accessing aggregations is transparent to users and client applications. The are more sections covering Analysis Server accesses aggregations automatically. aggregations later in the module. Note A cube can contain multiple partitions. Aggregations can be designed like a cube’s storage mode, on a partition-by-partition basis. For more information about partitions, see module 10, “Managing Partitions,” in course 2074A, Designing and Implementing OLAP Solutions with Microsoft SQL Server 2000. Full Aggregation Not Necessary It is not necessary to fully aggregate a cube in Analysis Services. Analysis Server utilizes a variety of algorithms to optimize data access, thereby eliminating the need for total cube aggregation. If an aggregation does not exist to satisfy a query containing summarized data, Analysis Server does not need to query the lowest level of data. Instead, the server uses an intermediate aggregation, if one exists, to satisfy the query. BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY
18. Module 8: Managing Storage and Optimization 9 Here is an example of how Analysis Server uses an intermediate aggregation to satisfy a query: Assume a hierarchy for the Time dimension consisting of the levels Year, ! Quarter, and Month. Aggregations exist for Quarter but not for Year. When a user queries the Year level, the server does not access Month level ! data to calculate the yearly totals. Instead, the server derives the totals from the quarter totals that are already aggregated. Effects on Cube Size and Processing Time Cube size increases as aggregations are added to a cube. In addition, processing times increase because pre-aggregations are calculated at process time. Note Long processing times tend to be more detrimental to an OLAP application’s success than large cube sizes. Disk space is inexpensive compared to time lost waiting for a cube to process. Size of the cube depends on several factors: The number of aggregations ! The number of dimensions ! The number of levels ! The number of measures ! The number of members ! The data distribution of the cube ! When designing aggregations, the goal is to maximize query performance while maintaining reasonable cube sizes and processing times. Tools for Implementing Aggregations In the following sections, you will learn about three tools available to you for implementing aggregations: Storage Design Wizard ! Usage-Based Optimization Wizard ! Dimension and Level aggregation properties ! BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY
19. 10 Module 8: Managing Storage and Optimization # The Storage Design Wizard Topic Objective To introduce the steps Choosing a Storage Mode ! involved in using the Storage Design Wizard. Setting Aggregation Options ! Lead-in Determining the Level of Aggregation The Storage Design Wizard ! is the interface that lets you Finishing Up specify the storage mode ! and aggregation design. The Storage Design Wizard is the interface that lets you specify the storage Delivery Tip mode and aggregation design. The next section introduces the steps involved in Integrate this content into using the Storage Design Wizard to design storage modes and aggregations: the demonstration Designing Storage for the Choosing a storage mode ! Sales Cube as an effective Setting aggregation options method of presenting the ! material. Determining the level of aggregation ! Finishing up ! There are two entry points into the Storage Design Wizard: After building or modifying a cube, you are prompted to set storage options. ! You start the wizard by clicking Yes. Right-click a cube or a partition in a cube, and then click Design Storage. ! The user interface of the Storage Design Wizard differs depending on: Whether storage has been designed previously for the cube. ! Whether the cube contains partitions. ! BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY