Chia sẻ: Thanh Cong | Ngày: | Loại File: PDF | Số trang:50

lượt xem


Mô tả tài liệu
  Download Vui lòng tải xuống để xem tài liệu đầy đủ

Tham khảo tài liệu 'mastering sql server 2000- p20', công nghệ thông tin, cơ sở dữ liệu phục vụ nhu cầu học tập, nghiên cứu và làm việc hiệu quả

Chủ đề:

Nội dung Text: MASTERING SQL SERVER 2000- P20

  1. 970 CHAPTER 26 • MONITORING AND OPTIMIZING SQL SERVER 2000 12. Click Next. 13. Under Select Tables to Tune, click Select All Tables. 14. Click Next; the Wizard will now start tuning your indexes. 15. You will be asked to accept the index recommendations; click Next.
  2. TIPS AND TECHNIQUES 971 16. If there were recommendations, you would be asked to schedule them for later or run them now, but because there are no recommendations for this workload file, you are taken directly to the final screen. Click Finish to complete the Wizard. 17. When you receive a message stating that the Wizard has completed, click OK. 18. Exit Profiler. Tips and Techniques If you want the best results from SQL Server’s monitoring tools, you need to know and use the proper techniques. If you don’t, the end result will not be what you are PA R T hoping for—or what you need. VI Setting a Measurement Baseline You will never know if your system is running slower than normal unless you know what normal is, which is what a measurement baseline does: It shows you the resources Advanced Topics (memory, CPU, etc.) SQL Server consumes under normal circumstances. You create the measurement baseline before putting your system into production so that you have something to compare your readings to later on.
  3. 972 CHAPTER 26 • MONITORING AND OPTIMIZING SQL SERVER 2000 The first thing you need to create an accurate measurement baseline is a test net- work with just your SQL Server and one or two client machines. You limit the number of machines involved because all networks have broadcast traffic, which is processed by all the machines on the network. This broadcast traffic can throw your counts off—sometimes a little, sometimes quite a bit. You may instead want to consider shut- ting down as many machines as possible and generating your baseline off-hours if your budget does not allow for a test network. You can then start your baseline. The Windows NT counters mentioned at the out- set of this chapter as well as the preset SQL Server counters should provide an accu- rate baseline with which you can compare future readings. Then you can move to the next technique. Data Archiving and Trend Tracking Although the consequences of throwing away your SQL Server monitoring records are not quite as severe as facing an IRS auditor without records and receipts, you still need to save, or archive, your records. One of the primary reasons to do so is to back up requests for additional equipment. For example, if you ask for funds to buy more memory for the SQL Server, but don’t bring any proof that the system needs the RAM, you are probably not going to get the money. If you bring a few months’ worth of reports, however, and say, “After tracking SQL Server for a time, we’ve found this…” management may be far more willing to give you the money you need. Using archived data in such fashion is known as trend tracking. One of the most valuable functions of using your archived data for trend tracking is proactive troubleshooting—that is, anticipating and avoiding problems before they arise. Suppose you added 50 new users to your network about three months ago and are about to do it again. If you archived your data from that period, you would be able to recall what those 50 users did to the performance of the SQL Server, and you could compensate for it. On the other hand, if you threw that data away, you might be in for a nasty surprise when your system unexpectedly slows to a crawl. Optimization Techniques SQL Server can dynamically adjust most of its settings to compensate for problems. It can adjust memory use, threads spawned, and a host of other settings. In some cases, unfortunately, those dynamic adjustments may not be enough—you may need to make some manual changes. We’ll look at a few specific areas that may require your personal attention.
  4. OPTIMIZATION TECHNIQUES 973 Queries and Stored Procedures The first thing to ask yourself when you are getting slow response times is whether you could be using a stored procedure instead of a local query. Stored procedures are different from local code in two ways: They are stored on the SQL Server, so they do not need to be transmitted over the network, which causes congestion. In addition, stored procedures are precompiled on the server; this saves system resources, because local code must be compiled once it gets to the system. Overall, stored procedures are the way to go, but if you need to use local queries, you should consider how they are written, because poorly constructed queries can wreak havoc on your system. If, for example, you have a query that is returning every row of a table when only half of that is required, you should consider rewriting the query. Improper use of WHERE clauses can also slow your queries down. Make sure that your WHERE clauses reference indexed columns for optimal performance. Tempdb Is your tempdb big enough to handle the load that your queries put on it? Think of tempdb as a scratchpad for SQL Server; when queries are performed, SQL Server uses this scratchpad to make notes about the result set. If tempdb runs out of room to make these notes, system response time can slow down. Tempdb should be between 25 and 40% of the size of your largest database (for example, if your largest database is 100MB, tempdb should be 25 to 40MB). Query Governor Right out of the box, SQL Server will run any query you tell it to, even if that query is poorly written. You can change that by using the Query Governor. This is not a sepa- rate tool, but is part of the database engine and is controlled by the Query Governor PA R T Cost Limit. This setting tells SQL Server not to run queries longer than x (where x is a value higher than zero). If, for example, the Query Governor Cost Limit is set to 2, any VI query that is estimated to take longer than 2 seconds would not be allowed to run. SQL Server can estimate the running time of a query because SQL Server keeps statistics about the number and composition of records in tables and indexes. The Query Gover- nor Cost Limit can be set by using the command sp_configure ‘query governor Advanced Topics cost limit’, ‘1’ (the 1 in this code can be higher). The Cost Limit can also be set on the Server Settings tab of the Server Properties page in Enterprise Manager.
  5. 974 CHAPTER 26 • MONITORING AND OPTIMIZING SQL SERVER 2000 NOTE If the Query Governor Cost Limit is set to zero (the default), all queries will be allowed to run. Setting Trace Flags A trace flag is used to temporarily alter a particular SQL Server behavior. Much like a light switch can be used to turn off a light and then turn it back on again, a trace flag can be used to turn off (or on) a behavior in SQL Server. Trace flags are enabled with DBCC TRACEON and turned off with DBCC TRACEOFF. The command to enable trace flag 1204 would look like this: DBCC TRACEON(1204). Table 26.3 lists some of the trace flags available to you. TABLE 26.3: USES OF TRACE FLAGS Trace Flag Use 107 This instructs the server to interpret numbers with a decimal point as type float instead of decimal. 260 This trace flag prints version information for extended stored procedure Dynamic Link Libraries. If you write your own extended stored procedures, this trace flag will prove useful in troubleshooting. 1204 This will tell you what type of locks are involved in a deadlock and what com- mands are affected. 1205 This flag returns even more detailed information about the commands affected by a deadlock. 1704 This will print information when temporary tables are created or dropped. 2528 This trace flag disables parallel checking of objects by the DBCC CHECKDB, DBCC CHECKFILEGROUP, and DBCC CHECKTABLE commands. If you know that the server load is going to increase while these commands are running, you may want to turn these trace flags on so that SQL Server checks only a single object at a time and therefore places less load on the server. Under ordinary circumstances, though, you should let SQL Server decide on the degree of parallelism. 3205 This will turn off hardware compression for backups to tape drives. 3604 When turning on or off trace flags, this flag will send output to the client. 3605 When turning on or off trace flags, this flag will send output to the error log. 7505 This enables 6.x handling of return codes when a call to dbcursorfetchx causes the cursor position to follow the end of the cursor set.
  6. OPTIMIZATION TECHNIQUES 975 Max Async I/O It should go without saying that SQL Server needs to be able to write to disk, because that’s where the database files are stored—but is it writing to disk fast enough? If you have multiple hard disks connected to a single controller, multiple hard disks con- nected to multiple controllers, or a RAID system involving striping, the answer is probably no. The maximum number of asynchronous input/output (Max Async I/O) threads by default in SQL Server is 32. This means that SQL Server can have 32 out- standing read and 32 outstanding write requests at a time. Thus, if SQL Server needs to write some data to disk, SQL Server can send up to 32 small chunks of that data to disk at a time. If you have a powerful disk subsystem, you will want to increase the Max Async I/O setting. The value to which you increase this setting depends on your hardware, so if you increase the setting, you must then monitor the server. Specifically, you will need to monitor the Physical Disk: Average Disk Queue Performance Monitor counter, which should be less than two (note that any queue should be less than two). If you adjust Max Async I/O and the Average Disk Queue counter goes above two, you have set Max Async I/O too high and will need to decrease it. NOTE You will need to divide the Average Disk Queue counter by the number of phys- ical drives to get an accurate count. That is, if you have three hard disks and a counter value of six, you would divide six by three—which tells you that the counter value for each disk is two. LazyWriter LazyWriter is a SQL Server process that moves information from the data cache in PA R T memory to a file on disk. If LazyWriter can’t keep enough free space in the data VI cache for new requests, performance slows down. To make sure this does not hap- pen, monitor the SQL Server: Buffer Manager – Free Buffers Performance Monitor counter. LazyWriter tries to keep this counter level above zero; if it dips or hits zero, you have a problem, probably with your disk subsystem. To verify this, you need to Advanced Topics check the Physical Disk: Average Disk Queue Performance Monitor counter and ver- ify that it is not more than two per physical disk (see above). If the queue is too high, LazyWriter will not be able to move data efficiently from memory to disk, and the free buffers will drop.
  7. 976 CHAPTER 26 • MONITORING AND OPTIMIZING SQL SERVER 2000 RAID RAID (Redundant Array of Inexpensive Disks) is used to protect your data and speed up your system. In a system without RAID, data that is written to disk is written to that one disk. In a system with RAID, that same data would be written across multiple disks, providing fault tolerance and improved I/O. Some forms of RAID can be imple- mented inexpensively in Windows NT, but this uses such system resources as proces- sor and memory. If you have the budget for it, you might consider getting a separate RAID controller that will take the processing burden off Windows NT. RAID is dis- cussed in detail in Chapter 4, but here is a quick refresher: RAID 0 Stripe Set: This provides I/O improvement, but not fault tolerance. RAID 1 Mirroring: This provides fault tolerance and read-time improve- ment. This can also be implemented as duplexing, which is a mirror that has separate controllers for each disk. RAID 0+1 Mirrored Stripe Set: This is a stripe set without parity that is duplicated on another set of disks. This requires a third-party controller, because Windows NT does not support this level of RAID natively. RAID 5 Stripe Set with Parity: This provides fault tolerance and improved I/O. Adding Memory SQL Server, like most BackOffice products, needs significant amounts of RAM. The more you put in, the happier SQL Server will be. There is one caveat about adding RAM, however: your level 2 cache. This is much faster (and more expensive) than standard RAM and is used by the processor for storing frequently used data. If you don’t have enough level 2 cache to support the amount of RAM in your system, your server may slow down rather than speed up. Microsoft tells you that the minimum amount of RAM that SQL Server needs is 32 to 64MB, but because SQL Server benefits greatly from added RAM, you should consider using 256MB of RAM, which requires 1MB of level 2 cache. Manually Configuring Memory Use Although SQL Server can dynamically assign itself memory, it is not always best to let it do so. A good example of this is when you need to run another BackOffice program, such as Exchange, on the same system as SQL Server. If SQL Server is not constrained, it will take so much memory that there will be none left for Exchange. The relevant
  8. SUMMARY 977 constraint is the max server memory setting; by adjusting it, you can stop SQL Server from taking too much RAM. If, for example, you set it to 102,400—100 × 1024 (the size of a megabyte)—SQL Server will never use more than 100MB of RAM. You could also set min server memory, which tells SQL Server to never use less than the set amount; this should be used in conjunction with set working size. Windows NT uses virtual memory, which means that data that is in memory and has not been accessed for a while can be stored on disk. The set working size option stops Windows NT from moving SQL Server data from RAM to disk, even if SQL Server is idle. This can improve SQL Server’s performance, because data will never need to be retrieved from disk (which is about 100 times slower than RAM). If you decide to use this option, you should set min server memory and max server memory to the same size, and then change the set working size option to 1. Summary This chapter has stressed the importance of monitoring and optimization. Monitor- ing allows you to find potential problems before your users find them; without it, you have no way of knowing how well your system is performing. Performance Monitor can be used to monitor both Windows NT and SQL Server. Some of the more important counters to watch are Physical Disk: Average Disk Queue (which should be less than two) and SQLServer:Buffer Manager: Buffer Cache Hit Ratio (which should be as high as possible). Query Analyzer allows you to see how a query will affect your system before you place the query in production. The Profiler is used to monitor queries after they have been placed in general use; it is also useful for monitoring security and user activity. Once you have used Profiler to log information about query use to a trace file, you can run the Index Tuning Wizard to optimize your indexes. Once you have created all logs and traces, you need to archive them. The various PA R T log files can be used later for budget justification and trend tracking. For example, VI suppose you added 50 users to your system six months ago and are about to add 50 more. If you kept records on what kind of load the last 50 users placed on your sys- tem, you will be better prepared for the next 50. This chapter also presented some tips for repairing a slow-running system. You can Advanced Topics change the Max Async I/O setting if your disk is not working hard enough to support the rest of the system, and you may need to upgrade your disk subsystem if the SQL Server: Buffer Manager – Free Buffers Performance Monitor counter hits zero. RAID can also speed up your SQL Server. If you can afford a separate controller, you should
  9. 978 CHAPTER 26 • MONITORING AND OPTIMIZING SQL SERVER 2000 get one to take some of the burden off Windows NT. If you can’t afford one, you can use Windows NT RAID level 1 for fault tolerance and speed. Now that you know how to optimize your server and keep it running at peak per- formance, it will be much easier to perform all of the tasks on your SQL Server. This is especially true of the next topic that we will discuss—replication.
  10. CHAPTER 27 Replication F E AT U R I N G : Understanding Replication 980 Setting Up Replication 990 Creating and Subscribing to a Transactional Publication 999 Creating and Subscribing to a Snapshot Publication 1017 Creating and Subscribing to a Merge Publication 1028 Using Replication Monitor 1040 Summary 1046
  11. F or one reason or another, many companies have more than one database system, especially in larger companies where there is more than one location or multiple departments keep their own servers. Regardless of the reason, many of these servers need to have copies of each other’s databases. For example, if you have two servers for your human resources department (one in New York and one in Singapore), you may need to keep a copy of each database on each server so that all of your human resources personnel can see the same data. The best way to copy this data is through replication. Replication is designed specifically for the task of copying data and other objects (such as views, stored procedures, and triggers) between servers and making certain that those copies stay up-to-date. In this chapter, we will look into the inner work- ings of replication. First we will discuss some terminology that is used to describe the various parts of replication. After you have an understanding of the terms, we can discuss the roles that SQL Servers can play in the replication process. Next we will move into the types and models of replication, and finally we will replicate. Let’s get started. Understanding Replication The sole purpose of replication is to copy data between servers. There are several good reasons for doing so: • If your company has multiple locations, you may need to move the data closer to the people who are using it. • If multiple people want to work on the same data at the same time, replication is a good way of giving them that access. • Replication can separate the functions of reading from writing data. This is espe- cially true in OLTP (online transaction processing) environments where reading data can place quite a load on the system. • Some sites may have different methods and rules for handling data (perhaps the site is a sister or child company). Replication can be used to give these sites the freedom of setting their own rules for dealing with data. • Mobile sales users can install SQL Server 2000 on a laptop, where they might keep a copy of an inventory database. These users can keep their local copy of the database current by dialing in to the network and replicating. You may be able to come up with even more reasons to use replication in your company, but to do so, you need to understand the publisher/subscriber concept.
  12. UNDERSTANDING REPLICATION 981 The Publisher/Subscriber Metaphor Microsoft uses the publisher/subscriber metaphor to make replication easier to under- stand and implement. It works a lot like a newspaper or magazine company. The newspaper company has information that people around the city want to read; there- fore the newspaper company publishes this data and has newspaper carriers distribute it to the people who have subscribed. As shown in Figure 27.1, SQL Server replication works much the same in that it too has a publisher, a distributor, and a subscriber: Publisher: In SQL Server terminology, the publisher is the server with the original copy of the data that others need—much like the newspaper company has the original data that needs to be printed and distributed. Distributor: Much like the newspaper company needs paper carriers to dis- tribute the newspaper to the people who have subscribed, SQL Servers need special servers called distributors to collect data from publishers to distribute to subscribers. Subscriber: A subscriber is a server that requires a copy of the data that is stored on the publisher. The subscriber is akin to the people who need to read the news, so they subscribe to the newspaper. FIGURE 27.1 SQL Server can Publication publish, distribute, or subscribe to Article Article publications Article in replication. Publisher Distributor Subscriber Contains original Collects changes Receives a PA R T copy of data from publishers copy of data VI NOTE A SQL Server can be any combination of these three roles. Advanced Topics The analogy goes even further: All of the information is not just lumped together in a giant scroll and dropped on the doorstep—it is broken up into various publica-
  13. 982 CHAPTER 27 • REPLICATION tions and articles so that it is easier to find the information you want to read. SQL Server replication follows suit: Article: An article is just data from a table that needs to be replicated. Of course, you probably do not need to replicate all of the data from the table, so you don’t have to. Articles can be horizontally partitioned, which means that not all records in the table are published, and they can be vertically parti- tioned, which means that not all columns need be published. Publication: A publication is a collection of articles and is the basis for sub- scriptions. A subscription can consist of a single article or multiple articles, but you must subscribe to a publication as opposed to a single article. Now that you know the three roles that SQL Servers can play in replication and that data is replicated as articles that are stored in publications, you need to know the types of replication. Replication Types It is important to control how publications are distributed to subscribers. If the news- paper company does not control distribution, for example, many people may not get the paper when they need it, or other people may get the paper for free. In SQL Server, you need to control distribution of publications for similar reasons, so that the data gets to the subscribers when it is needed. There are a few factors to consider when choosing a replication type: Autonomy: Autonomy is the amount of independence that your subscribers have over the data they receive. Some servers may need a read-only copy of the data, while others may need to be able to make changes to the data they receive. Latency: This refers to how long a subscriber can go without getting a fresh copy of data from the server. Some servers may be able to go for weeks without getting new data from the publisher, while other instances may require a very short wait time. Consistency: Possibly the most popular form of replication is transactional replication, where transactions are read from the transaction log of the pub- lisher, moved through the distributor, and applied to the database on the sub- scriber. This is where transactional consistency comes in. Some subscribers may need all of the transactions in the same order they were applied to the server, while other subscribers may need only some of the transactions. Once these factors have been considered, you are ready to choose the replication type that will work best for you.
  14. UNDERSTANDING REPLICATION 983 Distributed Transactions In some instances, multiple servers may need the same transaction at the exact same time, as in a bank, for example. Suppose that the bank has multiple servers for storing customer account data, each server storing a copy of the same data—all servers can modify the data in question. Now suppose that a customer comes to an Automatic Teller Machine and withdraws money from their account. The action of withdrawing money is a simple Transact-SQL transaction that removes money from the customer’s checking account record, but remember that more than one server holds this data. If the transaction makes it to only one of the bank’s servers, the customer could go to ATMs all over town and withdraw enough money to retire on, and the bank would have a very hard time stopping them. To avoid such a scenario, you need to get the exact same transaction to all of the subscribers at the same time. If the transaction is not applied to all of the servers, it is not applied to any of the servers. This type of replication is called distributed transac- tions or two-phase commit (2PC). Technically this is not a form of replication; 2PC uses the Microsoft Distributed Transaction Coordinator and is controlled by the way the Transact-SQL is written. A normal, single-server transaction looks like this: BEGIN TRAN TSQL CODE COMMIT TRAN A distributed transaction looks like this: BEGIN DISTRIBUTED TRAN TSQL CODE COMMIT TRAN Using distributed transactions will apply the same transaction to all required servers at once or to none of them at all. This means that this type of replication has very low autonomy, low latency, and high consistency. PA R T Transactional VI All data modifications made to a SQL Server database are considered transactions, whether or not they have an explicit BEGIN TRAN command and corresponding COMMIT TRAN (if the BEGIN…COMMIT is not there, SQL Server assumes it). All of these transactions are stored in a transaction log that is associated with the database. Advanced Topics With transactional replication, each of the transactions in the transaction log can be replicated. The transactions are marked for replication in the log (because not all transactions may be replicated), then they are copied to the distributor, where they are stored in the distribution database until they are copied to the subscribers.
  15. 984 CHAPTER 27 • REPLICATION The only real drawback is that subscribers to a transactional publication must treat the data as read-only, meaning that users cannot make changes to the data they receive. Think of it as being like a subscription to a newspaper—if you see a typo in an ad in the paper, you can’t change it with a pen and expect the change to do any good. No one else can see your change, and you will just get the same typo in the paper the next day. So, transactional replication has high consistency, low autonomy, and middle-of-the- road latency. Transactional with Updating Subscribers This type of replication is almost exactly like transactional replication, with one major difference: The subscribers can modify the data they receive. You can think of this type of replication as a mix of 2PC and transactional replication in that it uses the Distrib- uted Transaction Coordinator and distributed transactions to accomplish its task. The publisher still marks its transactions for replication, and those transactions get stored on the distributor until they are sent to the subscriber. On the subscriber, though, there is a trigger that is marked NOT FOR REPLICATION. This special trigger will watch for changes that come from users of the server, but not for changes that come from the distributor as a process of replication. This trigger on the subscriber database will watch for changes and send those changes back to the publisher, where they can be replicated out to any other subscribers of the publication. Snapshot While transactional replication copies only data changes to subscribers, snapshot replication copies entire publications to subscribers every time it replicates. In essence, it takes a snapshot of the data and sends it to the subscriber every time it replicates. This is useful for servers that need a read-only copy of the data and do not require updates very often—in fact, they could wait for days or even weeks for updated data. A good example of where to use this type of replication is in a department store chain that has a catalog database. The headquarters keeps and publishes the master copy of the database where changes are made. The subscribers can wait for updates to this catalog for a few days if necessary. The data on the subscriber should be treated as read-only here as well, because all of the data is going to be overwritten anyway each time replication occurs. This type of replication is said to have high latency, high autonomy, and high consistency. Snapshot with Updating Subscribers The only difference between this type of replication and standard snapshot replica- tion is that this type will allow the users to update the data on their local server. This
  16. UNDERSTANDING REPLICATION 985 is accomplished the same way it is accomplished in transactional replication with updating subscribers—a trigger is placed on the subscribing database that watches for local transactions and replicates those changes to the publishing server by means of the Distributed Transaction Coordinator. This type of replication has moderate latency, high consistency, and high autonomy. Merge By far, this is the most complex type of replication to work with, but also the most flexible. Merge replication allows changes to be made to the data at the publisher as well as at all of the subscribers. These changes are then replicated to all other sub- scribers until finally your systems reach convergence, the blessed state at which all of your servers have the same data. The biggest problem with merge replication is known as a conflict. This problem occurs when more than one user modifies the same record on their copy of the data- base at the same time. For example, if a user in Florida modifies record 25 in a table at the same time that a user in New York modifies record 25 in their own copy of the table, a conflict will occur on record 25 when replication takes place, because the same record has been modified in two different places, and therefore SQL Server has two values from which to choose. The default method of choosing a winner in this conflict is based on site priority (which you will see how to set later in this chapter). Merge replication works by adding triggers and system tables to the databases on all of the servers involved in the replication process. When a change is made at any of the servers, the trigger fires off and stores the modified data in one of the new system tables, where it will reside until replication occurs. This type of replication has the highest autonomy, highest latency, and lowest transactional consistency. But how does all of this occur? What is the driving force behind replication? Let’s look at the four agents that make replication run. Replication Agents PA R T VI Any of the types of subscriptions listed in the last section can be either push or pull subscriptions. A push subscription is configured and controlled at the publisher. This method of subscription is like the catalogs that you receive in the mail—the publisher decides when you get updates because the publisher knows when changes have been Advanced Topics made to the information inside the catalog. The same is true of a push subscription in replication—the publisher decides when changes will be sent to the subscribers. Pull subscriptions are more like a magazine subscription. You write to the publisher of the magazine and request a subscription—the magazine is not automatically sent to
  17. 986 CHAPTER 27 • REPLICATION you. Pull subscriptions work much the same in that the subscriber requests a subscrip- tion from the publisher—the subscription is not sent unless the subscriber asks for it. With either method of replication, four agents are used to move the data from the publisher to the distributor and finally to the subscriber: Log reader agent: This agent is used primarily in transactional replication. It reads the transaction log of the published database on the publisher and looks for transactions that have been marked for replication. When it finds such a transaction, the log reader agent copies the transaction to the distribution server, where it is stored in the distribution database until it is moved to the sub- scribers. This agent runs on the distributor in both push and pull subscriptions. Distribution agent: This agent moves data from the distributor to the sub- scribers. This agent runs on the distributor in a push subscription, but in a pull subscription, it runs on the subscriber. Therefore, if you have a large number of subscribers, you may want to consider using a pull subscription method to lighten the load on the distribution server. Snapshot agent: Just by reading the name of this agent, you would expect it to work with snapshot replication, but it works with all types of replication. This agent makes a copy of the publication on the publisher and either copies it to the distributor, where it is stored in the distribution working folder (\\ distribution_server\Program Files\Microsoft SQL Server\MSSQL$ (instance)\REPLDATA), or places it on removable disk (such as a CD-ROM or zip drive) until it can be copied to the subscriber. With snapshot replication, this agent runs every time replication occurs; with the other types of replica- tion, this agent runs on a less frequent basis and is used to make sure that the subscribers have a current copy of the publication, including the most up-to- date structure for the data. This agent runs on the distributor in either a push or a pull subscription. TI P New to SQL Server 2000 is the ability to compress snapshot files. This can save quite a bit of hard-disk space, because snapshot files can be sizable. Merge agent: This agent controls merge replication. It takes changes from all of the subscribers, as well as the publisher, and merges the changes with all other subscribers involved in replication. This agent runs on the distributor in a push subscription and on the subscriber in a pull subscription.
  18. UNDERSTANDING REPLICATION 987 Once you have selected the type of replication you need, you can pick the physical model to go with it. Replication Models There are three roles that a SQL Server can play in replication: publisher, distributor, and subscriber. Before you can successfully implement replication, you need to know where to place these servers in your scheme. Microsoft has a few standard replication models that should make it easier for you to decide where to put your servers Single Publisher, Multiple Subscribers In this scenario, there is a single, central publishing server where the original copy of the data is stored and several subscribers that need copies of the data. This model lends itself well to transactional or snapshot replication. A good example of when to use this is if you have a catalog database that is main- tained at company headquarters and your satellite offices need a copy of the catalog database. The database at headquarters could be published, and your satellite offices would subscribe to the publication. If you have a large number of subscribers, you could create a pull subscription so that the load is removed from the distribution server, making replication faster. Figure 27.2 should help you visualize this concept. FIGURE 27.2 Several servers can subscribe to a single publisher. Subscriber PA R T Publisher/ Subscriber VI Distributor Advanced Topics Subscriber
  19. 988 CHAPTER 27 • REPLICATION Multiple Publishers, Single Subscriber This model has a single server that subscribes to publications from multiple servers. As shown in Figure 27.3, this lends itself to the following scenario: Suppose that you work for a company that sells auto parts and you need to keep track of the inventory at all of the regional offices. The servers at all of the regional offices can publish their inventory databases, and the server at company headquarters can subscribe to those subscriptions. Now the folks at company headquarters will know when a regional office is running low on supplies, because headquarters has a copy of everyone’s inventory database. FIGURE 27.3 A single server can also subscribe to multiple publishers. Publisher/ Publisher/ Distributor Distributor Subscriber Publisher/ Publisher/ Distributor Distributor Multiple Publishers, Multiple Subscribers In this model, each server is a publisher, and each server is a subscriber (see Figure 27.4). You may instantly think that this lends itself to merge replication, but that is not always the case. This model can lend itself to other types of replication as well. For example, suppose that you work at a company that rents videos. Each video store needs to know what the other video stores have in stock so that when a cus- tomer wants a specific video, they can be instantly directed to a video store that has a copy of the desired video. To accomplish this, each video store would need to publish a copy of their video inventory, and each store would need to subscribe to the other stores’ publications. In this way, the proprietors of the video store would know what
  20. UNDERSTANDING REPLICATION 989 the other video stores have in stock. If this is accomplished using transactional repli- cation, there will be very little latency, because the publication would be updated every time a transaction takes place. FIGURE 27.4 Servers can both publish and subscribe to one another. Publisher/ Publisher/ Subscriber Subscriber Publisher/ Publisher/ Subscriber Subscriber Remote Distributor In many instances, the publishing server also serves as the distributor, and this works fine. However, there are instances when it is advantageous to devote a server to the task of distribution. Take the following scenario, for example (as shown in Figure 27.5): Many international companies need data replicated to all of their subsidiaries overseas. A company with headquarters in New York may need to have data replicated to London, Frankfurt, and Rome, for example. If the server in New York is both the publisher and the distributor, the process of replication would PA R T involve three very expensive long-distance calls: one to each of the three sub- scribers. If you place a distributor in London, though, the publisher in New VI York would need to make only one call, to the distributor in London. The dis- tributor would then make connections to the other European servers and there- fore save money on long-distance calls between servers. Advanced Topics
Đồng bộ tài khoản