intTypePromotion=1
zunia.vn Tuyển sinh 2024 dành cho Gen-Z zunia.vn zunia.vn
ADSENSE

Bài giảng Computer Architecture: Chapter 8 - Prof. Jerry Breecher

Chia sẻ: Codon_03 Codon_03 | Ngày: | Loại File: PPT | Số trang:24

49
lượt xem
2
download
 
  Download Vui lòng tải xuống để xem tài liệu đầy đủ

Bài giảng Computer Architecture: Chapter 8 - Multiprocessors Shared Memory Architectures hướng đến trình bày các vấn đề về Introduction – the big picture; centralized Shared Memory Architectures. Cùng tìm hiểu và tham khảo nội dung thông tin tài liệu.

Chủ đề:
Lưu

Nội dung Text: Bài giảng Computer Architecture: Chapter 8 - Prof. Jerry Breecher

  1. Computer Architecture Chapter 8 Multiprocessors Shared Memory Architectures Prof. Jerry Breecher CSCI 240 Fall 2003
  2. Chapter Overview We’re going to do only one section from this chapter, that part related to how caches from multiple processors interact with each other. 8.1 Introduction – the big picture 8.3 Centralized Shared Memory Architectures Chap. 8 ­ Multiprocessors 2
  3. The Big Picture: Where are Introduction We Now? 8.1 Introduction The major issue is this: 8.3 Centralized Shared Memory Architectures We’ve taken copies of the contents of main memory and put them in caches closer to the processors. But what happens to those copies if someone else wants to use the main memory data? How do we keep all copies of the data in synch with each other? Chap. 8 ­ Multiprocessors 3
  4. The Multiprocessor Picture Processor/Memory Bus Example: Pentium System Organization PCI Bus I/O Busses Chap. 8 ­ Multiprocessors 4
  5. Shared Memory Multiprocessor Processor Processor Processor Processor Registers Registers Registers Registers Caches Caches Caches Caches Memory Chipset •Memory: centralized with Uniform Memory Access time (“uma”) and bus interconnect, I/O Disk & other IO •Examples: Sun Enterprise 6000, SGI Challenge, Intel SystemPro Chap. 8 ­ Multiprocessors 5
  6. Shared Memory Multiprocessor • Several processors share one address space P P P – conceptually a shared memory – often implemented just like a Network/Bus multicomputer • address space distributed over private memories M • Communication is implicit Conceptual Model – read and write accesses to shared memory locations • Synchronization – via shared memory locations • spin waiting for non-zero – barriers Chap. 8 ­ Multiprocessors 6
  7. Message Passing Multicomputers • Computers (nodes) connected by a network – Fast network interface • Send, receive, barrier – Nodes not different than regular PC or workstation • Cluster conventional workstations or PCs with fast network – cluster computing – Berkley NOW Node – IBM SP2 P P P M M M Network Chap. 8 ­ Multiprocessors 7
  8. Large-Scale MP Designs Memory: distributed with nonuniform memory access time (“numa”) and scalable interconnect (distributed memory) 100 cycles 40 cycles Low Latency High Reliability 1 cycle Chap. 8 ­ Multiprocessors 8
  9. Shared Memory Architectures 8.1 Introduction In this section we will understand the 8.3 Centralized Shared issues around: Memory Architectures • Sharing one memory space among several processors. • Maintaining coherence among several copies of a data item. Chap. 8 ­ Multiprocessors 9
  10. Shared Memory Architectures The Problem of Cache Coherency CPU CPU CPU Cache Cache Cache A’ 100 A’ 550 A’ 100 B’ 200 B’ 200 B’ 200 Memory Memory Memory A 100 A 100 A 100 B 200 B 200 B 440 I/O I/O I/O Output of A gives 100 Input 440 to B a) Cache and memory b) Cache and memory c) Cache and memory coherent: A’ = A, B’ = B. incoherent: A’ ^= A. incoherent: B’ ^= B. Chap. 8 ­ Multiprocessors 10
  11. Shared Memory Some Simple Definitions Architectures Mechanism How It Works Performance Coherency Issues Write modified Good, Can have problems data from cache because with various copies Write Back to memory only doesn’t tie up containing different when memory values. necessary. bandwidth. Write modified Not so good - Modified values data from cache uses a lot of always written to Write Through to memory memory memory; data immediately. bandwidth. always matches. Chap. 8 ­ Multiprocessors 11
  12. Shared Memory What Does Coherency Mean? Architectures • Informally: – “Any read must return the most recent write” – Too strict and too difficult to implement • Better: – “Any write must eventually be seen by a read” – All writes are seen in proper order (“serialization”) • Two rules to ensure this: – “If P writes x and P1 reads it, P’s write will be seen by P1 if the read and write are sufficiently far apart” – Writes to a single location are serialized: seen in one order • Latest write will be seen • Otherwise could see writes in illogical order (could see older value after a newer value) Chap. 8 ­ Multiprocessors 12
  13. Shared Memory There are Different Types of Architectures Memory In The Cache Test_and_set(lock) shared_data = xyz; What kinds of memory are there in the cache? Clear(lock); TYPE Shared? Writable How Kept Coherent Code Shared No No Need. Private Data Exclusive Yes Write Back Shared Data Shared Yes Write Back * Interlock Data Shared Yes Write Through ** * Write Back gives good performance, but if you use write through here, there will be performance degradation. ** Write through here means the lock state is seen immediately. You want a write through here to flush the cache. Chap. 8 ­ Multiprocessors 13
  14. Shared Memory Potential HW Coherency Architectures Solutions • Snooping Solution (Snoopy Bus): – Send all requests for data to all processors – Processors snoop to see if they have a copy and respond accordingly – Requires broadcast, since caching information is at processors – Works well with bus (natural broadcast medium) – Dominates for small scale machines (most of the market) • Directory-Based Schemes – Keep track of what is being shared in one centralized place – Distributed memory => distributed directory for scalability (avoids bottlenecks) – Send point-to-point requests to processors via network – Scales better than Snooping – Actually existed BEFORE Snooping-based schemes Chap. 8 ­ Multiprocessors 14
  15. Shared Memory An Example Snoopy Protocol Maintained by Hardware Architectures Invalidation protocol, write-back cache Each block of memory is in one state: Clean in all caches and up-to-date in memory (Shared) OR Dirty in exactly one cache (Exclusive) OR Not in any caches Each cache block is in one state (track these): Shared : block can be read OR Exclusive : cache has only copy, its writeable, and dirty OR Invalid : block contains no data Read misses: cause all caches to snoop bus Writes to clean line are treated as misses Chap. 8 ­ Multiprocessors 15
  16. Shared Memory Snoopy-Cache State Machine-I Architectures CPU Read hit • State machine for CPU requests for each CPU Read Shared cache block Invalid (read/only) Place read miss on bus CPU Write Applies to Write Back CPU read miss CPU Read miss Data Write back block Place Write Place read miss Miss on bus on bus CPU Write Cache Block Place Write Miss on Bus State Exclusive (read/write) CPU Write Miss CPU read hit Write back cache block CPU write hit Place write miss on bus Chap. 8 ­ Multiprocessors 16
  17. Shared Memory Snoopy-Cache State Machine-II Architectures • State machine for bus requests Write miss for each for this block Shared Invalid cache block (read/only) • Appendix E gives details of bus requests Write Back Block; (abort Write Back memory access) Block; (abort memory access) Write miss for this block Read miss Exclusive for this block (read/write) Chap. 8 ­ Multiprocessors 17
  18. Shared Memory Example Architectures Processor 1 Processor 2 Bus Memory P1 P2 Bus Memory step State Addr Value State Addr Value Action Proc. Addr Value Addr Value P1: Write 10 to A1 P1: Read A1 P2: Read A1 P2: Write 20 to A1 P2: Write 40 to A2 CPU Read hit Assumes initial cache state Remote Write or Miss is invalid and A1 and A2 map Invalid Shared to same cache block, Read but A1 ≠ A2 Write miss on bus Remote miss on bus CPU Write Write Place Write or Miss Remote Read Miss on Bus Write Back Write Back This is the Cache for P1. Exclusive CPU read hit CPU write hit Chap. 8 ­ Multiprocessors 18
  19. Shared Memory Example: Step 1 Architectures P1 P2 Bus Memory step State Addr Value State Addr Value Action Proc. Addr Value Addr Value P1: Write 10 to A1 Excl. A1 10 WrMs P1 A1 P1: Read A1 P2: Read A1 P2: Write 20 to A1 P2: Write 40 to A2 Invalid Shared Write miss on bus Exclusive Chap. 8 ­ Multiprocessors 19
  20. Shared Memory Example: Step 2 Architectures P1 P2 Bus Memory step State Addr Value State Addr Value Action Proc. Addr Value Addr Value P1: Write 10 to A1 Excl. A1 10 WrMs P1 A1 P1: Read A1 Excl. A1 10 P2: Read A1 P2: Write 20 to A1 P2: Write 40 to A2 Assumes initial cache state Shared Invalid is invalid and A1 and A2 map to same cache block, but A1 ≠ A2 Exclusive CPU read hit Chap. 8 ­ Multiprocessors 20
ADSENSE

CÓ THỂ BẠN MUỐN DOWNLOAD

 

Đồng bộ tài khoản
2=>2