50 p5-550 Technical Overview and Introduction3.2 Reliability, availability, and serviceabilityExcellent quality and reliability are inherent in all aspects of the IBM Sserver p5 design andmanufacturing. The fundamental objective of the design approach is to minimize outages.The RAS features help to ensure that the system operates when required, performs reliably,and efficiently handles any failures that might occur. This is achieved using capabilitiesprovided by both the hardware and the operating system AIX 5L.The p5-550 as a POWER5 server enhances the RAS capabilities implemented inPOWER4-based systems. RAS enhancements available on POWER5 servers are: Most firmware updates allow the system to remain operational. The ECC has been extended to inter-chip connections for the fabric and processor bus. Partial L2 cache deallocation is possible. The number of L3 cache line deletes improved from 2 to 10 for better self-healingcapability.The following sections describe the concepts that form the basis of leadership RAS featuresof IBM Sserver p5 systems in more detail.3.2.1 Fault avoidanceThe p5 systems are built on a quality-based design to keep errors from ever happening. Thisdesign includes the following features: Reduced power consumption, cooler operating temperatures for increased reliability,enabled by copper chip circuitry, silicon-on-insulator, and dynamic-clock-gating Mainframe-inspired components and technologies3.2.2 First Failure Data CaptureIf a problem should occur, the ability to correctly diagnose it is a fundamental requirementupon which improved availability is based. The p5-550 incorporates advanced capability instart-up diagnostics and in run-time First Failure Data Capture (FDDC) based on strategicerror checkers built into the chips.Any errors detected by the pervasive error checkers are captured into Fault IsolationRegisters (FIRs), which can be interrogated by the service processor (SP). The SP in thep5-550 has the capability to access system components using special purpose serviceprocessor ports or by access to the error registers (Figure 3-1).