Chapter 4. Continuous availability and manageability 123action to deconfigure the faulty hardware to avoid a potential system outage and to enhancesystem availability.Persistent deallocationTo enhance system availability, a component that is identified for deallocation ordeconfiguration on a POWER processor-based system is flagged for persistent deallocation.Component removal can occur either dynamically (while the system is running) or at boottime (IPL), depending on both the type of fault and when the fault is detected.In addition, runtime unrecoverable hardware faults can be deconfigured from the system afterthe first occurrence. The system can be rebooted immediately after failure and resumeoperation on the remaining stable hardware. This approach prevents the same faultyhardware from affecting system operation again, and the repair action is deferred to a moreconvenient, less critical time.Persistent deallocation includes the following elements: Processor L2/L3 cache lines (cache lines are dynamically deleted) Memory Deconfigure or bypass failing I/O adaptersProcessor instruction retryAs in POWER6, the POWER7 processor has the ability to retry processor instruction andalternate processor recovery for a number of core-related faults. This approach significantlyreduces exposure to both permanent and intermittent errors in the processor core.Intermittent errors, often as a result of cosmic rays or other sources of radiation, are generallynot repeatable.With this function, when an error is encountered in the core, in caches and certain logicfunctions, the POWER7 processor automatically retries the instruction. If the source of theerror was truly transient, the instruction succeeds and the system continues as before.On IBM systems prior to POWER6, this error would have caused a checkstop.Alternate processor retryHard failures are more difficult, being permanent errors that are replicated each time theinstruction is repeated. Retrying the instruction does not help in this situation because theinstruction continues to fail.As in POWER6, POWER7 processors have the ability to extract the failing instruction fromthe faulty core and retry it elsewhere in the system for a number of faults, after which thefailing core is dynamically deconfigured and scheduled for replacement.Dynamic processor deallocationDynamic processor deallocation enables automatic deconfiguration of processor cores whenpatterns of recoverable core-related faults are detected. Dynamic processor deallocationprevents a recoverable error from escalating to an unrecoverable system error, which mightotherwise result in an unscheduled server outage. Dynamic processor deallocation relies onthe service processor’s ability to use FFDC-generated recoverable error information to notifythe POWER Hypervisor when a processor core reaches its predefined error limit. Then, thePOWER Hypervisor dynamically deconfigures the failing core, which is called out forreplacement. The entire process is transparent to the partition owning the failing instruction.