Skip to content

A2. Avoid reliance upon specific instances of infrastructure

Following the principle that cloud infrastructure should be treated as “Cattle, not Pets”, a recommended practice is to design cloud workloads to be resilient to the failure of any one particular component (a VM, or container, for example), by making it trivial to replace a failed or malfunctioning component at any time (this is facilitated by defining cloud assets declaratively and using Infrastructure as Code) and enshrining the abstract definition of a component as the “critical” part of the application, not the running instantiation of that definition on ephemeral infrastructure.
Because we give up a measure of control over specific instances of infrastructure when they’re run in environments owned by others, it is necessary to not only respond to, but to expect component failure. Irrespective of whether a computing workload is run in the public cloud, it is desirable to anticipate and proactively handle the possibility of an application component’s unavailability.