Terraform Tutorial: Drift Detection Methods

0
129

[ad_1]


A standard false impression amongst DevOps groups utilizing infrastructure as code (IaC) instruments is that the templates they use to run their deployments are infallible sources of reality. As a substitute, a basic problem of architectures constructed utilizing instruments like Terraform is configuration drift. This happens when the precise state of your infrastructure begins to build up modifications and deviates from the configurations outlined in your code.
Configuration drift can happen for a lot of causes, no matter how good your DevOps engineers are at attempting to keep away from it. Even when your deployments are utterly depending on IaC, there is likely to be conditions the place drift happens. Widespread failure factors sometimes embody including, eradicating, or modifying distant assets.
One other large danger is that there’s no simple strategy to assure that deployments in cloud environments are solely being carried out by utilizing IaC. It’s usually nonetheless potential to deploy manually or semi-manually by utilizing the net portal browser, command-line interface, or through APIs.
As a result of that is an ongoing—and probably severe—downside in environments that depend on IaC, this text will present your DevOps groups find out how to counteract frequent Terraform drift. This text explores a couple of totally different methods for detecting, monitoring, and remediating drift utilizing examples from a pattern Azure Digital Machine deployment.
Terraform state
To higher perceive how Terraform drift can happen, it’s necessary to know the aim and significance of the Terraform state.
Along with containing environmental metadata, essentially the most vital perform of the Terraform state is to be the only supply of reality in your back-end APIs. Terraform makes use of a declarative method to configuration and useful resource mapping, binding every distant object to a useful resource after which recording this affiliation in a Terraform state file.
To do that, it accommodates the checklist of deployed assets with their settings and parameters and retains observe of any suppliers and dependencies.
Firstly of a easy terraform plan deployment job, for instance, the terraform.tfstate file is initialized with these properties:
Listing of C:Terraformmy1stlinuxvm02/09/2022  03:03 PM            0 terraform.tfstate
After operating the deployment, the abstract is a little bit totally different:
Listing of C:Terraformmy1stlinuxvm02/09/2022  03:09 PM            27,589 terraform.tfstate
The state file accommodates in depth particulars about all deployed assets. For instance, it encodes roughly 25Mb of state info to deploy your Azure Digital Machine—together with the deployment and state metadata itself. This metadata is discovered initially of the file:
{    “model”: 4,    “terraform_version”: “0.13.4”,    “serial”: 12,    “lineage”: “ba42cc9f-46ae-13f5-4808-08716f7b82b1”,    “outputs”: {},    “assets”: [      {        “mode”: “managed”,        “type”: “azurerm_linux_virtual_machine”,        “name”: “myterraformvm”,        “provider”: “provider[“registry.terraform.io/hashicorp/azurerm”]”,        “situations”: [
The Terraform state file also contains various environment metadata. In this case, the file clearly identifies the version of Terraform used for running the deployment. This is important to the immutable approach that HashiCorp utilizes for infrastructure management. The metadata also includes a serial parameter, which is the file’s internal counter indicating how many iterations of state change have occurred.
The state is encoded as a JSON file. This makes it more machine-friendly than the native HashiCorp Configuration Language (HCL) used in most Terraform files, which is designed to emphasize readability by humans.
Although it’s a JSON document, the state file isn’t intended to be directly manually modified, as this will tamper with its condition and might corrupt it.
Instead, to get a clearer view of the state’s contents, run the terraform show command, which will output something similar to this:
C:Terraform>terraform show# azurerm_linux_virtual_machine.myterraformvm:resource “azurerm_linux_virtual_machine” “myterraformvm” {    admin_username                  = “azureuser”    allow_extension_operations      = true    computer_name                   = “myvm”    disable_password_authentication = true    encryption_at_host_enabled      = false    extensions_time_budget          = “PT1H30M”    id                              = “/subscriptions/46801f45-d426-43b3-a094-0781444710a8/resourceGroups/my1stTFRG/providers/Microsoft.Compute/virtualMachines/myVM”    location                        = “eastus”    max_bid_price                   = -1    name                            = “myVM”    network_interface_ids           = [        “/subscriptions/46801f45-d426-43b3-a094-0781444710a8/resourceGroups/my1stTFRG/providers/Microsoft.Network/networkInterfaces/myNIC”,
Here, you can see all details from the last successful deployment state of these example resources.
How drift creeps in
Although Terraform’s approach keeps things tidy when your DevOps engineers make changes from the Terraform CLI, changes outside the platform remain invisible to the Terraform state until the next terraform plan or terraform apply command.
For example, if a DevOps engineer changes a VM size configuration using a manual update process, like a cloud CLI or portal, or runs a non-Terraform-automated process, like CloudFormation, an ARM template, Chef, Puppet, or Ansible, the Terraform State file won’t detect the change.
DevOps engineers can sometimes identify these differences by using the terraform plan command followed by the terraform apply command, but some changes can invisibly break the state and need to be manually resolved. In particular, aggregate types and API responses should be error-checked.
You should therefore be cautious when modifying resources outside of Terraform and subsequently using terraform apply in your automations after doing so, as this command will revert your changes. Modifications like this can result in deployment failures by adversely changing the availability of deployed resources or even destructively affecting their state.
Detecting Terraform drift
The most basic way to detect drift is by comparing a Terraform state file to monitoring metrics provided by the actual infrastructure. This might involve something like comparing the Terraform state file to information from a cloud provider’s API to find discrepancies that would indicate configuration drift.
The READ method
You can use a provider’s READ method to capture the state of a schema and ensure that it’s synchronized to your state file. Provider CREATE and UPDATE functions frequently normalize inputs—as in the case of sanitized string inputs—or apply default values to unspecified (often optional) attributes. So, it’s good practice to call the READ method after any modifications to ensure your state is synchronized.
Terraform refresh
It’s tedious to compare files manually, and you can’t always rely on good practice being implemented universally, so Terraform provides commands for drift detection and remediation.
In the past, developers could use terraform refresh to validate configuration updates. This command reads the state of managed remote objects and updates the state file accordingly. However, terraform refresh is now deprecated, as its behavior can cause serious issues if remote resources are incorrectly configured. It should only be used in versions of Terraform before v0.15.4.
Terraform plan
So, how do you easily remedy drift in your state file after making changes to remote objects outside of Terraform? If you need to perform a simple refresh, it’s now recommended that you use the terraform plan command and apply the –refresh-only option:
terraform plan –refresh-only
The refresh functionality is also natively integrated in terraform plan and terraform apply, which are the two main commands used to trigger the deployment of resources.
To run a Terraform-based deployment of resources, you’ll need to follow three steps:

You’ll start with the terraform init command, which initializes the resource provider—Azure, AWS, Kubernetes, or whichever provider you’re using—and validates your terraform template file or files for syntax correctness.
Next, run terraform plan. This command runs through your Terraform template and validates the deployment in a pre-deployment state. Consider this a check of what will get deployed in the final step.
Use terraform apply to run the deployment. This command connects to the target environment and deploys the defined resources from the template file or files.

The terraform.state file is created during the execution of the apply step, storing the actual most up-to-date state of the deployment in it.
Here’s an example output of the terraform plan phase in your Azure Virtual Machine deployment, where the deploy.tf file is stored in a folder called my1stlinuxvm:
> terraform plan my1stlinuxvmRefreshing Terraform state in-memory prior to plan…The refreshed state will be used to calculate this plan, but will not bepersisted to local or remote state storage.————————————————————————An execution plan has been generated and is shown below.Resource actions are indicated with the following symbols:+ createTerraform will perform the following actions:  # azurerm_linux_virtual_machine.myterraformvm will be created  + resource “azurerm_linux_virtual_machine” “myterraformvm” {      + admin_username                  = “azureuser”      + allow_extension_operations      = true      + computer_name                   = “myvm”      + disable_password_authentication = true      + extensions_time_budget          = “PT1H30M”      + id                              = (known after apply)      + location                        = “eastus”      + max_bid_price                   = -1      + name                            = “myVM”      + network_interface_ids           = (known after apply)      + priority                        = “Regular”      + private_ip_address              = (known after apply)      + private_ip_addresses            = (known after apply)      + provision_vm_agent              = true      + public_ip_address               = (known after apply)      + public_ip_addresses             = (known after apply)      + resource_group_name             = “my1stTFRG”      + size                            = “Standard_DS2_v2”      + tags                            = {          + “environment” = “Terraform Demo”        }      + virtual_machine_id              = (known after apply)      + zone                            = (known after apply)
Notice when running this command, it clearly identifies the integrated refresh functionality. You can see this in the second line through the Refreshing Terraform state… operation.
From there, initiate the actual deployment by running and examining the output of a terraform apply command:
> terraform apply my1stlinuxvmAn execution plan has been generated and is shown below.Resource actions are indicated with the following symbols:  + createTerraform will perform the following actions:  # azurerm_linux_virtual_machine.myterraformvm will be created  + resource “azurerm_linux_virtual_machine” “myterraformvm” {      + admin_username                  = “azureuser”      + allow_extension_operations      = true      + computer_name                   = “myvm”      + disable_password_authentication = true      + extensions_time_budget          = “PT1H30M”      + id                              = (known after apply)      + location                        = “eastus”      + max_bid_price                   = -1      + name                            = “myVM”      + network_interface_ids           = (known after apply)      + priority                        = “Regular”      + private_ip_address              = (known after apply)      + private_ip_addresses            = (known after apply)      + provision_vm_agent              = true      + public_ip_address               = (known after apply)      + public_ip_addresses             = (known after apply)      + resource_group_name             = “my1stTFRG”      + size                            = “Standard_DS2_v2”      + tags                            = {          + “environment” = “Terraform Demo”        }
Once the deployment is complete, you can validate the Azure VM and, for example, check the size in the Azure Portal:

[ad_2]