Planning multiple environments back to back fails getting providers

ellisio · March 9, 2021, 6:04pm

I have created a demo project using Terraspace 0.6.2 to show the issue. Basically, the .terraform.lock.hcl is messed up when running the second plan:

TS_ENV=dev terraspace plan dev # valid lock file
TS_ENV=staging terraspace plan staging # invalid lock file

Check out this repo here which has everything needed to work through this issue: https://github.com/ellisio/terraspace-demo-bug

ellisio · March 9, 2021, 9:01pm

It looks like this is not just with TS_ENV switching. This appears to happen on multiple stacks as well. For example, if we have two stacks the required the hashicorp/google provider, the following behavior happens.

TL;DR

terraspace init stack-a # produces good lock file hashes
terraspace init stack-b # produces broken lock file hashes

Stack A

terraspace init stack-a

The above produces the following in .terraspace-cache/us-central1/dev/stacks/stack-a/.terraform.lock.hcl:

# This file is maintained automatically by "terraform init".
# Manual edits may be lost in future updates.

provider "registry.terraform.io/hashicorp/google" {
  version     = "3.59.0"
  constraints = ">= 2.12.0, ~> 3.45, >= 3.59.0, < 4.0.0"
  hashes = [
    "h1:rubiy+932DM9kWMJSz5u8zgUGj7Iza6m5krr0FuPi9E=",
    "zh:1210d60719470b32d979390a73fa2405ceb9702f2728854cac3c3804bf774442",
    "zh:1e0cec25c527cd09d94ddcea55522e3d75a600745f3d8cd46296e610dde41abf",
    "zh:3eff1094a52a680d044ed8182ca1b70a8a509e4200fd89deae220b21503832a2",
    "zh:604c5fdb7d15268e4a5210cfcc5630f34c9a0a06d8ef5f6f3a93513aad278e11",
    "zh:6c02ff804cfa2fd7dda4c090f06ee999ce6fed2bc7fe408fa3ba312d57b64d56",
    "zh:8954c3691d665f44ed7bda1c7f5d02f4980698657b6518b4445842f80c146481",
    "zh:8e1f53a315341285b04aa50dda086be1f84d02ab92a9f4a3875e648374829a7b",
    "zh:e0b1f047f65a8403ea16157d4f3f8492d4b23ceab85b939f2bcd368e2d8f0252",
    "zh:f795a80a734d7730fe0b876f16705964a80bd155925aecc60026c0e8dab145ca",
    "zh:ffdcdebaabc34467db790a8c3e769fa6e44f580e4a162de1ad4f7156e54064fd",
  ]
}

Stack B

terraspace init stack-b

The above produces the following in .terraspace-cache/us-central1/dev/stacks/stack-a/.terraform.lock.hcl:

# This file is maintained automatically by "terraform init".
# Manual edits may be lost in future updates.

provider "registry.terraform.io/hashicorp/google" {
  version     = "3.59.0"
  constraints = ">= 3.59.0, < 4.0.0"
  hashes = [
    "h1:rubiy+932DM9kWMJSz5u8zgUGj7Iza6m5krr0FuPi9E=",
  ]
}

The Issue

Because of the limited hashes for stack-b, that stack fails to plan/apply properly on TFC.

tung · March 9, 2021, 9:41pm

Dug into this with the provided example project. Here are different debugging sessions:

Reproduction of issue where .terraform.lock.hcl loses info with terraform version v0.14.4: https://gist.github.com/tongueroo/5e64bdc6d2fd74d71520d294a20f32e8
Reproduction of issue where .terraform.lock.hcl loses info with terraform version v0.14.7: https://gist.github.com/tongueroo/238b9eed692d40bc6b60fe3f639a8229
If running staging first and only then the .terraform.lock.hcl has all the info: https://gist.github.com/tongueroo/39912904e7e37f6d54914bffa5ca898d
Here’s the key. If with plugin cache disabled, then both .terraform.lock.hcl has all the info: https://gist.github.com/tongueroo/2eb04c801268393b700ebed6897139e3
Interestingly, the apply still works on some versions on terraform: https://gist.github.com/tongueroo/509751a5a14627bd31abbeafdde17bc1

#5 shows that the terraform apply seems to work even though the staging .terraform.lock.hcl file is missing lock info, at least with terraform v0.14.7. In your provided debugging output, terraform produces a hard fail instead of applying successfully like the gist output. Thinking different versions of terraform behave differently.

#4 is key, you can disable the use of a Terraform plugin_cache like so:

config/app.rb

Terraspace.configure do |config|
  config.terraform.plugin_cache.enabled = false
end

Docs: https://terraspace.cloud/docs/config/reference/

This results in consistent .terraform.lock.hcl files with the lock information. As an immediate step, you can disable the cache.

The .terraform.lock.hcl Terraform concept did not exist when added the plugin cache feature a while back. It looks like there are some side-effects when using the plugin cache to the .terraform.lock.hcl files. Think may change the default to be disabled, but got some ideas that may be a better approach though. Note, the plugin cache helps with the terraspace all commands. Will have to think about this some more.

ellisio · March 9, 2021, 9:47pm

@tung Confirmed adding config.terraform.plugin_cache.enabled = false fixes the lock files.

What made me find this error was tinkering with our second stack and running a fresh terraspace all up. The second stack failed to plan because of this issue. Running 0.14.7 both locally and in TFC.

PS: Thanks for you help on the other issue. We were able to leverage Terraspace by converting our tfc-workspaces workspace into just being a remote state data source. Now Terraspace is using data sources to get the google credentials and passing them into the provider for the stacks.

tung · March 12, 2021, 10:42pm

Changed the default to false https://github.com/boltops-tools/terraspace/pull/92 Would like to make it so there’s a cache per module/stack, but it requires a little more effort. Will get to it in time. The default is false for now. Rather have it consistently work than fast but broken.