Planning multiple environments back to back fails getting providers

I have created a demo project using Terraspace 0.6.2 to show the issue. Basically, the .terraform.lock.hcl is messed up when running the second plan:

TS_ENV=dev terraspace plan dev # valid lock file
TS_ENV=staging terraspace plan staging # invalid lock file

Check out this repo here which has everything needed to work through this issue: https://github.com/ellisio/terraspace-demo-bug

It looks like this is not just with TS_ENV switching. This appears to happen on multiple stacks as well. For example, if we have two stacks the required the hashicorp/google provider, the following behavior happens.

TL;DR

terraspace init stack-a # produces good lock file hashes
terraspace init stack-b # produces broken lock file hashes

Stack A

terraspace init stack-a

The above produces the following in .terraspace-cache/us-central1/dev/stacks/stack-a/.terraform.lock.hcl:

# This file is maintained automatically by "terraform init".
# Manual edits may be lost in future updates.

provider "registry.terraform.io/hashicorp/google" {
  version     = "3.59.0"
  constraints = ">= 2.12.0, ~> 3.45, >= 3.59.0, < 4.0.0"
  hashes = [
    "h1:rubiy+932DM9kWMJSz5u8zgUGj7Iza6m5krr0FuPi9E=",
    "zh:1210d60719470b32d979390a73fa2405ceb9702f2728854cac3c3804bf774442",
    "zh:1e0cec25c527cd09d94ddcea55522e3d75a600745f3d8cd46296e610dde41abf",
    "zh:3eff1094a52a680d044ed8182ca1b70a8a509e4200fd89deae220b21503832a2",
    "zh:604c5fdb7d15268e4a5210cfcc5630f34c9a0a06d8ef5f6f3a93513aad278e11",
    "zh:6c02ff804cfa2fd7dda4c090f06ee999ce6fed2bc7fe408fa3ba312d57b64d56",
    "zh:8954c3691d665f44ed7bda1c7f5d02f4980698657b6518b4445842f80c146481",
    "zh:8e1f53a315341285b04aa50dda086be1f84d02ab92a9f4a3875e648374829a7b",
    "zh:e0b1f047f65a8403ea16157d4f3f8492d4b23ceab85b939f2bcd368e2d8f0252",
    "zh:f795a80a734d7730fe0b876f16705964a80bd155925aecc60026c0e8dab145ca",
    "zh:ffdcdebaabc34467db790a8c3e769fa6e44f580e4a162de1ad4f7156e54064fd",
  ]
}

Stack B

terraspace init stack-b

The above produces the following in .terraspace-cache/us-central1/dev/stacks/stack-a/.terraform.lock.hcl:

# This file is maintained automatically by "terraform init".
# Manual edits may be lost in future updates.

provider "registry.terraform.io/hashicorp/google" {
  version     = "3.59.0"
  constraints = ">= 3.59.0, < 4.0.0"
  hashes = [
    "h1:rubiy+932DM9kWMJSz5u8zgUGj7Iza6m5krr0FuPi9E=",
  ]
}

The Issue

Because of the limited hashes for stack-b, that stack fails to plan/apply properly on TFC.

Dug into this with the provided example project. Here are different debugging sessions:

  1. Reproduction of issue where .terraform.lock.hcl loses info with terraform version v0.14.4: https://gist.github.com/tongueroo/5e64bdc6d2fd74d71520d294a20f32e8
  2. Reproduction of issue where .terraform.lock.hcl loses info with terraform version v0.14.7: https://gist.github.com/tongueroo/238b9eed692d40bc6b60fe3f639a8229
  3. If running staging first and only then the .terraform.lock.hcl has all the info: https://gist.github.com/tongueroo/39912904e7e37f6d54914bffa5ca898d
  4. Here’s the key. If with plugin cache disabled, then both .terraform.lock.hcl has all the info: https://gist.github.com/tongueroo/2eb04c801268393b700ebed6897139e3
  5. Interestingly, the apply still works on some versions on terraform: https://gist.github.com/tongueroo/509751a5a14627bd31abbeafdde17bc1

#5 shows that the terraform apply seems to work even though the staging .terraform.lock.hcl file is missing lock info, at least with terraform v0.14.7. In your provided debugging output, terraform produces a hard fail instead of applying successfully like the gist output. Thinking different versions of terraform behave differently. :face_with_monocle:

#4 is key, you can disable the use of a Terraform plugin_cache like so:

config/app.rb

Terraspace.configure do |config|
  config.terraform.plugin_cache.enabled = false
end

Docs: https://terraspace.cloud/docs/config/reference/

This results in consistent .terraform.lock.hcl files with the lock information. As an immediate step, you can disable the cache.

The .terraform.lock.hcl Terraform concept did not exist when added the plugin cache feature a while back. It looks like there are some side-effects when using the plugin cache to the .terraform.lock.hcl files. Think may change the default to be disabled, but got some ideas that may be a better approach though. Note, the plugin cache helps with the terraspace all commands. Will have to think about this some more.

1 Like

@tung Confirmed adding config.terraform.plugin_cache.enabled = false fixes the lock files.

What made me find this error was tinkering with our second stack and running a fresh terraspace all up. The second stack failed to plan because of this issue. Running 0.14.7 both locally and in TFC.

PS: Thanks for you help on the other issue. We were able to leverage Terraspace by converting our tfc-workspaces workspace into just being a remote state data source. Now Terraspace is using data sources to get the google credentials and passing them into the provider for the stacks. :metal:

1 Like

Changed the default to false https://github.com/boltops-tools/terraspace/pull/92 Would like to make it so there’s a cache per module/stack, but it requires a little more effort. Will get to it in time. The default is false for now. Rather have it consistently work than fast but broken.