Azul: Thank you so much for posting this! Using data sources is a great approach. Also, think it is clean. Been thinking of terraspace helpers to make the AWS API calls to look up shared things like vpc id. Using Terraform data sources is slightly like terraspace helpers in that sense. I like it
Using a glue stack like service_discovery
stack is interesting. Played with this approach a little bit. There seem to be some pros and cons
-
Con: Would have to remember to run terraspace up service_discovery
for the envs that need it first.
-
Pro: The glueing is somewhat centralizing located in one spot, the service_discovery
stack.
Gut says using data sources directly is maybe less overhead and work.
Additional Thoughts
Been reading what Terraform docs says.
Docs: State: Workspaces | Terraform by HashiCorp
Ran into this a while back and been thinking about it for some time. Here’s the relevant section of the docs:
Where multiple configurations are representing distinct system components rather than multiple deployments, data can be passed from one component to another using paired resources types and data sources. For example:
- Where a shared Consul cluster is available, use
consul_key_prefix
to publish to the key/value store and consul_keys
to retrieve those values in other configurations.
- In systems that support user-defined labels or tags, use a tagging convention to make resources automatically discoverable. For example, use the
aws_vpc
resource type to assign suitable tags and then the aws_vpc
data source to query by those tags in other configurations.
- For server addresses, use a provider-specific resource to create a DNS record with a predictable name and then either use that name directly or use the
dns
provider to retrieve the published addresses in other configurations.
- If a Terraform state for one configuration is stored in a remote backend that is accessible to other configurations then
terraform_remote_state
can be used to directly consume its root module outputs from those other configurations. This creates a tighter coupling between configurations, but avoids the need for the “producer” configuration to explicitly publish its results in a separate system.
To help identify, labeling them:
-
shared storage . IE: Consul approach above
-
data source : If cloud provider support for user-defined labels or tags it helps with querying
-
conventional naming : Example: server addresses: Essentially, sometimes resources are named with a pattern. And we can make use of it. IE: conventions over configurations. dev.example.com or prod.example.com
-
terraform_remote_state : When state is available. IE: Believe when the backend state file is available can use this. So might not be able to between state files in different buckets. IE: TS_ENV=common vs TS_ENV=prod
What you’re doing above with #2 (data source) can work quite well.
Some more relevant terraform docs about #4 (terraform_remote_state) The terraform_remote_state Data Source | Terraform by HashiCorp
Instead of remote data, docs generally recommends #1 (shared storage)
When possible, we recommend explicitly publishing data for external consumption to a separate location instead of accessing it via remote state. This lets you apply different access controls for shared information and state snapshots.
Docs recommends storing the info somewhere and fetching from it.
A key advantage of using a separate explicit configuration store instead of terraform_remote_state is that the data can potentially also be read by systems other than Terraform
On the same doc page, there’s a tfe_output
data source. Its basically #1 (shared storage), grabbing it from terraform cloud/enterprise.
Thinking will add a similar terraspace cloud tsc_output
helper that will fetch from terraspace cloud. It’s shared storage, but even more interesting because it’s accessible at terraspace build/compile/preprocessing of tfvars. That’s a generalized way.
For non-terraspace cloud users, there are custom helpers to fetch the info from where the user needs. IE: AWS api or some shared storage like consul. Can also add or consider PRs to specific terraspace cloud provider plugins that add additional helper support. Went ahead and improved the way terraspace core processes tfvars to help with this. This is released in Terraspace v2.2+
Summary of the approaches for handling passing data from global or shared stacks:
-
Data source: Using native terraform data sources
-
tsc_output: Think will add support for terraspace core and cloud
-
Terraspace Plugin helpers: Would like to add additional helpers. Will also consider PRs
-
Custom helpers: User can define own helpers