Azul: Thank you so much for posting this! Using data sources is a great approach. Also, think it is clean. Been thinking of terraspace helpers to make the AWS API calls to look up shared things like vpc id. Using Terraform data sources is slightly like terraspace helpers in that sense. I like it
Using a glue stack like
service_discovery stack is interesting. Played with this approach a little bit. There seem to be some pros and cons
Con: Would have to remember to run
terraspace up service_discovery for the envs that need it first.
Pro: The glueing is somewhat centralizing located in one spot, the
Gut says using data sources directly is maybe less overhead and work.
Been reading what Terraform docs says.
Docs: State: Workspaces | Terraform by HashiCorp
Ran into this a while back and been thinking about it for some time. Here’s the relevant section of the docs:
Where multiple configurations are representing distinct system components rather than multiple deployments, data can be passed from one component to another using paired resources types and data sources. For example:
- Where a shared Consul cluster is available, use
consul_key_prefix to publish to the key/value store and
consul_keys to retrieve those values in other configurations.
- In systems that support user-defined labels or tags, use a tagging convention to make resources automatically discoverable. For example, use the
aws_vpc resource type to assign suitable tags and then the
aws_vpc data source to query by those tags in other configurations.
- For server addresses, use a provider-specific resource to create a DNS record with a predictable name and then either use that name directly or use the
dns provider to retrieve the published addresses in other configurations.
- If a Terraform state for one configuration is stored in a remote backend that is accessible to other configurations then
terraform_remote_state can be used to directly consume its root module outputs from those other configurations. This creates a tighter coupling between configurations, but avoids the need for the “producer” configuration to explicitly publish its results in a separate system.
To help identify, labeling them:
shared storage . IE: Consul approach above
data source : If cloud provider support for user-defined labels or tags it helps with querying
conventional naming : Example: server addresses: Essentially, sometimes resources are named with a pattern. And we can make use of it. IE: conventions over configurations. dev.example.com or prod.example.com
terraform_remote_state : When state is available. IE: Believe when the backend state file is available can use this. So might not be able to between state files in different buckets. IE: TS_ENV=common vs TS_ENV=prod
What you’re doing above with #2 (data source) can work quite well.
Some more relevant terraform docs about #4 (terraform_remote_state) The terraform_remote_state Data Source | Terraform by HashiCorp
Instead of remote data, docs generally recommends #1 (shared storage)
When possible, we recommend explicitly publishing data for external consumption to a separate location instead of accessing it via remote state. This lets you apply different access controls for shared information and state snapshots.
Docs recommends storing the info somewhere and fetching from it.
A key advantage of using a separate explicit configuration store instead of terraform_remote_state is that the data can potentially also be read by systems other than Terraform
On the same doc page, there’s a
tfe_output data source. Its basically #1 (shared storage), grabbing it from terraform cloud/enterprise.
Thinking will add a similar terraspace cloud
tsc_output helper that will fetch from terraspace cloud. It’s shared storage, but even more interesting because it’s accessible at terraspace build/compile/preprocessing of tfvars. That’s a generalized way.
For non-terraspace cloud users, there are custom helpers to fetch the info from where the user needs. IE: AWS api or some shared storage like consul. Can also add or consider PRs to specific terraspace cloud provider plugins that add additional helper support. Went ahead and improved the way terraspace core processes tfvars to help with this. This is released in Terraspace v2.2+
Summary of the approaches for handling passing data from global or shared stacks:
Data source: Using native terraform data sources
tsc_output: Think will add support for terraspace core and cloud
Terraspace Plugin helpers: Would like to add additional helpers. Will also consider PRs
Custom helpers: User can define own helpers