How to create "global" stacks

fenos · August 7, 2022, 7:26pm

Hello,
Firstly, I love terraspace! I’m migrating my project over from terragrunt! Thanks for making it!!

I’m scratching my head for over an hour trying to figure out how can i achieve a stack that only applies once per project instead of per region.

For instance, my setup uses a regional layering, however, some google cloud resources are global by nature, for example, a Network is a global resource, whereas subnets are regional.

In terragrunt I solved this issue by having the following folder structure:

dev
|
network
|__ global
|     |__ network.tf
|__ europe-west-3
     |__ subnets.tf

I’m wondering if there is a way to achieve this and having the stack being implemented this way:

stacks
|
network
|__ global
|    |__ main.tf
|__ europe-west-3
     |__ subnets.tf

Then the global folder is simply moved top level in the .terraspace-cache folder.

I hope it is clear my issue

Any tips?

fenos · August 7, 2022, 8:05pm

Right, I just came across this part of the doc:
https://terraspace.cloud/docs/dependencies/exclude-stacks/

which will allow me to restrict the stack by first applying a TS_ENV=global however it will come with the following draw backs which makes this approach not suitable:

I will be unable to reference the output from the global env to the regional env, example: getting the network_name from the global env in the regional one, to create related subnets. Ref How to use outputs between Multi Region stacks
It will force me to create multiple stacks example: network_global - network etc… where I would have preferred this distinction within the stack it self

Is there a clean way to achieve this and be able to terraspace run-all up with global and non global resources?

tung · August 10, 2022, 8:39pm

No current way to use terraspace all up and connect stacks from different TS_ENV. Dug into this a while ago and it’s quite complex. It’s because TS_ENV loads super early in the terraspace boot process. Configs like config.build.cache_dir need to be set up quite early. Unsure on approach with it Will review and consider PRs. Of course no sweat either way.

The post you found How to use outputs between Multi Region stacks - #2 by tung is non-ideal. Haven’t figure this one out though.

Azul · August 26, 2022, 10:19am

I stumbled about this myself, it was one my first moments when I thought this terraspace tool was not really well thought, as having a global/shared-services kind of env is the common way for larger deployments.
Then took a step back and thought how I could bend this the ‘terraspace way’ ?
what I got working fairly well now, is that I don’t query outputs from my ‘global’ env, but retrieve those from data sources, being them remote_state or simply data sources. It works very well, and more important feels quite clean…

I might actually move a lot of this data sources code into its own stack, like a ‘service_discovery’ stack which simply runs all the data_sorces and outputs them into outputs.
I can then apply this service_discovery stack on all my envs, and collect the outputs the usual terraspace way in the <%= output(‘service_discovery.stuff’) %> way

tung · August 27, 2022, 6:20pm

Azul: Thank you so much for posting this! Using data sources is a great approach. Also, think it is clean. Been thinking of terraspace helpers to make the AWS API calls to look up shared things like vpc id. Using Terraform data sources is slightly like terraspace helpers in that sense. I like it

Using a glue stack like service_discovery stack is interesting. Played with this approach a little bit. There seem to be some pros and cons

Con: Would have to remember to run terraspace up service_discovery for the envs that need it first.
Pro: The glueing is somewhat centralizing located in one spot, the service_discovery stack.

Gut says using data sources directly is maybe less overhead and work.

Additional Thoughts

Been reading what Terraform docs says.

Docs: Terraform | HashiCorp Developer

Ran into this a while back and been thinking about it for some time. Here’s the relevant section of the docs:

Where multiple configurations are representing distinct system components rather than multiple deployments, data can be passed from one component to another using paired resources types and data sources. For example:

Where a shared Consul cluster is available, use consul_key_prefix to publish to the key/value store and consul_keys to retrieve those values in other configurations.

In systems that support user-defined labels or tags, use a tagging convention to make resources automatically discoverable. For example, use the aws_vpc resource type to assign suitable tags and then the aws_vpc data source to query by those tags in other configurations.

For server addresses, use a provider-specific resource to create a DNS record with a predictable name and then either use that name directly or use the dns provider to retrieve the published addresses in other configurations.

If a Terraform state for one configuration is stored in a remote backend that is accessible to other configurations then terraform_remote_state can be used to directly consume its root module outputs from those other configurations. This creates a tighter coupling between configurations, but avoids the need for the “producer” configuration to explicitly publish its results in a separate system.

To help identify, labeling them:

shared storage . IE: Consul approach above
data source : If cloud provider support for user-defined labels or tags it helps with querying
conventional naming : Example: server addresses: Essentially, sometimes resources are named with a pattern. And we can make use of it. IE: conventions over configurations. dev.example.com or prod.example.com
terraform_remote_state : When state is available. IE: Believe when the backend state file is available can use this. So might not be able to between state files in different buckets. IE: TS_ENV=common vs TS_ENV=prod

What you’re doing above with #2 (data source) can work quite well.

Some more relevant terraform docs about #4 (terraform_remote_state) Terraform | HashiCorp Developer

Instead of remote data, docs generally recommends #1 (shared storage)

When possible, we recommend explicitly publishing data for external consumption to a separate location instead of accessing it via remote state. This lets you apply different access controls for shared information and state snapshots.

Docs recommends storing the info somewhere and fetching from it.

A key advantage of using a separate explicit configuration store instead of terraform_remote_state is that the data can potentially also be read by systems other than Terraform

On the same doc page, there’s a tfe_output data source. Its basically #1 (shared storage), grabbing it from terraform cloud/enterprise.

Thinking will add a similar terraspace cloud tsc_output helper that will fetch from terraspace cloud. It’s shared storage, but even more interesting because it’s accessible at terraspace build/compile/preprocessing of tfvars. That’s a generalized way.

For non-terraspace cloud users, there are custom helpers to fetch the info from where the user needs. IE: AWS api or some shared storage like consul. Can also add or consider PRs to specific terraspace cloud provider plugins that add additional helper support. Went ahead and improved the way terraspace core processes tfvars to help with this. This is released in Terraspace v2.2+

Summary of the approaches for handling passing data from global or shared stacks:

Data source: Using native terraform data sources
tsc_output: Think will add support for terraspace core and cloud
Terraspace Plugin helpers: Would like to add additional helpers. Will also consider PRs
Custom helpers: User can define own helpers

john.lister · October 9, 2022, 8:45pm

Not sure if it helps, but I’ve implemented these type of stacks by making them work in one of two ways. Depending on the environment, I either create the resource OR do a data lookup, by manipulating the output the stack returns the same resource regardless.
You then simply run terraspace with your env set to global to create the resource and then for the other region/environment and it will read the state