GKE tells you Error 400/403: Missing edit permissions on account but actually something else is off. More fundamental. But easy to fix. Spoiler alert: You may want to check your project ID.

Hello, GKE!

Kubernetes is great. Maybe even the cloud we always wanted. And quite amazing on GKE. Plus, $300 in credits for Google Cloud (at the time of writing) don't exactly make it a tough decision either.

We didn't really have to think very long when looking for a managed Kubernetes service that could host our blog How Hard Can It Be?!. When setting up the Google Cloud account, we did put everything under project how-hard-can-it-be as it seemed like an obvious choice.

However, Kubernetes clusters can go through a lot of money in a rather limited amount of time. So, after 90 days, it was time to move on. Thankfully, our entire infrastructure is defined in Terraform.

So, it's a case of simply creating a new account, pointing Terraform at it, pressing the button, getting a coffee, and coming back to see everything working?

Well, almost.

Automation For the Win!

The first step is indeed creating a new account. Google made that very easy. Next, a new project needs to be created; naming it how-hard-can-it-be makes sense in order to keep things in line with the old account. After that, it's creating a service account and enabling the GKE API. Straightforward, for the most part.

Finally, all variables need to be plugged into the corresponding Terraform .tfvars file and after hitting the button, just before rushing off to the coffee machine, it happens...

Error: Error applying plan:

2 error(s) occurred:

* module.ghost-gke-cluster.module.gke-cluster.google_container_cluster.primary: 1 error(s) occurred:

* google_container_cluster.primary: googleapi: Error 403: Required "container.clusters.create" permission(s) for "projects/how-hard-can-it-be". See https://cloud.google.com/kubernetes-engine/docs/troubleshooting#gke_service_account_deleted for more info., forbidden
* google_compute_global_address.prod: 1 error(s) occurred:

* google_compute_global_address.prod: Error creating GlobalAddress: googleapi: Error 403: Required 'compute.globalAddresses.create' permission for 'projects/how-hard-can-it-be/global/addresses/how-hard-can-it-be-prod', forbidden

Terraform does not automatically rollback in the face of errors.
Instead, your Terraform state file has been partially updated with
any resources that successfully completed. Please address the error
above and apply again to incrementally change your infrastructure.

Blimey.

I Did What?!

So, something seems to be off with the service account.

Reading through the documentation on https://cloud.google.com/kubernetes-engine/docs/troubleshooting#gke_service_account_deleted it seems like the error is that the "Compute Engine and/or Kubernetes Engine service account has been deleted or edited". As clearly stated

When you enable the Compute Engine or Kubernetes Engine API, a service account is created and given edit permissions on your project. If at any point you edit the permissions, remove the account entirely, or disable the API, cluster creation and all management functionality will fail.

No rogue tinkering with these service accounts then. Except that I can't really recall ever doing something like that.

Anyways, there's also a helpful gcloud command that should sort things out. First authenticating to the right account via

gcloud auth login

before running the suggested

gcloud services enable container.googleapis.com

command should restore some order. Except, it doesn't.

The Solution

After what feels like forever, the solution to the problem is dauntingly obvious. It's actually already in the error message.

Error 403: Required "container.clusters.create" permission(s) for "projects/how-hard-can-it-be"

Note that it says projects/how-hard-can-it-be. Makes sense, since that's the name of the old, as well as the new project. Or is it?!

The Answer is in the JSON

The JSONs for the service accounts already have the answer. The JSON for the old service account contains

...
  "project_id": "how-hard-can-it-be",
...

The JSON for the new service account contains

...
  "project_id": "how-hard-can-it-be-239820",
...

While both projects are being displayed as project how-hard-can-it-be in the GCP console, Terraform needs the project ID to make the calls to the right project.

Fixing Terraform

In the end, the solution is to replace

...
project = "how-hard-can-it-be"
...

with

...
project = "how-hard-can-it-be-239820"
...

in the corresponding .tfvars file. Job done.

Conclusion

Extending on what the documentation on https://cloud.google.com/kubernetes-engine/docs/troubleshooting#gke_service_account_deleted states, it can be beneficial to double check the Terraform GCP provider settings.

The Error 403: Required "container.clusters.create" permission(s) error message can also occur when simply pointing Terraform at the wrong project.

Oh, and by the way: the page you're reading right now is being served from a new Google Cloud account. Thanks, Google!