The lost OpenStack project

 keystone  project  debug  Sat 29 April 2017

Another fun issue with an OpenStack platform this week: a lost Keystone project. This is the story of how we brought this project back to life without loosing existing resources.

We have a small OpenStack platform running in our Objectif Libre office in Toulouse, France. We use it internally to run test instances. It's running Ocata, and the Keystone setup uses the domains feature to separate service and temporary accounts (default domain) from LDAP-backed accounts (olcorp domain). The only project in the olcorp domain, lab, holds all our virtual resources.

Luke's problem

My colleague Luke (fictional name) could not login anymore at some point this week. He received this very explicit message: "You are not authorized for any projects or domains."

Not cool.

He uses OpenStack a lot, knows what he's doing, and his account had not been suspended. I tried with my own account: same error. I tried again with the cloud-admin account this time - stored in the Keystone database, not on the LDAP server. Everything went fine, I could perform requests. One of those requests was:

openstack project list --domain olcorp

Empty answer. No project means no way to create or access resources, even if authentication is valid.

The lab project had disappeared.

Restoring the project

When a project is removed from the Keystone database, the associated resources (instances, volumes, networks, ...) are not destroyed. This might appear as a maintenance problem but in our case it's been quite useful.

I hoped that Keystone used soft-deletion of database resources (the data would still be there, but marked as deleted), but no luck there.

The revival of the project required a few steps:

  1. Creation of a new lab project. This is a start but is not enough: the ID of the new project doesn't match the ID of the removed one. All the OpenStack resources are associated to a project using its ID, so we needed the same ID. It is not possible to change/define the project ID using the API (AFAIK).

  2. Bit of MySQL tweaking. I try to avoid modifying resources on the SQL server as much as I can but it can be very handy:

    . openrc.sh  # source the OpenStack env file to get the old project ID
    mysql keystonedb -e "update project set id='$OS_PROJECT_ID' where name='lab'"
    
  3. Setup of the roles for users. We use LDAP group-based authorization, with only 2 roles (admin and _member_) so restoring the permissions has been easy to do. It might have been more painful with more roles, groups or users.

The process has been very easy and restoring the project took very little time.

We still don't know what happened on the platform, and why the project disappeared, but the keystone access log is quite clear:

10.78.1.21 - - [28/Apr/2017:22:24:20 +0200] "DELETE /v3/projects/68a93cc709b44de08cfd11e6bdac2b9b HTTP/1.1" 204 281 "-" "python-keystoneclient"

Could be a human error or a bug (seems unlikely but eh). Will be worth a new blog post if we ever find out :)

Comments !