Talos v0.5, our modern operating system for Kubernetes, has just been released and we are excited to share the new features that we have added.
In this version of Talos, we've added three new major APIs: control plane recovery, control plane bootstrapping, and a new events subsystem that can be used for sophisticated automation at the OS level. As part of this release, we have refactored the init system which has made it much easier to implement new APIs.
An inherent risk in running a self-hosted control plane is the risk of losing the control plane. I won't go into the details of a self-hosted control plane here, but in this release of Talos we have added an API for recovering a lost control plane. To recover the control plane, configure
talosctl to point at any control plane node and run:
It's as simple as that. In practice we have found it extremely difficult to get into a state that would require recovery, but should you need to you can easily recover the control plane.
As we have built out the foundation of our next generation OS, we have had to depend on the config file in places where we would eventually like to see an API.
The bootstrap API removes the need for managing a special type of config used to perform bootstrapping of etcd and Kubernetes. This special config requires careful handling in upgrade and recovery scenarios, and can present edge cases which require the user to be aware of implementation details.
Our goal is to automate all of these processes so that the node becomes less and less of a "thing" that you have to manage, and eventually fades away into the background. The bootstrap API moves us in the direction of a higher level of automation and away from config management.
To bootstrap a cluster, spin up any number of control planes and run the following against any one of them:
In just a few minutes, the cluster will be bootstrapped and ready for workloads.
To reiterate, our goal is to bring a level of automation that is unprecedented at the host OS level. The new Events API is a great example of the ways we are working to tie low-level operating system functionality into a modern API.
Events at the operating system level open up some really interesting possibilities that have not really been explored. Imagine having programmatic and automation-driven access to network events like interface changes, security events, or boot progress. We are really excited about the possibilities for interesting functionality based on these new APIs. We already have plans to use events for a more robust upgrade controller.
In the future we plan to make the Events API extensible, so that the list of capabilities can be expanded dynamically, and so that Talos users can publish events on their own. We would love your input on what types of OS-level events you would like to see. Help us shape what the next generation of Linux distributions look like, and let us know!
We put in some work for this release that should make navigating documentation much easier.
The Path to 1.0
You might have noticed that this release came more quickly after 0.4 than previous releases. We have changed our model and will now be releasing new versions of Talos on a more regular basis. Our previous cycle was about 12 weeks long, and the new cycle will be 6 weeks long. This will let us iterate more quickly and get new features out to our users faster.
Note that the APIs released in this verion of Talos are still considered alpha. Things may break and the API is subject to change.
With some of the work done in this release of Talos, we expect to have an accelerated rate of added features beyond the fundamentals we have been working on for so long. Although Talos is production ready, it is not feature complete, and we will be focusing on getting to something we feel is worthy to call 1.0 this year.