In my previous post, I outlined some feature implementation goals for the Bitcoin/Lightning Kubernetes operator I am developing called
kiln. These goals will require the operator to interact with Bitcoin nodes in order to to read state and execute operations. The Bitcoin node implementation that I'm using (
btcd) exposes an RPC API. For the past couple of days, I've been exploring that API and how I can integrate the
kiln operator with with this RPC endpoint.
btcd is written in Go, so it was not surprising to find some useful client and utility libraries and even a few Go examples. Since I am using the Go operator SDK, using the native
btcd packages was a natural choice.
But before I jump into code, I had some questions along the way, not the least of which was:
What exactly is this Bitcoin RPC API?
In my observation, the primary function of this API is to support observability of the node state. The
kiln operator will need to make observations about nodes, so that's promising. For example, the procedure called
GetBlockCount() was an easy place to get started.
The API also supports some operations that result in node state changes. One category of operations relates to mining. These operations are not useful in production environments because if you are serious about mining, you will be running on specialized equipment, not commodity hardware. Profitable Bitcoin mining on Kubernetes will probably never be a thing, and mining not the target use case for
kiln. However on the
simnet (see previous post), being able to issue arbirary commands to generate blocks is useful.
btcd RPC API is not an implementation of a standard Bitcoin specification as far as I know. So while it may provide some utility for observing and managing the
btcd node that the
kiln operator deploys, I don't think I can expect to manage and observe other node implementations in this way. So for now, the
kiln operator will not be interoperable with all Bitcoin node implementations. I will stick to
btcd for now, but I think node interoperability may be a topic worth revisiting. It comes down to a set of philosophical questions for me:
- Should there be a standard API specification for Bitcoin node management? Does this exist?
- Should we embrace differences between operational practices and procedures for the various Bitcoin node implementations because it's the whole point and the main reason that several exist?
- How sufficient is the Bitcoin protocol itself for supporting observation and management?
Okay, that's enough pontificating, hopefully it gives the reader some background about this particular RPC endpoint I'm using.
Getting started by observing the block count
As an incremental step toward my goals for managing block production in a
simnet setting, I started by attempting to log the node's block count to the
Status field of the
BitcoinNode CR instance that owns the node deployment.
The first thing I need is for my operator to be able to trust the RPC server's certificate. Since the
BitcoinNode CR tells the operator the name of the RPC certificate secret already, I can assume that the secret is a standard Kubernetes TLS secret with a copy of the CA's certificate in a key called
Now with the CA certificate in hand, I can configure the RPC client and invoke the
Finally, I can write the block count value to the
BitcoinNode CR instance.
So this was good incremental progress. At least the operator is interacting directly with the Bitcoin node.
Some notes about testing
As I mentioned on Twitter, a developer tool called
kubefwd was really useful for testing. I could access the Bitcoin node Kubernetes service URL while running the operator on my local machine.
Aside from running the operator locally,
kubefwd also allowed me to programmatically invoke arbitrary operations to support my testing scenarios.
So for example, I could issue a command to my Bitcoin node to generate some blocks in order to test that the new block count is published to the
BitcoinNode CR instance status.
Challenges and next steps
As many of us know, in software the solution to one problem is often the cause of another. In this case, when I added in logic to report block count, I caused a race condition when creating new
BitcoinNodes. The operator tries to query the RPC server before the Bitcoin pod is fully up and running. In general, I get the impression that Kubernetes operator development is going to involve a lot of thinking about the timing of events and operations. This part isn't ironed out yet, so the implementation of the block count status update remains rough around the edges, though I'm hopeful that I can find some good practices demonstrated in other operator repositories.