* Fix force-renewing revoked on-demand certs
Follow-up to 9245be5a2f
* One more fix for on-demand logic of revoked certs
* OCSP revocation checks at startup, too
Required significant refactoring, hope it works.
Yet again way too late at night for this...
When I initially wrote the auto-replace feature, it was for the standard mode of operation,
which I presumed the vast majority of CertMagic deployments use. At the time, On-Demand
mode of operation was fairly niche. And at the time, it looked tricky to properly enable this feature for on-demand certificates, so I shelved it considering it would be low-impact anyway.
So on-demand certificates didn't benefit from auto-replace in the case of revocation (oh well,
no other servers / ACME clients do that at all anyway).
I guess since that time, the use of CertMagic's exclusive on-demand feature has risen in
popularity. But there is no way to tell, and I had no real way of knowing whether any
significant use of the feature is being had since Caddy has no telemetry. (We used to
have telemetry -- benign, anonymous technical stats to help us understand usage -- but
unfortunately public backlash forced us to end the program.) Based on public feedback
forced by external events, it seems that on-demand TLS deployments are probably rare,
but each of those few deployments actually serve thousands of sites/domains. (The
true importance of this feature would have been clear months ago if Caddy had telemetry,
as Caddy is the primary importer of CertMagic.)
This commit should enable auto-replace for on-demand certificates. It required some
refactoring and some decisions that aren't *entirely* clear are right, but that's how it
goes.
I haven't tested this. (Last time I worked on this feature it took me about 2 days to test properly.)
I got following panic while Caddy was running:
2021/10/26 08:06:34 panic: certificate worker: runtime error: invalid memory address or nil pointer dereference
goroutine 43 [running]:
github.com/caddyserver/certmagic.(*jobManager).worker.func1()
github.com/caddyserver/certmagic@v0.14.5/async.go:58 +0x65
panic({0x145d400, 0x23d6c50})
runtime/panic.go:1038 +0x215
github.com/caddyserver/certmagic.decodePrivateKey({0xc000738c00, 0x0, 0x0})
github.com/caddyserver/certmagic@v0.14.5/crypto.go:75 +0x2a
github.com/caddyserver/certmagic.(*Config).reusePrivateKey(0xc0003b77c0, {0xc0003b1640, 0x32})
github.com/caddyserver/certmagic@v0.14.5/config.go:602 +0x2b9
github.com/caddyserver/certmagic.(*Config).obtainCert.func2({0x190d3b8, 0xc000655920})
github.com/caddyserver/certmagic@v0.14.5/config.go:487 +0x1d6
github.com/caddyserver/certmagic.doWithRetry({0x190d310, 0xc0000b0440}, 0xc00003bd40, 0xc0007afba8)
github.com/caddyserver/certmagic@v0.14.5/async.go:106 +0x1cc
github.com/caddyserver/certmagic.(*Config).obtainCert(0xc0003b77c0, {0x190d310, 0xc0000b0440}, {0xc0003b1640, 0x32}, 0x0)
github.com/caddyserver/certmagic@v0.14.5/config.go:572 +0x58e
github.com/caddyserver/certmagic.(*Config).ObtainCertAsync(...)
github.com/caddyserver/certmagic@v0.14.5/config.go:427
github.com/caddyserver/certmagic.(*Config).manageOne.func1()
github.com/caddyserver/certmagic@v0.14.5/config.go:332 +0x6f
github.com/caddyserver/certmagic.(*jobManager).worker(0x23e0c60)
github.com/caddyserver/certmagic@v0.14.5/async.go:73 +0x112
created by github.com/caddyserver/certmagic.(*jobManager).Submit
github.com/caddyserver/certmagic@v0.14.5/async.go:50 +0x288
According to Go documentation: https://pkg.go.dev/encoding/pem#Decode
p can be nil (first parameter returned) and so it should be checked
before continuing as per this example:
https://pkg.go.dev/encoding/pem#example-Decode
I also added a test to verify that the fix works. Running the test
without the fix causes a panic.
Test: go test -count=1 './...'
* Fix TLS-ALPN-01 challenge for IP Identifiers
See #133
* Add tests for challengeKey function
* Add more tests
* Fix PR comments
* Remove deletion of TLS-ALPN-01 challenge certificate
* Begin refactor of ObtainCert and RenewCert to allow force renews
* Don't reuse private key in case of revocation due to key compromise
* Improve logging in renew
* Run OCSP check at start of cache maintenance
Otherwise we wait until first tick (currently 1 hour) which might be too long
* Fix obtain; move some things around
Obtain now tries to reuse private key if exists, but if it doesn't exist, that shouldn't be an error (so we clear the error in that case).
Moved the removal of compromised private keys to have logging make more sense.
* feature: add optional !important suffix
if !important is added to any of the resolvers, then all are considered
exclusive and no other fallbacks will be added.
* fix: !important can be on it's own
* simplify recursiveNameservers
- use custom OR default nameservers
- add testing
* removed print line
* tests: fixed defaults when resolv.conf is found
On-demand certs are managed at handshake-time. Doing so in the background was
a temporary holdover until on-demand maintenance improved, which it since has.
Since background maintenance did not consult the "ask" endpoint or decision func,
it would sometimes renew certificates that were not desirable to renew.
See https://caddy.community/t/clean-up-caddy-certificates/11429/11?u=matt
CertMagic has always been useful for TLS servers, with its Cache
type, which enables long-term automation of managed certs.
But there has never been a good way to use CertMagic with
client certificates, which can be automated the same way, but
which are used sporadically and instantaneously, rather than
during the long-running lifetime of a server.
This is a simple addition which provides a lot of value, so that
TLS clients can use CertMagic to automate their certificates.
The ClientCredentials() method returns chains of TLS certs that
are ready to be used in tls.Config structs to enable client auth.
Turns out this is needed when solving the HTTP challenge in Caddy, in certain situations.
This does not provide access to challenge info in distributed challenge storage (that would require a Config, and isn't exported anyway since it is handled internally).
This allows any challenges initiated within the process to be solved by whatever HTTP or TLS server is running, even if they do not know about the challenges themselves.
This is useful when a process has multiple servers running, but only one can solve the challenges (which is often the case, since a socket belongs to one listener at a time) and they do not know about each other or share configs. The trick is to wrap the solvers with a thin wrapper that stores all the challenge info in memory while the challenge is active.
A nice side-effect is I've simplified/unified the code that gets the challenge info when actually solving the challenges.
This makes it possible to use existing ACME account when you have the private key but not the contact information. Often the case when the ACME account is created out-of-band of the ACME client.
I suppose * is a valid subject -- technically -- but it probably won't
be accepted by browsers. They usually only accept wildcards
for subdomains.
Related, but only tangentially:
https://github.com/caddyserver/caddy/issues/3977