A couple cases reported using AWS EFS have empty lock files. This is unusual, but has now been observed multiple times. Most recent documented case: https://github.com/caddyserver/caddy/issues/3954
We now try to force a sync to the device to see if that helps.
Related to: https://github.com/caddyserver/caddy/issues/3939
Avoids a panic in the event ALL items listed are "terminal" - the linked specific case is surely a bug in the upstream storage implementation, but we shouldn't panic anyway.
Simply specifying a common name may not be enough, like in the case of
Let's Encrypt's new alternate chains, where one chain is a superset of
the other, and the difference is the root.
Should fix https://github.com/caddyserver/caddy/issues/3911
CertMagic does a sanity check before obtaining certs by checking names for invalid characters; and % is not a character that is accepted for IP addresses. I don't actually know how clients do validation on TLS connections to scoped IPs, but presumably we should just strip them before applying them to certificates server-side.
Also unexport the needlessly-exported NormalizedName function, which is no longer known to be used by any external libraries. (It used to be used by Caddy but we've since better contained the relevant logic within CertMagic.)
* Implement multiple issuer support
This change refactors Config.Issuer to be Config.Issuers, an array of
issuers. Each Issuer will be tried in turn until one succeeds. During
retries, each attempt will try each configured Issuer. When loading
certs from storage, CertMagic will look in each Issuer's storage
location for a qualifying asset. If multiple Issuers have one in storage
then the most-recently-issued cert will be selected.
This is a breaking change in that Config now accepts a slice of Issuers
rather than a single Issuer. The Revoker field is removed, as supporting
it is optional anyway. If the Issuer is also a Revoker, it can be used
implicitly to revoke certificates.
Also added a const for ZeroSSL's ACME endpoint.
* Load matching wildcard on-demand from storage
With this change, a config using on-demand TLS can load a certificate
for "sub.example.com" from storage using a matching wildcard cert
(i.e. "*.example.com") if no better matching certificate is available.
* Fix distributed solving with tls-alpn challenges
The type assertion in handshake.go was problematic since there's no
guarantee that an ACME issuer would be a concrete ACMEManager type.
Refactored the code to accept IssuerKey values generally, rather than
specific ACMEManager values only.
This fixes solving tls-alpn challenges in distributed settings.
More cleanup can be done, another time.
Significantly, on-demand renew operations no longer block unless the
certificate is already expired. It serves existing certs when possible,
and performs renewals in the background.
Also minor improvements to debug and error logging.
This is necessary to support a nuance in Caddy where we have to see if a
subject qualifies for a public certificate but with custom wildcard
checking. So we separate the wildcard check from other checks.
When checking whether a new DNS TXT record is deployed, as part of the
DNS challenge procedure, checkAuthoritativeNss is called in a loop until
the requested TXT value is found in one of the records, or until a
timeout.
Previously, if there were other DNS TXT records for the same FQDN, the
call to checkAuthoritativeNss failed and the whole DNS challenge was
canceled. This means for example that if there was any previous
_acme-challenge TXT for the domain, the DNS challenge would always fail.
This fixes this issue by not returning an error, but instead returning
not ready, when there are other values returned by that DNS TXT record
request.
Co-authored-by: Matt Holt <mholt@users.noreply.github.com>
Wildcard domain names collide with the same subdomain for the ACME TXT
record as the non-wildcard parent domain (for example, example.com and
*.example.com both use _acme-challenge.example.com), so we need to solve
those challenges mutually exclusively.
One potential problem with this current implementation is that we don't
wait for the DNS record to un-propagate after it is deleted; I've found
that re-running it works fine, after waiting just a few seconds. I am
not sure how to generalize this logic in all cases though. It is likely
provider-dependent. (I was testing with Cloudflare.)
Should fix https://github.com/caddyserver/caddy/issues/3474
If the machine goes to sleep or the process gets suspended, background
maintenance won't happen, so we need to check for expiration of all
managed, on-demand certificates at every handshake. Fortunately, this is
pretty cheap because it's simple date math.
https://caddy.community/t/local-certificates-not-renewing-on-demand/9482
* Minor improvement to DNS request handling
Sometimes incoming udp traffic on port 53 is blocked to
prevent DDoS attacks. In those cases only TCP will work
for DNS request as the UDP request will time out. And as
a result the DNS challenge will fail, while the server is
trying to verify if the challenge was propageted through
the NS.
Now instead of returning immidently, if a timeout with UDP was
received, the request will be tried again using TCP.
* Formatting and comment
Co-authored-by: Georg Friedrich <g.friedrich@sonnenwagen.org>
Co-authored-by: Matthew Holt <mholt@users.noreply.github.com>
This is necessary for a downstream requirement where the ACME CA offers
an API key to generate EAB credentials, but each time their API call is
used, new credentials are generated, so we need to be sure to use it
only once (when an account is actually being created). Thus, CertMagic
needs a way to tell the application when the account is actually being
created versus being reused. This allows the application to make an API
call just before account registration and fill the EAB credentials into
the ACMEManager struct.
Before when we used lego as our ACME library, DNS solvers abounded in
the lego repository and they could be used directly. Our new acmez lib
is very lightweight, and "bring-your-own-solvers", let alone your own
DNS provider implementations.
DNS providers are implemented in libdns: https://github.com/libdns
This commit adds an implementation of acmez.Solver that solves the DNS
challenge using libdns providers.
Unlike the other solvers, this one is exported because it is not a
challenge type that is enabled by default, and there is more config
surface.
We borrowed some DNS utility functions and tests from the lego repo.
But this is a very lightweight implementation that has a much, much
simpler API and smaller footprint.
Logging is now configurable through setting the Logging field on the
various relevant struct types. This is a more useful, consistent, and
higher-performing experience with logs than the std lib logger we used
before.
This isn't a 100% complete transition because there are some parts of
the code base that don't have obvious or easy access to a logger.
They are mostly fringe/edge cases though, and most are error logs, so
you shouldn't see them under normal circumstances. They still emit to
the std lib logger, so it's not like any errors get hidden: they are
just unstructured until we find a way to give them access to a logger.
* Lock now takes a context and should honor cancellation
This allows callers to give up if they can't obtain a lock in a certain
timeframe and for resources to be cleaned up, avoiding potential
resource leaks.
Breaking change for any Storage implementations, sorry about that. (It's
why we're not 1.0 yet.) I'll reach out to known implementations; it's a
simple change.
* Rename obtainLock to acquireLock to be less ambiguous
In our package, "obtain" has a more common meaning related to certs