* Initial implementation of ZeroSSL API issuer
Still needs CA support for CommonName-less certs
* Accommodate ZeroSSL CSR requirements; fix DNS prop check
* Fix README example
* Fix comment
These are useful for advanced applications (like Caddy) which would
like to remove certificates from the
cache in a controlled way, and operate the
cache with new settings while running.
Eliminates a bajillion nil checks and footguns
(except in tests, which bypass exported APIs, but that is expected)
Most recent #207
Logging can still be disabled via zap.NewNop(), if necessary.
(But disabling logging in CertMagic is a really bad idea.)
OnEvent can now control basic program flow for certain events.
For example, it can cancel cert_obtaining or cert_renewing from happening.
Slight API change adds context and changes to map[string]any for data.
This is easier to work with in practice and conforms more with Caddy's
new event system.
* Add context propagation to the Storage interface
Signed-off-by: Dave Henderson <dhenderson@gmail.com>
* Bump to Go 1.17
* Minor cleanup
* filestorage: Honor context cancellation in List()
Co-authored-by: Matthew Holt <mholt@users.noreply.github.com>
This work made possible by Tailscale: https://tailscale.com - thank you to the Tailscale team!
* Implement custom GetCertificate callback
Useful if another entity is managing certificates and can
provide its own dynamically during handshakes.
* Refactor CustomGetCertificate into OnDemandConfig
* Set certs to managed=true
This is only sorta true, but it allows handshake-time maintenance of the
certificates that are cached from CustomGetCertificate.
Our background maintenance routine skips certs that are OnDemand so it
should be fine.
* Change CustomGetCertificate into interface value
Instead of a function
* Case-insensitive subject name comparison
Hostnames are case-insensitive
Also add context to GetCertificate
* Export a couple of outrageously useful functions
* Allow multiple custom certificate getters
Also minor refactoring and enhancements
* Fix tests
* Rename Getter -> Manager; refactor
And don't cache externally managed certs
* Minor updates to comments
* Fix force-renewing revoked on-demand certs
Follow-up to 9245be5a2f
* One more fix for on-demand logic of revoked certs
* OCSP revocation checks at startup, too
Required significant refactoring, hope it works.
Yet again way too late at night for this...
* Begin refactor of ObtainCert and RenewCert to allow force renews
* Don't reuse private key in case of revocation due to key compromise
* Improve logging in renew
* Run OCSP check at start of cache maintenance
Otherwise we wait until first tick (currently 1 hour) which might be too long
* Fix obtain; move some things around
Obtain now tries to reuse private key if exists, but if it doesn't exist, that shouldn't be an error (so we clear the error in that case).
Moved the removal of compromised private keys to have logging make more sense.
I suppose * is a valid subject -- technically -- but it probably won't
be accepted by browsers. They usually only accept wildcards
for subdomains.
Related, but only tangentially:
https://github.com/caddyserver/caddy/issues/3977
* Implement multiple issuer support
This change refactors Config.Issuer to be Config.Issuers, an array of
issuers. Each Issuer will be tried in turn until one succeeds. During
retries, each attempt will try each configured Issuer. When loading
certs from storage, CertMagic will look in each Issuer's storage
location for a qualifying asset. If multiple Issuers have one in storage
then the most-recently-issued cert will be selected.
This is a breaking change in that Config now accepts a slice of Issuers
rather than a single Issuer. The Revoker field is removed, as supporting
it is optional anyway. If the Issuer is also a Revoker, it can be used
implicitly to revoke certificates.
Also added a const for ZeroSSL's ACME endpoint.
* Load matching wildcard on-demand from storage
With this change, a config using on-demand TLS can load a certificate
for "sub.example.com" from storage using a matching wildcard cert
(i.e. "*.example.com") if no better matching certificate is available.
* Fix distributed solving with tls-alpn challenges
The type assertion in handshake.go was problematic since there's no
guarantee that an ACME issuer would be a concrete ACMEManager type.
Refactored the code to accept IssuerKey values generally, rather than
specific ACMEManager values only.
This fixes solving tls-alpn challenges in distributed settings.
More cleanup can be done, another time.
This is necessary to support a nuance in Caddy where we have to see if a
subject qualifies for a public certificate but with custom wildcard
checking. So we separate the wildcard check from other checks.
If the machine goes to sleep or the process gets suspended, background
maintenance won't happen, so we need to check for expiration of all
managed, on-demand certificates at every handshake. Fortunately, this is
pretty cheap because it's simple date math.
https://caddy.community/t/local-certificates-not-renewing-on-demand/9482
Logging is now configurable through setting the Logging field on the
various relevant struct types. This is a more useful, consistent, and
higher-performing experience with logs than the std lib logger we used
before.
This isn't a 100% complete transition because there are some parts of
the code base that don't have obvious or easy access to a logger.
They are mostly fringe/edge cases though, and most are error logs, so
you shouldn't see them under normal circumstances. They still emit to
the std lib logger, so it's not like any errors get hidden: they are
just unstructured until we find a way to give them access to a logger.
This allows two certs (say, RSA and ECDSA) for the same names to be
loaded, and CertMagic will consider which one the client supports and
use that.
We used to extract just select fields from the leaf certificate so that
we didn't need to fill memory with more data than necessary, but in
order to use the stdlib's SupportsCertificate() method, we have to keep
the full tls.Certificate.Leaf field set for speed during handshakes.
CertMagic currently does wildcard matching in two places:
- Cache.AllMatchingCertificates() for finding all certs in cache
- Config.getCertificate() for finding one cert in cache at handshake
But those implementations will not use MatchWildcard() because their
looping logic is slightly customized.
Caddy, however, has need to compare DNS names with wildcards in at
least two places:
- Matching TLS connection policies by ServerName (SNI)
- Matching TLS automation policies by subject names
So this function is a good implementation for that.
This allows CertMagic to accommodate certificates with extremely short
lifetimes (new defaults work with cert lifetimes < 24h, but I wouldn't
want to push it < 30m with these defaults).
Breaking changes; thank goodness we're not 1.0 yet 😅 - read on!
This change completely separates ACME-specific code from the rest of the
certificate management process, allowing pluggable sources for certs
that aren't ACME.
Notably, most of Config was spliced into ACMEManager. Similarly, there's
now Default and DefaultACME.
Storage structure had to be reconfigured. Certificates are no longer in
the acme/ subfolder since they can be obtained by ways other than ACME!
Certificates moved to a new certificates/ subfolder. The subfolders in
that folder use the path of the ACME endpoint instead of just the host,
so that also changed. Be aware that unless you move your certs over,
CertMagic will not find them and will attempt to get new ones. That is
usually fine for most users, but for extremely large deployments, you
will want to move them over first.
Old certs path:
acme/acme-staging-v02.api.letsencrypt.org/...
New certs path:
certificates/acme-staging-v02.api.letsencrypt.org-directory/...
That's all for significant storage changes!
But this refactor also vastly improves performance, especially at scale,
and makes CertMagic way more resilient to errors. Retries are done on
the staging endpoint by default, so they won't count against your rate
limit. If your hardware can handle it, I'm now pretty confident that you
can give CertMagic a million domain names and it will gracefully manage
them, as fast as it can within internal and external rate limits, even
in the presence of errors. Errors will of course slow some things down,
but you should be good to go if you're monitoring logs and can fix any
misconfigurations or other external errors!
Several other mostly-minor enhancements fix bugs, especially at scale.
For example, duplicated renewal tasks (that continuously fail) will not
pile up on each other: only one will operate, under exponential backoff.
Closes#50 and fixes#55
This allows for user-loaded certificates to be associated with arbitrary
values such as user-provided IDs or categories. This can be useful if
multiple certificates satisfy a ClientHello but if a specific one still
needs to be chosen. See for example:
https://github.com/mholt/caddy/issues/2588
This is a breaking API change since we need to expose a tags parameter
to the caching functions, but we're not 1.0 yet so we will try this
API change and see how it goes.
* Significant refactor
This refactoring expands the capabilities of the library for advanced
use cases, as well as improving the overall architecture, including
possible memory leak fixes if used over a long period with many certs
loaded into memory. This refactor enables using different configs
depending on the certificate.
The public API has changed slightly, however, and arguably it is
slightly less convenient/elegant. I have never quite found the perfect
design for this package, and this certainly isn't it, but I think it's
better than what we had before.
There is still work to be done, but this is a good step forward. I've
decoupled Storage from Cache, and made it easier and more correct for
Configs (and Storage values) to be short-lived. Cache is the only value
that should be long-lived.
Note that CertMagic no longer automatically takes care of storage (i.e.
it used to delete old OCSP staples, but now it doesn't). The functions
to do this are still there and even exported, and now we expect the
application to call the cleanup functions when it wants to.
* Fix little oopsies
* Create Manager abstraction so obtain/renew isn't limited to ACME
* Replace TryLock and Wait with Lock, and check for idempotency (issue #5)
* Fix logic of lock waiter creation in FileStorage (+ improve client log)
* Return from Wait() if lock file becomes stale
* Remove racy deletion of empty lock folder
* move all (FileStorage) methods to (*FileStorage) so assignments to fields like fileStorageNameLocks aren't lost
* rework lock acquisition
* Create lockDir just before lock file creation to reduce the chance that another process calls Unlock() and removes lockDir while we were waiting, preventing us from creating the lock file.
* Use the same strategy that Wait() uses to avoid depending on internal state.
* fix unlock of unlocked mutex
* Move fileStorageNameLocksMu into FileStorage struct
* implement new lockfile removal strategy and simplify the lock acquisition loop.
* readme: Add link to full examples
* Rework file lock obtaining and waiting logic
* Remove not-useful optimization to simplify file-locking logic