Commit Graph

27 Commits

Author SHA1 Message Date
a
1a2275d54c
fs storage: Use temporary files when writing (#300)
* fix: use an tmp file to flush new certs to disk

* add readme
2024-08-04 13:37:03 -06:00
Goksan
6cb1f8262d
filestorage: Use RemoveAll() to delete directories (#282)
According to the godoc
2024-04-15 08:52:29 -06:00
Matthew Holt
0bc747093f
Tune empty-lock retry mechanism (issue #232) 2023-06-16 10:19:46 -06:00
Matthew Holt
321ed64912
Wait and retry if lockfile is empty (fix #232) 2023-06-15 15:32:25 -06:00
Matthew Holt
79babffe28
Treat empty lockfiles as stale
Had this happen when testing something in Caddy. A crash at startup left
a lockfile created but empty.
(This was not a production crash, just dev.)

Empty lockfiles have been reported before. I think we should
treat them as stale.
It's not perfect but it's best-effort.
2022-09-29 10:11:32 -06:00
Matthew Holt
46a4436693
Clarify storage documentation (close #196) 2022-08-08 10:44:46 -06:00
Dave Henderson
9a56fcd4f9
Propagate context in the Storage interface methods (#155)
* Add context propagation to the Storage interface

Signed-off-by: Dave Henderson <dhenderson@gmail.com>

* Bump to Go 1.17

* Minor cleanup

* filestorage: Honor context cancellation in List()

Co-authored-by: Matthew Holt <mholt@users.noreply.github.com>
2022-03-07 12:26:52 -07:00
Matt Holt
2d114193c3
storage: Require fs.ErrNotExist (fix #168) (#170)
Also stop using the deprecated io/ioutil package.
Update dependencies.
Update Go version in go.mod.
2022-03-07 11:11:20 -07:00
Matthew Holt
b6b3db32bc
ci: Update Go versions (Go 1.16 required)
We now use the io/fs package.
2022-01-05 17:47:33 -07:00
Matthew Holt
468bfd25e4
Avoid infinite loop in rare cases with stale locks
Should fix caddyserver/caddy#4448

Weaker mutual exclusion guarantees, but probably the better alternative
2022-01-05 17:32:21 -07:00
Matthew Holt
7eaf4e7a41
Sync writes to storage device
A couple cases reported using AWS EFS have empty lock files. This is unusual, but has now been observed multiple times. Most recent documented case: https://github.com/caddyserver/caddy/issues/3954

We now try to force a sync to the device to see if that helps.
2021-01-04 12:14:41 -07:00
Matthew Holt
2b98009606
Improve on-demand logic, logging, error handling
Significantly, on-demand renew operations no longer block unless the
certificate is already expired. It serves existing certs when possible,
and performs renewals in the background.

Also minor improvements to debug and error logging.
2020-11-12 13:12:07 -07:00
Matt Holt
82040fdb58
Lock now takes a context and should honor cancellation (#66)
* Lock now takes a context and should honor cancellation

This allows callers to give up if they can't obtain a lock in a certain
timeframe and for resources to be cleaned up, avoiding potential
resource leaks.

Breaking change for any Storage implementations, sorry about that. (It's
why we're not 1.0 yet.) I'll reach out to known implementations; it's a
simple change.

* Rename obtainLock to acquireLock to be less ambiguous

In our package, "obtain" has a more common meaning related to certs
2020-05-27 15:05:53 -06:00
Matthew Holt
5ed364019b
Add nil check; recover from all goroutines 2020-05-12 09:28:56 -06:00
Matthew Holt
b9edcb838b
mholt/certmagic -> caddyserver/certmagic
And update dependencies
2020-03-06 18:05:05 -07:00
Matthew Holt
37e754b40c
Major refactor to improve performance, correctness, and extensibility
Breaking changes; thank goodness we're not 1.0 yet 😅 - read on!

This change completely separates ACME-specific code from the rest of the
certificate management process, allowing pluggable sources for certs
that aren't ACME.

Notably, most of Config was spliced into ACMEManager. Similarly, there's
now Default and DefaultACME.

Storage structure had to be reconfigured. Certificates are no longer in
the acme/ subfolder since they can be obtained by ways other than ACME!
Certificates moved to a new certificates/ subfolder. The subfolders in
that folder use the path of the ACME endpoint instead of just the host,
so that also changed. Be aware that unless you move your certs over,
CertMagic will not find them and will attempt to get new ones. That is
usually fine for most users, but for extremely large deployments, you
will want to move them over first.

Old certs path:
  acme/acme-staging-v02.api.letsencrypt.org/...

New certs path:
  certificates/acme-staging-v02.api.letsencrypt.org-directory/...

That's all for significant storage changes!

But this refactor also vastly improves performance, especially at scale,
and makes CertMagic way more resilient to errors. Retries are done on
the staging endpoint by default, so they won't count against your rate
limit. If your hardware can handle it, I'm now pretty confident that you
can give CertMagic a million domain names and it will gracefully manage
them, as fast as it can within internal and external rate limits, even
in the presence of errors. Errors will of course slow some things down,
but you should be good to go if you're monitoring logs and can fix any
misconfigurations or other external errors!

Several other mostly-minor enhancements fix bugs, especially at scale.
For example, duplicated renewal tasks (that continuously fail) will not
pile up on each other: only one will operate, under exponential backoff.

Closes #50 and fixes #55
2020-02-21 14:32:57 -07:00
Matthew Holt
adb47e0d77
Keep file locks fresh by updating them at regular intervals
This allows much longer-lived locks and much shorter expiry times, so
if the process is force-closed, the lock becomes available in a matter
of seconds instead of hours. This also means locks can be accurately
acquired for hours without having to guess how long before a lock will
be stale.

Cost: one small goroutine per active lock. The goroutine may live a
little longer than the actual lock since its termination is
polling-based.
2020-02-11 12:24:38 -07:00
Matthew Holt
6666db6352
Update rate limits
I've decided that the purpose of the internal rate limiter is not to
enforce the CA's rate limits, which only the CA can really do properly.
Instead, they are to avoid hammering the CA endpoint with excessive
requests.
2019-12-17 09:26:54 -07:00
securityclippy
8261565d73 exported KeyBuilder.Safe to use with external pkgs (#13)
* exported KeyBuilder.Safe to use with external pkgs

* fixing comment for Safe function
2019-01-06 20:59:41 -07:00
Matt Holt
a3b276a1b4
storage: Replace TryLock and Wait with Lock; simplify FileStorage
* Replace TryLock and Wait with Lock, and check for idempotency (issue #5)

* Fix logic of lock waiter creation in FileStorage (+ improve client log)

* Return from Wait() if lock file becomes stale

* Remove racy deletion of empty lock folder

* move all (FileStorage) methods to (*FileStorage) so assignments to fields like fileStorageNameLocks aren't lost

* rework lock acquisition

* Create lockDir just before lock file creation to reduce the chance that another process calls Unlock() and removes lockDir while we were waiting, preventing us from creating the lock file.
* Use the same strategy that Wait() uses to avoid depending on internal state.

* fix unlock of unlocked mutex

* Move fileStorageNameLocksMu into FileStorage struct

* implement new lockfile removal strategy and simplify the lock acquisition loop.

* readme: Add link to full examples

* Rework file lock obtaining and waiting logic

* Remove not-useful optimization to simplify file-locking logic
2018-12-19 14:25:11 -07:00
Matthew Holt
fe722057f2
UnlockAllObtained() method; FileStorage handles stale locks
It's still pretty early (day 2!) of the library so I'm OK with adding
a necessary method that removes locks that would become stale.

Also handle stale locks in the FileStorage implementation of Storage.
2018-12-13 07:01:13 -07:00
Matthew Holt
318e24ccb2
Print which cache is doing maintenance in log entries 2018-12-12 15:43:34 -07:00
Matthew Holt
b2a67f0504 FileStorage: Fix List(); modify Storage interface (fixes #4)
Adding a recursive option to List(), which, if true, causes List to
act like a walk function.

Also differentiating between "terminal" keys and "non-terminal" in
KeyInfo, since sometimes directories are useful, like listing user
accounts.
2018-12-12 14:47:46 -07:00
Matthew Holt
5b3085c491
Export methods to build storage keys and prefixes
Also adjust clients so that they use the configured HTTPPort or
HTTPSPort for solving challenges, if different from the default
challenge port (not as preferred as the Alt*Port values, of course)
2018-12-11 15:48:47 -07:00
Matthew Holt
d2f9fba738
Combine Locker interface into Storage; improve docs 2018-12-11 11:46:55 -07:00
Matthew Holt
4dd0c62355
filestorage: Fix little bug in Exists check
Key must be converted to a filename
2018-12-10 19:24:07 -07:00
Matthew Holt
bea13a36c8
Initial commit 2018-12-09 20:15:26 -07:00