Apache config errors when configtest shows no errors
Recently I ran into an issue where a configtest with apachectrl -t
reported Syntax Ok
, reloaded apache and everything went down.
It turned out to be an issue where the SSL cert for one of the sites was invalid.
For context, I'd made a fix to the config around the use of a bundle and a crt with key (which meant a duplicate crt in the chain as the bundle would have it already, still works but best to fix) vs just using the bundle an key, but in doing so exposed an issue with an old ssl crt file.
The reload command initially didn't show an issue but I spotted the sites were down, along with a flood of alerts.
The main apache error.log file only reported there was a config issue, each site/vhost have their own files in our set up so grep to the rescue to track which was reporting new errors in their error logs.
I spotted one specific site having issues each time I restarted apache to bring it back up after the change:
"AH02565: Certificate and private key site.com:443:0 from /path/to/ssl.bundle and /path/to/ssl.key do not match"
That'd do it.
It seems that the old config had hidden this as it would work but was slightly misconfigured to duplicate the crt in the chain. The site was also behind Cloudflare so it's hard to know if that further hid the issue.
Like many sites we use automatic renewals, but it transpired this certificate's dates were several years old from a prior server migration and the cert hadn't been set up for the new server to auto renew.
As they had cloudflare in front, the site was serving fine over HTTPS and apache didn't seem to mind with no errors in the log either.
Backups are always key and I was able to bring sites back initially in minutes by rolling back the config files, giving me time to debug what had happened.
It's a tricky issue to prevent, the config change had been tested on several sites before it was rolled out to all. The backups worked well and the per site/vhost error logging was also useful for debugging here.
We have automated checks that the ssl certs for sites are valid but I plan to rework this to also check the raw files on the server itself and ensure the 2 files do match up.
Ultimately I'm writing this up to help my future self and anyone else that may run into a similar issue.
If apache's configtest (-t) reports that the files are ok but then apache fails to restart due to a config issue, a likely answer or at least something to check is the certs. Of course, always have backups before ready to roll the config changes back, check error logs and check the certs using openssl to verify they are correct or force renewing ones that are reporting as erroring in the error log.
Hope this helps someone else as well.