tlsa binary fails with certificate error
Hello list,
Not sure this is the right place to post, maybe I'd better mail the maintainer of the package, but you might have encountered the same issue.
I've always published TLSA records for my domains/subdomains, and using an automated (Cron) job to do this, invoking the tlsa script (provided by the hash-slinger package on my Fedora machines).
Now for about a few weeks now, the tlsa script fails, complaining with the following error message:
Could not verify local certificate: no start line. Traceback (most recent call last): File "/usr/bin/tlsa", line 889, in <module> genRecords(args.host, args.protocol, args.port, chain, args.output, args.usage, args.selector, args.mtype) NameError: name 'chain' is not defined
I'm using LetsEncrypt for my certificates, and I can't see what changed recently. I'm running the tlsa script against a concatenated (intermediate + domain certificate) PEM file, and it has always worked just fine.
During my investigations, I found that an "openssl verify" will fail on the file, saying "unable to get local issuer certificate". I have no way to tell if this has always failed, or if this is new behavior.
I'd be glad to hear if you have any thoughts about my issue.
Thanks!
Hoggins!
On May 13, 2018, at 4:55 AM, Hoggins! fuckspam@wheres5.com wrote:
Not sure this is the right place to post, maybe I'd better mail the maintainer of the package, but you might have encountered the same issue.
This list is a reasonable place to share experiences with the use of DANE-related tools. Though you may have trouble receiving email from this list, until your TLSA records are correct. The list MTA may be doing DANE-validation, and your TLSA records have been incorrect since ~28/Apr/2018. [This reply is also Bcc'd to you directly, just in case].
Now for about a few weeks now, the tlsa script fails, complaining with the following error message:
Could not verify local certificate: no start line.
This suggests that the file it is trying to use is either not a PEM file, or contains no certificate. A PEM certificate is enclosed between two lines of the form:
----- BEGIN CERTIFICATE ----- ----- END CERTIFICATE -----
Double check that the file exists and is well-formed.
I'm using LetsEncrypt for my certificates, and I can't see what changed recently. I'm running the tlsa script against a concatenated (intermediate + domain certificate) PEM file, and it has always worked just fine.
This should be OK, but you're posting the syntax you're using, and not showing the file permissions, what id the cron job is running as, ...
During my investigations, I found that an "openssl verify" will fail on the file, saying "unable to get local issuer certificate".
To verify a chain file, try:
$ chainfile=chain.pem # adjust appropriately $ rootCA=root.pem # adjust appropriately $ openssl verify -untrusted $chainfile -CAfile $rootCA $chainfile
the "-untrusted ..." option makes the intermediate certificates in the chain file available for verification and of course the root CA needs to be locally available. Mind you, for DANE-EE(3) you really don't need to verify your certificate, so this is likely a distraction.
I have no way to tell if this has always failed, or if this is new behavior.
Your TLSA records used to be correct until ~28/Apr/2018
I'd be glad to hear if you have any thoughts about my issue.
Please review the slides (and if you wish audio) from ICANN61 talk, that recommends a more robust key rotation approach. Stick with "3 1 1" rather than "3 1 2", in DNS smaller replies work better, and SHA2-256 is plenty secure.
http://imrryr.org/~viktor/ICANN61-viktor.pdf http://imrryr.org/~viktor/icann61-viktor.mp3
Also see the listing for your domains at:
https://github.com/danefail/list/blob/master/dane_fail_list.dat
please do read the DANE misconfiguration notices sent when the DANE survey finds problems, you were notified on 28/Apr, 30/Apr and 05/May.
Hello Viktor,
Thank you for your answers. I've performed all the checks I could, the certificates are well formed ("openssl x509 -in cert.pem -text -noout" succeeds), I'm starting to think there may be a problem with my Python version. At least I'm pretty sure that the issue comes from a system update on both the ones I'm using for generating the signatures : on an old system, I'm successfully generating the records with the exact same files.
I'm really not a Python guy, so I tried some ugly debugging with "print" placed here and there to check what was going on. Here are the differences between a "working" and a "non working" system (both using Python 2.7, although the binaries don't have the same checksums), inside routine getLocalChain() :
-- non working system : Fedora 28, python2-libs-2.7.15-1.fc28.x86_64
<snip> while True: cptr = m2.x509_read_pem(bio._ptr()) if not cptr: break chain.append(X509.X509(cptr, _pyfree=1)) print chain if not chain: <snip>
If I put the "print chain" inside the while loop, I get the correct chain array (one pass with only one item, and second pass with two items, output is as expected).
*BUT*
<snip> while True: cptr = m2.x509_read_pem(bio._ptr()) if not cptr: break chain.append(X509.X509(cptr, _pyfree=1)) print chain if not chain: <snip>
I the "print chain" is placed after the loop, *it does not print anything*, so the script will eventually complain about "chain" not being defined.
-- working system : Fedora 24, python-libs-2.7.13-2.fc24.x86_64
With this system and Python version, both tests are functional : either inside the loop or after the loop, I get something in the "chain" variable.
So that makes me look at the Python version, but I could be wrong. I'm open to discussion.
And as you were saying Viktor, it is confirmed here : since my servers have bad DANE signatures, communicating on this list is quite complicated for me.
Thank you !
Hoggins!
Le 13/05/2018 à 18:10, Viktor Dukhovni a écrit :
On May 13, 2018, at 4:55 AM, Hoggins! fuckspam@wheres5.com wrote:
Not sure this is the right place to post, maybe I'd better mail the maintainer of the package, but you might have encountered the same issue.
This list is a reasonable place to share experiences with the use of DANE-related tools. Though you may have trouble receiving email from this list, until your TLSA records are correct. The list MTA may be doing DANE-validation, and your TLSA records have been incorrect since ~28/Apr/2018. [This reply is also Bcc'd to you directly, just in case].
Now for about a few weeks now, the tlsa script fails, complaining with the following error message:
Could not verify local certificate: no start line.
This suggests that the file it is trying to use is either not a PEM file, or contains no certificate. A PEM certificate is enclosed between two lines of the form:
----- BEGIN CERTIFICATE ----- ----- END CERTIFICATE -----
Double check that the file exists and is well-formed.
I'm using LetsEncrypt for my certificates, and I can't see what changed recently. I'm running the tlsa script against a concatenated (intermediate + domain certificate) PEM file, and it has always worked just fine.
This should be OK, but you're posting the syntax you're using, and not showing the file permissions, what id the cron job is running as, ...
During my investigations, I found that an "openssl verify" will fail on the file, saying "unable to get local issuer certificate".
To verify a chain file, try:
$ chainfile=chain.pem # adjust appropriately $ rootCA=root.pem # adjust appropriately $ openssl verify -untrusted $chainfile -CAfile $rootCA $chainfile
the "-untrusted ..." option makes the intermediate certificates in the chain file available for verification and of course the root CA needs to be locally available. Mind you, for DANE-EE(3) you really don't need to verify your certificate, so this is likely a distraction.
I have no way to tell if this has always failed, or if this is new behavior.
Your TLSA records used to be correct until ~28/Apr/2018
I'd be glad to hear if you have any thoughts about my issue.
Please review the slides (and if you wish audio) from ICANN61 talk, that recommends a more robust key rotation approach. Stick with "3 1 1" rather than "3 1 2", in DNS smaller replies work better, and SHA2-256 is plenty secure.
http://imrryr.org/~viktor/ICANN61-viktor.pdf http://imrryr.org/~viktor/icann61-viktor.mp3
Also see the listing for your domains at:
https://github.com/danefail/list/blob/master/dane_fail_list.dat
please do read the DANE misconfiguration notices sent when the DANE survey finds problems, you were notified on 28/Apr, 30/Apr and 05/May.
On May 22, 2018, at 5:05 AM, Hoggins! fuckspam@wheres5.com wrote:
I think I see the bug:
-- non working system : Fedora 28, python2-libs-2.7.15-1.fc28.x86_64
<snip> while True: cptr = m2.x509_read_pem(bio._ptr()) if not cptr: break chain.append(X509.X509(cptr, _pyfree=1))
You're telling Python it owns the certificate object reference and should free it when no longer needed. Then add the certificate to the chain, but this call may not bump the certificate reference count.
print chain
Here you print the chain. And the certificate itself goes out of scope and is freed, the chain no longer holds a valid reference.
If I put the "print chain" inside the while loop, I get the correct chain array (one pass with only one item, and second pass with two items, output is as expected).
*BUT*
<snip> while True: cptr = m2.x509_read_pem(bio._ptr()) if not cptr: break chain.append(X509.X509(cptr, _pyfree=1)) print chain if not chain: <snip>
I the "print chain" is placed after the loop, *it does not print anything*, so the script will eventually complain about "chain" not being defined.
I think all the certificates are freed leaving no valid references in the chain.
In any case, you should fix your TLSA records to be correct first, and then fix the script... Perhaps "_pyfree = 0" would work better. If the script does not run forever, but is just a cron job, freeing memory just slows it down...
Hello Viktor,
I have published the correct TLSA records (generated with my "old" system) this morning, they are fixed for now.
Le 22/05/2018 à 16:11, Viktor Dukhovni a écrit :
On May 22, 2018, at 5:05 AM, Hoggins! fuckspam@wheres5.com wrote:
I think I see the bug:
-- non working system : Fedora 28, python2-libs-2.7.15-1.fc28.x86_64
<snip> while True: cptr = m2.x509_read_pem(bio._ptr()) if not cptr: break chain.append(X509.X509(cptr, _pyfree=1))
You're telling Python it owns the certificate object reference and should free it when no longer needed. Then add the certificate to the chain, but this call may not bump the certificate reference count.
print chain
Here you print the chain. And the certificate itself goes out of scope and is freed, the chain no longer holds a valid reference.
Actually when inside the loop, chain is not empty, it's only outside of it that it seems to be freed.
If I put the "print chain" inside the while loop, I get the correct chain array (one pass with only one item, and second pass with two items, output is as expected).
*BUT*
<snip> while True: cptr = m2.x509_read_pem(bio._ptr()) if not cptr: break chain.append(X509.X509(cptr, _pyfree=1)) print chain if not chain: <snip>
I the "print chain" is placed after the loop, *it does not print anything*, so the script will eventually complain about "chain" not being defined.
I think all the certificates are freed leaving no valid references in the chain.
In any case, you should fix your TLSA records to be correct first, and then fix the script... Perhaps "_pyfree = 0" would work better. If the script does not run forever, but is just a cron job, freeing memory just slows it down...
Changing _pyfree=1 to _pyfree=0 did not help, unfortunately.
On May 22, 2018, at 10:39 AM, Hoggins! fuckspam@wheres5.com wrote:
Hello Viktor,
I have published the correct TLSA records (generated with my "old" system) this morning, they are fixed for now.
Yes, I see that too. I've removed your domains from:
https://github.com/danefail/list
Le 22/05/2018 à 16:11, Viktor Dukhovni a écrit :
On May 22, 2018, at 5:05 AM, Hoggins! fuckspam@wheres5.com wrote:
I think I see the bug:
-- non working system : Fedora 28, python2-libs-2.7.15-1.fc28.x86_64
<snip> while True: cptr = m2.x509_read_pem(bio._ptr()) if not cptr: break chain.append(X509.X509(cptr, _pyfree=1))
You're telling Python it owns the certificate object reference and should free it when no longer needed. Then add the certificate to the chain, but this call may not bump the certificate reference count.
print chain
Here you print the chain. And the certificate itself goes out of scope and is freed, the chain no longer holds a valid reference.
Actually when inside the loop, chain is not empty, it's only outside of it that it seems to be freed.
Yes, perhaps because the certificate object is still in scope. What happens if you load all the certificates into a list in the loop, and build the chain from the list outside the loop, then the array still references the certificates.
If we get too deep into Python, we'll be too far off topic, but for now, we're still vaguely talking about certificate management...
Le 22/05/2018 à 16:49, Viktor Dukhovni a écrit :
On May 22, 2018, at 10:39 AM, Hoggins! fuckspam@wheres5.com wrote:
Hello Viktor,
I have published the correct TLSA records (generated with my "old" system) this morning, they are fixed for now.
Yes, I see that too. I've removed your domains from:
Thank you
Le 22/05/2018 à 16:11, Viktor Dukhovni a écrit :
On May 22, 2018, at 5:05 AM, Hoggins! fuckspam@wheres5.com wrote:
I think I see the bug:
-- non working system : Fedora 28, python2-libs-2.7.15-1.fc28.x86_64
<snip> while True: cptr = m2.x509_read_pem(bio._ptr()) if not cptr: break chain.append(X509.X509(cptr, _pyfree=1))
You're telling Python it owns the certificate object reference and should free it when no longer needed. Then add the certificate to the chain, but this call may not bump the certificate reference count.
print chain
Here you print the chain. And the certificate itself goes out of scope and is freed, the chain no longer holds a valid reference.
Actually when inside the loop, chain is not empty, it's only outside of it that it seems to be freed.
Yes, perhaps because the certificate object is still in scope. What happens if you load all the certificates into a list in the loop, and build the chain from the list outside the loop, then the array still references the certificates.
If we get too deep into Python, we'll be too far off topic, but for now, we're still vaguely talking about certificate management...
Anyway, I raised an issue on their GitHub, the maintainers should be able to have a look : https://github.com/letoams/hash-slinger/issues/20
Hoggins!
participants (2)
-
Hoggins!
-
Viktor Dukhovni