Troy Hunt: Home Assistant, Pwned Passwords and Security Misconceptions
Home Assistant, Pwned Passwords and Security Misconceptions
11 MARCH 2021
Two of my favourite things these days are Have I Been Pwned and Home Assistant. The former is an obvious choice, the latter I've come to love as I've embarked on my home automation journey. So, it was with great pleasure that I saw the two integrated recently:
Awesome! Pwned Passwords is a repository of 613M passwords exposed in previous data breaches, which makes them very poor choices for future use. They're totally free and they have a really cool anonymity API that ensures no useful information about the password being searched for is ever exposed. Because it's free, works well and does something genuinely useful, it's become quite popular:
That's not including all the queries against the freely downloadable data either so really, I have no idea how much it's used. A lot. However, not everyone is happy with Home Assistant's decision to steer people away from bad passwords:
I read through this thread earlier today and some of it is fair, namely that it should be configurable. I'm all for forcing customers of a web application to avoid known bad passwords because the account takeovers directly impact the service operator, but I think there's a valid argument to be made that people running their own internal service should be able to shoot themselves in the foot if they so please (Home Assistant typically runs on a device within the home). And soon, they'll be able to do precisely that:
But let's talk about all the misconceptions in that thread as it relates to passwords in general, Pwned Passwords in particular and use within Home Assistant because some of these comments are way off the mark. Let's begin:
Your Home Network Should Not be a "Trusted Environment"
We'll start here because other headings logically follow. There were a number of responses along these lines:
The assertion is that somehow, the home network is "trusted" and you can take shortcuts. Just yesterday I was listening to the Darknet Diaries episode about how the LinkedIn breach was orchestrated by pivoting through a developer's home network where he had a weak SSH password on his Mac which was successfully brute forced (sidenote: this podcast rocks!) Then there's all the occasions where hackers end up controlling devices in the home network again, due to password reuse. Just access to a webcam with no ability to pivot from there? The Mirai botnet taught us how far vulnerable IoT devices can be pushed and let's face it, those of us running Home Assistant are putting a lot of IoT stuff in the network that creates some level of risk, we just don't know how much risk.
This is why Zero Trust has become such a buzzword in recent years. I've got a whole section about this in Part 3 of my IoT series so go and read the details over there. The guts of it is to secure the devices on your network as though none of them can be trusted because sooner or later, that assumption is probably going to be true.
Software Should Guide People Down the "Path of Success"
Software products like Home Assistant end up running in a huge number of environments with all sorts of security postures and are maintained by people with enormously varying backgrounds and expertise. The challenge is to establish a baseline of security posture around things like acceptable passwords such that it doesn't get in the way of good security decisions whilst still providing useful protection for those who don't know any better.
Imagine Home Assistant did no checking whatsoever on passwords - you're going to end up with troves of "password123" and "qwerty". This is unquestionably true because I see it all the time in data breaches where there haven't been controls put on passwords. Poor decisions on password security lead to compromises and inevitably, Home Assistant took the view that they needed to save people from themselves. The fact that so many people chimed in on this thread after finding they were using weak passwords merely serves to validate that view.
And we are talking about facts here as well, not "opinions". Pwned Passwords was born out of NIST's guidance which was based on huge volumes of experience around the way people create and manage secrets, specifically the following:
When processing requests to establish and change memorized secrets, verifiers SHALL compare the prospective secrets against a list that contains values known to be commonly-used, expected, or compromised. For example, the list MAY include, but is not limited to: Passwords obtained from previous breach corpuses.
That's from my post almost 4 years ago now on Authentication Evolved which was the catalyst for Pwned Passwords. It's an opinion that the home network is somehow immune from account takeover, and it's wrong.
Understanding k-Anonymity
A misconception I deal with all the time is that somehow, k-anonymity puts passwords at risk:
Ironically, this flips the discussion from being about "it's in my home network therefore it's safe anyway" to "how dare you put my password at risk"! Equally, Home Assistant sends outbound requests all the time, it's how you know when there's update! You could argue that this doesn't meet the "third party" criteria but the premise of sending outbound requests in the background still holds true.
I explained the k-anonymity implementation in detail in the launch blog post but in short, it only ever sends the first 5 characters of a SHA-1 hash of the password. The reason this service is approaching a billion hits a month is because this approach poses no practical risk whatsoever to the password. The closest anyone has ever come to demonstrating a workable risk is that someone who can observe the HTTPS request could potentially derive the hash prefix alone based on packet size and we addressed that with padding a year ago.
Firstly, "they" is just me (HIBP is a one-person community-centric project I alone run). Secondly, per the explanation above you never send your password so no, I don't know what it is. And finally, no requests are ever logged beyond very short-term transient data as part of the underlying platforms it runs on so still no. But that last point is just my personal guarantee, which is why there's k-anonymity which means even if I'm lying, the request data is still useless.
I appreciate Thomas' attempt here, but his maths is off. My guess is that he's looking at the number of results in a query, for example there are 621 hash suffixes returned by a query to the prefix of EB6E2. What this means is that I've seen 621 passwords that when hashed with SHA-1, begin with that prefix. Now, maybe the password you're searching for is one of those 621. But maybe it's one that hashes down to a different suffix and with SHA-1 being 40 bytes of hex, that's 16^35 other possible hashes it could be. And then there's hash collisions; the SHA-1 character space can represent 16^40 different values, but there are way more possible passwords than that (essentially infinity), which means there are multiple passwords that create the same hash.
Unless Home Assistant is sending a UA string to identify it (and there's no requirement from my side to do so), I have no idea which client is sending the request. Also, it's not thousands of requests per days, it's tens of millions
By all means, argue about the philosophy of sending outbound requests, but don't argue about the security or privacy unless there's a valid basis for it.
But Someone Else Must Have Used the Bad Password
Over the years of running Pwned Passwords, I've seen a lot of this:
Think through what this means and let's say, for example, their password was "Baseball123!" which, by most password complexity criteria is just fine; uppercase, lowercase, number and symbol, plus it's 12 characters long. But it's in Pwned Passwords 57 times so would be rejected by Home Assistant's implementation. Now, let's say Hellis81 has never been in a data breach and as such, there's never been an association between their email address and this password. Does that make the password ok? No, it's still a terrible password because if other people are using the same one to this extent, it's obviously way too predictable which means there's no way it was created with anything resembling good password practice.
The first 2 sentences contradict each other! And just for reference, people never * (with an infinitesimally small rounding error) have a 100-character password and very rarely even have 20 characters. But you can have 20 characters and it can be a much stronger password than one that's 30 characters simply because the former is randomly generated and the latter is a predictable patterns that's been previously leaked. Assumptions about password strength purely based on a metric like length leads people to making bad choices.
An Open Source Home Automation System is Not a Car
I know, shocking right? Somehow this thread speared off into lots of comparing guidance on password strength to warnings in cars:
I've written before about how IRL analogies are terrible and this one is no exception. You will not die if you use a weak password. There aren't government regulations defining how the software is built. You can be any age to operate it. Home Assistant is free. And so on and so forth.
Cars warn you about a number of unsafe decisions and so does Home Assistant, but that's where the similarities end. Everything else can be adequately discussed by simply talking about the technology rather than trying to find things IRL to compare to.
Pwned Passwords Queries are Tiny and Massively Fast
This is another implementation misunderstanding in terms of the overhead of Pwned Passwords:
It's easy to establish how much "unneeded data" is transferred just by looking at the request pattern, and you can do that directly via the Pwned Passwords interface in the browser because it hits the same API as every other client. Here's what I see:
This request was 13.4kB at 31ms. Remember, that's searching through 613M records too. You can see why it's so fast by looking at the response headers:
This request was returned directly from the cache at Cloudflare's edge node in Brisbane, about 70km away from me. They almost certainly have an edge node within 10ms latency of you (that's a long-stated goal of theirs), and almost every request comes from their cache (always in the high 90%+ range). It's also brotli encoded which is a super efficient compression algorithm and if Home Assistant doesn't use this encoding, gzip would likely only be a few kB different at most.
I don't know what cadence Home Assistant makes these requests with, but one would assume it's infrequent (perhaps on start or config change) and a request like this would never be noticed in terms of the impact on a metered connection. We're living in an era where the average webpage is about 150 times larger than this request, you're never going to notice a Pwned Passwords query.
If This Feature Sucks, Your Password Sucks
Another thing I regularly see independently of the whole Home Assistant situation is people talking about how great their password is even when it's in Pwned Passwords:
If it's in Pwned Passwords, I've seen it in plain text. If I've seen it in plain text, hackers have seen it in plain text. It doesn't matter how many letters and numbers and symbols you've got in your password, if it's in Pwned Passwords then it's floating around the web where plenty of other people have access to it. The guidance from NIST I quoted earlier said not to use "passwords obtained from previous breach corpuses" - there is no caveat that says "unless you think it's a really good one"!
But Security is Hard
Frankly, I despair when the answer for doing something you know is poor security posture boils down to something as simple as taking a little bit of time out to do it right:
I've already addressed the internal network fallacy so let's move past that, if the barrier to doing security right is a unit of time measured in "over an hour", stay up a bit later one night. Or set the alarm a bit earlier (I got up at 05:00 today to get on top of a few things... before my time went writing this blog post!) Or just do it right in the first place! Parking security because it requires a small time investment is almost certainly going to come back to bite you sooner or later.
There's an Easy Fix
If you're the sort of person that's sufficiently skilled to be standing up your own home automation server, you're well and truly the sort of person who shouldn't ever have a problem creating strong passwords:
If you have a password manager already then you know what you're doing and this will be a breeze (you're probably also not one of the ones complaining about this feature!) If you don't have one already, my blog post from a decade on the only secure password being the one you can't remember has well and truly stood the test of time. Read it, get a password manager and stop worrying
Finally, Let People Shoot Themselves in the Foot if They so Desire, But Not Getting Shot is Even Better
Pascal has already indicated this will be configurable and as I said earlier, I agree. With the benefit of hindsight, I suspect he'd do things differently in the first place, even though the intention was good.
I'm admittedly curious as to how an isolated network with no internet connection is checking the Pwned Passwords service, but I do agree with the comments that people should be free to be dicks and deal with the consequences. What I'd like to encourage people to do, however, is to take this as an opportunity to get a password manager (I've used 1Password for years and full disclosure, I'm on their advisory board) and strengthen all passwords that require it.
If this feature is bugging you, you almost certainly have a bigger problem. Turn it off as a last resort and even then, use weak passwords only where absolutely necessary. I've never had to think twice about it in Home Assistant as I've done this from the outset. It's easy, just do it so you can get back to dealing with the myriad of other problems home automation brings with it
11 MARCH 2021
Two of my favourite things these days are Have I Been Pwned and Home Assistant. The former is an obvious choice, the latter I've come to love as I've embarked on my home automation journey. So, it was with great pleasure that I saw the two integrated recently:
Awesome! Pwned Passwords is a repository of 613M passwords exposed in previous data breaches, which makes them very poor choices for future use. They're totally free and they have a really cool anonymity API that ensures no useful information about the password being searched for is ever exposed. Because it's free, works well and does something genuinely useful, it's become quite popular:
That's not including all the queries against the freely downloadable data either so really, I have no idea how much it's used. A lot. However, not everyone is happy with Home Assistant's decision to steer people away from bad passwords:
I read through this thread earlier today and some of it is fair, namely that it should be configurable. I'm all for forcing customers of a web application to avoid known bad passwords because the account takeovers directly impact the service operator, but I think there's a valid argument to be made that people running their own internal service should be able to shoot themselves in the foot if they so please (Home Assistant typically runs on a device within the home). And soon, they'll be able to do precisely that:
But let's talk about all the misconceptions in that thread as it relates to passwords in general, Pwned Passwords in particular and use within Home Assistant because some of these comments are way off the mark. Let's begin:
Your Home Network Should Not be a "Trusted Environment"
We'll start here because other headings logically follow. There were a number of responses along these lines:
The assertion is that somehow, the home network is "trusted" and you can take shortcuts. Just yesterday I was listening to the Darknet Diaries episode about how the LinkedIn breach was orchestrated by pivoting through a developer's home network where he had a weak SSH password on his Mac which was successfully brute forced (sidenote: this podcast rocks!) Then there's all the occasions where hackers end up controlling devices in the home network again, due to password reuse. Just access to a webcam with no ability to pivot from there? The Mirai botnet taught us how far vulnerable IoT devices can be pushed and let's face it, those of us running Home Assistant are putting a lot of IoT stuff in the network that creates some level of risk, we just don't know how much risk.
This is why Zero Trust has become such a buzzword in recent years. I've got a whole section about this in Part 3 of my IoT series so go and read the details over there. The guts of it is to secure the devices on your network as though none of them can be trusted because sooner or later, that assumption is probably going to be true.
Software Should Guide People Down the "Path of Success"
Software products like Home Assistant end up running in a huge number of environments with all sorts of security postures and are maintained by people with enormously varying backgrounds and expertise. The challenge is to establish a baseline of security posture around things like acceptable passwords such that it doesn't get in the way of good security decisions whilst still providing useful protection for those who don't know any better.
Imagine Home Assistant did no checking whatsoever on passwords - you're going to end up with troves of "password123" and "qwerty". This is unquestionably true because I see it all the time in data breaches where there haven't been controls put on passwords. Poor decisions on password security lead to compromises and inevitably, Home Assistant took the view that they needed to save people from themselves. The fact that so many people chimed in on this thread after finding they were using weak passwords merely serves to validate that view.
And we are talking about facts here as well, not "opinions". Pwned Passwords was born out of NIST's guidance which was based on huge volumes of experience around the way people create and manage secrets, specifically the following:
When processing requests to establish and change memorized secrets, verifiers SHALL compare the prospective secrets against a list that contains values known to be commonly-used, expected, or compromised. For example, the list MAY include, but is not limited to: Passwords obtained from previous breach corpuses.
That's from my post almost 4 years ago now on Authentication Evolved which was the catalyst for Pwned Passwords. It's an opinion that the home network is somehow immune from account takeover, and it's wrong.
Understanding k-Anonymity
A misconception I deal with all the time is that somehow, k-anonymity puts passwords at risk:
Ironically, this flips the discussion from being about "it's in my home network therefore it's safe anyway" to "how dare you put my password at risk"! Equally, Home Assistant sends outbound requests all the time, it's how you know when there's update! You could argue that this doesn't meet the "third party" criteria but the premise of sending outbound requests in the background still holds true.
I explained the k-anonymity implementation in detail in the launch blog post but in short, it only ever sends the first 5 characters of a SHA-1 hash of the password. The reason this service is approaching a billion hits a month is because this approach poses no practical risk whatsoever to the password. The closest anyone has ever come to demonstrating a workable risk is that someone who can observe the HTTPS request could potentially derive the hash prefix alone based on packet size and we addressed that with padding a year ago.
Firstly, "they" is just me (HIBP is a one-person community-centric project I alone run). Secondly, per the explanation above you never send your password so no, I don't know what it is. And finally, no requests are ever logged beyond very short-term transient data as part of the underlying platforms it runs on so still no. But that last point is just my personal guarantee, which is why there's k-anonymity which means even if I'm lying, the request data is still useless.
I appreciate Thomas' attempt here, but his maths is off. My guess is that he's looking at the number of results in a query, for example there are 621 hash suffixes returned by a query to the prefix of EB6E2. What this means is that I've seen 621 passwords that when hashed with SHA-1, begin with that prefix. Now, maybe the password you're searching for is one of those 621. But maybe it's one that hashes down to a different suffix and with SHA-1 being 40 bytes of hex, that's 16^35 other possible hashes it could be. And then there's hash collisions; the SHA-1 character space can represent 16^40 different values, but there are way more possible passwords than that (essentially infinity), which means there are multiple passwords that create the same hash.
Unless Home Assistant is sending a UA string to identify it (and there's no requirement from my side to do so), I have no idea which client is sending the request. Also, it's not thousands of requests per days, it's tens of millions
By all means, argue about the philosophy of sending outbound requests, but don't argue about the security or privacy unless there's a valid basis for it.
But Someone Else Must Have Used the Bad Password
Over the years of running Pwned Passwords, I've seen a lot of this:
Think through what this means and let's say, for example, their password was "Baseball123!" which, by most password complexity criteria is just fine; uppercase, lowercase, number and symbol, plus it's 12 characters long. But it's in Pwned Passwords 57 times so would be rejected by Home Assistant's implementation. Now, let's say Hellis81 has never been in a data breach and as such, there's never been an association between their email address and this password. Does that make the password ok? No, it's still a terrible password because if other people are using the same one to this extent, it's obviously way too predictable which means there's no way it was created with anything resembling good password practice.
The first 2 sentences contradict each other! And just for reference, people never * (with an infinitesimally small rounding error) have a 100-character password and very rarely even have 20 characters. But you can have 20 characters and it can be a much stronger password than one that's 30 characters simply because the former is randomly generated and the latter is a predictable patterns that's been previously leaked. Assumptions about password strength purely based on a metric like length leads people to making bad choices.
An Open Source Home Automation System is Not a Car
I know, shocking right? Somehow this thread speared off into lots of comparing guidance on password strength to warnings in cars:
I've written before about how IRL analogies are terrible and this one is no exception. You will not die if you use a weak password. There aren't government regulations defining how the software is built. You can be any age to operate it. Home Assistant is free. And so on and so forth.
Cars warn you about a number of unsafe decisions and so does Home Assistant, but that's where the similarities end. Everything else can be adequately discussed by simply talking about the technology rather than trying to find things IRL to compare to.
Pwned Passwords Queries are Tiny and Massively Fast
This is another implementation misunderstanding in terms of the overhead of Pwned Passwords:
It's easy to establish how much "unneeded data" is transferred just by looking at the request pattern, and you can do that directly via the Pwned Passwords interface in the browser because it hits the same API as every other client. Here's what I see:
This request was 13.4kB at 31ms. Remember, that's searching through 613M records too. You can see why it's so fast by looking at the response headers:
This request was returned directly from the cache at Cloudflare's edge node in Brisbane, about 70km away from me. They almost certainly have an edge node within 10ms latency of you (that's a long-stated goal of theirs), and almost every request comes from their cache (always in the high 90%+ range). It's also brotli encoded which is a super efficient compression algorithm and if Home Assistant doesn't use this encoding, gzip would likely only be a few kB different at most.
I don't know what cadence Home Assistant makes these requests with, but one would assume it's infrequent (perhaps on start or config change) and a request like this would never be noticed in terms of the impact on a metered connection. We're living in an era where the average webpage is about 150 times larger than this request, you're never going to notice a Pwned Passwords query.
If This Feature Sucks, Your Password Sucks
Another thing I regularly see independently of the whole Home Assistant situation is people talking about how great their password is even when it's in Pwned Passwords:
If it's in Pwned Passwords, I've seen it in plain text. If I've seen it in plain text, hackers have seen it in plain text. It doesn't matter how many letters and numbers and symbols you've got in your password, if it's in Pwned Passwords then it's floating around the web where plenty of other people have access to it. The guidance from NIST I quoted earlier said not to use "passwords obtained from previous breach corpuses" - there is no caveat that says "unless you think it's a really good one"!
But Security is Hard
Frankly, I despair when the answer for doing something you know is poor security posture boils down to something as simple as taking a little bit of time out to do it right:
I've already addressed the internal network fallacy so let's move past that, if the barrier to doing security right is a unit of time measured in "over an hour", stay up a bit later one night. Or set the alarm a bit earlier (I got up at 05:00 today to get on top of a few things... before my time went writing this blog post!) Or just do it right in the first place! Parking security because it requires a small time investment is almost certainly going to come back to bite you sooner or later.
There's an Easy Fix
If you're the sort of person that's sufficiently skilled to be standing up your own home automation server, you're well and truly the sort of person who shouldn't ever have a problem creating strong passwords:
If you have a password manager already then you know what you're doing and this will be a breeze (you're probably also not one of the ones complaining about this feature!) If you don't have one already, my blog post from a decade on the only secure password being the one you can't remember has well and truly stood the test of time. Read it, get a password manager and stop worrying
Finally, Let People Shoot Themselves in the Foot if They so Desire, But Not Getting Shot is Even Better
Pascal has already indicated this will be configurable and as I said earlier, I agree. With the benefit of hindsight, I suspect he'd do things differently in the first place, even though the intention was good.
I'm admittedly curious as to how an isolated network with no internet connection is checking the Pwned Passwords service, but I do agree with the comments that people should be free to be dicks and deal with the consequences. What I'd like to encourage people to do, however, is to take this as an opportunity to get a password manager (I've used 1Password for years and full disclosure, I'm on their advisory board) and strengthen all passwords that require it.
If this feature is bugging you, you almost certainly have a bigger problem. Turn it off as a last resort and even then, use weak passwords only where absolutely necessary. I've never had to think twice about it in Home Assistant as I've done this from the outset. It's easy, just do it so you can get back to dealing with the myriad of other problems home automation brings with it