Mystery Database Exposed Info on 80 Million US HouseholdsResearchers Locate an Unprotected 24 GB Database With Names, Addresses and Incomes
A mysterious, unsecured database hosted on Microsoft's cloud platform contained personal information on nearly 80 million U.S. households, according to two researchers who found it.
In a blog posted Monday, Noam Rotem and Ran Locar, self-described security researchers and hacktivists, describe finding the exposed 24 GB database that contained information pertaining to 80 million U.S. households, including the full names of residents, age, marital status, income bracket and other details.
A Microsoft spokesperson told Information Security Media Group late Monday that the owner of the database had been notified and that the database was no longer publicly accessible via the internet. Microsoft declined to reveal the owner, and on Tuesday, Rotem told ISMG that the researchers still have not determined who the owner is.
It's not clear if anyone exploited the exposed data or tried to download or steal it. It's also not clear how long the data had been exposed to the internet.
Rotem and Locar note in the blog that while they examined the data, they did not download the information because that would raise ethical questions about possessing personal data without permission.
To put in perspective how much data these files contained, there were approximately 127 million households in the U.S. in 2017, according to the website Statista. So this particular database apparently contained information on more than 62 percent of U.S. households.
A household consists of all the people who occupy a housing unit, according to the U.S. Census Bureau. So it's not clear how many individuals' data could have been in the exposed database.
Rotem and Locar are working on a large-scale web mapping project, using port scanning techniques to look at various known IP blocks and addresses. During this project, they have found weaknesses and data leaks in numerous files and systems that are stored in the cloud and exposed to the wider, public-facing internet, according to their blog posted on VPNMentor Monday.
In most cases, the researchers have been able to identify exposed databases and then contact the owner before disclosing it. But in the case of the database with information on nearly 80 million U.S. households, Rotem and Locar could not identify the owner, but they noted that it's hosted on Microsoft's Azure cloud service, according to their blog.
"Unlike previous leaks we've discovered, this time, we have no idea who this database belongs to," according to the post. "It's hosted on a cloud server, which means the IP address associated with it is not necessarily connected to its owner."
While services such as Microsoft Azure and Amazon Web Services supply and secure the infrastructure for cloud platforms, these companies do not secure individual databases and files that companies use. That part of the security equation is the responsibility of the customer.
The two researchers note in the blog that the database appears to be connected to some type of "service" because the words "member_code" and "score" appear in every entry. They also note that they could not find anyone listed under the age of 40 in the data.
"This made us suspect that the database is owned by an insurance, healthcare or mortgage company," according to the blog post. "However, information one may expect to find in a database owned by brokers or banks is missing. For example, there are no policy or account numbers, Social Security numbers or payment types."
One reason for reporting this exposed database, the researchers say, is to attempt to identify the owner via crowdsourcing.
The database the two researchers found includes full addresses, including street addresses, cities, counties, states, and ZIP codes; exact longitude and latitude; full names, including first, last, and middle initial; ages; and dates of birth. The database contained other data that was encoded for internal use, but the two researchers found that these files contained title, gender, marital status, income, homeowner status and dwelling type.
If they gained access to this wealth of data, cybercriminals could use it to commit various kinds of identity theft.
Rotem and Locar also warn that if a criminal gang was planning a ransomware attack, they would know through this data how much income a victim has, and this could help them to determine how big a ransom they could demand.
"The major issue here is the scope and detail of the data," Rotem tell ISMG. "It can be used for marketing, of course, but it can also be used for malicious purposes. If you know the location, income level and age of the person, you can target rich elderly people for a wide range of intentions. The fact that the data was accessible freely over the internet is very problematic."
If someone had accessed the database, it's possible that all this personal information could have been sold on the dark web to other cybercriminals, says Timur Kovalev, chief technology officer at Untangle, a networking and security firm based in San Jose, California.
"The level of detail this database has on 80 million households should have all consumers and businesses worried," Kovalev says. "That information could not only be scraped and used for identity theft, but it could also be sold."
The organization that owns the database is likely to face questions from states that have started to add more data privacy and protection laws to protect consumers, Kovalev warns. This includes the California Consumer Privacy Act, which mirrors some of the protections offered by the European Union's General Data Privacy Regulation and allows residents to request information that companies collect on them (see: California's New Privacy Law: It's Almost GDPR in the US).
Before Monday's disclosure, Rotem had published other research with VPNMentor concerning a Chinese e-commerce site called Gearbest that owned an unsecured database that exposed 1.5 million customer records, including payment information, email addresses and other personal data (see: Gearbest Database Leaks 1.5 Million Customer Records).
And earlier this month, security researchers with UpGuard found that two third-party Facebook application developers exposed users' personal information by leaving the data exposed without a password in unsecured Amazon Web Services S3 buckets. In that case, more than 500 million records were exposed (see: Millions of Facebook Records Found Unsecured on AWS).