COMB: The Big Password Leak

Paper by Felipe Daragon and Syhunt Icy Team. April 26. 2021

Our Analysis

Following customer and media requests, we now analyzed the COMB21, the biggest known compilation of password leaks published on Feb 2, 2021 by a hacker on the same Internet forum that last month hosted links and information about the mega leak of Brazilian data.

We concluded that not only the leak exposes current and past passwords, but gives insight on key password elements and patterns, and reuse and changing habits of individuals and organizations from all around the world in a dangerous and unprecedented way: in many cases, between 3 to 30 passwords linked to an unique email were exposed, which gives insight on a person's password changing habits. And when a password repeats with an identical username at multiple domains, someone with password reusing habit is exposed.

A staggering total of 3.28 billion of passwords were exposed, linked to 2.18 billion unique emails, compiled into a single file and published through a link on the forum. This time the leak was fully published for free and the archive is being actively shared among hackers and cybercriminals in the form of a single, 7zip compressed archive.

The Leak in Numbers

3.28 billion 2.18 billion 26 millions
Total of Passwords Exposed: 3279064312Total of Unique Emails in LeakTotal of Domains in Leak
100GB 1.5 million 625,505
Uncompressed Password Database SizeTotal of World Gov Passwords Exposed: 1502909Total of US Gov Passwords Exposed: 625505
18.6GB (7Z)
Compressed Password Database Size

This compilation of leaks contains twice the amount of unique email and password pairs than the Breach Compilation from 2017, which exposed 1.4 billion credentials. It includes the script named count_total.sh, just like the 2017 compilation, and adds two new scripts: query.sh, for querying emails, and sorter.sh, for sorting the password leak data.

How World Government Is Affected

During our analysis, we concluded that the COMB leak includes millions of passwords linked to emails from government domains, which poses a major threat to government entities around the globe. Not only hackers and cybercriminals may exploit the COMB leak, but also hostile foreign actors.

Gov Email Passwords In The Leak (By Country) - Top 50 Countries

CountryTotal of Exposed Passwords
United States of America (*.gov)625,505
United Kingdom (*.gov.uk)205,099
Australia (*.gov.au)136,025
Brazil (*.gov.br)68,535
Canada (*.gc.ca)50,726
South Africa (*.gov.za)48,838
Mexico (*.gob.mx)31,995
France (*.gouv.fr)24,002
China (*.gov.cn)18,282
South Korea (*.go.kr)17,560
Taiwan (*.gov.tw)17,007
Argentina (*.gov.ar)15,604
New Zealand (*.govt.nz)15,488
Malaysia (*.gov.my)12,463
Turkey (*.gov.tr)11,469
Austria (*.gv.at)9,529
Colombia (*.gov.co)9,428
Thailand (*.go.th)7,913
Japan (*.go.jp)7,650
Ukraine (*.gov.ua)6,206
Peru (*.gob.pe)6,038
Chile (*.gob.cl)5,843
Singapore (*.gov.sg)5,470
Israel (*.gov.il)4,984
Costa Rica (*.go.cr)4,402
India (*.gov.in)4,253
Poland (*.gov.pl)4,194
Indonesia (*.go.id)4,040
United Arab Emirates (*.gov.ae)3,672
Switzerland (*.gov.ch)3,310
Ecuador (*.gov.ec)2,792
Italy (*.gov.it)2,593
Saudi Arabia (*.gov.sa)2,564
Hungary (*.gov.hu)2,166
Pakistan (*.gov.pk)2,123
Russia (*.gov.ru)1,964
Philippines (*.gov.ph)1,921
Hong Kong (*.gov.hk)1,795
Vietnam (*.go.vn)1,725
Latvia (*.gov.lv)1,647
El Salvador (*.gob.sv)1,640
Mozambique (*.gov.mz)1,493
Fiji (*.gov.fj)1,492
Venezuela (*.gob.ve)1,461
Kenya (*.go.ke)1,407
Namibia (*.gov.na)1,354
Jordan (*.gov.jo)1,340
Jamaica (*.gov.jm)1,298
Morocco (*.gov.ma)1,235
Uganda (*.gov.ug)1,228
All countries of the globe combined1,502,909

How we got to the above numbers: we scanned the entire 100GB COMB archive.

Note: Germany is not listed because the gov domain extension is not used in the country.

How USA Is Affected

2.78 millions 625,505
Total of .US Domain Passwords Exposed*: 2,780,342Total of GOV Passwords Exposed: 625,505

(*) Actual number is actually much bigger because international emails used by Americans. such as gmail.com. were not considered.

Top 20 USA Government Domains In The Leak (.GOV)

Rank Position, Domain, Number of Exposed Passwords (All names below have a .gov extension)

1. state: 29,144
2. va: 28,937
3. dhs: 21,575
4: nasa: 15,665
5. irs: 10,480
6: cdc: 8,904
7. usdoj: 8,857
8. ssa: 8,747
9. usps: 8,205
10: epa: 7,986

11. dc: 7,790
12. schools.nyc: 7,761
13: ky: 7,314
14: mail.nih: 7,302
15: faa: 7,159
16: michigan: 7,053
17: bop: 7,051
18: noaa: 6,682
19: gsa: 6,456
20: med.va: 6,345

How we got to the above numbers: we scanned the entire 100GB COMB archive.

Oldsmar Florida Water Facility Attack

According to an article by CyberNews, the COMB leak included 13 credentials linked to emails of the Oldsmar water plant in Florida, which, three days after the COMB was published, suffered a cyber attack that attempted to poison the water supply by boosting lye levels by 100 times. There is no confirmation, however, that the COMB leak was used during the cyber attack.

How Brazil Is Affected

9.78 millions 68,535 4,589
Total of .BR Passwords Exposed*: 9,785,714Total of GOV.BR Passwords Exposed: 68,535Total of JUS.BR Passwords Exposed: 4,589

(*) Actual number is actually much bigger because international emails used by Brazilians. such as gmail.com. were not considered.

Top 20 GOV.BR Domains In The Leak

Rank Position, Domain, Number of Exposed Passwords (All names below have a .gov.br extension)

1. caixa: 2,197
2. fatec.sp: 2,035
3. see.sp: 1,665
4: pbh: 1,008
5. macae.rj: 1,004
6. bcb: 999
7. camara: 985
8. previdencia: 870
9. policiamilitar.sp: 831
10: escola.ce: 805

11: etec.sp: 796
12: seed.pr: 796
13: prefeitura.sp: 787
14: tj.rs: 769
15: polmil.sp: 642
16: chesf: 593
17: dpf: 576
18: brigadamilitar.rs: 493
19: fazenda.sp: 466
20: agricultura: 451

How we got to the above numbers: we scanned the entire 100GB COMB archive.

Exploitation of Combined Mega Leaks for Brazil

The Big Brazil Data Leak that we recently analyzed contained millions of emails exposed, tied to individuals through CPF number and companies through their CNPJ numbers, but it did not include passwords. However, with the leaks being actively sold and shared online, cybercriminals and hackers, based on their known modus operandi, definitely will take advantage of the combination of both leaks.

As we mentioned above, the COMB password leak gives insight on past password elements, password patterns and password changing and reuse habits for individuals and organizations in an unprecedented way. The CPF/CNPJ data leak, on the other hand, gives insight on specific key details, such as email, birthday date, family member names and so on, that individuals can be using as part of their current passwords. When both leaks are exploited together, the chances of accurately guessing the current password of a target significantly increases.

Following a request by Estadão, we processed the hacker's catalog information of the Big Brazil Data Leak, and learned that 77.8 million individuals and 15.8 million companies have emails catalogued for sale by the hacker, all tied to their CPF and CNPJ numbers. This is the number of individuals and companies that are particularly vulnerable to the combined exploitation of both the COMB and the Brazil mega data leak.

The first leak also includes the date of inclusion of emails in the database, which gives insight on which year an email or multiple emails associated to a person or business were in use.

The Source of The Leak

We named this leak PWCOMB21 (PassWord Compilation Of Many Breaches Of 2021) and sometimes we refer to it as just COMB. The COMB leak is, like we explained in first section of this document, a compilation of leaks, and as such, the impressive number of leaked passwords comes from multiple leaks in different companies and organizations that happened over the years. The passwords were exposed through well-known techniques such as password hash cracking, after being stolen, and sometimes fishing attacks or eavesdropping on insecure, plaintext connections.


Despite the efforts over the recent years by the companies and organizations to monitor password leaks, harden the security of web applications, login mechanisms, switch to HTTPS and respond to password leaks, the publication and active sharing of this password leak compilation is a major blow to Internet security.

While some of the above listed domains, organizations, agencies and companies may have publicly acknowledged about breaches over the years and adopted appropriate response and countermeasure actions, a significant number of leaked passwords appear to originate from breaches that affected other companies and websites that simply allowed to create accounts linked to user emails. This means services like LinkedIn among other social networks, and multiple other Internet websites not referenced in the COMB archive.

Syhunt recommends, among other things, that:

  • Innovations in the field of authentication should be supported, pursued and put in place.
  • Multi-factor authentication (MFA) and tokens, more than ever, should be widely deployed and encouraged.
  • The replacement of broken password hashing (MD5, SHA1 etc) should be more aggressively pursued through source code analysis, deprecation (SAST) and additional means.
  • Users should be advised not only to change existing passwords, but to completely break with password naming habits and patterns when changing a password. They should be encouraged and assisted to adopt strong passwords more than ever.
  • Administrative and advanced users should use password managers that include strong, secure password generation feature.
  • Discuss, review and improve password policies and practices around the globe.
  • Discuss about the implications of deep learning applied to the COMB leak for password guessing.

Exploitation Through Deep Learning

If deep learning tools, such as PassGAN, are successfully applied to the COMB leak. the threat posed by the leak compilation increases - PassGAN applies a Generative Adversarial Network to password leaks in order to learn about the distribution of these passwords and, then, uses this knowledge to guess passwords.


