Twitch is hacked, and its source code leaked

2021-10-0616:18556298kotaku.com

Big-name streamers' incomes are revealed in the 125GB data leak

The Twitch logo in purple.

According to a report by VGC, an anonymous hacker has posted a 125GB torrent link containing, well, all of Twitch, including its source code and commit history going back to the start. The leak also contains streamers’ incomes since 2019, and information that suggests the Amazon-owned streaming platform’s Steam rival Vapor may really exist.

Update, 10/6/21, 11:30 a.m. ET: Twitch confirmed the hack on Twitter, but wrote that it still doesn’t know the extent of the breach:

The download was posted to 4chan today, described by its unidentified source as “part one” of “an extremely poggers leak,” stating that it contains the following:

> Entirety of twitch.tv, with commit history going back to its early beginnings

> Mobile, desktop and video game console Twitch clients

> Various proprietary SDKs and internal AWS services used by Twitch

> Every other property that Twitch owns including IGDB and CurseForge

> An unreleased Steam competitor from Amazon Game Studios

> Twitch SOC internal red teaming tools (lol)

And, the poster notes, “Creator payout reports from 2019 until now. Find out how much your favorite streamer is really making!”

Calling Twitch a “disgusting toxic cesspool,” the given motivation for the leak is to, “foster more disruption and competition in the online video streaming space.” They add, “we have completely pwnd them.” They also include the #DoBetterTwitch hashtag.

Kotaku has verified that this is a working torrent, and VGC claims anonymous sources within Twitch have confirmed the data is legitimate. We’ve obviously contacted Twitch for further comment, but haven’t heard back at the time of publication.

The consequences for a leak like this can be huge. Clearly the first thing anyone who has a Twitch account should do is change their password immediately, and set up two-factor authentication. It’s also advised to reset your stream key to protect data.

But the longer-term issues will be far more complex. Just the financial information for big-name streamers getting out there will be hugely serious for Twitch, with figures in the millions of dollars.

Within the leaked data is reportedly information on Vapor, Amazon’s rumored rival to Steam, which would integrate a store into the Twitch platform. More information will likely come to light as the leak is pored over. And of course this is marked as “part one,” suggesting that much more information may have been compromised in the hack.

This all comes at a time of much tribulation for Twitch, with the #DoBetterTwitch/#TwitchDoBetter hashtags at the forefront of efforts by users to demand a better service from the platform, including boycotts to demand action over hate raids. Twitch seems to be making some positive moves, but then always finds a way to do something terrible too.

We will keep you up to date with the consequences of the leak, which will likely be causing serious consternation at Amazon. As the hacker said on 4Chan, “Jeff Bezos paid $970 million for this, we’re giving it away FOR FREE.”


Read the original article

Comments

  • By nemothekid 2021-10-0616:5015 reply

    This is a pretty thorough and high profile hack on a major tech company - this isn't something I'd expect from an Amazon owned property. The hack (allegedly, I haven't downloaded it) includes

    * Entire git histories

    * Internal/Private AWS SDKs

    * Encrypted Password dumps and payout reports

    It's so comprehensive I'm very curious into how an attacker got that level of access. I can't think of another, large, corporate web 2.0 startup who's gotten owned in a similar fashion. Could the same attack work on Amazon? YouTube?

    It's also strange that someone who has this level of access to what is presumably a multi-billion dollar company decided to just leak the data? Maybe they did try to ransom it, but I'd imagine someone with this kind of access inside Twitch must have had some creative way of making money.

    • By madrox 2021-10-0619:108 reply

      There were no encrypted password dumps. No production secrets were leaked (according to the article). What's here is no more than what your average Twitch engineer has access to.

      Yes, that included payout data. Anyone with "staff" access to the site (which any employee can have) has access to any streamer's dashboard, which includes payout data.

      I don't think this was an attack. Based on the data so far I think it was a disgruntled engineer. Obviously if more gets leaked later I may revise that opinion.

      • By ergerger 2021-10-0619:563 reply

        I also worked for Twitch and can confirm what you're saying is true. These repo's any staff member had access to - including non-engineering staff.

        Revenue for the longest time was as simple as navigating to a streamers dashboard as staff, but they did finally gate that away from staff who don't need to see that info, however I am sure there are other ways to obtain revenue reporting info.

        I am assuming all data - including personal - has been compromised but so far, the data leaked is data that most staff would have access to in some way or another. Some may find that shocking, but this was not a "high level hack"

        • By madrox 2021-10-0620:001 reply

          I'm actually very happy to hear they finally added a flag for payout access. It's been years since I was there and my eyes bugged out when I saw what I had access to without needing it.

          • By ogn3rd 2021-10-0622:22

            Parent company was no different.

        • By bengale 2021-10-0621:471 reply

          Why did non engineers have access to repos?

          • By vkou 2021-10-0621:50

            The better question is, why did random engineers have access to the financials of the streamers on the platform, without having to go through a break-glass, audited, emergency access escalation.

        • By realyashnag 2021-10-0620:47

          Allegedly it also contains AWS access keys. I feel bad for the engineers who will have to answer for this.

      • By twistedpair 2021-10-0619:144 reply

        So much for information compartmentalization. Does the typical engineer need access to payment details for their daily work?

        • By nonameiguess 2021-10-0619:57

          No, but doing least privilege, separation of privileges, and RBAC correctly is tedious and difficult and slows development velocity, so companies rarely do it well if they even bother trying unless some outside force compels them.

          I highly doubt it would be possible to do something like this at AWS, just because hosting multitenant infrastructure and working with the government forces you to implement security since you're being audited and awarded contracts on that basis. Twitch users don't give a crap about the security of the platform. They just want to monetize as quickly as they can, too.

          So I'm not hugely surprised that practices and culture would be different even if they have the same parent company, especially since Twitch was an acquisition. Even if not, though, I'd expect security at Prime to be better than Twitch but worse than Marketplace, Marketplace to be worse than AWS, etc. All speculation since I've never worked at any Amazon product, but that's what I would expect.

        • By oconnor663 2021-10-0619:391 reply

          The tradeoffs for any individual piece of data are different from the tradeoffs of a company-wide policy. Siloing off one little thing (e.g. credit card info) usually doesn't inconvenience very many people, but at the same time it only provides marginal security. No front page headline has ever read "At Least The Credit Card Info Was Safe". On the other hand, a company-wide policy of siloing everything can have more of a security impact, but it also inconveniences everyone frequently. That's the tradeoff that many tech companies don't want to make.

          • By zerkten 2021-10-0620:131 reply

            I don't see how this precludes just-in-time access. Even if people can re-up on their own, you can still observe the data access patterns and manage the risk. Further, when you see someone is getting blocked a lot you can improve the experience for them so they are unblocked, or have more efficient access to the data. This is just mature data and security management.

            Quality of life and developer experience are important topics in many ways, but should they really trump security consistently? It's always going to be dependent on people's risk assessment and comfort, but frequently it skews the wrong way because the people making the decisions know that they'll be gone.

            • By myohmy 2021-10-0621:13

              Implementing just-in-time access on legacy systems that pre-date just-in-time architectures is extremely expensive. Its cheaper to either give all info or no info. Which is what every legacy company does instead.

              My company can shut off my access to the all the databases when they stop asking me to troubleshoot any and all data issues. Which will never happen.

        • By ABeeSea 2021-10-0619:24

          Part of the appeal is working at a place like Amazon is having a voice in decision making in the product you’re building. Hard to make informed decisions or opinions without the data. Engineers in Amazon retail definitely have broad access to sales data.

      • By ljm 2021-10-0619:212 reply

        Why would an intern at Twitch have access to data in production?

        Saying that no 'secrets' were leaked is effectively burying the lede.

        • By ijcd 2021-10-0622:33

          In general the broad access was to code repos early on. Some were gated. There’s lots of collaboration and the need to study other code bases for learning and collaboration, read only. It’s micro services galore there so one didn’t tend to have access to production databases for services or systems you didn’t work on. You were opted in there. Teams did their own devops for the most part.

          The payout data likely wasn’t ripped from a DB but rather dashboards which customer service or partnerships likely had access to. Tier1 or Tier2 support kinda stuff.

          This smells like a stolen backup or maybe network access and http scanning, finding the internal GitHub and maybe a support admin cred that allowed dashboard view.

        • By madrox 2021-10-0619:253 reply

          By secrets, I mean salts, password hashes, etc.

          • By ihattendorf 2021-10-0619:282 reply

            > You also get access to every streamer's dashboard and their analytics

            I would classify that as access to production systems.

          • By ljm 2021-10-0619:412 reply

            Why is getting access to prod, or prod data, considered a perk, exactly?

            • By manquer 2021-10-0619:501 reply

              The perk is the wrench UX denoting you are an employee to the community . Reddit/twitch allow employees to communicate with the users . It is a social media platform , being able to indicate that you are special is street cred.

              The other access rights that come from staff access is either incedential or miss /debt in architecture.

              • By zerkten 2021-10-0620:241 reply

                It's understandable why this is a neat perk, but it also seems absurd when you look at Twitch as an entity owned by a global corporation.

                • By VRay 2021-10-0620:571 reply

                  Man oh man, big "No thanks" to a perk like that from me

                  "Hey, pick through everything I say with a fine-toothed comb and treat it as the official company stance!"

                  • By iotku 2021-10-0623:00

                    From prior (user) experience, typically staff have non-wrenched alt accounts for just chilling out in streams (with less concern about conduct, but generally tame by twitch standards). But will wrench up for higher profile streams or folks they're otherwise pretty close with personally.

                    I suspect that's a lot more controlled these days, but it wasn't very uncommon for signified staff to be trolling along with everyone else.

          • By skilled 2021-10-0619:302 reply

            This statement makes no sense.

            The leak includes source code of multiple active websites and applications that are operated under the umbrella of Twitch/Amazon.

            Why would an intern have access to this data?

            • By isbvhodnvemrwvn 2021-10-0619:451 reply

              In many companies source code for all products is available to each and single developer.

              • By skilled 2021-10-0620:082 reply

                In what world does all this data in the leak would be stored together in a unified ecosystem? It makes absolutely no sense.

                If you're saying that Twitch runs their developer environment in a lousy manner (and you have proof of this), then please go ahead.

                But to imply that an intern/average developer would be given access to all this branching information is ignorant.

                • By barkingcat 2021-10-0713:41

                  I think many people are trying to say that in this world at many companies all that data is indeed stored and accessed together.

                  Maybe the super secure siloed world doesn't really exist outside of military/government organizations.

                • By imwillofficial 2021-10-0620:37

                  I have first hand info, and this is how it’s done. Don’t call somebody ignorant if you don’t have first hand info. Leave that for somebody who does.

            • By nemothekid 2021-10-0620:051 reply

              >Why would an intern have access to this data?

              monorepos are a thing at several companies (e.g. Google).

              • By enneff 2021-10-0620:48

                Monorepos can and often do have ACLs per directory.

      • By popotamonga 2021-10-0620:14

        I worked for a multi billion company and even 6 month contractors had access to basically everything with little effort.

      • By unethical_ban 2021-10-0620:052 reply

        No one in IT should have access to business data. That's simply best practice. Worst case would be a database engineer who has access to backups or some prod data for troubleshooting, and even that should be under tight control with good access accounting.

        • By hamburglar 2021-10-0620:201 reply

          Welcome to devops. Ask Mike down the hall to add you to the “admin” group. Tell him you’re a new dev so you need everything.

          (This is a joke but also, at many companies, it’s not. Twitch was once small and grew. Who knows what ancient all-access switches are still critical to running the systems, marked “tech debt” in someone’s backlog)

          • By unethical_ban 2021-10-0621:482 reply

            The whole point of devops is to automate everything according to best practices, so fuckups are a thing of the past! The only fuckups, of course, will be Terraform state issues.

            • By syshum 2021-10-070:15

              No the whole point of devops is to get rid of those terrible sysadmins always keeping the devs from doing anything...

              Once you have "DevOps" the devs are ops, your head count drops, and all that pesky security and other things those dirty sysadmins wanted are gone

              kinda /s sometimes I think that is really want managers think about devops

            • By hamburglar 2021-10-070:22

              As it turns out, the entire industry doesn't quite agree on "the whole point of devops."

        • By cube00 2021-10-0622:21

          Until the business raises a priority one incident that their monthly reports are not looking right and you need to dive into the data to find out why some other API back end decided to present its numbers this month divided by 1000 for ease of display to their own users.

          I know, I know, service contacts but my point is sometimes engineers need at least temporary access to provide support at times.

      • By weaksauce 2021-10-0620:161 reply

        Could have been a hack of a twitch engineer's laptop or something like that.

        • By bdreadz 2021-10-0622:33

          This is what I thought of as well. Maybe just an engineer was hacked.

      • By syshum 2021-10-0622:19

        Sounds like someone in Twitch Security needs to take a course on Least Privileged Access then

    • By 63 2021-10-0617:194 reply

      > It's also strange that someone who has this level of access to what is presumably a multi-billion dollar company decided to just leak the data? Maybe they did try to ransom it, but I'd imagine someone with this kind of access inside Twitch must have had some creative way of making money.

      Notably, the initial leak didn't actually include the password data which the leaker claims to have, just source code and payment data which has been verified by several affected streamers. It's possible that this first leak was just to establish trust so they can random or auction password hashes later.

      • By ganoushoreilly 2021-10-0618:14

        Given the torrent is labeled "twitch-leaks-part-one" I'm curious too as to what they have. The torrent breaks out into a lot of compressed volumes, so it's clear this wasn't just a backup file, but a curated collection of files. I'm very curious if we will see any other amazon related leaks come from it.

        Either way, I can only imagine the chaos inside as they try to figure out what has transpired here.

      • By nemothekid 2021-10-0617:423 reply

        >It's possible that this first leak was just to establish trust so they can random or auction password hashes later.

        Password hashes are relatively useless though? Once the leak is announced I imagine most of the big targets will rotate their credentials. Then the next thing you need to do is spend possibly thousands in CPU time bruteforcing bcrypt hashes. Then I'm not sure what you can even do with those.

        I'm not criminally creative but I imagine you could make more by abusing trust with payment processors or fraudulent invoices.

        • By tyingq 2021-10-0618:052 reply

          >Then I'm not sure what you can even do with those

          Assume some end users used the same passwords on other, non-twitch accounts. That's what makes hacked passwords valuable, no matter where they came from.

          • By twofornone 2021-10-0618:193 reply

            That's something I've wondered - do password hashes tend to be the same across platforms? Is everyone using the same hashing algorithm? Isn't this also what salting is for?

            Never implemented auth myself.

            • By tyingq 2021-10-0618:24

              Yes, the hashes are (usually) different due to different algorithms and/or salts. But, if you've brute forced one by using good guesses, and know the email/userid for other sites, and the user used the same or a similar password...that doesn't matter.

            • By franga2000 2021-10-0618:361 reply

              If everyone did things the way they're supposed to then no, hashes should never be the same between platforms. Using the same algorithm is likely, but as you said, salting solves that.

              But mistakes such as salting with just the username are sometimes made even by very large companies and in that case, hashes could be the same.

              • By evandwight 2021-10-0618:553 reply

                Why does it matter if hashes are the same?

                That only tells you the passwords are the same.

                • By mindwok 2021-10-0619:371 reply

                  If they are the same everywhere, you can precompute a huge database of hashes (called a rainbow table) and simply lookup the hash in the table when breaches occur to find the password. By salting, every provider who stores credentials has different hashes for the same inputs which makes the approach far less attractive at a large scale.

                  • By thaumasiotes 2021-10-0622:13

                    > If they are the same everywhere, you can precompute a huge database of hashes (called a rainbow table) and simply lookup the hash in the table when breaches occur to find the password.

                    You can do this anyway. But the space requirements of a rainbow table are so large that including an account's username in the password would make a rainbow table completely unfeasible.

                • By thaumasiotes 2021-10-0622:16

                  It doesn't matter at all if one person's hashed password is identical across two of that person's accounts on two different websites. The identical hash will instantly let an attacker (with access to both hashes) know that this person shares the same password across two accounts. But that is of no value; the attacker is going to start by assuming that it's true anyway.

                  Salts are there to ensure that two accounts on the same website which have identical passwords nevertheless have different password hashes.

            • By bduerst 2021-10-0618:23

              In a perfect world, no, but lazily someone could skip salting and/or use common hashing functions. IIRC this was a problem at Sony not too long ago.

          • By axpy906 2021-10-0618:162 reply

            Pretty much this. If they gain one email/username password combination - they can use it elsewhere.

            • By growt 2021-10-0619:062 reply

              If they are properly hashed and salted, they can not.

              • By xorcist 2021-10-0619:271 reply

                The point here is that once you brute force the plaintext password, the same password might be used elsewhere.

                • By mercurywells 2021-10-0620:543 reply

                  What if you did something like hash(plaintext_pw+"twitchsalt") <browser> ---> <server> hash(browser_hash + db_salt)

                  • By growt 2021-10-0621:30

                    If I understand this right, the problem is "twitchsalt" has to be known so that you can generate the same hash for future logins. So it's just one iteration of hashing more for a brute force attempt (modern hashing algorithms already use multiple iterations of hashing to make brute forcing harder)

                  • By ocdtrekkie 2021-10-0621:25

                    Well, bear in mind, the hacker also has the exact code Twitch uses to salt it's hashes.

                  • By xorcist 2021-10-0915:51

                    The browser_hash is now the password.

              • By thaumasiotes 2021-10-0622:111 reply

                Password salting has nothing to do with password reuse.

                Imagine two people have accounts on each of two websites:

                             eBay           YouTube
                   
                   Alice     sunlight       bobrules
                   
                   Bob       bobrules       bobrules
                
                A password reuse attack dumps the YouTube database, cracks Bob's password, and then accesses Bob's eBay account. The fix for this is that Bob should use different passwords on his different accounts. Hashing helps by making step 2 ("crack Bob's password") more difficult. Salting does not affect this attack in any way. Note that the attacker didn't bother to dump the eBay database.

                The attack that salting protects against dumps the YouTube database, cracks Bob's password, and then accesses Alice's YouTube account.

                • By growt 2021-10-077:361 reply

                  "Salting does not affect this attack in any way." Yes it does. If you habe unsalted passwords you can just use a rainbow table to look passwords up.

                  • By thaumasiotes 2021-10-078:201 reply

                    And that is not affected by salting. You can use a rainbow table to look passwords up whether or not those passwords are salted. There is zero conceptual connection between the two ideas.

                    Now, realistically, you can't use a rainbow table on passwords of any noticeable length, and a salt may push the password over the edge of that threshold. If that's really what you want... enforce a minimum password length.

                    • By growt 2021-10-079:061 reply

                      "Use of a key derivation that employs a salt makes this attack infeasible." https://en.wikipedia.org/wiki/Rainbow_table

                      "Salts defend against attacks that use precomputed tables (e.g. rainbow tables)" https://en.wikipedia.org/wiki/Salt_(cryptography)

                      • By tyingq 2021-10-0721:291 reply

                        Salts do nothing for people with predictable passwords though. The salt is in the dump, so I can hash known plaintext with the algorithm and the dumped data.

                        Even if I can only hash a million a day, if your password is one of the top million most popular, and I have a good list, I'll have your password in a day. And if you re-used it...

                        Salts do make naïve brute-force, all-possible-strings approaches useless, yes.

                        • By growt 2021-10-087:29

                          Yes, but nothing will make predictable passwords safe (at least when you have the hash). Enforcing password guidelines helps a bit.

            • By skeeks 2021-10-0619:112 reply

              I would be very deeply concerned if Twitch, a multi-billion dollar company owned by Amazon, does not properly hash and salt the passwords of its users.

              • By isbvhodnvemrwvn 2021-10-0619:43

                You don't brute force it, you find the password for accounts with the same e-mail in leaks from other sites and try only those.

              • By tyingq 2021-10-0621:42

                You can still run those "top billion popular password" lists against properly salted/hashed passwords.

        • By errantspark 2021-10-0618:515 reply

          A few things here. If you're the sort of person who runs a crypto mine, which I assume many of the people interested in breaking hashes are you have enough firepower at your disposal to at least perform a targeted attack on a few hashes with relative ease.

          Ideally that would be useless because things are properly salted and you don't know the salt, however with access to all of the source code as we have here I think it isn't as clear cut, as it may be possible to reverse out the salts as well.

          I'm not a cybersec guy so please take my speculation with a grain of salt.

          • By exitheone 2021-10-0619:40

            The salt is usually stored next to the password. The point of a salt is just to make the hash unique to prevent the use of rainbow tables, it's not a separate secret.

          • By shkkmo 2021-10-0619:59

            I think it is pretty common to store the salts alongside the password hashes. They are used by the same pieces of code so it is generally unrealistic to think that your salts will be secure if your hashes are obtained.

            Salting isn't really supposed to make a hashing algorithm secure by being secret but by being unique. Unique salts make hashing more secure because an attacker can't re-use a single rainbow table for multiple hashed passwords. That, combined with a sufficiently computationally difficult hashing algorithm, it makes it prohibitively expensive to reverse the hashes of all your users.

            This may not be enough to protect high value users or those who use fairly common or easily guessable passwords. This is part of why it is so important that you don't reuse passwords. It's also why your application should reject all known passwords using something like https://haveibeenpwned.com/Passwords or any of the "common password" list you can find online.

            Edit: If you do include a secret that is stored seperatly that is added to the password and salt when hashing, this is called "peppering" and these peppers are generally not unique per user.

          • By maccard 2021-10-0619:372 reply

            I've heard this before, and queried how feasible an attack would be, as people always talk about just how bad this is but yet I've _never_ heard of someone having an account compromised through this vector, and I'd like to know how feasible it really is. Here's the sha1 of an unsalted password b85ffa7dae2cbed04e7d3335f6ebc43c8a5764dd

            How long does it actually take in practice to break something like this? I would love it if someone could prove it to me.

            • By jaredsohn 2021-10-0619:401 reply

              Is the password ncc1701e?

              I just googled it and found https://hashtoolkit.com/decrypt-sha1-hash/b85ffa7dae2cbed04e... along with other results.

              • By maccard 2021-10-0619:522 reply

                It is! I guess using a password from Google isn't the best idea, and kind of defeated the point of what I wanted to ask (if your password isn't already hashed online how long does it actually take to break a sha1 hash), but definitely proves the point.

                Can I try again? Sha1 e7b7cdf949007abe7e8a190ba8eae56c60018c1f

                • By Behemoth66 2021-10-0620:161 reply

                  Couldn’t find it in 1.4 Trillion combinations. Used rockyou.txt with dive.rule.

                  Took me 6 minutes to try all 1.4 trillion passwords. So either you have a strong password or I messed something up. What is it?

                  In theory if your password was weak enough to be on this list it would take on average 3 minutes to break it on a GTX 1080.

                  • By maccard 2021-10-0620:452 reply

                    Thanks for trying! This somewhat supports what I'm suggesting - because that password hasn't been leaked by being posted in plaintext as a verified password, it's not available as a lookup, therefore it doesn't matter whether they used bcrypt, sha1 or md5, or even just pgp encrypted it, the password is likely "secure".

                    • By Behemoth66 2021-10-0621:041 reply

                      It depends. It doesn’t have to strictly be a leaked password. If it’s similar to a leaked password then the permutation rule-set will catch it.

                      Anything under 9 characters I can brute force in minutes. 9 character passwords would take me 9 hours.

                      Obviously if someone has a nest of the latest GPUs then they could go a lot faster.

                      But yes if your password is uwv&6qu_brusb618_$@618jg then it doesn’t really matter how you hash it.

                      • By maccard 2021-10-0621:231 reply

                        The reason I didn't give any more information on the password above is because you don't have any extra information on a dump of hashes from a twitch database either. If a password is only feasibly brute forceable for a specific algorithm by reducing the search space by many orders of magnitude, it kind of shows that there's not really any risk even if the passwords are unsalted for a person who hasn't reused a password.

                        • By thaumasiotes 2021-10-0622:37

                          > it kind of shows that there's not really any risk even if the passwords are unsalted for a person who hasn't reused a password.

                          No, it doesn't. You could reuse uwv&6qu_brusb618_$@618jg everywhere and it wouldn't get cracked. If the plaintext password leaked, then you'd be in more trouble.

                          What matters is whether your password is easy to guess, not whether you've reused it. If you have all unique passwords, they can still all be trivial to crack.

                    • By ghostway-chess 2021-10-075:021 reply

                      Well. Sha1 is not _that_ hard to break. It's a solved algorithm

                      • By astrange 2021-10-0710:08

                        That's for generating collisions, not preimage resistance. It's not particularly easy to reverse.

                • By shkkmo 2021-10-0620:082 reply

                  The point of the salt isn't that it makes it take longer to break any one password. What it does is prevent you from re-using the rainbow table you generate breaking one password when you break the next one.

                  Sha1 is not a very secure/expensive hashing algorithm and thus does make it significantly cheaper to break even with a unique salt.

                  • By thaumasiotes 2021-10-070:31

                    > What [a password salt] does is prevent you from re-using the rainbow table you generate breaking one password when you break the next one.

                    Your idea of what a rainbow table is appears to be unrelated to what a rainbow table actually is. A rainbow table is prepared in advance, not generated in the process of cracking an individual password.

                  • By maccard 2021-10-0620:131 reply

                    > Sha1 is not a very secure/expensive hashing algorithm and thus does make it significantly cheaper to break even with a unique salt.

                    Ok, so how long does it take to break the hash I've provided if it's not very secure?

                    • By shkkmo 2021-10-0620:401 reply

                      It's not so much "how long does it take" as it is "how much does it cost" and the answer to that really depends on what sort of compute infrastructure you have access to. Using a more appropriate hashing algorithm with a sufficient cost factor can massively increase the amount of compute needed. Preventing the re-use of that computational effort on additional users is why unique salts are important.

                      • By maccard 2021-10-0621:173 reply

                        > It's not so much "how long does it take" as it is "how much does it cost"

                        So the answer is "It's too expensive to figure out in practice, unless you're being explicitly targetted by someone with nation state level credentials?", i.e. it's pretty much fine?

                        > Using a more appropriate hashing algorithm with a sufficient cost factor can massively increase the amount of compute needed.

                        But by the sounds of it, SHA1 is more than enough (given that nobody here is willing to brute force the hash I shared above?)

                        > Preventing the re-use of that computational effort on additional users is why unique salts are important.

                        The person who "cracked" my first hash found it in a list of passwords which was actually gotten from a plain text dump 15 years ago. That wasn't found by reversing a hash, so the compute wasn't reused. You are right that once it's cracked, it's cracked and that's that, but if your password _isn't_ cracked it's moot whether it's hashed with SHA1 or something more secure, as per above?

                        • By theneworc 2021-10-0622:411 reply

                          >But by the sounds of it, SHA1 is more than enough (given that nobody here is willing to brute force the hash I shared above?)

                          SHA1 is "more than enough" for this specific interaction in which you chose a complex password and/or your only opponents are unmotivated/non-incentivized HN commenters that don't have a password cracker at their immediate disposal. That doesn't mean anything outside of this context.

                          If your opponent was a motivated hacker with dedicated password cracking machines (which do not require anything even close to a nation-state budget, btw), your SHA1 hash would be much more likely to be cracked. If you were a specific target of a hacker group, such as an employee of a company that is being targeted by an attack or someone known to have a BTC wallet with $10 million in it, your SHA1 hash would be much more likely to be cracked. If your password was a relatively simple phrase like "dog$aregreat2019", like the vast majority of user passwords are, it would almost certainly be cracked.

                          SHA1 is not even anywhere close to "enough" for general password hashing use. Don't think otherwise just because a couple of random HNers failed your little game.

                          edit: The premise of your "challenge" is also not equivalent to the goals of most hackers. Unless you are a specifically known and prioritized target (because you're a celeb, VIP, wealthy person or something like that), the goal of a hacker is not to take one specific hash and crack it, because the success of that will depend a lot on the complexity of your password. The goal of most hackers in a breach like this Twitch one is more like "just throw it all at the wall and see what sticks". They take a massive database of thousands of hashes and spend a few hours to see what can be cracked, taking advantage of the fact that while some people may have complex passwords, most do not. After a few hours, maybe they crack 90% of the SHA1 hashes in a leak. Maybe your password was complex enough that it was in the 10% that wasn't cracked; good for you, but just because your password remained uncracked doesn't mean SHA1 is "enough". The hackers still got the other 90%.

                          • By ghostway-chess 2021-10-075:22

                            But you shared a hash of an uncommon password. We probably have the salt (probably somewhere in the code) and people dont use password managers. So rainbow tables are enough. Oh, I thought the first sentence was you and not quoted. Agreed with the above

                        • By shkkmo 2021-10-0622:47

                          > But by the sounds of it, SHA1 is more than enough (given that nobody here is willing to brute force the hash I shared above?)

                          Absolutely not and that is a ridicoulous conclusion to draw. State-level resources are absolutely not required to break sha1.

                          > but if your password _isn't_ cracked it's moot whether it's hashed with SHA1 or something more secure, as per above?

                          Again, absolutely not. The algorithm and cost setting have a huge impact on the practical likihood that an attacker will crack your password.

            • By isoskeles 2021-10-0620:011 reply

              Many hashes are trivial to target, until you start getting to password hashers that force you to use lots of RAM or CPU (or ideally both) to check a single password. As long as you know what hashing algorithm was used (often inferred by the hash length or other details), you can shove it into hashcat or some alternatives and wait, either using a good dictionary or bruteforce. If you've configured hashcat to work well with a decent GPU, you're good to go.

              Even bcrypt is not that hard to find a solution to a hash if it didn't use enough rounds.

              I learned a bunch of this when a company I worked for was breached and wanted to see just how easy it was to solve out weaker passwords in our db.

              • By maccard 2021-10-0620:021 reply

                As I said, I've heard the claim, but still question it. Here's a sha1 e7b7cdf949007abe7e8a190ba8eae56c60018c1f, how long does it take hashcat to break it?

                • By filleokus 2021-10-0620:381 reply

                  I don't really follow your argument. You've never heard of a hash being brute forced? I've done it myself multiple times, both for pen testing purposes and for password recovery on systems I control myself.

                  The LinkedIn password leak contained hashed (but not salted) passwords, and some of those where cracked and exploited in the wild.

                  My old gaming PC with a 1060 can apparently do ≈ 6300 * 10^6 hashes per second. Assuming your password above is az-AZ, 0-9 = 62 possibilities (with no salt) it would take me 10 seconds to test all combinations for 6 characters and 30 days for 9 characters. And it's a trivially parallel problem, making it easy to throw money on to make it wall-clock quicker.

                  It's just a simple brute force problem, I don't see what there is to question (beside the choice of SHA1 for password hashing...).

                  • By maccard 2021-10-0621:112 reply

                    > The LinkedIn password leak contained hashed (but not salted) passwords, and some of those where cracked and exploited in the wild.

                    The hashes of previously unused passwords were brute forced, or passwords were reused across sites from a previous plain text dump and exploited? Because there's a big difference between those two things. If your password is reused and originally compromised , you're screwed regardless, and having the leaked hashed passwords doesn't leave you in any worse a situation than before.

                    > My old gaming PC with a 1060 can apparently do ≈ 6300 * 10^6 hashes per second. Assuming your password above is az-AZ, 0-9 = 62 possibilities (with no salt) it would take me 10 seconds to test all combinations for 6 characters and 30 days for 9 characters. And it's a trivially parallel problem, making it easy to throw money on to make it wall-clock quicker.

                    So practically infeasible to exploit? The claims that are being made (even in this thread) are that having a mining rig would let you brute force a SHA1 hash, but based on the numbers

                    > It's just a simple brute force problem, I don't see what there is to question

                    If it's "just a simple brute force problem", and SHA1 is the only issue, then my question is what's the password in the hash above? You (and others here, on reddit, online) are telling us that this is a trivial problem.

                    • By filleokus 2021-10-0621:47

                      > The hashes of previously unused passwords were brute forced, or passwords were reused across sites from a previous plain text dump and exploited?

                      I believe there are documented instances where previously not leaked passwords were cracked. Of course not 128 bit random strings, but still passwords more "complex" than what you previously posted. If you have 100 million hashes to try, you will crack some. People are generally have bad passwords, especially in 2012, even if the plaintext weren't available anywhere...

                      > So practically infeasible to exploit? It depends on how strong the password is and how much money you have to spend. For 32 USD I get an hour with p4d.24xlarge that has 8 graphics card, that in total can do about 175 * 10^9 hashes per second. 20 hours (and 640 USD) machine time (not wall clock time) on that machine can do what 30 days on my old PC does.

                      > If it's "just a simple brute force problem" […] If you can give me a bound on the number of combinations, and an AWS account to bill, I and many others would gladly attempt to crack your hash :-). But if your second hash is >9 alphanumerical characters we will probably just burn electricity to no avail.

                      I don't even know what you are arguing?

                      EDIT: Now that you have some numbers of hashing rates and cost, you can figure out how expensive different passwords are to crack with different approaches. Two common dictionary words with two numbers appended? 6 random alphanumeric characters? Then think about how expensive the cheapest non-leaked password is in a database of 100 million users are...

                      Is it bad to store plaintext passwords? Yes, obviously. Is some hashing better than none. Yes, obviously. Is salting your hashes much better than not. Yes, because with a salt, your first password wouldn't have turned up on Google / in rainbow tables. Is it even better to use a proper PBKDF. Yes, with a pretty aggressive PBKDF, brute forcing even low-complexity passwords become expensive very quickly, and we get the benefits of salting "built in".

                      Can SHA1 / MD5 hashes be cracked even if not the _exact_ password-hash pair have been leaked previously? Yes, very much so.

                    • By vel0city 2021-10-0622:16

                      Right? "Its just a simple brute force problem", but sometimes that still takes a lot of force. Sometimes far more force than breaking a single account password.

                      I managed to lock myself out of a dogecoin wallet. I have the hash of the passphrase, so I figured I'd give it a go cracking it. After a few weeks (and a larger than usual power bill) I sent it to some friends with good mining rigs to try and take a stab at it, willing to split the amount 50/50. Its only the passphrase, not the full wallet, so I'm not worried about someone stealing the doge.

                      The passphrase is probably 15-25 characters, mostly not dictionary words or simple letter/number/symbol substitution, only symbols easy to type on a US keyboard. I'm now about 6 months trying to crack that password with probably a few hundred dollars of electricity used overall between myself and friends (I don't know their power bill), excluding hardware cost as it was already owned, and I'm not even halfway through the search space.

                      Can it be done? Sure. Will I be able to crack that password with a cost that's less than the value of the DOGE in the wallet? Probably not. Right now its really more of a gamble that I'll get lucky with the rigs running. I had to tone down some of my rigs as it was getting quite hot over the summer, but over the winter I'll be chugging away as the waste heat is just additional home heat. I'll probably need to rent a considerable amount of GPU power on a cloud provider to crack it, at which point maybe it'll take me days to crack it but ultimately cost me many, many thousands of dollars in GPU-time.

          • By Deathmax 2021-10-0620:01

            Salts being exposed is not a massive risk in of itself, as the purpose of the salt is to prevent the use of pre-computed tables to reverse a hash into plaintext, forcing an attacker to bruteforce each individual hash+salt instead of being able to reuse work.

            With regards to crypto mines being used for breaking hashes, if you have one based on GPUs, yes, you could reuse GPU mining hardware for cracking hashes, albeit with relatively low hashrates for current best practice hashing algorithms.

            If you're looking at something like Bitcoin's hashrate and thinking that it could be used to break SHA2 hashes, as far as I understand ASIC miners, this is not possible, as ASIC miners are designed only for mining, and they don't really accept non-mining related inputs (ie, no arbitrary inputs to be hashed, unless it matches Bitcoin's specific steps for iterating over nonces).

          • By thaumasiotes 2021-10-0622:33

            > Ideally that would be useless because things are properly salted and you don't know the salt

            I'm really curious where people get their ideas about salting. It's not just a word. It doesn't make one password any more difficult to crack. It makes cracking every password in a given database more difficult to do. A password's salt is public information.

        • By j_walter 2021-10-0618:03

          Relatively useless...but if even a few percent of people recycle passwords used for banking or crypto platforms it could be a profitable cache of data.

      • By zinekeller 2021-10-0617:30

        Maybe that Twitch is competent in the password department so they decided against it? But thinking about it, although it's unclear if two-factor secrets are included in the leak, but maybe the two-factor secrets may be usable to someone who has already the password of a victim. Unless it's the dongle-type one (WebAuthn/FIDO), the secret is common to both the server and the user, so two-factor bypass is almost certain in this case.

      • By mdoms 2021-10-0618:07

        Doesn't seem likely to me. If the attacker has password hashes then they would want to keep this attack quiet so that the buyer of the hashes would have time to compute the passwords. If Twitch gets wind of this happening then a simple password reset would foil any efforts.

    • By skilled 2021-10-0616:591 reply

      I'm hoping we will get to see a transparent report (from hacker or Twitch) on how this happened.

      I think anyone would be excited to hack Twitch as the site alone - or any big platform for that matter - but this is quite literally someone just downloading the entire Twitch ecosystem and publishing it online.

      • By ergerger 2021-10-0618:25

        Twitch has not been known to be transparent about anything.

    • By leros 2021-10-0617:0913 reply

      It something I would expect security hardware to have automatically stopped. Even an employee shouldn't be able to download 125GB of stuff without flipping a safety switch somewhere.

      • By munk-a 2021-10-0618:011 reply

        Gosh - I've worked at shops where we handled multi-terabyte images and we'd regularly stream large chunks of that while debugging tools. I've also worked at places where data was king and 125GB of stuff might be a reasonable dispatch of data to help someone debug.

        The volume of data is irrelevant - source code is usually teensy tiny and of far more value to companies than, say, three months of livestream chat logs.

        I'm not certain what security hardware you're thinking of - but I'm pretty sure I hate it already since it doesn't effectively guard anything while making everyone's lives difficult. For effective corporate security you need 1) data use policies and 2) access control lists - both of those are generally more effectively implemented at an entirely software level.

        • By retbull 2021-10-0618:491 reply

          Yeah volume is a terrible metric to go by. I work as a data engineer and a lot of the time if I am working between environments or when migrating between data centers will have a copy of the data locally that I can write tests against or move to somewhere I can compare it to a running output. This would be possible to do entirely remotely I guess but not nearly as easy. (note I never do this with anything that contains PII)

          • By manquer 2021-10-0619:55

            It is still fraught with problems, while you (knowingly) wouldn't do it with PII, is not all that reassuring, others could, or compromised system could be used to exfiltrate this data, if the only control is just trust on the users behaving well with their access

            That fact in general industry the controls on how PII data is accessed internally is so lightly managed should worry everyone

      • By AshamedCaptain 2021-10-0617:264 reply

        Trying to protect against leaking developers/employees is like trying to protect against lone gunman terrorists: useless. And, if you try anyway, it is likely to cause more annoyance to everyone involved than actual protection (think TSA).

        • By yupper32 2021-10-0617:362 reply

          I disagree. Locking down and logging access to raw data like password hashes or payout information to only those who absolutely need it doesn't cause much annoyance and is very useful.

          It protects the company against rogue employees (not even strictly malicious, but also curious employees who want to see more than they should). It limits exposure if an employee's account gets hacked (my pet theory for this Twitch hack). And if something does go wrong, logs help track down the issue/leak.

          And at the end of the day, there should be a lightweight way to request access. Many times I've seen people request access that they didn't actually need. And most other times they have access pretty quickly.

          • By AshamedCaptain 2021-10-0617:552 reply

            Note that it was code that was leaked. Preventing developers from leaking the codebase they are working with is outright impossible. Now combine that with a "monorepo" and even the most junior developer has access to practically the entire company codebase and version control history.

            And you can try to prevent them from accessing live/real customer data, but the cost is that they will never be able to debug issues in production. Most companies, even very large ones, are just not able to pay that cost. Not to mention that once you have access to the codebase there are a million ways to leak customer data anyway -- it is a lost battle.

            • By yupper32 2021-10-0618:051 reply

              Of course, some stuff you can't avoid, especially code leaking. Luckily code isn't usually that interesting or useful to external parties which is the only reason it isn't leaked more.

              For the rest of the stuff, there's a sliding scale. In no universe does your average twitch developer need raw access to password hashes, for example.

              • By AshamedCaptain 2021-10-0815:48

                What with security as it is on these companies, the code is literally the most sensitive information they can hold, specially in terms of value to the company. With the code out, expect lots more high-profile cracks in the coming months...

                "your average twitch developer" needs access to the password hashes or at least the code that checks these hashes the moment they need to debug an issue which involves logging in, and from then its all downwards.

            • By lrem 2021-10-085:471 reply

              Nope, it was code AND data, including the sensitive type (e.g. user payouts).

          • By xmprt 2021-10-0617:53

            Adding to your pet theory I think that WFH has led to a lot of people being casual about their workplace security. For example, leaving a laptop unattended at a Starbucks.

            This is just a guess but I wouldn't be surprised if companies have to start taking stricter precautions with their security in a WFH world.

        • By hn_throwaway_99 2021-10-0619:00

          This isn't accurate. There are certainly companies that have extremely in-depth Data Loss Prevention toolsets and teams - everything anyone downloads or moves is logged and alerts fire if things look out of the ordinary. Google clearly had tons of data about how Anthony Levandowski was able to exfiltrate lots of info when he left.

          The issue that building these systems accurately so they are NOT a constant annoyance is difficult, expensive, and takes a large team to support well.

        • By unethical_ban 2021-10-0620:08

          There are ways to look for anomalous behavior without creeping too hard (even though it's a business's right to view and monitor all network traffic on their system).

          If someone who doesn't have a business need to upload lots of traffic begins uploading large amounts of data, you may ask questions. Maybe you kick off a scripted playbook that then checks for increased logins to other privileged systems, or for large transfers of data from internal sources to the user's desktop.

        • By xwolfi 2021-10-0716:531 reply

          I dont know dude, I work in an enormous company that you 've heard of, and it's impossible for me to imagine how to extract code out. I can't do it, except if I get remote access and film my screen while scrolling.

          Anything else is found quickly. I certainly wouldn't even dream of someone extracting the repo.

          • By AshamedCaptain 2021-10-0815:52

            Really, you can't simply copy files from a code repo you're working on? You work on a isolated workstation, not connected to any external network, where you are not allowed to bring anything other than plain clothes (TSA-style)? With a sizable army of developers all working this way?

            And if it's a remote FB/VNC connection, what is preventing you from just recording the screen? Not really hard...

            Most companies I've seen could see all their code extracted with one malformed NFS packet. These are "air gapped" systems holding the type of industrial secrets that we don't want to leak to china. Practically the only real line of defense they have is employee screening, which does not really stop the lone man guy.

      • By CobrastanJorji 2021-10-0617:502 reply

        If the bulk of it is a git repo, it's probably expected that every engineer will download it regularly.

        • By manojlds 2021-10-0617:523 reply

          Case against monorepos?

          • By crdrost 2021-10-0618:011 reply

            There are much better cases than this; in this case a monorepo makes it slightly more likely to be caught rather than less. (A monorepo can get to Google size and then you can't check it all out at once and it needs bespoke tooling, which can make it harder to pull this off.)

            On the flip side while many smaller repos _can_ have independent ACLs, you are very unlikely to set those up until you reach a certain scale -- and then when you reach that scale it gets hard to implement ACLs across everything at once. So your engineers probably all have access to all your repos until you reach a very large size anyway. So the question becomes just "can someone write a for-loop over all of the repo names and check them all out," and it's like, yeah, that's not terribly hard, I as a programmer can do that pretty easily in bash.

            Ideal repo size should not in my view be directed at "how do I prevent compromise to the external world," because VCS is not designed to give you the superpower of being resilient around being compromised. Rather VCS is trying to give you the superpower of time travel. So you should probably scope your repo to "what is the unit that makes sense to time travel with?" -- in other words if you are adamant that you have these independent services which operate decoupled and running this one backwards by a year should not affect that one, then those services should be in separate repos. If on the other hand they have some moderate coupling and rewinding this service by 1 year would break the APIs that that service uses to communicate... then those should ideally be in the same repo so that you can coordinate changes between them to their shared protocol.

            • By vineyardmike 2021-10-0618:22

              > So your engineers probably all have access to all your repos until you reach a very large size anyway.

              Happens at my company. We have rudimentary ACL but not sure how its implemented because you can find things via explicit searching, or via "organic finding" via links from repo->repo but it won't be surfaced if you just search for code.

          • By packetslave 2021-10-0619:50

            You can still have a monorepo and restrict who has access to certain parts of it. You just have to build the tools to do it.

            Google, for example, has a small number of subdirectories in the tree that only certain engineers can view (the really sensitive stuff, like the actual ranking algorithms for search and ads) but the build system is setup to allow you to still link against it.

          • By munk-a 2021-10-0618:02

            Not particularly - unless different teams are highly focused on certain subsections of the repository. If everyone might have to look anywhere than you'll need to download all the repos - whether that's one or five hundred.

        • By javajosh 2021-10-0617:553 reply

          How often do devs delete and re-clone?

          • By Retric 2021-10-0618:022 reply

            Clean OS install or new hardware should both be daily events at even mid sized companies. Because even if it’s once every 2-4 years per developer that still becomes extremely common in aggregate.

            • By Marsymars 2021-10-0619:18

              I think the tech giants have warped some people's expectations of what a "mid-sized" company is. I work for a mid-sized company where we roll our own ERP system and we probably average about two clean OS installs per year across the entire development team.

            • By javajosh 2021-10-0618:26

              Yes, but GP said "every engineer, regularly" which seems odd.

          • By vineyardmike 2021-10-0618:23

            I suspect lots of junior devs will clone fresh, push changes, nuke repo and repeat. I did when i was young instead of syncing state and rebasing.

      • By com2kid 2021-10-0617:431 reply

        > Even an employee shouldn't be able to download 125GB of stuff without flipping a safety switch somewhere.

        I am trying to recall, but I am pretty sure when I worked in Microsoft Office that a build would pull down many tens of gigabytes of data.

        125GB in one day from the build system wouldn't be uncommon!

        • By Raidion 2021-10-0618:151 reply

          That's ingress though. Companies should be monitoring and worrying about egress.

          Edit: This won't help against a thumbdrive, but that type of thing should be also tracked.

          • By AustinDev 2021-10-0619:37

            I'm working on a project and just had to repull my workspace after some local corruption. I pulled 1.2TB out of the office and never got an email. I think it's pretty common for places not to monitor egress that closely.

      • By tptacek 2021-10-0618:234 reply

        There was a fad for tools that accomplished this in enterprise networks, with much clearer rules for who needs to access what (it was called "data loss prevention", or DLP) and those tools for the most part don't work. This is a harder problem than it looks like.

        • By unethical_ban 2021-10-0620:101 reply

          DLP products tend to be more about scanning the contents of data for sensitive patterns, at least in my observation of the market. There are other products (typically built into SIEM) that do correlation on login events, network traffic and whatnot to detect anomalous behavior.

          • By AmericanChopper 2021-10-0622:09

            I’ve worked on a lot of DLP projects in big enterprise, and I have a very dim view of the entire category of product. A lot of their functionality is just magic black boxes, that unsurprisingly achieve very little. The primary motive for deploying them is not that they’re particularly effective, it’s so that you can tell auditors and other scrutineers that you’ve got a “DLP solution”. The idea that you can grant people access to huge quantities information, but then very strictly control what they do with it is fundamentally flawed. Especially on networks that require large amounts of in and outflow for BAU. Even the most tightly controlled data in the world cannot be protected from an inside leaker (or adversary who has taken control of an insiders access), because it runs into the same “analog hole” issue that DRM products have.

        • By tfigment 2021-10-0619:26

          My company has this. It encrypts any file touched on USB. And other software logs every app run. Prevents casual copying but easily circumvented. But somewhere logs may have enough info to trace the source of leak I guess.

        • By realitylabs 2021-10-0619:551 reply

          These tools (DLP) have gotten better with app migration to K8s, since traffic can be watched prior to encryption in a standardized way. Just an FYI….

          • By tptacek 2021-10-0620:20

            The enterprise DLP tools were deployed fleetwide as agents and at network choke points; getting access to the raw data wasn't the problem.

        • By cpach 2021-10-0619:21

          Thank you for mentioning this. I always had a gut feeling that it seems like an extremely hard problem to solve in a sensible way.

      • By outworlder 2021-10-070:49

        > It something I would expect security hardware to have automatically stopped. Even an employee shouldn't be able to download 125GB of stuff without flipping a safety switch somewhere.

        Remember that Twitch handles streams. Good luck implementing this without having all sorts of false alarms everywhere.

        Plus, you don't have to exfiltrate 125GB in one go.

      • By cheeze 2021-10-0618:35

        I feel like once you have it pulled downm, it would be as simple as an upload to s3 (which wouldn't trigger any flags), then making the bucket public whenever you want. Hell, S3 used to (still does?) support being part of a torrent swarm...

      • By ljm 2021-10-0619:28

        Why would that help? They just have to accumulate work over a period of time and then 'lose' their laptop.

      • By toomuchtodo 2021-10-0617:211 reply

        That's 6.25GB/day over a 20 day working month. More time, less data per work day, harder to detect.

        • By jandrese 2021-10-0617:451 reply

          And it might be disguised as a video stream coming out of the video streaming servers.

          But it could also be a 128GB thumb drive plugged into the system somewhere.

          • By vineyardmike 2021-10-0618:25

            > And it might be disguised as a video stream coming out of the video streaming servers.

            Just log in to FB messenger or Discord and egress it as small data chunks that way. Lots of people have private chats on work computer for practical purposes.

            Discord allows for bots, so you could easily write a script to chunk data and egress, and another to re-assemble.

      • By ABeeSea 2021-10-0619:26

        ML engineers / data scientists are regularly moving terabytes of data around at Amazon.

      • By yawaworht1978 2021-10-0618:58

        Indeed , how could this happen, really curious.

        So let's say someone with access to all GitHub repos gave the password to someone else, maybe then it was downloaded from another machine?

        Or someone stole the credentials and downloaded from another machine?

        Or someone got access to such a machine?

        It's it not possible to prevent these cases?

        How long does such a download take?

      • By stefan_ 2021-10-0617:421 reply

        Cue monorepo discussion

        • By munk-a 2021-10-0618:04

          Cue "Don't check payment receipts into git" discussion - although I strongly suspect this hack wasn't just about acquiring appropriate credentials and then running `git clone`. It sounds to me like a backup service was compromised.

    • By ArlenBales 2021-10-0617:101 reply

      There are so many indiscreet USB pentesting devices easily purchasable by anyone today, I'm actually surprised this sort of thing doesn't happen more often.

      • By SketchySeaBeast 2021-10-0617:431 reply

        Shouldn't that be discreet devices? Or do they make a really high pitched whine with a big flashing light when they start transferring data?

        • By angst_ridden 2021-10-0617:49

          "Hey, Jeff, what's that weird thumb-drive over there that keeps texting me `I'm in your datacenter downloading your datas'?"

    • By aahortwwy 2021-10-0623:101 reply

      ITT: people shocked that something like this could happen at a company the size and profile of Twitch.

      Running security at scale in a hypergrowth B2C company is very difficult. It's also completely different from running security at a startup, in a B2B company, or a slower-growth situation. _Every_ security executive and manager I've met has given up in frustration after 12-24 months and gone to take a cushy FAANG job instead.

      I'm not surprised at all. My experience in security at a larger SV unicorn was that changes only happened in the immediate aftermath of a security crisis. Otherwise, there was incredible inertia and you just wouldn't be able to get the institutional support you needed to make progress.

      • By xwolfi 2021-10-0716:561 reply

        It's funny because for me each letter of FAANG is an hypergrowth B2C company...

        • By aahortwwy 2021-10-111:38

          All of them have very significant B2B products.

    • By koolba 2021-10-0617:093 reply

      How much of this is a holdover of lax security practices from before they were acquired? I can’t imagine AWS being managed in a way where local network access gives you keys to the kingdom. Then again, EC2 instance profiles do let you do quite a bit.

      • By lamontcg 2021-10-0618:041 reply

        Conflating AWS security with twitch security is probably the wrong way to think about it.

        Within Amazon those are almost going to be two entirely separate companies, with very different security focuses.

        The idea that Amazon is monolithic and uniform wasn't true when I left there in 2006, and I'm certain it is less so now.

        And that isn't just that its related to the merger, but that fundamentally its different business orgs with different focus.

        • By vineyardmike 2021-10-0618:292 reply

          But does twitch not share the same Amazon wide git service? Could most of Amzn code be leaked or compromised? Seems like all of amazon internals that shares security measures is at risk...

          • By cheeze 2021-10-0618:42

            I've heard (but don't have any actual evidence more than hearsay) that Twitch generally operates independently of Amazon/AWS. I'm sure that they share some things, but I wouldn't be surprised if their source was separate from the "main repo"

          • By bleepblooop 2021-10-071:20

            Remember that Amazon runs one of the biggest multi-tenant service platforms in the industry! A separate business unit like Twitch is likely to be set up a lot like any other random AWS customer, and you wouldn't expect that compromising servers used by one AWS customer to automatically compromise the underlying infrastructure.

            (I would also expect that the Amazon retail systems are in most senses "just another tenant" on AWS, albeit with much more liberal quotas!)

      • By this_user 2021-10-0620:19

        I always had the impression that Twitch were operating in a largely independent fashion. For instance, it had been an open secret for years that one of their executives had been sexually harassing female streamers. Only a year ago he was finally fired. If Amazon had a firmer grip on Twitch, I'm sure they would have stepped in much earlier.

      • By ganoushoreilly 2021-10-0618:19

        If you go back to the Adobe software breach circa 2013, a large part of their issues were the bolt on connections between acquisitions. It's honestly the most common thing I see in the startup world.

    • By slightwinder 2021-10-0622:431 reply

      > It's also strange that someone who has this level of access to what is presumably a multi-billion dollar company decided to just leak the data?

      From what I heard about Twitch-interns over the years, it seems the company is more a third-rate-s**hole that grew too big too fast and accumulated a huge amount of technical debt and fatal security flaws. Making billions doesn't mean anything if you don't invest them back into the important corners of the company. It's considered a miracle that the platform is still working that well in that state. And what comes from the leaks so far supports this view.

      Though, said that, it seems they did start to improve one or two years ago, just too late to prevent this critical hit. But considering this was also a strike that avoided the deadly parts (yet), maybe there is a different aim here and the company can grow from this? It will be interesting to see how Amazon will react to this.

      • By superfrank 2021-10-0623:161 reply

        > From what I heard about Twitch-interns over the years, it seems the company is more a third-rate-s*hole that grew too big too fast and accumulated a huge amount of technical debt and fatal security flaws.

        I mean this as a genuine question, but is there any company that didn't end up like this after an exponential growth phase? I'm not saying it's okay, but this feels par for the course. I've now been at two start ups during that hockey stick growth time and both went through this as well.

        I'd be curious if anyone here has worked at a large, fast growing tech company where they didn't accumulate a ton of technical debt during growth. If so, what did the company do to prevent that?

        • By slightwinder 2021-10-0623:35

          Generally yes, but Twitch is not your average startup. It's now 10 years old, and 7 of those years it was owned by Amazon, which should have enough competence and manpower for bringing it onto a good course. But from what I heard, Amazon did neglect Twitch for a long time and focused too much on making it a profitable business by all costs. Because of which they had all those scandals and problems in the last years. It's a business-platform, where technology is just an afterthought.

    • By yupper32 2021-10-0617:242 reply

      Does anyone know if Twitch employees have two factor auth? Having access to an employee's account would be the easiest way to pull this off.

      It'd be strange if they don't have two factor auth, of course, but it's just as strange to have this large of a hack.

      I think if it is a simple case of an employee account takeover, then the attack would "work" to some extent at any company. Larger companies typically have strict data access requirements, though. Good luck finding the few employees who have raw access to Google password hashes, for example. And even more luck knowing how to get that data if you do.

      • By some_furry 2021-10-0617:341 reply

        > Does anyone know if Twitch employees have two factor auth?

        Yes, IIRC everyone at Amazon has a hardware security key (which is more secure than the standard mobile app TOTP most of us use everywhere online).

        • By ramesh31 2021-10-0617:402 reply

          >(which is more secure than the standard mobile app TOTP most of us use everywhere online).

          Is it though? The "wrench theory" applies here. It's not unthinkable that an employee was stalked on social media and had their key stolen.

          • By bawolff 2021-10-0617:491 reply

            Its still more secure. Rubber hose cryptanalysis applies to both equally, but that doesn't mean there aren't other attacks that apply to totp which don't to yubikeys.

            More secure != perfectly secure.

            • By xmprt 2021-10-0617:554 reply

              With a phone you need my passcode to accept to 2FA request (assuming lock screen notifications are disabled). I think yubikeys can work without a passcode as long you plug it in right?

              • By bawolff 2021-10-0618:06

                Right, but presumably the site is already asking for a password, and if the attacker can bypass one password, im not sure its a safe assumption that they cant bypass two. However fair enough. Some yubikeys do involve fingerprint scans too though.

                The main security benefit is unphishability. With yubikey/webauth crypto is used so you can't give the code to the wrong website. Phishing is a pretty major cause of account hacks generally, so pragmatically that is a very big win.

              • By cheeze 2021-10-0618:391 reply

                It's still the same, 2fa.

                With a Yubikey, you need to use your password to log in to your computer, and then need to auth using Yubikey.

                With OTP app, you need to use your password to log into your computer, passcode for phone, and then auth.

                In both cases, it's something you know, and something you have. You could argue that the app based is a bit more secure in that you need two passwords. On the flipside, if your phone gets pwned, someone can access completely remote.

                Everything is a tradeoff.

                • By bawolff 2021-10-0619:00

                  Why would you need to log into your computer with a yubikey? Wouldn't any computer (including the attacker's computer) work?

              • By tenryuu 2021-10-0621:31

                Amazon still has a passkey requirement, it's not just a touch of the key, and these passwords are different to your user passwords at login.

              • By some_furry 2021-10-0618:03

                They require a physical touch.

          • By some_furry 2021-10-0617:42

            Yes.

            I don't know which protocols they use (obviously), but if they use WebAuthn, everything is public-key signatures. Even if you leak everything from the server, public keys buy you nothing.

            https://webauthn.guide/

      • By AustinDev 2021-10-0619:391 reply

        Every Twitch Developer has 2FA even 3rd party developers are required to have 2FA I also think, but don't know, that this applies to Twitch Broadcaster Partners as well in order to have their tax information in the system.

        Luckily iirc from a conversation with a senior Twitch engineer the Tax information backend has been migrated to Amazon. So hopefully that did not leak... Because that would be full legal name and addresses of a ton of streamers that likely have stalkers.

        • By lrae 2021-10-0620:23

          Twitch partners also have forced 2FA for quite some time now, should be a couple of years now - at least more than a year though. Covid killed my sense of time.

    • By gorgoiler 2021-10-0619:19

      Facebook [2011] was pretty bad…

      https://www.theguardian.com/technology/2012/feb/17/facebook-...

      …except Mangham didn’t ever get to release his spoils to The Internet?

    • By dilyevsky 2021-10-0620:09

      > I can't think of another, large, corporate web 2.0 startup who's gotten owned in a similar fashion

      Linkedin, Microsoft, Yahoo, Google

    • By pizzazzaro 2021-10-0617:46

      Considering who owns it? There's conversations with government agencies asking "are we okay?" To the world's richest man.

    • By FormerBandmate 2021-10-0617:05

      I mean, it did work on Amazon (a division with poorer security probably, but still). 4chan is a truly special place

    • By kordlessagain 2021-10-0618:021 reply

      From an ethical standpoint, any code that amplifies and profits from radical speech should be fair game for release. If employees or hackers feel the need to release info in that regard, so be it. This is the risk defined in such models and should be mitigated accordingly.

      • By heurisko 2021-10-0618:081 reply

        Who decides what speech is radical enough to compromise the privacy of users?

        And if speech is "radical" meaning to the point of illegality, shouldn't the legal system decide, rather than the court of public opinion?

    • By Hokusai 2021-10-0617:052 reply

      > this isn't something I'd expect from an Amazon owned property

      Because you expect Amazon to put security priority over new features and profit? We have very different understandings of what Amazon stands for.

      • By nemothekid 2021-10-0617:281 reply

        >Because you expect Amazon to put security priority over new features and profit?

        I don't know what you think Amazon stands for, but Amazon runs the largest cloud hosting service in the world - AWS, which not only runs a large number of other large companies but governments as well. I know, first hand, that their datacenter security protocols are state of the art.

        Amazon has a much larger surface attack area so if they were playing fast and loose with security, chances are we would know already.

        • By Hokusai 2021-10-0618:182 reply

          > Amazon has a much larger surface attack area so if they were playing fast and loose with security, chances are we would know already.

          I get your point and I am no taking about AWS but about Twitch. Each part of the company has its own incentives. Amazon is well know for not caring about quality nor its employees. In my experience with corporations there is little to no technical sharing between different parts of the company. AWS could have the best SecOps in the world and Twitch could have no security at all. Is your experience different?

          • By nemothekid 2021-10-0619:21

            I'm not sure what point you are trying to make. If you look at most of the high profile hacks and leaks in the past 20 years, very few of them are from web 2.0 tech companies (e.g. Google, Facebook) rather than dinosaurs (ex. Target). Those that have (like Google) have only been successfully breached by nation actors (e.g. China, NSA).

            As far as I can tell, there's no data to back up the assertion that these large tech companies are disregarding security if favor of profits, except for Twitch now, which is why this leak is interesting to me.

          • By vineyardmike 2021-10-0618:30

            > In my experience with corporations there is little to no technical sharing between different parts of the company.

            Amazon is all about sharing efforts with the company. That's the whole point of AWS - its a monetization of this efforts. Most older AWS services started out as internal services that someone realized was generally useful.

      • By adrusi 2021-10-0617:15

        EC2, Amazon's cash cow, competes with nearly identical offerings from Microsoft and Google, and is not a place where additional features are often all that valuable to customers. Any sort of breach like this on EC2 would seriously hurt Amazon's bottom line and they know it.

  • By dolores_ab 2021-10-0617:1212 reply

    Someone actually started streaming going through the code ... on twitch.

    https://www.twitch.tv/deepfrieddev

      • By kuroguro 2021-10-0618:461 reply

        On one hand I understand why you'd ban that kind of content, on the other it's essentially public information now... what's the point.

        • By AnIdiotOnTheNet 2021-10-0619:562 reply

          Because everyone else doing it still doesn't make it right.

          • By kuroguro 2021-10-0621:38

            The streaming part or the downloading/looking at code?

            You can look at leaked source code for educational purposes in most places (not legal advice). As far as I understand leaks are commonly used in vulnerability research for example (if the bad guys can use it so can bug hunters).

            Streaming copyrighted material is a separate issue - but using it for "criticism, comment, news reporting, teaching" should fall under fair use, no?

          • By OnlineGladiator 2021-10-0620:281 reply

            What's wrong with looking at public code? The code is public, regardless of how it became public - this isn't someone's personal life being exposed. If twitch is damaged by streaming this, it's only because their poor code quality is being examined publicly.

            I can certainly understand why twitch banned this and don't blame them (although I think it's stupid), but I see nothing unethical about openly talking about this code in the public now that it's already there.

            • By AnIdiotOnTheNet 2021-10-0622:361 reply

              > What's wrong with looking at public code? The code is public, regardless of how it became public

              Copyright would disagree with you, and I would say that ethically it is basically the same as stealing it yourself. You're profiting off of someone else having done the dirty work for you.

              > this isn't someone's personal life being exposed.

              Apparently a lot of payment information, telephone numbers, etc. was also in the leak. I don't think we should downloading or encouraging people to download and peruse that stuff.

              • By OnlineGladiator 2021-10-0623:01

                > You're profiting off of someone else having done the dirty work for you.

                I don't think anybody is streaming this stuff on twitch with the intention to make money, anymore than someone sharing it on a blog is trying to make money. Sure, in that edge case I'd agree with you, but it seems like the exception to the rule (after all people can just go look at the code themselves for free). I'm not talking about the guy who stole the code and is likely ransoming Amazon with it - I'm talking about people that just like to talk about code because it's something they like to do (there's an entire category for it on twitch already).

                > Apparently a lot of payment information, telephone numbers, etc. was also in the leak. I don't think we should downloading or encouraging people to download and peruse that stuff.

                My limited understanding is none of this information actually has been leaked yet, and is likely part of a future ransom (I could be wrong, I haven't looked because I don't care). I don't condone sharing that either, but that's not what the guy streaming was sharing. I'm talking about discussing the source code which is already publicly available.

                > Copyright would disagree with you

                I know very little about copyright so I'll just assume you're right. I still see no ethical problem with openly discussing this code publicly though. Anyway, agree to disagree.

      • By Philip-J-Fry 2021-10-0618:26

        They = you. It's fine to be honest, you're not exactly making it unobvious.

    • By CoolGuySteve 2021-10-0617:564 reply

      "Sorry. Unless you’ve got a time machine, that content is unavailable."

      Too bad, it would be nice to see someone go through and document how Twitch works. I've never worked at "web scale" so I'd probably learn a lot.

      • By yupper32 2021-10-0618:011 reply

        > I've never worked at "web scale" so I'd probably learn a lot.

        As someone who has worked at both large and small companies, you'd probably be disappointed.

        • By AustinDev 2021-10-0619:401 reply

          It's likely lots of bubble gum and chicken wire. I'm sure in the video ingest and transcode side of things there are some really interesting bits though. When you're owned by Amazon you don't need to optimize too much to achieve web scale... just leverage AWS services. It's not like you're going to get a bill.

          • By whack 2021-10-0620:15

            > When you're owned by Amazon you don't need to optimize too much to achieve web scale... just leverage AWS services. It's not like you're going to get a bill.

            Oh you're be surprised. Divisions get billed constantly for the AWS resources they consume, and this bill gets taken out of their annual budget. From what I hear, this is a common practice in most large organizations.

            Also, the AWS services you can access from within Amazon are almost identical to the AWS services you can access as an external customer. It's equally easy/hard for a random company to achieve web scale, compared to Twitch.

      • By peterkos 2021-10-0618:253 reply

        A lot of it is probably hacked together -- like, embarrassingly hacked together lol

        • By cheeze 2021-10-0618:431 reply

          This is true about almost any company. Closed source generally means you can have lower standards.

        • By dijit 2021-10-0622:251 reply

          You’re being downvoted for being overly negative, but the ops code is of (literally) shockingly poor quality.

          This leak has made me understand clearly that code quality is not what makes a product great.

          I guess that’s something.

          The jenkinsfiles are mostly nice and clean though. I’ve definitely seen worse of those.

          • By peterkos 2021-10-081:44

            Oops, didn't mean to be too too negative. I say embarrassing in the sense of, I've definitely shoved out awful code because something needed to get out(tm). And with large companies, deadlines that cause that situation are inevitable.

            But I also say it like that because, well, I've seen code that causes (objectively easy-to-fix) crashes but still ships because of one reason or another: laziness, politics, inexperience. It's a part of software engineering I'm still trying to accept.

        • By phgn 2021-10-0620:45

          Yep, there are lots of small services that don't seem production ready in the source code. Though admittedly we don't know which of those are deprecated.

      • By Arnavion 2021-10-0621:41

        Well, you know what they say, "Self help is the best help."

      • By wesleytodd 2021-10-0620:01

        I hear Netflix has a good tech blog ;)

    • By jedberg 2021-10-0617:46

      Hah. This is like when reddit does something people don't like and there is a huge thread about it ... on reddit.

    • By phgn 2021-10-0620:42

      It is really fun to go through the source code. You'll find interesting architecture diagrams, documentation etc. It's like joining a new job and being amazed how a service you actually use was build.

      Everyone interested, just download the code :)

    • By onnnon 2021-10-0617:551 reply

      Channel is gone, banned?

      • By jeffalo 2021-10-0617:56

        Yep, we saw it happen live.

    • By echelon 2021-10-0617:541 reply

      It just got disconnected.

      The chat had a few Amazon insiders, which was interesting to read their perspectives.

      • By treesknees 2021-10-0618:16

        Any bits you recall from the chat?

    • By mawaldne 2021-10-0617:58

      This no longer works. Guy got banned I think.

    • By Orphis 2021-10-0617:55

      And banned

    • By Avery3R 2021-10-0617:54

      got banned

    • By Nickoladze 2021-10-0617:55

      aaaand it's gone

    • By inputError 2021-10-0617:55

      they just got banned as of 10:55 AM October 6, 2021 PDT

  • By mastermojo 2021-10-0618:172 reply

    There's something about this sentence that I find hilarious:

    The download was posted to 4chan today, described by its unidentified source as “part one” of “an extremely poggers leak,”

    • By wchar_t 2021-10-0621:111 reply

      I find it extremely ironic that they whine about Twitch being a "disgusting cesspool"... on 4chan.

      > Calling Twitch a “disgusting toxic cesspool,”

      • By snvzz 2021-10-0622:011 reply

        Ironic? Why?

        • By adamwk 2021-10-0622:361 reply

          Because as far as cesspools go, 4chan is the most toxic and disgusting

          • By snvzz 2021-10-072:08

            That's a bold claim.

    • By jallen_dot_dev 2021-10-0621:00

      This hack was not very xqcL of them.

HackerNews