>Randomise your bucket names! There is no need to use company-backup.s3.amazonaws.com

This is really poor advice. It offers no real benefit, especially since any asset you access will betray your bucket name because it's part of the DNS resolution. Bucket names are emphatically public as much as a DNS name is public.

It can also create more problems. If you name something like companyname-production vs companyname-qa, you pretty much know right off the bat which environment you are about to mess up. Not so with random names or UUIDs.

This is also security by obscurity. If all one needs to know is the bucket name, you have already lost.

EDIT: As an exception to this, I randomize a portion of the bucket name when it is created by automation. But this is solely to avoid name clashes across separate clusters. The prefix will still be the same.

> I see this being claimed a lot, but isn’t all security by obscurity at the end of the day?

I Do Not Think It Means What You Think It Means [1].

To elaborate, the concept is not formal/mathematical, it's a design concept. You can distinguish between a security implementation that explicitly depends on a secret key or password, and an implementation that implicitly relies upon secret implementation details for its security. The latter is not intentionally designed as a carefully-controlled secret, and therefore much easier to accidentally leak.

Whilst they don't in your example, choosing a password between 1 and 65K is a very bad decision to begin with (assuming the attack knows this ... if they don't the password search space is far larger than the port search space)

In general A does not improve your security 65K times since a single attempt will tell if there is telnet on the port or not, whereas with B all you know if you got the wrong password.

Now if you ran a dummy telnet that always can slow 'wrong password' responses on the other (65K-1) ports that would potentially increase the security 65K times, but still isn't really a meaningful thing to do.

A is a realistic case of security by obscurity - there's a sizable amount of people who believe that to be secure.

B is in my opinion much less realistic: very few people believe a password two bytes long (or better, with two bytes of entropy) to be secure. Even a trivial password like "TelnetSucks" scores 31 bits of entropy with https://apps.cygnius.net/passtest/.

B) unless you know the password MUST be a number and MUST be between 1 and 65K (which is a terrible password requirement, e.g. a password of max length 65000 using only digits 0-9 is as good as no password), you need to brute force the entire known character space up to some finite number. The sun will die first.

The problem is one of probabilities - even the most basic script-kiddie scanners is set up to find your telnet server. Right now there are hundreds, if not thousands, of machines scanning the entire IPv4 space over and over for exactly this kind of silly configuration. If you do something like this it will eventually be found and used.

No, the responses to the comment are a common fallacy on display, where rather than addressing the point of the thought experiment, which is clear enough, people attack the premise. There is no amount of defensive writing[1] that can bring relief to this situation.

I actually somewhat agree with your point, your example is simply not realistic. Your point is correct because people are using the term security by obscurity wrong. Security by obscurity means that you rely on the secret implementation of your algorithm. Our best encryption algorithms are public so they can be poked and peer reviewed. You are right in the fact that through enough obscurity of the key, you attain security as it's non feasible statistically to brute force.

At the most abstract level, security is RISK management which is related to SECRETS management. So, on some level it is true that security is equivalent to obscurity. But, that's like saying that cars are molecules. It is true, but it is not a useful statement.

There are two operative principles of security that you should research. 1) Defense in depth, where there is more than one layer of security that must be pierced. 2) Assume that the attacker knows absolutely everything about your system, design, ports, and so on - except for the key material.

I can think of one advantage.. it makes it difficult for somebody to attack you with a typo attack. If all your buckets having a consistent naming scheme that is very strict, then somebody else could make a bucket very similar to one of yours where a typo would be likely and your data starts going to them.

I wouldn't call it poor advice. It isn't a control, more security by obscurity, but it doesn't exactly hurt anything either. I saw a situation recently where a bucket was accidentally opened to the world, but the name was a UUID and in the entire history of the bucket no request was logged other than from the intended clients.

Is fc20d856-2a7e-41ab-b072-9bb9a68c6bda production or 193565ac-9121-4071-8aeb-62f3111c4c97 or is that the dev setup or the staging data for the other service or...

To me the big question here is why these names have to be global. Why can't I have a UUID externally but a name and an account internally? Honest question, I assume there may be a significant issue as smarter people than me decided not to do it that way.

> in the entire history of the bucket no request was logged other than from the intended clients

This sounds sort of like dumb luck. It just means no one was looking for it, that doesn't mean it's secure. This all reminds of me of the xkcd about making passwords that are easy for computers to guess and hard for people to remember[0].

Your security on buckets should be the bucket policy/permissions themselves, not the arbitrary naming of them. Security by obscurity is rarely secure and more about the illusion of security.

I couldn't agree more with your second point, but risk is usually considered the product of likelihood and impact. If I name my bucket 'bestbuy' vs '4fc6-43b0-bc19-75fe07e06133', the likelihood that some random is going to find my bucket increases dramatically.

The chance of it being found by someone guessing the name would increase dramatically. The chance of it being found by someone running a script that searches for buckets using DNS logs, code searches, etc would be the same.

Hackers don't often try to guess things. They run scripts. That's why it doesn't matter what you call the bucket.

IMO, the problem is in the use of the CA system, where control over "names" (e.g. subdomains) is shared with third parties (certificate issuers) instead of being solely with the user who wants to reserve names.

It is possible to have a non-CA PKI system where the user controls both the issuance of the public key and the associated name she will use. In such a system, no third party has control over names. People learn the user's name and the user's key from the same source: the user.

Thus there is no issue of trust re: using third parties, and thus no need for monitoring what names the third parties are issuing, e.g. via "certificate transparency" logs. CT logs do not need to exist.

This is not a new idea and it has been proven to work. I can prepare a post with examples if anyone is interested.

Yep. Can use crt.sh for this on a per domain level, I also wrote ausdomainledger.net as an experiment to index all subdomains in the .au TLD, querying the CT logs directly, which was a bunch of fun.

> How to "hide" private subdomains?

Symantec provides the option of label redaction (using the '?' symbol) for CT precerts with the certificates they issue. For example: https://crt.sh/?q=?.amazon.com.au . However I'm pretty sure its not supported by the CT RFC ...

Otherwise, I'd say wildcards.

Replacing the CA PKI with something else is very drastic and if possible, will probably take a very long time ...

Basically some resolvers submit all (some?) of their DNS query responses to a central database so that it can be searched later. It seems you can also install a passive "sensor" in your network that (presumably) passively MITMs DNS queries and then sends off the responses.

I don't know how hard it is to get access to the data, but:

> programs like RiskIQ's DNSIQ allow organizations to install a sensor on their network that reports back to RiskIQ and in exchange, the organization gains access to all the passive DNS traffic inside the central repository.

I did some analysis a few months ago and collected the names of approximately 100,000 buckets in the wild. Rough numbers, about 5% are open to the public for anonymous read, and about 5% of those are open for anonymous write.

I'm convinced that Chris Vickery, the guy behind a good many of the open bucket finds this year, has access to enterprise firewall/proxy logs. Not because the buckets would have been hard to find, but because you could spend a lifetime looking through thousands upon thousands of open buckets before you find anything interesting.

This is concerning b/c there have been a number of high profile data breaches that have occurred due to over reliance on S3 bucket obscurity. Where the buckets have been left with minimal or misconfigured permissions and GBs of data there for the downloading.

Concerning in the sense of "if you aren't sure why this is a story on HN" -> that you may be unaware that many large and generally technically competent firms are screwing this up and this repo/tool is yet one more reason to take this seriously.

Correct me if I’m wrong but last time I tried to make a new bucket’s contents public it was a real PITA. The default configuration is very locked down. So I think it’s never a case of minimal configuration and always misconfiguration.

I was curious so I've tried if I could find anything compromising with it and it's mostly just public buckets of some images used for websites so nothing strange. Maybe the README is a bit too dramatic.

The code takes the CT hostname and tries to access a bunch of different buckets that might exist related to that hostname. So if you get a cert for foo.example.com it will ask s3 if foo.example.com.s3.amazonaws.com and www-foo.example.com.s3.amazonaws.com exist.