May 6th, 2007

Michael Bernstein sent me a discussion note at the S3 Forum on the new pricing infrastructure and how it penalizes for smaller files. From there, I found a more disquieting thread.

When asked the question, Are 404 and 403 errors charged under the new pricing plan? the answer was:

As our intent is to charge equitably for system resources used, we will be charging the owner of the bucket for 403s and 404s, since they consume system resources (as do all requests). Note that we will not be charging for requests which fail due to an Amazon S3 internal system error (all other requests will be billed).

(emph.mine)

In case you missed it, let me repeat this: we will be charging the owner of the bucket for 403s and 404s, since they consume system resources. What this means is that anyone can setup a simple script to post requests to my S3 account to non-existent files, and I'll be charged for each request. Even if I set the bucket to be private and requests return a 403, not authorized. A badly behaved bot, aggregator, or other application could do the same. The risk of maintaining my photos at the site has now become too great, and I'll have to plan on moving these by end of the month.

Then there is 304 requests: they're also being charged. HEAD and conditional/range GET requests will be billed at the GET request rate? Answer: Yes, that is correct.

As was mentioned in another thread, Amazon had no intention of S3 being used for web access. It had no intention of this being used by anyone outside of larger, corporate uses through the company's EC2 CPU enabled applications. Surprising considering how this was the way the system was packaged for sale. So much for Web 2.0 thinking. So much for Amazon being one of the 'cool kids'. Be interesting to chat with Jeff Bezos about this at the next O'Reilly conference.

"So, Jeff, how does it feel to be the Web 2.0 man who broke Web 1.0 functionality?"

If you're using S3 in a public facing capability, I suggest, strongly, you work on your exit from the system before June. If you're using the site for backup, make doubly sure that your site can't be accessed by the public. If you're using it for backup, you might want to consider doing the same, or hope no one figures out your private bucket names.

S3 is effectively dead as a web service.

I looked more closely at a site such as SmugMug, which uses S3. I'm not sure if the company is using EC2, but regardless, the site is serving images through it's own server, which then accesses S3. This means that the requests aren't direct to the S3 service.

If they can cache images, the company can probably limit GET requests. I imagine it might be able to work something out with Amazon not to be charged when any of their buckets is accessed anonymously, resulting in a 403, forbidden access.

I thought about putting something in for me locally, but it's not worth it. This means a request would have to go to my server, which then I'd have to programmatically process into a request to S3, cache the most recent photos, and then serve up the image. Talk about 'break the web'.

No, it's obvious that S3 was not intended for this type of service, and this is a move to chase those of us who 'corrupted' the service off. Though the costs seem low, it's the lack of control that's really becoming the issue.

As for bandwidth, bandwidth was never the issue for me as physical storage was.

Comments
1

That's EC2, not E2.

This isn't new. Using S3 as personal web storage has always been dangerous, if you're worried about Bad Guys causing you to be charged massive rates; they could always just request some large (existing) document millions of times.

2
Shelley - 5:41 pm 5/6/2007

Yup, dumb idea on my part.

3

Being a developer of S3 Backup, I'm as upset about this as you are and probably more, but I still hope Amazon will decide not to charge for anonymous requests, and that would solve most of the issues. Still, this trend is quite unnerving to people basing their business on Amazon WebServices.

4
Shelley - 6:13 pm 5/6/2007

Sergey that is very true. I'm already fixing my problem by downloading the buckets to various drives, and then will figure out what to do with the material. For larger sites that have based businesses on this — such as SmugMug — well, I guess it really depends on how the service manages this access.

One could handle requests so that they go through a local service and then out to S3, but that's horrid from a web standpoint. Really horrid.

5
Shelley - 6:24 pm 5/6/2007

Looking at SmugMug more closely, they do serve the images up through their application, so they don't have direct public access of the images. They probably could control the GETs with judicious ues of caching.

But they can't control the 403, which is a valid request type against a bucket when accessed without authorization.

6
Bud Gibson - 6:27 pm 5/6/2007

Well, looking at the pricing analyses on the forums and considering my own use case, sticking with my ISP makes the most sense.

The question in my mind, will this ever be a profitable business for them? Generally, people want predictable pricing, but their own cost structure seems not to allow them to do that. Charging per request makes pricing unpredictable. Even if you could generally afford what the price turns out to be, the unpredictable nature drives people off. What exactly is Amazon's scale providing? Nothing that customers perceive as a benefit. This has been the problem of all computing as a utility models that I've seen.

7

SmugMug actually welcomed the change in pricing as they are mostly uploading big objects and the new pricing encourages exactly that. If I understand correctly they only use S3 as a backup provider.

8
Shelley - 6:36 pm 5/6/2007

Sergey, that makes more sense, and fits what I've seen from trying to reverse engineer the site. But that's not a web service, that's just remote storage. And companies like SmugMug bring more money and bring Web 2.0 goodness, where folks like me are probably seen more as a barnacle, to be scraped off.

Perhaps Amazon needs to reframe how they describe the service.

9

This seems an obvious case where (in Schneieresque terms) capabilities are not in line with responsibilities. Essentially, S3 customers seem like they are going to be vulnerable for a new class of DOS attack, even if they are not using S3 as a webserver, and even though they can't do anything about it.

Anyone want to place bets on how long it takes some script-kiddie with a botnet to inflate an S3 customer's bill?

10
Allan - 12:30 pm 5/7/2007

i think a service like http://www.stradcom.com will help reduce these charges. Your bucket info needs to be cached somewhere so you will hit the Amazon servers only if you are sure the file is there.

11

Allan, we're talking about rogue requests, this service does nothing to prevent them.

12

It occurs to me rather belatedly that the first line of defense to prevent the DOS scenario for non-public buckets should be a simple setting that says "reject utterly any requests that don't come from within this range of IP addresses".

Thanks to all those who have contributed to the discussion. Comments are now closed, but you can contact the author of the post directly.