Hacker News new | past | comments | ask | show | jobs | submit login
Macie: Automatically Discover, Classify, and Secure Content at Scale (amazon.com)
104 points by forrestbrazeal on Aug 14, 2017 | hide | past | favorite | 36 comments



Dear aws,

Could you guys take it easy for a moment with spawning new weird services like horny hamsters, reflect on the ~372 you already have, and fix some aspects of them?

  How about letting me add basic auth to S3 or CloudFront?
  How about non-expiring signed s3 urls?
  How about restricted access to api-gateway?
  Hey, how about fixing Athena to stop throwing random errors, and actually support the fully featured presto syntax?

I could extend this list with 64 items more without even opening my dedicated aws complains notebook.


You're more likely to get a big promotion for creating a bold new v1.0 than refining and polishing v3.5. This seems to be a big problem with Google, too. Out of whack incentivization at work, I suppose.

I wish more companies rewarded employees based upon NPS or some similar metric. I think that would address your complaint in fairly short order.


If you have examples for the Google side, I'd legitimately love to hear 'em (email dollars@ if the list is too long for character limit).

This has been far from my experience internally and what we're trying to share with users, but that doesn't mean we can't also improve.


How many chat apps and platforms has Google launched over the past 10 years? I can't even keep track of them anymore. Here are six'ish: https://www.digitaltrends.com/mobile/google-chat-video-apps-...

Edit: I don't have any complaints about GCP. I think you folks have done a great job of clearly positioning and segmenting your product lines. Props to you for that.


Tell me again which Android chat app I am supposed to use? Oh I guess it's that new one that barely works that replaced the less new one that barely works that replaced the old one that barely works. Or maybe it's that other new one Google just released...


And if they decide to bring out a new service they could for once continue building on one of the 10 existing different UI design languages instead of introducing yet another "sporty user interface".


Just curious — why do you want lower bar security? If you didn’t care about security, then don’t sign the S3 request (make it public). If you want it signed, then why is expiration an issue?


It's a spectrum right? Having a signed S3 request is slightly better than having a public link.


For all intents and purposes, a signed, non-expiring S3 request is a public link, isn't it? Anybody can access it in perpetuity just by following the link. The only difference is that the signed request is unpredictable and public links are probably predictable by default. So just include a UUID in your key and you are set.


The public link can be just as unpredictable.


You can implement basic auth using Lambda@Edge.

You could also implement your own non-expiring signed URLs with Lambda@Edge too.


Yes. I know how to solve all the issues on my list. But I'm paying aws a fortune for them to solve my problems, not for me to find workarounds.

/end rant

[EDIT: Also, slightly drifting - me solving an aws issue is inefficient in the big scheme of things. We'd have N developers solving the same problem, each to himself. That's a DRY violation. And we can't even estimate the waste, lacking an aws issues/voting system]


rolls eyes

Is there some reason programmers feel this need to bring up workarounds when I'm looking for an actual solution? If I could count how many answers on StackOverflow are just "well if you use JQuery..." or "Well if you use Boost...".

Address the point the person you're responding to is making, and answer only the questions they are asking. Making assumptions just wastes people's time. I understand you're just trying to help, but it comes across as really condescending.

"You should be doing this you pleb!"

is how I always read these types of responses in my head, and it's really frustrating because often-times I'm already aware of whatever workaround you've mentioned, have already tried it, and know it's not suitable. It's doubly frustrating when you're specifically looking for the solution that's not the workaround.


There's no need to be like that.

> Is there some reason programmers feel this need to bring up workarounds when I'm looking for an actual solution?

It is a solution! It's not the same solution you want, but it's an actual solution to the problem you described.

> Address the point the person you're responding to is making, and answer only the questions they are asking.

They are addressing the point, and " answer only the questions they are asking." is a dreadful idea to me. So often people are asking how to solve a problem in a specific way, but there's no good reason to be limiting themselves in that way. Given that they currently can't easily solve their problem in that specific way, it suggests that if there is a nice solution they are likely looking in the wrong place.

> Making assumptions just wastes people's time.

I feel like not sharing a solution because you think that the person you're trying to help has some unexpected extra

> is how I always read these types of responses in my head, and it's really frustrating because often-times I'm already aware of whatever workaround you've mentioned, have already tried it, and know it's not suitable. It's doubly frustrating when you're specifically looking for the solution that's not the workaround.

Then you should make it more explicit what you've already tried. We're not inside your head, and many people haven't tried these things before. This is a good guide: http://www.catb.org/esr/faqs/smart-questions.html

I didn't realise lambda@edge was a thing, or that I could use it to solve these problems. If I'd asked the question, these would have helped me. Why should someone refrain from writing a concise and polite response helping me purely because they think I may have tried that before and it will annoy me?

> "You should be doing this you pleb!" is how I always read these types of responses in my head,

Then you may benefit from trying to work on this. Their reply has none of this snark or rudeness at all, and is simply listing some ways of solving the problem. You are the one that added this mentally, and then it annoys you. You are adding something yourself which then annoys you.

Go back and read what they said. They very simply explained that AWS let you do those things using lambda@edge. There was absolutely no reason to reply so rudely.


So, I totally agree with you on the tone of the GP, and that workarounds are worthwhile to share, but I feel like they have a point.

I feel the problem is more pronounced in the Javascript/webdev community (caveat: I'm recently dabbling in web front end development after a good number of years in Java-land) - the willingness to throw a poorly understood workaround or npm-incantation out there ("it usually works if I.."), rather than trying to actually get at the root of the problem is frustrating. I don't just want to make it work, I want to understand why it didn't before. It seems, from still shallow observation, that the community (such as a singular community exists) has an ethos of doing whatever to make it work and moving on.

Sometimes the root of the issue is a fundamental limitation that you need to work around, and workarounds are definitely useful in those cases, but they are definitely frustrating when they aren't accompanied by reasonably precise diagnosis of the problem they are working around.


And once you make those changes, please update the documentation on your website.


> once your data has been classified by Macie, it assigns each data item a business value

... and a thousand startup founders started the process, looked at the results, put their heads down on their desks, and wept quietly.


I don't get it. Are they crying because they have a ton of unsecured PII or because all their data is worthless?


When reading one of my jokes, feel free to use the funniest interpretation.


Yes.


Would you like Coffee or Tea? Yes.


Whiskey or Vodka?


Is there an actual use for this? seems like its another useless service from the aws team instead on focusing on improving their interfaces, api's and pricing options.


Data Loss Prevention (DLP) and related things, mostly. Like alerting you that you just wrote a bunch of private customer data to a world-readable bucket. Or that some random employee is downloading all of your privileged and confidential reports, which could mean their credentials have been compromised. Very very nice if you have data stored in S3 that you really want to keep secure.

DISCLAIMER: work for AWS, have met and talked with the Macie team on several occasions. Opinions on here are my own


Still this seems like a poorly managed enterprise, probably a lazy cto or lack of a ciso, personally m kinda spektic that these services actually offer value in a well managed enterprise, specially if you work under the assumption that everything that it's on the internet or well "the cloud" it's by nature inherently insecure. Let's see how long those "valley people" keep milking the "AI", "Deep learning", "Machine Learning" hype.


As many large enterprise shops migrate to cloud, they need a way to handle data classification and sprawl in their s3 buckets. Magic tools that can label data for them is more useful than nothing, and those big shops wont blink at the pricetag.


If ever you have to deal with large financial institutions, Data Loss Prevention is one of the items that will come up as part of their standard audit on your systems. It can be really hard (and expensive) to put in place and something like this solve it in a flash.


Not cheap: https://aws.amazon.com/macie/pricing/

After first GB, $5 per GB processed by the content classification engine


That's $5 once for each GB, not $5 per GB per month. Seems reasonable to me compared to the cost of implementing a DLP solution.


> When Jeff and I heard about this service, we both were curious on the meaning of the name Macie. Of course, Jeff being a great researcher looked up the name Macie and found that the name Macie has two meanings.

I somehow initially parsed the author's mention of Jeff as Jeff Bezos but soon realized she was referring to Jeff Bar, chief evangelist of AWS.


I don't think their svm classifier would work over encrypted data, so if a service stores data encrypted at rest is this useful at all?


Most people using S3 are encrypting their data with KMS. Haven't looked, but if it's like every other S3 consumer, you can write a trust policy for Macie.


> The first meaning of Macie that was found, said that that name meant “weapon”.

Is macie a french word for weapon? I'm familiar with the English word "mace" that is a weapon but not macie.


I ask myself how many people are storing _potential_ sensitive information without application level encryption so that AWS decides to build such tool... slightly distressing.


I don't get it. In what scenarios would this be useful?


how do people train and build a service like this ?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: