We map the TUS[0] protocol to S3 multipart upload operations. This lets us obscure the S3 bucket from the client and perform authorize each interaction. The TUS operations are handled by a dedicated micro-service. It could be done in a Lambda or anything.
Once the upload completes we kick off a workflow to virus scan, unzip, decrypt, and process the file depending on what it is. We do some preliminary checks in the service looking at the file name, extension, magic bytes, that sort of stuff and reject anything that is obviously wrong.
For virus scanning, we started with ClamAV[1], but eventually bought a Trend Micro product[2] for reasons that may not apply to you. It is serverless based on SQS, Lambda, and SNS. Works fine.
Once scanned, we do a number of things. For images that you are going to serve back out, you for sure want to re-encode those and strip metadata. I haven't worked directly on that part in years, but my prototype used ImageMagick[3] to do this. I remember being annoyed with a Java binding for it.
Once the upload completes we kick off a workflow to virus scan, unzip, decrypt, and process the file depending on what it is. We do some preliminary checks in the service looking at the file name, extension, magic bytes, that sort of stuff and reject anything that is obviously wrong.
For virus scanning, we started with ClamAV[1], but eventually bought a Trend Micro product[2] for reasons that may not apply to you. It is serverless based on SQS, Lambda, and SNS. Works fine.
Once scanned, we do a number of things. For images that you are going to serve back out, you for sure want to re-encode those and strip metadata. I haven't worked directly on that part in years, but my prototype used ImageMagick[3] to do this. I remember being annoyed with a Java binding for it.
[0] https://tus.io/ [1] https://www.clamav.net/ [2] https://cloudone.trendmicro.com/ [3] https://imagemagick.org/index.php