bucket-antivirus-function

This is a fork that:

supports Python 3.10
uses uptodate versions of the libraries
can set the maximum file size to be scanned via the environment variable AV_SCAN_MAX_FILESIZE and AV_SCAN_MAX_SCANSIZE
Functions related to Datadog are not maintained

このリポジトリはFork元が更新を停止したため個人的にメンテナンスをしているものです。
AWSのPythonバージョンアップに伴う修正と、ClamAVの設定項目を増やすことが主な関心事です。
DataDog関連機能は詳しくないためメンテナンス予定はありません。
もしなにかご要望等あればX(Twitter)のDMにください。対応するかもしれません。
※AWS Lambdaのエフェメラルストレージ容量の制約により大容量ファイルのスキャンには向いていません。

下記のMakeコマンドについては現在メンテナンスしていません。

pre_commit_install
pre_commit_tests
scan
update

Scan new objects added to any s3 bucket using AWS Lambda.

Features

Easy to install
Send events from an unlimited number of S3 buckets
Prevent reading of infected files using S3 bucket policies
Accesses the end-user’s separate installation of open source antivirus engine ClamAV

How It Works

Each time a new object is added to a bucket, S3 invokes the Lambda function to scan the object
The function package will download (if needed) current antivirus definitions from a S3 bucket. Transfer speeds between a S3 bucket and Lambda are typically faster and more reliable than another source
The object is scanned for viruses and malware. Archive files are extracted and the files inside scanned also
The objects tags are updated to reflect the result of the scan, CLEAN or INFECTED, along with the date and time of the scan.
Object metadata is updated to reflect the result of the scan (optional)
Metrics are sent to DataDog (optional)
Scan results are published to a SNS topic (optional) (Optionally choose to only publish INFECTED results)
Files found to be INFECTED are automatically deleted (optional)

Installation

Build from Source

To build the archive to upload to AWS Lambda, run make all. The build process is completed using the amazonlinux Docker image. The resulting archive will be built at build/lambda.zip. This file will be uploaded to AWS for both Lambda functions below.

Create Relevant AWS Infra via CloudFormation

Use CloudFormation with the cloudformation.yaml located in the deploy/ directory to quickly spin up the AWS infra needed to run this project. CloudFormation will create:

An S3 bucket that will store AntiVirus definitions.
A Lambda Function called avUpdateDefinitions that will update the AV Definitions in the S3 Bucket every 3 hours. This function accesses the user’s above S3 Bucket to download updated definitions using freshclam.
A Lambda Function called avScanner that is triggered on each new S3 object creation which scans the object and tags it appropriately. It is created with 1600mb of memory which should be enough, however if you start to see function timeouts, this memory may have to be bumped up. In the past, we recommended using 1024mb, but that has started causing Lambda timeouts and bumping this memory has resolved it.

Running CloudFormation, it will ask for 2 inputs for this stack:

BucketType: private (default) or public. This is applied to the S3 bucket that stores the AntiVirus definitions. We recommend to only use public when other AWS accounts need access to this bucket.
SourceBucket: [a non-empty string]. The name (do not include s3://) of the S3 bucket that will have its objects scanned. Note - this is just used to create the IAM Policy, you can add/change source buckets later via the IAM Policy that CloudFormation outputs

After the Stack has successfully created, there are 3 manual processes that still have to be done:

Upload the build/lambda.zip file that was created by running make all to the avUpdateDefinitions and avScanner Lambda functions via the Lambda Console.
To trigger the Scanner function on new S3 objects, go to the avScanner Lambda function console, navigate to Configuration -> Trigger -> Add Trigger -> Search for S3, and choose your bucket(s) and select All object create events, then click Add. Note - if you chose more than 1 bucket as the source, or chose a different bucket than the Source Bucket in the CloudFormation parameter, you will have to also edit the IAM Role to reflect these new buckets (see "Adding or Changing Source Buckets")
Navigate to the avUpdateDefinitions Lambda function and manually trigger the function to get the initial Clam definitions in the bucket (instead of waiting for the 3 hour trigger to happen). Do this by clicking the Test section, and then clicking the orange test button. The function should take a few seconds to execute, and when finished you should see the clam_defs in the av-definitions S3 bucket.

Adding or Changing Source Buckets

Changing or adding Source Buckets is done by editing the AVScannerLambdaRole IAM Role. More specifically, the S3AVScan and KmsDecrypt parts of that IAM Role's policy.

S3 Events

Configure scanning of additional buckets by adding a new S3 event to invoke the Lambda function. This is done from the properties of any bucket in the AWS console.

Note: If configured to update object metadata, events must only be configured for PUT and POST. Metadata is immutable, which requires the function to copy the object over itself with updated metadata. This can cause a continuous loop of scanning if improperly configured.

Configuration

Runtime configuration is accomplished using environment variables. See the table below for reference.

Variable	Description	Default	Required
AV_DEFINITION_S3_BUCKET	Bucket containing antivirus definition files		Yes
AV_DEFINITION_S3_PREFIX	Prefix for antivirus definition files	clamav_defs	No
AV_DEFINITION_PATH	Path containing files at runtime	/tmp/clamav_defs	No
AV_SCAN_START_SNS_ARN	SNS topic ARN to publish notification about start of scan		No
AV_SCAN_START_METADATA	The tag/metadata indicating the start of the scan	av-scan-start	No
AV_SIGNATURE_METADATA	The tag/metadata name representing file's AV type	av-signature	No
AV_STATUS_CLEAN	The value assigned to clean items inside of tags/metadata	CLEAN	No
AV_STATUS_INFECTED	The value assigned to clean items inside of tags/metadata	INFECTED	No
AV_STATUS_METADATA	The tag/metadata name representing file's AV status	av-status	No
AV_STATUS_SNS_ARN	SNS topic ARN to publish scan results (optional)		No
AV_STATUS_SNS_PUBLISH_CLEAN	Publish AV_STATUS_CLEAN results to AV_STATUS_SNS_ARN	True	No
AV_STATUS_SNS_PUBLISH_INFECTED	Publish AV_STATUS_INFECTED results to AV_STATUS_SNS_ARN	True	No
AV_TIMESTAMP_METADATA	The tag/metadata name representing file's scan time	av-timestamp	No
CLAMAVLIB_PATH	Path to ClamAV library files	./bin	No
CLAMSCAN_PATH	Path to ClamAV clamscan binary	./bin/clamscan	No
FRESHCLAM_PATH	Path to ClamAV freshclam binary	./bin/freshclam	No
DATADOG_API_KEY	API Key for pushing metrics to DataDog (optional)		No
AV_PROCESS_ORIGINAL_VERSION_ONLY	Controls that only original version of an S3 key is processed (if bucket versioning is enabled)	False	No
AV_DELETE_INFECTED_FILES	Controls whether infected files should be automatically deleted	False	No
EVENT_SOURCE	The source of antivirus scan event "S3" or "SNS" (optional)	S3	No
S3_ENDPOINT	The Endpoint to use when interacting wth S3	None	No
SNS_ENDPOINT	The Endpoint to use when interacting wth SNS	None	No
LAMBDA_ENDPOINT	The Endpoint to use when interacting wth Lambda	None	No
AV_SCAN_MAX_FILESIZE	Files larger than this will be skipped and assumed clean(ClamAV args)	Depends on ClamAV	No
AV_SCAN_MAX_SCANSIZE	The maximum amount of data to scan for each container file(ClamAV args)	Depends on ClamAV	No

2023/05/08 AV_SCAN_MAX_FILESIZE, AV_SCAN_MAX_SCANSIZE were added.
設定項目についてはこちらを参照ClamAV - ArchWiki

S3 Bucket Policy Examples

Deny to download the object if not "CLEAN"

This policy doesn't allow to download the object until:

The lambda that run Clam-AV is finished (so the object has a tag)
The file is not CLEAN

Please make sure to check cloudtrail for the arn:aws:sts, just find the event open it and copy the sts. It should be in the format provided below:

 {
    "Effect": "Deny",
    "NotPrincipal": {
        "AWS": [
            "arn:aws:iam::<<aws-account-number>>:role/<<bucket-antivirus-role>>",
            "arn:aws:sts::<<aws-account-number>>:assumed-role/<<bucket-antivirus-role>>/<<bucket-antivirus-role>>",
            "arn:aws:iam::<<aws-account-number>>:root"
        ]
    },
    "Action": "s3:GetObject",
    "Resource": "arn:aws:s3:::<<bucket-name>>/*",
    "Condition": {
        "StringNotEquals": {
            "s3:ExistingObjectTag/av-status": "CLEAN"
        }
    }
}

Deny to download and re-tag "INFECTED" object

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Deny",
      "Action": ["s3:GetObject", "s3:PutObjectTagging"],
      "Principal": "*",
      "Resource": ["arn:aws:s3:::<<bucket-name>>/*"],
      "Condition": {
        "StringEquals": {
          "s3:ExistingObjectTag/av-status": "INFECTED"
        }
      }
    }
  ]
}

Manually Scanning Buckets

You may want to scan all the objects in a bucket that have not previously been scanned or were created prior to setting up your lambda functions. To do this you can use the scan_bucket.py utility.

pip install boto3
scan_bucket.py --lambda-function-name=<lambda_function_name> --s3-bucket-name=<s3-bucket-to-scan>

This tool will scan all objects that have not been previously scanned in the bucket and invoke the lambda function asynchronously. As such you'll have to go to your cloudwatch logs to see the scan results or failures. Additionally, the script uses the same environment variables you'd use in your lambda so you can configure them similarly.

Testing

There are two types of tests in this repository. The first is pre-commit tests and the second are python tests. All of these tests are run by CircleCI.

pre-commit Tests

The pre-commit tests ensure that code submitted to this repository meet the standards of the repository. To get started with these tests run make pre_commit_install. This will install the pre-commit tool and then install it in this repository. Then the github pre-commit hook will run these tests before you commit your code.

To run the tests manually run make pre_commit_tests or pre-commit run -a.

Python Tests

The python tests in this repository use unittest and are run via the nose utility. To run them you will need to install the developer resources and then run the tests:

pip install -r requirements.txt
pip install -r requirements-dev.txt
make test

Local lambdas

You can run the lambdas locally to test out what they are doing without deploying to AWS. This is accomplished by using docker containers that act similarly to lambda. You will need to have set up some local variables in your .envrc.local file and modify them appropriately first before running direnv allow. If you do not have direnv it can be installed with brew install direnv.

For the Scan lambda you will need a test file uploaded to S3 and the variables TEST_BUCKET and TEST_KEY set in your .envrc.local file. Then you can run:

direnv allow
make archive scan

If you want a file that will be recognized as a virus you can download a test file from the EICAR website and uploaded to your bucket.

For the Update lambda you can run:

direnv allow
make archive update

License

Upside Travel, Inc.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

ClamAV is released under the GPL Version 2 License and all source for ClamAV is available for download on Github.

Name		Name	Last commit message	Last commit date
Latest commit History 177 Commits
deploy		deploy
images		images
scripts		scripts
.coveragerc		.coveragerc
.editorconfig		.editorconfig
.envrc		.envrc
.envrc.local.template		.envrc.local.template
.flake8		.flake8
.gitignore		.gitignore
.markdownlintrc		.markdownlintrc
.pre-commit-config.yaml		.pre-commit-config.yaml
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
clamav.py		clamav.py
common.py		common.py
display_infected.py		display_infected.py
installed_lib_list.txt		installed_lib_list.txt
metrics.py		metrics.py
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt
scan.py		scan.py
scan_bucket.py		scan_bucket.py
test_clamav.py		test_clamav.py
test_common.py		test_common.py
test_display_infected.py		test_display_infected.py
test_scan.py		test_scan.py
test_scan_bucket.py		test_scan_bucket.py
update.py		update.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

bucket-antivirus-function

Features

How It Works

Installation

Build from Source

Create Relevant AWS Infra via CloudFormation

Adding or Changing Source Buckets

S3 Events

Configuration

S3 Bucket Policy Examples

Deny to download the object if not "CLEAN"

Deny to download and re-tag "INFECTED" object

Manually Scanning Buckets

Testing

pre-commit Tests

Python Tests

Local lambdas

License

About

Releases 3

Packages

Languages

License

maya-hanada/bucket-antivirus-function

Folders and files

Latest commit

History

Repository files navigation

bucket-antivirus-function

Features

How It Works

Installation

Build from Source

Create Relevant AWS Infra via CloudFormation

Adding or Changing Source Buckets

S3 Events

Configuration

S3 Bucket Policy Examples

Deny to download the object if not "CLEAN"

Deny to download and re-tag "INFECTED" object

Manually Scanning Buckets

Testing

pre-commit Tests

Python Tests

Local lambdas

License

About

Resources

License

Stars

Watchers

Forks

Releases 3

Packages 0

Languages

Packages