AWS S3 configuration for backup

Amazon Simple Storage Service (AWS S3) is a cloud storage which provides amazing flexibility to store data and maintaining with it. You can use them to save your data files or inbound files or outbound files or you can use S3 to serve your files in your web project.

We will focus on creating S3 bucket for backing up files with versioning and expiration.

Versions

I am using below versions in the post.

Python - 3.7
Boto3 - 1.13.3

Creating AWS IAM user

You need user to work with S3. You can create one with AWS IAM (Identity and Access management). You can follow through below steps if you unfamiliar with creating a user.

  1. Go to IAM services of your AWS and Click users

  2. Add the user name which you want to keep and select programmatic access for the user. It is better to keep the separate access for API and
  3. Create new policy for S3 upload by clicking "create policy". If you are okay, you can still use "AdministratorAccess" (skip the next step)
  4. Select "S3" service and all actions as highlighted below and create policy.


  5. Assign the above created policy to the user. You can skip the tags sections and create new user.

  6. Download the .csv shown below and keep it safe. You need Access key ID and Secret key to upload file.

S3 bucket configuration

We will keep below configuration for our S3 bucket.

  1. Single version - You can upload same files many times. S3 will keep versions of them. This will be perfect for full backup. This will not be applicable for delta or incremental backup which we are not covering now.
  2. Expiration - we will keep only 15 days of backup. You can configure this as per your need.

Creating S3 bucket

We will create S3 bucket like below.

  1. Navigate to S3 service in AWS console or simply click here and click "create bucket".
  2. Provide unique bucket name, select the region which you are closer to reduce network latency of the transfer and block all public access as this is backup strategy.

  3. Click advance setting and click "enable" to enable object lock. You have type "enable" in the input box and confirm and create bucket.
  4. Navigate to the bucket and create a folder which is called prefix. Creating a "prefix/folder" will keep your bucket clean.

Add expiration

Lifecycle will help to add expiration. You can find lifecycle under management.

  1. Click "Add lifecycle rule" to add expiration.
  2. Provide rule name and you can set this run to specific prefix like we have created database in the previous step and click next.
  3. You can skip the transition and go to Expiration.
  4. Check current and previous version.            
    1. Expire current version - for 15 days, files will be kept in the bucket. When you download the file, it will provide you latest
    2. Previous version - after 15 days, current version becomes previous version. After It becomes previous version, it will be deleted in 1 day. You can change this as per your need.
    3. If there is any incomplete upload, it will be removed in 7 days.
  5. You can see the configured values in the next section. You can review and save it.

Now you are set to upload files.

Installing boto3

Boto3 is the python SDK for Amazon Web Services. You can install use pip to install.

(blog)   thuruthuru git:(development pip install boto3

Script to upload

We need key id, secret key of aws user we have created and bucket name and prefix for s3 upload. The below function will copy the file to s3 using given configuration.

def copy_file_to_s3(file_path, file_name, aws_params, s3_params):
"""
copies file from local disk to AWS S3 bucket.

:param file_path: source file path
:param file_name: name of the file to create in s3
:param aws_params: aws dict with key id, secret key
:param s3_params: s3 dict with bucket name and prefix
:return:
"""
status = True
key = aws_params['key']
secret_key = aws_params['secret_key']
bucket = s3_params['bucket']
prefix = s3_params.get('prefix', None)

s3_target_file_path = os.path.join(prefix, file_name) if prefix else file_name
s3_client = boto3.client('s3',
aws_access_key_id=key,
aws_secret_access_key=secret_key)

try:
s3_client.upload_file(file_path, bucket, s3_target_file_path)
print(f"Backup completed to s3.!")
except ClientError as e:
print(f"{e}")
status = False
return status

You can call the above function as below and use anywhere.

backup_file_path = "/hobby/db/db.sqlite3"
backup_file = "db.sqlite3"
aws_params = {
'key': 'your_aws_key_id',
'secret_key': 'your_aws_key_id',
}
s3_params = {
'bucket': 'backup-sample123',
'prefix': 'database'
}
copy_file_to_s3(file_path=backup_file_path,
file_name=backup_file,
aws_params=aws_params,
s3_params=s3_params)

Viewing versions in S3 console.

You can upload files to S3 and view the versions in the console. There is version in the toggle and click how. You will be able to see the versions.

Github gist

I have created GitHub git. You can check that out.

Happy reading. Please comment if you are stuck or you have any questions.

Related posts
Managing media and static files in Django using S3

Managing media and static files in Django using S3

Durai Pandian May 10, 2020

Managing Django media files and static files from AWS S3 using django-storages instead of serving them from the same ser...
Continue reading...
Archiving directory in python

Archiving directory in python

Durai Pandian May 09, 2020

Archiving multiple files or folder and subfolders as zip recursively in python using native python zipfile library and a...
Continue reading...
Backup SQLite database to AWS S3

Backup SQLite database to AWS S3

Durai Pandian May 07, 2020

Taking SQLite database backup in python to AWS S3 with expiration and maintaining versioning using s3 lifecycle in manag...
Continue reading...
Deploy python django app in heroku app

Deploy python django app in heroku app

Durai Pandian Mar 28, 2020

Deploying Django application in free Heroku using heroku-cli, git repository and setting environment variables to Heroku...
Continue reading...

Comments
We'll never share your email with anyone else.