Scripting and Automation Fundamentals
January 17, 2025
Master scripting and automation from Bash and Python, building CI/CD pipelines, automating deployments, and eliminating manual toil.
Scripting and Automation Fundamentals
Manual work doesn't scale. If you're SSH-ing into servers, running the same commands repeatedly, or clicking through cloud consoles to deploy infrastructure, you're doing it wrong. Modern cloud engineering is built on automation-- scripts, pipelines, and infrastructure as code that eliminate repetitive tasks, reduce human error, and enable teams to ship faster.
This guide covers the scripting and automation fundamentals every cloud engineer, DevOps engineer, and SRE needs to know. You'll learn Bash scripting, Python automation, CI/CD concepts, and how to build reliable, repeatable deployment workflows.
Why Cloud Engineers Need Scripting and Automation
Automation isn't optional. Unless your goal is to give yourself job security by making every process long and convoluted, you need to be automating.
- Eliminate toil: Automate repetitive tasks (deployments, backups, log rotation, security scans).
- Ensure consistency: Scripts execute the same way every time. No human error.
- Speed up deployments: CI/CD pipelines deploy infrastructure in minutes instead of hours.
- Improve reliability: Add error handling and catch errors before they reach production.
If you're not automating, you're wasting your time and your company's money.
Core Automation Concepts
Idempotency
An idempotent operation produces the same result whether you run it once or multiple times. This is critical for automation.
Non-idempotent (bad):
# Creates a new file every time
echo "data" >> config.txt
Idempotent (good):
# Replaces file, same result every time
echo "data" > config.txt
Infrastructure tools like Terraform and Ansible are designed to be idempotent-- running them multiple times won't break your infrastructure.
Declarative vs Imperative
- Imperative: Specify the exact steps to achieve a result (Bash scripts, Python scripts).
- Declarative: Describe the desired end state, let the tool figure out the steps (Terraform, CloudFormation, Kubernetes).
Declarative is usually better for infrastructure, imperative for one-off tasks and glue scripts.
Environment Variables
Store configuration outside your code:
export AWS_REGION="us-east-1"
export DB_PASSWORD="secretpass"
Never hardcode secrets. Use environment variables, secret managers, or config files (excluded from Git).
Bash Scripting Fundamentals
Bash is the default shell on most Linux systems. You can also download "Git Bash" for Windows. It's perfect for system administration, glue scripts, and quick automation tasks.
Basic Script Structure
#!/bin/bash
# This is a comment
echo "Hello, Cloud Engineer"
Make it executable:
chmod +x script.sh
./script.sh
Variables
#!/bin/bash
# Define variables
NAME="CloudJobs"
COUNT=42
# Use variables
echo "Welcome to $NAME"
echo "Count: $COUNT"
# Command substitution
CURRENT_DATE=$(date +%Y-%m-%d)
echo "Today is $CURRENT_DATE"
User Input
#!/bin/bash
echo "Enter your name:"
read NAME
echo "Hello, $NAME"
Conditionals
#!/bin/bash
if [ -f /etc/nginx/nginx.conf ]; then
echo "Nginx config exists"
else
echo "Nginx config not found"
fi
# Numeric comparison
if [ $COUNT -gt 10 ]; then
echo "Count is greater than 10"
fi
# String comparison
if [ "$ENV" == "production" ]; then
echo "Running in production"
fi
Loops
#!/bin/bash
# For loop
for i in 1 2 3 4 5; do
echo "Number: $i"
done
# Loop over files
for file in /var/log/*.log; do
echo "Processing $file"
done
# While loop
COUNT=0
while [ $COUNT -lt 5 ]; do
echo "Count: $COUNT"
COUNT=$((COUNT + 1))
done
Functions
#!/bin/bash
deploy_app() {
APP_NAME=$1
echo "Deploying $APP_NAME..."
# deployment commands here
}
# Call function
deploy_app "my-web-app"
Error Handling
#!/bin/bash
set -e # Exit on any error
set -u # Exit on undefined variable
set -o pipefail # Exit on pipe failure
# Check command exit status
if ! aws s3 ls s3://my-bucket; then
echo "Bucket not found"
exit 1
fi
Practical Bash Example: Backup Script
#!/bin/bash
set -euo pipefail
BACKUP_DIR="/backups"
DATE=$(date +%Y%m%d_%H%M%S)
SOURCE="/var/www/html"
# Create backup directory if it doesn't exist
mkdir -p "$BACKUP_DIR"
# Create tar archive
BACKUP_FILE="$BACKUP_DIR/backup_$DATE.tar.gz"
tar -czf "$BACKUP_FILE" "$SOURCE"
# Upload to S3
aws s3 cp "$BACKUP_FILE" s3://my-backups/
# Delete local backup
rm "$BACKUP_FILE"
echo "Backup completed: backup_$DATE.tar.gz"
Python for Cloud Automation
Python is more powerful than Bash for complex automation, API interactions, and data processing. It's a full programming language with plenty of libraries and a large community.
Why Python for Cloud Engineering?
- SDKs for all cloud providers: boto3 for AWS, azure-sdk for Azure, google-cloud-python for GCP
- Readable syntax: Easier to maintain than long Bash scripts
- Rich ecosystem: Libraries for everything (requests, data processing, etc.)
- Error handling: Better exception handling than Bash
- Data processing: Parse JSON, CSV, YAML with built-in libraries
Basic Python Automation
#!/usr/bin/env python3
import os
import subprocess
import requests
# Environment variables
aws_region = os.getenv("AWS_REGION", "us-east-1")
# Run shell command
result = subprocess.run(["ls", "-la"], capture_output=True, text=True)
print(result.stdout)
# HTTP request
response = requests.get("https://api.github.com")
print(response.json())
AWS Automation with boto3
#!/usr/bin/env python3
import boto3
# Create EC2 client
ec2 = boto3.client('ec2', region_name='us-east-1')
# List running instances
response = ec2.describe_instances(
Filters=[{'Name': 'instance-state-name', 'Values': ['running']}]
)
for reservation in response['Reservations']:
for instance in reservation['Instances']:
instance_id = instance['InstanceId']
instance_type = instance['InstanceType']
print(f"Instance {instance_id}: {instance_type}")
Practical Python Example: Instance Cleanup
#!/usr/bin/env python3
import boto3
from datetime import datetime, timedelta
ec2 = boto3.client('ec2', region_name='us-east-1')
# Find instances older than 7 days
cutoff_date = datetime.now() - timedelta(days=7)
response = ec2.describe_instances()
for reservation in response['Reservations']:
for instance in reservation['Instances']:
launch_time = instance['LaunchTime'].replace(tzinfo=None)
if launch_time < cutoff_date:
instance_id = instance['InstanceId']
print(f"Terminating old instance: {instance_id}")
# ec2.terminate_instances(InstanceIds=[instance_id])
Python Script Best Practices
#!/usr/bin/env python3
import argparse
import logging
# Set up logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
# Parse command-line arguments
parser = argparse.ArgumentParser(description='Deploy application')
parser.add_argument('--env', required=True, help='Environment (dev/staging/prod)')
parser.add_argument('--version', required=True, help='App version to deploy')
args = parser.parse_args()
def deploy(env, version):
logger.info(f"Deploying version {version} to {env}")
# deployment logic here
if __name__ == '__main__':
deploy(args.env, args.version)
CI/CD Pipelines
Continuous Integration/Continuous Deployment automates testing and deployment.
CI/CD Concepts
- Continuous Integration (CI): Automatically test code on every commit.
- Continuous Deployment (CD): Automatically deploy passing code to production.
- Pipeline: Series of automated steps (build, test, deploy).
- Artifact: Build output (container image, compiled binary, zip file).
Example: GitHub Actions Pipeline
# .github/workflows/deploy.yml
name: Deploy to AWS
on:
push:
branches: [main]
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v2
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v1
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: us-east-1
- name: Deploy with Terraform
run: |
terraform init
terraform apply -auto-approve
Example: Jenkins Pipeline
// Jenkinsfile
pipeline {
agent any
stages {
stage('Build') {
steps {
sh 'docker build -t myapp:latest .'
}
}
stage('Test') {
steps {
sh 'docker run myapp:latest npm test'
}
}
stage('Deploy') {
steps {
sh 'kubectl apply -f k8s/'
}
}
}
}
Best Practices
Make Scripts Reusable
#!/bin/bash
# Bad: Hardcoded values
aws s3 cp /data/backup.tar.gz s3://my-bucket/
# Good: Parameterized
BACKUP_FILE=$1
S3_BUCKET=$2
aws s3 cp "$BACKUP_FILE" "s3://$S3_BUCKET/"
Log Everything
import logging
logging.basicConfig(
level=logging.INFO,
format='%(asctime)s - %(levelname)s - %(message)s',
handlers=[
logging.FileHandler('/var/log/deploy.log'),
logging.StreamHandler()
]
)
logger = logging.getLogger(__name__)
logger.info("Starting deployment")
Test Before Production
#!/bin/bash
# Dry run mode
if [ "$DRY_RUN" == "true" ]; then
echo "Would execute: aws s3 rm s3://bucket/file"
else
aws s3 rm s3://bucket/file
fi
Handle Failures Gracefully
import sys
try:
deploy_application()
except Exception as e:
logger.error(f"Deployment failed: {e}")
rollback_deployment()
sys.exit(1)
Document Your Scripts
#!/bin/bash
# deploy.sh - Deploys application to AWS
#
# Usage: ./deploy.sh <environment> <version>
# Example: ./deploy.sh production v1.2.3
#
# Environment variables:
# AWS_REGION - AWS region (default: us-east-1)
# DRY_RUN - If true, only show what would be done
Learning Path
- First: Master Bash basics-- variables, conditionals, loops, functions.
- Second: Write practical scripts-- backups, log rotation, system health checks.
- Third: Learn Python basics and boto3 for AWS automation.
- Fourth: Set up a simple CI/CD pipeline with GitHub Actions or Jenkins.
- Fifth: Learn Terraform or Ansible for infrastructure as code.
- Sixth: Automate a full deployment workflow-- build, test, deploy, rollback.
Practical Exercises
- Backup automation: Write a Bash script that backs up a directory, compresses it, uploads to S3, and cleans up old backups.
- Instance management: Write a Python script using boto3 to list, start, stop, or terminate EC2 instances.
- CI/CD pipeline: Set up GitHub Actions to run tests and deploy a static site to S3 on every push.
- Infrastructure as code: Write Terraform configs to create a VPC, subnets, and an EC2 instance.
- Monitoring script: Write a health check script that pings a URL and sends an alert if it fails.
The Bottom Line: Automate Everything
Manual work is slow, error-prone, and doesn't scale. Every task you do more than twice should be automated. Start with simple Bash scripts, progress to Python automation, and eventually build full CI/CD pipelines and infrastructure as code workflows. The more you automate, the more valuable you become as a cloud engineer.
Pair scripting and automation with Linux fundamentals, networking knowledge, and Git skills, and you'll have everything you need to succeed in cloud engineering.
For more cloud engineering guidance, check out what does a cloud engineer do and how to get a cloud engineering job with no experience.