Scripting and Automation Fundamentals

Master scripting and automation from Bash and Python, building CI/CD pipelines, automating deployments, and eliminating manual toil.

Scripting and Automation Fundamentals

Manual work doesn't scale. If you're SSH-ing into servers, running the same commands repeatedly, or clicking through cloud consoles to deploy infrastructure, you're doing it wrong. Modern cloud engineering is built on automation-- scripts, pipelines, and infrastructure as code that eliminate repetitive tasks, reduce human error, and enable teams to ship faster.

This guide covers the scripting and automation fundamentals every cloud engineer, DevOps engineer, and SRE needs to know. You'll learn Bash scripting, Python automation, CI/CD concepts, and how to build reliable, repeatable deployment workflows.


Why Cloud Engineers Need Scripting and Automation

Automation isn't optional. Unless your goal is to give yourself job security by making every process long and convoluted, you need to be automating.

  • Eliminate toil: Automate repetitive tasks (deployments, backups, log rotation, security scans).
  • Ensure consistency: Scripts execute the same way every time. No human error.
  • Speed up deployments: CI/CD pipelines deploy infrastructure in minutes instead of hours.
  • Improve reliability: Add error handling and catch errors before they reach production.

If you're not automating, you're wasting your time and your company's money.


Core Automation Concepts

Idempotency

An idempotent operation produces the same result whether you run it once or multiple times. This is critical for automation.

Non-idempotent (bad):

# Creates a new file every time
echo "data" >> config.txt

Idempotent (good):

# Replaces file, same result every time
echo "data" > config.txt

Infrastructure tools like Terraform and Ansible are designed to be idempotent-- running them multiple times won't break your infrastructure.

Declarative vs Imperative

  • Imperative: Specify the exact steps to achieve a result (Bash scripts, Python scripts).
  • Declarative: Describe the desired end state, let the tool figure out the steps (Terraform, CloudFormation, Kubernetes).

Declarative is usually better for infrastructure, imperative for one-off tasks and glue scripts.

Environment Variables

Store configuration outside your code:

export AWS_REGION="us-east-1"
export DB_PASSWORD="secretpass"

Never hardcode secrets. Use environment variables, secret managers, or config files (excluded from Git).


Bash Scripting Fundamentals

Bash is the default shell on most Linux systems. You can also download "Git Bash" for Windows. It's perfect for system administration, glue scripts, and quick automation tasks.

Basic Script Structure

#!/bin/bash
# This is a comment

echo "Hello, Cloud Engineer"

Make it executable:

chmod +x script.sh
./script.sh

Variables

#!/bin/bash

# Define variables
NAME="CloudJobs"
COUNT=42

# Use variables
echo "Welcome to $NAME"
echo "Count: $COUNT"

# Command substitution
CURRENT_DATE=$(date +%Y-%m-%d)
echo "Today is $CURRENT_DATE"

User Input

#!/bin/bash

echo "Enter your name:"
read NAME
echo "Hello, $NAME"

Conditionals

#!/bin/bash

if [ -f /etc/nginx/nginx.conf ]; then
  echo "Nginx config exists"
else
  echo "Nginx config not found"
fi

# Numeric comparison
if [ $COUNT -gt 10 ]; then
  echo "Count is greater than 10"
fi

# String comparison
if [ "$ENV" == "production" ]; then
  echo "Running in production"
fi

Loops

#!/bin/bash

# For loop
for i in 1 2 3 4 5; do
  echo "Number: $i"
done

# Loop over files
for file in /var/log/*.log; do
  echo "Processing $file"
done

# While loop
COUNT=0
while [ $COUNT -lt 5 ]; do
  echo "Count: $COUNT"
  COUNT=$((COUNT + 1))
done

Functions

#!/bin/bash

deploy_app() {
  APP_NAME=$1
  echo "Deploying $APP_NAME..."
  # deployment commands here
}

# Call function
deploy_app "my-web-app"

Error Handling

#!/bin/bash

set -e  # Exit on any error
set -u  # Exit on undefined variable
set -o pipefail  # Exit on pipe failure

# Check command exit status
if ! aws s3 ls s3://my-bucket; then
  echo "Bucket not found"
  exit 1
fi

Practical Bash Example: Backup Script

#!/bin/bash

set -euo pipefail

BACKUP_DIR="/backups"
DATE=$(date +%Y%m%d_%H%M%S)
SOURCE="/var/www/html"

# Create backup directory if it doesn't exist
mkdir -p "$BACKUP_DIR"

# Create tar archive
BACKUP_FILE="$BACKUP_DIR/backup_$DATE.tar.gz"
tar -czf "$BACKUP_FILE" "$SOURCE"

# Upload to S3
aws s3 cp "$BACKUP_FILE" s3://my-backups/

# Delete local backup
rm "$BACKUP_FILE"

echo "Backup completed: backup_$DATE.tar.gz"

Python for Cloud Automation

Python is more powerful than Bash for complex automation, API interactions, and data processing. It's a full programming language with plenty of libraries and a large community.

Why Python for Cloud Engineering?

  • SDKs for all cloud providers: boto3 for AWS, azure-sdk for Azure, google-cloud-python for GCP
  • Readable syntax: Easier to maintain than long Bash scripts
  • Rich ecosystem: Libraries for everything (requests, data processing, etc.)
  • Error handling: Better exception handling than Bash
  • Data processing: Parse JSON, CSV, YAML with built-in libraries

Basic Python Automation

#!/usr/bin/env python3

import os
import subprocess
import requests

# Environment variables
aws_region = os.getenv("AWS_REGION", "us-east-1")

# Run shell command
result = subprocess.run(["ls", "-la"], capture_output=True, text=True)
print(result.stdout)

# HTTP request
response = requests.get("https://api.github.com")
print(response.json())

AWS Automation with boto3

#!/usr/bin/env python3

import boto3

# Create EC2 client
ec2 = boto3.client('ec2', region_name='us-east-1')

# List running instances
response = ec2.describe_instances(
    Filters=[{'Name': 'instance-state-name', 'Values': ['running']}]
)

for reservation in response['Reservations']:
    for instance in reservation['Instances']:
        instance_id = instance['InstanceId']
        instance_type = instance['InstanceType']
        print(f"Instance {instance_id}: {instance_type}")

Practical Python Example: Instance Cleanup

#!/usr/bin/env python3

import boto3
from datetime import datetime, timedelta

ec2 = boto3.client('ec2', region_name='us-east-1')

# Find instances older than 7 days
cutoff_date = datetime.now() - timedelta(days=7)

response = ec2.describe_instances()
for reservation in response['Reservations']:
    for instance in reservation['Instances']:
        launch_time = instance['LaunchTime'].replace(tzinfo=None)
        
        if launch_time < cutoff_date:
            instance_id = instance['InstanceId']
            print(f"Terminating old instance: {instance_id}")
            # ec2.terminate_instances(InstanceIds=[instance_id])

Python Script Best Practices

#!/usr/bin/env python3

import argparse
import logging

# Set up logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

# Parse command-line arguments
parser = argparse.ArgumentParser(description='Deploy application')
parser.add_argument('--env', required=True, help='Environment (dev/staging/prod)')
parser.add_argument('--version', required=True, help='App version to deploy')
args = parser.parse_args()

def deploy(env, version):
    logger.info(f"Deploying version {version} to {env}")
    # deployment logic here

if __name__ == '__main__':
    deploy(args.env, args.version)

CI/CD Pipelines

Continuous Integration/Continuous Deployment automates testing and deployment.

CI/CD Concepts

  • Continuous Integration (CI): Automatically test code on every commit.
  • Continuous Deployment (CD): Automatically deploy passing code to production.
  • Pipeline: Series of automated steps (build, test, deploy).
  • Artifact: Build output (container image, compiled binary, zip file).

Example: GitHub Actions Pipeline

# .github/workflows/deploy.yml
name: Deploy to AWS

on:
  push:
    branches: [main]

jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout code
        uses: actions/checkout@v2
      
      - name: Configure AWS credentials
        uses: aws-actions/configure-aws-credentials@v1
        with:
          aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
          aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
          aws-region: us-east-1
      
      - name: Deploy with Terraform
        run: |
          terraform init
          terraform apply -auto-approve

Example: Jenkins Pipeline

// Jenkinsfile
pipeline {
    agent any
    
    stages {
        stage('Build') {
            steps {
                sh 'docker build -t myapp:latest .'
            }
        }
        
        stage('Test') {
            steps {
                sh 'docker run myapp:latest npm test'
            }
        }
        
        stage('Deploy') {
            steps {
                sh 'kubectl apply -f k8s/'
            }
        }
    }
}

Best Practices

Make Scripts Reusable

#!/bin/bash

# Bad: Hardcoded values
aws s3 cp /data/backup.tar.gz s3://my-bucket/

# Good: Parameterized
BACKUP_FILE=$1
S3_BUCKET=$2
aws s3 cp "$BACKUP_FILE" "s3://$S3_BUCKET/"

Log Everything

import logging

logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(levelname)s - %(message)s',
    handlers=[
        logging.FileHandler('/var/log/deploy.log'),
        logging.StreamHandler()
    ]
)

logger = logging.getLogger(__name__)
logger.info("Starting deployment")

Test Before Production

#!/bin/bash

# Dry run mode
if [ "$DRY_RUN" == "true" ]; then
  echo "Would execute: aws s3 rm s3://bucket/file"
else
  aws s3 rm s3://bucket/file
fi

Handle Failures Gracefully

import sys

try:
    deploy_application()
except Exception as e:
    logger.error(f"Deployment failed: {e}")
    rollback_deployment()
    sys.exit(1)

Document Your Scripts

#!/bin/bash

# deploy.sh - Deploys application to AWS
#
# Usage: ./deploy.sh <environment> <version>
# Example: ./deploy.sh production v1.2.3
#
# Environment variables:
#   AWS_REGION - AWS region (default: us-east-1)
#   DRY_RUN - If true, only show what would be done

Learning Path

  1. First: Master Bash basics-- variables, conditionals, loops, functions.
  2. Second: Write practical scripts-- backups, log rotation, system health checks.
  3. Third: Learn Python basics and boto3 for AWS automation.
  4. Fourth: Set up a simple CI/CD pipeline with GitHub Actions or Jenkins.
  5. Fifth: Learn Terraform or Ansible for infrastructure as code.
  6. Sixth: Automate a full deployment workflow-- build, test, deploy, rollback.

Practical Exercises

  1. Backup automation: Write a Bash script that backs up a directory, compresses it, uploads to S3, and cleans up old backups.
  2. Instance management: Write a Python script using boto3 to list, start, stop, or terminate EC2 instances.
  3. CI/CD pipeline: Set up GitHub Actions to run tests and deploy a static site to S3 on every push.
  4. Infrastructure as code: Write Terraform configs to create a VPC, subnets, and an EC2 instance.
  5. Monitoring script: Write a health check script that pings a URL and sends an alert if it fails.

The Bottom Line: Automate Everything

Manual work is slow, error-prone, and doesn't scale. Every task you do more than twice should be automated. Start with simple Bash scripts, progress to Python automation, and eventually build full CI/CD pipelines and infrastructure as code workflows. The more you automate, the more valuable you become as a cloud engineer.

Pair scripting and automation with Linux fundamentals, networking knowledge, and Git skills, and you'll have everything you need to succeed in cloud engineering.

For more cloud engineering guidance, check out what does a cloud engineer do and how to get a cloud engineering job with no experience.

← Back to Insights