AWS Lambda function with requests module

Introduction

In this tutorial, we will create an AWS Lambda function with requests module. Then create a .zip deployment package containing the dependencies.

Prerequisites

Before starting, you should have the following prerequisites configured

  • An AWS account
  • AWS CLI on your computer

Walkthrough

  • Create the deployment package
  • Create AWS Lambda function with requests module

Create the deployment package

Navigate to the project directory containing your lambda_function.py source code file. In this example, the directory is named my_function.

mkdir my_function
cd my_function
ls -alt

Install “requests” dependencies in the my_function directory.

pip3 install requests --target .

Create lambda_function.py source code file. This sample uses region_name=”ap-northeast-1″.(Tokyo region)

import boto3
import requests

def lambda_handler(event, context):
    file_name = "2023-11-26-0.json.gz"
    bucket_name = "hieu320231129"
    print(f'Getting the {file_name} from gharchive')
    res = requests.get(f'https://data.gharchive.org/{file_name}')
    print(f'Uploading {file_name} to s3 under s3://{bucket_name}')
    s3_client = boto3.client('s3', region_name="ap-northeast-1")
    upload_res = s3_client.put_object(
        Bucket=bucket_name,
        Key=file_name,
        Body=res.content
    )

    objects = s3_client.list_objects(Bucket=bucket_name)['Contents']
    objectname= []
    for obj in objects:
        objectname.append(obj['Key'])

    return {
        'object_names': objectname,
        'status_code': '200'
    }

Create a .zip file with the installed libraries and lambda source code file.

zip -r ../my_function.zip .
cd ..
ls -alt

Test lambda function from local computer.

# check bucket
aws s3 ls s3://hieu320231129/ --recursive

#invoke lambda function from local
python3 -c "import lambda_function;lambda_function.lambda_handler(None, None)"

#delete uploaded file
aws s3 rm s3://hieu320231129/ --recursive

Create AWS Lambda function with requests module

I will deploy zip file with python3.11 and change environment setting

Change environment setting

Test lambda function with request module

Conclusion

These steps provide an example of creating a lambda function with dependencies. I used the request module to read a file from a website and put it into an AWS S3 bucket. The specific configuration details may vary depending on your environment and setup. It’s recommended to consult the relevant documentation from AWS for detailed instructions on setting up. I hope will this your helpful. Thank you for reading the DevopsRoles page!

Refer

https://docs.aws.amazon.com/lambda/latest/dg/python-package.html

A Comprehensive Guide to Validating Kubernetes Cluster Installed using Kubeadm

Introduction

When setting up a Kubernetes cluster using Kubeadm, it’s essential to validate the installation to ensure everything is functioning correctly. In this blog post, we will guide you through the steps to Validating Kubernetes Cluster Installed using Kubeadm and Kubectl.

Learn how to validate your Kubernetes cluster installation using Kubeadm and ensure smooth operations. Follow our step-by-step guide for easy validation.

You have Installed Kubernetes using Kubeadm on Ubuntu: A Step-by-Step Guide

Validating Kubernetes Cluster Installed using Kubeadm: Step-by-Step Guide

Validating CMD Tools: Kubeadm & Kubectl

First, let’s check the versions of Kubeadm and Kubectl to ensure they match your cluster setup.

Checking “kubeadm” version

kubeadm version

Checking “kubectl” version

kubectl version

Make sure the versions of Kubeadm and Kubectl are compatible with your Kubernetes cluster.

Validating Cluster Nodes

Next, we need to ensure that all nodes in the cluster, including both Master and Worker nodes, are in the “Ready” state.

To check the status of all nodes:

kubectl get nodes
kubectl get nodes -o wide

This command will display a list of all nodes in the cluster along with their status. Ensure that all nodes are marked as “Ready.”

Validating Kubernetes Components

It’s crucial to verify that all Kubernetes components on the Master node are running correctly.

To check the status of Kubernetes components:

kubectl get pods -n kube-system
kubectl get pods -n kube-system -o wide

This command will show the status of various Kubernetes components in the kube-system namespace. Ensure that all components are in the “Running” state.

Validating Services: Docker & Kubelet

To ensure the proper functioning of your cluster, we need to validate the services Docker and Kubelet on all nodes.

Checking Docker service status

systemctl status docker

This command will display the status of the Docker service. Ensure that it is “Active” and running without any errors.

Checking Kubelet service status

systemctl status kubelet

This command will show the status of the Kubelet service. Verify that it is “Active” and running correctly.

Deploying Test Deployment

To further validate your cluster, let’s deploy a sample Nginx deployment and check its status.

Deploying the sample “nginx” deployment:

kubectl apply -f https://k8s.io/examples/controllers/nginx-deployment.yaml

This command will create the Nginx deployment in your cluster.

Validate the deployment:

kubectl get deploy
kubectl get deploy -o wide

These commands will display the status of the Nginx deployment, including the number of replicas and the desired and current states.

Check if the pods are in the “Running” state:

kubectl get pods
kubectl get pods -o wide

Make sure all pods are running without any errors.

Verify that containers are running on the respective worker nodes:

docker ps

This command will show the running containers on each worker node. Ensure that the Nginx containers are running as expected.

Delete the deployment:

kubectl delete -f https://k8s.io/examples/controllers/nginx-deployment.yaml

This command will delete the Nginx deployment from your cluster.

Conclusion

By following these steps, you can validate your Kubernetes cluster installation using Kubeadm and Kubectl. It’s essential to ensure that all the components, services, and deployments are running correctly to have a reliable and stable Kubernetes environment. I hope will this your helpful. Thank you for reading the DevopsRoles page!

Installing Kubernetes using Kubeadm on Ubuntu: A Step-by-Step Guide

Introduction

Kubernetes has emerged as the go-to solution for container orchestration and management. If you’re looking to set up a Kubernetes cluster on a Ubuntu server, you’re in the right place. In this step-by-step guide, we’ll walk you through the process of installing Kubernetes using Kubeadm on Ubuntu.

Prerequisites

I have created 3 VMs for Kubernetes Cluster Nodes to Cloud Google Compute Engine (GCE)

  • Master(1): 2 vCPUs – 4GB Ram
  • Worker(2): 2 vCPUs – 2GB RAM
  • OS: Ubuntu 16.04 or CentOS/RHEL 7

I have configured Firewall Rules Ingress in Google Compute Engine (GCE)

  • Master Node: 2379,6443,10250,10251,10252
  • Worker Node: 10250,30000-32767

Installing Kubernetes using Kubeadm on Ubuntu

Set hostname on Each Node

# hostnamectl set-hostname "k8s-master"    // For Master node
# hostnamectl set-hostname "k8s-worker1"   // For 1st worker node
# hostnamectl set-hostname "k8s-worker2"   // For 2nd worker node

Add the following entries in /etc/hosts file on each node

192.168.1.14   k8s-master
192.168.1.16   k8s-worker1
192.168.1.17   k8s-worker2

Disable Swap and Bridge Traffic

Kubernetes does not work well with swap enabled. Run it on MASTER & WORKER Nodes

Disable SWAP

# swapoff -a
# sed -i.bak -r 's/(.+ swap .+)/#\1/' /etc/fstab

Load the following kernel modules on all the nodes,

# tee /etc/modules-load.d/containerd.conf <<EOF
overlay
br_netfilter
EOF
# modprobe overlay
# modprobe br_netfilter

Set the following Kernel parameters for Kubernetes, run beneath tee command

# tee /etc/sysctl.d/kubernetes.conf <<EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
EOF

Reload the above changes

# sysctl --system

For example, The output terminal of worker1.

Installing Docker

Run it on MASTER & WORKER Nodes. Kubernetes requires a container runtime, and Docker is a popular choice. To install Docker, run the following commands:

apt-get update  
apt-get install -y  apt-transport-https ca-certificates curl software-properties-common gnupg2

curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -

sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu \
  $(lsb_release -cs) \
  stable"

Installing Docker

apt-get update && sudo apt-get install \
  containerd.io=1.6.24-1 \
  docker-ce=5:20.10.24~3-0~ubuntu-$(lsb_release -cs) \
  docker-ce-cli=5:20.10.24~3-0~ubuntu-$(lsb_release -cs)

For example, The output terminal is as below:

Setting up the Docker daemon

cat <<EOF | sudo tee /etc/docker/daemon.json
{
  "exec-opts": ["native.cgroupdriver=systemd"],
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "100m"
  },
  "storage-driver": "overlay2"
}
EOF

mkdir -p /etc/systemd/system/docker.service.d

Start and enable the docker

systemctl daemon-reload
systemctl enable docker
systemctl restart docker
systemctl status docker

Install Kubeadm, Kubelet, and Kubectl

Add the Kubernetes repository and install Kubeadm, Kubelet, and Kubectl

apt-get update && sudo apt-get install -y apt-transport-https curl

curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -

cat <<EOF | sudo tee /etc/apt/sources.list.d/kubernetes.list
deb https://apt.kubernetes.io/ kubernetes-xenial main
EOF

Installing Kubeadm, Kubelet, Kubectl

apt-get update
apt-get install -y kubelet kubeadm kubectl

apt-mark hold kubelet kubeadm kubectl

Start and enable Kubelet

systemctl daemon-reload
systemctl enable kubelet
systemctl restart kubelet
systemctl status kubelet

Initializing CONTROL-PLANE

Run it on MASTER Node only. On your master node, initialize the Kubernetes cluster with the command below:

kubeadm init

Make note of the kubeadm join command that’s provided at the end; you’ll need it to join worker nodes.

Installing POD-NETWORK add-on

Run it on MASTER Node only

For kubectl

mkdir -p $HOME/.kube
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config

Installing “Weave CNI” (Pod-Network add-on)

kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')"

NOTE: There are multiple CNI Plug-ins available. You can install a choice of yours. In the case above commands don’t work, try checking the below link for more info

Joining Worker Nodes

Run it on WORKER Node only

On your worker nodes, use the kubeadm join command from above kubeadm init output to join them to the cluster.

kubeadm join <...>

Run this command IF you do not have the above join command and/or create a NEW one.

kubeadm token create --print-join-command

Verify the Cluster

On the master node, ensure your cluster is up and running

kubectl get nodes

You should see the master node marked as “Ready” and any joined worker nodes.

Conclusion

Congratulations! You’ve successfully installed Kubernetes using Kubeadm on Ubuntu. With your Kubernetes cluster up and running, you’re ready to deploy and manage containerized applications and services at scale.

Kubernetes offers vast capabilities for container orchestration, scaling, and management. As you become more familiar with Kubernetes, you can explore advanced configurations and features to optimize your containerized environment.

I hope will this your helpful. Thank you for reading the DevopsRoles page!

Additional Resources:

HISTCONTROL ignorespace Force history in Linux

Introduction

How to Force history not to remember a particular command using HISTCONTROL ignorespace in Linux. When executing a command, you can use HISTCONTROL with ignorespace and precede the command with a space to ensure it’s ignored in your command history.

This might be tempting for junior sysadmins seeking discretion, but it’s essential to grasp how ignorespace functions. As a best practice, it’s generally discouraged to purposefully hide commands from your history, as transparency and accountability are crucial in system administration and troubleshooting.

What is HISTCONTROL?

HISTCONTROL is an environment variable in Linux that defines how your command history is managed. It allows you to specify which commands should be recorded in your history and which should be excluded. This can help you maintain a cleaner and more efficient command history.

ignorespace – An Option for HISTCONTROL

One of the settings you can use with HISTCONTROL is ignorespace. When ignorespace is included in the value of HISTCONTROL, any command line that begins with a space character will not be recorded in your command history. This can be incredibly handy for preventing sensitive information, such as passwords, from being stored in your history.

Working with HISTCONTROL ignorespace

Step 1: Check Your Current HISTCONTROL Setting

Before you start using HISTCONTROL with ignorespace, it’s a good idea to check your current HISTCONTROL setting. Open a terminal and run the following command:

echo $HISTCONTROL

This will display your current HISTCONTROL setting. If it’s empty or doesn’t include ignorespace, you can proceed to the next step.

Step 2: Set HISTCONTROL to ignorespace

To enable ignorespace in your HISTCONTROL, you can add the following line to your shell configuration file (e.g., ~/.bashrc for Bash users):

export HISTCONTROL=ignorespace

After making this change, be sure to reload your shell configuration or start a new terminal session for the changes to take effect.

Step 3: Test ignorespace

Now that you’ve set HISTCONTROL to ignorespace, you can test its functionality. Try entering a command with a leading space, like this:

 ls -l

Notice that the space at the beginning of the command is intentional. This command will not be recorded in your command history because of the ignorespace setting.

Step 4: Verify Your Command History

To verify that the command you just entered is not in your history, you can display your command history using the history command:

history

Conclusion

utilizing HISTCONTROL with ignorespace empowers you to better manage your Linux command history. This feature proves especially useful when excluding commands with sensitive data or temporary experiments. Understanding and harnessing HISTCONTROL ignorespace and its options, like ignorespace, enhances both the efficiency and security of your Linux command line experience.

Remember that these settings are user-specific, so individual configuration is necessary for each user on a multi-user system. Armed with this knowledge, you can exercise greater control over your command history and enhance your overall command line efficiency in Linux. You can Force history not to remember a particular command using HISTCONTROL ignorespace. Thank you for reading the DevopsRoles page!

Dive view the contents of docker images

Introduction

How to view the contents of docker images? “Dive” is a command-line tool for exploring and analyzing Docker images. It allows you to inspect the contents of a Docker image, view its layers, and understand the file structure and sizes within those layers.

This tool can be helpful for optimizing Docker images and gaining insights into their composition. Dive: A Simple App for Viewing the Contents of a Docker Image.

For MacOS, Dive can be installed with either Homebrew and on Windows, Dive can be installed with a downloaded installer file for the OS.

What You’ll Need

  • Dive: You’ll need to install the Dive tool on your system to use it.
  • Docker: Dive works with Docker images, so you should have Docker installed on your system to pull and work with Docker images. For example, install docker on Ubuntu here.

Installing Dive

To install Dive, you can use package managers like Homebrew (on macOS) or download the binary from the Dive GitHub repository.

Using Homebrew (on macOS)

brew install dive

Downloading the binary

You can visit the Dive GitHub repository (dive) and download the binary for your platform from the “Releases” section. You installing Dive on Ubuntu.

$ export DIVE_VERSION=$(curl -sL "https://api.github.com/repos/wagoodman/dive/releases/latest" | grep '"tag_name":' | sed -E 's/.*"v([^"]+)".*/\1/')
$ curl -OL https://github.com/wagoodman/dive/releases/download/v${DIVE_VERSION}/dive_${DIVE_VERSION}_linux_amd64.deb
$ sudo apt install ./dive_${DIVE_VERSION}_linux_amd64.deb

The result as the picture below

Using Dive

Once you have Dive installed, you can use it to view the contents of a Docker image as follows:

  1. Open your terminal or command prompt.
  2. Run the following command, replacing with the name or ID of the Docker image you want to inspect:
  3. Dive will launch a text-based interface that allows you to navigate through the layers of the Docker image. You can explore the file structure, check the sizes of individual layers, and gain insights into the image’s contents.

View the contents of docker images

To examine the latest Alpine Docker image

dive alpine:latest

You can define a different source using the source option

dive IMAGE --source SOURCE

SOURCE is the location of the repository.

The features of Dive

  • Layer Visualization: Dive provides a visual representation of a Docker image’s layers, showing how they are stacked on top of each other.
  • Layer Size Information: Dive displays the size of each individual layer in the Docker image.
  • File and Directory Listing: You can navigate through the contents of each layer and view the files and directories it contains.
  • Image Efficiency Analysis: Dive helps you identify inefficiencies in your Docker images.
  • Image Build Context Analysis: Dive can analyze the build context of a Docker image.
  • Image Diffing: Dive allows you to compare two Docker images and see the differences between them.

Conclusion

Dive is a powerful tool for image analysis and optimization, and it can help you gain insights into what’s inside a Docker image. It’s particularly useful for identifying large files or unnecessary dependencies that can be removed to create smaller and more efficient Docker images.

You can view the contents of docker images using Dive.

Create a Lambda to access ElastiCache

Introduction

In this tutorial, you will create a Lambda to access ElastiCache cluster. When you create the Lambda function, you provide subnet IDs in your Amazon VPC and a VPC security group to allow the Lambda function to access resources in your VPC. For illustration in this tutorial, the Lambda function generates a UUID, writes it to the cache, and retrieves it from the cache.

Invoke the Lambda function and verify that it accessed the ElastiCache cluster in your VPC.

Prerequisites

Before starting, you should have the following prerequisites configured

  • An AWS account
  • AWS CLI on your computer
  • A Memcached cluster (refer Memcached tutorial to create a Memcached cluster )

Create a Lambda to access ElastiCache in an Amazon VPC

  • Create the execution role
  • Create an ElastiCache cluster
  • Create a deployment package
  • Create the Lambda function
  • Test the Lambda function
  • Clean up

Create the execution role

Create the execution role that gives your function permission to access AWS resources. To create an execution role with the AWS CLI, use the create-role command.

In the following example, you specify the trust policy inline.

aws iam create-role --role-name lambda-vpc-role --assume-role-policy-document '{"Version": "2012-10-17","Statement": [{ "Effect": "Allow", "Principal": {"Service": "lambda.amazonaws.com"}, "Action": "sts:AssumeRole"}]}'

You can also define the trust policy for the role using a JSON file. In the following example, trust-policy.json is a file in the current directory. Example trust-policy.json

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "lambda.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

Add permissions to the role, and use the attach-policy-to-role command. Start by adding the AWSLambdaVPCAccessExecutionRole managed policy.

aws iam attach-role-policy --role-name lambda-vpc-role --policy-arn arn:aws:iam::aws:policy/service-role/AWSLambdaVPCAccessExecutionRole

Create an ElastiCache cluster

Refer Memcached tutorial to create a Memcached cluster.

The following command retrieves the configuration endpoint (ConfigurationEndpoint)

aws elasticache describe-cache-clusters \
    --cache-cluster-id my-cluster --query 'CacheClusters[].ConfigurationEndpoint'

Create a deployment package

In the following example, create app.py a file in the current directory. Example app.py

from __future__ import print_function
import time
import uuid
import sys
import socket
import elasticache_auto_discovery
from pymemcache.client.hash import HashClient

#elasticache settings
elasticache_config_endpoint = "your-elasticache-cluster-endpoint:port"
nodes = elasticache_auto_discovery.discover(elasticache_config_endpoint)
nodes = map(lambda x: (x[1], int(x[2])), nodes)
memcache_client = HashClient(nodes)

def handler(event, context):
    """
    This function puts into memcache and get from it.
    Memcache is hosted using elasticache
    """

    #Create a random UUID... this will be the sample element we add to the cache.
    uuid_inserted = uuid.uuid4().hex
    #Put the UUID to the cache.
    memcache_client.set('uuid', uuid_inserted)
    #Get item (UUID) from the cache.
    uuid_obtained = memcache_client.get('uuid')
    if uuid_obtained.decode("utf-8") == uuid_inserted:
        # this print should go to the CloudWatch Logs and Lambda console.
        print ("Success: Fetched value %s from memcache" %(uuid_inserted))
    else:
        raise Exception("Value is not the same as we put :(. Expected %s got %s" %(uuid_inserted, uuid_obtained))

    return "Fetched value from memcache: " + uuid_obtained.decode("utf-8")

Dependencies

  • pymemcache – The Lambda function code uses this library to create an HashClientobject to set and get items from memcache.

Create a deployment package.

zip -r function.zip app.py pymemcache/* elasticache_auto_discovery/*

Create the Lambda function

Create the Lambda function with the create-function command.

aws lambda create-function --function-name AccessMemCache --timeout 30 --memory-size 1024 \
--zip-file fileb://function.zip --handler app.handler --runtime python3.8 \
--role arn:aws:iam::123456789012:role/lambda-vpc-role \
--vpc-config SubnetIds=subnet-0a8aaace20a7efd26,subnet-0daa531c4e748062d,subnet-0de820fd0f0efded5,SecurityGroupIds=sg-083f2ca0560111a3b

Test the Lambda function

In this step, you invoke the Lambda function manually using the invoke command. When the Lambda function runs, it generates a UUID and writes it to the ElastiCache cluster specified in your Lambda code. The Lambda function then retrieves the item from the cache.

Invoke the Lambda function with the invoke the command includes getting log stream from CloudWatch

aws lambda invoke --function-name AccessMemCache --cli-binary-format raw-in-base64-out --payload '{"key": "value"}' out

Clean up

Run the following delete-function command to delete the AccessMemCache function.

aws lambda delete-function --function-name AccessMemCache

Run the following command to delete an IAM role

aws iam list-attached-role-policies --role-name lambda-vpc-role
aws iam detach-role-policy --role-name lambda-vpc-role --policy-arn arn:aws:iam::aws:policy/service-role/AWSLambdaVPCAccessExecutionRole
aws iam delete-role --role-name lambda-vpc-role

Conclusion

These steps provide an example to manage the Memcached cluster. The specific configuration details may vary depending on your environment and setup. It’s recommended to consult the relevant documentation from AWS for detailed instructions on setting up. I hope this will your helpful. Thank you for reading the DevopsRoles page!

Refer

https://docs.aws.amazon.com/lambda/latest/dg/services-elasticache-tutorial.html#vpc-ec-deployment-pkg

https://docs.aws.amazon.com/lambda/latest/dg/gettingstarted-awscli.html#with-userapp-walkthrough-custom-events-upload

https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_manage_delete.html#roles-managingrole-deleting-cli

Memcached tutorial

Introduction

In this Memcached tutorial, you will create an Amazon ElastiCache for the Memcached cluster in your default Amazon Virtual Private Cloud. Operations on the cluster using CLI commands and node management using CLI commands. For more information about Amazon ElastiCache, see Amazon ElastiCache.

Prerequisites

Before starting, you should have the following prerequisites configured

  • An AWS account
  • AWS CLI on your computer

Memcached tutorial

  • Creating a Memcached cluster with AWS CLI
  • Modifying a Memcached cluster with AWS CLI
  • Viewing the elements in a Memcached cluster with AWS CLI
  • Rebooting a Memcached cluster with AWS CLI
  • Discovering the endpoints of Memcached cluster with AWS CLI
  • Adding nodes to a Memcached cluster with AWS CLI
  • Removing nodes from a Memcached cluster with AWS CLI
  • Scaling Memcached vertically
  • Configuring a Lambda function to access Amazon ElastiCache in an Amazon VPC
  • Deleting a Memcached cluster with AWS CLI

Creating a Memcached cluster with AWS CLI

Before you begin, If you have not installed the AWS CLI, see Setting up the Amazon Redshift CLI. This tutorial uses the us-ease-1 region.

Now we’re ready to launch a Memcached cluster by using the AWS CLI.

You can set up a cluster with a specific number of nodes and a parameter group that controls the properties for each node. All nodes within a cluster are designed to be of the same node type and have the same parameter and security group settings. 

Every cluster must have a cluster identifier. The cluster identifier is a customer-supplied name for the cluster. This identifier specifies a particular cluster when interacting with the ElastiCache API and AWS CLI commands. The cluster identifier must be unique for that customer in an AWS Region. For more information, see create-cache-cluster

  • Supported –engine-version
  • –cache-parameter-group-name: If this argument is omitted, the default parameter group for the specified engine is used. Or you can use create-cache-parameter-group command to create a parameter group.
  • If you’re going to launch your cluster in a VPC, make sure to create a subnet group in the same VPC before you start creating a cluster.

The following CLI code creates a Memcached cache cluster with 3 nodes.

#Creating a subnet group
aws elasticache create-cache-subnet-group \
    --cache-subnet-group-name my-subnetgroup \
    --cache-subnet-group-description "Testing" \
    --subnet-ids "subnet-0a8aaace20a7efd26" "subnet-0daa531c4e748062d" "subnet-0de820fd0f0efded5"

#Creating cluster
aws elasticache create-cache-cluster \
--cache-cluster-id my-cluster \
--cache-node-type cache.t2.medium \
--engine memcached \
--engine-version 1.5.16 \
--cache-subnet-group-name my-subnetgroup
--num-cache-nodes 3

This command returns the following result.

Modifying a Memcached cluster with AWS CLI

In addition to adding or removing nodes from a cluster, there can be times when you need to make other changes to an existing cluster, such as, adding a security group, changing the maintenance window, or a parameter group.

You can modify an existing cluster using the AWS CLI modify-cache-cluster operation. To modify a cluster’s configuration value, specify the cluster’s ID, the parameter to change, and the parameter’s new value.

The --apply-immediately parameter applies only to modifications in the engine version and changing the number of nodes in a cluster. If you want to apply any of these changes immediately, use the --apply-immediately parameter. If you prefer postponing these changes to your next maintenance window, use the --no-apply-immediately parameter. Other modifications, such as changing the maintenance window, are applied immediately.

The following example changes the maintenance window for a cluster named my-cluster and applies the change immediately.

aws elasticache modify-cache-cluster \
    --cache-cluster-id my-cluster \
    --preferred-maintenance-window sun:23:00-mon:02:00

This command returns the following result.

Viewing the elements in a Memcached cluster with AWS CLI

You can view detailed information about one or more clusters using describe-cache-clusters

By default, abbreviated information about the clusters is returned. You can use the optional ShowCacheNodeInfo flag to retrieve detailed information about the cache nodes associated with the clusters. These details include the DNS address and port for the cache node endpoint.

The following code lists the details for my-cluster

aws elasticache describe-cache-clusters --cache-cluster-id my-cluster

This command returns the following result.

Rebooting a Memcached cluster with AWS CLI

Some changes require that the cluster be rebooted for the changes to be applied. For example, for some parameters, changing the parameter value in a parameter group is only applied after a reboot.

When you reboot a cluster, the cluster flushes all its data and restarts its engine. During this process, you cannot access the cluster. Because the cluster flushed all its data, when it is available again, you start with an empty cluster.

To reboot specific nodes in the cluster, use the --cache-node-ids-to-reboot to list the specific clusters to reboot.

To reboot a cluster (AWS CLI), use the reboot-cache-cluster CLI operation.

Run the following command to reboot a cluster.

aws elasticache reboot-cache-cluster --cache-cluster-id my-cluster --cache-node-ids-to-reboot "0001"

This command returns the following result.

Discovering the endpoints of the Memcached cluster with AWS CLI

Your application connects to your cluster using endpoints. An endpoint is a node or cluster’s unique address. Which endpoints to use

  • If you use Automatic Discovery, you can use the cluster’s configuration endpoint to configure your Memcached client. This means you must use a client that supports Automatic Discovery.
  • If you don’t use Automatic Discovery, you must configure your client to use the individual node endpoints for reads and writes. You must also keep track of them as you add and remove nodes.

You can use the AWS CLI to discover the endpoints for a cluster and its nodes with the describe-cache-clusterscommand. For more information, see the topic describe-cache-clusters.

The following command retrieves the configuration endpoint (ConfigurationEndpoint)

aws elasticache describe-cache-clusters \
    --cache-cluster-id my-cluster --query 'CacheClusters[].ConfigurationEndpoint'

This command returns the following result.

For Memcached clusters, the command returns the configuration endpoint. If you include the optional parameter --show-cache-node-info, the following command retrieves the configuration endpoint (ConfigurationEndpoint) and individual node endpoints (Endpoint) for the Memcached cluster.

aws elasticache describe-cache-clusters \
    --cache-cluster-id my-cluster \
    --show-cache-node-info

Adding nodes to a Memcached cluster with AWS CLI

Adding nodes to a Memcached cluster increases the number of your cluster’s partitions.

To add nodes to a cluster using the AWS CLI, use the AWS CLI operation modify-cache-cluster. For more information, see the AWS CLI topic modify-cache-cluster.

Run the following command to add nodes to a cluster

aws elasticache modify-cache-cluster \
    --cache-cluster-id my-cluster \
    --num-cache-nodes 4 \
    --apply-immediately

This command returns the following result.

Removing nodes from a Memcached cluster with AWS CLI

To remove nodes from a cluster using the command-line interface, use the command modify-cache-cluster with the following parameters:

  • --cache-cluster-id The ID of the cache cluster that you want to remove nodes from.
  • --num-cache-nodes The --num-cache-nodes parameter specifies the number of nodes that you want in this cluster after the modification is applied.
  • --cache-node-ids-to-remove A list of node IDs that you want removed from this cluster.
  • --apply-immediately or --no-apply-immediately Specifies whether to remove these nodes immediately or at the next maintenance window.
  • --region Specifies the AWS Region of the cluster that you want to remove nodes from.

The following example immediately removes node 0004 from the cluster my-cluster.

aws elasticache modify-cache-cluster \
    --cache-cluster-id my-cluster \
    --num-cache-nodes 3 \
    --cache-node-ids-to-remove 0004 \
    --region us-east-1 \
    --apply-immediately  

This command returns the following result.

Scaling Memcached vertically

To scale a Memcached cache cluster vertically

  • Create a new cache cluster with the new node type. 
  • In your application, update the endpoints to the new cluster’s endpoints.
  • Delete the old cache cluster. 

Deleting a Memcached cluster with AWS CLI

For more information, see the AWS CLI for ElastiCache topic delete-cache-cluster

Run the following command to delete a cluster.

aws elasticache delete-cache-cluster --cache-cluster-id my-cluster

This command returns the following result.

Conclusion

These steps provide an example to manage the Memcached cluster. The specific configuration details may vary depending on your environment and setup. It’s recommended to consult the relevant documentation from AWS for detailed instructions on setting up. I hope will this be helpful. Thank you for reading the DevopsRoles page!

Refer

https://docs.aws.amazon.com/AmazonElastiCache/latest/mem-ug/WhatIs.html

https://docs.aws.amazon.com/lambda/latest/dg/services-elasticache-tutorial.html

S3 to Redshift

Introduction

This tutorial shows you how to create a Redshift cluster resource, connect to Amazon Redshift, load sample data from S3 to Redshift into Redshift, and run queries with data usage command line tools.

You can use SQL Workbench or Amazon Redshift Query Editor v2.0 (web-based analyst workbench). In this tutorial, we choose to load sample data from an Amazon S3 bucket to Amazon Redshift using the PLSQL command-line tool.

psql is a terminal-based front-end to PostgreSQL. It enables you to type in queries interactively, issue them to PostgreSQL, and see the query results. Alternatively, input can be from a file or command line arguments. In addition, psql provides several meta-commands and various shell-like features to facilitate writing scripts and automating a wide variety of tasks.

Prerequisites

Before starting, you should have the following prerequisites configured

  • An AWS account
  • AWS CLI on your computer

Load data from S3 to Redshift into Redshift example with AWS CLI

  • Install PSQL on MacOS
  • Creating a data warehouse with Amazon Redshift using AWS CLI
  • Connect to the Redshift cluster using PSQL
  • Create Redshift cluster tables using PSQL
  • Redshift default role setting uses AWS Console manage
  • Loading sample data from S3 to Redshift with PSQL
  • Delete the sample cluster using AWS CLI

Install PSQL on MacOS

We can choose a version from PostgresSQL page or execute the following command on MacOS

brew install postgresql

Creating a data warehouse with Amazon Redshift using AWS CLI

Before you begin, If you have not installed the AWS CLI, see Setting up the Amazon Redshift CLI. This tutorial uses the us-east-1 region.

Now we’re ready to launch a cluster by using the AWS CLI.

The create-cluster the command has a large number of parameters. For this tutorial, you will use the parameter values that are described in the following table. Before you create a cluster in a production environment, we recommend that you review all the required and optional parameters so that your cluster configuration matches your requirements. For more information, see create-cluster

Parameter nameParameter value for this exercise
cluster-identifierexamplecluster
master-usernameawsuser
master-user-passwordAwsuser123
node-typedc2.large
cluster-typesingle-node

Run the following command to create a cluster.

aws redshift create-cluster --cluster-identifier examplecluster --master-username awsuser --master-user-password Awsuser123 --node-type dc2.large --cluster-type single-node

This command returns the following result.

The cluster creation process will take several minutes to complete. To check the status, enter the following command.

aws redshift describe-clusters --cluster-identifier examplecluster | grep ClusterStatus

When the ClusterStatus field changes from creating to available, the cluster is ready for use.

Connect to the Redshift cluster using PSQL

Run the following command to connect to the Redshift cluster.

psql -h examplecluster.ccfmryooawwy.us-east-1.redshift.amazonaws.com -U awsuser -d dev -p 5439

You must explicitly grant inbound access to your client to connect to the cluster. When you created a cluster in the previous step, because you did not specify a security group, you associated the default cluster security group with the cluster.

The default cluster security group contains no rules to authorize any inbound traffic to the cluster. To access the new cluster, you must add rules for inbound traffic, which are called ingress rules, to the cluster security group. If you are accessing your cluster from the Internet, you will need to authorize a Classless Inter-Domain Routing IP (CIDR/IP) address range.

#get VpcSecurityGroupId
aws redshift describe-clusters --cluster-identifier examplecluster | grep VpcSecurityGroupId

Run the following command to enable your computer to connect to your Redshift cluster. Then login into your cluster using psql.

#allow connect to cluster from my computer
aws ec2 authorize-security-group-ingress --group-id sg-083f2ca0560111a3b --protocol tcp --port 5439 --cidr 111.111.111.111/32

This command returns the following result.

Now test the connection by querying the system table

Create Redshift cluster tables using PSQL

In this tutorial, I use sample data from AWS. Run the following command to create Redshift tables.

create table users(
userid integer not null distkey sortkey,
username char(8),
firstname varchar(30),
lastname varchar(30),
city varchar(30),
state char(2),
email varchar(100),
phone char(14),
likesports boolean,
liketheatre boolean,
likeconcerts boolean,
likejazz boolean,
likeclassical boolean,
likeopera boolean,
likerock boolean,
likevegas boolean,
likebroadway boolean,
likemusicals boolean);                        

create table event(
eventid integer not null distkey,
venueid smallint not null,
catid smallint not null,
dateid smallint not null sortkey,
eventname varchar(200),
starttime timestamp);

create table sales(
salesid integer not null,
listid integer not null distkey,
sellerid integer not null,
buyerid integer not null,
eventid integer not null,
dateid smallint not null sortkey,
qtysold smallint not null,
pricepaid decimal(8,2),
commission decimal(8,2),
saletime timestamp);

This command returns the following result.

Test by querying the public.sales table as follows

select * from public.sales;

Redshift default role setting uses AWS Console manage

Before you can load data from Amazon S3, you must first create an IAM role with the necessary permissions and attach it to your cluster. To do this refer to AWS document

Loading sample data from S3 to Redshift with PSQL

Use the COPY command to load large datasets from Amazon S3 into Amazon Redshift. For more information about COPY syntax, see COPY in the Amazon Redshift Database Developer Guide.

Run the following SQL commands in PSQL to load data from S3 to Redshift

COPY users 
FROM 's3://redshift-downloads/tickit/allusers_pipe.txt' 
DELIMITER '|' 
TIMEFORMAT 'YYYY-MM-DD HH:MI:SS'
IGNOREHEADER 1 
REGION 'us-east-1'
IAM_ROLE default;                    
                    
COPY event
FROM 's3://redshift-downloads/tickit/allevents_pipe.txt' 
DELIMITER '|' 
TIMEFORMAT 'YYYY-MM-DD HH:MI:SS'
IGNOREHEADER 1 
REGION 'us-east-1'
IAM_ROLE default;

COPY sales
FROM 's3://redshift-downloads/tickit/sales_tab.txt' 
DELIMITER '\t' 
TIMEFORMAT 'MM/DD/YYYY HH:MI:SS'
IGNOREHEADER 1 
REGION 'us-east-1'
IAM_ROLE default;

After loading data, try some example queries. 

\timing

SELECT firstname, lastname, total_quantity 
FROM   (SELECT buyerid, sum(qtysold) total_quantity
        FROM  sales
        GROUP BY buyerid
        ORDER BY total_quantity desc limit 10) Q, users
WHERE Q.buyerid = userid
ORDER BY Q.total_quantity desc;

Now that you’ve loaded data into Redshift.

Delete the sample cluster using AWS CLI

When you delete a cluster, you must decide whether to create a final snapshot. Because this is an exercise and your test cluster should not have any important data in it, you can skip the final snapshot.

To delete your cluster, enter the following command.

aws redshift delete-cluster –cluster-identifier examplecluster –skip-final-cluster-snapshot

Congratulations! You successfully launched, authorized access to, connected to, and terminated a cluster.

Conclusion

These steps provide an example of loading data from S3 to Redshift into Redshift with the PSQL tool. The specific configuration details may vary depending on your environment and setup. It’s recommended to consult the relevant documentation from AWS for detailed instructions on setting up. I hope will this be helpful. Thank you for reading the DevopsRoles page!

Refer

https://docs.aws.amazon.com/redshift/latest/mgmt/getting-started-cli.html#getting-started-create-sample-db-cli

https://docs.aws.amazon.com/redshift/latest/gsg/new-user-serverless.html

Docker data science image

Introduction

How to create A Simple Docker Data Science Image.

  • Docker offers an efficient way to establish a Python data science environment, ensuring a smooth workflow from development to deployment.
  • Begin by creating a Dockerfile that precisely defines the environment’s specifications, dependencies, and configurations, serving as a blueprint for the Docker data science image.
  • Use the Docker command to build the image by incorporating the Dockerfile along with your data science code and necessary requirements.
  • Once the image is constructed, initiate a container to execute your analysis, maintaining environment consistency across various systems.
  • Docker simplifies sharing your work by encapsulating the entire environment, eliminating compatibility issues that can arise from varied setups.
  • For larger-scale collaboration or deployment needs, Docker Hub provides a platform to store and distribute your Docker images.
  • Pushing your image to Docker Hub makes it readily available to colleagues and collaborators, allowing effortless integration into their workflows.
  • This comprehensive process of setting up, building, sharing, and deploying a Python data science environment using Docker significantly enhances reproducibility, collaboration, and the efficiency of deployment.

Why Docker for Data Science?

Here are some reasons why Docker is commonly used in the data science field:

  1. Reproducibility: Docker allows you to package your entire data science environment, including dependencies, libraries, and configurations, into a single container. This ensures that your work can be reproduced exactly as you intended, even across different machines or platforms.
  2. Isolation: Docker containers provide a level of isolation, ensuring that the dependencies and libraries used in one project do not interfere with those used in another. This is especially important in data science, where different projects might require different versions of the same library.
  3. Portability: With Docker, you can package your entire data science stack into a container, making it easy to move your work between different environments, such as from your local machine to a cloud server. This is crucial for collaboration and deployment.
  4. Dependency Management: Managing dependencies in traditional environments can be challenging and error-prone. Docker simplifies this process by allowing you to specify dependencies in a Dockerfile, ensuring consistent and reliable setups.
  5. Version Control: Docker images can be versioned, allowing you to track changes to your environment over time. This can be especially helpful when sharing projects with collaborators or when you need to reproduce an older version of your work.
  6. Collaboration: Docker images can be easily shared with colleagues or the broader community. Instead of providing a list of instructions for setting up an environment, you can share a Docker image that anyone can run without worrying about setup complexities.
  7. Easy Setup: Docker simplifies the process of setting up complex environments. Once the Docker image is created, anyone can run it on their system with minimal effort, eliminating the need to manually install libraries and dependencies.
  8. Security: Docker containers provide a degree of isolation, which can enhance security by preventing unwanted interactions between your data science environment and your host system.
  9. Scalability: Docker containers can be orchestrated and managed using tools like Kubernetes, allowing you to scale your data science applications efficiently, whether you’re dealing with large datasets or resource-intensive computations.
  10. Consistency: Docker helps ensure that the environment you develop in is the same environment you’ll deploy to. This reduces the likelihood of “it works on my machine” issues.

Create A Simple Docker Data Science Image

Step 1: Create a Project Directory

Create a new directory for your Docker project and navigate into it:

mkdir data-science-docker
cd data-science-docker

Step 2: Create a Dockerfile for Docker data science

Create a file named Dockerfile (without any file extensions) in the project directory. This file will contain instructions for building the Docker image. You can use any text editor you prefer.

# Use an official Python runtime as a parent image
FROM python:3.8-slim

# Set the working directory to /app
WORKDIR /app

# Copy the current directory contents into the container at /app
COPY . /app

# Install any needed packages specified in requirements.txt
RUN pip install --no-cache-dir -r requirements.txt

# Make port 8888 available to the world outside this container
EXPOSE 8888

# Define environment variable
ENV NAME DataScienceContainer

# Run jupyter notebook when the container launches
CMD ["jupyter", "notebook", "--ip=0.0.0.0", "--port=8888", "--no-browser", "--allow-root"]

Step 3: Create requirements.txt

Create a file named requirements.txt in the same directory and list the Python libraries you want to install. For this example, we’ll include pandas and numpy:

pandas==1.3.3
numpy==1.21.2

Step 4: Build the Docker Image

Open a terminal and navigate to the project directory (data-science-docker). Run the following command to build the Docker image:

docker build -t data-science-image .

Step 5: Run the Docker Container

After the image is built, you can run a container based on it:

docker run -p 8888:8888 -v $(pwd):/app --name data-science-container data-science-image

Here:

  • -p 8888:8888 maps port 8888 from the container to the host.
  • -v $(pwd):/app mounts the current directory from the host to the /app directory in the container.
  • –name data-science-container assigns a name to the running container.

In your terminal, you’ll see a URL with a token that you can copy and paste into your web browser to access the Jupyter Notebook interface. This will allow you to start working with data science libraries like NumPy and pandas.

Remember, this is a simple example. Depending on your specific requirements, you might need to add more configurations, libraries, or dependencies to your Docker image.

Step 6: Sharing and Deploying the Image

To save an image to a tar archive

docker save -o data-science-container.tar data-science-container

This tarball can then be loaded on any other system with Docker installed via

docker load -i data-science-container.tar

Push to Docker hub to share with others publicly or privately within an organization.

To push the image to Docker Hub:

  1. Create a Docker Hub account if you don’t already have one
  2. Log in to Docker Hub from the command line using docker login
  3. Tag the image with your Docker Hub username: docker tag data-science-container yourusername/data-science-container
  4. Push the image: docker push yourusername/data-science-container
  5. The data-science-container image is now hosted on Docker Hub. Other users can pull the image by running:
docker pull yourusername/data-science-container

Conclusion

The process of creating a simple Docker data science image provides a powerful solution to some of the most pressing challenges in the field. By encapsulating the entire data science environment within a Docker container, practitioners can achieve reproducibility, ensuring that their work remains consistent across different systems and environments. The isolation and dependency management offered by Docker addresses the complexities of library versions, enhancing the stability of projects.

I hope will this your helpful. Thank you for reading the DevopsRoles page!

Boto3 DynamoDB

Introduction

In this Boto3 DynamoDB tutorial, we’ll walk through the process of creating tables, loading data, and executing fundamental CRUD operations in AWS DynamoDB using Python and the Boto3 library.

Boto3, the Python SDK for AWS, is primarily known for its two widely used features: Clients and Resources.

  • boto3 dynamodb client provides a low-level interface to the AWS service. It maps 1:1 with the actual AWS service API.
  • In another way, boto3 dynamodb resource are a higher-level abstraction compared to clients. It provides an object-oriented interface for interacting with various AWS services. Resources aren’t available for all AWS services.

In this tutorial, we use Boto3 DynamoDB resource methods.

Boto3 DynamoDB Prerequisites

Before starting, you should have the following prerequisites configured

  • An AWS account
  • AWS CLI on your computer
  • Download and unzip the sample source code from GitHub

Boto3 DynamoDB CRUD Operations example

  • Create table
  • Batch Write Items
  • Read Item
  • Add new item
  • Full scan table
  • Update item
  • Delete item
  • List all table
  • Delete table

Create table

The CreateTable operation adds a new table to your account. In an Amazon Web Services account, table names must be unique within each Region. That is, you can have two tables with the same name if you create the tables in different Regions.

CreateTable is an asynchronous operation. We can wait to create a process with wait_until_exists() method Upon receiving a CreateTable request, DynamoDB immediately returns a response with a TableStatus of CREATING. After the table is created, DynamoDB sets the TableStatus to ACTIVE. You can perform read-and-write operations only on an ACTIVE table.

The following code example shows how to create a DynamoDB table.

Python (Boto3)

def create_table(self, table_name):
        """
        Creates an Amazon DynamoDB table that can be used to store forum data.
        The table partition key(S): Name 

        :param table_name: The name of the table to create.
        :return: The newly created table.
        """
        try:
            self.table = self.dyn_resource.create_table(
                TableName=table_name,
                KeySchema=[
                    {'AttributeName': 'Name', 'KeyType': 'HASH'},  # Partition key
                ],
                AttributeDefinitions=[
                    {'AttributeName': 'Name', 'AttributeType': 'S'}
                ],
                ProvisionedThroughput={'ReadCapacityUnits': 10, 'WriteCapacityUnits': 5})
            self.table.wait_until_exists()
        except ClientError as err:
            logger.error(
                "Couldn't create table %s. Here's why: %s: %s", table_name,
                err.response['Error']['Code'], err.response['Error']['Message'])
            raise
        else:
            return self.table

Call function to create table as bellow

forums = Forum(dynamodb)
    #Check for table existence, create table if not found
    forums_exists = forums.exists(table_name)
    if not forums_exists:
        print(f"\nCreating table {table_name}...")
        forums.create_table(table_name)
        print(f"\nCreated table {forums.table.name}.")

This command returns the following result.

Batch Write Items

The BatchWriteItem operation puts or deletes multiple items in one or more tables. A single call  BatchWriteItem can transmit up to 16MB of data over the network, consisting of up to 25 item put or delete operations. While individual items can be up to 400 KB once stored, it’s important to note that an item’s representation might be greater than 400KB while being sent in DynamoDB’s JSON format for the API call.

BatchWriteItem cannot update items. 

If DynamoDB returns any unprocessed items, you should retry the batch operation on those items. However, AWS strongly recommends that you use an exponential backoff algorithm. If you retry the batch operation immediately, the underlying read or write requests can still fail due to throttling on the individual tables. If you delay the batch operation using exponential backoff, the individual requests in the batch are much more likely to succeed.

For more information, see Batch Operations and Error Handling in the Amazon DynamoDB Developer Guide & Exponential Backoff And Jitter

The following code example shows how to write a batch of DynamoDB items.

def write_batch(self, forums):
        """
        Fills an Amazon DynamoDB table with the specified data, using the Boto3
        Table.batch_writer() function to put the items in the table.
        Inside the context manager, Table.batch_writer builds a list of
        requests. On exiting the context manager, Table.batch_writer starts sending
        batches of write requests to Amazon DynamoDB and automatically
        handles chunking, buffering, and retrying.

        :param forums: The data to put in the table. Each item must contain at least
                       the keys required by the schema that was specified when the
                       table was created.
        """
        try:
            with self.table.batch_writer() as writer:
                for forum in forums:
                    writer.put_item(Item=forum)
        except ClientError as err:
            logger.error(
                "Couldn't load data into table %s. Here's why: %s: %s", self.table.name,
                err.response['Error']['Code'], err.response['Error']['Message'])
            raise

Call function to write data to DynamoDB as below

    #Load data into the created table
    forum_data = forums.get_sample_forum_data(forum_file_name)
    print(f"\nReading data from '{forum_file_name}' into your table.")
    forums.write_batch(forum_data)
    print(f"\nWrote {len(forum_data)} forums into {forums.table.name}.")
    print('-'*88)

This command returns the following result.

Read Item

The GetItem operation returns a set of attributes for the item with the given primary key. If there is no matching item, GetItem do not return any data and there will be no Item element in the response.

GetItem provides an eventually consistent read by default. If your application requires a strongly consistent read, set ConsistentRead to true

The following code example shows how to get an item from a DynamoDB table.

    def get_forum(self, name):
        """
        Gets forum data from the table for a specific forum.

        :param name: The name of the forum.
        :return: The data about the requested forum.
        """
        try:
            response = self.table.get_item(Key={'Name': name})
        except ClientError as err:
            logger.error(
                "Couldn't get forum %s from table %s. Here's why: %s: %s",
                name, self.table.name,
                err.response['Error']['Code'], err.response['Error']['Message'])
            raise
        else:
            return response['Item']

Call function to get data items from DynamoDB as below

    #Get forum data with hash key = 'Amazon DynamoDB'  
    forum = forums.get_forum("Amazon DynamoDB")
    print("\nHere's what I found:")
    pprint(forum)
    print('-'*88)

This command returns the following result.

Add new item

Creates a new item, or replaces an old item with a new item. If an item that has the same primary key as the new item already exists in the specified table, the new item completely replaces the existing item. You can perform a conditional put operation (add a new item if one with the specified primary key doesn’t exist), or replace an existing item if it has certain attribute values. You can return the item’s attribute values in the same operation, using the ReturnValuesparameter.

The following code example shows how to put an item in a DynamoDB table.

    def add_forum(self, name, category, messages, threads, views):
        """
        Adds a forum to the table.

        :param name: The name of the forum.
        :param category: The category of the forum.
        :param messages: The messages of the forum.
        :param threads: The quality threads of the forum.
        :param views: The quality views of the forum.
        """
        try:
            self.table.put_item(
                Item={
                    'Name': name,
                    'Category': category,
                    'Messages': messages,
                    'Threads': threads,
                    'Views': views
                    })
        except ClientError as err:
            logger.error(
                "Couldn't add forum %s to table %s. Here's why: %s: %s",
                name, self.table.name,
                err.response['Error']['Code'], err.response['Error']['Message'])
            raise

Call function to add item from DynamoDB as below

    #Add new forum data with hash key = 'SQL server'
    forums.add_forum("SQL server","Amazon Web Services",4,2,1000)
    print(f"\nAdded item to '{forums.table.name}'.")
    print('-'*88)

This command returns the following result.

Full scan table

The Scan operation returns one or more items and item attributes by accessing every item in a table or a secondary index. To have DynamoDB return fewer items, you can provide an FilterExpression operation.

If the total size of scanned items exceeds the maximum dataset size limit of 1 MB, the scan completes and results are returned to the user. The LastEvaluatedKey value is also returned and the requestor can use the LastEvaluatedKey to continue the scan in a subsequent operation. 

The following code example shows how to scan a DynamoDB table.

   def scan_forums(self):
        """
        Scans for forums.

        :param n/a
        :return: The list of forums.
        """
        forums = []
        scan_kwargs = {}
        try:
            done = False
            start_key = None
            while not done:
                if start_key:
                    scan_kwargs['ExclusiveStartKey'] = start_key
                response = self.table.scan(**scan_kwargs)
                forums.extend(response.get('Items', []))
                start_key = response.get('LastEvaluatedKey', None)
                done = start_key is None
        except ClientError as err:
            logger.error(
                "Couldn't scan for forums. Here's why: %s: %s",
                err.response['Error']['Code'], err.response['Error']['Message'])
            raise

        return forums

Call function to scan items from DynamoDB as below

    #Full scan table
    releases = forums.scan_forums()
    if releases:
        print(f"\nHere are your {len(releases)} forums:\n")
        pprint(releases)
    else:
        print(f"I don't know about any forums released\n")
    print('-'*88)

This command boto3 dynamodb scan returns the following result.

Update item

Edits an existing item’s attributes, or adds a new item to the table if it does not already exist. We can put, delete, or add attribute values. We can also perform a conditional update on an existing item (insert a new attribute name-value pair if it doesn’t exist, or replace an existing name-value pair if it has certain expected attribute values).

We can also return the item’s attribute values in the same UpdateItem operation using the ReturnValues parameter.

The following code example shows how to update an item in a DynamoDB table.

    def update_forum(self, name, category, messages, threads, views):
        """
        Updates rating and plot data for a forum in the table.

        :param name: The name of the forum.
        :param category: The category of the forum.
        :param messages: The messages of the forum.
        :param threads: The quality threads of the forum.
        :param views: The quality views of the forum.
        :return: The fields that were updated, with their new values.
        """
        try:
            response = self.table.update_item(
                Key={'Name': name},
                UpdateExpression="set Category=:c, Messages=:m, Threads=:t, #Views=:v",
                ExpressionAttributeValues={
                    ':c': category,
                    ':m': messages,
                    ':t': threads,
                    ':v': views
                    },
                ExpressionAttributeNames={"#Views" : "Views"},
                ReturnValues="UPDATED_NEW")
        except ClientError as err:
            logger.error(
                "Couldn't update forum %s in table %s. Here's why: %s: %s",
                name, self.table.name,
                err.response['Error']['Code'], err.response['Error']['Message'])
            raise
        else:
            return response['Attributes']

Call function to update an item of DynamoDB as below

    #Update data: update forum quality views from 1000 to 2000
    updated = forums.update_forum("SQL server","Amazon Web Services",4,2,2000)
    print(f"\nUpdated :")
    pprint(updated)
    print('-'*88)

This command returns the following result.

Delete item

Deletes a single item in a table by primary key. You can perform a conditional delete operation that deletes the item if it exists, or if it has an expected attribute value.

In addition to deleting an item, you can also return the item’s attribute values in the same operation, using the ReturnValues parameter.

Unless you specify conditions, the DeleteItem is an idempotent operation; running it multiple times on the same item or attribute does not result in an error response.

The following code example shows how to delete an item from a DynamoDB table.

    def delete_forum(self, name):
        """
        Deletes a forum from the table.

        :param name: The title of the forum to delete.
        """
        try:
            self.table.delete_item(Key={'Name': name})
        except ClientError as err:
            logger.error(
                "Couldn't delete forum %s. Here's why: %s: %s", name,
                err.response['Error']['Code'], err.response['Error']['Message'])
            raise

Call function to delete the item of DynamoDB as below

    #Delete data
    forums.delete_forum("SQL server")
    print(f"\nRemoved item from the table.")
    print('-'*88)
    ##Full scan table
    releases = forums.scan_forums()
    if releases:
        print(f"\nHere are your {len(releases)} forums:\n")
        pprint(releases)
    else:
        print(f"I don't know about any forums released\n")
    print('-'*88)

This command returns the following result.

List all table

Returns an array of table names associated with the current account and endpoint. The output from ListTables is paginated, with each page returning a maximum of 100 table names default.

The following code example shows how to list DynamoDB tables.

    #List all table
    print('-'*88)
    print(f"Table list:\n")
    print(list(dynamodb.tables.all()))

This command returns the following result.

Delete table

Deletes a single item in a table by primary key. You can perform a conditional delete operation that deletes the item if it exists, or if it has an expected attribute value.

In addition to deleting an item, you can also return the item’s attribute values in the same operation, using the ReturnValues parameter.

Unless you specify conditions, the DeleteItem is an idempotent operation; running it multiple times on the same item or attribute does not result in an error response.

The following code example shows how to delete an item from a DynamoDB table.

    def delete_table(self):
        """
        Deletes the table.
        """
        try:
            self.table.delete()
            self.table = None
        except ClientError as err:
            logger.error(
                "Couldn't delete table. Here's why: %s: %s",
                err.response['Error']['Code'], err.response['Error']['Message'])
            raise

Call function to delete the item of DynamoDB as below

   #Delete table
    forums.delete_table()
    print(f"Deleted {table_name}.")

This command returns the following result.

Conclusion

These steps provide an example CRUD Operations using Boto3 DynamoDB. The specific configuration details may vary depending on your environment and setup. It’s recommended to consult the relevant documentation from AWS for detailed instructions on setting up. I hope this will your helpful. Thank you for reading the DevopsRoles page!

Boto3 DynamoDB Refer to:

Devops Tutorial

Exit mobile version