Rick

Rick
Rick

Thursday, April 27, 2017

Radom thoughts by Rick Hightower: DR and multi-region

DR based on region is silly for most apps and services. It is an expensive bet.

Multi-region is great for reducing latency for sure and DR for sure, but multi-region hot standbys is silly for most apps.

Mutli-AZs deployments are enough for DR IMO for 99% of use cases. 
If your app/service can survive a single-AZ outage, it is better than 99.999% of apps out there. 

I am not saying to not do multi-region deploys (hot standbys), but merely that it has a cost, and your app may not need it. 


If you have a regular backup and a way to restore from another region, you are ahead of the game.
  • frequent EBS snapshots sent to another region, 
  • back things up to S3, replicate S3 bucket to S3 bucket, 
  • read replicas for DBs in another region if you must


For many services and applications, you don’t have to run a hot standby if you are spread across three AZs. 
Focus on surviving a single AZ failure. Get that right. Then focus on how to recover in another region from backups:

  • snapshot, AMIs, etc. ready to go, ready to be spun up, 
  • backups to S3 with S3 bucket replication. Cheap and easy.

If all hell breaks loose, and it takes you 15 minutes to 1 hour to spin up in a new region that is a lot cheaper than running hot-standby in a second region 24/7 365 days a year. The probability of a complete region failure and the cost to your business being down for 15 minutes to an hour vs. the cost of running a second set of servers all of the time. 

Engineers love to over engineer (especially bad ones). Hot standbys are expensive. Unless you need to run in multiple regions to reduce latency. 

If CA falls into the ocean, no one is going to care if your app serving virtual tractors is down for a few hours. 
If Ohio is nuked, and your app is down for an hour, no one will care that they saw the same ad twice.
We can serve a default ad without personalization for an hour. 


Monday, April 17, 2017

Cassandra AWS Cluster with CloudFormation, bastion host, Ansible, ssh and the aws-command line

This Cassandra tutorial is useful for developers and DevOps/DBA staff who want to launch a Cassandra cluster in AWS.
The cassandra-image project has been using Vagrant and Ansible to set up a Cassandra Cluster for local testing. Then we used PackerAnsible and EC2. We used Packer to create AWS images in the last tutorial. In this tutorial, we will use CloudFormation to create a VPC, Subnets, security groups and more to launch a Cassandra cluster in EC2 using the AWS AMI image we created with Packer in the last article. The next two tutorials after this one, will set up Cassandra to work in multiple AZs and multiple regions using custom snitches for Cassandra.

Overview

This article covers the following:
  • CloudFormation
  • CloudFormer
  • Setting up VPC, NAT, Subnets, CIDRs, and more
  • AWS command line tools to launch CloudFormations
  • Setting up a bastion server for ssh and ansible in AWS
  • Setting up ansible to tunnel through our bastion server to manage AWS Cassandra instances
  • Using ansible to install Oracle 8 JDK instead of OpenJDK

Getting started

We will create a VPC, subnets, security groups and more. Then we will expand the CloudFormation as we need to set up EC2Snitch and EC2MultiRegionSnitch in later tutorials. We also set up a bastion host in our new public subnet of our new VPC. The bastion host allows us to tunnel ansible commands to our Cassandra or Kafka cluster.

Retrospective - Past Articles in this Cassandra Cluster DevOps/DBA series

The first tutorial in this series was about setting up a Cassandra cluster with Vagrant (also appeared on DZone with some additional content DZone Setting up a Cassandra Cluster with Vagrant. The second tutorial in this series was about setting up SSL for a Cassandra cluster using Vagrant (which also appeared with more content as DZone Setting up a Cassandra Cluster with SSL). The third article in this series was about configuring and using Ansible (building on the first two articles). This last article (the 4th) Cassandra Tutorial: AWS Ansible Packer and the AWS command line covered applying the tools and techniques from the first three articles to produce an image (EC2 AMI to be precise) that we can deploy to AWS/EC2. This article uses that AWS AMI image and deploys it into a VPC that we create with CloudFormation.

Where do you go if you have a problem or get stuck?

We set up a google group for this project and set of articles. If you just can’t get something to work or you are getting an error message, please report it here. Between the mailing list and the github issues, we can support you with quite a few questions and issues. You can also find new articles in this series by following Cloudurable™ at our LinkedIn pageFacebook pageGoogle plus or Twitter.

Creating a simple VPC with one private subnet and one public subnet for our Cassandra Cluster

We describe the process here to create a VPC, but we have a script (CloudFormation template) to save you the trouble.
Recall that an AWS VPC is a virtual private cloud. You can create multiple Amazon VPCs within a region that spans multiple availability zones, which is useful for Amazon Cassandra deploys and Amazon Kafka deploys. A VPC is an isolated area to deploy EC2 instances.
Let’s create a new VPC for our cluster (Kafka or Cassandra). To start things off, you can use the AWS VPC creation wizard. Before you do that, create an elastic IP, which you will use for the NatGateway.
Recall that Amazon EC2 instances launched in a private subnet cannot access the Internet to do updates unless there is a NAT gateway. A NAT is a network address translator. Even if you wanted to update your Cassandra or Kafka EC2 instances with yum install foo, you could not do it because they have no route to the public Internet. AWS provides NAT gateways which are similar to IGW, but unlike IGWs they do not allow incoming traffic, but rather only allow responses to outgoing traffic from your Amazon EC2 instances.
Before we create the NatGateway, we need to create an EIP. First created a new EIP to associate with the new VPC. The wizard will ask you to select a VPC template, pick the one with one private network and one public network. It will ask you for the EIP id for the NatGateway.
Recall that an EIP is an Elastic IP Address which is a public IP address. AWS has a pool of public IP addresses available to rent per region, and an EIP is taken from this pool.
Don’t worry, we did all of this and created a CloudFormation template which we will cover in a bit, you can use the CloudFormation script instead of the Wizard, but we want to describe how we created the CloudFormation template.
When using the VPC wizard, it says it is waiting for a Nat Gateway, but the NatGateway seems to be waiting for a subnet, but it is not. All you need is the EIP to give the VPC wizard; then it creates the NatGateway for you.

Using CloudFormer

After you are done creating something in AWS/EC2 that you want to automate, do the following. Tag all of the resources, VPC, NAT gateway, etc. For this, we used cloudgen=cassandra-test.
Then you want to start up the AWS CloudFormer. To do this in the AWS Console, go to CloudFormation, you should see a wizard, pick the 3rd option down on the home page of CloudFormation (or pick create-stack and choose CloudFormer from templates examples dropdown). Then select the CloudFormer from the list of templates. Run this CloudFormation template to create a CloudFormer. Give a username and password that you will have to use later. Run the CloudFormation for CloudFormer (very meta). After the CloudFormation completes, go to the link provided in the CloudFormation stack output. Enter your username and password. Launch the CloudFormer, then use the filter on cloudgen=cassandra-test.
Walk through the wizard, and it will create a CloudFormation template that you can run (it won’t work, but it gets you 99% of the way there as CloudFormer came out before NatGateways). First time I created it the username and password did not work. I had to pick a shorter password. After you are done creating your CloudFormation stack, then you can shutdown the CloudFormer

CloudFormation template created from CloudFormer

Here is the CloudFormation template that we derived from the above process with some edits to make it more readable.

CloudFormation Template

{
  "AWSTemplateFormatVersion": "2010-09-09",
  "Description": "Setup VPC for Cassandra",
  "Resources": {
    "vpcMain": {
      "Type": "AWS::EC2::VPC",
      "Properties": {
        "CidrBlock": "10.0.0.0/16",
        "InstanceTenancy": "default",
        "EnableDnsSupport": "true",
        "EnableDnsHostnames": "true",
        "Tags": [
          {
            "Key": "cloudgen",
            "Value": "cassandra-test"
          },
          {
            "Key": "Name",
            "Value": "CassandraTestCluster"
          }
        ]
      }
    },
    "subnetPublic": {
      "Type": "AWS::EC2::Subnet",
      "Properties": {
        "CidrBlock": "10.0.0.0/24",
        "AvailabilityZone": "us-west-2a",
        "VpcId": {
          "Ref": "vpcMain"
        },
        "Tags": [
          {
            "Key": "cloudgen",
            "Value": "cassandra-test"
          },
          {
            "Key": "Name",
            "Value": "Public subnet"
          }
        ]
      }
    },
    "subnetPrivate": {
      "Type": "AWS::EC2::Subnet",
      "Properties": {
        "CidrBlock": "10.0.1.0/24",
        "AvailabilityZone": "us-west-2a",
        "VpcId": {
          "Ref": "vpcMain"
        },
        "Tags": [
          {
            "Key": "cloudgen",
            "Value": "cassandra-test"
          },
          {
            "Key": "Name",
            "Value": "Private subnet"
          }
        ]
      }
    },
    "internetGateway": {
      "Type": "AWS::EC2::InternetGateway",
      "Properties": {
      }
    },
    "dhcpOptions": {
      "Type": "AWS::EC2::DHCPOptions",
      "Properties": {
        "DomainName": "us-west-2.compute.internal",
        "DomainNameServers": [
          "AmazonProvidedDNS"
        ]
      }
    },
    "networkACL": {
      "Type": "AWS::EC2::NetworkAcl",
      "Properties": {
        "VpcId": {
          "Ref": "vpcMain"
        },
        "Tags": [
          {
            "Key": "cloudgen",
            "Value": "cassandra-test"
          },
          {
            "Key": "Name",
            "Value": "CassandraTestNACL"
          }
        ]
      }
    },
    "routeTableMain": {
      "Type": "AWS::EC2::RouteTable",
      "Properties": {
        "VpcId": {
          "Ref": "vpcMain"
        },
        "Tags": [
          {
            "Key": "cloudgen",
            "Value": "cassandra-test"
          }
        ]
      }
    },
    "routeTablePublic": {
      "Type": "AWS::EC2::RouteTable",
      "Properties": {
        "VpcId": {
          "Ref": "vpcMain"
        },
        "Tags": [
          {
            "Key": "cloudgen",
            "Value": "cassandra-test"
          }
        ]
      }
    },
    "eipForNatGateway": {
      "Type": "AWS::EC2::EIP",
      "Properties": {
        "Domain": "vpc"
      }
    },
    "natGateway": {
      "Type": "AWS::EC2::NatGateway",
      "Properties": {
        "AllocationId": {
          "Fn::GetAtt": [
            "eipForNatGateway",
            "AllocationId"
          ]
        },
        "SubnetId": {
          "Ref": "subnetPublic"
        }
      }
    },
    "securityGroupDefault": {
      "Type": "AWS::EC2::SecurityGroup",
      "Properties": {
        "GroupDescription": "default VPC security group",
        "VpcId": {
          "Ref": "vpcMain"
        },
        "Tags": [
          {
            "Key": "cloudgen",
            "Value": "cassandra-test"
          },
          {
            "Key": "Name",
            "Value": "CassandraTestSG"
          }
        ]
      }
    },
    "aclEntryAllowAllEgress": {
      "Type": "AWS::EC2::NetworkAclEntry",
      "Properties": {
        "CidrBlock": "0.0.0.0/0",
        "Egress": "true",
        "Protocol": "-1",
        "RuleAction": "allow",
        "RuleNumber": "100",
        "NetworkAclId": {
          "Ref": "networkACL"
        }
      }
    },
    "aclEntryAllowAllIngress": {
      "Type": "AWS::EC2::NetworkAclEntry",
      "Properties": {
        "CidrBlock": "0.0.0.0/0",
        "Protocol": "-1",
        "RuleAction": "allow",
        "RuleNumber": "100",
        "NetworkAclId": {
          "Ref": "networkACL"
        }
      }
    },
    "subnetAclAssociationPublic": {
      "Type": "AWS::EC2::SubnetNetworkAclAssociation",
      "Properties": {
        "NetworkAclId": {
          "Ref": "networkACL"
        },
        "SubnetId": {
          "Ref": "subnetPublic"
        }
      }
    },
    "subnetAclAssociationPrivate": {
      "Type": "AWS::EC2::SubnetNetworkAclAssociation",
      "Properties": {
        "NetworkAclId": {
          "Ref": "networkACL"
        },
        "SubnetId": {
          "Ref": "subnetPrivate"
        }
      }
    },
    "vpcGatewayAttachment": {
      "Type": "AWS::EC2::VPCGatewayAttachment",
      "Properties": {
        "VpcId": {
          "Ref": "vpcMain"
        },
        "InternetGatewayId": {
          "Ref": "internetGateway"
        }
      }
    },
    "subnetRouteTableAssociationPublic": {
      "Type": "AWS::EC2::SubnetRouteTableAssociation",
      "Properties": {
        "RouteTableId": {
          "Ref": "routeTablePublic"
        },
        "SubnetId": {
          "Ref": "subnetPublic"
        }
      }
    },
    "routeNatGateway": {
      "Type": "AWS::EC2::Route",
      "Properties": {
        "DestinationCidrBlock": "0.0.0.0/0",
        "NatGatewayId": {
          "Ref": "natGateway"
        },
        "RouteTableId": {
          "Ref": "routeTableMain"
        }
      }
    },
    "routeInternetGateway": {
      "Type": "AWS::EC2::Route",
      "Properties": {
        "DestinationCidrBlock": "0.0.0.0/0",
        "RouteTableId": {
          "Ref": "routeTablePublic"
        },
        "GatewayId": {
          "Ref": "internetGateway"
        }
      },
      "DependsOn": "vpcGatewayAttachment"
    },
    "vpcDHCPOptionsAssociation": {
      "Type": "AWS::EC2::VPCDHCPOptionsAssociation",
      "Properties": {
        "VpcId": {
          "Ref": "vpcMain"
        },
        "DhcpOptionsId": {
          "Ref": "dhcpOptions"
        }
      }
    },
    "securityGroupIngressDefault": {
      "Type": "AWS::EC2::SecurityGroupIngress",
      "Properties": {
        "GroupId": {
          "Ref": "securityGroupDefault"
        },
        "IpProtocol": "-1",
        "SourceSecurityGroupId": {
          "Ref": "securityGroupDefault"
        }
      }
    },
    "securityGroupEgressDefault": {
      "Type": "AWS::EC2::SecurityGroupEgress",
      "Properties": {
        "GroupId": {
          "Ref": "securityGroupDefault"
        },
        "IpProtocol": "-1",
        "CidrIp": "0.0.0.0/0"
      }
    }
  }
}
We define the following resources in the above CloudFormation which was generated with CloudFormer.
  • vpcMain which is the VPC with CIDR 10.0.0.0/16
  • subnetPublic which is the public Subnet with CIDR 10.0.0.0/24
  • subnetPrivate which is the private Subnet with CIDR 10.0.1.0/24
  • internetGateway of type InternetGateway
  • dhcpOptions of type DHCPOptions
  • networkACL of type NetworkAcl
  • natGateway of type NatGateway
  • routeTableMain of type RouteTable
  • routeTablePublic of type RouteTable
  • eipForNatGateway of type NatGateway
  • securityGroupDefault of type SecurityGroup
The vpcMain which is the AWS VPC which is the VPC we use to deploy instances. The subnetPublic (Subnet) with CIDR 10.0.0.0/24 is a part of a VPC’s IP address range. Just like an AWS VPC you need to specify CIDR blocks for the subnets. Subnets are associated with availability zones (independent power source and network). Subnets can be public or private. A private subnet is one that is not routable from the internetGateway. The subnetPrivate (Subnet) does not have a route to the internetGateway. The internetGateway (InternetGateway) enables traffic from the public Internet to the mainVPC VPC. The internetGateway (IGW) does network address translation from public IPs of EC2 instances to their private IP for incoming traffic. When an EC2 instance sends IP traffic from a public subnet, the IGW acts as the NAT for the public subnet and translates the reply address to the EC2 instance’s public IP (EIP). The IGW keep track of the mappings of EC2 instances private IP address and their public IP address. AWS ensures that the IGW is highly available and handles the horizontal scale, redundancy as needed. The dhcpOptions (DHCPOptions) is associated with with the mainVPC and is used for Dynamic Host Configuration Protocol (DHCP) config and provides a standard for configuring TCP/IP networks. The networkACL(NetworkAcl) - Network ACL Control List (NACL) - is a stateless layer of security for subnets. NACLs act as a stateless firewall. The natGateway (NatGateway) are similar to IGW, but unlike IGWs they do not allow incoming traffic, but rather only allow responses to outgoing traffic from your Amazon EC2 instances. NAT gateways are simple to manage and highly available. The securityGroupDefault (SecurityGroup) provides a stateful firewall that is applied directly to EC2 instance, ELBs and Autoscale group launches. For more details of what the above CloudFormation creates and why see this short guide to VPC or this AWS Cassandra deployment guide.

Using the new CloudFormation

We use the AWS CloudFormation CommandLine to create the VPC, subnets, Network ACL, etc. for our Cassandra Cluster.

Using aws cloudformation command line to create VPC for Cassandra Cluster or Kafka Cluster

#!/usr/bin/env bash
set -e

source bin/ec2-env.sh

aws --region ${REGION} s3 cp cloud-formation/vpc.json s3://$CLOUD_FORMER_BUCKET
aws --region ${REGION} cloudformation create-stack --stack-name ${ENV}-vpc-cassandra \
--template-url "https://s3-us-west-2.amazonaws.com/$CLOUD_FORMER_BUCKET/vpc.json" \

Notice that we upload the CloudFormation template to S3 using the AWS command-line. Then, we call create-stack to run the CloudFormation stack. This is our base VPC setup. We will add to it as we continue.

Modifying ec2-env.sh

We added three more variables to our ec2-env.sh file as follows:

ec2-env.sh - KEY name, aws REGION, name of S3 bucket to store CloudFormation templates

export KEY=KEY_NAME_CASSANDRA
export REGION=us-west-2
export CLOUD_FORMER_BUCKET=cloudurable-cloudformer-templates
You might recall that our ec2-env.sh file specifies security groups id, subnets id, IAM profile name, etc.

Adding outputs to CloudFormation.

We were using the default VPC, but now we want to use the VPC, subnet, etc. that we just created. We could just look that up using the console, but a better way would be to add Outputs to our VPC CloudFormation for our Cassandra Cluster. The CloudFormation Outputs section declares values that can be imported into other CloudFormation stacks or queried by the command line, or just displayed in the AWS CloudFormation console.
CloudFormation templates once deployed are called CloudFormation stacks. CloudFormation stacks can depend on outputs from other CloudFormation stacks.
Here are the updates we make to create output variables from our CloudFormation.

cloud-formation/vpc.json - adding output variables

{
  "AWSTemplateFormatVersion": "2010-09-09",
  "Description": "Setup VPC for Cassandra",
  "Outputs": {
      "subnetPublicOut": {
        "Description": "Subnet Public Id",
        "Value": {
          "Ref": "subnetPublic"
        },
        "Export": {
          "Name": {
            "Fn::Sub": "${AWS::StackName}-subnetPublic"
          }
        }
      },
      "subnetPrivateOut": {
        "Description": "Subnet Private Id",
        "Value": {
          "Ref": "subnetPrivate"
        },
        "Export": {
          "Name": {
            "Fn::Sub": "${AWS::StackName}-subnetPrivate"
          }
        }
      }
  },
  ...
Notice we put our output variable under the key "Outputs" which is a map of output variables.
We define subnetPublicOut and subnetPrivateOut which get exported to ${AWS::StackName}-subnetPublic and ${AWS::StackName}-subnetPrivate

Setting up Bastion Security Group

Next, we need to setup the Security Group for our Bastion Host. A Bastion host will allow us to manage the Cassandra/Kafka EC2 instances from ssh/ansible. It is the bridge to out private subnet where we will keep our Cassandra/Kafka EC2 instances.
The security group needs to open up port 22 as follows.

cloud-formation/vpc.json - Bastion Security Group for Ansible Mgmt of Cassandra Database Nodes

    "securityGroupBastion": {
      "Type": "AWS::EC2::SecurityGroup",
      "Properties": {
        "GroupDescription": "Security group for bastion server.",
        "VpcId": {
          "Ref": "vpcMain"
        },
        "SecurityGroupIngress": [
          {
            "IpProtocol": "tcp",
            "FromPort": "22",
            "ToPort": "22",
            "CidrIp": "0.0.0.0/0"
          }
        ],
        "SecurityGroupEgress": [
          {
            "IpProtocol": "-1",
            "CidrIp": "0.0.0.0/0"
          }
        ],
        "Tags": [
          {
            "Key": "Name",
            "Value": "bastionSecurityGroup"
          },
          {
            "Key": "cloudgen",
            "Value": "cassandra-test"
          }
        ]
      }
    },

Setting up Security Group for Cassandra Nodes

This example will focus on Cassandra nodes, not Kafka, but the ideas are similar. This security group uses the CIDR of the VPC to open up all traffic to all subnets in this VPC.

cloud-formation/vpc.json - Security group for Cassandra Database nodes in Cassandra Cluster

    "securityGroupCassandraNodes": {
      "Type": "AWS::EC2::SecurityGroup",
      "Properties": {
        "GroupDescription": "Security group for Cassandra Database nodes in Cassandra Cluster",
        "VpcId": {
          "Ref": "vpcMain"
        },
        "SecurityGroupIngress": [
          {
            "IpProtocol": "-1",
            "CidrIp": "10.0.0.0/8"
          }
        ],
        "SecurityGroupEgress": [
          {
            "IpProtocol": "-1",
            "CidrIp": "0.0.0.0/0"
          }
        ],
        "Tags": [
          {
            "Key": "Name",
            "Value": "cassandraSecurityGroup"
          },
          {
            "Key": "cloudgen",
            "Value": "cassandra-test"
          }
        ]
      }
    }

Output new security groups as CloudFormation outputs.

We will want to add the securityGroupCassandraNodes and securityGroupBastion to the output of the CloudFormation so we can use it from our AWS EC2 scripts.

cloud-formation/vpc.json - output new security groups

{
  "AWSTemplateFormatVersion": "2010-09-09",
  "Description": "Setup VPC for Cassandra Cluster for Cassandra Database",

  "Outputs": {
    "subnetPublicOut": {
      "Description": "Subnet Public Id",
      "Value": {
        "Ref": "subnetPublic"
      },
      "Export": {
        "Name": {
          "Fn::Sub": "${AWS::StackName}-subnetPublic"
        }
      }
    },
    "subnetPrivateOut": {
      "Description": "Subnet Private Id",
      "Value": {
        "Ref": "subnetPrivate"
      },
      "Export": {
        "Name": {
          "Fn::Sub": "${AWS::StackName}-subnetPrivate"
        }
      }
    },
    "securityGroupBastionOutput": {
      "Description": "Security Group Bastion for managing Cassandra Cluster Nodes with Ansible",
      "Value": {
        "Ref": "securityGroupBastion"
      },
      "Export": {
        "Name": {
          "Fn::Sub": "${AWS::StackName}-securityGroupBastion"
        }
      }
    },
    "securityGroupCassandraNodesOutput": {
      "Description": "Cassandra Database Node security group for Cassandra Cluster",
      "Value": {
        "Ref": "securityGroupCassandraNodes"
      },
      "Export": {
        "Name": {
          "Fn::Sub": "${AWS::StackName}-securityGroupCassandraNodes"
        }
      }
    }
  },
  ...
Notice that we added securityGroupBastionOutput and securityGroupCassandraNodesOutput to the above CloudFormation.

Cloudurable specialize in AWS DevOps Automation for Cassandra, Spark and Kafka

We hope you find this Cassandra tutorial useful. We also provide Spark consultingCasandra consulting and Kafka consulting to get you setup fast in AWS with CloudFormation and CloudWatch. Support us by checking out our Spark TrainingCasandra training and Kafka training.

Updating CloudFormation

As we iterative develop our CloudFormation, like add new security groups, we do not have to rebuild everything. Instead, we can update the CloudFormation stack. CloudFormation is smart enough to see what has changed and only add/update those areas.

bin/update-vpc-cloudformation.sh

#!/usr/bin/env bash
set -e

source bin/ec2-env.sh

aws --region ${REGION} s3 cp cloud-formation/vpc.json s3://$CLOUD_FORMER_BUCKET
aws --region ${REGION} cloudformation update-stack --stack-name ${ENV}-vpc-cassandra \
--template-url "https://s3-us-west-2.amazonaws.com/$CLOUD_FORMER_BUCKET/vpc.json" \

The above uses the CloudFormation Update Stack to update a stack as specified by the template. After the update stack completes successfully, the stack update starts.
We can see our output variable from our CloudFormation template from the command line as follows.

List the output variables of CloudFormation with the aws command-line


$ aws cloudformation describe-stacks --stack-name dev-vpc-cassandra | jq .Stacks[].Outputs[]

Output
{
  "Description": "Subnet Private Id",
  "OutputKey": "subnetPrivateOut",
  "OutputValue": "subnet-XXe5453a"
}
{
  "Description": "Cassandra Database Node security group for Cassandra Cluster",
  "OutputKey": "securityGroupCassandraNodesOutput",
  "OutputValue": "sg-XX527048"
}
{
  "Description": "Subnet Public Id",
  "OutputKey": "subnetPublicOut",
  "OutputValue": "subnet-XXe5453c"
}
{
  "Description": "Security Group Bastion for managing Cassandra Cluster Nodes with Ansible",
  "OutputKey": "securityGroupBastionOutput",
  "OutputValue": "sg-XX527040"
}

Then we just modify our ec2-env.sh script to use these values. Now we can modify the bin/ec2-env.sh.

env file

#!/bin/bash
set -e


export REGION=us-west-2
export ENV=dev
export KEY_PAIR_NAME="cloudurable-$REGION"
export PEM_FILE="${HOME}/.ssh/${KEY_PAIR_NAME}.pem"
export SUBNET_PUBLIC=subnet-XXe5453a
export SUBNET_PRIVATE=subnet-XXe5453b
export CLOUD_FORMER_S3_BUCKET=cloudurable-cloudformer-templates
export HOSTED_ZONE_ID="XXNXWWXWZXEXHJ-NOT-REAL"



export BASTION_NODE_SIZE=t2.small
export BASTION_SECURITY_GROUP=sg-XX527040
export BASTION_AMI=ami-XXb3310e
export BASTION_EC2_INSTANCE_NAME="bastion.${ENV}.${REGION}"
export BASTION_DNS_NAME="bastion.${ENV}.${REGION}.cloudurable.com."


export CASSANDRA_NODE_SIZE=m4.large
export CASSANDRA_AMI=ami-XXb3310f
export CASSANDRA_SECURITY_GROUP=sg-XX527048
export CASSANDRA_IAM_PROFILE=IAM_PROFILE_CASSANDRA
export CASSANDRA_EC2_INSTANCE_NAME="cassandra-node.${ENV}.${REGION}"
export CASSANDRA_DNS_NAME="node0.${ENV}.${REGION}.cloudurable.com."

Just like a war plan does not survive the first battle, variable names do not survive the first refactor to add a feature. We could also use the CloudFormation outputs as input variables to another CloudFormation as input variables.
CloudFormation is the AWS way to create immutable infrastructure.

Why a Bastion server

bastion host is a computer that is locked down and fully exposed to attack, but in our case the Bastion has a firewall so that only port 22 is open (SSH), and in fact when we only run the bastion host when we want to ssh into our private subnet or run asnbile playbooks. The bastion host is on the public side of the DMZ.

Creating bastion server

We updated the log into server bash script, the associate DNS with IP bash scripts and the get IP address of ec2 instance bash scripts to take arguments and renamed them to work with the bastion EC2 instance and Cassandra Database instances. Then we created a new script called bin/create-ec2-instance-bastion.sh to use the new scripts and the appropriate environment variables.
Here is the create bastion script.

create-ec2-instance-bastion.sh - bastion for ansible and ssh bridge

#!/bin/bash
set -e

source bin/ec2-env.sh

instance_id=$(aws ec2 run-instances --image-id "$BASTION_AMI" --subnet-id  "$SUBNET_PUBLIC" \
 --instance-type "$BASTION_NODE_SIZE" --iam-instance-profile "Name=$CASSANDRA_IAM_PROFILE" \
 --associate-public-ip-address --security-group-ids "$BASTION_SECURITY_GROUP" \
 --key-name "$KEY_PAIR_NAME" | jq --raw-output .Instances[].InstanceId)

echo "bastion ${instance_id} is being created"

aws ec2 wait instance-exists --instance-ids "$instance_id"

aws ec2 create-tags --resources "${instance_id}" --tags Key=Name,Value="${BASTION_EC2_INSTANCE_NAME}" \
Key=Role,Value="Bastion" Key=Env,Value="DEV"

echo "${instance_id} was tagged waiting to login"

aws ec2 wait instance-status-ok --instance-ids "$instance_id"


bin/associate-route53-DNS-with-IP.sh ${BASTION_EC2_INSTANCE_NAME} ${BASTION_DNS_NAME}
bin/login-ec2-instance.sh ${BASTION_EC2_INSTANCE_NAME}

If you followed along with the previous tutorials, the above will all make sense. Essentially, we are just launching and EC2 instance using the AMI/image we created with packer from Packer/Ansible/Cassandra Tutorial. Then the script waits for the image to become active, then it associates the bastion DNS name with the IP of this image.
Note since we are launching the Cassandra node in a private subnet, we will not be able to log into it direct any longer. We will have to log into the bastion server first.
Now let’s run this script.

Running bin/create-ec2-instance-bastion.sh - to create ansible bastion AWS ec2 instance

$ bin/create-ec2-instance-bastion.sh
bastion i-069819c22bbd379ab is being created
i-069819c22bbd379ab was tagged waiting to login
IP ADDRESS 55.222.33.66 bastion.dev.us-west-2.cloudurable.com.

{
"Changes":[
    {
        "Action": "UPSERT",
        "ResourceRecordSet": {
                "Type": "A",
                "Name": "bastion.dev.us-west-2.cloudurable.com.",
                "TTL": 300,
                "ResourceRecords": [{
                    "Value": "55.222.33.66"
                }]
        }
    }
]
}

IP ADDRESS 55.222.33.66
ECDSA key fingerprint is SHA256:DzyRqdhPPUlTf8ZPAH6XtGe0SRNthSoMXK4cZCpGGME.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '54.202.31.60' (ECDSA) to the list of known hosts.
...
Let’s also startup the Cassandra Database Node which will be a seed server in the Cassandra Cluster. We do this with the create-ec2-instance-cassandra.sh as follows.

Run bin/create-ec2-instance-cassandra.sh - Create Cassandra instance

 $ bin/create-ec2-instance-cassandra.sh
Cassandra Database: Cassandra Cluster Node i-0602e8b4d75020438 is being created
Cassandra Node i-0602e8b4d75020438 was tagged waiting for status ready
Now we can log into the bastion.

Logging into Bastion running in our public subnet

We will log into Bastion so we can then log into our Cassandra or Kafka nodes.

Using login-ec2-instance.sh

$ bin/login-ec2-instance.sh bastion.dev.us-west-2
centos@ip-10-0-0-220 ~]$
The private IP address of the Cassandra instance is 10.0.1.10. To log into that server, we would first need to log into bastion (private IP 10-0-0-220) as follows.

Logging into Cassandra node from bastion

centos@ip-10-0-0-220 ~]$ ssh -i ~/resources/server/certs/test_rsa ansible@10.0.1.138
[ansible@ip-10-0-1-10 ~]$

SSH Setup for Ansible/SSH to managed Cassandra nodes

Before we setup, let’s make sure we can log into the bastion host.

Try to connect to Cassandra seed node via ssh bastion tunnel

$ ssh -t bastion ssh -t -i /home/ansible/.ssh/test_rsa ansible@10.0.1.10
This trick of ssh bastion tunneling from the cmd line was described Using SSH through a Bastion tunnel.
We modify our ~/.ssh/config to tunnel requests for 10.0.1.10 (our Cassandra node), through the bastion bastion.dev.us-west-2.cloudurable.com.

~/.ssh/config - configure ssh bridge from bastion to main cassandra node

Host *.us-west-2.compute.amazonaws.com
  ForwardAgent yes
  IdentityFile ~/.ssh/test_rsa
  User ansible

Host bastion
  Hostname bastion.dev.us-west-2.cloudurable.com
  ForwardAgent yes
  IdentityFile ~/.ssh/test_rsa
  User ansible

Host cassandra.node0
  Hostname 10.0.1.10
  ForwardAgent yes
  IdentityFile ~/.ssh/test_rsa
  ProxyCommand ssh bastion  -W  %h:%p
  User ansible

First, we create Host bastion alias that sets up access to bastion.dev.us-west-2.cloudurable.com, your DNS name will vary. Then we use this bastion to tunnel ssh to the Cassandra instance using the ProxyCommand. This ProxyCommand of the ssh client config runs the command ssh bastion -W host:port and then talks to the standard in/out of that command as if the remote connection (specified by -W). The %h is for the hostname and the %p is port.
Now, cassandra.node0 is a bit special because it is a seed server, but other Cassandra Nodes will be a bit nameless so to speak. We want a way to configure them without naming each one, and while we are at for speed we want to use SSH multiplexing.

~/.ssh/config - create the ssh bridge for the rest of the Cassandra nodes

Host 10.0.1.*
    ForwardAgent yes
    IdentityFile ~/.ssh/test_rsa
    ProxyCommand ssh bastion  -W  %h:%p
    User ansible
    ControlMaster auto
    ControlPath ~/.ssh/ansible-%r@%h:%p
    ControlPersist 5m

Ideas for this setup came from the running ansible through SSH bastion host on Scott’s WebLog.
We make the following changes to get ansible to work through bastion.
We add this workaround to ansible.cfg.

ansible.cfg -

...
[ssh_connection]
ssh_args = -o ControlMaster=auto -o ControlPersist=30m
control_path = %(directory)s/%%h-%%p-%%r

Then we setup our inventory file with our new ssh aliases that we defined in ~/.ssh/config earlier.

inventory.ini -

[cassandra-nodes]
cassandra.node0


[bastion]
bastion

Side Note: Local to project ssh config so we can check it in.

If we use the -F parameter as an ssh_args in ansible.cfg/ssh_connection then we can specify . Now we can keep all these files in source control by adding this [ssh_connection] \n ssh_args = -F ssh/ssh.config -o ControlMaster=auto -o ControlPersist=30m to ansible.cfg in the project directory of this Ansible/Cassandra tutorial. This is another trick from blog post: running ansible through a bastion
The next thing we want to do is ping our servers with ansible just to show that it is working.

Running ansible commands via a bastion server


$ ansible cassandra.node0 -m ping

cassandra.node0 | SUCCESS => {
    "changed": false,
    "ping": "pong"
}

$ ansible bastion -m ping

bastion | SUCCESS => {
    "changed": false,
    "ping": "pong"
}


Installing more than one node to the cluster

To automate the configuration of the Cassandra instances, we will use Cassandra Cloud.
CassandraCloud is a tool that helps you configure Cassandra for clustered environments. It works well in DockerAWSMesosEC2, and VirtualBox environments (and similar environments). It allows you to configure Cassandra easily. For example, it could be kicked off as a USER_DATA script in Amazon EC2 (AWS EC2). CassandraCloud usually runs once when an instance is first launched and then never again. CassandraCloud allows you to override values via OS ENVIRONMENT variables. There is an HCL config file, and there are command line arguments. The HCL config file can be overridden with ENVIRONMENT which can be overridden with command line arguments.CassandraCloud will generate ${CASSANDRA_HOME}/conf/cassandra.yaml file. You can specify a custom template (usually found in ${CASSANDRA_HOME}/conf/cassandra-yaml.template).
Here is the EC2 user-data script where we invoke Cassandra Cloud.

resources/user-data/cassandra - AWS EC2 User-Data for Cassandra Database Node

#!/bin/bash
set -e

export BIND_IP=`curl http://169.254.169.254/latest/meta-data/local-ipv4`

/opt/cloudurable/bin/cassandra-cloud -cluster-name test \
                -client-address  ${BIND_IP} \
                -cluster-address  ${BIND_IP} \
                -cluster-seeds 10.0.1.10

/bin/systemctl restart  cassandra

Notice we are passing the client BIND_IP which we get form the EC2 meta-data.
We also added an extra param to bin/create-ec2-instance-cassandra.sh so we can pin a deployment to a certain IP in the CIDR range of our AWS VPC private subnet.

bin/create-ec2-instance-cassandra.sh -

#!/bin/bash
set -e

source bin/ec2-env.sh

if [ -z "$1" ]
    then
        PRIVATE_IP_ADDRESS=10.0.1.10
    else
        PRIVATE_IP_ADDRESS=$1
fi

instance_id=$(aws ec2 run-instances --image-id "$CASSANDRA_AMI" --subnet-id  "$SUBNET_PRIVATE" \
 --instance-type ${CASSANDRA_NODE_SIZE} --private-ip-address ${PRIVATE_IP_ADDRESS}  \
 --iam-instance-profile "Name=$CASSANDRA_IAM_PROFILE" \
 --security-group-ids "$CASSANDRA_SECURITY_GROUP" \
 --user-data file://resources/user-data/cassandra \
 --key-name "$KEY_PAIR_NAME" | jq --raw-output .Instances[].InstanceId)

echo "Cassandra Database: Cassandra Cluster Node ${instance_id} is being created"

aws ec2 wait instance-exists --instance-ids "$instance_id"

aws ec2 create-tags --resources "${instance_id}" --tags Key=Name,Value="${CASSANDRA_EC2_INSTANCE_NAME}" \
Key=Cluster,Value="Cassandra" Key=Role,Value="Cassandra_Database_Cluster_Node" Key=Env,Value="DEV"

echo "Cassandra Node ${instance_id} was tagged waiting for status ready"

aws ec2 wait instance-status-ok --instance-ids "$instance_id"

If you run it with no IP address, then it creates the Cassandra Seed Node EC2 instance. If you run it with an IP address, then it creates and instance with that private IP. (Note that the IP must be in the range of the private subnet CIDR that we created earlier.)
Let’s use this to create a second instance.

Running bin/create-ec2-instance-cassandra.sh 10.0.1.11

$ bin/create-ec2-instance-cassandra.sh 10.0.1.11
Cassandra Database: Cassandra Cluster Node i-0e9939e9f62ae33d4 is being created
Cassandra Node i-0e9939e9f62ae33d4 was tagged waiting for status ready
Now we need to make sure it is working. We can do this by ssh-ing into cassandra.node0 and checking the status and describing the cassandra cluster with nodetool.

Check Cassandra node status with nodetool

$ ssh cassandra.node0
...
$ /opt/cassandra/bin/nodetool status
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address    Load       Tokens       Owns (effective)  Host ID                               Rack
UN  10.0.1.10  132.06 KiB  32           100.0%            95794596-7dbe-4ec9-8c35-f4f49a5bb999  rack1
UN  10.0.1.11   94.97 KiB  32           100.0%            eb35bb65-c582-4fa9-9069-fd5222830c99  rack1
Notice that both 10.0.1.10 (the seed server), and 10.0.1.11 are seen. We will setup a seed server per availability zone.
Let’s also see the Cassandra cluster with nodetool status.

Check Cassandra node status with nodetool

[ansible@ip-10-0-1-10 ~]$ /opt/cassandra/bin/nodetool describecluster
Cluster Information:
        Name: test
        Snitch: org.apache.cassandra.locator.DynamicEndpointSnitch
        Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
        Schema versions:
                86afa796-d883-3932-aa73-6b017cef0d19: [10.0.1.10, 10.0.1.11]


Up next

In the next article we will use Ec2Snitch and setup a second subnet and availability zone.

Bonus lap using an ansible playbook to install Oracle JDK 8

A big mandate came down from the corporate home office, switch all instances from OpenJDK to the Oracle JDK. We protest that we ran benchmarks and burn-ins with the OpenJDK and there is no difference. The home office has silenced our pleas. No remorse.
We are using the OpenJDK and Cassandra gives warning messages not to. Let’s use Ansible to fix that. We will use a ansible playbook for installing jdk-8 on CentOS.
First let’s add our extra Cassandra node 10.0.1.11 to the list of cassandra-nodes that we are managing.

inventory.ini - add new Cassandra node to file

[cassandra-nodes]
cassandra.node0
10.0.1.11

...
Then let’s create an ansible playbook that installs the Oracle JDK on our Cassandra nodes.

playbooks-oracle-8.jdk.yml

---
- hosts: cassandra-nodes
  gather_facts: no
  become: true
  remote_user: ansible
  vars:
    download_url: http://download.oracle.com/otn-pub/java/jdk/8u121-b13/e9e7ea248e2c4826b92b3f075a80e441/jdk-8u121-linux-x64.tar.gz
    download_folder: /opt
    java_name: "{{download_folder}}/jdk1.8.0_121"
    java_archive: "{{download_folder}}/jdk-8u121-linux-x64.tar.gz"


  tasks:
  - name: Download Java
    command: "curl -L -b 'oraclelicense=a' {{download_url}} -o {{java_archive}} creates={{java_archive}}"

  - name: Unpack archive
    command: "tar -zxf {{java_archive}} -C {{download_folder}} creates={{java_name}}"

  - name: Fix ownership
    file: state=directory path={{java_name}} owner=root group=root recurse=yes

  - name: Remove previous
    command: 'alternatives --remove "java" /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.121-0.b13.el7_3.x86_64/jre/bin/java'

  - name: Make Java available for system
    command: 'alternatives --install "/usr/bin/java" "java" "{{java_name}}/bin/java" 2000'

  - name: Clean up
    file: state=absent path={{java_archive}}

Next up we just need to run the playbook.

Running playbook

$ ansible-playbook playbooks/install-oracle-8-jdk.yml

PLAY [cassandra-nodes] *********************************************************

TASK [Download Java] ***********************************************************
changed: [cassandra.node0]
 [WARNING]: Consider using get_url or uri module rather than running curl

changed: [10.0.1.11]

TASK [Unpack archive] **********************************************************
changed: [cassandra.node0]
 [WARNING]: Consider using unarchive module rather than running tar

changed: [10.0.1.11]

TASK [Fix ownership] ***********************************************************
changed: [10.0.1.11]
changed: [cassandra.node0]

TASK [Remove previous] *********************************************************
changed: [cassandra.node0]
changed: [10.0.1.11]

TASK [Make Java available for system] ******************************************
changed: [10.0.1.11]
changed: [cassandra.node0]

TASK [Clean up] ****************************************************************
changed: [cassandra.node0]
changed: [10.0.1.11]

PLAY RECAP *********************************************************************
10.0.1.11                  : ok=6    changed=6    unreachable=0    failed=0   
cassandra.node0            : ok=6    changed=6    unreachable=0    failed=0   

Now imagine that we did not just have two servers but 50. This playbook is a lot nicer.

Conclusion

We used CloudFormer to create a starter CloudFormation template which created an AWS VPC, NAT, Subnets, InternetGateway, CIDRs, and more. Then we used AWS command line tools to launch CloudFormations as a stack. Then we added some additional security groups. Then we used AWS command line tools to update the CloudFormations stack that we set up earlier. We set up a bastion host which allows us to tunnel ansible commands. We then used the AWS command line. We then set up a Cassandra cluster using Cassandra Cloud, EC2 USER DATA script, and EC2 instance meta-data to generate Cassandra YAML config. Then as a bonus lap, we used ansible to run a playbook to replace our OpenJDK usage with the Oracle JDK.

More about Cloudurable™

Cloudurable is focused on AWS DevOps automation for Cassandra and Kafka. We also focus on supporting the SMACK stack in AWS. All of our training and mentoring has a special focus on AWS deployments. Our Apache Spark course, for example, covers running Spark on EMR and using Spark SQL with data from S3 and DynamoDB as well as using Spark Streaming with Kinesis. We also cover how to use Cassandra from Spark (for example).
Consulting
Training

Resources

Kafka and Cassandra support, training for AWS EC2 Cassandra 3.0 Training