Tagging all ec2 instances for all EKSs in account.
Lambda, EventBridge and CloudTrail
I recently came across an interesting task. For cost management, it was necessary to tagging all ec2 instances on the AWS account. The tag should contain Name = EKS-$CLUSTER-NAME
.
As you know, ec2 clusters created for EKS do not have the Name
tag by default, they are created within the Node Group from a custom Launch Template (if you explicitly specified and created it). Or they are created with an AWS managed Launch Template that controls ec2 in your Node Group, unless you explicitly specified otherwise.
With the first case, when you have a custom Launch Template, everything is clear, you can simply add custom tags to it, and get ec2 instances with Name = EKS-$CLUSTER-NAME
at the output. But what if some of the Node Group EKS are not managed through a separate Launch Template?
By default, AWS does not have a property that allows you to create an tag for an EKS node and link it to ec2. As a result, when listing ec2 in your account, you can see a huge number of instances without a name, which will only have service tags that EKS uses to manage ec2. One of them is: kubernetes.io/cluster/$CLUSTER-NAME = owned
, let’s try to use it.
I used this tag to assign a custom tag Name = EKS-$CLUSTER-NAME
for new ec2 instances at the time of their creation, using AWS Lambda and events from EventBridge for this purpose.
Step 1. Prepare the IAM policy and the IAM role
Create a lambda_ec2_policy policy file.json:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ec2:DescribeInstances",
"ec2:CreateTags"
],
"Resource": "*"
}
]
}
Create a policy in AWS:
aws iam create-policy \
--policy-name LambdaEC2TaggingPolicy \
--policy-document file://lambda_ec2_policy.json
Create an IAM role for the Lambda function:
aws iam create-role \
--role-name LambdaEC2TaggingRole \
--assume-role-policy-document '{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Principal": { "Service": "lambda.amazonaws.com" },
"Action": "sts:AssumeRole"
}]
}'
Link the IAM policy to the role:
export ACCOUNT_ID=$(aws sts get-caller-identity --query "Account" --output text)
aws iam attach-role-policy \
--role-name LambdaEC2TaggingRole \
--policy-arn arn:aws:iam::${ACCOUNT_ID}:policy/LambdaEC2TaggingPolicy
You also need to add a standard policy for logging logs to CloudWatch.:
aws iam attach-role-policy \
--role-name LambdaEC2TaggingRole \
--policy-arn arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole
Step 2. Creating a Lambda Function
Save your code to a file lambda_function.py:
import boto3
import json
import logging
logger = logging.getLogger()
logger.setLevel(logging.INFO)
def lambda_handler(event, context):
logger.info(f"Event received: {json.dumps(event)}")
ec2_client = boto3.client('ec2')
instance_ids = extract_instance_ids(event)
if not instance_ids:
logger.warning("Instance IDs were not found in the event.")
return {'status': 'no instance ids found'}
logger.info(f"Instance IDs are extracted: {instance_ids}")
tagged_instances = []
for instance_id in instance_ids:
if tag_instance(ec2_client, instance_id):
tagged_instances.append(instance_id)
return {
'status': 'processed',
'instances_total': len(instance_ids),
'instances_tagged': len(tagged_instances),
'instances': tagged_instances
}
def extract_instance_ids(event):
instance_ids = []
if 'detail-type' in event:
detail_type = event.get('detail-type')
logger.info(f"Type of event: {detail_type}")
if detail_type == "AWS API Call via CloudTrail":
try:
items = event.get('detail', {}).get('responseElements', {}).get('instancesSet', {}).get('items', [])
logger.info(f"instancesSet elements found: {json.dumps(items)}")
for item in items:
instance_id = item.get('instanceId')
if instance_id:
instance_ids.append(instance_id)
except Exception as e:
logger.error(f"Error extracting instance IDs from CloudTrail: {str(e)}")
elif detail_type == "EC2 Instance State-change Notification":
instance_id = event.get('detail', {}).get('instance-id')
if instance_id:
instance_ids.append(instance_id)
elif 'resources' in event:
resources = event.get('resources', [])
for resource in resources:
if resource.startswith('arn:aws:ec2:') and '/instance/' in resource:
instance_id = resource.split('/instance/')[1]
instance_ids.append(instance_id)
if not instance_ids and 'instance_id' in event:
instance_ids.append(event['instance_id'])
if not instance_ids and 'instanceId' in event:
instance_ids.append(event['instanceId'])
return instance_ids
def tag_instance(ec2_client, instance_id):
logger.info(f"Instance Processing {instance_id}")
try:
import time
time.sleep(30)
retries = 3
for attempt in range(retries):
try:
response = ec2_client.describe_instances(InstanceIds=[instance_id])
logger.info(f"Received information about the instance on the attempt {attempt+1}")
break
except Exception as e:
if attempt < retries - 1:
logger.warning(f"Couldn't get instance data {instance_id}, attempt {attempt+1}: {str(e)}")
time.sleep(10 * (attempt + 1))
else:
logger.error(f"Couldn't get instance data {instance_id} adter {retries} attempts: {str(e)}")
return False
reservations = response.get('Reservations', [])
if not reservations or len(reservations[0].get('Instances', [])) == 0:
logger.warning(f"The instance was not found: {instance_id}")
return False
instance = reservations[0]['Instances'][0]
instance_tags = {}
for tag in instance.get('Tags', []):
instance_tags[tag['Key']] = tag['Value']
logger.info(f"Current instance tags {instance_id}: {json.dumps(instance_tags)}")
if 'Name' in instance_tags:
logger.info(f"Instance {instance_id} already has the Name tag: {instance_tags['Name']}")
return False
cluster_name = None
for key in instance_tags.keys():
if key.startswith('kubernetes.io/cluster/'):
cluster_name = key.split('/', 2)[-1]
logger.info(f"Cluster tag found: {key}={instance_tags[key]}")
break
if cluster_name:
logger.info(f"Adding the tag Name=EKS-{cluster_name} for the instance {instance_id}")
try:
ec2_client.create_tags(
Resources=[instance_id],
Tags=[{'Key': 'Name', 'Value': f'EKS-{cluster_name}'}]
)
logger.info(f"Tag Name=EKS-{cluster_name} successfully added for instance {instance_id}")
return True
except Exception as e:
logger.error(f"Error when creating the tag: {str(e)}")
return False
else:
logger.info(f"For the instance {instance_id} cluster EKS tags not found")
return False
except Exception as e:
logger.error(f"Unexpected error during instance processing {instance_id}: {str(e)}")
return False
Create a ZIP archive with the function:
zip lambda_function.zip lambda_function.py
Create a Lambda function in the AWS CLI:
aws lambda create-function \
--function-name ec2-auto-tagging \
--runtime python3.11 \
--zip-file fileb://lambda_function.zip \
--handler lambda_function.lambda_handler \
--role arn:aws:iam::${ACCOUNT_ID}:role/LambdaEC2TaggingRole \
--timeout 60
Step 3. Configure EventBridge and CloudTrail
Create an EventBridge rule that triggers Lambda when creating new EC2 instances.:
aws events put-rule \
--name "trigger-on-ec2-instance-creation" \
--event-pattern '{
"source": ["aws.ec2"],
"detail-type": ["AWS API Call via CloudTrail"],
"detail": {
"eventSource": ["ec2.amazonaws.com"],
"eventName": ["RunInstances"]
}
}'
Configuring CloudTrail to record API events
CloudTrail is necessary for the system to receive instance startup events. If CloudTrail is not configured, follow these steps:
export S3_BUCKET=cloudtrail-logs-$(aws sts get-caller-identity --query "Account" --output text)
export ACCOUNT_ID=$(aws sts get-caller-identity --query "Account" --output text)
export REGION=$(aws configure get region)
aws s3 mb s3://$S3_BUCKET --region $REGION
cat > bucket-policy.json << EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "AWSCloudTrailAclCheck",
"Effect": "Allow",
"Principal": {
"Service": "cloudtrail.amazonaws.com"
},
"Action": "s3:GetBucketAcl",
"Resource": "arn:aws:s3:::${S3_BUCKET}"
},
{
"Sid": "AWSCloudTrailWrite",
"Effect": "Allow",
"Principal": {
"Service": "cloudtrail.amazonaws.com"
},
"Action": "s3:PutObject",
"Resource": "arn:aws:s3:::${S3_BUCKET}/AWSLogs/${ACCOUNT_ID}/*",
"Condition": {
"StringEquals": {
"s3:x-amz-acl": "bucket-owner-full-control"
}
}
}
]
}
EOF
aws s3api put-bucket-policy --bucket $S3_BUCKET --policy file://bucket-policy.json
aws cloudtrail create-trail \
--name api-events-trail \
--s3-bucket-name $S3_BUCKET \
--is-multi-region-trail \
--enable-log-file-validation
aws cloudtrail start-logging --name api-events-trail
aws cloudtrail put-event-selectors \
--trail-name api-events-trail \
--event-selectors '[{"ReadWriteType": "All", "IncludeManagementEvents": true}]'
Add the Lambda permission and configure the EventBridge
aws lambda add-permission \
--function-name ec2-auto-tagging \
--statement-id AllowEventBridgeInvoke \
--action lambda:InvokeFunction \
--principal events.amazonaws.com \
--source-arn arn:aws:events:${REGION}:${ACCOUNT_ID}:rule/trigger-on-ec2-instance-creation
aws events put-targets \
--rule trigger-on-ec2-instance-creation \
--targets '[{"Id": "1", "Arn": "arn:aws:lambda:'${REGION}':'${ACCOUNT_ID}':function:ec2-auto-tagging"}]'
Step 4. Checking the work
Create a new EC2 instance with a tag like:
aws ec2 run-instances \
--image-id ami-XXXXXXXXXX \
--count 1 \
--instance-type t2.micro \
--subnet-id subnet-XXXXXXXXXX \
--tag-specifications 'ResourceType=instance,Tags=[{Key=kubernetes.io/cluster/my-cluster,Value=owned}]'
After starting the instance, make sure that a new tag is automatically added:
INSTANCE_ID=i-XXXXXXXXXX
sleep 60
aws ec2 describe-tags --filters "Name=resource-id,Values=${INSTANCE_ID}" --query "Tags[?Key=='Name']"
The tag should appear:
[
{
"Key": "Name",
"ResourceId": "i-XXXXXXXXXX",
"ResourceType": "instance",
"Value": "EKS-my-cluster"
}
]
Step 5. Tag existing ec2 instances
After all the actions above, at the time of creation, an tag with the cluster name will be created on the instance, but right now there are instances created without tags, let’s tag them too:
#!/bin/bash
set -e
if ! command -v aws &> /dev/null; then
echo "Error: AWS CLI is not installed. Please install it and configure it."
exit 1
fi
echo "Getting a list of all EKS clusters from existing EC2 instances..."
CLUSTER_NAMES=($(aws ec2 describe-instances \
--filters "Name=tag-key,Values=aws:eks:cluster-name" \
--query "Reservations[].Instances[].Tags[?Key=='aws:eks:cluster-name'].Value" \
--output text | sort | uniq))
if [ ${#CLUSTER_NAMES[@]} -eq 0 ]; then
echo "No instances were found with the aws:eks:cluster-name tag."
exit 0
fi
echo "The following EKS clusters were found:"
for CLUSTER in "${CLUSTER_NAMES[@]}"; do
echo "- $CLUSTER"
done
TOTAL_INSTANCES=0
for CLUSTER_NAME in "${CLUSTER_NAMES[@]}"; do
NAME_TAG_VALUE="EKS-${CLUSTER_NAME}"
echo "========================================"
echo "Processing cluster: ${CLUSTER_NAME}"
echo "Searching for EC2 instances with the aws tag:eks:cluster-name=${CLUSTER_NAME}..."
INSTANCE_IDS=$(aws ec2 describe-instances \
--filters "Name=tag:aws:eks:cluster-name,Values=${CLUSTER_NAME}" \
--query "Reservations[].Instances[].InstanceId" \
--output text)
if [ -z "$INSTANCE_IDS" ]; then
echo "Instances with the tag aws:eks:cluster-name=${CLUSTER_NAME} not found."
continue
fi
INSTANCE_COUNT=$(echo $INSTANCE_IDS | wc -w)
echo "Found $INSTANCE_COUNT instances for the cluster ${CLUSTER_NAME}."
TOTAL_INSTANCES=$((TOTAL_INSTANCES + INSTANCE_COUNT))
for INSTANCE_ID in $INSTANCE_IDS; do
echo "Adding a tag Name=${NAME_TAG_VALUE} to the instance $INSTANCE_ID..."
EXISTING_NAME_TAG=$(aws ec2 describe-tags \
--filters "Name=resource-id,Values=${INSTANCE_ID}" "Name=key,Values=Name" \
--query "Tags[0].Value" \
--output text)
if [ "$EXISTING_NAME_TAG" != "None" ] && [ ! -z "$EXISTING_NAME_TAG" ]; then
echo " The instance already has the tag Name=${EXISTING_NAME_TAG}. Updating it..."
fi
aws ec2 create-tags \
--resources "$INSTANCE_ID" \
--tags "Key=Name,Value=${NAME_TAG_VALUE}"
echo " The tag has been successfully added."
done
echo "Processing of the cluster ${CLUSTER_NAME} has been completed."
done
echo "========================================"
echo "The operation is completed. A total of $TOTAL_INSTANCES instances were processed across all clusters."
Debugging when problems occur
If the tags do not appear automatically, follow these steps for debugging:
1. Checking Lambda logs
LOG_STREAM=$(aws logs describe-log-streams \
--log-group-name /aws/lambda/ec2-auto-tagging \
--order-by LastEventTime \
--descending \
--limit 1 \
--query 'logStreams[0].logStreamName' \
--output text)
aws logs get-log-events \
--log-group-name /aws/lambda/ec2-auto-tagging \
--log-stream-name $LOG_STREAM \
--limit 20
2. Checking EventBridge settings
aws events describe-rule --name trigger-on-ec2-instance-creation
aws events list-targets-by-rule --rule trigger-on-ec2-instance-creation
3. Checking CloudTrail
aws cloudtrail get-trail-status --name api-events-trail
aws cloudtrail get-event-selectors --trail-name api-events-trail
4. Verifying the rights of the IAM role
aws iam list-attached-role-policies --role-name LambdaEC2TaggingRole
5. Manual Lambda testing with test event
cat > test-event.json << EOF
{
"version": "0",
"id": "6a7e8feb-b491-4cf7-a9f1-bf3703467718",
"detail-type": "AWS API Call via CloudTrail",
"source": "aws.ec2",
"account": "$(aws sts get-caller-identity --query "Account" --output text)",
"time": "2021-12-03T17:31:20Z",
"region": "$(aws configure get region)",
"resources": [],
"detail": {
"eventSource": "ec2.amazonaws.com",
"eventName": "RunInstances",
"responseElements": {
"instancesSet": {
"items": [
{
"instanceId": "i-YOUR_INSTANCE_ID"
}
]
}
}
}
}
EOF
INSTANCE_ID=i-XXXXXXXXXX
sed -i "s/i-YOUR_INSTANCE_ID/$INSTANCE_ID/g" test-event.json
aws lambda invoke \
--function-name ec2-auto-tagging \
--payload fileb://test-event.json \
response.json
cat response.json
6. The main causes of inactivity and their solutions
EventBridge does not have Lambda as a target
aws events put-targets \
--rule trigger-on-ec2-instance-creation \
--targets '[{"Id": "1", "Arn": "arn:aws:lambda:'${REGION}':'${ACCOUNT_ID}':function:ec2-auto-tagging"}]'
Lambda does not have permission to receive events from EventBridge
aws lambda add-permission \
--function-name ec2-auto-tagging \
--statement-id AllowEventBridgeInvoke \
--action lambda:InvokeFunction \
--principal events.amazonaws.com \
--source-arn arn:aws:events:${REGION}:${ACCOUNT_ID}:rule/trigger-on-ec2-instance-creation
CloudTrail is not configured or enabled
aws cloudtrail start-logging --name api-events-trail
The Lambda function shuts down too quickly, without waiting for the tags to become available.
aws lambda update-function-configuration \
--function-name ec2-auto-tagging \
--timeout 120