AWS Form Processing

Serverless form processing for (static) websites

Easy and cheap (and privacy aware?)

Your team is building a static website for another super frugal client, Nofrills Coffee. You’ve already figured out how to host it in the easiest and cheapest way possible, but the client also needs to be able to receive feedback from users through a web form. You wish the front end developers could just submit the form to one of those “free” services like Formspree and Simple Form, but you’re too wary of security and privacy to hand things over like that.

If you trust Amazon in that it won’t read and analyze every single byte that hits its APIs, you may find some valuable information in this article. Otherwise… See you next week.

Overview

In this article, I’ll assume the web app you want to connect with AWS is already capable of sending POST requests to APIs and web services on the Internet (on static websites, that’s usually achieved through Ajax). We’ll work on an AWS API endpoint that takes form content from the front end (a web app, website etc.), parses it and sends an email about it to one or more staff members at Nofrills Coffee. Last but not least, this is the kind of payload we’re expecting from that web form:

{
    "name": "Ricky de Picquey",
    "email": "ricky@entitled.omg",
    "subject": "Wrong order",
    "message": "I ordered a skinny venti latte extra dry and it came with the regular amount of foam. I've never been so disrespected in my life! (╯°□°)╯︵ ┻━┻"
}

The front end will send that JSON payload to a public endpoint on AWS (API Gateway), which will forward it to a Lambda function, which will then massage the payload and send an email with the parsed information to one or more of the poor souls at Nofrills Coffee. Easy biz, right? Let’s get going, then.

Project structure and functionality

These are the only two CloudFormation templates we’ll be dealing with:

  • lambda.yaml: this template creates a Lambda function responsible for parsing a payload and sending its contents to one or more email addresses through Amazon SES (Simple Email Service);
  • api-gateway.yaml: this template creates an API Gateway endpoint (/contact) that accepts POST requests and forwards them to the Lambda function above.

Create and verify identity in SES

Before we can start sending emails through SES, we first need to create and verify an SES identity—a domain name or email address that we can prove we own. I’m going to use an email address because that’s all we need for our current use case (email messages will be sent from one address only), but the process for creating and validating a domain identity is just a bit more involved.

In order to add and verify a new email address in SES, we can make a call to the verify-email-identity method of the AWS CLI, which “Adds an email address to the list of identities for your Amazon SES account and attempts to verify it.” Because SES is only available in a few regions (North Virginia, Oregon and Ireland), we’ll start by setting environment variables for our default profile and region of choice. This step is optional, but by performing it we can skip the --profile and --region options when using the AWS CLI:

export AWS_DEFAULT_PROFILE="cwtf"
export AWS_DEFAULT_REGION="us-east-1"

We’re ready to call verify-email-identity now:

aws ses verify-email-identity --email-address website@nofrills.coffee

A verification email from no-reply-aws@amazon.com should arrive in a few seconds at the address specified (website@nofrills.coffee). To complete the verification process, all you have to do is click on the link embedded in the email.

/img/2018/11/2018-11-13-aws-form-processing/aws-ses-congratulations-thumb.jpg

SES identity verification result.

We can confirm that the new identity was created and verified in SES by calling get-identity-verification-attributes:

$ aws ses get-identity-verification-attributes --identities "website@nofrills.coffee"
{
    "VerificationAttributes": {
        "website@nofrills.coffee": {
            "VerificationStatus": "Success"
        }
    }
}

We’re ready to start playing with CloudFormation now.

Lambda

Parameters

We start by defining a few parameters that will later be referenced by CloudFormation resources:

---
# Stack name: form-processing-lambda
AWSTemplateFormatVersion: "2010-09-09"
Description: >
  Forward payloads from ApiGateway to Lambda and then send emails with processed data.

Parameters:
  SESEmailAddresses:
    Type: "String"
    Default: "ermenegildo@nofrills.coffee,leda@nofrills.coffee"
    Description: "Email addresses of employees that should be notified of message."
  SESSourceEmailAddress:
    Type: "String"
    Default: "website@nofrills.coffee"
    Description: >
      Source email address (SES email address that will "send" emails to other addresses).
  SESIdentityARN:
    Type: "String"
    Default: "arn:aws:ses:us-east-1:860793216825:identity/website@nofrills.coffee"
    Description: >
      ARN of SES identity (source email address).
  SESTemplateName:
    Type: "String"
    Default: "message-from-website"
    Description: "SES template name. (AWS::SES::Template has no !GetAtt...)"

The first default value is a comma delimited value containing the target email addresses, i.e. those that will receive the notifications. The second one is the source email address, which we’ve just verified. The third default value is set to the source email address’ ARN (Amazon Resource Name), which you can get from the AWS web console. The fourth and last value will be used in a few resource declarations in a bit.

Bear in mind that an AWS account that’s still in the SES sandbox will only be allowed to send emails to the addresses listed in SESEmailAddresses if those addresses have been verified as well. If that applies to your account, you can use the same process described above (calling verify-email-identity) to verify those addresses.

SES template

After the parameter declaration, we’re going to create an SES template with a default subject and an HTML and text blueprints with placeholders. We’ll see later on how those get substituted:

Resources:
  # SES email template (used inside Lambda function)
  SESTemplate:
    Type: "AWS::SES::Template"
    Properties:
      Template:
        TemplateName: !Ref "SESTemplateName"
        SubjectPart: "New message from website"
        TextPart: >
          New message from website.
          Name: {{name}}
          Subject: {{subject}}
          Message: {{message}}
          Someone please do something about it...
        HtmlPart: >
          <!doctype html>
            <html>
                <body>
                    <p>New message from website.</p>
                    <ul>
                        <li>Name: {{name}}</li>
                        <li>Subject: {{subject}}</li>
                        <li>Message: {{message}}</li>
                    </ul>
                    <p><b>Someone</b> please do something about it...</p>
                </body>
            </html>

You may be wondering why I chose to declare the template name as a CloudFormation parameter (SESTemplateName) instead of just assigning a string to the Template attribute, like this:

TemplateName: "message-from-website"

Then, later on, we should be able to reference the template name with !GetAtt "SESTemplate.TemplateName", right? Wrong. The reason for this is that, at the time of this writing, AWS::SES::Template doesn’t return any value after its creation, so there are no attributes that !GetAtt can fetch for us. That’s why we have to declare a parameter and reference it like that. CloudFormation is full of these little “exceptions”—quite frustrating, indeed.

IAM role

The IAM role defines an inline policy granting Lambda a few capabilities pertaining to SES (sending email, bulk email etc.), CloudWatch (creating logs) and XRay (tracking events):

# This snippet should go under "Resources", after "SESTemplate"
  # IAM role and inline policy defining permissions for Lambda function
  LambdaRole:
    Type: "AWS::IAM::Role"
    Properties:
      AssumeRolePolicyDocument:
        Version: "2012-10-17"
        Statement:
          -
            Effect: "Allow"
            Principal:
              Service:
                - "lambda.amazonaws.com"
            Action:
              - "sts:AssumeRole"
      Path: "/"
      Policies:
        -
          PolicyName: "form-processing-lambda-role-policy"
          PolicyDocument:
            Version: "2012-10-17"
            Statement:
              -
                Effect: "Allow"
                Action:
                  - "ses:SendBulkTemplatedEmail"
                  - "ses:SendEmail"
                  - "ses:SendRawEmail"
                Resource: !Ref "SESIdentityARN"
              -
                Effect: "Allow"
                Action:
                  - "logs:CreateLogGroup"
                  - "logs:CreateLogStream"
                  - "logs:PutLogEvents"
                Resource: "arn:aws:logs:*:*:*"
              -
                Effect: "Allow"
                Action:
                  - "xray:PutTelemetryRecords"
                  - "xray:PutTraceSegments"
                Resource: "*"

Note how the inline policy is referencing the SES entity specified in the template parameters (SESIdentityARN). The policy grants a given Lambda function permissions over that SES entity (the source email address, website@nofrills.coffee) to send bulk templated emails, regular emails and raw emails. We’ll only send bulked emails in this article, so feel free to remove the remaining SES actions if you feel you won’t need them.

Lambda function

After that, we define the Lambda function itself, telling it to assume the IAM role defined above and passing it environment variables that will later be accessed from the Python script:

# This snippet should go under "Resources", after "LambdaRole"
  # Lambda function to catch ApiGateway events and send SES emails
  LambdaFunction:
    Type: "AWS::Lambda::Function"
    Properties:
      FunctionName: "form-processing-lambda"
      Description: "Process ApiGateway payloads and send SES emails"
      Handler: "index.handler"
      Role: !GetAtt "LambdaRole.Arn"
      Environment:
        Variables:
          SES_EMAIL_ADDRESSES: !Ref "SESEmailAddresses"
          SES_SOURCE_EMAIL_ADDRESS: !Ref "SESSourceEmailAddress"
          SES_TEMPLATE_NAME: !Ref "SESTemplateName"
      Code:
        ZipFile: |
            # Python script goes here
      Runtime: "python3.6"
      Timeout: "90"
      TracingConfig:
        Mode: "Active"

The easiest way to write a Lambda function is by dumping the code directly in the CloudFormation template. Next, we’ll see the Python script that should replace the comment # Python script goes here above.

Python script

We’re going to use a Python 3 script to process the payload and make calls to Amazon SES through the Boto library:

import os
import re
import json
import html
import boto3
from botocore.exceptions import ClientError

# Entry point
def handler(event, context):
    # Get SES email addresses from environment
    try:
        ses_email_addresses = os.getenv("SES_EMAIL_ADDRESSES").strip().split(",")
    except BaseException as exception:
        print("Failed to parse SES email addresses. Original exception: " + str(exception))
        return get_response(500, "Internal server error.")

    # Get SES source email address from environment
    ses_source_email_address = os.getenv("SES_SOURCE_EMAIL_ADDRESS", None)
    if not ses_source_email_address:
        print("No SES source email address specified.")
        return get_response(500, "Internal server error.")

    # Get SES template name from environment
    ses_template_name = os.getenv("SES_TEMPLATE_NAME", None)
    if not ses_template_name:
        print("No SES template name specified.")
        return get_response(500, "Internal server error.")

    # Get payload
    try:
        payload = json.loads(event["body"])
        from_name = html.escape(payload.get("name"))
        from_email = html.escape(payload.get("email"))
        from_subject = html.escape(payload.get("subject"))
        from_message = html.escape(payload.get("message"))
    except BaseException as exception:
        print("Failed to parse payload. Original exception: " + str(exception))
        return get_response(500, "Internal server error.")

    # Perform minimal validation
    if not from_name or not from_email or not from_subject or not from_message:
        return get_response(400, "Missing fields.")

    if not is_email_address_valid(from_email):
        return get_response(400, "Bad email address.")

    # Send emails to SES recipients
    try:
        template_data_string = json.dumps({
            "name": from_name,
            "subject": from_subject,
            "message": from_message
        })

        result = boto3.client("ses").send_bulk_templated_email(
            Source=ses_source_email_address,
            ReplyToAddresses=[from_email],
            Template=ses_template_name,
            DefaultTemplateData=template_data_string,
            Destinations=[
                {
                    'Destination': {
                        'ToAddresses': ses_email_addresses
                    },
                    'ReplacementTemplateData': template_data_string
                }
            ]
        )

        print(result)
    except ClientError as exception:
        print("Failed to send bulk templated email. Original exception: " + str(exception))
        return get_response(500, "Internal server error.")

    if result['Status'][0]['Status'] == 'Success':
        return get_response(200, "Message sent with success.")
    else:
        return get_response(500, "Internal server error.")

# Return JSON response in API Gateway format
def get_response(status_code, response_message):
    return {
        "statusCode": status_code,
        "body": response_message,
        "headers": {},
        "isBase64Encoded": False
    }

# Return correctness of email address' syntax
def is_email_address_valid(email_address):
    regexp = r"^[_A-Za-z0-9-]+(\.[_A-Za-z0-9-]+)*@[A-Za-z0-9-]+(\.[A-Za-z0-9-]+)*(\.[A-Za-z-]+)$"
    return re.match(regexp, email_address)

There’s a lot going on here, so let’s start with the helper functions. As the name implies, is_email_address_valid will simply return a boolean according to its input parameter’s value: True if it’s a syntactically valid email address and False otherwise. get_response is just a wrapper to make our lives easier—it returns a dictionary in the format expected by API Gateway.

handler, which is the Lambda’s entry point, starts by getting the environment variables that were originally set up and populated by CloudFormation (SES_EMAIL_ADDRESSES, SES_SOURCE_EMAIL_ADDRESS and SES_TEMPLATE_NAME). It then parses the contents in the Lambda event, performs a little input filtering and validates the email address sent by the Nofrills Coffee’s imaginary end user. Finally, it makes a call to the send_bulk_templated_email method of the Boto 3 library using all the variables previously set up in the script. If the call succeeds, we return a 200 status code wrapped in a Python dictionary, otherwise we return a 500 and hide from the end user the fact that we’ve probably done something really wrong on the “server” side.

Note that the payload is being read from a field inside the Lambda event dictionary (event["body"]). We need to do it like that because the API Gateway will send its events to Lambda in the following format:

{
    'resource': '/contact',
    'path': '/contact',
    'httpMethod': 'POST',
    # [...]
    'body': '{"name": "Test", "email": "test@example.com", "subject": "Testing", "message": "Testing"}',
    'isBase64Encoded': False
}

Also, pay attention to the template_data_string variable and where it’s being used—that’s how the SES template placeholders get substituted.

Once you’ve fully understood the Python script and tweaked it according to your needs, don’t forget to dump it inside the CloudFormation template (right below Zipfile: |).

Output

As a final step, we need to output and export the Lambda’s ARN so we can import it from the API Gateway template we’ll write next:

Outputs:
  LambdaFunctionARN:
    Description: "ARN of Lambda function responsible for sending emails"
    Value: !GetAtt "LambdaFunction.Arn"
    Export:
      Name: "form-processing-lambda-arn"

That’s it for the Lambda template. To create a CloudFormation stack from it, use the following:

aws cloudformation create-stack \
--stack-name form-processing-lambda \
--template-body file://lambda.yaml \
--capabilities CAPABILITY_IAM

API Gateway

The CloudFormation template we’re going to use in order to create the API endpoint contains a bunch of interconnected resources. A lot of them. (I’ve no idea why we need that many separate chunks just to have a working RESTful endpoint, so save your questions to Amazon.) Another annoying thing is that we’ll need to build two very long strings from values coming out of all sorts of places because CloudFormation doesn’t offer a more elegant way of doing what we need it to do.

Enough with the ranting. Here’s the template in its monumental entirety:

---
# Stack name: form-processing-api-gateway
AWSTemplateFormatVersion: "2010-09-09"
Description: "Forward ApiGateway requests to Lambda function"

Resources:
  # ApiGateway REST API
  RestApi:
    Type: "AWS::ApiGateway::RestApi"
    Properties:
      Name: "form-processing-rest-api"
      Description: "REST API for /contact endpoint"
      FailOnWarnings: true

  # Lambda permission to let ApiGateway invoke Lambda function
  LambdaPermission:
    Type: "AWS::Lambda::Permission"
    Properties:
      Action: "lambda:InvokeFunction"
      Principal: "apigateway.amazonaws.com"
      SourceArn:
        Fn::Join:
          - ""
          - - "arn:aws:execute-api:"
            - !Ref "AWS::Region"
            - ":"
            - !Ref "AWS::AccountId"
            - ":"
            - !Ref "RestApi"
            - "/*"
      FunctionName: !ImportValue "form-processing-lambda-arn"

  # ApiGateway REST API resource
  RestApiResource:
    Type: "AWS::ApiGateway::Resource"
    Properties:
      RestApiId: !Ref "RestApi"
      ParentId: !GetAtt "RestApi.RootResourceId"
      PathPart: "contact"

  # ApiGateway REST API method (POST)
  RestApiMethodPost:
    Type: "AWS::ApiGateway::Method"
    Properties:
      RestApiId: !Ref "RestApi"
      ResourceId: !Ref "RestApiResource"
      HttpMethod: "POST"
      AuthorizationType: "NONE"
      Integration:
        Type: "AWS_PROXY"
        IntegrationHttpMethod: "POST"
        Uri:
          Fn::Join:
            - ""
            - - "arn:aws:apigateway:"
              - Ref: "AWS::Region"
              - ":lambda:path/2015-03-31/functions/"
              - !ImportValue "form-processing-lambda-arn"
              - "/invocations"
      MethodResponses:
        - StatusCode: "200"
        - StatusCode: "400"
        - StatusCode: "500"

  # ApiGateway REST API deployment
  RestApiDeployment:
    Type: "AWS::ApiGateway::Deployment"
    DependsOn: "RestApiMethodPost"
    Properties:
      Description: "REST API deployment for /contact endpoint"
      RestApiId: !Ref "RestApi"
      StageName: "latest"

As you can see, the first resource (RestApi) is pretty much just holding a name and a description. FailOnWarnings is a CloudFormation thingy—it’s only used to indicate “whether to roll back the resource if a warning occurs while API Gateway is creating the RestApi resource.”1 Why do we need it? We don’t. It just sounds like the right thing to do…

The second resource grants API Gateway permission to invoke our Lambda function. Pay attention at how the value assigned to the SourceArn attribute has to be assembled from all sorts of piecemeal information belonging to the current AWS account. (Triple facepalm). The last attribute (FunctionName) makes it clear why we had to export the Lambda function’s ARN from our previous template.

Wait, it gets better. The third resource is of type AWS::ApiGateway::Resource. Do me a favor and open its AWS documentation page right now. There you’ll find the description for that resource, and I quote:

The AWS::ApiGateway::Resource resource creates a resource in an Amazon API Gateway (API Gateway) API.

You don’t say!!!

The forth resource is one that actually makes some sense as it defines a few things that are definitely necessary if we want our endpoint to work properly: the HTTP method type (POST), the endpoint itself (also built from a shitload of scattered data), and the status codes we want to return.

The fifth and last (thank God!) resource creates an API Gateway environment without which we wouldn’t be able to access our endpoint. Seriously, I’ve no idea why we need that thing. I’ve only put it there because I had to…

Enough with the ranting (this time I mean it). Let’s deploy this thing, make the front end people happy and call it a day, shall we?

$ aws cloudformation create-stack \
--stack-name form-processing-api-gateway \
--template-body file://api-gateway.yaml

Testing

Before running any tests, we need to get the API Gateway’s identifier:

$ aws apigateway get-rest-apis \
--query "items[?name == 'form-processing-rest-api'].id" \
--output text
d4tshh3lqp

You can also run that command and assign its result to an environment variable in one shot:

API_ID=$(aws apigateway get-rest-apis \
--query "items[?name == 'form-processing-rest-api'].id" \
--output text)

We’re going to use curl in the next command in order to call our new API endpoint. If you haven’t set the AWS_DEFAULT_REGION as suggested in the beginning of this article, make sure you replace it with the AWS region you deployed the CloudFormation stack to (us-east-1, us-west-2 etc.) below:

$ curl -i -X POST -d '{"name": "Test", "email": "test@example.com", "subject": "Test", "message": "Testing"}' "https://${API_ID}.execute-api.${AWS_DEFAULT_REGION}.amazonaws.com/latest/contact"

# curl output
HTTP/1.1 200 OK
Content-Type: application/json
Content-Length: 26
Connection: keep-alive
Date: Mon, 12 Nov 2018 18:47:32 GMT
x-amzn-RequestId: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
x-amz-apigw-id: xxxxxxxxxxxxxxx=
X-Amzn-Trace-Id: Root=x-xxxxxxxx-xxxxxxxxxxxxxxxxxxxxxxxx;Sampled=1
X-Cache: Miss from cloudfront
Via: 1.1 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.cloudfront.net (CloudFront)
X-Amz-Cf-Id: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx==

Message sent with success.

Mirá que bueno! And that’s the response payload your front end team should expect to receive from your lightweight, low cost, and elegant serverless form processor. And below is the type of email Nofrills’ staff can expect to receive from it:

/img/2018/11/2018-11-13-aws-form-processing/message-from-website-thumb.jpg

1000th happy customer.

You can also test for missing fields and a bad email address to confirm that the Lambda function returns a 400 response in such cases.

Delete the stacks

To delete the AWS resources created above, call delete-stack following the correct dependency order:

aws cloudformation delete-stack --stack-name form-processing-api-gateway
aws cloudformation delete-stack --stack-name form-processing-lambda

Conclusion

Despite the insane Javaesque red tape involved in the API Gateway resources, it wasn’t too bad, was it? A serverless form processor like the one shown here has many advantages, such as low cost and simplicity of use, and you can also extend it to other use cases beyond email alerts.