Voice Transcription Pipeline in PHP With Vonage
Published on May 4, 2021

In this post, you'll create a voice transcription pipeline. The objective is to use Amazon Transcribe to process an entire conversation into channels and then insert the results into an RDS MySQL database instance. To accomplish this will take two AWS Lambda functions: an HTTP application to retrieve an MP3 file and submit to Amazon Transcribe, and a callback function upon completion of the transcription to store the results into a MySQL database.

Prerequisites

DT API Account

To complete this tutorial, you will need a DT API account. If you don’t have one already, you can sign up today and start building with free credit. Once you have an account, you can find your API Key and API Secret at the top of the DT API Dashboard.

This tutorial also uses a virtual phone number. To purchase one, go to Numbers > Buy Numbers and search for one that meets your needs.

Setup Instructions

Clone the nexmo-community/voice-channels-aws-transcribe-php repo from GitHub, and navigate into the newly created directory to proceed.

Use Composer to Install Dependencies

This example requires the use of Composer to install dependencies and set up the autoloader.

Assuming you have Composer installed globally, run:

composer install

AWS Setup

You will need to create AWS credentials as indicated by Serverless.

Also, create a new AWS S3 bucket and make note of the URL for later use.

Create a Vonage Application Using the Command Line Interface

Install the CLI by following these instructions. You'll use this to create a new Vonage Voice application that also sets up an answer_url and event_url for the app running in AWS Lambda:

vonage apps:create aws-transcribe --voice_answer_url=https://<your_hostname>/webhooks/answer --voice_event_url=https://<your_hostname>/webhooks/event
</your_hostname></your_hostname>

NOTE: You'll be using <your_hostname> as a placeholder in this command. Later, after you know the URLs provided by deploying to AWS Lambda, you'll need to update these pieces of the URLs via the Vonage API Dashboard settings for your application.

IMPORTANT: This will return an application ID and a private key. The application ID will be needed for the vonage apps:link command as well as the .env file later. A file named private.key will be created in the same location/level as server.js, by default.

Obtain a New Virtual Number

If you don't have a number already in place, obtain one from Vonage. This can also be achieved using the CLI by running this command:

vonage numbers:search US

And purchasing one of the available numbers given back by running:

vonage numbers:buy <number>
</number>

Finally, link the new number to the created application by running:

vonage apps:link YOUR_APPLICATION_ID --number=<number>
</number>

Update Environment

Rename the provided .env.dist file to .env and update the values as needed:

APP_ID=voice-aws-transcribe-php LANG_CODE=en-US SAMPLE_RATE=8000 AWS_VERSION=latest AWS_S3_ARN= AWS_S3_BUCKET_NAME='' AWS_S3_RECORDING_FOLDER_NAME='' VONAGE_APPLICATION_PRIVATE_KEY_PATH='./private.key' VONAGE_APPLICATION_ID=

NOTE: All placeholders noted by <> need to be updated.

Serverless Plugin

Install the serverless-dotenv-plugin with the following command:

npm i -D serverless-dotenv-plugin

Deploy to Lambda

With all the above updated successfully, you can now use Serverless to deploy the app to AWS Lambda.

serverless deploy

Note: Make sure to visit the Vonage API Dashboard and update the answer and event URLs for your application with what is provided by the deployment.

Migrate Transcription to a Database

If you only require the transcription, all is done. However, to automate migrating the transcription results to the database will require another function to be deployed. Clone this nexmo-community/aws-voice-transcription-rds-callback-php repo to another location and follow the instructions in the README to get it up and running. The instructions are identical to what was done above for the first function.

Create The Trigger

After adding the second function, you can navigate to CloudWatch in your AWS Console and select Events and Get Started to create a new Event Rule.

Set the Rule as follows:

  • Event Pattern

  • Build event pattern to match events by service

  • Service Name = Transcribe

  • Event Type = Transcribe Job State Change

  • Specific status(es) = COMPLETED

  • As the Target select the Lambda function #2 created above

  • Scroll down and click Configure Details.

  • Give the rule a meaningful name and description, and enable it.

  • Click Create rule to complete it.

Now you're ready to test.

Usage

With the deployment completed, you should be able to place a call to your virtual number from any phone. You will hear a message about being connected, and then the recipient number will be called.

After you hang up, the MP3 file will be retrieved from Vonage and uploaded to AWS S3. Following that, a transcription job will be started. The job can be monitored in the AWS Console website after login.

Upon completion of the transcription, CloudWatch will trigger the Lambda function to parse the transcription and insert to the database.

Next Steps

If you have any questions or run into troubles, you can reach out to @VonageDev on Twitter or inquire in the Vonage Developer Community Slack team. Good luck.

Adam CulpVonage Alumni

Adam is a developer and consultant who enjoys ultra-running, blogging/vlogging, and helping others tame technology to accomplish amazing things with an insatiable desire to mentor and help.

Ready to start building?

Experience seamless connectivity, real-time messaging, and crystal-clear voice and video calls-all at your fingertips.