“Hey Facebook, What Type of Dog Is That?” Adding ML to Messenger
Published on May 13, 2021

Convolutional Neural Networks (CNNs) provide a powerful and scalable mechanism for preforming image classification. They can be relatively difficult to build, train, and tune from scratch, which is what makes tools like TensorFlow and the inception models so indispensable to improving our ML workflows.

That said for us .NET folks running python scripts out of an in-app shell is less than an ideal solution which is what makes the release of the ML.NET TensorFlow library so exciting.

What if I told you that with just a couple hundred lines of C# code and a little configuration you could build an ASP.NET core app that will house a powerful CNN that you can interact with as simple as sending a picture message to a Facebook page?

With training as simple as:

Training ImageTraining Image

And a classification request as simple as:

Classification RequestClassification Request

Well, that's precisely what we're going to do - using ML.NET, we're going to build a powerful classifier and then using Nexmo's Messages API and Messenger we're going to create a powerful, easy to use, vector for training and classification.

Learning Objectives

In this tutorial, we will:

  • Create an ML.NET TensorFlow Neural Network

  • Train that Neural Network to recognize different types of dogs

  • Create a Messaging vector to ask the Neural Network to classify dogs it's never seen before

  • Create a Learning vector to allow the Neural Network to learn new types of dogs dynamically.

Prerequisites

  • Visual Studio 2019 version 16.3 or higher

  • A linked Facebook Page to your Nexmo account See here for setup

  • Optional: Ngrok for test deployment

DT API Account

To complete this tutorial, you will need a DT API account. If you don’t have one already, you can sign up today and start building with free credit. Once you have an account, you can find your API Key and API Secret at the top of the DT API Dashboard.

Project Setup

First thing's first - let's open Visual Studio, Create a new ASP.NET Core 3.0 API application and call it MessagesTensorFlow. Now let's add the following NuGet packages to the solution:

  • BouncyCastle

  • jose-jwt

  • Microsoft.ML

  • Microsoft.ML.ImageAnalytics

  • Microsoft.ML.TensorFlow

  • Newtonsoft.Json

We're going to be starting off our neural net with the Inception V1 model and then seeding it with images / labels off disk. Create a folder under the MessagesTensorFlow directory called assets.

In assets download and unzip the Inception V1 Model

Also, under assets, create a folder called train and predict. Under each of those directories, add a tags.tsv file. Your directory structure should look something like this now:

Directory structureDirectory structure

Now go to each file and in the advanced properties section set the Copy to Output Directory to Copy if newer

Creating the Learner

Let's now create the class that's going to actually hold our neural network. Create a file called TFEngine.cs.

Imports

Add the following imports to the top of the file:

using Microsoft.ML;
using Microsoft.ML.Data;
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Net;

Class Setup

Then inside the TFEngine class let's add some paths so we can access everything all the files we'll be ingesting into our model. As well as some settings for managing the inception data.

static readonly string _assetsPath = Path.Combine(Environment.CurrentDirectory, "assets");
static readonly string _imagesFolder = Path.Combine(_assetsPath, "train");
static readonly string _savePath = Path.Combine(_assetsPath, "predict");
static readonly string _trainTagsTsv = Path.Combine(_imagesFolder, "tags.tsv");
static readonly string _inceptionTensorFlowModel = Path.Combine(_assetsPath, "inception5h", "tensorflow_inception_graph.pb");

const int ImageHeight = 224;
const int ImageWidth = 224;
const float Mean = 117;
const bool ChannelsLast = true;

Let's also set this class up as a singleton and allow only one access to it at a time. We're also going to add a webClient for downloading the image URLs.

static readonly object _lock = new object();

private static WebClient _client = new WebClient();
private static TFEngine _instance;
public static TFEngine Instance
{
    get
    {
        lock (_lock)
        {
            if (_instance == null)
            {
                _instance = new TFEngine();
            }
            return _instance;
        }

    }
}

private TFEngine()
{
    _mlContext = new MLContext();
    GenerateModel();
}

We're also going to create some fields to hold our pipeline that will be used to create our model, the model that will be used to perform the prediction, and the MLContext.

private IEstimator<itransformer> _pipeline;
private ITransformer _model;
private MLContext _mlContext;
</itransformer>

Next add a class ImageData which will hold the image data as it passes through the model

public class ImageData
{
    [LoadColumn(0)]
    public string ImagePath;

    [LoadColumn(1)]
    public string Label;
}

Then create a structure to house the prediction data as it flows out of the model:

public class ImagePrediction : ImageData
{
    public float[] Score;

    public string PredictedLabelValue;
}

The score will be an array containing the probabilities the neural net assigns to each possible label, and the PredictedLabelValue will, of course, be prediction from the network (the item with the highest score)

Model Training

Now it's time to train our model!

Add a method called GenerateModel

public void GenerateModel()
{
    _pipeline = _mlContext.Transforms.LoadImages(outputColumnName: "input", imageFolder: _imagesFolder, inputColumnName: nameof(ImageData.ImagePath))//Loads the images from the image folder
        .Append(_mlContext.Transforms.ResizeImages(outputColumnName: "input", imageWidth: ImageWidth, imageHeight: ImageHeight, inputColumnName: "input"))//Resizes all of the images to a size the inception model can work with it
        .Append(_mlContext.Transforms.ExtractPixels(outputColumnName: "input", interleavePixelColors: ChannelsLast, offsetImage: Mean))//Extract pixels from the images for use
        .Append(_mlContext.Model.LoadTensorFlowModel(_inceptionTensorFlowModel)// Loads the tensorflow model from the inception .pb file
        .ScoreTensorFlowModel(outputColumnNames: new[] { "softmax2_pre_activation" }, inputColumnNames: new[] { "input" }, addBatchDimensionInput: true))// scores input images against the tensorflow models softmax2_pre_activation layer - a vector of features that might describe an input image
        .Append(_mlContext.Transforms.Conversion.MapValueToKey(outputColumnName: "LabelKey", inputColumnName: "Label"))// maps the ImageData's label to the output column labelKey
        .Append(_mlContext.MulticlassClassification.Trainers.LbfgsMaximumEntropy(labelColumnName: "LabelKey", featureColumnName: "softmax2_pre_activation"))// creates the multiclass classifier from the tensorflow model
        .Append(_mlContext.Transforms.Conversion.MapKeyToValue("PredictedLabelValue", "PredictedLabel")) // for the predictor - maps the predictedlabelValue to the PredictedLabel Key
        .AppendCacheCheckpoint(_mlContext);// fits the training data to the model - et voila - we have our classifier
    IDataView trainingData = _mlContext.Data.LoadFromTextFile<imagedata>(path: _trainTagsTsv, hasHeader: false);
    _model = _pipeline.Fit(trainingData);
}
</imagedata>

This is really the heart of what's going to make our predictor work. The '_pipeline =' section is a chain of commands that will:

  • Load the images off of the disk

  • Resize the images for ingestion

  • Extract and vectorize the pixels in the images

  • Load the inception TensorFlow model (essentially our pre-made neural net)

  • Create a training model and run the training data through it to create a prediction model for us to use

Classifying a Single Image

With our model trained we can now go about creating a method that will take in a file name and return a string containing a prediction and the networks confidence in the prediction. This function takes an imageUrl, saves the file to the disk, classifies the image and returns a string containing the classifiers guess with its confidence.

public string ClassifySingleImage(string imageUrl)
{
    try
    {
        var filename = Path.Combine(_savePath, $"{Guid.NewGuid()}.jpg");
        _client.DownloadFile(imageUrl, filename);
        var imageData = new ImageData()
        {
            ImagePath = filename
        };

        var predictor = _mlContext.Model.CreatePredictionEngine<imagedata, imageprediction="">(_model);
        var prediction = predictor.Predict(imageData);
        var response = $"I'm about {prediction.Score.Max() * 100}% sure that the image you sent me is a {prediction.PredictedLabelValue}";
        Console.WriteLine($"Image: {Path.GetFileName(imageData.ImagePath)} predicted as: {prediction.PredictedLabelValue} with score: {prediction.Score.Max() * 100} ");
        return response;
    }
    catch (Exception)
    {
        return "Something went wrong when trying to classify image";
    }
}
</imagedata,>

Adding Training Data

The final operation we're going to ask of the Tensor Flow Engine is essentially the reverse of prediction, we'll ask it to accept an image URL and label and update itself to better recognize images of that label. The AddTrainingImage saves the provided image to disk, appends information about that images to the tags.tsv file, and regenerates the model.

public string AddTrainingImage(string imageUrl, string label)
{
    try
    {
        var id = Guid.NewGuid();
        var fileName = Path.Combine(_imagesFolder, $"{id}.jpg");
        _client.DownloadFile(imageUrl, fileName);
        File.AppendAllText(_trainTagsTsv, $"{id}.jpg\t{label}" + Environment.NewLine);
        IDataView trainingData = _mlContext.Data.LoadFromTextFile<imagedata>(path: _trainTagsTsv, hasHeader: false);
        _model = _pipeline.Fit(trainingData);
        return $"I have trained myself to recognize the image you sent me as a {label}. Your teaching is appreciated";
    }
    catch (Exception)
    {
        return "something went wrong when trying to train on image";
    }
}
</imagedata>

Using the Messages API to Drive Classification and Training

Messages Objects

Next we're going to add some POCOs to hold our messaging data as it comes in and goes out to the Messages API - these objects are fairly verbose and don't do anything particularly interesting aside from allowing the serialization / deserialization of JSON so, for the sake of brevity feel free to simply use the following structures:

Interacting With the API

Creating these structures frees us up to manage the data we are getting from and sending out to the Messages API. However we need one more step to enable us to actually use the API - we'll need to generate a JWT to authenticate our application with the Messages API. To this end, let's create the following files.

  • TokenGenerator.cs

  • MessageSender.cs

Generate JWT

TokenGenerator is going to have one static method GenerateToken which will accept a list of Claims and the privateKey for your application

using Org.BouncyCastle.Crypto.Parameters;
using Org.BouncyCastle.OpenSsl;
using Org.BouncyCastle.Security;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Security.Claims;
using System.Security.Cryptography;

namespace MessagesTensorFlow
{
    public class TokenGenerator
    {
        public static string GenerateToken(List<claim> claims, string privateKey)
        {
            RSAParameters rsaParams;
            using (var tr = new StringReader(privateKey))
            {
                var pemReader = new PemReader(tr);
                var kp = pemReader.ReadObject();
                var privateRsaParams = kp as RsaPrivateCrtKeyParameters;
                rsaParams = DotNetUtilities.ToRSAParameters(privateRsaParams);
            }
            using (RSACryptoServiceProvider rsa = new RSACryptoServiceProvider())
            {
                rsa.ImportParameters(rsaParams);
                Dictionary<string, object=""> payload = claims.ToDictionary(k => k.Type, v => (object)v.Value);
                return Jose.JWT.Encode(payload, rsa, Jose.JwsAlgorithm.RS256);
            }
        }
    }
}
</string,></claim>

This will generate a JWT for your usage with the Messages API.

Generate a Claims List for JWT

In the MessageSender.cs we'll have a method to generate the claims for the JWT from your appId:

private static List<claim> GetClaimsList(string appId)
{
    const int SECONDS_EXPIRY = 3600;
    var t = DateTime.UtcNow - new DateTime(1970, 1, 1);
    var iat = new Claim("iat", ((Int32)t.TotalSeconds).ToString(), ClaimValueTypes.Integer32); // Unix Timestamp for right now
    var application_id = new Claim("application_id", appId); // Current app ID
    var exp = new Claim("exp", ((Int32)(t.TotalSeconds + SECONDS_EXPIRY)).ToString(), ClaimValueTypes.Integer32); // Unix timestamp for when the token expires
    var jti = new Claim("jti", Guid.NewGuid().ToString()); // Unique Token ID
    var claims = new List<claim>() { iat, application_id, exp, jti };

    return claims;
}
</claim></claim>

Read App Settings and Create JWT

Then we'll have another method to read the relevant items out of your configuration, which the controller will hand us through Dependency Injection, retrieve the claims list, and build the JWT

private static string BuildJwt(IConfiguration config)
{
    var appId = config["Authentication:appId"];
    var priavteKeyPath = config["Authentication:privateKey"];
    string privateKey = "";
    using (var reader = File.OpenText(priavteKeyPath)) // file containing RSA PKCS1 private key
        privateKey = reader.ReadToEnd();

    var jwt = TokenGenerator.GenerateToken(GetClaimsList(appId), privateKey);
    return jwt;
}

This will of course require a couple of items in your appsettings.json file Add the following object to your appsettings.json file and fill in with the appropriate values:

"Authentication": {
    "appId": "app_id",
    "privateKey": "path_to_key_file"
  }

Send a Message

Now we're going to tie this all together with our SendMessage method, which will take our message, toId, fromId, and config. This method will generate a JWT and send along a request to the Messages API to send a message containing the feedback from our classifier to our end user.

public static void SendMessage(string message, string fromId, string toId, IConfiguration config)
{
    const string MESSAGING_URL = @"https://api.nexmo.com/v0.1/messages";
    try
    {
        var jwt = BuildJwt(config);

        var requestObject = new MessageRequest()
        {
            to = new MessageRequest.To()
            {
                id = toId,
                type = "messenger"
            },
            from = new MessageRequest.From()
            {
                id = fromId,
                type = "messenger"
            },
            message = new MessageRequest.Message()
            {
                content = new MessageRequest.Message.Content()
                {
                    type = "text",
                    text = message
                },
                messenger = new MessageRequest.Message.Messenger()
                {
                    category = "RESPONSE"
                }
            }
        };
        var requestPayload = JsonConvert.SerializeObject(requestObject, new JsonSerializerSettings() { NullValueHandling = NullValueHandling.Ignore, DefaultValueHandling = DefaultValueHandling.Ignore });
        var httpWebRequest = (HttpWebRequest)WebRequest.Create(MESSAGING_URL);
        httpWebRequest.ContentType = "application/json";
        httpWebRequest.Accept = "application/json";
        httpWebRequest.Method = "POST";
        httpWebRequest.PreAuthenticate = true;
        httpWebRequest.Headers.Add("Authorization", "Bearer " + jwt);
        using (var streamWriter = new StreamWriter(httpWebRequest.GetRequestStream()))
        {
            streamWriter.Write(requestPayload);
        }
        using (var httpResponse = (HttpWebResponse)httpWebRequest.GetResponse())
        {
            using (var streamReader = new StreamReader(httpResponse.GetResponseStream()))
            {
                var result = streamReader.ReadToEnd();
                Console.WriteLine(result);
                Console.WriteLine("Message Sent");
            }
        }
    }
    catch (Exception e)
    {
        Debug.WriteLine(e.ToString());
    }
}

Classification Handler

We're going to want the handling of inbound webhooks to be asynchronous and respond immediately, so we're going to create a ClassificationHandler.cs file to actually handle the classify / reply operations. This file will contain a couple of small structures to allow us to unpack, classify or train, and reply to inbound messages.

In ClassificationHandler.cs add the following code:

public static void ClassifyAndRespond(object state)
{
    var request = state as ClassifyRequest;
    var response = TFEngine.Instance.ClassifySingleImage(request.imageUrl);
    MessageSender.SendMessage(response, request.toId, request.fromid, request.Configuration);
}

public static void AddTrainingData(object state)
{
    var request = state as TrainRequest;
    var response = TFEngine.Instance.AddTrainingImage(request.imageUrl, request.Label);
    MessageSender.SendMessage(response, request.toId, request.fromid, request.Configuration);
}
public class TrainRequest : Request
{
    public string Label { get; set; }
}
public class ClassifyRequest : Request{}
public abstract class Request
{
    public string imageUrl { get; set; }
    public string toId { get; set; }
    public string fromid { get; set; }

    public IConfiguration Configuration { get; set; }
}

Handle Incoming Messages Webhooks

From our code's perspective the final thing we're going to need to do is to create a couple of controllers to handle the incoming messages and status from the Messages API.

In the Controllers folder, add 2 "API controller - Empty" called InboundController and StatusController.

Status Controller

The Status controller is going to provide status to our application's messages as they flow through the API, to keep track of what's going on, let's add a post method to the Status controller to write the status contents out to the debug console:

[HttpPost]
public HttpStatusCode Post([FromBody]StatusMessage message)
{
    Debug.WriteLine(JsonConvert.SerializeObject(message));
    return HttpStatusCode.NoContent;
}

Inbound Controller

The Inbound Controller is going to be managing the Inbound Messages from our webhook.

Class Setup

Let's first set it up by creating a dictionary for the pending training labels, a Configuration object for the controller to access the configuration, and by Dependency Injecting the Configuration into the Inbound Controller constructor:

public static Dictionary<string, string=""> _pendingTrainLabels = new Dictionary<string, string="">();
public IConfiguration Configuration { get; set; }
public InboundController(IConfiguration configuration)
{
    Configuration = configuration;
}
</string,></string,>

Handling Inbound Messages

Next, we'll write the actual InboundMessage handler. This handler will be a POST request. It will check to see if there's any text in the message. If there is, it will see if the first word in that message is 'train.' If so it will save the rest of the message as a training label, and the next time that user sends a message with a picture, the classifier will be trained with that image and label.

On any other image message it will simply classify the image and send the output of the classification back to the message sender.

In both cases it starts a WorkItem in the ThreadPool, passing in one of those handy ClassificationHandler request objects we generated earlier - this unblocks the controller to send a status back to the messages api (in this case a 204 to inform it that it received the message)

[HttpPost]
public HttpStatusCode Post([FromBody]InboundMessage message)
{
    const string TRAIN = "train";
    try
    {
        Debug.WriteLine(JsonConvert.SerializeObject(message));
        if (!string.IsNullOrEmpty(message.message.content.text))
        {
            var split = message.message.content.text.Split(new[] { ' ' }, 2);
            if (split.Length > 1)
            {
                if (split[0].ToLower() == TRAIN)
                {
                    var label = split[1];
                    var requestor = message.from.id;
                    if (!_pendingTrainLabels.ContainsKey(requestor))
                    {
                        _pendingTrainLabels.Add(requestor, label);
                    }
                    else
                    {
                        _pendingTrainLabels[requestor] = label;
                    }
                }
            }
        }
        if (_pendingTrainLabels.ContainsKey(message.from.id) && message.message.content?.image?.url != null)
        {
            ThreadPool.QueueUserWorkItem(ClassificationHandler.AddTrainingData, new ClassificationHandler.TrainRequest()
            {
                toId = message.to.id,
                fromid = message.from.id,
                imageUrl = message.message.content.image.url,
                Label = _pendingTrainLabels[message.from.id],
                Configuration = Configuration
            });
            _pendingTrainLabels.Remove(message.from.id);
        }
        else
        {
            ThreadPool.QueueUserWorkItem(ClassificationHandler.ClassifyAndRespond,
            new ClassificationHandler.ClassifyRequest()
            {
                toId = message.to.id,
                fromid = message.from.id,
                imageUrl = message.message.content.image.url,
                Configuration = Configuration
            });
        }

        return HttpStatusCode.NoContent;
    }
    catch (Exception ex)
    {
        return HttpStatusCode.NoContent;
    }
}

Seeding With a Little Data.

You can add whatever images and tags you want to get yourself started. For the sake of simplicity I'm only going to start with one image - an image of my dog (aptly named Zero).

Training dataTraining data

I'm going to put that image in the assets/train directory.

Now since Zero is a whippet, I'm going to, in the tags.tsv file in the assets/train folder, add the file name 'zero.jpg' followed by a tab, followed by the label 'whippet' followed by a new line

zero.jpg    whippet

Testing

With this done, all that's left to do is fire it up, expose it to the internet, and test it out. I use ngrok and IIS express to test it.

IIS Express Config

First go into the project properties debug tab and look for the App Url - specifically which port it's going to be using - I uncheck the Enable SSL box for testing.

DebugDebug

Then launch the site from visual studio using IIS Express - you'll see the port in the address bar of the browser which pops up - in my sample I cleaned out all the weather controller stuff that comes out of the box so I get a 404 when I fire it up - which is fine as this is really only acting as a web service to listen for and reply to webhooks. There isn't any get requests to propagate a page back to your web browser.

Using Ngrok to Expose the Port to the Internet

For the Messages API to forward the messages we'll need to expose the site to the internet - for testing purposes, we'll use ngrok to expose our IIS express port. Open up your command line and use this command, replace with your port number.

ngrok http --host-header="localhost:" http://localhost:

This command produces an output like this:

Command OutputCommand Output

Configuring the Webhooks

Using the http link we just got from ngrok you can create the url that the webhook will be calling back on - you can see in the Route of the controllers we just made what the route will look like:

RouteRoute

It's going to work out to be http://dc0feb1d.ngrok.io/api/Status for status messages and http://dc0feb1d.ngrok.io/api/Inbound for inbound messages

NOTE: The first part of the url (dc0feb1d) will change whenever you restart ngrok on the free tier.

We'll use those callback URLs to register our webhooks with Nexmo.

Go To ${CUSTOMER_DASHBOARD_URL} and login to your Nexmo account

Go to messages and dispatch -> Your applications and select the edit button for your application

On the edit screen change the Status URL and Inbound URL fields to the noted values above and click the blue save button in the lower right hand corner.

And that's it. Now you have a classifier / learner that you can feed images over messenger.

Steve LorelloVonage Alumni

Former .NET Developer Advocate @Vonage, full-stack polyglottic Software Engineer, AI/ML Grad Student, avid runner, and an even more avid traveler.

Ready to start building?

Experience seamless connectivity, real-time messaging, and crystal-clear voice and video calls-all at your fingertips.