TensorFlow Image Recognition Using – Python & C++ 1


1. Objective

In this TensorFlow tutorial, we will be getting to know about the TensorFlow Image Recognition. Today in this tutorial of Tensorflow image recognition we will have a deep learning of Image Recognition using TensorFlow. Moreover, in this tutorial, we will see the classification of the image using the inception v3 model and also look at how TensorFlow recognizes image using Python API and C++ API. 

It becomes difficult for a computer to differentiate between images as it doesn’t function the way our brain does. The area of machine learning has shown tremendous progress to deal with this situation. As one of the solutions for this is TensorFlow Image Recognition. You have seen how a convolutional neural network model and deep learning can achieve recognizing visual related tasks.

So, let’s study TensorFlow Image Recognition in detail.

TensorFlow Image Recognition

TensorFlow Image Recognition Using – Python & C++

2. TensorFlow Image Recognition 

Now, many researchers have demonstrated progress in computer vision using the ImageNet- an academic benchmark for validating computer vision. There are many models for TensorFlow image recognition, for example, QuocNet, AlexNet, Inception. Previously TensorFlow had launched BN-Inception-v2. Now, they have taken another step in releasing the code for Inception-v3, the new Image Recognition model in TensorFlow. Inception-v3 is trained for large ImageNet using the data from 2012.

Now, the results from AlexNet classifying data can be seen below.

TensorFlow Image Recognition

TensorFlow Image Recognition- Alex Classification

Hence, to compare the models you try to analyze how these models fail to classify the images into the right categories called as – “top 5 error rate”. Here in this TensorFlow tutorial, you will be learning about the Inception-v3 model, how it works and how it can be reused for other visual tasks.

Tensorflow Applications | Learn Various Uses of Tensorflow

3. TensorFlow Image Recognition Using Python API

Use classify_image.py to download the trained model from tensorflow.org. Here, in TensorFlow Image Recognition Using Python API you will be needing 200M of hard disk space.

Now, run the following command for cloning the TensorFlow model’s repo from Github:

cd models/tutorials/image/imagenet
python classify_image.py

Further, running the above will generate an image of a panda.

TensorFlow Image Recognition

Image Recognition in TensorFlow Using Python API

If the model is running properly then the following output should be achieved:

giant panda, panda, panda bear, coon bear, Ailuropoda melanoleuca (score = 0.88493)
indri, indris, Indri indri, Indri brevicaudatus (score = 0.00878)
lesser panda, red panda, panda, bear cat, cat bear, Ailurus fulgens (score = 0.00317)
custard apple (score = 0.00149)
earthstar (score = 0.00127)

4. TensorFlow Image Recognition Using C++ API

Now, in TensorFlow Image Recognition Using C++ API you can run the same Inception-v3 using the C++ API. For that you have to download an archive having GraphDef running it from the root directory of TensorFlow library:

curl -L "https://storage.googleapis.com/download.tensorflow.org/models/inception_v3_2016_08_28_frozen.pb.tar.gz" |
  tar -C tensorflow/examples/label_image/data -xz

Next, we need to compile and run the C++ binary. Use the following command for that

Explore Tensorflow Features

bazel build tensorflow/examples/label_image/...

Further, you will get the following output.

I tensorflow/examples/label_image/main.cc:206] military uniform (653): 0.834306
I tensorflow/examples/label_image/main.cc:206] mortarboard (668): 0.0218692 I tensorflow/examples/label_image/main.cc:206] academic gown (401): 0.0103579 I tensorflow/examples/label_image/main.cc:206] pickelhaube (716): 0.00800814 I tensorflow/examples/label_image/main.cc:206] bulletproof vest (466): 0.00535088

Basically, we will be getting the default image of Admiral Hopper and network correctly identifies her with the perfection of 0.8.

TensorFlow Image Recognition

TensorFlow Image Recognition using C++ API- Admiral Hopper

Since in TensorFlow Image Recognition, if you look inside of tensorflow/examples/label_image/main.cc file you can get to know how this works. Now, we will walk step by step through the main functions used:

The Inception-v3 model expects to get square images of size 299*299, which are the input_height and input_width flags. You also need to scale pixel values from integers (0-255) to floating point values that graph requires. We also need to control scaling using input_mean and input_std flags (input_mean from each pixel divided by input_std).

Some adjustments to the values have to be made to match the one used in the training process.

Now, you will be seeing how they are applied to an image in ReadTensorFromImageFile().

// Given an image file name, read in the data, try to decode it as an image,
// resize it to the requested size, and then scale the values as desired.
Status ReadTensorFromImageFile(string file_name, const int input_height,
                               const int input_width, const float input_mean,
                               const float input_std,
                               std::vector<Tensor>* out_tensors) {
  tensorflow::GraphDefBuilder b;

Now, you need to create GraphDefBuilder an object you can use to load or run the model.

string input_name = "file_reader";
  string output_name = "normalized";
  tensorflow::Node* file_reader=tensorflow::ops::ReadFile(tensorflow::ops::Const(file_name, b.opts()),
                                b.opts().WithName(input_name));

Now, create nodes for the small model that you want to run. Load, resize and scale the pixel values to get the input for the main model. The first node you create is a Const op that holds a tensor with the file name of the image. That is then passed as the first input to the ReadFile op. The b.opts() argument ensures that the node is added to the definition in the GraphDefBuilder. Also name the ReadFile operator by making WithName() call to b.opts(). This gives a name to the node, which is not that necessary as an automatic name will be assigned, but it does make debugging a bit easier.

Read Tensorflow Bright and the Dark Side

// Now try to figure out what kind of file it is and decode it.
  const int wanted_channels = 3;
  tensorflow::Node* image_reader;
  if (tensorflow::StringPiece(file_name).ends_with(".png")) {
    image_reader = tensorflow::ops::DecodePng(
        file_reader,
        b.opts().WithAttr("channels", wanted_channels).WithName("png_reader"));
  } else {
    // Assume if it's not a PNG then it must be a JPEG.
    image_reader = tensorflow::ops::DecodeJpeg(
        file_reader,
        b.opts().WithAttr("channels", wanted_channels).WithName("jpeg_reader"));
  }
  // Now cast the image data to float so we can do normal math on it.
  tensorflow::Node* float_caster = tensorflow::ops::Cast(
      image_reader, tensorflow::DT_FLOAT, b.opts().WithName("float_caster"));
  // The convention for image ops in TensorFlow is that all images are expected
  // to be in batches, so that they're four-dimensional arrays with indices of
  // [batch, height, width, channel]. Because we only have a single image, we
  // have to add a batch dimension of 1 to the start with ExpandDims().
  tensorflow::Node* dims_expander = tensorflow::ops::ExpandDims(
      float_caster, tensorflow::ops::Const(0, b.opts()), b.opts());
  // Bilinearly resize the image to fit the required dimensions.
  tensorflow::Node* resized = tensorflow::ops::ResizeBilinear(
      dims_expander, tensorflow::ops::Const({input_height, input_width},
                                            b.opts().WithName("size")),
      b.opts());
  // Subtract the mean and divide by the scale.
  tensorflow::ops::Div(
      tensorflow::ops::Sub(
          resized, tensorflow::ops::Const({input_mean}, b.opts()), b.opts()),
      tensorflow::ops::Const({input_std}, b.opts()),
      b.opts().WithName(output_name));

After this, keep adding more nodes, to decode the file data as an image, to cast the integers into floating point values, to resize it. At last, run the subtraction and division operations on the pixel values.

// This runs the GraphDef network definition that we've just constructed, and
  // returns the results in the output tensor.
  tensorflow::GraphDef graph;
  TF_RETURN_IF_ERROR(b.ToGraphDef(&graph));

Finally, you will have a model definition stored in the b variable. Then turn the definition into a full graph definition with the ToGraphDef().

std::unique_ptr<tensorflow::Session> session(
      tensorflow::NewSession(tensorflow::SessionOptions()));
  TF_RETURN_IF_ERROR(session->Create(graph));
  TF_RETURN_IF_ERROR(session->Run({}, {output_name}, {}, out_tensors));
  return Status::OK();

So, in this, you create a tf.Session object, which is the interface to the graph, and runs it by specifying the node you want to get the output from, and where you want to put the output data.

This TensorFlow Image Recognition process will give you a vector of Tensor objects, which will only be a single object in size. You can think of a Tensor as a multi-dimensional array as it holds a 299 pixels high, 299 pixels wide, 3 channel image as its float value. if you have your image-processing framework, you should be able to use that instead. But, remember to apply the same transformations.

This was a simple example of creating a small TensorFlow graph in C++. You can see how we do that in the LoadGraph().

// Reads a model graph definition from disk, and creates a session object you
// can use to run it.
Status LoadGraph(string graph_file_name,
                 std::unique_ptr<tensorflow::Session>* session) {
  tensorflow::GraphDef graph_def;
  Status load_graph_status =
      ReadBinaryProto(tensorflow::Env::Default(), graph_file_name, &graph_def);
  if (!load_graph_status.ok()) {
    return tensorflow::errors::NotFound("Failed to load compute graph at '",
                                        graph_file_name, "'");
  }

If you’ve seen the image loading code, a lot of the terms are already present. Instead of using a GraphDefBuilder to give a GraphDef object, you can load a protobuf file that directly has GraphDef.

Learn TensorFlow API Documentation | Use Of TensorFlow API

session->reset(tensorflow::NewSession(tensorflow::SessionOptions()));
  Status session_create_status = (*session)->Create(graph_def);
  if (!session_create_status.ok()) {
    return session_create_status;
  }
  return Status::OK();
}

Then create a Session object from GraphDef and then pass it to the caller so as to run it later.

The GetTopLabels() is similar to the image loading. Just like the image loader, it will create a GraphDefBuilder, then add a couple of nodes to it, and run the graph to get a pair of tensors as the output.

// Analyzes the output of the Inception graph to retrieve the highest scores and
// their positions in the tensor, which correspond to categories.
Status GetTopLabels(const std::vector<Tensor>& outputs, int how_many_labels,
                    Tensor* indices, Tensor* scores) {
  tensorflow::GraphDefBuilder b;
  string output_name = "top_k";
  tensorflow::ops::TopK(tensorflow::ops::Const(outputs[0], b.opts()),
                        how_many_labels, b.opts().WithName(output_name));
  // This runs the GraphDef network definition that we've just constructed, and
  // returns the results in the output tensors.
  tensorflow::GraphDef graph;
  TF_RETURN_IF_ERROR(b.ToGraphDef(&graph));
  std::unique_ptr<tensorflow::Session> session(
      tensorflow::NewSession(tensorflow::SessionOptions()));
  TF_RETURN_IF_ERROR(session->Create(graph));
  // The TopK node returns two outputs, the scores and their original indices,
  // so we have to append :0 and :1 to specify them both.
  std::vector<Tensor> out_tensors;
  TF_RETURN_IF_ERROR(session->Run({}, {output_name + ":0", output_name + ":1"},
                                  {}, &out_tensors));
  *scores = out_tensors[0];
  *indices = out_tensors[1];
  return Status::OK();

The PrintTopLabels() will take all the sorted results, and will then print them.

In the end, main() defines together all of these calls.

int main(int argc, char* argv[]) {
  // We need to call this to set up global state for TensorFlow.
  tensorflow::port::InitMain(argv[0], &argc, &argv);
  Status s = tensorflow::ParseCommandLineFlags(&argc, argv);
  if (!s.ok()) {
    LOG(ERROR) << "Error parsing command line flags: " << s.ToString();
    return -1;
  }

  // First we load and initialize the model.
  std::unique_ptr<tensorflow::Session> session;
  string graph_path = tensorflow::io::JoinPath(FLAGS_root_dir, FLAGS_graph);
  Status load_graph_status = LoadGraph(graph_path, &session);
  if (!load_graph_status.ok()) {
    LOG(ERROR) << load_graph_status;
    return -1;
  }

Now, load the main graph.

// Get the image from disk as a float array of numbers, resized and normalized
  // to the specifications the main graph expects.
  std::vector<Tensor> resized_tensors;
  string image_path = tensorflow::io::JoinPath(FLAGS_root_dir, FLAGS_image);
  Status read_tensor_status = ReadTensorFromImageFile(
      image_path, FLAGS_input_height, FLAGS_input_width, FLAGS_input_mean,
      FLAGS_input_std, &resized_tensors);
  if (!read_tensor_status.ok()) {
    LOG(ERROR) << read_tensor_status;
    return -1;
  }
  const Tensor& resized_tensor = resized_tensors[0];

Further, load, resize and process the input.

// Actually run the image through the model.
  std::vector<Tensor> outputs;
  Status run_status = session->Run({ {FLAGS_input_layer, resized_tensor}},
                                   {FLAGS_output_layer}, {}, &outputs);
  if (!run_status.ok()) {
    LOG(ERROR) << "Running model failed: " << run_status;
    return -1;
  }

Next, in TensorFlow Image Recognition run the loaded graph.

// This is for automated testing to make sure we get the expected result with
  // the default settings. We know that label 866 (military uniform) should be
  // the top label for the Admiral Hopper image.
  if (FLAGS_self_test) {
    bool expected_matches;
    Status check_status = CheckTopLabel(outputs, 866, &expected_matches);
    if (!check_status.ok()) {
      LOG(ERROR) << "Running check failed: " << check_status;
      return -1;
    }
    if (!expected_matches) {
      LOG(ERROR) << "Self-test failed!";
      return -1;
    }
  }

Know about Tensorflow Architecture, Important Terms, and Functionalities

Hence, for testing purpose, you can check to make sure you get the output expected.

  // Do something interesting with the results we've generated.
Status print_status = PrintTopLabels(outputs, FLAGS_labels);

At last, print the label.

if (!print_status.ok()) {
   LOG(ERROR) << "Running print failed: " << print_status;
   return -1;
 }

Since, the error handling done here is using TensorFlow’s Status object, which is very convenient as it will let you know whether an error has occurred with the ok() checker.

So, this was all about TensorFlow Image Recognition using Python and C++ API. Hope you like our explanation.

5. Conclusion

Hence, in this Tensorflow image recognition tutorial, we learned how to classify images using Inception V3 model, which lets us train our model with a higher accuracy than its predecessor. There are many more methods with which we can implement image recognition such as Alexnet, GoogleNet, VGGNet, etc. Moreover, here we saw Image Recognition using Python API and C++ API. In addition, we discussed TensorFlow image recognition process by example also. Next, we will discuss CNN using TensorFlow. Furthermore, if you have any query regarding TensorFlow Image Recognition, feel free to ask in the comment section.

For reference


Leave a comment

Your email address will not be published. Required fields are marked *

One thought on “TensorFlow Image Recognition Using – Python & C++

  • Kamal

    Hi,

    Thanks for this useful tutorial.

    I want to use Tensorflow. The pattern that I want to detect in image is not knowned/referenced by ImageNet. I thought to do the following : enter a pattern in the brain network, and take the feature vector of this pattern, before the ‘classification’ step. After that, I will do the same with the image that I want to analyze, and I will compare there feature vector with the pattern feature vector with an algorithm of classification (k nearest neighbours for instance).

    Have you an idea on the functions that I need to use in tensorflow API (in C++) ?

    Thanks