{"id":15996,"date":"2018-05-31T06:00:16","date_gmt":"2018-05-31T06:00:16","guid":{"rendered":"https:\/\/data-flair.training\/blogs\/?p=15996"},"modified":"2021-05-14T11:00:16","modified_gmt":"2021-05-14T05:30:16","slug":"tensorflow-audio-recognition","status":"publish","type":"post","link":"https:\/\/data-flair.training\/blogs\/tensorflow-audio-recognition\/","title":{"rendered":"TensorFlow Audio Recognition in 10 Minutes"},"content":{"rendered":"<p><span style=\"font-weight: 400\">By now you\u2019ve already learned how to create and train your own model. In this <strong>Tensorflow<\/strong> tutorial, you\u2019ll be recognizing audio using TensorFlow. Moreover, in this TensorFlow Audio Recognition tutorial, we will go through the deep learning for audio applications using TensorFlow.<\/span><\/p>\n<p><span style=\"font-weight: 400\"> Along with this, we will see training process and the confusion matrix. Also, we will touch TensorBoard and working model for audio recognition in TensorFlow. Lastly, we will study command recognition and also how can we customize our audio model.<\/span><\/p>\n<p>So, let&#8217;s begin TensorFlow Audio Recognition.<\/p>\n<h2><span style=\"font-weight: 400\">TensorFlow Audio Recognition<\/span><\/h2>\n<p><span style=\"font-weight: 400\">This tutorial will show you how to build a basic TensorFlow speech recognition network that recognizes ten words. Actual speech and audio recognition systems are very complex and are beyond the scope of this tutorial.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Just like the <strong>MNIST<\/strong> tutorial for images, this should give you a basic understanding of the techniques involved. Once you&#8217;ve completed this TensorFlow Audio Recognition tutorial, you\u2019ll have a model that tries to classify a one-second audio clip as either: <\/span><\/p>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">silence, <\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">an unknown word,<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">yes <\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">no <\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">up<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\"> down <\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">left <\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">right<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\"> on<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">off <\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">stop<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">go<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">Before starting you should have TensorFlow installed on your system with a good internet connectivity and some hard disk space.<\/span><\/p>\n<h2><span style=\"font-weight: 400\">Training in TensorFlow Audio Recognition<\/span><\/h2>\n<p><span style=\"font-weight: 400\">To begin the training process in TensorFlow Audio Recognition, head to the TensorFlow source and type the following<\/span><span style=\"font-weight: 400\">:<\/span><\/p>\n<pre class=\"EnlighterJSRAW\">python tensorflow\/examples\/speech_commands\/train.py<\/pre>\n<p><span style=\"font-weight: 400\">This command will download the\u00a0speech dataset, which consists of 65k .wav audio files where people say 30 different words.<\/span><\/p>\n<p><span style=\"font-weight: 400\">You&#8217;ll see the output information for every training step along the process like the one given below:<\/span><br \/>\n<strong>I0730 16:54:41.813438 55030 train.py:252] Saving to &#8220;\/tmp\/speech_commands_train\/conv.ckpt-100&#8221;<\/strong><\/p>\n<p><span style=\"font-weight: 400\">The trained weights are being saved to a checkpoint file and if you ever interrupted the training, you can always go back to the checkpoint file to resume from the last point of training. <\/span><\/p>\n<h2><span style=\"font-weight: 400\">Confusion Matrix in TensorFlow<\/span><\/h2>\n<p><span style=\"font-weight: 400\">The first 400 steps, will give you:<\/span><\/p>\n<pre class=\"EnlighterJSRAW\">I0730 16:57:38.073667   55030 train.py:243] Confusion Matrix:\r\n [[258   0   0   0   0   0   0   0   0   0   0   0]\r\n [  7   6  26  94   7  49   1  15  40   2   0  11]\r\n [ 10   1 107  80  13  22   0  13  10   1   0   4]\r\n [  1   3  16 163   6  48   0   5  10   1   0  17]\r\n [ 15   1  17 114  55  13   0   9  22   5   0   9]\r\n [  1   1   6  97   3  87   1  12  46   0   0  10]\r\n [  8   6  86  84  13  24   1   9   9   1   0   6]\r\n [  9   3  32 112   9  26   1  36  19   0   0   9]\r\n [  8   2  12  94   9  52   0   6  72   0   0   2]\r\n [ 16   1  39  74  29  42   0   6  37   9   0   3]\r\n [ 15   6  17  71  50  37   0   6  32   2   1   9]\r\n [ 11   1   6 151   5  42   0   8  16   0   0  20]]<\/pre>\n<p><span style=\"font-weight: 400\">Where the first section is a\u00a0matrix. Each column represents a set of samples that were estimated to be each keyword. In the above matrix, the first column represents all the clips that were predicted to be silence, the second representing the unknown words, the third &#8220;yes&#8221;, and so on.<\/span><\/p>\n<p><span style=\"font-weight: 400\">The rows represent clips by their correct, truth keywords. The first row is all the clips that were silenced, the second clips that were unknown words, the third &#8220;yes&#8221;, etc.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Thus, in the confusion matrix, Reflection is in the network mistakes. Now, all the entries in the first row are zero but the first because the first row contains all the clips that are actually silence. <\/span><\/p>\n<p><span style=\"font-weight: 400\">Thus, the positive number shows the errors outside the first cell. This means that there are some false positives in the network, and the network is recognizing words which are not \u201csilence\u201d to be silence.<\/span><\/p>\n<h3><span style=\"font-weight: 400\">a. Validating<\/span><\/h3>\n<pre class=\"EnlighterJSRAW\">I0730 16:57:38.073777 55030 train.py:245] Step 400: Validation accuracy = 26.3% (N=3093)<\/pre>\n<p><span style=\"font-weight: 400\">You should separate your data set into three categories: The biggest one for training the network, a smaller one for calculating the accuracy during training, and another one to process the accuracy after the training has been completed.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Hence, the script does this division into categories for you, and the logging line shown above will tell you the accuracy. Overfitting occurs when the validation doesn\u2019t increase but the accuracy does, <\/span><\/p>\n<h2><span style=\"font-weight: 400\">TensorBoard in TensorFlow<\/span><\/h2>\n<p><span style=\"font-weight: 400\">You can visualize how the training progress using TensorBoard. The events are saved to \/tmp\/retrain_logs, and loaded using:<\/span><\/p>\n<pre class=\"EnlighterJSRAW\">tensorboard --logdir \/tmp\/retrain_logs<\/pre>\n<div id=\"attachment_16034\" style=\"width: 2410px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/asfg-1.png\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-16034\" class=\"wp-image-16034 size-full\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/asfg-1.png\" alt=\"TensorFlow Audio Recognition\" width=\"2400\" height=\"1080\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/asfg-1.png 2400w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/asfg-1-150x68.png 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/asfg-1-300x135.png 300w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/asfg-1-768x346.png 768w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/asfg-1-1024x461.png 1024w\" sizes=\"auto, (max-width: 2400px) 100vw, 2400px\" \/><\/a><p id=\"caption-attachment-16034\" class=\"wp-caption-text\">Audio Recognition in TensorFlow- TensorBoard<\/p><\/div>\n<p><span style=\"font-weight: 400\">You have to go to\u00a0<\/span><span style=\"font-weight: 400\">http:\/\/localhost:6006<\/span><span style=\"font-weight: 400\">\u00a0in your system browser, to see charts and graphs in TensorBoard<\/span><span style=\"font-weight: 400\">.<\/span><\/p>\n<h2><span style=\"font-weight: 400\">Finished Training in TensorFlow Audio Recognition<\/span><\/h2>\n<p><span style=\"font-weight: 400\">After a few hours of training, the script usually completes about 20,000 steps, printing out a final confusion matrix, and the accuracy percentage<\/span><\/p>\n<p><span style=\"font-weight: 400\">You can export to mobile devices in a compact form using:<\/span><\/p>\n<pre class=\"EnlighterJSRAW\">python tensorflow\/examples\/speech_commands\/freeze.py \\\r\n--start_checkpoint=\/tmp\/speech_commands_train\/conv.ckpt-18000 \\\r\n--output_file=\/tmp\/my_frozen_graph.pb<\/pre>\n<h2><span style=\"font-family: Georgia, Georgia, serif\">Running on Android<\/span><\/h2>\n<p><span style=\"font-weight: 400\">Download\u00a0the demo app from github:<\/span> <a href=\"https:\/\/github.com\/tensorflow\/tensorflow\/tree\/master\/tensorflow\/examples\/android#prebuilt-components\"><span style=\"font-weight: 400\">https:\/\/github.com\/tensorflow\/tensorflow\/tree\/master\/tensorflow\/examples\/android#prebuilt-components<\/span><\/a><span style=\"font-weight: 400\"> \u00a0and install them on your phone. <\/span><\/p>\n<p><span style=\"font-weight: 400\">You&#8217;ll see &#8216;TF Speech&#8217; \u00a0in your app list, and after it has been opened it will show you the list of words that you&#8217;ve just trained your model with. <\/span><span style=\"font-weight: 400\">After letting the app use your microphone, you should be able to try the words and see them highlighted in the interface when the model recognizes them.<\/span><\/p>\n<h2><span style=\"font-weight: 400\">Working- TensorFlow Speech Recognition Model<\/span><\/h2>\n<p><span style=\"font-weight: 400\">This TensorFlow Audio Recognition tutorial is based on the kind of <strong>CNN<\/strong> that is very familiar to anyone who&#8217;s worked with image recognition like you already have in one of the previous tutorials. The audio is a 1-D signal and not be confused for a 2D spatial problem.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Now, we solve the issue by defining a time slot in which your spoken words should fit, and changing the signal in that slot into an image. You can do this by grouping the incoming audio into short segments, and calculating the strength of the frequencies. <\/span><\/p>\n<p><span style=\"font-weight: 400\">Each segment is treated as a vector of numbers, which are arranged in time to form a 2D array. This array of values can then be treated like a one-channel image, also known as a spectrogram. You can view what kind of image an audio sample produces with<\/span><span style=\"font-weight: 400\">:<\/span><\/p>\n<pre class=\"EnlighterJSRAW\">bazel run tensorflow\/examples\/wav_to_spectrogram:wav_to_spectrogram -- \\\r\n--input_wav=\/tmp\/speech_dataset\/happy\/ab00c4b2_nohash_0.wav \\\r\n--output_png=\/tmp\/spectrogram.png<\/pre>\n<p><strong style=\"font-family: Verdana, Geneva, sans-serif\">\/tmp\/spectrogram.png<\/strong><span style=\"font-weight: 400\">\u00a0will show you<\/span><span style=\"font-weight: 400\">:<\/span><\/p>\n<div id=\"attachment_16035\" style=\"width: 266px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/spectro.jpg\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-16035\" class=\"wp-image-16035 size-full\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/spectro.jpg\" alt=\"TensorFlow Audio Recognition\" width=\"256\" height=\"246\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/spectro.jpg 256w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/spectro-150x144.jpg 150w\" sizes=\"auto, (max-width: 256px) 100vw, 256px\" \/><\/a><p id=\"caption-attachment-16035\" class=\"wp-caption-text\">Working Model Of TensorFlow Audio Recognition<\/p><\/div>\n<p><span style=\"font-weight: 400\">This is also a 2D, one-channel representation so we can treat it like an image too.<\/span><br \/>\n<span style=\"font-weight: 400\">The image that&#8217;s produced is then fed into a multi-layer convolutional neural network, with a fully-connected layer followed by a softmax at the end. <\/span><\/p>\n<h2><span style=\"font-weight: 400\">Command Recognition in TensorFlow<\/span><\/h2>\n<div id=\"attachment_16044\" style=\"width: 1210px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/Command-Recognition-in-TensorFlow-01.jpg\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-16044\" class=\"wp-image-16044 size-full\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/Command-Recognition-in-TensorFlow-01.jpg\" alt=\"TensorFlow Audio Recognition\" width=\"1200\" height=\"628\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/Command-Recognition-in-TensorFlow-01.jpg 1200w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/Command-Recognition-in-TensorFlow-01-150x79.jpg 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/Command-Recognition-in-TensorFlow-01-300x157.jpg 300w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/Command-Recognition-in-TensorFlow-01-768x402.jpg 768w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/Command-Recognition-in-TensorFlow-01-1024x536.jpg 1024w\" sizes=\"auto, (max-width: 1200px) 100vw, 1200px\" \/><\/a><p id=\"caption-attachment-16044\" class=\"wp-caption-text\">TensorFlow Command Recognition<\/p><\/div>\n<p><span style=\"font-weight: 400\">RecognizeCommands<\/span><span style=\"font-weight: 400\"> is fed the output of running the TensorFlow model, it averages the signals, and returns a value of the keyword when it thinks a recognized word has been found. we can do this at the <strong>Java<\/strong> level on Android, or <strong>Python<\/strong> on the RasPi. <\/span><\/p>\n<p><span style=\"font-weight: 400\">As long as they share the common logic, you can alter the parameters that will change the average, and then transfer them over to your application to get similar results.<\/span><\/p>\n<h3><span style=\"font-weight: 400\">a. Unknown Class<\/span><\/h3>\n<p><span style=\"font-weight: 400\">Your app may hear sounds that are not a part of your training set. To make the network learn which sounds to boycott, you need to provide clips of audio that are not a part of your classes. To do this, you can create\u00a0<\/span><span style=\"font-weight: 400\">boo<\/span><span style=\"font-weight: 400\">,\u00a0meow, and\u00a0<\/span><span style=\"font-weight: 400\">shoo<\/span><span style=\"font-weight: 400\">\u00a0subfolders and fill them with noises from animals. <\/span><\/p>\n<p><span style=\"font-weight: 400\">The Speech Commands dataset include 20 words in its unknown classes, including the digits zero through nine along with some random names.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Hence, you can control the percentage of the number of sets pick from the unknown classes using <\/span><span style=\"font-weight: 400\">&#8211;unknown_percentage<\/span><span style=\"font-weight: 400\"> flag which by default is 10%.<\/span><\/p>\n<h3><span style=\"font-weight: 400\">b. Background Noise<\/span><\/h3>\n<p><span style=\"font-weight: 400\">There are obviously background noises in any captured audio clip. To build a model that&#8217;s immune to this such noises, you need to train the model against recorded audio with identical properties. <\/span><\/p>\n<p><span style=\"font-weight: 400\">The files in the Speech Commands dataset were recorded on multiple devices and in many different surroundings, so that will help for the training.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Then, we can randomly choose small excerpts from the files along with loudness level denoted by <\/span><span style=\"font-weight: 400\">&#8211;background_volume<\/span> and mixed at a low volume into clips during training. Not all clips have a background added, the\u00a0&#8211;background_frequency<span style=\"font-weight: 400\">\u00a0flag controls what proportion have them mixed in.<\/span><\/p>\n<h2><span style=\"font-weight: 400\">Customizing<\/span><\/h2>\n<p><span style=\"font-weight: 400\">The model used for the script is huge, using 940k weight parameters that will have too many calculations to run at speeds on devices with limited resources. <\/span><\/p>\n<p><span style=\"font-weight: 400\">The other options to counter this are:<\/span><br \/>\n<b>low_latency_conv:<\/b><span style=\"font-weight: 400\">\u00a0 The accuracy here is lower than conv but the amount of weight parameters is nearly the same and it is much faster<\/span><\/p>\n<p><span style=\"font-weight: 400\">You should specify\u00a0<\/span><span style=\"font-weight: 400\">&#8211;model_architecture=low_latency_conv<\/span><span style=\"font-weight: 400\">\u00a0to use this model on the command line. <\/span><br \/>\n<span style=\"font-weight: 400\">You should add parameters like the learning rate = 0.01 and steps = 20,000.<\/span><br \/>\n<b><\/b><\/p>\n<p><b>low_latency_svdf:<\/b><span style=\"font-weight: 400\">\u00a0Here too, the accuracy is lower than conv but it only uses about 750k parameters, and has an optimized execution. <\/span><\/p>\n<p><span style=\"font-weight: 400\">\u00a0Typing <\/span><span style=\"font-weight: 400\">&#8211;model_architecture=low_latency_svdf<\/span><span style=\"font-weight: 400\">\u00a0on the command line to use the model, and specifying the training rate and the number of steps along with:<\/span><\/p>\n<pre class=\"EnlighterJSRAW\">python tensorflow\/examples\/speech_commands\/train \\\r\n--model_architecture=low_latency_svdf \\\r\n--how_many_training_steps=100000,35000 \\\r\n--learning_rate=0.01,0.005<\/pre>\n<p><b>Other parameters to customize:<\/b><span style=\"font-weight: 400\">\u00a0You can also change the spectrogram parameters. That will change the size of the input image to the model. If the input is smaller, the model will require fewer computations and it is a great way to sacrifice some accuracy for improved latency.<\/span><\/p>\n<p>So, this was all about TensorFlow Audio recognition. Hope you like our explanation.<\/p>\n<h2><span style=\"font-weight: 400\">Conclusion<\/span><\/h2>\n<p><span style=\"font-weight: 400\">Hence, that was how you perform a simple TensorFlow audio recognition of ten words. In conclusion, we discussed TensorBoard in TensorFlow, Confusion matrix. Also, we learned a working model of TensorFlow audio recognition and training in audio recognition. <\/span><\/p>\n<p><span style=\"font-weight: 400\">Next up is a tutorial for <strong>Linear Model in TensorFlow<\/strong>. Furthermore, if you have any doubt regarding TensorFlow Audio Recognition, feel free to ask through the comment section.<\/span><span hidden class=\"__iawmlf-post-loop-links\" data-iawmlf-links=\"[{&quot;id&quot;:1932,&quot;href&quot;:&quot;https:\\\/\\\/github.com\\\/tensorflow\\\/tensorflow\\\/tree\\\/master\\\/tensorflow\\\/examples\\\/android#prebuilt-components&quot;,&quot;archived_href&quot;:&quot;http:\\\/\\\/web-wp.archive.org\\\/web\\\/20250820164003\\\/https:\\\/\\\/github.com\\\/tensorflow\\\/tensorflow\\\/tree\\\/master\\\/tensorflow\\\/examples\\\/android&quot;,&quot;redirect_href&quot;:&quot;&quot;,&quot;checks&quot;:[{&quot;date&quot;:&quot;2025-12-10 11:26:04&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-01-20 13:38:14&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-02-22 15:48:04&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-02-26 12:05:53&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-03-04 06:38:45&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-04-01 20:47:04&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-06-27 20:54:48&quot;,&quot;http_code&quot;:200}],&quot;broken&quot;:false,&quot;last_checked&quot;:{&quot;date&quot;:&quot;2026-06-27 20:54:48&quot;,&quot;http_code&quot;:200},&quot;process&quot;:&quot;done&quot;}]\"><\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>By now you\u2019ve already learned how to create and train your own model. In this Tensorflow tutorial, you\u2019ll be recognizing audio using TensorFlow. Moreover, in this TensorFlow Audio Recognition tutorial, we will go through&#46;&#46;&#46;<\/p>\n","protected":false},"author":6,"featured_media":16036,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[73],"tags":[1232,2890,3240,14510,14511,14530,14607,14879],"class_list":["post-15996","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-tensorflow","tag-audio-recognition-in-tensorflow","tag-confusion-matrix-in-tensorflow","tag-customizing-in-audio-recognition","tag-tensoflow-audio-recognition","tag-tensorboard","tag-tensorflow-command-recognition","tag-tensorflow-speech-recognition","tag-training-in-audio-recognition"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.8 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>TensorFlow Audio Recognition in 10 Minutes - DataFlair<\/title>\n<meta name=\"description\" content=\"TensorFlow Audio recognition- training, confusion matrix, tensorboard, working of tensorflow model, Command recognition and customizing tensoorflow audio\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/data-flair.training\/blogs\/tensorflow-audio-recognition\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"TensorFlow Audio Recognition in 10 Minutes - DataFlair\" \/>\n<meta property=\"og:description\" content=\"TensorFlow Audio recognition- training, confusion matrix, tensorboard, working of tensorflow model, Command recognition and customizing tensoorflow audio\" \/>\n<meta property=\"og:url\" content=\"https:\/\/data-flair.training\/blogs\/tensorflow-audio-recognition\/\" \/>\n<meta property=\"og:site_name\" content=\"DataFlair\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/DataFlairWS\/\" \/>\n<meta property=\"article:published_time\" content=\"2018-05-31T06:00:16+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2021-05-14T05:30:16+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/TensorFlow-Audio-Recognition-01.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1200\" \/>\n\t<meta property=\"og:image:height\" content=\"628\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"DataFlair Team\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@DataFlairWS\" \/>\n<meta name=\"twitter:site\" content=\"@DataFlairWS\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"DataFlair Team\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"8 minutes\" \/>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"TensorFlow Audio Recognition in 10 Minutes - DataFlair","description":"TensorFlow Audio recognition- training, confusion matrix, tensorboard, working of tensorflow model, Command recognition and customizing tensoorflow audio","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/data-flair.training\/blogs\/tensorflow-audio-recognition\/","og_locale":"en_US","og_type":"article","og_title":"TensorFlow Audio Recognition in 10 Minutes - DataFlair","og_description":"TensorFlow Audio recognition- training, confusion matrix, tensorboard, working of tensorflow model, Command recognition and customizing tensoorflow audio","og_url":"https:\/\/data-flair.training\/blogs\/tensorflow-audio-recognition\/","og_site_name":"DataFlair","article_publisher":"https:\/\/www.facebook.com\/DataFlairWS\/","article_published_time":"2018-05-31T06:00:16+00:00","article_modified_time":"2021-05-14T05:30:16+00:00","og_image":[{"width":1200,"height":628,"url":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/TensorFlow-Audio-Recognition-01.jpg","type":"image\/jpeg"}],"author":"DataFlair Team","twitter_card":"summary_large_image","twitter_creator":"@DataFlairWS","twitter_site":"@DataFlairWS","twitter_misc":{"Written by":"DataFlair Team","Est. reading time":"8 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/data-flair.training\/blogs\/tensorflow-audio-recognition\/#article","isPartOf":{"@id":"https:\/\/data-flair.training\/blogs\/tensorflow-audio-recognition\/"},"author":{"name":"DataFlair Team","@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/person\/2c58ecb4f73a39f0ef993f1ddfcd7b89"},"headline":"TensorFlow Audio Recognition in 10 Minutes","datePublished":"2018-05-31T06:00:16+00:00","dateModified":"2021-05-14T05:30:16+00:00","mainEntityOfPage":{"@id":"https:\/\/data-flair.training\/blogs\/tensorflow-audio-recognition\/"},"wordCount":1494,"commentCount":6,"publisher":{"@id":"https:\/\/data-flair.training\/blogs\/#organization"},"image":{"@id":"https:\/\/data-flair.training\/blogs\/tensorflow-audio-recognition\/#primaryimage"},"thumbnailUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/TensorFlow-Audio-Recognition-01.jpg","keywords":["Audio recognition in TensorFlow","confusion matrix in tensorflow","customizing in Audio recognition","Tensoflow audio recognition","Tensorboard","tensorFlow command recognition","tensorflow speech recognition","Training in audio recognition"],"articleSection":["Tensorflow Tutorials"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/data-flair.training\/blogs\/tensorflow-audio-recognition\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/data-flair.training\/blogs\/tensorflow-audio-recognition\/","url":"https:\/\/data-flair.training\/blogs\/tensorflow-audio-recognition\/","name":"TensorFlow Audio Recognition in 10 Minutes - DataFlair","isPartOf":{"@id":"https:\/\/data-flair.training\/blogs\/#website"},"primaryImageOfPage":{"@id":"https:\/\/data-flair.training\/blogs\/tensorflow-audio-recognition\/#primaryimage"},"image":{"@id":"https:\/\/data-flair.training\/blogs\/tensorflow-audio-recognition\/#primaryimage"},"thumbnailUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/TensorFlow-Audio-Recognition-01.jpg","datePublished":"2018-05-31T06:00:16+00:00","dateModified":"2021-05-14T05:30:16+00:00","description":"TensorFlow Audio recognition- training, confusion matrix, tensorboard, working of tensorflow model, Command recognition and customizing tensoorflow audio","breadcrumb":{"@id":"https:\/\/data-flair.training\/blogs\/tensorflow-audio-recognition\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/data-flair.training\/blogs\/tensorflow-audio-recognition\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/data-flair.training\/blogs\/tensorflow-audio-recognition\/#primaryimage","url":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/TensorFlow-Audio-Recognition-01.jpg","contentUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/TensorFlow-Audio-Recognition-01.jpg","width":1200,"height":628,"caption":"Audio Recognition in TensorFlow"},{"@type":"BreadcrumbList","@id":"https:\/\/data-flair.training\/blogs\/tensorflow-audio-recognition\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Blog Home","item":"https:\/\/data-flair.training\/blogs\/"},{"@type":"ListItem","position":2,"name":"Tensorflow Tutorials","item":"https:\/\/data-flair.training\/blogs\/category\/tensorflow\/"},{"@type":"ListItem","position":3,"name":"TensorFlow Audio Recognition in 10 Minutes"}]},{"@type":"WebSite","@id":"https:\/\/data-flair.training\/blogs\/#website","url":"https:\/\/data-flair.training\/blogs\/","name":"DataFlair","description":"Learn Today. Lead Tomorrow.","publisher":{"@id":"https:\/\/data-flair.training\/blogs\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/data-flair.training\/blogs\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/data-flair.training\/blogs\/#organization","name":"DataFlair","url":"https:\/\/data-flair.training\/blogs\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/logo\/image\/","url":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/07\/Data-Flair.png","contentUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/07\/Data-Flair.png","width":106,"height":48,"caption":"DataFlair"},"image":{"@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/DataFlairWS\/","https:\/\/x.com\/DataFlairWS","https:\/\/www.linkedin.com\/company\/dataflair-web-services-pvt-ltd\/","https:\/\/www.youtube.com\/user\/DataFlairWS"]},{"@type":"Person","@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/person\/2c58ecb4f73a39f0ef993f1ddfcd7b89","name":"DataFlair Team","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/1ce4a0e3e542444fc73bbebf83e89e8b73e2d95ccb1fcee64da9945f078b97c5?s=96&d=mm&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/1ce4a0e3e542444fc73bbebf83e89e8b73e2d95ccb1fcee64da9945f078b97c5?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/1ce4a0e3e542444fc73bbebf83e89e8b73e2d95ccb1fcee64da9945f078b97c5?s=96&d=mm&r=g","caption":"DataFlair Team"},"description":"The DataFlair Team provides industry-driven content on programming, Java, Python, C++, DSA, AI, ML, data Science, Android, Flutter, MERN, Web Development, and technology. Our expert educators focus on delivering value-packed, easy-to-follow resources for tech enthusiasts and professionals.","url":"https:\/\/data-flair.training\/blogs\/author\/dfteam2\/"}]}},"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts\/15996","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/users\/6"}],"replies":[{"embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/comments?post=15996"}],"version-history":[{"count":7,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts\/15996\/revisions"}],"predecessor-version":[{"id":94964,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts\/15996\/revisions\/94964"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/media\/16036"}],"wp:attachment":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/media?parent=15996"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/categories?post=15996"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/tags?post=15996"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}