Performance Enhancement of Image Captioning Technique Using Machine Learning Approach

Authors

  • Deepti Goyal, Dr. S. V. A. V. Prasad

Abstract

The present research works aims to attain highest accuracy to increase the performance of model used for image captioning. Henceforth, the in this proposed research an attempt was made using VGG 16 and LSTM algorithm. The fusion of computer vision and natural language processing has received a lot of interest recently thanks to the advent of deep learning. This field is represented by picture captioning, which trains a computer to comprehend an image's visual information using one or more phrases. The process of constructing a coherent description of high-level image semantics requires an understanding of the state, attributes, and relationships of these objects. Although picture captioning is a difficult and involved task, several researchers have made significant progress in this area. In this research, we focus on employing convolutional neural networks (CNNs) as encoders and recurrent neural networks (RNNs) as decoders to propose methods for photo captioning In the present report, image captioning is handled using the VGG16 and LSTM algorithms. Compared to other models, this one achieves the greatest accuracy (96.75%) and the lowest loss (0.22) of those that are currently available. Hence, the proposed model performed outstandingly.

Downloads

Published

2022-08-17