|
 |
Bimonthly Since 1986 |
ISSN 1004-9037
|
|
 |
|
|
Publication Details |
Edited by: Editorial Board of Journal of Data Acquisition and Processing
P.O. Box 2704, Beijing 100190, P.R. China
Sponsored by: Institute of Computing Technology, CAS & China Computer Federation
Undertaken by: Institute of Computing Technology, CAS
Published by: SCIENCE PRESS, BEIJING, CHINA
Distributed by:
China: All Local Post Offices
|
|
|
|
|
|
|
|
|
|
|
09 May 2023, Volume 38 Issue 3
|
|
|
|
IMAGE CAPTION GENERATOR WITH VOICE USING LSTM AND CNN ALGORITHMS
1Dr. Dattatray G. Takale, 2Dr. Dattatray S. Galhe, 3Dr. Parishit N. Mahalle, 4Dr. Chitrakant O. Banchhor 5Prof.Piyush P. Gawali, 6Prof.Gopal Deshmukh, 7Dr. Vajid Khan, 8Prof. Madhuri Karnik
Journal of Data Acquisition and Processing, 2023, 38 (3): 1121-1132 .
|
Abstract
In the area of voice-driven picture caption creation, the VGG16 Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) networks have showed potential. In this study, we demonstrate a system that uses this potent combination to provide captions and audio explanations for pictures. In order to provide a rich representation of the input pictures' information, high-level features are extracted from the images using the VGG16 CNN. The LSTM network then receives these characteristics and expands the memory by including sequential data to provide illustrative captions. The well-known "Flickr8k" dataset, which includes a large collection of photographs and related human-written captions, serves as the basis for the system's training and evaluation. Our method generates precise and contextually appropriate captions and audio explanations for a variety of pictures by combining the strengths of CNN and LSTM. The trial results show the value of the suggested strategy, opening the door to further developments in picture captioning and accessibility for those with visual impairments.
Keyword
Image caption generation, voice synthesis, VGG16, Convolutional Neural Network, LSTM, Long Short-Term Memory, Flickr8k dataset
PDF Download (click here)
|
|
|
|
|