Image Caption Generator with CNN & LSTM In Python With Source Code
The Image Caption Generator with CNN & LSTM In Python was developed using Python Programming with CNN and LSTM. This Project is to learn the concepts of a CNN and LSTM model and build a working model of Image caption generator by implementing CNN with LSTM.
A Image Caption Generator In python we will be implementing the caption generator using CNN (Convolutional Neural Networks) and LSTM (Long short term memory). The image features will be extracted from Xception which is a CNN model trained on the imagenet dataset and then we feed the features into the LSTM model which will be responsible for generating the image captions.
What is CNN?
Convolutional Neural Networks are specialized deep neural networks which can process the data that has input shape like a 2D matrix. Images are easily represented as a 2D matrix and CNN is very useful in working with images.
CNN is basically used for image classifications and identifying if an image is a bird, a plane or Superman, etc. It scans images from left to right and top to bottom to pull out important features from the image and combines the feature to classify images. It can handle the images that have been translated, rotated, scaled and changes in perspective.
What is LSTM?
LSTM stands for Long short term memory, they are a type of RNN (recurrent neural network) which is well suited for sequence prediction problems. Based on the previous text, we can predict what the next word will be. It has proven itself effective from the traditional RNN by overcoming the limitations of RNN which had short term memory. LSTM can carry out relevant information throughout the processing of inputs and with a forget gate, it discards non-relevant information.
In this Python Project Using CNN and LSTM also includes a downloadable Python Project With Source Code for free, just find he downloadable source code below and click to start downloading.
By the way if you are new to python programming and you don’t know what would be the the Python IDE to use, I have here a list of Best Python IDE for Windows, Linux, Mac OS that will suit for you. I also have here How to Download and Install Latest Version of Python on Windows.
To start executing Image Caption Generator with CNN & LSTM In Python With Source Code, make sure that you have installed Python 3.9 and PyCharm in your computer.
Image Caption Generator with CNN & LSTM In Python With Source Code : Steps on how to run the project
Time needed: 5 minutes.
These are the steps on how to run Image Caption Generator with CNN & LSTM In Python With Source Code
- Step 1: Download the given source code below.
First, download the given source code below and unzip the source code.
- Step 2: Import the project to your PyCharm IDE.
Next, import the source code you’ve download to your PyCharm IDE.
- Step 3: Run the project.
last, run the project with the command “py main.py -i example.jpg”
from keras.preprocessing.text import Tokenizer from keras.preprocessing.sequence import pad_sequences from keras.applications.xception import Xception from keras.models import load_model from pickle import load import numpy as np from PIL import Image import matplotlib.pyplot as plt import argparse
Complete Source Code
from keras.preprocessing.text import Tokenizer from keras.preprocessing.sequence import pad_sequences from keras.applications.xception import Xception from keras.models import load_model from pickle import load import numpy as np from PIL import Image import matplotlib.pyplot as plt import argparse ap = argparse.ArgumentParser() ap.add_argument('-i', '--image', required=True, help="Image Path") args = vars(ap.parse_args()) img_path = args['image'] def extract_features(filename, model): try: image = Image.open(filename) except: print("ERROR: Couldn't open image! Make sure the image path and extension is correct") image = image.resize((299,299)) image = np.array(image) # for images that has 4 channels, we convert them into 3 channels if image.shape == 4: image = image[..., :3] image = np.expand_dims(image, axis=0) image = image/127.5 image = image - 1.0 feature = model.predict(image) return feature def word_for_id(integer, tokenizer): for word, index in tokenizer.word_index.items(): if index == integer: return word return None def generate_desc(model, tokenizer, photo, max_length): in_text = 'start' for i in range(max_length): sequence = tokenizer.texts_to_sequences([in_text]) sequence = pad_sequences([sequence], maxlen=max_length) pred = model.predict([photo,sequence], verbose=0) pred = np.argmax(pred) word = word_for_id(pred, tokenizer) if word is None: break in_text += ' ' + word if word == 'end': break return in_text #path = 'Flicker8k_Dataset/111537222_07e56d5a30.jpg' max_length = 32 tokenizer = load(open("tokenizer.p","rb")) model = load_model('models/model_9.h5') xception_model = Xception(include_top=False, pooling="avg") photo = extract_features(img_path, xception_model) img = Image.open(img_path) description = generate_desc(model, tokenizer, photo, max_length) print("\n\n") print(description) plt.imshow(img)
Download Source Code below
In this advanced Python project, we have implemented a CNN-RNN model by building an image caption generator. Some key points to note are that our model depends on the data, so, it cannot predict the words that are out of its vocabulary. We used a small dataset consisting of 8000 images. For production-level models, we need to train on datasets larger than 100,000 images which can produce better accuracy models.
- Code For Game in Python: Python Game Projects With Source Code
- Best Python Projects With Source Code 2020 FREE DOWNLOAD
- How to Make a Point of Sale In Python With Source Code 2021
- Python Code For Food Ordering System | FREE DOWNLOAD | 2020
- Inventory Management System Project in Python With Source Code
If you have any questions or suggestions about Image Caption Generator with CNN & LSTM In Python With Source Code, please feel free to leave a comment below.