Using Machine Learning and Python to Keyword Images

in #python6 years ago (edited)

Screenshot from 2018-06-08 13-49-04.png

You may have seen earlier that I am building an image viewer application.

One of the reasons for this application is my store of purchased stock photography, and also my library of personally taken images. When I need an appropriate image for an article, I have to scroll through visually.

What if I could add keywords to the file names so I could search more easily?

Tensor Flow and Machine Learning with Python

It turns out there has been a heck of a lot of work done in the field of categorizing and analyzing images. While image recognition is still being developed, the last 5 or so years has seen great strides from companies such as Microsoft and Google, and also universities such as the University of Montreal.

For Python users, we can leverage all this hard work even on a tool as humble as the Raspberry Pi or an old laptop, because we don't need to train the machine ourselves!

It's not R2 D2

Now don't get too excited, it's not quite there yet. This is how it managed to interpret my first image:

1.jpg

YOUR PICTURE IS OF A:
 - fountain: 0.533854 likelihood
 - paddlewheel: 0.097188 likelihood
 - park_bench: 0.044255 likelihood
 - birdhouse: 0.031683 likelihood
 - breakwater: 0.021008 likelihood
 - paintbrush: 0.018073 likelihood
 - boathouse: 0.014875 likelihood
 - barn: 0.014575 likelihood
 - mailbox: 0.012850 likelihood
 - shopping_cart: 0.012731 likelihood

Then I realized it had exported from my camera rotated by 90 degrees, so I ran it again ...

YOUR PICTURE IS OF A:
 - lakeside: 0.342271 likelihood
 - freight_car: 0.259517 likelihood
 - boathouse: 0.118738 likelihood
 - paddlewheel: 0.054276 likelihood
 - container_ship: 0.032019 likelihood
 - steam_locomotive: 0.031085 likelihood
 - trailer_truck: 0.017880 likelihood
 - tractor: 0.013872 likelihood
 - canoe: 0.013488 likelihood
 - electric_locomotive: 0.008250 likelihood

Some interpretations were way off, but lakeside and boathouse were excellent matches, as was canoe further down.

Python Code

Full code in this Gist

Let's load in one image and get the top 10 results.

We will use this cat selfie:

2.jpg

import numpy as np
from keras.preprocessing import image
from keras.applications import resnet50

# Load the Keras image database
model = resnet50.ResNet50()

# Load the picture as 224x224 (maximum size this model can cope with)
picture = image.load_img("2.jpg", target_size=(224, 224))

# Convert to image array
x = image.img_to_array(picture)

# Expand as if it is an array of images
x = np.expand_dims(x, axis=0)

# Pre-process to the scale of the trained network
x = resnet50.preprocess_input(x)

# Run the prediction
predictions = model.predict(x)

# Get the classes of the top 10 results
predicted_classes = resnet50.decode_predictions(predictions, top=10)

print("YOUR PICTURE IS OF A:")

for imagenet_id, name, likelihood in predicted_classes[0]:
    print(" - {}: {}".format(name, likelihood))

Results

YOUR PICTURE IS OF A:
 - tabby: 0.37501177191734314
 - Egyptian_cat: 0.19036126136779785
 - lynx: 0.09264399111270905
 - tiger_cat: 0.07313445210456848
 - Persian_cat: 0.07192806154489517
 - Siamese_cat: 0.04455263167619705
 - carton: 0.027719130739569664
 - window_screen: 0.014117571525275707
 - plastic_bag: 0.013977281749248505
 - bow_tie: 0.008933120407164097

It even knew it was a Tabby! Would have been useful to say "cat".

What about Benji? He is an English Cocker Spaniel.

3.jpg

YOUR PICTURE IS OF A:
 - cocker_spaniel: 0.48787274956703186
 - curly-coated_retriever: 0.2491113841533661
 - bluetick: 0.048936877399683
 - standard_poodle: 0.045590486377477646
 - Irish_water_spaniel: 0.026697995141148567
 - Labrador_retriever: 0.022910259664058685
 - Chesapeake_Bay_retriever: 0.012821889482438564
 - Bouvier_des_Flandres: 0.012804904952645302
 - Great_Dane: 0.009580017998814583
 - American_Staffordshire_terrier: 0.006594178732484579

Perfect first result!

Conclusion

It still needs some human interpretation. The cat and dog pictures could definitely use the cat and dog keywords ;)

But still, impressive!


makerhacks.png

Sort:  

I'm thinking of building a a trained model to classify and tag images from construction projects that I'm involved in. Would save me hours and hours and hours of time in writing photo descriptions, as well as trying to find images later on to remember what happened when.

Coin Marketplace

STEEM 0.36
TRX 0.12
JST 0.039
BTC 69965.85
ETH 3540.49
USDT 1.00
SBD 4.71