[Tutorial]Image Recognition and Screen mapping in python developments(Part 2)

in #utopian-io6 years ago

images.png
image source
Repository: Python, Open Source Repository

Software: For software download python 3.0 compatible with your OS here

Difficulty : Basic
What you will learn:
In this tutorial you will learn how to

  • Use the PyAutoGui, and PIL module

  • Create a steemit feed reading bot with python

  • Setup a bot to run in any system process

For my previous posts in the python tutorial series
-Click Here* Creating a simple calculator using python 3.8(cpython)

Creating an encryption key(Cryptography)
- Part 1
- Part 2
- Part 3
- Part 4

Developing an institutions data collection backend and frontend with python 3.6 series
Part 1 of this series
Part 2 of this series
Part 3 of this series
Part 4 of this series
Part 5 of this series
Part 6 of this series

Earlier on this series
Automating OnScreen Processes

PyCharm
In this tutorial i would recommend using PyCharm which has a free community edition. PyCharm is an IDE developed for python users. PyCharm makes it easier to download and install modules on your windows, mac or linux system. Click here to download PyCharm

Paint.net
Paint.net or any other system image editor required.
For paint.net follow this link

Tutorial
This series covers the process of automating system processes in python development. See curriculum for part 1.
In the last of this tutorial series we worked on automating a bot that helps access your steemit feed goes through it prompting you regularly.
The common problem we would face though arises from the fact that our bot runs based on coordinates, so if we change the arranngement of our desktop our bot wouldnt be able to find the browser and run the site.
This brings the need for image recognition. Image recognition would help our bot find chrome regardless of where it is on the screen.

Development
In this tutorial we would be using the pyautogui module which has a pyscreeze function that comes with it. This module helps us carry our various cross platform processes such as taking screen shots, locating a certain image on the screen and others which we will see through the course of this tutorial.
Firstly, since we wont be using the coordinates method anymore head to your python code and delete the Coordinates class and all its constituents. as we wouldnt be needing this anymore.
We then progress to editing our functions for image recognitition beginning with the openChrome function;

def openchrome():
    #pyautogui.locateCenterOnScreen, locates the center of the image whose file path has been specified.
    chromeIcon = pyautogui.locateCenterOnScreen('C:/Users/ZEOY/Pictures/Screenshots/python/chromeIcon.png')
    print(chromeIcon)#This function prints the coordinates of the center of the image but returns none if not found
    pyautogui.doubleClick(chromeIcon)
    time.sleep(3.0)

For the image to be recognized, take a screenshot of your homescreen, then use either paint or any other image editing software to crop out the image of your browser on the homescreen.
NoteThe crop has to be a very small portion of your icon and not the whole thing and also should not include the colour of your desktop wallpaper or any other contradicting pattern on your desktop screen.

Next, we crop for our maximize button, exit button, chromeapp or any other feature you feel you would like to add this this bot, and store it in a folder which can be referenced to from our code.

Screenshot (20).png

We then repeat this process for all the other functions in code.

def openchrome():
    chromeIcon = pyautogui.locateCenterOnScreen('C:/Users/ZEOY/Pictures/Screenshots/python/chromeIcon.png')
    print(chromeIcon)
    pyautogui.doubleClick(chromeIcon)
    time.sleep(3.0)
def maximizetab():
    maximizebutton = pyautogui.locateCenterOnScreen('C:/Users/ZEOY/Pictures/Screenshots/python/maximize.png')
    print(maximizebutton)
    pyautogui.click(maximizebutton)
    time.sleep(4)

def starttyping():
    pyautogui.keyDown('w')
    time.sleep(0.1)
    pyautogui.keyUp('w')
    pyautogui.keyDown('w')
    time.sleep(0.1)
    pyautogui.keyUp('w')
    pyautogui.keyDown('w')
    time.sleep(0.1)
    pyautogui.keyUp('w')
    pyautogui.keyDown('.')
    time.sleep(0.1)
    pyautogui.keyUp('.')
    pyautogui.keyDown('s')
    time.sleep(0.1)
    pyautogui.keyUp('s')
    pyautogui.keyDown('t')
    time.sleep(0.1)
    pyautogui.keyUp('t')
    pyautogui.keyDown('e')
    time.sleep(0.1)
    pyautogui.keyUp('e')
    pyautogui.keyDown('e')
    time.sleep(0.1)
    pyautogui.keyUp('e')
    pyautogui.keyDown('m')
    time.sleep(0.1)
    pyautogui.keyUp('m')
    pyautogui.keyDown('i')
    time.sleep(0.1)
    pyautogui.keyUp('i')
    pyautogui.keyDown('t')
    time.sleep(0.1)
    pyautogui.keyUp('t')
    pyautogui.keyDown('.')
    time.sleep(0.1)
    pyautogui.keyUp('.')
    pyautogui.keyDown('c')
    time.sleep(0.1)
    pyautogui.keyUp('c')
    pyautogui.keyDown('o')
    time.sleep(0.1)
    pyautogui.keyUp('o')
    pyautogui.keyDown('m')
    time.sleep(0.1)
    pyautogui.keyUp('m')
    time.sleep(2)
    pyautogui.keyDown('enter')
    time.sleep(0.1)
    pyautogui.keyUp('enter')
    time.sleep(4)

The next thing we have to do is to create our main function which will manage how our program will run. This would be very necessary because if we just call a certain function and it fails to perform our code will keep running and end up clicking on things that we do not want it to click. Also we would add a prompt to find out whether our user is using any other browser such as mozilla or internet explorer and make our program to also run in that instance.
Our code should look like this

def main():
    x = pyautogui.prompt('Which Browser do you prefer',buttons= ['Chrome','Firefox','Explorer'])
    if x == 'Chrome':
        if openchrome() == 'none':
            print("Cant find chrome")
        else:
            if maximizetab() == 'none':
                print("Cant maximize tab")
            else:
                starttyping()
                explore()
    elif x == 'Firefox':
        openfirefox()#Repeat the above code using the template above, and add chrome and firefox images in your folder
    elif x == 'Explorer':
        openexplorer()
    else:
        pyautogui.alert('Browser not available in program','OOps')

Conclusion
Image recognition is imperative in automating system processes and if used properly can simplify many automated onscreen processes. This feauture though can be problematic when the image type is changed from .jpeg to .png and vice versa so it would be advisable to use an app such as your default paint app that saves files in .png.

You can find my Proof of work in Github

Sort:  

Thank you for your contribution.

  • Nice work on the explanations of your code, although adding a bit more comments to the code can be helpful as well

Your contribution has been evaluated according to Utopian policies and guidelines, as well as a predefined set of questions pertaining to the category.

To view those questions and the relevant answers related to your post, click here.


Need help? Write a ticket on https://support.utopian.io/.
Chat with us on Discord.
[utopian-moderator]

Thank you for your review, @portugalcoin!

So far this week you've reviewed 6 contributions. Keep up the good work!

Hello! I find your post valuable for the wafrica community! Thanks for the great post! We encourage and support quality contents and projects from the West African region.
Do you have a suggestion, concern or want to appear as a guest author on WAfrica, join our discord server and discuss with a member of our curation team.
Don't forget to join us every Sunday by 20:30GMT for our Sunday WAFRO party on our discord channel. Thank you.

Hey @yalzeee
Thanks for contributing on Utopian.
We’re already looking forward to your next contribution!

Want to chat? Join us on Discord https://discord.gg/h52nFrV.

Vote for Utopian Witness!

Coin Marketplace

STEEM 0.30
TRX 0.12
JST 0.034
BTC 63877.55
ETH 3143.56
USDT 1.00
SBD 3.97