Simulating virtual key-presses and mouse clicks seems like something that only an experienced programmer with a detailed understanding of how computer operating systems work can tackle. However, if you know a little Python, you can take advantage of a really handy module that will allow you to control your computer programmatically. This module is called PyAutoGUI and makes it incredibly easy to do everything you could ever do on a computer using a keyboard and mouse with a few simple commands.
To follow along with this article, all you need are the basics of Python (variables, loops, functions). However, PyAutoGUI makes everything so simple that you should still be able to follow along if you come from an R background or have any familiarity with any programming language.
You can install PyAutoGUI by running
pip install pyautogui on Windows and
pip3 install pyautogui on Mac OS.
If you’re on Linux, open your terminal and type execute the following commands:
sudo apt-get install scrot sudo apt-get install python3-tk sudo apt-get install python3-dev pip install --user pyautogui
If you’ve never worked with Python locally or come from a different programming language, here’s a video that walks through setting up Python on a computer: Python Tutorial for Beginners 1: Install and Setup for Mac and Windows - YouTube
Before we get started, we first need to understand how locations on our screens are represented in PyAutoGUI. In a nutshell, it works the same way as the coordinate system in math except for the fact that the origin (0,0) is the top-left corner of the screen and the y-axis is “flipped”; the y-coordinates increase as you go down.
Another way of thinking about this is to think of the coordinates (x, y) as a shorthand for the computer: “Start at the top left corner of the screen and move x pixels to the right and y pixels down”. For example, (200, 300) means 200 pixels left and 300 pixels down. This will become important when we want to move our mouse to a certain location; whereas we humans can estimate where we want the cursor to end up, the PyAutoGUI requires the precise location.
To get the precise location, we can also use the MouseInfo application that comes with PyAutoGUI.
- Open up your command prompt and type in
- You should now see an interactive Python shell where you can type:
import pyautogui; pyautogui.mouseInfo()
You should now see a MouseInfo application that will display the coordinates of your cursor in real-time with additional information. Try moving the mouse around and see the coordinates change to get a feel for where things are on your screen.
Say you want to click on a particular location on your screen. You hover your mouse over that point and see the corresponding coordinates on the MouseInfo application.
In this case, my cursor was 645 pixels left and 604 pixels down from the top left corner when I took this screenshot.
- Move your mouse to any location on the screen:
pyautogui.moveTo(100, 100) # move to point (100, 100) pyautogui.moveTo(100, 100, duration=2) # motion lasts for 2 seconds
- Click on a particular location
pyautogui.click() # left click current position of cursor pyautogui.click(300, 500) # left click click screen location (300, 500) pyautogui.click(button='right') # right click pyautogui.click(button='middle') # middle click (scroll button)
- Double Click
pyautogui.doubleClick() # same optioinal parameters as click()
- Drag your mouse (useful for dragging application windows)
- Scroll (the actual scroll amount depends on your computer settings so you may need to experiment a few times before you get the desired amount of scrolling)
We can also simulate keyboard presses by passing in the key we want to press into various functions related to the keyboard.
- Press any key on your keyboard
pyautogui.press('enter') pyautogui.press(['shift', 'left'])
- Write text into a text field (ie: search bar)
pyautuogui.write('Hello, World! This text is being typed from my Python script!') pyautogui.write('I pause for 0.25 seconds after each character', interval=0.25)
- Issue keyboard shortcuts
- Hold down on a key
- Release a held-down key (
press()is just shorthand for
keyUp()applied to the same key)
The full list of keys PyAutoGUI supports can be found here: Keyboard Control Functions — PyAutoGUI documentation
Note that these mouse clicks and key-presses oftentimes happen faster than your computer can load applications. You can say
pyautogui.PAUSE = 0.5 at the beginning of your file to pause your script 0.5 seconds after every command (see the 2nd project in the next section for an example). I found 0.5 works well for my computer but you can experiment with various values to find the optimal delay for your system. You can also isolate a single command to delay on by importing the
time module and calling the
time.sleep() where you pass in the number of seconds you want to delay the execution of the next command.
Finally, PyAutoGUI also comes with a built-in fail-safe feature where if you accidentally lose control of your computer to the program (ie: commands in an infinite while loop), you can always quickly slide your mouse to the top left corner of the screen (0, 0) to force-quit the program at any point.
As you can see, PyAutoGUI makes it very simple to access your keyboard and mouse. You can sort of think of
pyautogui as an object that represents the input device (mouse and keyboard) which has various predefined methods you can use.
Here are some random projects that I made with this module:
- A small script that draws a perfect star on a paint application such as MS Paint or Painbrush:
import math import pyautogui pyautogui.drag(100, 0) pyautogui.drag(-100 * math.cos(math.radians(36)), 100 * math.sin(math.radians(36))) pyautogui.drag(100 * math.cos(math.radians(72)), -100 * math.sin(math.radians(72))) pyautogui.drag(100 * math.cos(math.radians(72)), 100 * math.sin(math.radians(72))) pyautogui.drag(-100 * math.cos(math.radians(36)), -100 * math.sin(math.radians(36)))
- This one opens up Dataquest on Chrome and Notepad, then splits the screen so I can take notes as I go along.
import pyautogui pyautogui.PAUSE = 0.5 # Both Chrome and Notepad needs time to load pyautogui.press('win') pyautogui.write('Chrome') pyautogui.press('enter') pyautogui.write('https://app.dataquest.io/dashboard') pyautogui.press('enter') pyautogui.hotkey('win', 'left') pyautogui.press('win') pyautogui.write('Notepad') pyautogui.press('enter') pyautogui.hotkey('win', 'right')
At this point, you should know how to write a Python script for everything you could do on a computer. However, PyAutoGUI does have some cool additional functionalities that turn the module into an even more powerful tool. This article was largely based on things I found interesting in the official PyAutoGUI documentation. If you want to dig deeper here’s a link to the documentation: Welcome to PyAutoGUI’s documentation! — PyAutoGUI documentation.
If you’re a beginner and are just starting out with Python, you might be intimidated by the convoluted and technical way that documentations are written. The documentation for PyAutoGUI actually is very simple and readable so if you followed along with this article, you’ll be able to understand the documentation as well. This is a great way to practice reading the documentation; something that you have to get used to at a certain point on your Data Science journey anyways.
The creator of this PyAutoGUI, Al Sweigart, also has a book called Automate the Boring Stuff with Python available for free online. It’s a great book that dives deeper into PyAutoGUI and other ways of automating repetitive tasks on a computer using Python, like sending emails and organizing your files.
Thank you so much for reading. I hope this module was interesting and fun for you as much as it was for me. I wanted to share this module because it was a fun way to use Python and a reminder that programming is a powerful tool with many applications outside of Data Science. I personally found PyAutoGUI really motivating and a fun way to keep up my Python skills. I encourage you to try it out yourself because there’s something very satisfying about running the script and seeing the computer simulate mouse clicks and keypresses by itself.