Programmatically Control Your Keyboard and Mouse Through a Python Script

Simulating virtual key-presses and mouse clicks seems like something that only an experienced programmer with a detailed understanding of how computer operating systems work can tackle. However, if you know a little Python, you can take advantage of a really handy module that will allow you to control your computer programmatically. This module is called PyAutoGUI and makes it incredibly easy to do everything you could ever do on a computer using a keyboard and mouse with a few simple commands.

To follow along with this article, all you need are the basics of Python (variables, loops, functions). However, PyAutoGUI makes everything so simple that you should still be able to follow along if you come from an R background or have any familiarity with any programming language.

Installation

You can install PyAutoGUI by running pip install pyautogui on Windows and pip3 install pyautogui on Mac OS.
If you’re on Linux, open your terminal and type execute the following commands:

sudo apt-get install scrot
sudo apt-get install python3-tk
sudo apt-get install python3-dev
pip install --user pyautogui

If you’ve never worked with Python locally or come from a different programming language, here’s a video that walks through setting up Python on a computer: Python Tutorial for Beginners 1: Install and Setup for Mac and Windows - YouTube

The Screen Coordinate System

Before we get started, we first need to understand how locations on our screens are represented in PyAutoGUI. In a nutshell, it works the same way as the coordinate system in math except for the fact that the origin (0,0) is the top-left corner of the screen and the y-axis is “flipped”; the y-coordinates increase as you go down.

Another way of thinking about this is to think of the coordinates (x, y) as a shorthand for the computer: “Start at the top left corner of the screen and move x pixels to the right and y pixels down”. For example, (200, 300) means 200 pixels left and 300 pixels down. This will become important when we want to move our mouse to a certain location; whereas we humans can estimate where we want the cursor to end up, the PyAutoGUI requires the precise location.

To get the precise location, we can also use the MouseInfo application that comes with PyAutoGUI.

  1. Open up your command prompt and type in python
  2. You should now see an interactive Python shell where you can type: import pyautogui; pyautogui.mouseInfo()

You should now see a MouseInfo application that will display the coordinates of your cursor in real-time with additional information. Try moving the mouse around and see the coordinates change to get a feel for where things are on your screen.

Say you want to click on a particular location on your screen. You hover your mouse over that point and see the corresponding coordinates on the MouseInfo application.

In this case, my cursor was 645 pixels left and 604 pixels down from the top left corner when I took this screenshot.

Scripting Your Mouse

  • Move your mouse to any location on the screen:
pyautogui.moveTo(100, 100)  # move to point (100, 100)
pyautogui.moveTo(100, 100, duration=2)  # motion lasts for 2 seconds
  • Click on a particular location
pyautogui.click()  # left click current position of cursor
pyautogui.click(300, 500)  # left click click screen location (300, 500)
pyautogui.click(button='right')  # right click
pyautogui.click(button='middle')  # middle click (scroll button)
  • Double Click
pyautogui.doubleClick()  # same optioinal parameters as click()
  • Drag your mouse (useful for dragging application windows)
pyautogui.dragTo(200, 300)
  • Scroll (the actual scroll amount depends on your computer settings so you may need to experiment a few times before you get the desired amount of scrolling)
pyautogui.scroll(500)

Scripting Your Keyboard

We can also simulate keyboard presses by passing in the key we want to press into various functions related to the keyboard.

  • Press any key on your keyboard
pyautogui.press('enter')
pyautogui.press(['shift', 'left'])
  • Write text into a text field (ie: search bar)
pyautuogui.write('Hello, World! This text is being typed from my Python script!')
pyautogui.write('I pause for 0.25 seconds after each character', interval=0.25)
  • Issue keyboard shortcuts
pyautogui.hotkey('ctrl', 'c')
  • Hold down on a key
pyautogui.keyDown('space')
  • Release a held-down key (press() is just shorthand for keyDown() and keyUp() applied to the same key)
pyautogui.keyUp('space')

The full list of keys PyAutoGUI supports can be found here: Keyboard Control Functions — PyAutoGUI documentation

Note that these mouse clicks and key-presses oftentimes happen faster than your computer can load applications. You can say pyautogui.PAUSE = 0.5 at the beginning of your file to pause your script 0.5 seconds after every command (see the 2nd project in the next section for an example). I found 0.5 works well for my computer but you can experiment with various values to find the optimal delay for your system. You can also isolate a single command to delay on by importing the time module and calling the time.sleep() where you pass in the number of seconds you want to delay the execution of the next command.

Finally, PyAutoGUI also comes with a built-in fail-safe feature where if you accidentally lose control of your computer to the program (ie: commands in an infinite while loop), you can always quickly slide your mouse to the top left corner of the screen (0, 0) to force-quit the program at any point.

Sample Projects:

As you can see, PyAutoGUI makes it very simple to access your keyboard and mouse. You can sort of think of pyautogui as an object that represents the input device (mouse and keyboard) which has various predefined methods you can use.

Here are some random projects that I made with this module:

  • A small script that draws a perfect star on a paint application such as MS Paint or Painbrush:
import math
import pyautogui

pyautogui.drag(100, 0)
pyautogui.drag(-100 * math.cos(math.radians(36)), 100 * math.sin(math.radians(36)))
pyautogui.drag(100 * math.cos(math.radians(72)), -100 * math.sin(math.radians(72)))
pyautogui.drag(100 * math.cos(math.radians(72)), 100 * math.sin(math.radians(72)))
pyautogui.drag(-100 * math.cos(math.radians(36)), -100 * math.sin(math.radians(36)))
  • This one opens up Dataquest on Chrome and Notepad, then splits the screen so I can take notes as I go along.
import pyautogui

pyautogui.PAUSE = 0.5 # Both Chrome and Notepad needs time to load

pyautogui.press('win')
pyautogui.write('Chrome')
pyautogui.press('enter')
pyautogui.write('https://app.dataquest.io/dashboard')
pyautogui.press('enter')
pyautogui.hotkey('win', 'left')

pyautogui.press('win')
pyautogui.write('Notepad')
pyautogui.press('enter')
pyautogui.hotkey('win', 'right')

Further Research

At this point, you should know how to write a Python script for everything you could do on a computer. However, PyAutoGUI does have some cool additional functionalities that turn the module into an even more powerful tool. This article was largely based on things I found interesting in the official PyAutoGUI documentation. If you want to dig deeper here’s a link to the documentation: Welcome to PyAutoGUI’s documentation! — PyAutoGUI documentation.

If you’re a beginner and are just starting out with Python, you might be intimidated by the convoluted and technical way that documentations are written. The documentation for PyAutoGUI actually is very simple and readable so if you followed along with this article, you’ll be able to understand the documentation as well. This is a great way to practice reading the documentation; something that you have to get used to at a certain point on your Data Science journey anyways.

The creator of this PyAutoGUI, Al Sweigart, also has a book called Automate the Boring Stuff with Python available for free online. It’s a great book that dives deeper into PyAutoGUI and other ways of automating repetitive tasks on a computer using Python, like sending emails and organizing your files.

Conclusion

Thank you so much for reading. I hope this module was interesting and fun for you as much as it was for me. I wanted to share this module because it was a fun way to use Python and a reminder that programming is a powerful tool with many applications outside of Data Science. I personally found PyAutoGUI really motivating and a fun way to keep up my Python skills. I encourage you to try it out yourself because there’s something very satisfying about running the script and seeing the computer simulate mouse clicks and keypresses by itself.

4 Likes

I assume the last line here writes into the url bar of chrome? How does it know to do this? Is there something under the hood that automatically moves mouse to url bar and clicks there so it becomes active?

This makes me think pyautogui.write behaves differently based on what application is currently open?

Actually, that’s not a feature of PyAutoGUI. Chrome has the search bar selected by default when it first opens. Other browsers should work the same way but you can always select the search bar using the MouseInfo application.
The pyautogui.write() function should behave the same way on any application; it simply sends text into whatever input field you have selected. If the search bar wasn’t selected for example, it would basically do nothing because there is nowhere to send the text to.

2 Likes

Thanks for turning me on to this!
I was able to automate opening my password manager, saving me a lot of time!

1 Like

@kimhyunseop: Interesting read… I’m thinking of the other side of things since I’m a security student. Possible for it to be used a a malicious way? I could see it being packaged with other libraries or being used after attackers compromise systems (if they have shell access), somewhat like running Remote Desktop Protocol and controlling one’s screen but in this case using the terminal (and if of course the victim has python installed and added to path for pip to work lol).

Just some thoughts that came about when I read the article… I think its good that we can look at the plus and minuses of such applications lol… since computers just happily execute code they are given :sweat_smile:

@masterryan.prof:
Hmmm… I didn’t think of that. Python is actually used extensively in ethical hacking meaning you technically can use it to create malicious scripts. However, I don’t really see attackers using PyAutoGUI because there are much more sophisticated modules designed specifically for compromising systems. That would very inefficient and sort of silly :sweat_smile:. (attackers will also probably get caught if they use Python illegally since cybersecurity agencies are also bound to know about these open-source tools as well).

Your did remind me to add something I forgot to the article though: You can always force-quit whatever PyAutoGUI is doing by quickly sliding your mouse to the top left corner of your screen.

@kimhyunseop: Hmm true in ways…

Ahh i see… interesting. Will explore more about it when I have the time!