AutoHotKey: Automate Windows with a Third-party Scripting Language

f you’re like me you’re always looking for ways to increase productivity. This search can involve lots of different technologies. One limitation on technology usage is its applicability to the scope of a problem, if you need to open 20 documents in Open Office and run a filter to convert them to Docbook the best solution would probably not involve writing Java. For these simple operations one tends to use scripting languages that have interfaces appropriate to the task at hand.

There are problems inherent in using a scripting language:

  1. You will need to learn the interfaces the language provides for your task.
  2. Sometimes your preferred language does not provide any easy interface, so you are stuck either doing the task manually, doing the task in a difficult manner, or learning a new language for the task.

In the example of converting 20 documents to Docbook you’d probably find it makes sense to do this task manually because time constraints make this temporarily the most productive solution.

In Windows, which I expect a lot of us have to deal with in one way or another, there is a solution; it is to use the graphical nature of Windows to automate things. This solution provides several benefits:

  1. Often the interfaces provided by a program menu or hotkey combination are not matched directly by the API of a particular program, thus something that can be represented by one or two mouse clicks must be done by many lines in a scripting language.
  2. By being able to send keystrokes and mouse movements directly to an application you can create a cross-application scripting language that will use the same interfaces for a large number of applications.

There are a number of ways to automate keystrokes and other user inputs in Windows, but the best I have found for my needs is the AutoHotKey scripting language.

Running AutoHotKey
As its name implies, AutoHotKey is useful for making HotKeys. A HotKey is a sequence of concurrent keystrokes that will cause an application to do something?an example is the well-known Ctrl-Alt-Del keystroke combination.

AutoHotKey scripts have the extension .ahk; when right-clicked you can choose to run or compile the script. When you choose to run the script it will be loaded into your taskbar and can be accessed from there.

You can run multiple AutoHotKey scripts at the same time; all of them will load in the taskbar. If you are using the scripts to catch keyboard or other user interface input you will have to make sure that concurrently running scripts do not have key collisions or conflict in other ways (see Figure 1).

Figure 1. The screen shot shows an AutoHotKey script running in Taskbar with the right-click context menu selected, the big green H is the icon for an AutoHotKey script.

AutoHotKey can do pretty much anything a normal scripting language can. It has the ability to build GUIs and interact with protocols such as http, but where it really shines is catching data from user interface devices and sending user interface device data to running applications. In this article I will focus on catching and sending keyboard data to applications as an example of this functionality.

Using AutoHotKey to Make HotKeys
The following is an example of a functioning AutoHotKey script from the AutoHotKey documentation:

    #n::Run Notepad

The # sign is a shortcut reference to the Windows Key found on many keyboards, the n is the letter n, the double : assigns the Run Notepad to the keyboard combination of the Windows Key and letter n pressed simultaneously. Of course, the Windows Key is already used for a lot of hotkey combinations, pressing it alone opens the Start menu in Windows XP, while pressing the Windows Key and the letter e simultaneously opens Windows Explorer rooted on the ‘My Computer’ special folder. Remember that overriding Windows key functionality can have unintended side effects if you are not careful.

Run is a command that runs an external program. If you pass it a system-wide command (for those applications contained in the Windows folder), such as Notepad, it will execute that command. If you pass it a predefined registry clsid, such as the clsid for My Computer, – 20d04fe0-3aea-1069-a2d8-08002b30309d, it will open the My Computer folder, as shown:

 #e:: Run::{20d04fe0-3aea-1069-a2d8-08002b30309d}

This combination overrides the Windows Key and letter e combination to do exactly the same thing that Windows does anyway. If the string is the name of a file or shortcut in the same folder as your AutoHotKey script it will open that file or shortcut; if the string is a URL it will open that URL in your default browser. If a file, command, or URL cannot be found it will return an error message. (See the sidebar “Three HotKey Examples, One Error Message,” for more examples.)

One of the benefits of Run is that it will also pass parameters to the application it launches, for example the command below not only runs Internet Explorer but it tells IE what Web address to direct to:

    Run Internet Explorer http://www.devx.com 

What if you want to open an application and send keystrokes or other input to it? The way I have been doing it so far won’t work because text to the right of the file or URL in the Run command is a parameter passed to the appropriate application. Here’s one solution that performs several actions from one key combination:

#n::Run NotepadSleep, 200Send "hello Notepad"return 

This example allows multiple actions and introduces two more commands. The Sleep command pauses script execution for a time between 0 milliseconds and 24 days. For the uses demonstrated in this article most sleep times will be a few hundred milliseconds?long enough to open the new application. The Send command sends the sequence of keys found in between the quotation marks. Note that this is somewhat fragile: If you want to make sure Notepad is open before sending the text to it there are ways to wait until the opened application has the context. For a quick and dirty automation, though, this is more than adequate.

Author’s Note: An alternate syntax would be to use the & sign to separate the keys in your HotKey combination, for example # & n:: Run Notepad works the same as #n: Run NotePad. The reason for doing this is that you may want to separate named keys, such as the space key from single character keys such as n.

Remapping Keys with AutoHotKey
You could use what we’ve already learned to remap keys with AutoHotKey, for example:

    x::y

remaps the letter x to the letter y. Reasons to remap keys could be:

  1. You have a ‘non-standard’ keyboard .
  2. You want to send characters that are not otherwise accessible.
  3. You find certain keys easier to use than others.
  4. You want to use a non-standard keyboard layout, such as the Colemak keyboard layout implemented using AutoHotKey

One thing that should be noted is that remapping is context insensitive when using lower-case, so that remapping x::y will also remap X::Y, but if you use upper case only the upper case is remapped. For example X::Y remaps uppercase X to uppercase letter Y but the lowercase x will be unaffected while the following remapping x::Y, remaps both x and X to Y. (See the sidebar “Simple, Impermanent Mapping,” for more on the pros and cons of remapping keys.)

Remember that other input devices can be remapped besides keyboards. The following remaps the right mouse button to the letter y (no reason to do that, it’s just an example):

    Rbutton::y

A Simple Example of Usage
It is useful to provide HotKeys running not just in the operating system but also in specific applications. One strategy is to have a script with a set of HotKeys useful for working a particular application. When you intend to work with that application you Run the script from your script with starting hotkeys. This could also be achieved by having the script check if the application you want to run your hotkeys with is active, but I prefer to do it with a hotkey that I can control.

Below is an example AutoHotKey script for starting more scripts. It uses the Windows Key + g keystrokes to Run the address of gmail and then load in a script called gmail.ahk. The same thing is done with the Windows Key and the r key for Google Reader.

#g::   Run http://mail.google.com  Run gmail.ahk return#f::   Run http://www.google.com/reader  Run reader.ahk ;this overrides the Windows hotkey for search you may want to change itreturn#p:: run Acrobat.ahk

The different scripts started of course need to be defined. Gmail.ahk, which provides some hotkeys for use with Gmail, is shown in Listing 1; Reader.ahk, which shows some hotkeys for use with Google Reader, is shown in Listing 2; and Acrobat.ahk, which has some hotkeys for Adobe Acrobat, is shown in Listing 3.

As you can see from the three listings you now have functions. The functions, although with different names in each example, are all essentially the same. They take a numerical parameter and send a sequence of keys, after which they recursively call themselves, passing the parameter subtracted by 1 until the input parameter equals 0. If the input parameter equals 0 the functions won’t send any keys. (See the sidebar, “Automating Text Output,” for an example of this method.)

Listing 1 (gmail.ahk) is the only example that should require more detailed explanation. Basically it has different ways to select and deselect multiple emails at a time. The keys j and x are used by Gmail for their built-in hotkeys. The j key moves down the list of selected emails, and the x key selects an email. Note that you will have to make hotkeys in Gmail active. Because of built-in hotkeys you can essentially make super hotkeys that do a number of common hotkey combinations at once. (Two things to note quickly here: The number 2?as used in Listing 1?is the number at the top of a standard keyboard; the number 2 on a numerical pad is accessed by a different character. Hitting the delete button and the down arrow button simultaneously when the gmail.ahk is running will create an inputbox. The numerical value passed to this inputbox will cause the function to select a number of emails down from the current context. The delete and up button will select emails up from the current context.)

Once you’ve loaded in your application-specific AutoHotKey script you will want a way to unload it too. This is provided by the command ExitApp, which is used to exit a script with hotkeys in it. The keys used are the delete and the uppercase G for the gmail.ahk. In the reader the delete and uppercase F is used; and in the pdf the delete and uppercase P is used. You will remember that these are the same lowercase keys that were used to open the scripts. With a lot of extended hotkey functionality in your system you will want an easy way to start, shut down, and list the hotkeys in your scripts. Listing of hotkeys is provided by the command: ListHotkeys.

Each of the scripts has a hotkey that calls ListHotKeys with the insert key and the same letter that is used to start the script from the master script.

Taking AutoHotKey Further
I hope this article gives a nice overview of some of AutoHotKey’s capabilities. Some other things that AutoHotKey makes especially easy that this article does not cover are:

  1. HotStrings (Make sure to check out this cool AutoHotKey script for Gmail from LifeHacker, based on the HotStrings concept.)
  2. Remapping joystick controls
  3. Control individual windows (including partial transparency)

There are lots of little open source scripts written in AutoHotKey that give you some productivity enhancing functionality. For example sites such as 1- Hour Software have a great number of useful widgets. The AutoHotKey forum also has useful scripts to download and customize.

Share the Post:
Share on facebook
Share on twitter
Share on linkedin

Overview

The Latest

your company's audio

4 Areas of Your Company Where Your Audio Really Matters

Your company probably relies on audio more than you realize. Whether you’re creating a spoken text message to a colleague or giving a speech, you want your audio to shine. Otherwise, you could cause avoidable friction points and potentially hurt your brand reputation. For example, let’s say you create a

chrome os developer mode

How to Turn on Chrome OS Developer Mode

Google’s Chrome OS is a popular operating system that is widely used on Chromebooks and other devices. While it is designed to be simple and user-friendly, there are times when users may want to access additional features and functionality. One way to do this is by turning on Chrome OS

homes in the real estate industry

Exploring the Latest Tech Trends Impacting the Real Estate Industry

The real estate industry is changing thanks to the newest technological advancements. These new developments — from blockchain and AI to virtual reality and 3D printing — are poised to change how we buy and sell homes. Real estate brokers, buyers, sellers, wholesale real estate professionals, fix and flippers, and beyond may