Introduction to Virtual Desktop Automation

Kasper Fehrend

Senior Product Evangelist at Leapwork

In this lesson we will go through the basics of Leapwork and build a small automation flow that can be used for virtual desktop applications.

You will learn:

  • What virtual desktop automation is, and when to this automation approach
  • The steps involved in building a simple flow with Leapwork for automating virtual desktop applications
  • About the different tools available when automating virtual desktops: Image recognition, text recognition, keyboard instrumentation, controlling the mouse


When we want to automate applications running in a browser or a desktop application like an office product, we use Leapwork Web Automation or Desktop Automation, respectively.

These kinds of automation are using what is called Object Inspection to interact with the applications. This means Leapwork has access to the individual elements and components inside the application, and it is typically easier to automate using this type of automation.

When the application to automate is running in a virtual environment, like a Citrix desktop or receiver, the application behaves like if it was installed and accessed on a local computer, but from an automation point of view, there is a boundary between the local computer and the applications.

Virtual applications are running on a server, and what we see on the local computer is basically an image reflecting the changes to the application running on the server.

To automate this kind of application we need to interact with the boundary using the virtual desktop tools in Leapwork, like image and text recognition.

Another example of when we will use the virtual desktop automation tools is mainframe applications.

A lot of companies, especially in the financial sector, developed big extensive mainframe applications in the 80s and 90s, and these applications are very often still in use.

As with applications running in a Citrix client, we can't get access to the individual elements in a mainframe application, so we must identify objects using image and text recognition, keyboard actions etc.

To sum up:

  • If you are automating applications running in a terminal client like Citrix, VNC, VMWare or if you are automating a mainframe application use Virtual Desktop automation.
  • If the application under automation is installed on the local computer - or on the computer where the automation will run - you should not use Virtual desktop automation. This includes the vast majority of desktop applications, and also applies to web applications.
  • For some very old desktop applications there might be instances where desktop automation doesn’t work. In these cases, you can use the image & text recognition tools to handle these specific areas of the application.

A general rule of thumb: You can combine all the automation types in a single flow - so always use the best tool for the job.

The tutorial above covers when to use virtual desktop automation, and when to use web and desktop automation. Citrix, Remote Desktop, and mainframe applications are examples of when to use virtual desktop automation.

In the tutorial, we create a simple automation flow using image recognition to click various places in an application. The example also shows how to insert text into the application and how to use text recognition to read text from applications.

Learn more: Best practices for text and image recognition.

Go back to tutorials for virtual desktop automation.