Sikulix 2014
Sikulix 2014
Release 1.1.0-Beta1
7 Miscellanous 85
7.1 Can I do X or Y or Z in SikuliX? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
7.2 How to run SikuliX from Command Line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
7.3 How to use SikuliX API in your JAVA programs or Java aware scripting . . . . . . . . . . . . . . . . 88
7.4 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
i
8.1 Hello World (Mac) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
8.2 Hello World (Windows) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
8.3 Goodbye Trash (Mac) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
8.4 Uncheck All Checkboxes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
8.5 Check the Right Checkbox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
8.6 Working with Sliders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
8.7 Desktop Surveillance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
ii
SikuliX Documentation, Release 1.1.0-Beta1
Getting Help
Looking for specific information?
Try the Table of Contents
Look through the genindex
Use the search
See other peoples questions or ask a question yourself.
If you think youve found bugs, search or report bugs in our bug tracker.
Contents 1
SikuliX Documentation, Release 1.1.0-Beta1
2 Contents
CHAPTER 1
SikuliX at the top supports scripting via SikuliX IDE (a basic script editor to load, edit, save and run scripts including
the creation/organization of the needed images for your visual workflow).
Supported scripting languages:
Python (language level 2.7) supported by the Jython interpreter.
Ruby (language level 1.9/2.0) supported by the JRuby interpreter.
JavaScript supported by the Java builtin scripting engine (Java 7: Rhino, Java 8: Nashorn).
If you are new to programming, you can still enjoy using SikuliX to automate simple repetitive tasks without learning
one of the supported scripting languages using the SikuliX IDE.
A good start might be to have a look at the tutorials.
If you plan to write more complex scripts, which might even be structured in classes and modules, you have to dive
into the Python Language, the Ruby Language or JavaScript.
NOTE: Since Jython and JRuby are based on Java, the modules available for Python or Ruby might not be available
in the Sikulix environment. So before trying to use any non-standard modules or extension packages, you have to
check, wether they are supported in this SikuliX environment.
NOTE on Java usage The features in SikuliX at the bottom line are implemented with Java. So you might as well
use SikuliX at this Java API level in your Java projects or other Java aware environments (see how to). Though this
documentation is targeted at the scripting people it contains basic information about the Java level API as well at
places, where there are major differences between the two API level.
Additionally you might look through the JavaDocs (temporary location).
Each chapter in this documentaton briefly describes a class or a group of methods regarding their basic features.
General usage information are provided and hints, that apply to all methods in that chapter. We recommend to read
carefully before using the related features.
If you are totally new with Sikuli, it might be a good idea to just read through this documentation sequentially. An
alternative way might be to jump to the chapters that you are interested in by scanning the table of contents.
A way in the middle would be reading the core classes in this sequence: Region, then Match, and finally Screen.
After that, you can go to any places of interest using the table of contents or use the genindex to browse all
classes, methods and functions in alphabetical order.
3
SikuliX Documentation, Release 1.1.0-Beta1
5
SikuliX Documentation, Release 1.1.0-Beta1
Visually the tool sees images (usually retangular pixel areas) on the screen and allows to act on these images
using mouse and keyboard simulation. There might be some recorder feture as well with such a tool.
SikuliX belongs to the 3rd category and currently does not have a recorder feature. While you work through your
workflow you capture some images, that you want to act on or that you expect to appear after an action. These images
are used by click and type actions or are used to wait for the screen to be ready for the next action.
SikuliX is a WYSIWYS-Tool: What You See Is What You Script.
So again taking the above workflow, now using SikuliX commands:
openApp(someApp) # we use an application someApp
click(imageButton) # we click some button
wait(imageExpected1) # we wait that the app reacts and shows the expected result on the screen
type(some text); type(Key.ENTER) # we fill in some text and press ENTER
wait(imageExpected2) # again we wait for some expected reaction or result
click(...) # we click ...
wait(...) # we wait ...
type(...) # we type ...
wait(...) # we wait ...
...
Using the SikuliX IDE, you can setup and maintain such visual workflows including capturing and organizing the
needed images. Besides knowing the options of the basic commands, you do not need any programming or scripting
knowledge. Only in the moment, you want to optimize, repeat or augment such a basic linear workflow, you have to
dive deeper into the scripting language of your choice (supported in the IDE: Python, Ruby, JavaScript).
In software testing people use tools, to check, wether an application driven by some workflow reveals the expected
results. With applications having a visually complex GUI, sooner or later the testers want to check some visual content
against their expectation. This might be the presentation of GUI elements on the screen or the visual content of some
part of the screen.
SikuliX can be integrated in various ways into such testing environments, either on the scripting level or using the Java
based API (where the SikuliX features are implemented). Prominent examples for this approach are RobotFramework
or Cucumber.
Also a combination of GUI aware tools and SikuliX are already reality (e.g. together with Selenium for web apps).
The challenge here usually is, that SikuliXs image based features are pixel-aware in the sense, that an image is on the
screen exactly pixel-wise or SikuliX fails (with some slight tolerances as long as width and height in pixels match).
This usually leads to the neeed to have different image sets for different environments. The version 2 will have some
more features, that will make it easier to handle such situations.
Other aspects important for testers:
SikuliX needs a real screen running the application under test or at least some equivalent virtual solution
SikuliX is only available on PCs/Workstations running Windows, Mac or Linux and having a Java version 6+
SikuliX uses the OpenCV package for finding an image on the screen.
The SikuliX feature is based on OpenCVs method matchTemplate(), which is rather good explained on this example
page. If you are not familiar with how it works, you should just have a look there and then come back to here and read
further.
A basic feature in Sikulix is to wait for an image to appear in a given region:
# some top left part of the screen
aRegion = Region(0, 0, 500, 500)
# a png image file on the file system
# this is the image we want to look for in the given Region
aImage = someImage.png
# search and get the result
aMatch = aRegion.find(aImage)
To not make it too complicated here, I do not talk about how you create the aImage - we just assume it is there and
accessible.
The matchTemplate() expects an even sized or larger image (base), where the given image (target) should be searched.
To prepare that, we internally make a screenshot (using Java Robot class) of the screen area defined by the given
aRegion. This now is the base image and held in memory. The target image is also created as in memory image read
form the image file. Both images then are converted to the needed OpenCV objects (CVMat).
Now we run the matchTemplate() function and get a matrix in the size of the base image, that contains for each pixel
a similarity score for the target image compared pixel by pixel with its top left corner at this pixel location. If this is
not clear here now, go back to the above example and try to understand. The score values at each pixel location vary
between 0.0 and 1.0: the lower the value, the lower the probability, that the area with its top left corner at this pixel
location contains the target image. Score values above 0.7 - 0.8 signal a high probability, that the image is here.
In the next step, we use another OpenCV method, to get the relevant max value (result score) from the mentioned
result matrix, meaning that we are looking for a pixel, that most probable is the top left corner of the target image in
the base image.
If nothing else is said, only a result score > 0.7 is taken as found. The other values will signal a FindFailed exception.
Depending on various aspects of the target image (mainly how much even background towards the edges in contained
in the target image), one usually get result scores > 0.8 or even 0.9. If one follows SikuliXs recommendation how to
create target images, one should in most cases get result scores >0.95 or even >0.99 (internally taken as exact match
with 1.0).
If the result score is accepted as found, in the next step we create a match object, that denotes the region on the screen,
that most probably contains the image (aMatch in the above snippet).
If the image is not found (result score not acceptable), we either terminate the search operation signalling failed or
start a new search with a new screenshot of the given region. This is repeated until the image is either found or the
given or implicit waiting time (3 seconds in the standard) has elapsed, which also results in a FindFailed signal. The
rate of this repetition can be specified, to reduce the cpu usage by SikuliX, since the search process is pure number
crunching.
A word on elapsed time for search operations: The larger the base image the longer the search. The smaller the size
difference of the 2 images, the faster. On modern systems with large monitors searching a small to medium sized
image (up to 10.000 pixels), the elapsed time might be between 0.5 and 1 second or even more. The usual approach,
to reduce search time is to reduce the search region as much as possible to the area, one expects the target image to
appear. Small images of some 10 pixels in search regions of some 1000 pixels are found within some 10 milliseconds
or even faster.
The actual version 1.1.0 of Sikulix implements a still-there-feature: before searching in the search region, it is first
checked, wether the image is still in the same place as at the time of the last search (if the search region contains
this last match). On success, this preflight operation usually takes some milliseconds, which speeds up workflows
enormously if they contain repetitive tasks with the same images.
Not knowing the magic behind SikuliXs search feature and the matchTemplate() function, people always wonder,
why images showing up multiple times on the screen, are not found in some regular order (e.g. top left to bottom
right). That this is not the case is caused by the implementation of the matchTemplate() function as some statistical
numeric matrix calculations. So never expect SikuliX to return the top left appearance of a visual being more than
once on the screen at time of search. The result is not predictable in this sense.
If you want to find a specific item of these multiple occurrences, you have to restrict the search region, so that only the
one you are looking for is found.
For cases where this is not suitable or if you want to cycle through all appearances, we have the findAll() method, that
returns a list of matches in decreasing result score order. You might work through this list according to their position
on the screen by using their (x,y) top left corner coordinates. findAll internally evaluates the search result matrix, by
repetitively looking for the next max value after having switched off some area around the last max.
To use images with the features of SikuliX like click(someImage), you need to store these images as image files
preferably in the PNG format (someImage.png) somewhere on the file system or somewhere in the internet.
An image in this sense is some rectengular pixel area taken from the screen (captured or made a screenshot - with
Sikuli we use the verb capture as the process of taking the image and save it in an image file and the name screenshot
for the artifact visual object = image).
Capturing is supported by the IDE or programmatically via the respective SikuliX features. You might use any capture
tool instead to get your images (preferably in PNG format).
To load the images SikuliX has 2 principles:
bundle path: the images are stored together with the script file (.py for Python, .rb for Ruby, .js for JavaScript) in
a folder named someScript.sikuli, where the scriptfile must be named the same as the folder (e.g. someScript.py).
This all is automatically assured, when working with the SikuliX IDE.
image path: additionally SikuliX supports a list of places as an image path. Possible places are folders in the
file system, folders in a jar-file and folders somewhere in the internet. There are functions available to manage
your own image path. When an image has to be loaded (exception: the absolute path is given), the places are
sequentially checked for the existence of the image. The first matching place wins.
It is strongly recommended, to have a naming sceme for the image files and to not rely on the basic timestamped image
file naming of the SikuliX IDE, which is basically for new users with little programming experience.
Version 2 will have a capturing tool as a standalone app, that supports the basic aspects of image handling:
capture and recapture screeshots along a workflow (some kind of recorder)
organize your image path
organize groups of same images, that can be switched depending on environment aspects
organize a group of images, that somehow relate to each other and should be found together
organize different states of an image (e.g. selected/not selected)
optimize screenshots to get the highest possible scores at find
some kind of basic support for transparency (e.g. ignore inner part of button)
A major aspect of SikuliX is to be available on Windows, Mac and Linux with as little differences as possible. This
means, that features will only be added to SikuliX as standard, if they can be made available on all these systems.
Nevertheless it will be possible beginning with version 2, to add extensions or plugins, that might not be available on
all systems from the beginning or forever. Version 2 will have a suitable eco-system for that.
Mainly bacause of this major aspect SikuliX is a Java based application or library. Hence the usable artifacts are
delivered as jar-files. Were possible the SikuliX IDE is delivered as application (currently Mac only, Windows .exe
planned for version 2).
To use the SikuliX features you need a valid Java runtime installation (JRE, preferably the Oracle versions) of at least
version 6. SikuliX works with version 7 and 8 too and version 2 will need at least Java 7.
With version 1.1.x, there are still vital parts of SikuliX written in C++, which makes a SikuliX artefact system specific
in the end. This currently is supported by an initial setup process, that produces the finally usable artefacts for this
system environment.
The only exception is Java programming with some Maven compatible build system, that allows to simply start pro-
gramming without having done a setup. The needed artifacts for this system are dynamically loaded according to the
Maven dependency concept.
Beginning with version 1.1.0 the resulting artefacts (currently sikulix.jar and/or sikulixapi.jar) can be moved around
as needed (though it is still recommended to have the SikuliX stuff in one well defined place, to avoid update/upgrade
problems). Everything else SikuliX needs during runtime is stored either in the systems temp space or in a special
system specific area in the users home space (see the system specific topics below). Missing or outdated things in
these areas are created/recreated at runtime by SikuliX automatically (means: you can delete everything at any time,
as long as you keep the jars).
The current layout of this space is as follows (we call it SikulixAppData):
Extensions (place for extensions/Plugins)
Lib (the stuff to support Jython/JRuby usage)
SikulixDownloads (non SikuliX artefacts like Jython, JRuby, Tesseract support, ...)
SikulixDownloads_TIMESTAMP (versioned SikuliX stuff needed for setup)
SikulixLibs_TIMESTAMP (the place for the exported native libraries)
SikulixSetup (optional: used when the setup is run from the project context)
SikulixStore (place for persistent or optional information)
SikulixTesseract (place for language specific tessdata files)
Currently there is no need to step into these folders except for debugging purposes.
SikuliX in the standard does not need any environment settings anymore.
The IDE currently (1.1.0) still is only available as jar-file, that can be double-clicked to start it.
Setup installs a runsikulix.cmd, that can be used to start the IDE from commandline or to run scripts.
For version 2 an application is planned as .exe based on Launch4J.
The SikulixAppData is stored in the folder Sikulix inside the folder the environment variable %APPDATA% points to.
Besides Java there are no prerequisites. All native libraries for 32Bit and/or 64Bit are bundled in the jar-files and
exported at runtime as needed. Be aware: the bitness of the native libraries depends on the bitness of the used Java
version, which might as well be a 32Bit version on a 64Bit Windows (though not recommended).
The IDE is available as Sikulix.app after setup and should be moved to the /Applications folder.
Additionally there is a command script runsikulix to run scripts from commandline (Terminal).
The SikulixAppData folder is here ~/Library/Application Support/Sikulix
Besides Java there are no prerequisites. All native libraries (64Bit, since SikuliX needs OSX 10.6+) are bundled in the
jar-files and exported at runtime as needed.
The IDE is only available as jar-file, that can be double-clicked to start it (setting the executable bit of the jar-file might
be necessary).
Setup installs a runsikulix command script, that can be used to start the IDE from commandline or to run scripts.
As Java you might use the OpenJDK versions (though the Oracle versions are recommended). A JRE is sufficient, if
you do not need a JDK for your own purposes.
Usually on Linux systems SikuliX does not run out of the box, since it is not possible to bundle all native libraries for
all possible Linux flavors. It is intended, to have at least a version for the latest Ubuntu systems, that run out of the
box.
If setup fails due to library problems, have a look at the setup log to get hints on how to proceed.
About how to get the native libraries working in case look here.
11
SikuliX Documentation, Release 1.1.0-Beta1
13
SikuliX Documentation, Release 1.1.0-Beta1
<jython-folder>/bin/easy_install exists
if this is not the case run <jython-folder>/bin/jython -m ensurepip on a command-
line and check again
if this is still not the case follow the steps further below Fallback without pip
run <jython-folder>/bin/pip install jip to install the package jip, which allows
to add Java libraries easily to your Jython environment
add any needed Python package (must not depend on C-based stuff) using pip, easy-install or
manual methods into <jython-folder>/Lib/site-packages and/or use jip for adding
Java libraries preferably from Maven Central
4.3.2 Access Python packages from SikuliX scripts run by SikuliX (GUI or comman-
dline)
The following approaches apply to situations, where you want to use Python modules installed somewhere on your
system, without the need to manipulate sys.path, meaning, that when using mport moduleXYZ this package
is found automatically.
SikuliX uses a central repository (SikulixRepo in the following) for internal stuff (native libraries, downloaded artifacts, reso
Windows: %APPDATA%\Sikulix
Mac: ~/Library/Application Support/Sikulix
Linux: ~/.Sikulix
Basic preparation To SikulixRepo add a folder Lib (if not already there) and inside add a folder
site-packages
Approach 1 Since an existing folder SikulixRepo/Lib/site-packages will be recognized and added au-
tomatically as the 1st entry to sys.path, modules/packages contained in here will be found when
imported without any further preperations. This approach can be used, to overwrite modules/packages, that
otherwise would be found elsewhere on sys.path (e.g. for testing)
Approach 2 In the folder SikulixRepo/Lib/site-packages have a file sites.txt, that contains absolute
paths one per line, that point to other places, where modules packages can be found. These paths will be added
automatically at startup to the end of sys.path in the given sequence. With this approach, you might for
example add the Lib/site-packages folder of your own Jython installation.
4.3.3 Prepare and use your own jar files in the Jython environment
You might prepare jar files containing Python scripts/modules/packages, Java classes and other stuff like images, that
are intended to be used in the scripting context.
possible use cases
you want to pack scripted stuff together with other resources into a container ready to be used by yourself
or others via import (which is not supported by the .skl packaging method).
you want to secure your script code against modifications by others, that use your distributed jar.
Later (possibly only with version 2) there will be a feature available, to run such script containers directly from
commandline (java -jar mystuff.jar parameters) or by double clicking.
typical jar file structure:
14 Chapter 4. SikuliX scripting and usage in programming scenarios (preferably Java based)
SikuliX Documentation, Release 1.1.0-Beta1
-- jar rootlevel
module1.py # Python module
module2.py
- folder1 # Python package
__init__.py
stuff1.py
stuff2.py
- images # image folder
img1.png
img2.png
- org # Java package
- mystuff
class1.class
class1.class
becomes a jar
-- jar rootlevel
- packagefolder
__init__.py
stuff.py
16 Chapter 4. SikuliX scripting and usage in programming scenarios (preferably Java based)
CHAPTER 5
SikuliX now has a basic implementation of a globel options repository, that is recognized and loaded at startup time
(IDE and API usage). At runtime, this options store can be accessed to get and set values. Persistence (save) and
merge of additional options files is not yet supported.
Since internally the options are stored in a Java Properties object as key-value-pairs in string representation, the options
file must conform to the applicable rules (look here).
At startup these places are searched for a file SikulixOptions.txt (first appearence wins):
the working folder
the users home folder
the folder SikulixStore in the Sikulix repository
Special for the options defined in the Settings class
see Controlling Sikuli Scripts and their Behavior
any Settings.option = value will overwrite the existing option value in the Settings class at
startup
Other Options globally recognized and processed on startup
Debug.level = n
preferably n as 3, debug information produced from beginning of startup
Settings.OverwriteImages = yes
when saving images in the IDE, usually when using a name, that already exists, this is not overwritten,
but stored with a name having a suffix -n appended. With this option switched on (default is off)
naming images with an existing name will replace the image file without notice.
classpath = list of absolute-paths
will be added to the end of the classpath in the given sequence. Rules are the same as for Java
classpath on that system. On Windows backslashes have to be doubled (escaped).
17
SikuliX Documentation, Release 1.1.0-Beta1
setShowActions(False | True)
If set to True, when a script is run, Sikuli shows a visual effect (a blinking double lined red circle) on the spot
where the action will take place before executing actions (e.g. click(), dragDrop(), type(), etc) for
about 2 seconds in the standard (see Settings.SlowMotionDelay ). The default setting is False.
exit([value ])
Stops the script gracefully at this point. The value is returned to the calling environment.
class Settings
Settings.ActionLogs
Settings.InfoLogs
Settings.DebugLogs
see the section about debug and log messages.
Settings.MinSimilarity
The default minimum similiarty of find operations. While using a Region.find() operation, if only an
image file is provided, Sikuli searches the region using a default minimum similarity of 0.7.
Settings.MoveMouseDelay
Control the time taken for mouse movement to a target location by setting this value to a decimal value (default
0.5). The unit is seconds. Setting it to 0 will switch off any animation (the mouse will jump to the target
location).
As a standard behavior the time to move the mouse pointer from the current location to the target location given
by mouse actions is 0.5 seconds. During this time, the mouse pointer is moved continuosly with decreasing
speed to the target point. An additional benefit of this behavior is, that it gives the active application some time
to react on the previous mouse action, since the e.g. click is simulated at the end of the mouse movement:
mmd = Settings.MoveMouseDelay # save default/actual value
click(image1) # implicitly wait 0.5 seconds before click
Settings.MoveMouseDelay = 3
click(image2) # give app 3 seconds time before clicking again
Settings.MoveMouseDelay = mmd # reset to original value
Settings.DelayBeforeMouseDown
Settings.DelayBeforeDrag
19
SikuliX Documentation, Release 1.1.0-Beta1
Settings.DelayBeforeDrop
DelayBeforeMouseDown specifies the waiting time before mouse down at the source location as a decimal value
(seconds).
DelayBeforeDrag specifies the waiting time after mouse down at the source location as a decimal value (sec-
onds).
DelayBeforeDrop specifies the waiting time before mouse up at the target location as a decimal value (seconds).
Usage: When using Region.dragDrop(), Region.drag() and Region.dropAt() you may have
situations, where the operation is not processed as expected. This may be due to the fact, that the Sikuli actions
are too fast for the target application to react properly. With these settings the waiting time befor and after the
mouse down at the source location and before the mouse up at the target location of a dragDrop operation are
controlled. The standard settings are 0.3 seconds for each value. The time that is taken, to move the mouse from
source to target is controlled by Settings.MoveMouseDelay::.
Be aware The given values are only valid for the next following action. The inner timing will be reset to the
defaults after the actions completion.
Settings.DelayBeforeMouseDown = 0.5 Settings.DelayBeforeDrag = 0.2 Settings.DelayBeforeDrop
= 0.2 Settings.MoveMouseDelay = 3 dragDrop(source_image, target_image) # time for complete
dragDrop: about 4 seconds + search times
Settings.ClickDelay
Specify a delay between the mouse down and up in seconds as 0.nnn. This only applies to the next click action
and is then reset to 0 again. A value > 1 is cut to 1.0 (max delay of 1 second)
Settings.TypeDelay
Specify a delay between the key presses in seconds as 0.nnn. This only applies to the next click action and is
then reset to 0 again. A value > 1 is cut to 1.0 (max delay of 1 second)
NOTE: If the internal timing of the compound mouse functions like click() or dragDrop() is not suit-
able in your special situation, you might as well build your own functions using the basic mouse functions
Region.mouseDown(), Region.mouseMove() and Region.mouseUp()
Settings.SlowMotionDelay
Control the duration of the visual effect (seconds).
Settings.WaitScanRate
Settings.ObserveScanRate
Specify the number of times actual search operations are performed per second while waiting for a pattern to
appear or vanish.
As a standard behavior Sikuli internally processes about 3 search operations per second, when processing a
Region.wait(), Region.exists(), Region.waitVanish(), Region.observe()). In cases
where this leads to an excessive usage of system ressources or if you intentionally want to look for the visual
object not so often, you may set the respective values to what you need. Since the value is used as a rate per
second, specifying values between 1 and near zero, leads to scans every x seconds (e.g. specifying 0.5 will lead
to scans every 2 seconds):
def myHandler(e):
print "it happened"
# you may wish to save the actual settings before
Settings.ObserveScanRate = 0.2
onAppear(some_image, myHandler)
observe(FOREVER, background = True)
# the observer will look every 5 seconds ;-)
Settings.ObserveMinChangedPixels
The minimum area size in pixels that changes its content to trigger a change event when using
Region.onChange() when no value is specified. The default value is 50 (a rectangle of about 7x7 Pix-
els).
New in version 1.1.0.
these are the relevant Settings for user logging showing defaults: (False = switched off, True = switched on)
Settings.UserLogs = True (False: user log calls are ignored)
Settings.UserLogPrefix = "user" (message prefix)
Settings.UserLogTime = True
Debug.setUserLogFile("absolute-path-to-file") (no default)
to write a user log message:
Debug.user("text with %placeholders", args ...) where text is a string according to the
rules of Java String.format().
Information about Java String formatting can be found here (rather formal, look for tutorials in the net addition-
ally if this is new for you)
the messages look so: [prefix optional-timestamp] message-text with filled in arg
values
Being in Jython scripting one might as well use this: Debug.user("some text with
%placeholders" % (list-of-args ...))
the settings for Sikulis logging with the defaults: (False = switched off (message type not created), True =
switched on)
Settings.ActionLogs = True (message prefix: [log])
Settings.InfoLogs = True (message prefix: [info])
Settings.DebugLogs = False (message prefix: [debug])
Settings.LogTime = False
Debug.setLogFile("absolute-path-to-file") to redirect the Sikuli messages to a file, no default
Debug messages Sikuli internally issues debug messages all over the place, to show, what it is doing. Creating debug
messages is dependant on the current DEBUG_LEVEL value:
if 0, no debug messages are shown
if >0, debug messages having a level <= DEBUG_LEVEL are created
The initial DEBUG_LEVEL is 0 and can be set with
the Java command line parameter -Dsikuli.Debug=n or
the command line parameter -d n when using SikuliX jars or command scripts.
Currently a suitable DEBUG_LEVEL is 3, that shows enough valuable information about what is going on internally.
If you ever encounter problems, that might have to do with SikuliXs internal processing, switch on debug messaging
with level 3.
To avoid tons of not needed messages, you might switch debugging on and off on the fly for only critical sections in
your workflow:
switch on: Debug.on(n) setting the DEBUG_LEVEL=n (recommended: 3)
Selective log message processing (callback is the name of your specific callback function):
Debug.setLoggerUser("callback") # redirect messages [user]
Debug.setLoggerInfo("callback") # redirect messages [info]
Debug.setLoggerAction("callback") # redirect messages [log]
Debug.setLoggerError("callback") # redirect messages [error]
Debug.setLoggerDebug("callback") # redirect messages [debug]
You might suppress the creation of the message header for all messages, so you only get the message body:
use Debug.setLoggerNoPrefix(myLogger()) instead of the initial
Debug.setLogger(myLogger())
New in version 1.1.0.
getBundleFolder()
same as getBundlePath() but with trailing separator to make it suitable for string concatenation.
getParentPath()
returns the path to the parent folder of the current .sikuli folder without trailing separator.
getParentFolder()
same as getParentPath() but with trailing separator to make it suitable for string concatenation.
makePath(path1, path2, path3, ...)
returns a path with the correct path seperators for the system running on by concatenating the given path elements
from left to right (given as strings). There is no trailing path seperator.
makeFolder(path1, path2, path3, ...)
same as makePath() but trailing path seperator to make it suitable for string concatenation.
NOTE makePath and makeFolder on Windows the first path element can be specified as a drive letter X:
unzip(fromFile, toFolder)
A convenience function to unzip a zipped container to a folder (implemented using the Java builtin support for
zip files). The ending of the file does not matter, the content of the file is examined to find out, whether it is a
valid zip container. A zipped folder structure is preserved in the target folder. Relative paths would be resolved
against the current working folder. This can for example be used, to unpack jar files.
Parameters
fromFile a file with a zipped content given as path string
toFolder the folder where to place the unzipped content given as path string
Returns True if it worked, False otherwise
NOTE The complementary feature zip(fromFolder, toFile) will follow soon.
NOTE on Java usage:
import org.sikuli.basics.FileManager;
FileManager.unzip(fromFile, toFolder);
6.1.4 Image Search Path - where SikuliX looks for image files
SikuliX maintains a list of locations to search for images when they are not found in the current .sikuli folder (a.k.a.
BundlePath). This list is maintained internally but can be inspected and/or modified using the following functions.
GENERAL NOTEs:
as long as an image file has the ending .png, this might be omitted.
you might use subfolders as well, to form a relative path to an image file
an image path might point to a location inside a jar file or a location on the Java classpath
an image path might point to a folder in the net, that is accessible via HTTP
SikuliX internally manages a cache for the imagefile content (standard 64 MB), where images are held in
memory, thus avoiding a reload on subsequent references to the same image file.
The bundle path can be accessed and modified so:
NOTE: the bundle path can only be on the (local) file system, not in a jar, nor in the net (access via HTTP). If you
need places in a jar or in the HTTP net, use the add function.
setBundlePath(path-to-a-folder)
Set the base path for searching images. Sikuli IDE sets this automatically to the path of the folder of the script
(.sikuli). Therefore, you should use this function only if you really know what you are doing. Using it generally
means that you would like to take care of your captured images by yourself.
Additionally images are searched for in the image path, that is a global list of other places to look for images
and the bundle path being the first entry. It is implicitly extended by script folders, that are imported (see: Reuse
of Code and Images).
Currently (will be revised in version 1.2), you should not use a jar file folder, Use addImagePath() instead.
getBundlePath()
Get a string containing the absolute path to a folder containing your images used for finding images and which
is set by SikuliX IDE automatically to the script folder (.sikuli). You may use this function for example, to
package your private files together with the script or to access the image files in the bundle for other purposes.
Be aware of the convenience functions to manipulate paths.
NOTE for Java usage: Since there is no default BundlePath, when not running a script, like in the situation, when
using the Java API in Java program or other situations with the direct use of Java aware scripting languages, you can
use this feature to set the one place, where you have all your images:
import org.sikuli.script.ImagePath;
ImagePath.setBundlePath("path to your image folder");
screen.find("image1");
screen.find("imageset1/image2");
NOTE: first find omits .png, second find uses a relative path with a subfolder
Other places, where Sikuli looks for images, are stored internally in the image path list.
When searching images, the paths are scanned in the order of the list. The first image file with a matching image file
name is used.
Use the following functions to manipulate this list.
NOTE for Java usage: Class of the mentioned functions:
import org.sikuli.script.ImagePath
getImagePath()
Get a list of paths where Sikuli will search for images.
imgPath = getImagePath() # get the list
# to loop through
for p in imgPath:
print p
addImagePath(a-new-path)
Add a new folder path to the end of the current list (avoids double entries) Java API:
ImagePath.add(path)
As a convenience you might use this function also to add a path to a HTTP net folder like so sikulix.com: or
sikulix.com:somefolder/images (see addHTTPImagePath)
addHTTPImagePath(a-new-path)
Add a new folder path to the end of the current list (avoids double entries) Java API:
ImagePath.addHTTP(a-new-path)
a-new-path is a net url like sikulix.com optionally with a folder structure attached like so: sikulix.com/images (a
leading http:// or https:// is optional, so one might copy and paste links)
The folder must be accessible via HTTP and must allow HTTP-HEAD requests on the contained
image files (this is checked at time of trying to add the path entry).
NOTE on Java usage: images in a jar
It is possible to access images, that are stored inside of jar files. So you might develop a Java app, that comes bundled
with the needed images in one jar file.
To support the development cycle in IDEs, you might specify an alternate path, where the images can be found, when
running inside the IDE.
Usage in Maven Projects:
Following the conventions of Maven projects you should store your images in a subfolder at
src/main/resources for example src/main/resources/images, which then at jar produc-
tion will be copied to the root level of the jar. Not following this suggestion you have to work according
to the case other projects.
ImagePath.add("someClass/images") where someClass is the name of a class contained in a
jar or folder on the class path containing the images folder.
Usage in other Projects:
ImagePath.add("someClass/images", alternatePath) where someClass is the name of
a class contained in a jar on the class path containing the images folder at the root level of the jar.
where alternatePath is a valid path specification, where the images are located, when running from
inside an IDE.
Be aware: that you might use the Sikuli IDE, to maintain a script, that only contains the image filenames and then
is used as image path in your Java app like ImagePath.add("myClass/myImages.sikuli"), which e.g. in
the Maven context will assume as image path src/main/ressources/myImages.sikuli.
Note for Jython scripting: use load() without the import to use the feature images in jars:
from org.sikuli.script import ImagePath
load("absolute path to someJar")
ImagePath.add("someClass/someFolder")
removeImagePath(a-path-already-in-the-list)
Remove the given path from the current list Java API: ImagePath.remove(path)
resetImagePath(a-path)
Clears the current list and sets the first entry to the given path (hence gets the BundlePath). This gets you a fresh image en
Java API: ImagePath.reset(path)
Note: paths must be specified using the correct path separators (slash on Mac and Unix and double blackslashes on
Windows). The convenience functions in File and Path handling might be helpful.
This list is automatically extended by Sikuli with script folders, that are imported (see: Importing other Sikuli Scripts),
so their contained images can be accessed by only using their plain filenames. If you want to be sure of the results of
your manipulations, you can use getImagePath() and check the content of the returned list.
NOTE: at all time the first entry in the list is internally taken as BundlePath, where appropriate.
Note on contained images: Together with the import, Sikuli internally uses the feature SIKULI_IMAGE_PATH to
make sure that images contained in imported .sikulis are found automatically.
Some comments on general rules for Python import
An import is only processed once (the first time it is found in the program flow). So be aware:
If your imported script contains code outside of any function definitions ( def() ), this code is only
processed once at the first time, when the import is evaluated
Since the IDE does not reload the modules when running a script the next time, you have to use the Jythons
reload() function, if you are changing imported scripts while they are in use:
# instead of: import module
import module
reload(module)
Python has a so called namespace concept: names (variables, functions, classes) are only known in its names-
pace:
your main script has its own namespace
Each imported script has its own namespace. So names contained in an imported script have to be qualified
with the module name (e.g. myLib.myFunction() )
You may use from myLib import *, which adds all names from myLib into your current namespace.
So you can use myFunction() directly. When you decide to use this version, be sure you have a naming
convention that prevents naming conflicts.
New in version 1.1.0.
Loading a jar-file containing Java/Python modules and additional resources as needed
load(jar-file)
Loads a jar-file and puts the absolute path to it into sys.path, so the Java or Python code in that jar-file can be
imported afterwards.
Parameters jar-file either a relative or absolute path to filename.jar
Returns True if the file was found, otherwise False
load(jar-file, image-folder)
same as load(jar-file), but additionally adds the given folder to the image path. image-folder is assumed to be a
foldername available in the jars rootlevel (not checked though).
Parameters
jar-file either a relative or absolute path to filename.jar
image-folder a relative path (always use / as path separator, no leading /)
Returns True if the file was found, otherwise False
Search startegy The given jar is searched as following (first match wins):
if given as absolute path it is checked for existence and processed (if not exists: no further action)
if given as relative path:
the current path (Jython: sys.path, Java: classpath)
the current folder (Jython only: bundle path)
the SikuliX Extensions folder
the SikuliX Lib folder
Note for Java usage at the Java level, this feature is available as Sikulix.load(jar [, folder]) and adds
the given jar to the end of the classpath on the fly. A given folder is added to the image path as mentioned above.
Note on Python usage more details and usage cases are discussed in Using Python. After a successful load(), you
might use the standard import something, to make the module something available in your scripting context.
New in version 1.1.0.
6.1.6 Running scripts and snippets from within other scripts and run scripts one
after the other
If the snippet produces some output on stdout and/or stderror, this is accessible after return using:
commandOutput = RunTime.get().getLastCommandResult()
where the error output comes after a line containing ***** error *****
In the standard the following dialog boxes are shown in the middle of the screen, where SikuliX (IDE or from com-
mandline) is running (usually the primary screen).
New in version 1.1.0.
They are always on top from the beginning, no matter which application currently is the frontmost. While the dialog
is visible, you might move it around and act on other applications, until you work with the dialog box.
New in version 1.1.0.
If you want the dialog to appear in a special location on the screen (even on other screens in multimonitor situations),
you can use the function popat() to define this location. The dialog will be positioned here with the center of its
dialog panel. Be aware, that locations near the edge of the screen might make parts of the dialog not accessible (this
is not checked). This location will stay in effect until changed by another use of popat(). A popat() without
parameters will reset it to the standard (center of primary screen).
popat(x, y)
popat(location)
popat(region)
popat()
Define the location, where the center of popup dialogs should be positioned from now on.
Parameters
x x value of the location
y y value of the location
location the location as a Location
region the location as the center of the given Region
If no parameter is given, the location will be reset to the center primary screen (default).
popup(text[, title ])
Display a dialog box with an OK button and text as the message. The script then waits for the user to click the
OK button.
Parameters
text text to be displayed as message
title optional title for the messagebox (default: Sikuli Info)
Example:
popup("Hello World!\nHave fun with Sikuli!")
A dialog box that looks like below will popup Note: \n inserts a line break
A dialog box that looks like above will appear to allow the user to interactively enter some text. This text is then
assigned to the variable name, which can be used in other parts of the script, such as paste(name) to paste
the text to a login box.
Example: input with preset:
name = input("Please enter your name to log in:", "anonymous") # a preset input text
When using the parameter default, the text input field will be pre-populated with the given text, so the user
might just click OK/Cancel or edit the content of the input field.
Example: input with hidden input:
password = input("please enter your secret", hidden = True)
As the user inputs his secret infoemation, the text is shown as one asterisk per character.
New in version 1.1.0.
inputText([msg ][, title ][, lines ][, width ])
Parameters
msg text to be displayed as message (default: nothing)
title optional title for the messagebox (default: Sikuli Text)
lines how many lines the text box should be high (default: 9)
width how many characters the box should have as width (default: 20)
Returns the possible multiline text entered by the user (might be empty)
A message box with the given height and width is displayed and allows the user to input as many lines of text as
needed. The display area is fix and not scrollable (means visble is only the top left part, that fits into the visible
area). But you might enter as much text as you like, e.g. via CopyAndPaste.
Example:
Sikuli can listen to global hotkeys that you register with Env.addHotkey and call the corresponding handler (sikuli
functions) when the user presses the hotkeys.
Env.addHotkey(key, modifiers, handler)
Register the specified key + modifiers as a global hotkey. When the hotkey is pressed, the specified function
handler will be called.
Parameters
key a character or a constant value defined in Key.
modifiers Key modifiers, which can be one or multiple constants defined in
KeyModifier.
Returns True if success.
Env.removeHotkey(key, modifiers)
Unregister the registered global hotkey key + modifiers.
Parameters
key a character or a constant value defined in Key.
6.2.3 Starting and stopping other apllications and bring them to front
switchApp(application)
Bring the matching application or window to front (make it the active/focused application/window). If no
matching application/window can be found, it is tried to open an application using the given string as program
name or location.
Parameters application the name of an application (case-insensitive) or (part of) a window
title (Windows/Linux) (case-sensitive).
:return None if an error occured, on success a new App class object (look App)
This function switches the input focus to the specified application (brings it to front).
Windows: In the first step, the given text is taken as part of a program name (not case sensitive). If it is found in
the process list, it will be switched to front, if it has a main window (registered in the process list). Otherwise
the text will be used to search for a matching window title.
Windows/Linux: the window is identified by scanning the titles of all accessible windows for the occurence of
the application string. The first window in the system specific order, whose title contains the given text, is given
focus.
Mac: the string application is used to identify the application. If the application has multiple windows
opened, all these windows will be brought to the front with unchanged z-order, which cannot be influenced
currently.
Examples:
# Windows: switches to an already opened Firfox or opens it otherwise
switchApp("c:\\Program Files\\Mozilla Firefox\\firefox.exe")
# Windows: switches to the frontmost opened browser window (or does nothing
# if no Firefox window is currently opened)
# works, because all Firefox window titles contain "Mozilla Firefox"
switchApp("Mozilla Firefox")
closeApp(application)
Close the specified application.
Parameters application the name of an application (case-insensitive) or (part of) a window
title (Windows/Linux)
:return None if an error occured, on success a new App class object (look App)
This function closes the application indicated by the string application (Mac) or the windows whose titles
contain the string application (Windows/Linux). On Windows/Linux, the application itself may be closed if the
main window is closed or if all the windows of the application are closed.
Example:
# Windows: closes Firefox if it is running, does nothing otherwise
closeApp("c:\\Program Files\\Mozilla Firefox\\firefox.exe")
run(command)
Run command in the command line
Parameters command a command that can be run from the command line.
Returns a multiline string containing the result of the execution
This function executes the command and the script waits for its completion.
structure of the result (comments after #, not part of the result)
Multiline string:
N # a number being the return code
text
text
text
text # no, one or more lines execution output (stdout)
*****error***** # if the execution ended with an error
error text # or the return code was not 0
error text
error text # no, one or more lines error output (stderr)
NOTE for usage variants of the command run() and for the Java usage see class App
Java Level
Java maintains a global storage for settings (key/value pairs), that can be accessed by the program/script. Sikuli uses it
for some of its settings too. Normally it is not necessary to access these settings at the Java level from a Sikuli script,
since Sikuli provides getter and setter methods for accessing values, that make sense for scripting. One example is the
list of paths, that Sikuli maintains to specify additional places to search for images (please refer to Importing other
Sikuli Scripts for more information).
If needed, you may access the java settings storage as shown in the following example:
import java.lang.System
# get a value
val = System.getProperty("key-of-property")
Jython/Python Level
You may use all settings, that are defined in standard Python/Jython and that are available in your system environment.
The modules sys and time are already imported, so you can use their methods without the need for an import
statement.
sys.path may be one of the most valuable settings, since it is used by Python/Jython to locate modules, that are
referenced using import module. It is a list of paths, that is e.g. maintained by Sikuli to implement Importing
other Sikuli Scripts as a standard compliant feature (exception: .Sikuli scripts cannot form a module tree).
If you want to use sys.path, it is recommended to do it as shown in the following example, to avoid appending the
same entry again:
myPath = "some-absolute-path"
if not myPath in sys.path:
sys.path.append(myPath)
Sikuli Level
Sikuli internally uses the class Settings to store globally used settings. Publicly available attributes may be accessed
by using Settings.[name-of-an-attribute] to get its value and Settings.attribute = value to
set it. It is highly recommended to only modify attributes, that are described in this document or when you really
know, what you are doing.
Actually all attributes of some value for scripting are described in the topic Controlling Sikuli Scripts and their Behav-
ior.
To store some settings across SikuliX IDE sessions, SikuliX utilizes the Java feature Preferences.
As persistent storage Java uses:
on Windows the registry branch HKCUSoftwareJavaSoftPrefsorgsikuli...
on Mac a plist file in ~/Library/Preferences/org.sikuli.....plist
on Linux usually at ~/.java/.userPrefs/org/sikuli/prefs.xml
The content is controlled by the IDEs Preferences panel. It is safe to delete this branch/file, to get a default setup and
might help in some situations, wher the startup of the IDE does not work or crashes.
6.3.1 Store some of your information persistently and reload it later (same machine)
You might use SikuliXs persistent storage, to store and reload your own information accross SikuliX sessions or
only across different runs of same or different scripts/programs. Internally the item names get a unique prefix, to avoid
name mangling (currently nonSikuli_).
Sikulix.prefStore(key, value)
Store a key-value-pair in Javas persistent preferences store
Parameters
key an item name as string
value a string value to be stored as the items content
Sikulix.prefLoad(key[, value ])
Retrieve the value of a previously stored key-value-pair using key as the items name
Parameters
key an item name as string
value an optional string value to be returned, if the item was not yet stored like some
default
Returns the items content if the item exists, otherwise an empty string or the given default
Sikulix.prefRemove(key)
Permanently remove the key-value-pair using key as the items name
Parameters key an item name as string
Returns the items content if the item exists, otherwise an empty string
Sikulix.prefRemove()
Permanently remove all key-value-pairs stored before using Sikulix.prefStore()
Parameters key an item name as string
Returns the items content if the item exists, otherwise an empty string
The class Env is deprecated and should not be used anymore. The contained features are moved to other places
and redirected from inside class Env to be downward compatibel.
NOTE: In the following the non-Env methods are the replacements, that should be used instead.
Settings.getOS()
Env.getOS()
Settings.getOSVersion()
Env.getOSVersion()
Get the type ( getOS() ) and version string ( getOSVersion() ) of the operating system your script is
running on.
An example using these methods on a Mac is shown below:
# on a Mac
myOS = Settings.getOS()
myVer = Settings.getOSVersion()
if myOS == OS.MAC:
print "Mac " + myVer # e.g., Mac 10.6.3
else:
print "Sorry, not a Mac"
myOS = Settings.getOS()
if myOS == "MAC" or myOS.startswith("M"):
print "Mac " + myVer # e.g., Mac 10.6.3
else:
print "Sorry, not a Mac"
There are convenience functions, to check wether we are running on a specific system:
Settings.isWindows()
Settings.isMac()
Settings.isLinux()
Returns True if we are running on this system, False otherwise
Settings.getSikuliVersion()
Env.getSikuliVersion()
Get the version of Sikuli.
Returns a string containing the version string
if not Settings.getSikuliVersion().contains("1.0.1"):
print "This script needs SikuliX 1.0.1"
exit(1)
App.getClipboard()
Env.getClipboard()
Get the content of the clipboard if it is text, otherwise an empty string.
NOTE: Be careful, when using Env.getClipboard() together with paste(), since paste internally uses
the clipboard to transfer text to other applications, the clipboard will contain what you just pasted. Therefore, if
you need the content of the clipboard, you should call Env.getClipboard() before using paste().
Tipp: When the clipboard content was copied from a web page that mixes images and text, you should be aware,
that there may be whitespace characters around and inside your text, that you might not have expected. In this
case, you can use Env.getClipboard().strip() to at least get rid of surrounding white spaces.
Key.isLockOn(key-constant)
Env.isLockOn(keyConstant)
Get the current status ( on / off ) off the respective key. Only one key can be specified.
Parameters keyConstant one of the key constants Key.CAPS_LOCK, Key.NUM_LOCK,
Key.SCROLL_LOCK
Returns True if the specified key is on, False otherwise
Further information about key constants can be found in Class Key.
Mouse.at()
Env.getMouseLocation()
Get the current location of the mouse cursor.
Returns a Location object of the position of the mouse cursor on the screen.
Setting the size to a smaller value would make the matching algorithm be faster.
class Region
The Region is besides Images/Patterns (called Visuals) and Matches (where a Visual was found and how) the basic
element in the SikuliX concept. So be sure, you have understood all aspects of a Region.
A Region is a rectangular area on a Screen and is defined by
1. its upper left corner (x, y) being the pixel with this offset relative to the upper left corner of the screen
(usually (0, 0) ) and
2. its dimension (w, h) as its width and height in pixels.
x, y, w, h are integer numbers counting a distance in pixels.
A Region does not know anything about its visual content (windows, pictures, graphics, text, ...). It only knows
the position on the screen and its dimension.
A Match, being the result of a Region.find() operation, basically is a Region in all aspects, just having a few
additional attributes specific for a find result.
New Regions can be created in various ways:
specify their position and dimension
extend a given Region in all directions (expand or shrink)
based on adjacent rectangles up to the bounds of the screen horizontally or vertically.
based on their corners
as subregions being rows, columns or cells of a regular grid
combine different Regions or use their intersection
You can use Region.find(), to search a given Visual being a rectangular pixel pattern (given as an Image
(filename or Image) or a Pattern object) within this Region. If this Visual is found in the Region, the resulting
Match object has a similarity score between >0 and 1. The lower the similarity score, the higher the chance, that you
got a false positive (found something else). To make your scripts robust against false positives, you should take care,
to get similarity scores >0.85 or even >0.9.
If the Visual is given for the search as an Image , Sikuli uses a minimum similarity of 0.7, which only returns Matches
with a score >0.7. This default value can be changed in Settings.MinSimilarity. A Pattern is searched with
the optionally given minimum similarity using Pattern.similar().
Find operations return a Match object, which has all attributes and methods of a Region and can be used in the same
way as a Region (e.g. find something or click another target inside). A Match has the size in pixels of the Visual used
for searching, the position where it was found, the similarity score and the elapsed time.
Look here for more detailed information on How SikuliX finds images on the screen.
Be aware: every mouse or keyboard action, that specifies a Visual to search for, will internally do the respective find
operation first, to evaluate the action target.
A Region remembers the match of the last successful find operation, all matches of the last
successful Region.findAll() and the elapsed time. With Region.getLastMatch(),
Region.getLastMatches() and Region.getLastTime() you can get these objects/value.
You can wait for a Pattern to appear using Region.wait() or wait for it to vanish using
Region.waitVanish()
Every not successful find operation (even those done internally with a click() ...) will raise a FindFailed exception,
that has to be handled in your script. If you do not do that, your script will simply stop here with an error.
If you do not want to handle these FindFailed exceptions, you might search for a Pattern using exists, which
just returns nothing (None/null) in case of not found. So you simply check the return value for being a Match.
For other options to handle FindFailed situations see FindFailed exception.
During a find operation internally the search is repeated with a scan rate (standard 3 per second) until success or an
optionally given timeout (standard 3 seconds) is reached, which then results in a FindFailed exception.
Sikuli supports visual event driven programming: You can tell a Region to observe that something appears, vanishes
or changes. It is possible to wait for the completion of an observation or let it run in the background, while your script
continues running. When one of the visual events happens, a handler in your script is called. Each Region can only
have one observer, but each observer can observe multiple visual events in that Region. You might also check the
status of a background observation later in your workflow, to handle events inline. Generally it is your responsibility
to stop observations, but at termination of your script or Java program all observations are terminated automatically.
NOTE: For hints and tips how to get robust and fast acting workflows look into the Best Practices.
NOTES
In any case a newly created Region will be restricted to the boundaries of the screen containing the largest part
of the new Region.
It displays an error, if no part of the newly created Region is contained by any of the available screens. Subsequent
usages of such a Region might result in errors, exceptions or even crashes, if features are used, that access the screen.
Use Region.isValid() to check, wether a Region is contained by a screen.
Create a new Region based on top left corner and size
class Region
Region(x, y, w, h)
Region(region)
Region(Rectangle)
Create a region object
Parameters
x x position of a rectangle.
y y position of a rectangle.
w height of a rectangle.
h width of a rectangle.
region an existing region object.
rectangle an existing object of Java class Rectangle
Returns a new region object.
For other ways to create new Regions see: Extend Regions ....
NOTE: The position and dimension attributes are named x, y representing the top left corner and w, h being
width and height. You might access/change these values directly or use the available getter/setter methods.
Note: Additionally you might use selectRegion() to interactively create a new region at runtime.
NOTE: Using Region(someOtherRegion) just duplicates this region (creates a new object). This can be
useful, when you need the same Region with different attributes, such as another observation loop or
another setting for Region.setThrowException() to control whether throwing an exception or not
when find ops fail.
Change a Regions position and/or size
class Region
setX(number)
setY(number)
setW(number)
setH(number)
Set the respective attribute of the region to the new value. This effectively moves the region around and/or
changes its dimension.
Parameters number the new value
moveTo(location)
Set the position of this region regarding its top left corner to the given location (the x and y values are
modified).
Parameters location location object becomes the new top left corner
Returns the modified region object
reg.moveTo(anotherLocation) # equivalent to
reg.setX(anotherLocation.x); reg.setY(anotherLocation.y)
setROI(x, y, w, h)
setROI(rectangle)
setRect(x, y, w, h)
setRect(rectangle)
All these methods are doing exactly the same: setting position and dimension to new values. The moti-
vation for two names is to make scripts more readable: setROI() is intended to restrict the search
to a smaller area to speed up processing searches (region of interest), whereas setRect() should
be used to change a region (move and/or shrink or enlarge).
Parameters
x, y, w, h (all) the attributes of a rectangle
rectangle a rectangle object
Returns None
morphTo(region)
Set the position and dimension of this region to the corresponding values of the region given as parameter.
(see: setRect())
Parameters region a region object
Returns the modified region object
reg.morphTo(anotherRegion) # equivalent to
r = anotherRegion; reg.setX(r.x); reg.setY(r.y); reg.setW(r.w); reg.setH(r.h)
getX()
getY()
getW()
getH()
Get the respective attribute of the region.
Returns integer value
getCenter()
Get the center of the region.
Returns an object of Location
getTopLeft()
getTopRight()
getBottomLeft()
getBottomRight()
Get the location of the regions respective corner
Returns Location object
getScreen()
Returns the screen object that contains this region.
Returns a new Screen object
See Multi Monitor Environments.
getLastMatch()
getLastMatches()
To access the Matches returned by the last find op in this Region.
Returns a Match object or a list of Match objects
All basic find operations (explicit like Region.find() or implicit like Region.click()) store
the match in lastMatch and multi-find ops (like Region.findAll()) all found matches into
lastMatches of the Region that was searched.
How to go through the Matches returned by getLastMatches().
TIPP: The LastMatch can be used to avoid a second search for the same Visual in sequences like:
wait(someVisual)
click(someVisual)
# or
if exists(someOtherVisual):
click(someOtherVisual)
To avoid the second search with the click() you can use:
wait(someVisual)
click(getLastMatch())
# or
if exists(someOtherVisual):
click(getLastMatch())
See Region.click() for the usage of these convenience shortcuts. A someRegion.click() will
either click the center of the given Region or the lastMatch, if any is available.
getTime()
Returns the elapsed time in number of milli-seconds of the last find op in this Region
Attributes influencing the behavior of features of a Region
class Region
NOTE For settings influencing the handling of Visual-not-found situations in this Region look here: FindFailed
Exceptions.
isRegionValid()
Returns False, if the Region is not contained by a Screen and hence cannot be used with faetures,
that need a Screen (find, capture, ...), otherwise True.
setAutoWaitTimeout(seconds)
Set the maximum waiting time for all subsequent find operations in that Region.
Parameters seconds a number, which can have a fraction. The internal granularity is milli-
seconds.
All subsequent find ops will be run with the given timeout instead of the current value of
Settings.AutoWaitTimeout, to which the region is initialized at time of creation (default 3 sec-
onds).
getAutoWaitTimeout()
Get the current value of the maximum waiting time for find ops in this region.
Returns timeout in seconds
setWaitScanRate(rate)
Set this Regions value: A find op should repeat the search for the given Visual rate times per second
until found or the maximum waiting time is reached. At time of Region creation the value is initialized
from Settings.WaitScanRate, which has a default of 3.
Parameters rate a value > 0. values < 1 will lead to scans every x seconds and hence longer
pauses between the searches (reduces cpu load).
TIPP Since on average the shortes search times are some milli seconds, rate > 100 will lead to a conti-
nous search under all circumstances.
getWaitScanRate()
Get the current value of this Regions WaitScanRate.
Returns the rate number
6.4.2 Get evenly sized parts of a Region (as rows, columns and cells based on a
raster)
In many cases, one has parts of a GUI, that are to some extent evenly structured, having some virtual raster (rows,
columns and/or cells), that one wants to use for restricting searches or walk through this parts for other reasons.
Typical examples are tables like in an Excel sheet, boxes in some GUI or on a webpage or dropdown lists and menues.
A given Region can be set to have some evenly sized raster, so that one can access these subregions and create new
Regions.
Convenience functions, to get a subregion from a specified raster in one step
class Region
get(somePart)
Select a part of the given Region based on the given part specifier.
Parameters somePart a constant as Region.CONSTANT or an integer between 200 and 999
(see below)
Returns a new Region created from the selected part
Usage based on the javadocs:
Constants for the top parts of a region (Usage: Region.CONSTANT)
shown in brackets: possible shortcuts for the part constant
NORTH (NH, TH) - upper half
NORTH_WEST (NW, TL) - left third in upper third
NORTH_MID (NM, TM) - middle third in upper third
NORTH_EAST (NE, TR) - right third in upper third
... similar for the other directions:
right side: EAST (Ex, Rx)
bottom part: SOUTH (Sx, Bx)
left side: WEST (Wx, Lx)
Based on the scheme behind these constants there is another possible usage:
specify part as e 3 digit integer where the digits xyz have the following meaning
1st x: use a raster of x rows and x columns
2nd y: the row number of the wanted cell
3rd z: the column number of the wanted cell
y and z are counting from 0
valid numbers: 200 up to 999 (< 200 are invalid and return the region itself)
example: get(522) will use a raster of 5 rows and 5 columns and return the cell in the middle
special cases:
if either y or z are == or > x: returns the respective row or column
example: get(525) will use a raster of 5 rows and 5 columns and return the row in the middle
setRows(numberRows)
setCols(numberColumns)
Define a rows or columns only raster, by dividing the Regions respective dimension into even parts. The
corresponding Regions will only be created, when the respective access methods are used later.
Parameters
numberRows the number of rows the Region should be devided in
numberColumns the number of columns the Region should be devided in
Returns the first element as new Region if successful or the region itself otherwise
setRaster(numberRows, numberColumns)
Define a raster, by deviding the Regions height in numberRows even sized rows and its width into
numberColumns even sized columns.
Parameters
numberRows the number of rows the Region should be devided in
numberColumns the number of columns the Region should be devided in
Returns the top left cell (getCell(0, 0)) if success, the Region itself if not
getRow(whichRow)
getCol(whichColumn)
Get the Region of the whichRow row or whichColumn column in the Regions valid raster counting
from 0. Negative values will count backwards from the end. Invalid indexes will return the last or first
element respectively.
Parameters
whichRow the number of the row to create a new Region from
whichColumn the number of the column to create a new Region from
Returns a new Region representing the selected element or the Region if no raster
getCell(whichRow, whichColumn)
Get the cell with the coordinates (whichRow, whichColumn) in the Regions valid raster counting from
0. Negative values will count backwards from the end. Invalid indexes will return the last or first element
respectively. If the current raster only has rows or columns, the element of the corresponding index will
be returned.
Parameters
whichRow the number of the row
whichColumn the number of the column
Returns a new Region representing the selected element or the Region itself if no raster
getting information about the current raster
class Region
isRasterValid()
Can be used to check, wether the Region currently has a valid raster
Returns True if it has a valid raster (either getCols or getRows or both would return > 0)
getRows()
getCols()
Returns the current raster setting (0 means not set) as number of rows/columns
getRowH()
getColW()
Returns the current raster setting (0 means not set) as height of one row or width of one column.
6.4.3 Extend Regions and create new Regions based on existing Regions
NOTES:
Except otherwise noted
these methods return new Region objects, whose location and size are based on the specified region.
the given base Region remains unchanged.
In any case the new Region will be restricted to the boundaries of the screen containing the largest part of the
new Region.
It displays an error, if no part of the new Region is contained by any of the available screens. Subsequent usages
of such a Region object might result in errors, exceptions or even crashes, if features are used, that acces the screen.
Use Region.isValid() to check, wether a Region is contained by a screen.
class Region
offset(location)
offset(x, y)
Creates a new Region object, whose upper left corner is relocated adding the given x and y values to the
respective values of the given Region. Width and height are the same.
Parameters
location a Location object providing the relocating x and y values
x a number being the offset horizontally (< 0 to the left, > 0 to the right)
y a number being the offset vertically (< 0 to the top, > 0 to the bottom)
Returns the new Region object
new_reg = reg.offset(Location(xoff, yoff)) # same as
new_reg = Region(reg.x + xoff, reg.y + yoff, reg.w, reg.h)
inside()
Returns the same object. Retained for upward compatibility. region.inside().find() is totally
equivalent to region.find().
Returns Region itself
NOTE: Besides the individual methods like nearby, left, right, above, below there is one new
method grow with some more options and different signatures. Where documented together, they are fully equiv-
alent. The reason behind is some better compatibility to the usage of Java Rectangle.
class Region
grow([range ])
nearby([range ])
The new region is defined by extending (>0) or shrinking (<0) the current regions dimensions in all
directions by range number of pixels. The center of the new region remains the same.
The default is taken from Settings.DefaultPadding (standard value 50)
Parameters range an integer indicating the number of pixels or the current default if omitted.
Returns a new Region object
above([range ])
below([range ])
left([range ])
right([range ])
Returns a new Region that is defined with respect to the given region:
above: new bottom edge next pixel row above given regions top edge
below: new top edge next pixel row below given regions bottom edge
left: new right edge next pixel clomn left of given regions left edge
right: new left edge next pixel column right of given regions right edge
It does not include the current region. If range is omitted, it reaches to the corresponding edge of the
screen.
Parameters range a positive integer defining the new demension aspect (width or height)
Returns a new Region object
grow(width, height)
Besides acting on visual objects, finding them is one of the core functions of Sikuli.
PS: means, that either a Pattern or a string (path to an image file or just plain text) can be used as parameter. A
find operation is successful, if the given image is found with the given minimum similarity or the given text is found
exactly. Similarity is a value between 0 and 1 to specify how likely the given image looks like the target. By default,
the similarity is 0.7 if an image rather than a pattern object with a specific similarity is given to Region.find().
Normally all these region methods are used as reg.find(PS), where reg is a region object. If written as find(PS)
it acts on the default screen, which is an implicit region in this case (see: SCREEN as Default Region). But in
most cases it is a good idea to use region.find() to restrict the search to a smaller region in order to speed up
processing.
If a find operation is successful, the returned match is additionally stored internally with the region that was
used for the search. So instead of using a variable to store the match ( m = reg.find() ), you can use
reg.getLastMatch() to access it afterwards. Unsuccessful find operations will leave these values unchanged.
By default, if the visual object (image or text) cannot be found, Sikuli will stop the script by raising an Exception
FindFailed. This follows the standards of the Python language, so that you could handle such exceptions using try:
... except: ....
If you are not used to programming using the Python language or because of other reasons, you might just want to
bypass the exception handling, which means just ignoring it (None is returned in that case). Or you might interactively
react on a FindFailed situation (e.g. optionally repeat the find). Read more about concepts and options at: Exception
FindFailed.
If you have multiple monitors, please read Multi Monitor Environments.
Note on IDE: Capturing is a tool in the IDE, to quickly set up images to search for. These images are named
automatically by the IDE and stored together with the script, at the time it is saved (we call the location in the file
system bundle-path). Behind the curtain, the images itself are specified simply by using a string containing the file
name (path to an image file).
class Region
find(PS)
Parameters PS a Pattern object or a string (path to an image file or just plain text)
Returns a Match object that contains the best match or fails if not found
Find a particular GUI element, which is seen as the given image or just plain text. The given file name of an
image specifies the elements appearance. It searches within the region and returns the best match, which
shows a similarity greater than the minimum similarity given by the pattern. If no similarity was set for
the pattern by Pattern.similar() before, a default minimum similarity of 0.7 is set automatically.
If autoWaitTimeout is set to a non-zero value, find() just acts as a wait().
Side Effect lastMatch: the best match can be accessed using Region.getLastMatch() afterwards.
findAll(PS)
Parameters PS a Pattern object or a string (path to an image file or just plain text)
Returns one or more Match objects as an iterator object or fails if not found
How to iterate through is documented here.
Repeatedly find ALL instances of a pattern, until no match can be found anymore, that meets the require-
ments for a single Region.find() with the specified pattern.
By default, the returned matches are sorted by the similiarty. If you need them ordered by their positions,
say the Y coordinates, you have to use Pythons sorted function. Here is a example of sorting the matches
from top to bottom.
Side Effect lastMatches: a reference to the returned iterator object containing the found matches is stored
with the region that was searched. It can be accessed using getLastMatches() afterwards. How to iterate
through an iterator of matches is documented here.
how to proceed in your workflow. So it is typically used with an if statement. At least one find operation
is performed, even if 0 seconds is specified. So specifying 0 seconds saves some time, in case there is no
need to wait, since its your intention to get the information not found directly.
Side Effect lastMatch: the best match can be accessed using Region.getLastMatch() afterwards.
Note: You may adjust the scan rate (how often a search during the wait takes place) by setting
Settings.WaitScanRate appropriately.
the scan rate (how often a search during the observation takes place) by setting Settings.ObserveScanRate
appropriately.
PS: as a parameter in the following methods you have to specify a Pattern or a String (path to an image file or just
plain text). handler: as a parameter in the following methods you have to specify the name of a function, which will
be called by the observer, in case the observed event happens. The function name (and usually the function itself) has
to be defined in your script before using the appropriate functions to register an observe event. The existance of the
function will be checked after starting the script, but before running it.
So to get your script running, you have to have at least the following statements in your script:
def myHandler(event): # you can choose any valid function name
# event: can be any variable name, it references the ObserveEvent object
pass # add your statements here
Note for Java And this is how you setup a handler in your Java program:
someRegion.onAppear("path-to-an-image-file",
new ObserveCallback() {
@Override
public void appeared(ObserveEvent event) {
// here goes your handler code
}
}
);
Here ObserveCallback is an abstract class defining the three possible callback funtions ``appeared``,
``vanished`` and ``changed``, that have to be overwritten as needed in your implementation
of the ObserveCallback.
Read :py:class:`ObserveEvent` to know what is contained in the event object and what its features are
NOTE ON CONCURRENCY with ObserveInBackground, the callback concept and Mouse/Keyboard usage In
Sikuli version prior to 1.1.0 it could happen, that mouse actions in the handler callback could interfere with mouse
actions in the main workflow or other callback handlers, since these threads work in parallel without any automatic
synchronization.
Beginning with 1.1.0 mouse actions like click are safe in the way, that they always are completed, before any other
click operation can be started (internally handled like a transaction).
So parallel clicks in main workflow and handler should do their job correctly, but might be run in a sequence, that
cannot be foreseen. Look here, if you want to have more control over mouse and keyboard usage in parallel processes.
class Region
onAppear(PS, handler)
With the given region you register an APPEAR event, whose pattern/image/text is looked for to be there or
to appaear while running an observation with the next call of observe(). In the moment the observation
is successful for that event, your registered handler is called and the observation is paused until you return
from your handler.
With the first appearence, the observation for this event is terminated. If you want the observation for this
event to be continued, you have to use ObserveEvent.repeat() before leaving the handler.
Parameters
observe([seconds ])
Begin observation within the region. The script waits for the completion of the observation (meaning until
the observation is stopped by intention or timed out).
Parameters seconds a number, which can have a fraction, as maximum observation time in
seconds. Omit it or use the constant FOREVER to tell the observation to run for an infinite
time (or until stopped by stopObserve()).
Returns True, if the observation could be started, False otherwise
For each region object, only one observation can be running at a given time, meaning, that a call to
observe(), while an observe for that region is running, is ignored with an error message, returning False.
Note: You may adjust the scan rate (how often a search during the observation takes place) by setting
Settings.ObserveScanRate appropriately.
observeInBackground([seconds ])
The observation is run in the background, meaning that the observation will be run in a subthread and
processing of your script is continued immediately.
Take care, that your script continues with some time consuming stuff. Additionally Named Events might
be of interest.
The over all behavior and the features are the same as Region.observe().
observe([seconds ], background = True)
DEPRECATED (will not be in version 2+) Only available in Python scripts for some limited backward
compatibility, with the impact, that the Region object must be a Python level Region. In case you have to
cast a Java level Region using Region(someRegion).
It is strongly recommended to revise your scripts using this observe feature as soon as possible.
stopObserver()
Stop observation for this region.
The source region of an observed visual event is available from the event that is passed as parameter to the
handler function.
Additionally there is a convenience feature to stop observation within a handler function: simply call
event.stopObserver() inside the handler function.:
def myHandler(event):
event.stopObserver() # stops the observation
# instead of
# event.getRegion().stopObserver()
onAppear("path-to-an-image-file", myHandler)
observe(FOREVER) # observes until stopped in handler
class ObserveEvent
When processing an observation in a region, a handler function is called, when one of your registered events
Region.onAppear(), Region.onVanish() or Region.onChange() happen.
The one parameter, you have access to in your handler function is an instance of ObserveEvent. You have
access to the following features of the event, that might help to identify the cause of the event, act on the resulting
matches and optionally modify the behavior of the observation.
New in version X1.1.0: Note on versions prior to 1.1.0 The event class was SikuliEvent and it allowed to
directly access the attributes like type, match, region, ... . This class no longer exists and its follow up is the class
ObserveEvent. This break of downward compatibility is by intention, to force the revision of scripts, that use the
observe feature and are run with version 1.1.0+
New in version X1.1.1.
The feature FindFailed and/or ImageMissing handler allows to specify functions that are visited in the case of these
failures happening. The handler gets an ObserveEvent object as parameter, that can be used to analyse the situation
and define how the situation should be handled finally.
getType()
get the type of the event
Returns a string containing APPEAR, VANISH, CHANGE, GENERIC, FINDFAILED,
MISSING
isAppear(), isVanish(), isChange(), isGeneric(), isFindFailed(), isMissing()
convenience methods, to check the type
Returns True or False
getRegion()
The observing region of this event.
Returns the region object
getPattern()
Get the pattern that triggered this event. A given image is packed into the pattern. This is only valid
for APPEAR and VANISH events as well as for FINDFAILED and MISSING.
Returns the pattern object (which allows to access the given image if needed)
getImage()
Directly access the given image in case of FINDFAILED and MISSING.
Returns the image object
getMatch()
For an APPEAR you get the Match object that appeared in the observed region (same as with
wait()).
For a VANISH event, you get the last Match object that was found in the observed region before it
vanished.
This method is not valid in a CHANGE event.
Returns the match object
getChanges()
Get a list of Match objects that represent the rectengular areas that changed their content. Their
sizes are at least minChangedSize pixels.
This attribute is valid only in a CHANGE event.
Returns an unsorted list of match objects
getCount()
Get the count how often the handler was visited.
Returns the count as number
getTime()
Get the time, when the event happened.
Returns a long integer value according to the Java feature new Date().getTime()
repeat([waitTime ])
Specify the time in seconds, that the observation of this event should pause after returning from the
handler.
Remember APPEAR and VANISH events are stopped after the first occurence. You have to use an
approriate repeat(), to continue the observation.
Parameters waitTime seconds to pause, taken as 0 if not given
getResponse()
In case of FINDFAILED or MISSING get the current setting of the FindFailedResponse of the event
region
Returns PROMPT, RETRY, SKIP or ABORT
setResponse(response)
In case of FINDFAILED or MISSING set the FindFailedResponse of the event region. This will be
the option, that is used after return from the handler for the final reaction.
Parameters response PROMPT, RETRY, SKIP or ABORT
getName()
Get the unique name of this event for use with the appropriate features (see working with named
events)
Returns a string containing the name
stopObserver()
Stop observation for this region (shortcut for event.getRegion().stopObserver()).
Working with named observe events
Additionally to the callback-concept of the observation feature, it is possible, to start one or more observations in
background, having registered events without handlers. When these events happen, the event is stored in a list and its
observation is paused until the event is taken from the list. Both concepts can be combined per observation.
Events without handlers are registered by omitting the handler parameter in the methods Region.onAppear(),
Region.onVanish() and Region.onChange() and storing the returned name for later use.
After having started the observation the usual way using Region.observe(), you can check, wether any events
have happened until now, you can access the events using their name or get a list of all events that happened until now.
With the events themselves you can work exactly like in the handler concept (see: ObserveEvent).
The following methods are bound to the region under observation.
class Region
hasObserver()
Check wether at least one event is registered for this region. The observation might be running or not.
Returns True or False
isObserving()
Check wether currently an observation is running for that region
Returns True or False
hasEvents()
Check wether any events have happened for that region
Returns True or False
getEvents()
Get the events, that have happened until this moment. The events are purged from the internal event list.
Returns a list of ObserveEvent (might be empty)
getEvent(name)
Get the named event and purge it from the internal event list
Parameters name the name of the event (string)
Returns the named event or None/null if it is not on the internal event list
setInactive(name)
The named event is paused during the running observation until activated again or the observation is
restarted.
Parameters name the name of the event (string)
setActive(name)
The named event is activated, so it is observed during the running observation.
Parameters name the name of the event (string)
Besides finding visual objects on the screen, acting on these elements is one of the kernel operations of Sikuli. Mouse
actions can be simulated as well as pressing keys on a keyboard.
The place on the screen, that should be acted on (in the end just one specific pixel, the click point), can be given either
as a pattern like with the find operations or by directly referencing a pixel location or as center of a region
object (match or screen also) or the target offset location connected with a match. Since all these choices can be
used with all action methods as needed, they are abbreviated and called like this:
PSMRL: which means, that either a Pattern object or a string (path to an image file or just plain text) or a Match
or a Region or a Location can be used as parameter, in detail:
P: pattern: a Pattern object. An implicit find operation is processed first. If successful, the center of the
resulting matches rectangle is the click point. If the pattern object has a target offset specified, this is used as
click point instead.
S: string: a path to an image file or just plain text. An implicit find operation with the default minimum similarity
0.7 is processed first. If successful, the center of the resulting match object is the click point.
M: match: a match object from a previous find operation. If the match has a target specified it is used as the
click point, otherwise the center of the matchs rectangle.
R: region: a region object whose center is used as click point.
L: location: a location object which by definition represents a point on the screen that is used as click point.
It is possible to simulate pressing the so called key modifiers together with the mouse operation or when simulating
keyboard typing. The respective parameter is given by one or more predefined constants. If more than one modifier is
necessary, they are combined by using + or |.
Normally all these region methods are used as reg.click(PS), where reg is a region object. If written as
click(PS) the implicit find is done on the default screen being the implicit region in this case (see: SCREEN
as Default Region). But using reg.click(PS) will restrict the search to the regions rectangle and speed up pro-
cessing, if region is significantly smaller than the whole screen.
Generally all aspects of find operations and of Region.find() apply.
If the find operation was successful, the match that was acted on, can be recalled using Region.getLastMatch().
As a default, if the visual object (image or text) cannot be found, Sikuli will stop the script by raising an Exception
FindFailed (details see: not found).
Note on IDE: Capturing is a tool in the IDE, to quickly set up images to search for. These images are named
automatically by the IDE and stored together with the script, at the time it is saved (we call the location in the file
system bundle-path). Behind the curtain the images itself are specified by using a string containing the file name (path
to an image file).
Note: If you need to implement more sophisticated mouse and keyboard actions look at Low Level Mouse and Key-
board Actions.
Note: In case of having more than one Monitor active, refer to Multi Monitor Environments for more details.
Note on Mac: it might be necessary, to use switchApp() before, to prepare the application for accepting the action.
class Region
click(PSMRL[, modifiers ])
Perform a mouse click on the click point using the left button.
Parameters
PSMRL a pattern, a string, a match, a region or a location that evaluates to a click point.
modifiers one or more key modifiers
Returns the number of performed clicks (actually 1). A 0 (integer null) means that because of
some reason, no click could be performed (in case of PS may be not Found).
Side Effect if PS was used, the match can be accessed using Region.getLastMatch() afterwards.
Example:
doubleClick(PSMRL[, modifiers ])
Perform a mouse double-click on the click point using the left button.
Parameters
PSMRL a pattern, a string, a match, a region or a location that evaluates to a click point.
modifiers one or more key modifiers
Returns the number of performed double-clicks (actually 1). A 0 (integer null) means that
because of some reason, no click could be performed (in case of PS may be not Found).
Side Effect if PS was used, the match can be accessed using Region.getLastMatch() afterwards.
rightClick(PSMRL[, modifiers ])
Perform a mouse click on the click point using the right button.
Parameters
PSMRL a pattern, a string, a match, a region or a location that evaluates to a click point.
modifiers one or more key modifiers
Returns the number of performed right cilicks (actually 1). A 0 (integer null) means that because
of some reason, no click could be performed (in case of PS may be not Found).
Side Effect if PS was used, the match can be accessed using Region.getLastMatch() afterwards.
highlight([seconds ])
Highlight the region for some period of time.
Parameters seconds a decimal number taken as duration in seconds
The region is highlighted showing a red colored frame around it. If the parameter seconds is given, the
script is suspended for the specified time. If no time is given, the highlighting is started and the script
continues. When later on the same highlight call without a parameter is made, the highlighting is stopped
(behaves like a toggling switch).
Example:
m = find(some_image)
Note: The red frame is just an overlay in front of all other screen content and stays in its place, independently
from the behavior of this other content, which means it is not connected to the content of the defining region.
But it will be adjusted automatically, if you change position and/or dimension of this region in your script, while
it is highlighted.
Note: Due to the implementation of this function, the target application might loose focus and opened menus
or lists get closed again. In other cases the highlight frame is not or not completely visible (not getting to the
front). In these cases the highlight feature cannot be used for tracking the search results.
A possible workaround is to use hover(), to move the mouse over the match or even use a function like this:
def hoverHighlight(reg, loop = 1):
for n in range(loop):
hover(reg.getTopLeft())
hover(reg.getTopRight())
hover(reg.getBottomRight())
hover(reg.getBottomLeft())
hover(reg.getTopLeft())
Using this function instead of highlight will let the mousepointer visit the corners of the given region clockwise,
starting and stopping top left. With the standard move delay of 0.5 seconds this will last about 2 seconds for one
loop (second parameter, default 1).
hover(PSMRL)
Move the mouse cursor to hover above a click point.
Parameters
PSMRL a pattern, a string, a match, a region or a location that evaluates to a click point.
modifiers one or more key modifiers
Returns the number 1 if the mousepointer could be moved to the click point. A 0 (integer null)
returned means that because of some reason, no move could be performed (in case of PS may
be not Found).
Side Effect if PS was used, the match can be accessed using Region.getLastMatch() afterwards.
dragDrop(PSMRL, PSMRL[, modifiers ])
Perform a drag-and-drop operation from a starting click point to the target click point indicated by the two
PSMRLs respectively.
Parameters
PSMRL a pattern, a string, a match, a region or a location that evaluates to a click point.
modifiers one or more key modifiers
If one of the parameters is PS, the operation might fail due to not Found.
Sideeffect: when using PS, the match of the target can be accessed using Region.getLastMatch()
afterwards. If only the first parameter is given as PS, this match is returned by
Region.getLastMatch().
If the operation does not perform as expected (usually caused by timing problems
due to delayed reactions of applications), you may adjust the internal timing param-
eters Settings.DelayBeforeMouseDown, Settings.DelayBeforeDrag and
Settings.DelayBeforeDrop (default value is 0.3 seconds) for the next action (timing is re-
set to default after the operation is completed). In case this might be combined with the internal timing
parameter Settings.MoveMouseDelay.
Note: If you need to implement more sophisticated mouse and keyboard actions look at Low Level Mouse
and Keyboard Actions.
drag(PSMRL)
Start a drag-and-drop operation by starting the drag at the given click point.
Parameters PSMRL a pattern, a string, a match, a region or a location that evaluates to a click
point.
Returns the number 1 if the operation could be performed. A 0 (integer null) returned means
that because of some reason, no move could be performed (in case of PS may be not Found).
The mousepointer is moved to the click point and the left mouse button is pressed and held, until the button
is released by a subsequent mouse action. (e.g. a Region.dropAt() afterwards).
If the operation does not perform as expected (usually caused by timing problems
due to delayed reactions of applications), you may adjust the internal timing parameters
Settings.DelayBeforeMouseDown, and Settings.DelayBeforeDrag (default value
is 0.3 seconds) for the next action (timing is reset to default after the operation is completed).
Side Effect if PS was used, the match can be accessed using Region.getLastMatch() afterwards.
dropAt(PSMRL[, delay ])
Complete a drag-and-drop operation by dropping a previously dragged item at the given target click point.
Parameters PSMRL a pattern, a string, a match, a region or a location that evaluates to a click
point.
Returns the number 1 if the operation could be performed. A 0 (integer null) returned means
that because of some reason, no move could be performed (in case of PS may be not Found).
The mousepointer is moved to the click point and the left mouse button is released. If it is necessary to
visit one ore more click points after dragging and before dropping, you can use Region.mouseMove()
or Region.hover() inbetween and dropAt only for the final destination.
If the operation does not perform as expected (usually caused by timing problems due to delayed reac-
tions of applications), you may adjust the internal timing parameter Settings.DelayBeforeDrop
(default value is 0.3 seconds) for the next action (timing is reset to default after the operation is completed).
Side Effect if PS was used, the match can be accessed using Region.getLastMatch() afterwards.
type([PSMRL ], text[, modifiers ])
Type the text at the current focused input field or at a click point specified by PSMRL.
Parameters
PSMRL a pattern, a string, a match, a region or a location that evaluates to a click point.
modifiers one or more modifier keys (Class Key)
Returns the number 1 if the operation could be performed, otherwise 0 (integer null), which
means, that because of some reason, it was not possible or the click could be performed (in
case of PS may be not Found).
This method simulates keyboard typing interpreting the characters of text based on the layout/keymap of
the standard US keyboard (QWERTY).
Special keys (ENTER, TAB, BACKSPACE, ...) can be incorporated into text using the constants defined
in Class Key using the standard string concatenation +.
If PSMRL is given, a click on the clickpoint is performed before typing, to gain the focus. (Mac: it might
be necessary, to use switchApp() to give focus to a target application before, to accept typed/pasted
characters.)
If PSMRL is omitted, it performs the typing on the current focused visual component (normally an input
field or an menu entry that can be selected by typing something).
Side Effect if PS was used, the match can be accessed using Region.getLastMatch() afterwards.
Note: If you need to type international characters or you are using layouts/keymaps other than US-
QWERTY, you should use Region.paste() instead. Since type() is rather slow because it simulates
each key press, for longer text it is preferrable to use Region.paste().
Best Practice: As a general guideline, the best choice is to use paste() for readable text and type()
for action keys like TAB, ENTER, ESC, .... Use one type() for each key or key combination and be
aware, that in some cases a short wait() after a type() might be necessary to give the target application
some time to react and be prepared for the next Sikuli action.
paste([PSMRL ], text)
Paste the text at a click point.
Parameters
PSMRL a pattern, a string, a match, a region or a location that evaluates to a click point.
modifiers one or more key modifiers
Returns the number 1 if the operation could be performed, otherwise 0 (integer null), which
means, that because of some reason, it was not possible or the click could be performed (in
case of PS may be not Found).
Pastes text using the clipboard (OS-level shortcut (Ctrl-V or Cmd-V)). So afterwards your clipboard con-
tains text. paste() is a temporary solution for typing international characters or typing on keyboard
layouts other than US-QWERTY.
If PSMRL is given, a click on the clickpoint is performed before typing, to gain the focus. (Mac: it might
be necessary, to use switchApp() to give focus to a target application before, to accept typed/pasted
characters.)
If PSMRL is omitted, it performs the paste on the current focused component (normally an input field).
Side Effect if PS was used, the match can be accessed using Region.getLastMatch() afterwards.
Note: Special keys (ENTER, TAB, BACKSPACE, ...) cannot be used with paste(). If needed, you
have to split your complete text into two or more paste() and use type() for typing the special keys
inbetween. Characters like \n (enter/new line) and \t (tab) should work as expected with paste(), but be
aware of timing problems, when using e.g. intervening \t to jump to the next input field of a form.
class Region
text()
Extract the text contained in the region using OCR.
Returns the text as a string. Multiple lines of text are separated by intervening n.
Note: Since this feature is still in an experimental state, be aware, that in some cases it might not work
as expected. If you face any problems look at the Questions & Answers / FAQs and the Bugs.
class Region
mouseDown(button)
Press the mouse button down.
Parameters button one or a combination of the button constants Button.LEFT,
Button.MIDDLE, Button.RIGHT.
Returns the number 1 if the operation is performed successfully, and zero if otherwise.
The mouse button or buttons specified by button are pressed until another mouse action is performed.
mouseUp([button ])
Release the mouse button previously pressed.
Parameters button one or a combination of the button constants Button.LEFT,
Button.MIDDLE, Button.RIGHT.
Returns the number 1 if the operation is performed successfully, and zero if otherwise.
The button specified by button is released. If button is omitted, all currently pressed buttons are released.
mouseMove(PSRML)
Move the mouse pointer to a location indicated by PSRML.
Parameters PSMRL a pattern, a string, a match, a region or a location that evaluates to a click
point.
Returns the number 1 if the operation could be performed. If using PS (which invokes an
implicity find operation), find fails and you have switched off FindFailed exception, a 0
(integer null) is returned. Otherwise, the script is stopped with a FindFailed exception.
Sideeffects: when using PS, the match can be accessed using Region.getLastMatch() afterwards
mouseMove(xoff, yoff )
Move the mouse pointer from its current position to the position given by the offset values (<0 left, up >0
right, down)
Parameters
xoff horizontal offset
yoff vertical offset
Returns 1 if possible, 0 otherwise
As a default, find operations (explicit and implicit) when not successful raise an Exception FindFailed, that
will stop the script immediately.
To implement some checkpoints, where you want to asure your workflow, use Region.exists(), that reports a
not found situation without raising FindFailed (returns False instead).
To run all or only parts of your script without FindFailed exceptions to be raised, use
Region.setThrowException() or Region.setFindFailedResponse() to switch it on and off
as needed.
For more sophisticated concepts, you can implement your own exception handling using the standard Python construct
try: ... except: ... .
New in version X1.1.1.
Generally a FindFailed situation is also signalled (besides that the image could not befound on the screen), if the image
could not be found on the current image path and hence could not be loaded for the find process.
New in version X1.1.1.
To implement even more sophisticated concepts, it is possible to declare handler functions, that are visited in case of
a FindFailed and/or ImageMissing situations and allow to take corrective actions. Before leaving the handler you can
specify how the case should finally be handled (ABORT, SKIP, RETRY or PROMPT). If specified, a handler is always
visited before any other action is taken.
The PROMPT response now allows to recapture the image on the fly or just to capture an image, that is not loadable.
New in version X1.0-rc2.
These are the possibilities to handle not found situations:
generally abort a script, if not handled with try: ... except: ... (the default setting or
using setThrowException(True) or setFindFailedResponse(ABORT))
generally ignore all not found situations (using setThrowException(False) or
setFindFailedResponse(SKIP)),
want to be prompted in such a case (using setFindFailedResponse(PROMPT))
advise Sikuli to wait forever (be careful with that!) (using setFindFailedResponse(RETRY))
New in version X1.1.1:
advise Sikuli to visit the specified handler before taking any other action (using
setFindFailedHandler(handler))
Comment on using PROMPT:
This feature is helpful in following situations:
you are developing something, that needs an application with its windows to be in place, but this workflow you
want to script later. If it comes to that point, you get the prompt, arrange the app and click Retry. Your workflow
should continue.
you have a workflow, where the user might do some corrective actions, if you get a FindFailed
guess you find more ;-)
In case of a FindFailed, you get the following prompt:
class Region
Reminder If used without specifying a region, the default/primary screen (default region SCREEN) is used.
New in version X1.0-rc2.
setFindFailedResponse(ABORT | SKIP | PROMPT | RETRY)
For the specified region set the option, how Sikuli should handle not found situations. The option stays
in effect until changed by another setFindFailedResponse().
Parameters
ABORT all subsequent find failed operations (explicit or implicit) will raise exception
FindFailed (which is the default when a script is started).
SKIP all subsequent find operations will not raise exception FindFailed. Instead, explicit
find operations such as Region.find() will return None. Implicit find operations (ac-
tion functions) such as Region.click() will do nothing and return 0.
PROMPT all subsequent find operations will not raise exception FindFailed. Instead
you will be prompted for the way to handle the situation (see using PROMPT). Only the
current find operation is affected by your response to the prompt.
RETRY all subsequent find operations will not raise exception FindFailed. Instead, Sikuli
will try to find the target until it gets visible. This is equivalent to using wait( ... ,
FOREVER) instead of find() or using setAutoWaitTimeout(FOREVER).
New in version X1.1.1.
setFindFailedHandler(handler)
For all subsequent find failed operations (explicit or implicit) the specified handler should be visited in
case of FindFailed or image not loadable. (see using a FindFailed handler)
Parameters handler the name of the handler function that should be visited.
getFindFailedResponse()
Get the current setting in this region.
Returns ABORT or SKIP or PROMPT or RETRY
Usage:
val = getFindFailedResponse()
if val == ABORT:
print "not found will stop script with Exception FindFailed"
elif val == SKIP:
print "not found will be ignored"
elif val == PROMPT:
print "when not found you will be prompted"
elif val == RETRY:
print "we will always wait forever"
else:
print val, ": this is a bug :-("
True all subsequent find operations (explicit or implicit) will raise exception FindFailed
(which is the default when a script is started) in case of not found.
False all subsequent find operations will not raise exception FindFailed. Instead, ex-
plicit find operations such as Region.find() will return None. Implicit find operations
(action functions) such as Region.click() will do nothing and return 0.
getThrowException()
Returns True or False
Get the current setting as True or False (after start of script, this is True by default) in this region.
Instead of:
# reg is a region object
if not reg.exists(image1):
reg.click(image2)
reg.wait(image3, 10)
reg.doubleClick(image4)
you can group methods applied to the same region using Pythons with syntax:
# reg is a region object
with reg:
if not exists(image1):
click(image2)
wait(image3, 10)
doubleClick(image4)
All methods inside the with block (mind indentation) that have the region omitted are redirected to the region object
specified at the with statement.
IMPORTANT Usage Note
This only works without problems for region objects created on the scripting level using one of the constructors
Region().
Region objects created with Region methods, that return new region objects, might not work though in some cases.
If you get strange results or errors in the with block (e.g. syntax error __enter__ is not defined for
this region) cast your Region object to a scripting level Region object using
castedRegion = Region(regionNotWorking) and use castedRegion in the with clause
or do it like this:
# reg is a scripting level region object
regNew = reg.left() # returns a non-scripting-level region object
with Region(regNew):
if not exists(image1):
click(image2)
wait(image3, 10)
doubleClick(image4)
6.5 Location
This class is there as a convenience, to handle single points on the screen directly by its position (x, y). It is mainly
used in the actions on a region, to directly denote the click point. It contains methods, to move a point around on the
screen.
class Location
Location(x, y)
Parameters
x x position
y y position
Returns a new location object representing the position (x,y) on the screen
getX()
getY()
Get the x or y value of a location object
It is possible to get the values directly by location.x or location.y. It is also possible to set these
values directly by location.x = value or location.y = value.
Note: getX()and getY() currently (versions 0.10.2 and X 1.0rc2) return float values (Java:
double), whereas location.x and location.y return integer values.
setLocation(x, y)
Set the location of this object to the specified coordinates.
offset(dx, dy)
Get a new location which is dx and dy pixels away horizontally and vertically from the current location.
above(dy)
Get a new location which is dy pixels vertically above the current location.
below(dy)
Get a new location which is dy pixels vertically below the current location.
left(dx)
Get a new location which is dx pixels horizontally to the left of the current location.
right(dx)
Get a new location which is dx pixels horizontally to the right of the current location.
6.6 Screen
class Screen
Class Screen is there, to have a representation for a pysical monitor where the capturing process (grabbing a rect-
angle from a screenshot, to be used for further processing with find operations is implemented. For Multi Monitor
Environments it contains features to map to the relevant monitor.
Since Screen extends class Region, all methods of class Region can be used with a screen object.
Of special interest might be the grouping of region method calls using with: in Multi Monitor Environments: use it
for other screens, than the default/primary screen, where you have this feature by default.
Be aware, that using the whole screen for find operations may have an impact on performance. So if possible either
use setROI() or restrict a find operation to a smaller region object (e.g. reg.find()) to speed up processing.
class Screen
Screen([id ])
Create a new Screen object
Parameters id an integer number indicating which monitor in a multi-monitor environment.
Returns a new screen object.
It creates a new screen object, that represents the default/primary monitor (whose id is 0), if id is omitted.
Numbers 1 and higher represent additional monitors that are available at the time, the script is running
(read for details).
Using numbers, that do not represent an existing monitor, will stop the script with an error. So you may
either use getNumberScreens() or exception handling, to avoid this.
Note: If you want to access the default/primary monitor ( Screen(0) ) without creating a new screen object,
use the constant reference SCREEN, that is initiated when your script starts: SCREEN=Screen(0).
getNumberScreens()
Get the number of screens in a multi-monitor environment at the time the script is running
getBounds()
Get the dimensions of monitor represented by the screen object.
Returns a rectangle object
The width and height of the rectangle denote the dimensions of the monitor represented by the screen
object. These attributes are obtained from the operating system. They can not be modified using Sikuli
script.
Normally all region methods are used as reg.find(PS), where reg is a region object (or a screen or a match object). If
written as find(PS) it acts on the default screen being the implicit region in this case (mapped to the constant reference
SCREEN). In Multi Monitor Environments this is the primary monitor (use the constant reference SCREEN, to access
it all the time), that normally is Screen(0), but might be another Screen() object depending on your platform.
So its a convenience feature, that can be seen as an implicit use of the python construct with object:.
On the other hand this may slow down processing speed, because of time consuming searches. So to speed up
processing, saying region.find() will restrict the search to the specified rectangle. Another possibility is to say setROI()
to restrict the search for all following find operations to a smaller region than the whole screen. This will speed up
processing, if the region is significantly smaller than the whole screen.
6.6.3 Capturing
Capturing is the feature, that allows to grab a rectangle from a screenshot, to save it for later use. At each time, a
capturing is initiated, a new screenshot is taken.
6.6. Screen 71
SikuliX Documentation, Release 1.1.0-Beta1
There are two different versions: the first one Screen.capture() saves the content of the selected rectangle in
a file and returns its file name, whereas the second one Screen.selectRegion() just returns the position and
dimension of the selected rectangle.
Both features are available in the IDE via the buttons in the toolbar.
class Screen
If more than one monitor is available, Sikuli is able to manage regions and click points on these monitors.
The base is the coordinate system (picture above), that positions the primary monitor with its upper left corner at
(0,0) extending the x-direction to the right and the y-direction towards the lower boundary of the screen. The position
of additional monitors can be configured in the operating system to be on either side of the primary monitor, with
different positions and sizes. So monitors left of the primary will have pixels with negative x-values and monitors
above will have negative y-values (left and above both x and y are negative).
At script start, Sikuli gets the relevant information from the operating system and creates respective screen objects, that
have an ID (0 for the first or primary monitor, 1 and higher for additional monitors with a maximum of one less than
the number of screens) and know the rectangle, they cover in the coordinate system. These informations are readonly
for a script.
These predefined screen objects can be accessed with Screen(0), Screen(1), ... and are normally used to create your
own screen objects. The possibility to use the region methods on a default region mapped to the primary monitor is
implemented with the constant reference SCREEN. This concept is only available for the primary monitor.
How to get the relevant information:
getNumberScreens() returns the number of available screens.
getBounds() returns the rectangle covered by the default/primary monitor.
Screen.getBounds() returns the rectangle covered by a screen object created using Screen(id).
Be aware: Changes in your system settings are only recognized by the IDE, when it is started.
Windows: The monitor, that is the first one based on hardware mapping (e.g. the laptop monitor), will always be
Screen(0). In the Windows settings it is possible to place the taskbar on one of the secondary monitors, which makes it
the primary monitor getting the base coordinates (0,0). The other available monitors are mapped around based on your
settings. But the Sikuli internal mapping is not changed, so the primary monitor might be any of your Screen() objects.
Sikuli takes care for that and maps SCREEN always to the primary monitor (the one with the (0,0) coordinates). So
for example you have a laptop with an external monitor, that shows the taskbar (is primary monitor):
SCREEN maps to Screen(1)
Screen(0) is your laptop monitor
6.6. Screen 73
SikuliX Documentation, Release 1.1.0-Beta1
Mac: The monitor, that has the System Menu Bar, is always Screen(0) and mapped to the default SCREEN.
Linux (Under construction)
With its rectangle, a screen object is always identical with the monitor it was created using Screen(id). Using
Region.setROI() to restrict the region of interest for find operations has no effect on the base rectangle of the
screen object.
On the other hand region objects and location objects can be positioned anywhere in the coordinate system. Only when
a find operation or a click action has to be performed, the objects rectangle or point has to be inside the rectangle of
an existing monitor (basically repersented by Screen(0), Screen(1), ...). When this condition is met, everything works
as expected and known from a single monitor system.
With finding and acting there are the following exceptions:
Point Outside: a click point is outside any monitor rectangle. The clickpoint will be mapped to the edges or
corners of the primary monitor according to the relative position:
to the edges if its x or y value is in the range of the respective edge (right, left, above, below)i
to the corners, if x and y are outside any range of any edge (left/above -> upper left corner, ...)
Region Outside: a region is completely outside any monitor
a click action is handled in thesame way as Point Outside
a find operation will always fail
Region Partially Outside: a region is partially outside a monitor but not overlapping another monitor
a click action is handled in the same way as Point Outside
a find operation will be carried out only in the part of region within the bounds of the monitor, excluding
the area outside the monitor.
Region Across Monitors: a region lies across multiple monitors:
a click action is handled in the same way as Point Outside
a find operation will be restricted to the region within the bounds of the monitor that has a smaller id.
An interactive capture (the user is asked to select an image or a rectangle via Screen.capture() or
Screen.selectRegion()) will automatically be restricted to the monitor, where it was started.
A scripted capture using a rectangle or a region (i.e. Screen.capture( region | rectangle )), will be
handled accordingly:
Region Outside: no image is captured, None is returned
Region Partially Outside: the returned image will only cover the part inside the monitor
Region Across Monitors: the returned image will only cover the part inside the monitor with the smallest id.
Based on the knowledge of your monitor configuration, you can now start some further evaluations using e.g.
Region.hover() together with setShowActions(True) and highlighting using Region.highlight().
6.7 Pattern
class Pattern
A pattern is used, to associate an image file with additional attributes used in find operations and when acting on a
match object.
Minimum Similarity:
While using a Region.find() operation, if only an image file is provided, Sikuli searches the region using a default
minimum similarity of 0.7. This default value can be changed in Settings.MinSimilarity.
Using similar() you can associate a specific similarity value, that will be used as the minimum value, when this
pattern object is searched. The IDE supports adjusting the minimum similarity of captured images using the Preview
Pane (internally in the script, the images are turned into a pattern object automatically).
Click Point:
Normally when clicking on a match, the center pixel of the associated rectangle is used. With a pattern object, you
can define a different click point relative to the center using targetOffset().
class Pattern
Pattern(string)
Parameters string a path to an image file
Returns a new pattern object
This will initialize a new pattern object without any additional attributes. As long as no pattern methods
are used additionally, it is the same as just using the image file name itself in the find operation.
similar(similarity)
Return a new Pattern object containing the same attributes (image, click point) with the minimum similarity
set to the specified value.
Parameters similarity the minimum similarity to use in a find operation. The value
should be between 0 and 1.
Returns a new pattern object
exact()
Return a new Pattern object containing the same attributes (image, click point) with the minimum similarity
set to 1.0, which means exact match is required.
Returns a new pattern object
targetOffset(dx, dy)
Return a new Pattern object containing the same attributes (image, similarity), but a different definition
for the click. By default, the click point is the center of the found match. By setting the target offset, it
is possible to specify a click point other than the center. dx and dy will be used to calculate the position
relative to the center.
Parameters
dx x offset from the center
dy y offset from the center
Returns a new pattern object
getFilename()
Get the filename of the image contained in the Pattern object.
Returns a filename as a string
getTargetOffset()
Get the target offset of the Pattern object.
Returns a Location object as the target offset
6.7. Pattern 75
SikuliX Documentation, Release 1.1.0-Beta1
6.8 Match
class Match
An object of class Match represents the result of a successful find operation. It has the rectangle dimension of the
image, that was used to search. It knows the point of its upper left corner on an existing monitor, where it was found.
Since class Match extends class Region, all methods of class Region can be used with a match object.
A match object is created as the result of an explicit find operation. It can be saved in a variable for later use with
actions like click().
It has the rectangle dimension of the image, that was used to search. It knows the point of its upper left corner on an
existing monitor, where it was found. It knows the similarity it was found with and a click point to be used, if set by a
pattern.
For all other aspects, the features and attributes of class Region apply.
class Match
getScore()
Get the similarity score the image or pattern was found. The value is between 0 and 1.
getTarget()
Get the location object that will be used as the click point.
Typically, when no offset was specified by Pattern.targetOffset(), the click point is the center
of the matched region. If an offset was given, the click point is the offset relative to the center.
A find operation Region.findAll() returns an iterator object that can be used to fetch all found matches as
match objects one by one. A reference to the iterator is stored in the respective region and can be accessed using
Region.getLastMatches().
Important to know:
per definition, an iterator can be stepped through only once - it is empty afterwards
You can read more about the basics of operations with iterators from the description of Finder class. To save
contained matches for later use, you can convert them to list.
Example: using while: with default screen
Example: using with: with default screen
6.9 Finder
class Finder
New in version 1.1.0.
The behavior is changed compared to previos versions, to be consistent with Region.find()/findAll().
A Finder object implements an iterator of matches and allows to search for a visual object in an image file that you
provide (e.g. a screenshot taken and saved in a file before). After setting up the finder object and doing a find operation,
you can iterate through the found matches if any.
Important to know:
per definition, an iterator can be stepped through only once - it is empty afterwards
it has to be destroyed using finder.destroy(), especially when used with for: or while:
when used in a with: construct, it is destroyed automatically
Compared with the region based find/findAll operation, no exception FindFailed is raised in case nothing is found at
all (use hasNext() to check).
The result using a finder object can be compared to what you get with region.getLastMatch() when using
find() or with region.getLastMatches() when using findAll().
Note: There is no chance, to get the number of matches in advance. If you iterate through to count, afterwards your
finder is empty. So you have to save your matches somehow while counting, if you need them later (one possible
solution see example below).
The workflow always is:
setup a Finder
do a find or findAll operation
check with hasNext(), wether anything was found at all
get the available matches with next() if hasNext() says more available
After a complete iteration, the finder object is empty.
You can start a new find or findAll operation at any time.
class Finder
Finder(path-to-imagefile)
Create a new finder object.
Parameters path-to-imagefile filename to a source image to search within
find(path-to-imagefile[, similarity ])
Find a given image within a source image previously specified in the constructor of the finder object. If
more than one match is possible, yet only one is returned and which one is not predictible.
Parameters
path-to-imagefile the target image to search for
similarity the minimum similarity a match should have. If omitted, the default is
used.
New in version 1.1.0.
findAll(path-to-imagefile[, similarity ])
Find all occurences of the given image within a source image previously specified in the constructor of the
finder object.
Parameters
path-to-imagefile the target image to search for
similarity the minimum similarity a match should have. If omitted, the default is
used.
6.9. Finder 77
SikuliX Documentation, Release 1.1.0-Beta1
hasNext()
Check whether there are more matches available that satisfy the minimum similarity requirement. A False
returned after the first hasNext() signals, that nothing was found at all.
Returns True if more matches exist.
next()
Get the next match.
Returns a Match object or None, if empty or no more matches.
The returnd reference to a match object is no longer available in the finder object afterwards. So if it is
needed later, it has to be saved to another variable.
Example 1: basic operations using a Finder
Example 2: we want to know how many matches in advance and want to save the matches for later use (based on the
previous example).
class Key
Applicable usage situations for these predefined constants of special keys and key modifiers can be found in Acting on
a Region and Low Level Mouse and Keyboard Actions.
miscellanous keys
ENTER, TAB, ESC, BACKSPACE, DELETE, INSERT
miscellanous keys
SPACE
function keys
F1, F2, F3, F4, F5, F6, F7, F8, F9, F10, F11, F12, F13, F14, F15
navigation keys
HOME, END, LEFT, RIGHT, DOWN, UP, PAGE_DOWN, PAGE_UP
special keys
Note: The status ( on / off ) of the keys Key.CAPS_LOCK, Key.NUM_LOCK and Key.SCROLL_LOCK can be
evaluated with the method Env.isLockOn().
numpad keys
NUM0, NUM1, NUM2, NUM3, NUM4, NUM5, NUM6, NUM7, NUM8, NUM9
SEPARATOR, ADD, MINUS, MULTIPLY, DIVIDE
modifier keys
ALT, CMD, CTRL, META, SHIFT, WIN, ALTGR
A key is called a modifier key, when it is used in conjunction with other keys by pressing and holding it while typing
the other keys. The keys used most commonly as modifier keys are SHIFT, ALT (left alt key, CTRL (might be strg on
Windows) and the CMD key (Apple key on OS X). On Windows you additionally have the Windows key and the alt
key on the right.
the plain type() command has 2 parameters:
parameter 1: the keys, that should be pressed/released one after the other
parameter 2: the modifier keys, that alltogether should be pressed and held during the typing and released
at the end.
For parameter 2 there are 2 options:
new version: (a key is a key ;-): Key.XXX
old version (kept for upwards compatibility): KeyModifier.XXX
... and these early versions should not be used anymore KEY_ALT, KEY_CTRL, KEY_SHIFT, KEY_WIN,
KEY_CMD, KEY_META
The modifier keys can be combined using + , if more than one key modifier is needed.
# example with the recommended Key.XXX version
type(Key.ESC, Key.CTRL + Key.ALT)
Note for Java programming: These constants are mapped to the according constants of
java.awt.event.InputEvent.
class App
Generally you have the choice between using the class methods (e.g.
App.open("application-identifier")) or first create an App instance and use the instance meth-
ods afterwards (e.g. myApp = App("application-identifier") and then later on myApp.open()).
There is no recomendation for a preferred usage. The only real difference is, that you might save some ressources,
when using the instance approach, since using the class methods produces more intermediate objects. So if you plan
to act on the same app or window more often, it might be more transparent, to use the instance approach How to
create an App instance
The basic choice is to just say someApp = App("some-app-identifier") and you have your app instance,
that you can later use together with its methods, without having to specify the string again.
Generally all class methods return an app instance, that you might save in a variable to use it later in your script.
At time of instance creation the process list is scanned for the name of the executable using the given text. If this is
found, the app instance is initialized with the respective information (PID, executable, window title of main/frontmost
window). So you could directly ask the app instance, wether it is running (isRunning()), has a main window
(hasWindow()), get the title of that main window (getWindow()) and get the process id (PID) (getPID).
The string representation of an app instance looks like this: [nPID:executableName
(main/frontmost window title)] given text
(Windows/Linux) If it is not found, the titles of the currently known windows are scanned, wether they contain the
given text. The first matching window found evaluates to the application, that initializes the app instance.
If neither the executable is found in the process list, nor a matching window, the app instance is initialized as not
running (the PID is set to -1). The given text is remembered.
If you specify the exact window title of an open window, you will get exactly this one. But if you specify some text,
that is found in more than one open window title, you will get the first in the row of all these windows. So if you want
exactly one specific window, you either need to know the exact window title or at least some part of the title text, that
makes this window unique in the current context (e.g. save a document with a specific name, before accessing its
window).
(Mac OS X) not yet possible, to identify a running app by part of the title of one of its windows. The window title
you get by getWindow()is the one of the currently frontmost window of that application.
NOTE Currently the information, wether a window is hidden or minimized, is not available and it is not possible yet,
to bring such a window to front with a compound SikuliX feature.
class App
classmethod pause(waitTime)
Usage: App.pause(someTime) (convenience function)
Just do nothing for the given amount of time in seconds (integer or float).
classmethod App(application)
Usage: someApp = App(application)
Open the specified application, if it is not yet opened and bring it to front
Parameters application The name of an application (case-insensitive), that can be found
in the path used by the system to locate applications, or the full path to an application.
Optionally you might add parameters, that will be given to the application at open (see
setUsing()).
Returns an App object, that can be used with the instance methods
isRunning([waitTime ])
Usage: if not someApp.isRunning(): someApp.open() where App instance someApp
was created before.
Parameters waitTime optional: seconds as integer, that should be waited for the app to get
running
Returns True if the app is running (has a process ID), False otherwise
Windows It is common, to identify an app by (part of) its window title. If it is not open yet, one has to
use methods, to open it first before proceeding.
So this is a typical example, how to deal with that:
Example:
# we want to act in a VirtualBox VM window and use the VM's name
# which is always part of the Window title when running
vb = App("VM-name")
if not vb.isRunning():
App.open(r'"C:\Program Files\Oracle\VirtualBox\VBoxManage.exe" startvm VM-name')
while not vb.isRunning():
wait(1)
vb.focus()
appWindow = App.focusedWindow()
hasWindow()
Usage: if not someApp.hasWindow(): openNewWindow() # some private
function where App instance someApp was created before.
Returns True if the app is running and has a main window registered, False otherwise
getWindow()
Usage: title = someApp.getWindow() where App instance someApp was created before.
Returns the title of the frontmost window of this application, might be an empty string
getPID()
Usage: pid = someApp.getPID() where App instance someApp was created before.
Returns the process ID as number if app is running, -1 otherwise
getName()
Usage: appName = someApp.getName() where App instance someApp was created before.
Returns the short name of the app as it is shown in the process list
setUsing(parametertext)
Usage: appName = someApp.setUsing("parm1 x parm2 y parm3 z") where App in-
stance someApp was created before.
Parameters parametertext a string, that is given to the applications startup as you would
give it, if you would start the app from a commandline.
Returns the app instance
classmethod open(application)
Usage: App.open(application)
Open the specified application, if it is not yet opened and bring it to front
Parameters application The name of an application (case-insensitive), that can be found
in the path used by the system to locate applications, or the full path to an application (Win-
dows: use double backslash \ in the path string to represent a backslash)
Returns an App object, that can be used with the instance methods, None in case of failing
This method is functionally equivalent to openApp().
open([waitTime ])
Usage: someApp.open() where App instance someApp was created before.
Open this application.
Parameters waitTime optional: seconds as integer, that should be waited for the app to get
running
Returns the app instance or null/None if open failed
classmethod focus(application)
Usage: App.focus(application)
Switch the input focus to an application/window.
Parameters application The name of an application (case-insensitive) or (part of) a win-
dow title (Windows/Linux) (case-sensitive).
Returns an App object, that can be used with the instance methods, , None in case of failing
focus()
Usage: someApp.focus() where App instance someApp was created before.
Switch the input focus to this application/window.
classmethod close(application)
Usage: App.close(application)
It closes the given application or the matching windows (Windows/Linux). It does nothing if no running
application or opened window (Windows/Linux) can be found. On Windows/Linux, whether the applica-
tion itself is closed depends on weather all open windows are closed or a main window of the application
is closed, that in turn closes all other opened windows.
Parameters application The name of an application (case-insensitive) or (part of) a win-
dow title (Windows/Linux)(case-sensitive).
This method is functionally equivalent to closeApp().
close()
Usage: someApp.close() where App instance someApp was created before.
Close this application.
classmethod focusedWindow()
Usage: App.focusedWindow()
Identify the currently focused or the frontmost window and switch to it. Sikuli does not tell you, to
which application this window belongs.
Returns a Region object representing the window or None if there is no such window.
On Mac, when starting a script, Sikuli hides its window and starts processing the script. In this
moment, no window has focus. Thus, it is necessary to first click somewhere or use App.focus()
to focus on a window. In this case, this method may return None.
On Windows, this method always returns a region. When there is no window opened on the desktop,
the region may refer to a special window such as the task bar or an icon in the system tray.
Example:
# highlight the currently fontmost window for 2 seconds
App.focusedWindow().highlight(2)
window([n ])
Usage 1: App(application).window([n]) an App instance is created on the fly.
Usage 2: someApp.window([n]) where App instance someApp was created before.
Get the region corresponding to the n-th window of this application (Mac) or a series of windows
with the matching title (Windows/Linux).
Parameters n 0 or a positive integer number. If ommitted, 0 is taken as default.
Returns the region on the screen occupied by the window, if such window exists and None
if otherwise.
Below is an example that tries to open a Firefox browser window and switches to the address field
(Windows):
# using an existing window if possible
myApp = App("Firefox")
if not myApp.window(): # no window(0) - Firefox not open
App.open("c:\\Program Files\\Mozilla Firefox\\Firefox.exe")
wait(2)
myApp.focus()
wait(1)
type("l", KEY_CTRL) # switch to address field
Afterwards, it focuses on the Firefox application, uses the window() method to obtain the region
of the frontmost window, applies some operations within the region, and finally closes the window:
# using a new window
firefox = App.open("c:\\Program Files\\Mozilla Firefox\\Firefox.exe");
wait(2)
firefox.focus()
wait(1)
# now your just opened new window should be the frontmost
with firefox.window(): # see the general notes below
# some actions inside the window(0)'s region
click("somebutton.png")
firefox.close() # close the window - stop the process
Below is another example that highlights all the windows of an application by looping through them
(Mac):
# not more than 100 windows should be open ;-)
myApp = App("Safari")
for n in range(100):
w = myApp.window(n)
if not w: break # no more windows
w.highlight(2) # window highlighted for 2 second
Be aware, that especially the window handling feature is experimental and under further develop-
ment.
Especially on Windows be aware, that there might be many matching windows and windows, that
might not be visible at all. Currently the window() function has no feature to identify a special
window besides returning the region. So you might need some additional checks to be sure you are
acting on the right window.
Windows/Linux: The close() function currently kills the application, without closing its win-
dows before. This is an abnormal termination and might be recognized by your application at the
next start (e.g. Firefox usually tries to reload the pages).
Even if the windows are hidden/minimized, their region that they have in the visible state is returned.
Currently there is no Sikuli feature, to decide wether the given window(n) is visible or not or if it
is currently the frontmost window. The only guarentee: window()/window(0) is the topmost
window of an application (Mac) or a series of matching windows (Windows/Linux).
Currently there are no methods available to act on such a window (resize, bring to front, get the
window title, ...).
Some tips:
Check the position of a windows returned region: some apps hide there windows by giving them
outside coordinates (e.g. negative)
Check the size of a windows returned region: normally your app windows will occupy major parts
of the screen, so a windows returned region of e.g. 150x30 might be some invisible stuff or an
overlay on the real app window (e.g. the search in history input field on the Safari Top-Sites page,
which is reported as windows(0))
If you have more than one application window, try to position them at different coordinates, so you
can decide which one you act on in the moment.
It is sometimes possible to use the OCR text extraction feature Region.text() to obtain the
window title.
Miscellanous
If you are wondering if Sikuli can do X or Y, these two general rules apply:
If you can do X with Java, you can also do it in SikuliX ...
... by simply adding the respective Java resources to the classpath (the standard Java classes are
already there).
For example, you know how to create a GUI with Java Swing, you can do it in the same way in
SikuliX.
If you can do X with Python, you probably can do it in SikuliX as well.
This actually depends on what Python modules you want to use. SikuliX is on Python language level
2.5 and 2.7. So everything available in the respective Python base package is available in SikuliX
too.
If modules are written in pure Python, you can use them in SikuliX as well. A typical example are
the Excel access modules xlrd and xlwt.
If modules are written in C or depend on C-based stuff, unfortunately, you cant use them in SikuliX
(e.g. the support for Win32API calls).
But it always is a good idea to check in the net, wether there either is a feature compatible module
already available for Jython or you might use an appropriate Java library package.
To Ruby the same rule applies accordingly.
For any specific question it is always a good idea to first consult SikuiXs FAQs and Questions
Then search with Google for the topic mentioning Jython, JRuby or even Sikuli - you will get tons of hints and
snippets.
Yes. You can create GUIs with Java Swing or any other Java/Jython/JRuby GUI toolkit.
85
SikuliX Documentation, Release 1.1.0-Beta1
SikuliX can be used on command line to run a Sikuli script or open the IDE.
The usage on each platform:
Windows
PATH-TO-SIKULIX/runsikulix.cmd [options]
Mac OS X
PATH-TO-SIKULIX/runsikulix [options]
Linux
PATH-TO-SIKULIX/runsikulix [options]
PATH-TO-SIKULIX/runsikulix(.cmd)
-d,-debug <value> (up to 3 makes sense)
raise the verbosity level of SikuliXs internal debug messages - might be useful to clarify odd situations
-f,-logfile [<path to log file>]
a valid filename (if omitted: WorkingDir/SikuliLog.txt) where SikliXs log messages should be written to
-u,-userlog [<path to log file>]
a valid filename (if omitted: WorkingDir/UserLog.txt) where user log messages created using Debug.user()
should be written to
PATH-TO-SIKULIX/runsikulix(.cmd)
86 Chapter 7. Miscellanous
SikuliX Documentation, Release 1.1.0-Beta1
-h,-help
print a help message showing the available options and exit
-i,-interactive
open an interactive Jython session that is prepared for the usage of the SikuliX features
PATH-TO-SIKULIX/runsikulix(.cmd)
-c,-console
all output goes to stdout
7.2.4 Command Line Options (intention: run a script without opening the IDE)
PATH-TO-SIKULIX/runsikulix(.cmd)
-r,-run <sikuli-folder/file> (one or more entries seperated by space)
run one or more .sikuli or .skl files one after the other
<sikuli-folder/file> can be
a relative or absolute path with or without dotted parts (e.g. ../some-script)
a pointer to a location in the HTTP net (for details look here). The contained script file is downloaded
and run, while the image files are downloaded when used in the script at runtime.
Having more than one script to run, the folder containing the script folder is remembered and applied to a
following entry, that has a preceding ./ - example
sikulix.com:scripts/test1 ./test2 ./test2 will reuse the location sikulix.com:scripts/ for test2 and test3
Having more than one script specified: a return code of -1 will stop the complete execution.
Having more than one script specified: the next script can get the return code of the script run before using
ScriptingSupport.getLastReturnCode()
7.2.5 Command Line Options (intention: run the experimental scriptrun server)
PATH-TO-SIKULIX/runsikulix(.cmd)
-s,-server [<port>] (optional port not yet supported, 50001 is used as default)
start a scriptrun server (more information<http://www.sikulix.com/support.html>_)
7.2.6 Command Line Options (intention: provide user parameters for running
scripts)
PATH-TO-SIKULIX/runsikulix(.cmd)
-- <arguments>
the space delimited and optionally quoted arguments (only apostrophes are supported) are passed to Jythons
sys.argv and hence are available in your script. A parameter containing intermediate blanks MUST be quoted
to get it into one sys.argv entry.
This option must go after all the other options mentioned above.
7.3 How to use SikuliX API in your JAVA programs or Java aware
scripting
The core of SikuliX is written in Java, which means you can use the SikuliX API as a standard JAVA library in your
program.
This applies to any Java aware scripting environment like Jython, JRuby, Scala, Groovy, Clojure and more, where you
write your scripts in other IDEs and run them using the respective runtime support directly.
NOTE: When using the scripting support provided by the SikuliX IDE and when running Jython/JRuby scripts from command
For information what you can do when using the Java API to have this convenience too, look here
SIKULI_IMAGE_PATH.
After having setup SikuliX on your system, as recommended at Getting started, you have to do the following:
After adding sikulixapi.jar as a library reference into your project, the project hierarchy in Eclipse might look like this:
Files view
88 Chapter 7. Miscellanous
SikuliX Documentation, Release 1.1.0-Beta1
7.3. How to use SikuliX API in your JAVA programs or Java aware scripting 89
SikuliX Documentation, Release 1.1.0-Beta1
It is planned, to publish sikulixapi.jar version 1.1.0+ on MavenCentral, so having a dependency in your project pom
would be sufficient.
The coordinates:
<groupId>com.sikulix</groupId>
<artifactId>sikulixapi</artifactId>
<version>1.1.0</version>
Snapshots of developement head can be loaded from OSSRH they are created daily (most of the time ;-)
use this repository setting:
<repository>
<! OSSRH: com.sikulix >
<id>com.sikulix</id>
<name>com.sikulix</name>
<url>https://oss.sonatype.org/content/groups/public</url>
<layout>default</layout>
<snapshots>
<enabled>true</enabled>
<updatePolicy>always</updatePolicy>
</snapshots>
</repository>
and as version use: <version>1.1.0-SNAPSHOT</version>
Be aware, that some method signatures in the Java API differ from the scripting level.
Javadoc of SikuliX (temporary location).
90 Chapter 7. Miscellanous
SikuliX Documentation, Release 1.1.0-Beta1
images
image1.png
image2.png
the standard Maven project setup would be
pom.xml
src
main
java
mypackage
ImageContainer.class
resources
images
image1.png
image2.png
In other non-Maven project setups in NetBeans, Eclipse and other IDEs this might look different. But what counts is
the final position of the images folder in the resulting jar file relative to its root.
This would try to load image files from the jar file of the given class name, supposing the jar packaging during project
build produces a jar like this:
TestSikuli.class
images
spotlight.png
spotlight-input.png
import org.sikuli.script.*;
As a goody you might use the SikuliX IDE to manage your image folders inside your java projects.
Just capture the images and save the script appropriately (e.g. as images.sikuli) in your project structure in a suitable
place and add the path to your java program having the jar file on the class path at runtime.
Taking the above ImageContainer example ina Maven context:
project structure:
pom.xml
7.3. How to use SikuliX API in your JAVA programs or Java aware scripting 91
SikuliX Documentation, Release 1.1.0-Beta1
src
main
java
mypackage
ImageContainer.class
resources
images.sikuli
images.py
spotlight.png
spotlight-input.png
The ImageContainer class need not have any features, but must be valid (compileable).
Have the resulting jar on classpath at runtime and access the images just using their base names:
import org.sikuli.script.*;
7.4 Extensions
This feature is currently under developement and might only be available on request.
Extensions allow to implement new Sikuli features by adding packages to your current Sikuli installation.
They are maintained by the developers (see Technical Details below).
If you want to contribute a new extension or a modified exisiting one, please look at How to contribute an extension
below.
The download of an extension is supported by the IDE through the menu Tools -> Extensions. You get a popup, that
lists the available and already installed extensions and allows to download new packages or updates for installed ones.
This popup shows a new package not yet installed:
92 Chapter 7. Miscellanous
SikuliX Documentation, Release 1.1.0-Beta1
If you need more information about the features of the extension, just click More Info - this will open the related
documentation from the web in a browser window.
If you want to install the extension, just click the Install... button. The package will be downloaded and added to your
extensions repository.
This popup shows an installed package:
If a new version would be available at that time, the Install... button would be active again, showing the new version
number. Now you could click and download the new version.
How to Use an Extension
To use the features of an installed extension in one of your scripts, just say from extension-name import *.
For an usage example read Sikuli Guide.
For information about features, usage and API use menu Tools -> Extensions -> More Info in the IDE.
Technical Details
Extensions are Java JAR files containing some Java classes (usually the core functions) and/or Python modules, which
define the API to be used in a script.
Sikuli maintains a local extensions directory, where downloaded extensions are stored together with a hidden list
of the installed extensions (Windows: %APPDATA%\Sikuli\extensions, Mac: ~/Library/Application
Support/Sikuli/extensions, Linux: ~/.sikuli/extensions).
Once an extension is imported using import extension-name, Sikuli automatically searches and loads the JAR
file of that extension into the current context with load(path-to-jar-file).
7.4. Extensions 93
SikuliX Documentation, Release 1.1.0-Beta1
The final structure of a JAR (filename extension-name-X.Y where X.Y is the version string) looks like this:
org/com
- your-organization-or-company
-- extension-name
--- yourClass1.class
--- yourClass2.class
--- .... more classes
extension-name
- __init__.py
- extension-name.py
META-INF
- MANIFEST.MF
The file __init__.py contains at least from extension-name import * to avoid one qualification level.
So in a script you might either use:
import extension-name
extension-name.functionXYZ()
or:
from extension-name import *
functionXYZ()
The second case requires more investement in a naming convention, that avoids naming conflicts.
The file extension-name.py contains the classes and methods, that represent the API, that one might use in a
Sikuli script.
As an example you may take the source of the extension Sikuli Guide.
94 Chapter 7. Miscellanous
SikuliX Documentation, Release 1.1.0-Beta1
While developing your extensions, you can put the JAR file in Sikulis extension directory or in the same .sikuli folder
as your test script. The JAR file should not have a version number in its file name, e.g. extension-name.jar.
Because Sikuli starts to search extensions in the .sikuli folder of the running script and then in the Sikuli extensions
folder, it is usually a good idea to put your developing extensions in the .sikuli folder of your test script.
Another option is to use the load() function with an absolute path to your extension-name.jar. If this fails,
Sikuli goes on searching in the current .sikuli folder and then in the Sikuli extensions folder. If load() succeeds, it
returns True and puts absolute-path-to-your-extension-name.jar into sys.path, so you can use
import extension-name afterwards.
Currently you have to contact the developers of Sikuli and agree on how to proceed.
Quick Start
First Example
In our first example, suppose we want to create a guided tour of this very documentation page you are currently
reading. We want to bring your attention to the logo picture to the right. Using the functions provided by Sikuli Guide,
we can write the following script to accomplish this:
When you run this script, Sikuli Guide will search for the logos image on the screen, highlight it, and display the text
This is Sikulis logo below the image, like the figure below:
7.4. Extensions 95
SikuliX Documentation, Release 1.1.0-Beta1
Again, this happens in the actual interface, rather than in a video or a screenshot. The logo image that is highlighted
is the actual interface element users can click on.
Lets explain the script line by line. The first line is an import statement that tells Sikuli to load the Sikuli Guide
extension. The secod line uses the text(pattern, text) function to add text next to a given pattern, in
this case, the logo image. Note that by default the text added is not displayed immediately, it is only internally added to
the visual element. In the third line, we call show(secs) to explicitly tell Sikuli Guide to now display all registerd
annotation elements (in this case only the text) for the duration specified by secs.
Below is a YouTube video of this example.
It is possible to add text or other annotations to multiple visual elements before calling show(secs) in order to show
them on the screen at the same time.
The script above uses the function tooltip(text) to add tooltips to three links in addition to the text annotation.
The result of running this script is shown below:
96 Chapter 7. Miscellanous
SikuliX Documentation, Release 1.1.0-Beta1
Rather than showing the annotations all at once, we can also show them one by one using separate show() statements.
Below is an example where we cycle through the three links and show the tooltip of each link one at a time.
The result of running this script is shown below (sorry, no animation):
Adding Interaction
Another way to control the flow of a guided tour is to display a dialog box and let users click on a button to continue
to the next part of the tour. Sikuli Guide provides a function dialog(message) to accomplish this easily. Below
is an example using this function to create a two-part guided tour.
The tour presented by the script above introduces the navigation links above and the Sikulis logo as a shortcut to go
back to the documentations HOME page. The function call dialog("Part 1") indicates the tour will show a
7.4. Extensions 97
SikuliX Documentation, Release 1.1.0-Beta1
dialog that displays the message specified by the string argument (i.e., Part 1: Navigation Links). The following call
to show() will actually display the dialog along with the text elements spcified earlier.
The figure below shows what happens after Line 3:
After users click on the Next button, the tour moves to the next part. The screen will look like below:
You might use the feature clickable(PSRM), to make a region sensitive for clicks.
The script waits until the user clicks one of the two highlighted areas.
98 Chapter 7. Miscellanous
SikuliX Documentation, Release 1.1.0-Beta1
Function References
element: when used as a parameter, it can either be something that can be used with a find() (Pattern or string as image
file name or just plain text), a Region or Match object or another Guide element.
All funtions return the created element, so later the layout can be changed by specific setters or they can be used as
target elements for other elements
Static Annotations
guide.rectangle(element)
Add a rectangular overlay as frame on the specified elements region.
Parameters element a suitable
guide.circle(element)
Add a circle around the specified targets region.
Parameters element a pattern, string, region or match
guide.text(element, txt)
Add some text to one edge of the specified element
Parameters
element a suitabel element
txt a string as text to display
guide.tooltip(element, txt)
Add a tooltip (small font in a light yellow box). same as text(), but with predefined layout. As usual for tooltips:
the text should be a short oneliner
Parameters
element a suitabel element
txt a string as text to display
7.4. Extensions 99
SikuliX Documentation, Release 1.1.0-Beta1
Interactive Elements
guide.button(element, name)
A clickable button showing its name as the button text.
Parameters name a string as text to display, later used as reference to check how the button was
used
Control
guide.show([seconds ])
Show static and interactive components added so far for the specified amount of time.
Parameters seconds a decimal number as display duration in seconds
The default duration is 10 seconds. If interactive elements (either one or more clickable elements or a dialog
box) were previously added, it waits until the user interacts with one of these elements. At this time all elements
vanish and are discarded.
Now, open the Sikuli IDE. We begin by taking the screenshot of our target, the Spotlight symbol in the upper-right
corner, so that we can tell Sikuli Script what to look for and click on.
To simulate a mouse click on the Spotlight symbol, we are going to use the click function. To tell Sikuli how the
target looks like, we need to capture the targets image on the screen.
Sikuli IDE provides two methods to capture screen images. The first method is to click on the camera button in the
toolbar. This will bring you to the screen capturing mode.
The second method is to press a hot-key (Command + Shift + 2). Often the target whose image you wish to capture
may be covered by the Sikuli IDEs window. You can minimize the IDEs window and use this hot-key to switch to
the capturing mode.
In the screen capturing mode, the screen will look darker and freeze momentarily. The entire desktop becomes like a
canvas where you can draw a rectangle around the target you want to capture an image of. In this case, the target is
the spotlight symbol. The cross of red dotted lines shows the center of the rectangle you just drew.
101
SikuliX Documentation, Release 1.1.0-Beta1
After you have drawn (or selected) a rectangle, the image within the rectangle will be captured and inserted into the
script editor at the current cursor position.
Now, you can write the click function using this image as an argument to tell Sikuli to click on spotlight symbol.
As a convenience, Sikuli IDE provides a Command List on the left panel. It shows a list of the most often used
functions. Camera icons in the functions indicate these functions expect a captured image as an argument.
Locate the click() function in the list and click on it. If Auto Capture is on (default), you will be directed to the screen
capturing mode in which you can capture an image of an interface target to be inserted into the click() function as an
argument.
The next step is to tell Sikuli to enter the string Hello World into spotlights search box, which can be done with a
simple type function.
This function will type the string given in the argument into whichever input control that has the focus. After clicking
on the spotlight symbol, we can expect the spotlight search box will be the input that has the focus.
Congratulations! You have just completed your first Sikuli Script. Press the run button to see this script in action!
Now, open the Sikuli IDE. We begin by taking the screenshot of our target, the Start menu symbol that is usually
located in the lower-left corner of the desktop. Using this screenshot, we can tell Sikuli script what to click on.
To simulate a mouse click on the Start symbol, we are going to use the click function. To tell Sikuli how the Start
symbol look like, we need to capture its image on teh screen.
Sikuli IDE provides two methods to capture screen images. The first method is to click on the camera button in the
toolbar. This will bring you to the screen capturing mode.
The second method is to press a hot-key (Ctrl + Shift + 2). Often the target whose image you wish to capture may
be covered by the Sikuli IDEs window. You can minimize the IDEs window and use this hot-key to switch to the
capturing mode.
In the screen capturing mode, the screen will look darker and freeze momentarily. The entire desktop becomes like a
canvas where you can draw a rectangle around the target you want to capture an image of. In this case, the target is
the Start symbol. The cross of red dotted lines shows the center of the rectangle you just drew.
After you have drawn (or selected) a rectangle, the image within the rectangle will be captured and inserted into the
script editor at the current cursor position.
Now, you can write the click function using this image as an argument to tell Sikuli to click on start symbol.
As a convenience, Sikuli IDE provides a Command List on the left panel. It shows a list of the most often used
functions. Camera icons in the functions indicate these functions expect a captured image as an argument.
Locate the click() function in the list and click on it. If Auto Capture is on (default), you will be directed to the screen
capturing mode in which you can capture an image of an interface target to be inserted into the click() function as an
argument.
The next step is to tell Sikuli to enter the string Hello World into search box, which can be done with a simple type
function.
This function will type the string given in the argument into whichever input control that has the focus. After clicking
on the Start symbol, we can expect the search box will be the input that has the focus.
Congratulations! You have just completed your first Sikuli Script. Press the run button to see this script in action!
In this tutorial, we will write a Sikuli Script to automate the operation of clearing the content of the trash bin. On Mac
OS X, this operation takes three steps:
1. Click on the Trash icon in the dock
2. Click on the Empty button in the container window
3. Click on the Empty Trash button to confirm
To automate this operation, first we need to capture the images of the GUI elements that need to be clicked.
Then, we can write a sequence of three click() statements to click on the three elements that need to be clicked in
order.
Notice how closely these three lines of code are mapped to the text description of the three-step operation earlier. We
simply replace the description of each GUI element (e.g., Trash bin) by its image directly. How intuitive it is!
Before running the script, make sure the Trash icon is visible on the screen, otherwise Sikuli Script can not find it.
In this tutorial, we will demonstrate how to use a for loop to interact with multiple instances of a GUI component.
Suppose we want to uncheck all the check boxes in a window, such as the Sharing preferences window shown below:
Unfortunately, there is no uncheck all function available. The solution? Write a Sikuli Script to look for ALL the
checked items and uncheck them automatically. The function needed for this operation is findAll().
First, lets capture the screenshot image of a checked item.
findAll() searches the entire screen for all the matching visual patterns and returns a list of locations of those
similar patterns. This capability allows us to obtain all the checked items are on the screen. Then, we can simply write
a for loop in standard Python syntax and call click() on each element in the list.
When this script is executed, Sikuli will find all the items that are currently checked and click on each item one by one
in the loop.
This tutorial demonstrates how to check a particular checkbox when there are multiple checkboxes in a window.
Consider the following window, which is the window for setting the preferences for Dock on Mac, and we want to
check the checkbox indicated by the circle that reads Minimize Windows into Application Icons.
Simply looking for the image of the checkbox like below will not work.
Sikuli does not know which checkbox we are referring to; it simply clicks on the first one it finds. Below is what we
will do instead.
First we want to obtain the region covering the content area of the Dock pane. One way to accomplish this is to use a
spatial operator to obtain a region below the title bar of the Dock pane. The figure below illustrates this strategy.
In this interface, we can click on the checkbox to indicate the desired location of the click point. In this example,
the offset is then determined to be -137 in x, which means 137 pixels to the left of the center of the text label. After
selecting the offset, the thumbnail in the script editor will be updated with a small red cross to indicate the new click
point.
Then, the call click(t) will do the right thing, clicking on the checkbox instead of the center of the text label.
In this tutorial, we will learn how to use dragDrop() and spatial operators by writing a number of scripts to
manipulate sliders.
Suppose we wish to lower the speaking rate of the Text to Speech function. We want to drag the slider to the slow side
in the Speech preferences window shown below.
The function that can perform the dragging is dragDrop(). This function takes two images as arguments. The
first image describes the source GUI object to drag and the second image describes the appearance of the destination
location where the GUI object should be dragged to and dropped.
Let us capture the source and destination images respectively.
What happen if there are more than two sliders. How can we make sure the right slider is dragged? The above example
works because the particular window only has one slider. How can we deal with cases when there are several instances
of similar looking GUI components? Let us consider the Sound preferences window shown below.
Suppose we wish to lower the Alert volume. To make sure Sikuli Script drags the right slider, we need a way to tell
Sikuli Script to look for the one that is to the right of the Alert volume label, not the one next to the Output volume
label. Sikuli Script provides a set of spatial operators to do exactly this. Here we will apply the right() operator as
follows.
This statement tells Sikuli Script to first find the Alert volume label and then find the slider thumb only within the
region strictly to the right of the result of the first find. The slider thumb found is then stored in the variable t. Now
that we have identified the desired slider thumb, we can call dragDrop() to drag it to the left by giving the image
of the Alter volume as the target.
In the above example, we use the image of the Alert volume label to implicitly guide the direction of dragging to the
left. It is also possible to use relative coordinates to explicitly drag to the left, as shown below.
Here, the (x,y) coordinates of the slider thumb are accessible as the attributes of t. We can thus calculate the position
200 pixels to the left of t and ask Sikuli Script to drag the thumb to that position.
Sikuli Script can be used to perform desktop surveillance. In this tutorial, we will go through a number of exercises
to create scripts that can monitor the screen and notify us when certain interesting visual event happens.
The first exercise is to create a Facebook app to periodically check our Facebook homepage visually and see if a
particular friend has recently updated the status message. One easy way to detect this event is to look for the friends
face image on our Facebook homepage. If found, the friend must have posted a new status message. If not found, we
should check back again in a few moments.
Lets implement this operation using a while: loop. First we need to capture a screenshot of the friends face image.
To check if the friends face can be seen on the screen, we use exists(), which returns True when the face image
is found. We set the looping condition to Not Found so that the while loop will terminate only when the face image is
found. We add a sleep(n) statement in the body of the loop to introduce a 5 second interval between attempts to
look for the face image on the screen.
Alternatively, Sikuli provides a convenient wait() function that periodically checks the screen to wait for a given
image pattern to appear. Using wait(), the above code can be rewritten as:
The constant FOREVER means we want Sikuli to wait indefinitely. If we do not want to wait forever, we can replace
FOREVER with a number to indicate the number of seconds to wait until Sikuli should giveup.
After the while loop exits or the wait function returns, we can call popup() to display a notification message.
This will display a popup message that looks like below:
Now we can run this Sikuli Facebook App, sit back and relax, and get notified when our friend updates his message.
In the previous exercise, we wrote a script to detect an images appearance. In this exercise, we will do the opposite
detecting the disappearance of a visual pattern.
Skype is a great tool that allows us to stay in close contact with our friends even if they are in remote parts of the
world. However, there might be some unfortunate circumstances we may want to avoid being seen online by a certain
individual. Perhaps the individual talks too much. Perhaps we owe the individual some money. It would be nice to
know when the individual is offline so that it is safe to get online. While Skype does provide a feature to notify us
when an individual is online, there is no notification feature when the opposite happens.
An automatic logoff notifier would be desirable to deal with this situation. Let us build this tool using Sikuli Script.
Notice that if an individual is no longer online, the combined visual pattern of the green status icon and the individuals
screen name will disappear. Thus, we can take a screenshot that includes both the green icon and the screen name as
follows.
Then, we can write a Sikuli Script to watch for the disappearance of the screenshot image we just captured.
This script is very similar to the one in the previous exercise. The only difference is the removal of the NOT operator
from the condition statement for the while loop, since we are trying to do the opposite.
Another way to wait for the disappearance of an image is to use the waitVanish() function. The above script can
be rewritten as follows:
The third exercise is to build a bus arrival notification tool. For many bus riders, online GPS-based tracking services
are very useful. Instead of patiently standing outside at a bust stop, braving the freezing wind in the winter or scorching
sun in the summer, riders can sit comfortably inside in front of their computers, checking emails, updating Facebook
status, or watching YouTube? videos, or what have you. They only need to look at the map every few moments to
check the location of the bus symbol on the map. Only when the bus is close enough do they have to finally get out
and walk to the bus top.
Since we care about whether the bus is getting close to the stop, we only need to look at the neighborhood around the
stop. Thus, we can resize the browser to show just that portion of the map, while leaving a lot of screen space to do
something else, in this case, reading CNN news.
Let us write a Sikuli Script to do the bus tracking for us. It is possible to define a region and ask Sikuli Script to focus
only on that region to search for a particular visual pattern. This way, Sikuli dont have to waste precious computing
cycles scanning the whole screen. To do so, let us click on the select a region button in the toolbar as indicated
below.
The entire screen will freeze and turn darker, similar to what happen in the screen capture mode. Simply draw a
rectangle to cover the entire neighborhood map. The region covered by this rectangle is visually represented as a
thumbnail image of the entire desktop where the region is shaded in red.
Next, we capture the screenshot image of the bus symbol so that we can tell Sikuli Script to watch for its appearance.
Now. we can write the following script to check the map and watch for the appearance of a bus symbol:
Interpreting the meaning of the while loop is straightforwardwhile it is not the case that a bus symbol can be found
inside the region, sleep for 5 seconds. In other words, the while loop will exit only when the bus symbol is found
inside the region. Then, the popup statement will be executed to notify us the bus has arrived.
Again, the same effect can be achived using wait(). The script can be rewritten as:
For Developers
Sikuli Script is a Jython and Java library that automates GUI interaction using image patterns to direct keyboard/mouse
events. The core of Sikuli Script is a Java library that consists of two parts: java.awt.Robot, which delivers keyboard
and mouse events to appropriate locations, and a C++ engine based on OpenCV, which searches given image patterns
on the screen. The C++ engine is connected to Java via JNI and needs to be compiled for each platform. On top of the
Java library, a thin Jython layer is provided for end-users as a set of simple and clear commands. Therefore, it should
be easy to add more thin layers for other languages running on JVM, e.g. JRuby, Scala, Javascript, etc.
117
SikuliX Documentation, Release 1.1.0-Beta1
9.1.2 The Structure of a Sikuli source folder or zipped file (.sikuli, .skl)
A Sikuli script (.sikuli) is a directory that contains a Python source file (.py) representing the automation workflow or
the test cases and all the image files (.png) used by the source file. All images used in a Sikuli script are simply a path
to the .png file in the .sikuli bundle. Therefore, the Python source file can also be edited by any text editor.
While saving a script using Sikuli IDE, an extra HTML file may optionally be created in the .sikuli directory so that
users can share a visual copy of the scripts on the web easily.
A Sikuli zipped script (.skl) is simply a zipped file of all files in the .sikuli folder. It is intended for distribution via mail
or web upload, can also be run from command line and reopened in the Sikuli IDE. (The previous naming as Sikuli
executable is deprecated, since this is misleading: people most often thought, it is something like a self-contained
and self-running package comparable to a Windows EXE, but it is not).
Have fun working with SikuliX? You can do more than just being a user! There are many ways you can help SikuliXs
development:
Blog or tweet about SikuliX. Share your cool scripts and ideas to the world and let more people know how cool
SikuliX is.
Dont forget to integrate a link to the official homepage of SikuliX http://sikulix.com
Report bugs or request new features in our bug tracker.
Visit the question board and the bug tracker regularly and answer peoples questions there or comment posted
bugs. (You may want to subscribe to the bug tracker or subscribe to all questions.) Many people have questions
that you may know how to deal with. Help them to get through the obstacles and they may help you in the
future.
Submit patches to fix bugs or add features.
Translate SikuliX into your language and help more people who speak different languages to access SikuliX.
You can help us to
translate SikuliX IDEs interface and SikuliXs messages, or
translate SikuliXs documentation (this site you are reading).
Read Documentation, internationalization and localization for more details.
If you are interested in making SikuliX better: Any patches to SikuliX are always welcome
1. Fork the SikuliX source tree from GitHub
Version 1.1.x
Version 1.2.x
1. Fix bugs or add new features.
2. Post pull requests
3. In any case please leave documentation in the comment area.
We hope SikuliX can be used by anyone from anywhere in the world. You can help us to translate the user interface
of SikuliX IDE, any SikuliX messages or to improve and translate this documentation.
SikuliX Translation
If you find an incorrect or missing translation, or if you would like to add a language that isnt yet translated, heres
you can do:
When ready, Visit our translations page on transifex and use the provided tools to translate the relevant items.
It is also possible to post pull requests against the sources on GitHub.
This documentation is created using Sphinx, and written in the reStructuredText format. You can view how the source
code looks like using the link Show Source in the side bar.
The docs are hosted on readthedocs with the sources being on Github.
Changes or additions to the docs should be posted as pull requests on GitHub.
If you want to contribute to the docs or its translation on a regular base, you can be added as a maintainer on readthe-
docs. In case use one of the contacts on sikulix.com.
g
guide, 99
121
SikuliX Documentation, Release 1.1.0-Beta1
Symbols C
<arguments> capture() (Screen method), 72
command line option, 87 circle() (in module guide), 99
-c,console click() (Region method), 60
command line option, 87 ClickDelay (Settings attribute), 20
-d,debug <value> (up to 3 makes sense) close() (App class method), 82
command line option, 86 close() (App method), 82
-f,logfile [<path to log file>] closeApp() (built-in function), 37
command line option, 86 command line option
-h,help <arguments>, 87
command line option, 86 -c,console, 87
-i,interactive -d,debug <value> (up to 3 makes sense), 86
command line option, 87 -f,logfile [<path to log file>], 86
-r,run <sikuli-folder/file> (one or more entries seperated -h,help, 86
by space) -i,interactive, 87
command line option, 87 -r,run <sikuli-folder/file> (one or more entries
-s,server [<port>] (optional port not yet supported, seperated by space), 87
50001 is used as default) -s,server [<port>] (optional port not yet supported,
command line option, 87 50001 is used as default), 87
-u,userlog [<path to log file>] -u,userlog [<path to log file>], 86
command line option, 86
D
A debugging and logging, 21
above() (Location method), 70 DebugLogs (Settings attribute), 19
above() (Region method), 50 DelayBeforeDrag (Settings attribute), 19
ActionLogs (Settings attribute), 19 DelayBeforeDrop (Settings attribute), 19
addHotkey() (Env method), 35 DelayBeforeMouseDown (Settings attribute), 19
addHTTPImagePath() (built-in function), 24 doubleClick() (Region method), 60
addImagePath() (built-in function), 24 drag() (Region method), 62
addImportPath() (built-in function), 26 dragDrop() (Region method), 61
App (built-in class), 79, 80 dropAt() (Region method), 62
App() (App class method), 80
at() (Mouse method), 41 E
exact() (Pattern method), 75
B exists() (Region method), 52
below() (Location method), 70 exit() (built-in function), 19
below() (Region method), 50
Bundle Path, 23 F
button() (in module guide), 100 file and path handling, 22
find() (Finder method), 77
123
SikuliX Documentation, Release 1.1.0-Beta1
124 Index
SikuliX Documentation, Release 1.1.0-Beta1
Index 125
SikuliX Documentation, Release 1.1.0-Beta1
T
targetOffset() (Pattern method), 75
text() (in module guide), 99
text() (Region method), 64
tooltip() (in module guide), 99
type() (Region method), 62
TypeDelay (Settings attribute), 20
U
unzip() (built-in function), 23
W
wait() (Region method), 51
WaitScanRate (Settings attribute), 20
waitVanish() (Region method), 52
wheel() (Region method), 64
window(), 82
with, 69
126 Index