Insights of Mobile Test Automation Frameworks — What is happening under the hood? {Chapter One}

Stale Element
10 min readFeb 27, 2021

In this speedy and competitive digital era, mobile test automation has become one of the most unavoidable requirements for every software organization. It is next to impossible to manually test the application with all possible combinations of scattered devices with various platforms. To make it possible seamlessly, Test Engineering teams have to go with a mobile test automation approach. But now the question arises — Which mobile test automation framework is best fit for our requirements and why?

Prior to the selection of test automation framework — we need to look at the following parameters:

  • Our testing app is on which platform — Android or iOS or Both?
  • Does the testing app contain web views?
  • Is the testing app native or hybrid by implementation?
  • Native app is an app developed for a particular mobile device or platform (such as Android, iOS, or Windows)
  • Hybrid app consists basically of websites packaged in a native wrapper
  • Does the framework have a strong testing community?
  • Language/technology supported by different frameworks?
  • How does the framework interact with the testing app to speed up the execution?
  • Are we looking for Grey box or Black box test automation?

Let’s conclude it with few native as well as third party automation frameworks, and observe their under the hood interaction with the testing app. Then we would be able to find the right choice for our mobile test automation. We have divided the automation frameworks in to two parts:

  • Frameworks/Drivers with Appium
  • Native frameworks — Part of Android SDK and XCode itself

In this blog, we will cover-up all of the major frameworks — which we use with Appium.

Frameworks/Drivers with Appium

Appium {except Espresso driver} is considered a BLACK BOX testing framework — has no access to our test application’s internal methods or state. Test engineers consider it one of the most preferred open source test automation frameworks because it is capable of automating both NATIVE as well as HYBRID mobile applications. Moreover, it supports both Android and iOS platforms for automation. It works on the client-server architecture — Server is NodeJS based and its client libraries are like Java, Python, Ruby, C# etc. Test engineers use the Appium framework just like a user would (interacting with the surface of the app i.e. accessibility layer), not like an app developer would (calling internal methods directly).

Appium is a client-server architecture — The Appium server receives a connection from client in the form of a JSON object over HTTP. Once the server receives the details, it creates a session, as specified in JSON, and returns the session ID, which will be maintained until the Appium server is running. So, all testing will be performed in the context of this newly created session. We will see the practical aspects of it with all different Appium drivers.

Appium client-server architecture

In reality, Appium is an abstraction over all following mobile UI test automation frameworks:

  • Android’s Test Automation framework with UIAutomator2 Driver
  • Android’s Test Automation framework with Espresso Driver
  • iOS’s UIAutomation framework with XCUITest Driver
  • Windows UIA framework with WinApp Driver {we will not discuss about it}

# UIAUTOMATOR2 DRIVER WITH APPIUM FOR ANDROID

Google’s UIAutomator 2 is a test automation framework based on Android instrumentation and allows one to build and run UI test scripts. When an Appium client requests {Test scripts HTTP request} to create a new AndroidDriver session, the client passes the desired capability {JSON payload} to the Appium node server. The UIAutomator2 driver module creates the session and passes the info to ADB. Then ADB installs the UIAutomator2 server APK {Android Package Kit} on the connected Android device, starts the Netty server, initiates a session and also installs AUT if not already installed. Once the Netty server session is started, the UIAutomator2 server continues to listen on the device for requests and responses. And finally, instrumentation tests run on a real device or an emulator.

If we are using UIAutomator2 driver for our mobile app automation, we only require a publishable APK file {which is the same as we have an APK file already placed on the Google app store}. Why do we say UIAutomator2 does Black box automation? — We can only do BLACKBOX testing with UIAutomator2 driver because we can not look inside the APK application. It basically can do what a user can do, so there is no security concern with the application under test. So it has basically full accessibility access over all of the applications and aspects of the device.

Appium UIAutomator2 Driver architecture

UIAutomator2 driver actions under the hood
Lets hit JSON payload over HTTP request to UIAutomator2 driver through Postman:

  • Start Appium server at http://127.0.0.1:4723
  • Here we are connecting a real Android’s device with our machine
  • Before taking further steps — be ready with one PUBLISHABLE APK file
  • To start UIAutomator2 driver session — Send HTTP POST request http://127.0.0.1:4723/wd/hub/session with below JSON payload:

{
“desiredCapabilities”:{
“platformName”: “Android”,
“platformVersion”: “9.0”,
“deviceName”: “Mi A1”
“automationName”: “UiAutomator2”,
“app”: “/path/test-app.apk”
}
}
Response: sessionID

  • Now we have sessionID, the test app is installed and the UIAutomator2 server installed and started on our device
  • To verify installed APKs element by using elementID — Send HTTP POST request http://127.0.0.1:4723/wd/hub/session/<sessionID>/element with below JSON payload:

{
“using”: “id”,
“value”: “loginicon”
}
Response: elementID

  • To take action on installed APKs element {e.g. click} by using elementID — Send HTTP POST request http://127.0.0.1:4723/wd/hub/session/<sessionID>/element/<elementID>/click with below JSON payload:

{
“using”: “id”,
“value”: “loginicon”
}

  • To close the session — Send HTTP DELETE request http://127.0.0.1:4723/wd/hub/session/<sessionID>

When to choose a UIAutomator2 driver?

  • Good choice for Android apps {Hybrid and Native both} black box testing
  • Good if we need to take actions only on accessibility layer of app
  • Good to automate all aspects of the device UI — the home screen etc.
  • But, execution is not faster than Espresso — as Espresso interact internals rather than accessibility layer

# ESPRESSO DRIVER WITH APPIUM FOR ANDROID

Google’s Espresso is a native open source framework, which is used to automate Android based applications. It is packaged within the Android SDK and primarily used by developers for native mobile app development because they have internal code knowledge. Espresso’s full power is unlocked by those who are familiar with the codebase under test. This is the reason — we need a DEBUG APK file for the test automation in case of espresso. For this, we need to enable debugging options on our phone to run this debug APK.

Appium Espresso Driver architecture

So it is very clear that for Espresso, we have to have a debug APK so it can not be just an APK that is already installed on the device or downloaded from the Google app store. It has to be one that we built in debug mode, otherwise Espresso will not be able to automate it. It is awesome if Espresso looks inside the application but it is also a really big security risk. So, Espresso only allows you to do it to applications that you have debug access over. Why do we say Espresso does Grey box automation? — It has access to debug APK internals i.e. the backdoor method which enables us to get inside of our application code and call methods internal to our application that a user would never see from the outside.

You can not use Espresso to automate all aspects of the device UI — the home screen, switching to any other non-debug apps and things like that. For that you must use the UIAutomator2 driver, which automates the elements placed on the accessibility layer of the device which is kind of a layer that sits above all the applications installed on the device.

Through Espresso — We can also find elements by a view tag, which is an Android specific tag about elements that can be added by their developers. This is especially helpful for React Native applications, because React Native puts the test IDs into the view tag on Android. And, right now the Espresso driver is the only Appium driver that actually gives you access to the view tag of an element.

Matcher<View> withTagValue (Matcher<Object> tagValueMatcher)

Returns a matcher that matches Views based on tag property values.

The best thing about Espresso is — “Idle synchronization”: During automated script execution it waits until the spinner/loader disappears i.e. screen is in resting state and ready to take action.

Espresso driver has a mobile: backdoor method — that we will call in our test code, which will tell the Espresso driver what to run inside our app. We call it BACKDOOR because Appium is getting inside of our app through the back door of Espresso, not the front door of the app UI that a user would use.

Espresso driver actions under the hood

Lets hit JSON payload over HTTP request to Espresso driver through Postman:

{
“desiredCapabilities”:{
“platformName”: “Android”,
“platformVersion”: “10.0”,
“deviceName”: “Nexus6_API_29_Q”,
“automationName”: “espresso”,
“app”: “/path/app-debug.apk”,
“package”: “com.example.youngwind.helloworld”
}
}
Response: sessionID

  • Now we have sessionID, the test app is installed and the Espresso server also installed and started on our device
  • To verify installed APKs element by using element’s resource ID — Send HTTP POST request http://127.0.0.1:4723/wd/hub/session/<sessionID>/element with below JSON payload:

{
“using”: “id”,
“value”: “hometown”
}
Response: elementID

  • To take action on installed APKs element {e.g. send keys} by using elementID — Send HTTP POST request http://127.0.0.1:4723/wd/hub/session/<sessionID>/element/<elementID>/value with below JSON payload:

{
“text”: “StaleElement”
}

  • To close the session — Send HTTP DELETE request http://127.0.0.1:4723/wd/hub/session/<sessionID>

When to choose an Espresso driver?

  • Good choice for Android apps {Hybrid and Native both} grey box testing
  • Good if we need to take actions by using methods insides the debug app
  • Good if we need Idle synchronization while taking the actions on app
  • Execution faster than UIAutomator2 — as Espresso interact internals rather than accessibility layer

# XCUITEST DRIVER WITH APPIUM FOR iOS

Apple’s XCUITest is an automation framework introduced with the iOS 9.3 version. However, from iOS 10 and later versions, it’s the only supported automation framework. Appium XCUITest Driver performs automated black-box testing of iOS and tvOS native applications and WebKit web views. Appium internally uses Facebook’s WebDriverAgent project to support XCUITest. Facebook WebDriverAgent is a WebDriver server implementation for iOS. It is used to remote control connected devices or simulators and allows one to launch an app, perform commands (such as tap and scroll), and kill applications.

Appium XCUITest Driver architecture

XCUITest driver actions under the hood

Let’s hit JSON payload over HTTP request to XCUITest driver through Postman:

  • Start Appium server at http://127.0.0.1:4723
  • As we have .app file for testing so starting iPhone simulator on our machine {In case of .ipa file, we can also attach real iPhone device to our machine}
  • To start XCUITest driver session — Send HTTP POST request http://127.0.0.1:4723/wd/hub/session with below JSON payload:

{
“desiredCapabilities”:{
“platformName”: “ios”,
“platformVersion”: “14.4”,
“deviceName”: “iPhone 12 Pro Max”,
“automationName”: “XCUITest”,
“app”: “/path/test-app.app”,
“useNewWDA”: true
}
}
Response: sessionID

  • Now we have sessionID, the test app is installed and the XCUITest server started on our simulator
  • To verify installed Apps element by using elementID — Send HTTP POST request http://127.0.0.1:4723/wd/hub/session/<sessionID>/element with below JSON payload:

{
“using”: “id”,
“value”: “cta_login”
}
Response: elementID

  • To take action on installed Apps element by using elementID — Send HTTP POST request http://127.0.0.1:4723/wd/hub/session/<sessionID>/element/<elementID>/click with below JSON payload:

{
“using”: “id”,
“value”: “cta_login”
}

When to choose an XCUITest driver?

  • Good choice for iOS/tvOS apps {Hybrid and Native both} black box testing
  • Good if we need to take actions only on accessibility layer of app
  • Good to automate all aspects of the device UI — the home screen etc.
  • But, execution is not faster than KIF — as KIF has maximum integration with our code while minimizing the number of layers
  • Many of client libraries to write test scripts — Java, Python, C# etc.

# WRAPPING UP

At the end I would suggest — Seems Appium provides a complete mobile test automation ecosystem, which is required to all Test Engineers. While using Appium — we just need to take care of different flavors, if it fits for us or not.

# KEY TAKEAWAYS

  • Got to know the best fit mobile test automation framework as per our requirement.
  • Understood — the internal workings of different mobile automation frameworks
  • Known — internal Appium architecture with different drivers
  • Good understanding about blackbox as well as greybox mobile automation frameworks

# OPEN QUESTIONS

  • Native frameworks — which are part of Android SDK
  • Native frameworks — which are part of XCode

We will be solving these open questions in chapter two of this blog series. So now it’s time to bring down the curtain — for further discussion or any doubts, please comment below and start open collaboration over mobile test automation.

--

--