UI Automation – Page Object Model and other Design Patterns

This post has been republished via RSS; it originally appeared at: New blog articles in Microsoft Tech Community.

Author : Edwin Hernandez Maynez

In this article we will do an in-depth examination of Code Design Patterns for UI test automation solutions and will refer mainly to Selenium and C#. If you want to know more about Functional Test Automation tools in general please have a look at our previous article: What are the best UI Test Automation Tools?

We will review what are the upsides of using a Design Pattern to develop an Automated UI test solution and why you should probably take it in consideration for large projects. This information applies to most test frameworks but it's more usually implemented on projects that use Selenium and Selenium-like frameworks such as Appium and WinAppDriver.

Challenges associated with Large UI automation projects

Due to the size of a project and the number of test cases, test code becomes increasingly difficult to maintain and keep up to date to changes in the application. This is a time consuming task when the idea is to reduce time by automating your Regression suite.
There is potentially a lot of repetition on a typical functional test suite, there are steps that are done once and again by different tests with little variance, some of this can be aided by using Data-Driven testing but that still leaves you with many different pieces of your code that perform actions on the same elements.
If a change happens in the application, then all UI Element locators must be updated, increasing the risk of human error and maintenance effort.
If there is no reutilization of code, new test cases take about the same time to develop than earlier test cases.

By applying a design pattern that allows for code reutilization, single point of maintenance for UI Elements and by also implementing Data-Driven testing, you can really minimize the effort associated with updating your test scripts on each test cycle.

Design Patterns

There are a few design patterns that can be used to help you overcome the mentioned challenges. For the purposes of this article we will look into the Page Object Model, but just for clarification, here is a quick list of possible designs and components:

Page Object Model. The basic idea is to divide the application into modules/pages/panels as needed and abstracts object recognition and actions on those objects from the test level. I'll expand on that a bit later. POM is normally paired with other common elements such as Object Repository and Driver factory. If implemented efficiently, it really improves on the maintainability and reusability of code. It's the most used design pattern for UI automation, especially with Selenium-based frameworks.
ScreenPlay Model. This model takes POM further by organizing the page objects, their actions and other elements such as inputs, goals, actor, etc. into a more readable (and supposedly maintainable) screenplay organization.
Façade Design Pattern. It's similar to the page object model, but it's geared to full facades or forms in which many inputs and possibly more than one actions need to happen, the main drawback is that variance in the workflow force you to create another façade class. The test level calls the whole façade class and provides an object which contains all the inputs needed. This is similar to how API testing is sometimes organized.
Fluent Design Pattern. It's a different flavor of POM that is supposed to conform better with Behavior Driven Development since it forces the test to be done in a "logical chain" or workflow. The page objects are written in an fluent interface manner in which methods can be cascaded or chained in a flow of calls. This is achieved by making the methods return an page class object of the type required to continue the flow. Example of how a POM test method would look like if using a Fluent design:

POM

[TestMethod] public void POM_test() { NavigationBar navigationBar = new NavigationBar(driver); navigationBar.SubmitSearch("123"); ResultsPane resultsPane = new ResultsPane(driver); resultsPane.Select_Row(1); int invoiceNumber = resultsPane.Get_InvoiceNumber(); Assert.AreEqual("83812302", invoiceNumber); }

Fluent Design

[TestMethod] public void Fluent_test() { NavigationBar navigationBar = new NavigationBar(driver); Assert.AreEqual(“83812302”, navigationBar.SubmitSearch("123") .Select_Row(1) .ClickOnInitialPopUps() .Get_InvoiceNumber() ) }

It's important to point out that these designs are not necessarily mutually exclusive and some of these components can be used on the same solution. Also, there are sources that cite Page Factory, Driver Factory and Object Repository as design patterns on their own but since they mostly require a POM to function, I will describe them in the next section.

Page Object Model Design

There are many conceptual maps of how this looks and how it works, I personally like this one by Mohsin, which is similar to this other one by Vrishali T. For this article I will be describing a simple approach based on the projects I have worked on in the past and I will create my own conceptual map with a few pieces that I think should also be mentioned as part of the design. This should be just enough to get a newcomer started and I will include code examples that should help you setup the minimal structure.

First of all, a POM has the following elements/layers:

Test Method level. This is where the steps of a test case are executed and where we handle all inputs and validations. Data-source binding should be done at this level.
Page Factory. This may or may not be part of a POM solution depending on whenever or not you want to "pre-initialize" your UI elements before run-time using something called Selenium FindBy annotations. You set the elements as properties of a page class and use an attribute to set the locator that will be used to find the element. This is an alternative to actually using the driver.FindElement(s) functions. You can find more information about this model here. I prefer not using annotations, I find using the FindElement functions gives me more flexibility to find, wait, try to validate that an element is not in the page, or iterate thru a list of elements.
Page Object Classes and their Methods. Each logical division in an application should have its own page class (not necessarily a page, could be a panel, a bar, a dialog, etc.), each page should expose methods to perform actions or get information. E.g. NavigationBar.SearchByTitle("Nurse");
Driver Factory. Initializes the Selenium driver to provide support for cross-browser testing or local/remote drivers.
Object Repository. Not always implemented but definitely useful, you can abstract the Locator (By) properties from the Page Class and keep all the UI Element properties in a XML or JSON file that can be queried from an Object Repository class. The idea is to have a single point of maintenance in case of application updates to existing Elements. Having an object repository adds another level of modularity to your solution and separates the object locators from the page classes. This is useful in the case that you have to update several objects in one go (e.g. when a new build is deployed for testing) but not so much actual test logic. In this case you would benefit from having to update a single file (an xml for example) with all the object locators.
Helper/Wrapper Class. It's very useful to create a class where you write wrappers around the common Selenium functionalities such as FindElement, Click or SendKeys. By implementing you own wrappers, you can do error handling, smarter waits, timeouts, fail safe or retries.
Test Data Source. You can define a data source for the test method and use test data from a CSV, XML, Excel or even a SQL database. Data-driven testing is vital to cover more terrain by executing multiple test-iterations of the same test case with different data scenarios, saving code and time. You can find more information about data-driven testing here. Also it's useful to create a wrapper around the data binding process, that way you can select or combine data from different data sources, even the ones not supported by MSTest (Json, Redis, etc.).

The conceptual map below takes in consideration all the elements above, except for the Page Factory, instead I'm using Page Objects straight from the test method, the page objects have methods themselves that use the selenium findElement(s) functions:

Untitled picture.png

Example of the Minimal Structure for a POM

Below you can find a demo of how the minimal structure would look like for a POM project, it was based on the conceptual/class diagram above. I am showcasing the main features of the code on this article but you can also find the solution zip file for download at the bottom of the page. This is a Visual Studio Unit test project, written in C# and using the Selenium libraries.

Some notes about this project:

This is a C# Unit that using the .Net framework (as opposed to .Net Core)
You will need to add the Nuget Packages for WebDriver, WebDriver.Support, Selenium Extras and any Browsers you wish to support. You can also try to restore Nuget Packages from the attached solution.
The POM classes are implemented as a Class Library and referenced into the Unit test project.
The methods on the Page Object classes are static, they take the driver as an argument and use it to perform an action.

This is just one option, alternatively, you could use a non-static method, instantiate a page class object, pass the driver as an argument to the constructor and then call the method without having to pass a driver to it.
Both options are good, using static methods is probably simpler (and hence that's the one I'm using for this demo).

Test Method Example:

[DataSource("Microsoft.VisualStudio.TestTools.DataSource.CSV", @"|DataDirectory|\TestData\Test1_Data.csv", "Test1_Data#csv", DataAccessMethod.Sequential)] [TestMethod] public void Test1() { IWebDriver driver = DriverFactory.InitBrowser("Chrome"); DriverFactory.LoadApplication(driver,"https://demoPOM.org"); NavigationBar.SubmitSearch(driver, TestContext.DataRow["SubmitStr"].ToString()); ResultsPane.Select_Row(driver, Int32.Parse(TestContext.DataRow["Row"].ToString())); int invoiceNumber = ResultsPane.Get_InvoiceNumber(driver); Assert.AreEqual("83812302", invoiceNumber); }

Page Object Class Example:

public class NavigationBar { public static void SubmitSearch(IWebDriver driver, string searchStr) { By searchTxBxLocator = ObjRepoQuery.GetLocator("\\searchtxbx"); Wrappers.SmartFindElement(driver, searchTxBxLocator).SendKeys(searchStr); By searchBtnLocator = ObjRepoQuery.GetLocator("\\searchbtn"); Wrappers.Click(driver,searchBtnLocator); } }

Driver Factory Example:

public static IWebDriver InitBrowser(string browserName) { IWebDriver driver = null; switch (browserName) { case "Firefox": FirefoxOptions fOptions = new FirefoxOptions(); fOptions.AddAdditionalOption("platform", "LINUX"); fOptions.AddAdditionalOption("version", "66"); driver = new FirefoxDriver(fOptions); break; case "Chrome": ChromeOptions cOptions = new ChromeOptions(); cOptions.PlatformName = "windows"; cOptions.AddAdditionalOption("platform", "WIN10"); cOptions.AddAdditionalOption("version", "latest"); driver = new ChromeDriver(cOptions); break; } return driver; }

Wrappers/Helpers Example (partial):

public class Wrappers { public const int TIMEOUT_CONST = 10; public static IWebElement SmartFindElement(IWebDriver driver, By byLocator, int timeout = TIMEOUT_CONST) { var wait = new WebDriverWait(driver, TimeSpan.FromSeconds(timeout)); var element = wait.Until( SeleniumExtras.WaitHelpers.ExpectedConditions.ElementIsVisible( byLocator)); return element; } ...

Object Repository Query Class Example:

public class ObjRepoQuery { public static By GetLocator(string Xpath) { XmlDocument repo = new XmlDocument(); repo.Load(@"\ObjectRepository\XMLRepo.xml"); XmlNode node; XmlElement root = repo.DocumentElement; node = root.SelectSingleNode(Xpath); By b = null; string valueString = ""; if (node != null) { valueString = node.ChildNodes.OfType<XmlElement>().Where(e => e.Name == "value").FirstOrDefault().InnerText; switch (node.ChildNodes.OfType<XmlElement>().Where(e => e.Name == "byType").FirstOrDefault().InnerText) { case "xpath": b = By.XPath(valueString); break; case "name": b = By.Name(valueString); break; case "class name": b = By.ClassName(valueString); break; case "UAI": b = By.Name(valueString); break; default: break; } } if (b == null) throw new NotFoundException("Locator not found on Object Repository"); return b; } }

XML Object Repository (partial):

<?xml version="1.0" encoding="utf-8" ?> <DemoApp> <LeftPanel> <NotImplemented></NotImplemented> </LeftPanel> <Navigationbar> <searchtxbx> <byType>xpath</byType> <value>//Frame[@ClassName="Top"][@Name="Main"]/Edit[@ClassName="Input"][@Name="Search Criteria"]</value> </searchtxbx> ...

Full Example-POM Solution available below.

Challenges associated with Large UI automation projects

Design Patterns

Page Object Model Design

Example of the Minimal Structure for a POM

Test Method Example:

Leave a Reply Cancel reply