Selenium WebDriver has been very popular since the time of its inception. It was one of the biggest changes in the Selenium suite of tools. In our "what is Selenium" guide, we discussed what made Selenium WebDriver popular. In this article, we’ll be learning Selenium WebDriver by taking a deep dive.
Selenium Webdriver Architecture
So first let’s take a look at how Selenium Webdriver API interacts with real browsers using browser drivers and understand major blocks that comprise the Selenium WebDriver architecture. Well, Selenium WebDriver is comprised of four major blocks:
- Selenium Client Libraries
- JSON Wire Protocol
- Browser Drivers
- Browsers
Let’s get to know each one of them in detail and what they help us with:
1. Selenium Client Libraries
Developers and Software testers use the language they are comfortable with to write automation scripts. There are various languages available like C#, Java, Python, Perl, etc in which one can write a script, so this can make the task difficult. However, the Selenium client library or Selenium language binding facilitates the capability of multi-language support. You can use the language you are comfortable with to write your automation script and Selenium will do the rest. To understand this more, let’s take an example if you want to write an automation script in PHP language, you would require PHP client libraries and the rest of the task will be done. You can download all the Selenium binding from Selenium official website.
2. JSON Wire Protocol
JSON (JavaScript Object Notation) Wire Protocol facilitates the capability of transferring data between the Client and Server on the web. It is a REST (Representational State Transfer) API that provides a transport mechanism and defines a RESTful web service using JSON over HTTP.
3. Browser Drivers
Browser Drivers are used for interacting with browsers and relaying automation script instructions to the browsers. Browser drivers take care of loss of any internal logic of browser functionalities. Each browser has its specific Browser Webdriver.
Following are the steps involved in running an automation script using a specific Browser driver:
- HTTP request gets generated for every Selenium command and sent to browser driver.
- Specific browser driver receives the HTTP request through the HTTP server.
- HTTP Server sends all the steps to perform a function which are executed on the browser.
- Test execution report is sent back to server and HTTP server sends it to the Automation script.
4. Browsers
The best part about Selenium Webdriver is that it supports all the major browsers like Google Chrome, Mozilla Firefox, Internet Explorer and Safari. Every browser has specific Webdriver for executing automation scripts.
Types of Browser specific Web Drivers
There are various drivers like HtmlUnit driver, Chrome Driver, Firefox driver, Internet Explorer Driver, Opera Driver which are required in order to run automation scripts. Let’s get to learn each one of them in detail:
1. HtmlUnit Driver
As the name suggests, this is based upon HtmlUnit and is one of the most lightweight and fast implementations of WebDriver. If you’re using a language binding (other than Java) in that case, you’ll need to have this driver.
Command lines to use in your code for using HtmlUnit Driver:
- java
WebDriver driver = new HtmlUnitDriver();
- csharp
IWebDriver driver = new RemoteWebDriver(new Uri("http://127.0.0.1:4444/wd/hub"), DesiredCapabilities.HtmlUnit());
- python
driver = webdriver.Remote("http://localhost:4444/wd/hub", webdriver.DesiredCapabilities.HTMLUNIT.copy())
- ruby
driver = Selenium::WebDriver.for :remote, :url => "http://localhost:4444/wd/hub", :desired_capabilities => :htmlunit
- perl
my $driver = Selenium::Remote::Driver->new(browser_name => 'htmlunit', remote_server_addr => 'localhost
2. Firefox Driver
As the name suggests, Firefox Driver is used to control firefox browser while running automation script.
Commands to use in your code for using Firefox Driver:
java
WebDriver driver = new FirefoxDriver();
csharp
IWebDriver driver = new FirefoxDriver();
python
driver = webdriver.Firefox()
ruby
driver = Selenium::WebDriver.for :firefox
perl
my $driver = Selenium::Remote::Driver->new;
3. Internet Explorer Driver
Internet Explorer Driver is used to run automation scripts on internet explorer. It supports IE7,8,9,10, and 11. Earlier it used to support IE6 as well however it dropped supported for it in 2014.
Command lines to use in your code for using Internet Explorer Driver:
java
WebDriver driver = new InternetExplorerDriver();
csharp
IWebDriver driver = new InternetExlorerDriver();
python
driver = webdriver.Ie()
ruby
driver = Selenium::WebDriver.for :ie
perl
my $driver = Selenium::Remote::Driver->new(browser_name => 'internet explorer');
4. ChromeDriver
Chrome Driver works with Chrome Browser to help automate test automation scripts. For the WebDriver to discover ChromeDriver, you need to enter ChromeDriver’s path into the test script.
Command lines to use in your code for using ChromeDriver:
java
WebDriver driver = new ChromeDriver();
csharp
IWebDriver driver = new ChromeDriver();
python
driver = webdriver.Chrome()
ruby
driver = Selenium::WebDriver.for :chrome
perl
my $driver = Selenium::Remote::Driver->new(browser_name => 'chrome');
Features of Selenium Webdriver
We know that Selenium WebDriver is one of the best choices for developers and we need to understand what makes it stand out of the crowd. Let’s have a look at some of the top features of Selenium WebDriver:
Selenium WebDriver with Java -Basics to Advanced+Frameworks
Multi-Browser Compatibility
Selenium Webdriver interacts with the website and its web elements in a browser just like a real user with use of browser’s native support to hit direct calls without the need of any intermittent software or device. It supports all modern web browsers like Chrome, Firefox, Opera, Safari and Internet Explorer. You can launch any browser with easy commands. For example Chrome Browser
WebDriver driver = new ChromeDriver();
Selenium web driver also supports AndroidDriver, HtmlUnitDriver, and IPhoneDriver.
Multiple Language Support
It supports most of the commonly used programming languages like Java, Javascript, Python, PHP, Ruby, C#, Perl etc. It provides us the freedom to choose any of the programming languages to write our automation scripts.
It also allows using more efficient way of writing automation scripts like using the switch statement, conditional statements, decision making statements to strengthen the automation script and make capable of handling all the situations.
Speed and Performance
Selenium Webdriver executes test script faster when compared to the other tools of the Selenium suite. Unlike Selenium RC, it directly communicates with the browser without the need of any intermediate server.
1. Better Handling of Dynamic Web Elements
Handling dynamic web elements is one of the most common challenges while performing automation testing. Selenium Webdriver knows how to handle dynamic web elements better like checkboxes, dropdowns, and alerts etc. It is very easy to locate the web elements with XPath or ID in case of static elements, but if the element’s XPath or ID keeps changing, it becomes very difficult to handle.
Selenium uses some of the following methods to handle dynamic elements:
- Absolute XPath: This is the most often used Xpath to handle dynamic elements and it contains the complete path of the web UI element starting from the root node.
- Contains(): This function has the ability to find an element with partial text and can be used to handle dynamic elements.
- Starts-with(): This function is based on finding and matching the starting text of the supplied attribute with the attribute of dynamic elements.
2. Easy to Identify and Use Web Elements
Selenium WebDriver has a set of locators that can be helpful in finding web elements on the webpage. With this, it becomes easier to implement those elements in the test automation suite. Following are the few most used locators:
- Name
- ClassName
- ID
- TagName
- LinkText
- PartialLinkText
- Xpath
- CSS Selector
- DOM
Selenium Webdriver vs Selenium RC
Selenium RC had some restrictions which eventually led to the development of Selenium WebDriver. Selenium web driver comes with an enhanced version of Selenium RC. Let’s look into what Selenium Webdriver has got in comparison to Selenium RC.
Architecture
Selenium Webdriver architecture is very simple, precise and more efficient. It controls the web browsers like Chrome, Firefox etc directly from OS level. All you need is your IDE for writing Selenium automation script and a browser.
While Selenium RC architecture includes one more intermediate server named as Selenium Remote Control Server which acts as a middle layer between your automation script and a web browser. Selenium RC server injects a Javascript code to the browser to fetch the instruction from RC server and execute them in the browser. Then, it relays the browser response to RC server and RC displays the results to you.
Speed
As Selenium Webdriver directly interacts with the browser, it is much faster and has better performance than Selenium RC which requires an intermediate Javascript program to reach the browser.
Real life interaction
Selenium Webdriver acts like a real person while interacting with a web browser. For example, Selenium Webdriver can not enter any value to the textbox field if you have disabled the textbox.
While Selenium RC can access any disabled element just like any other Javascript program. Software testers have reported issues in RC to able to access disabled fields.
Differences in API
Selenium Webdriver API is simpler than Selenium RC as it does not contain unnecessary and baffling commands. While Selenium RC’s API is brimming with redundant and confusing commands. For example: Most of the time software testers are confused whether to use type or typeKeys, or whether to use click, mouseDown, or mouseDownAt.
And the worst part is every browser interprets each command in a different way.
Browser Support
Selenium Webdriver supports headless HtmlUnit browser. It is a very fast browser as it does not need to wait for page elements to load. This invisible browser can accelerate your test execution cycles. However, Selenium RC does not support HtmlUnit browser and it is a bit time taking as compared to Selenium WebDriver.
Setting up a Selenium Webdriver project
Configuring Selenium Webdriver on your local system involves following steps:
1. Download and Install Java 8 and setup the Environment variables in your local System
Since I am going to use Java as a programming language to write Automation script, I will need Java configured in my system to run and execute Java codes in eclipse.
- You can download the latest version of JDK (Java Development Kit) from http://www.oracle.com/technetwork/java/javase/downloads/index.html
- Set up the Environment variable in your local system. Follow these steps to get it done:Go to My Computer properties -> Advanced System Settings-> environment variables -> new tab of user variable -> write path in variable name -> write path of the bin folder of Java in variable value -> ok -> ok -> ok
- Now check if Java configured successfully or not. You can check it by running a command ( javac ) in command prompt.
If you get all the list of Java commands and operations in your command prompt, you are all set with Java environment in your system.
2. Download and Configure Integrated Development Environment (I am using Eclipse for this article)
- Download the latest version of Eclipse for Java developers from this link: https://www.eclipse.org/downloads/
- Double click on the Eclipse application file and install it in your system.
- Now give a directory to set up the workspace and hit launch button to open up Eclipse IDE. Now you are all set to write an automation test suite.
3. Download and Configure Selenium WebDriver Java Client
- Download Java client Selenium web driver from https://docs.seleniumhq.org/download/
- The downloaded Java binding would be in zipped format. Extract the file in a folder. It contains the essential jar files required to configure Selenium WebDriver in Eclipse IDE.
- Let’s start creating a project in Eclipse and configure Selenium Webdriver in it.
- Name the new project and click on Finish button to get started with a new project.
- Now create a public class in the created project.
- It is a time to add Selenium jar files in the Test suite (Hackr-Test). Open up properties of the created project.
- Click on Java Build path and add external Jar files.
- Add Selenium Webdriver jar files
- Now add Selenium files under libs folder and you are all set with Selenium Webdriver configuration in Eclipse IDE and now you can write Selenium Automation script and run it in Webdriver.
4. Download and Configure Browser Specific Webdriver in your local system
Since I am going to use Chrome browser specific Webdriver, I’ll explain to you how to download and configure Chrome Webdriver in Eclipse IDE.
- Open up this page to download the latest version of Chrome Webdriver from http://chromedriver.chromium.org/downloads
- Download Chrome Webdriver as per your operating system.
- Unpack the Chrome Webdriver zip file and copy the path of the directory.
- It’s time to set the Chrome driver path in your Automation Script.
We are all set with Selenium dependencies and Chrome Webdriver configured in Eclipse IDE and ready to run a sample automation script.
Live Example with Chrome Selenium-WebDriver and running Automation script in the local Chrome browser
Here is the sample automation script which can be run to automate the web browser using Selenium Webdriver.
import org.openqa.selenium.By; import org.openqa.selenium.WebDriver; import org.openqa.selenium.WebElement; import org.openqa.selenium.chrome.ChromeDriver; import org.openqa.selenium.interactions.Actions; import org.openqa.selenium.support.ui.ExpectedCondition; import org.openqa.selenium.support.ui.ExpectedConditions; import org.openqa.selenium.support.ui.WebDriverWait; import org.openqa.selenium.remote.DesiredCapabilities; import org.openqa.selenium.JavascriptExecutor; import java.util.ArrayList; import org.openqa.selenium.WebDriverException; import java.util.List; import java.util.concurrent.TimeUnit; import org.openqa.selenium.Dimension; public class Test1 { public static void main(String[] args){ System.setProperty("webdriver.chrome.driver","D:\\Chrome-Webdriver\\chromedriver.exe"); // give your driver path here WebDriver driver= new ChromeDriver(); driver.get("http://www.lambdatest.com"); try { driver.manage().window().maximize(); WebElement sign = driver.findElement(By.xpath("//*[@id=\"bs-example-navbar-collapse-1\"]/ul/li[6]/a")); sign.click(); WebElement TextBox = driver.findElement(By.xpath("//*[@id=\"app\"]/section/form/div/div/input[1]")); TextBox.sendKeys("saifs@lambdatest.com"); WebElement Password = driver.findElement(By.xpath("//*[@id=\"app\"]/section/form/div/div/input[2]")); Password.sendKeys("$@!f@4155441018"); WebElement login = driver.findElement(By.xpath("//*[@id=\"app\"]/section/form/div/div/button")); login.click(); WebElement menu = driver.findElement(By.xpath("//*[@id=\"Layer_1\"]")); menu.click(); // WebElement Rts = driver.findElement(By.xpath("//*[@id=\"myApp\"]/nav/div[1]/ul/li[2]/a")); //Rts.click(); WebElement UrlF = driver.findElement(By.xpath("//*[@id=\"input-text\"]")); UrlF.sendKeys("https://hackr.io/blog"); // WebDriverWait WaitToLoadElement = new WebDriverWait(driver, 30 ); Thread.sleep(30000); WebElement Bttn = driver.findElement(By.xpath("/html/body/app-root/app-console/app-header/section/app-test-detail/div[1]/div[1]/div/div[1]/form/div[3]/button")); Bttn.click(); Thread.sleep(30000); WebElement ensession = driver.findElement(By.xpath("//*[@id=\"drog-nav\"]/div[2]/ul/li[8]/a")); ensession.click(); WebElement ensessions = driver.findElement(By.xpath("//*[@id=\"terminate-session-popup\"]/div/div/div[1]")); ensessions.click(); // Thread.sleep(200000); WebDriverWait lwait = new WebDriverWait(driver, 40); lwait.until(ExpectedConditions.visibilityOfElementLocated(By.xpath("//div[@class=\'mCSB_container\']/ul [@class=\'list-unstyled real-browser-test__list-browser text-center\']/li"))); // WebDriverWait waitO= new WebDriverWait(driver, 80); //waitO.until(ExpectedConditions.visibilityOfElementLocated(By.xpath("//div[@class=\'mCSB_container\']/ul [@class=\'list-unstyled real-browser-test__list-os\']/li"))); driver.manage().timeouts().implicitlyWait(15, TimeUnit.SECONDS); ((JavascriptExecutor)driver).executeScript("window.open()"); ArrayList tabs = new ArrayList (driver.getWindowHandles()); driver.switchTo().window(tabs.get(1)); //switches to new tab driver.get("https://hackr.io"); WebElement blog = driver.findElement(By.xpath("//*[@id=\"programming\"]/ul/li[5]/a/span/img")); sign.click(); WebElement blog1 = driver.findElement(By.xpath("//*[@id=\"post-2789\"]/div[3]/div[2]/a")); sign.click(); } catch (Exception e) { System.out.println(e.getMessage()); } } }
This code will launch LambdaTest website and click on the login button to sign in with registered email and password. And then it will run a Real-time test by giving an input URL https://hackr.io and selecting browser and operating system. And finally, it will close the session by clicking on the End session button. After that, the script will launch https://hackr.io website in a new tab and then redirect to the blog page by finding the blog element.
Running Selenium Script Using Remote WebDriver(optional)
Selenium-WebDriver API Commands and Operations
Let’s have a quick look at most used Selenium web driver commands and operations. Since I am using Java language to write automation test suit, I will be mentioned most of the frequently used commands written using Java Syntax.
1. Fetching a web page
This is a first and foremost command to write in an automation script. This command is used to fetch and open a web page on the browser.
Java: driver.get("http://www.google.com");
csharp: driver.Url = "http://www.google.com";
Ruby: driver.get "http://www.google.com"
Python: driver.get("http://www.google.com")
Perl: $driver->get('http://www.google.com')
Javascript: driver.get('http://www.google.com');
2. Locating web elements
These commands are used to find the web UI elements to add in Selenium Automation script. These are the elements that are fetched by Find Element or Find elements method of most of the language bindings.
Following are the web element locators:
- By ID
This is the most efficient way to locate an element on web page. Using this method, class names will be taken as element id to add in automation scripts.WebElement element = driver.findElement(By.id("coolestWidgetEvah"));
- By Class Name
Web elements can be identified by the attribute on the DOM element.List cheeses = driver.findElements(By.className("cheese"));
- By Xpath
Webdriver uses browser’s native Xpath to find the webUI elements at higher level.List inputs = driver.findElements(By.xpath("//input"));
- Using Javascript You can also execute Javascript to get the Web UI element to add in automation script.
WebElement element = (WebElement) ((JavascriptExecutor)driver).executeScript("return $('.cheese')[0]");
- By Link Text
You can find the web element using matching visible text.WebElement cheese = driver.findElement(By.linkText("cheese"));
3. Fetching text values
Developers and testers always want to retrieve the innerText of a web element. This command returns the visible text value.
WebElement element = driver.findElement(By.id("elementID")); element.getText();
4. Switching Between Windows and Frames
Selenium Webdriver supports switching between multiple frames and tabs. Most of the web application have multiple frames and windows to make it more usable. This Webdriver command can help you switching multiple windows easily.
driver.switchTo().window("windowName");
5. Navigation: History and Location
Common commands for navigating to the webpages.
driver.navigate().to("http://www.example.com"); driver.navigate().forward(); driver.navigate().back();
6. Drag And Drop
Commands to perform drag and drop action.
WebElement element = driver.findElement(By.name("source")); WebElement target = driver.findElement(By.name("target")); (new Actions(driver)).dragAndDrop(element, target).perform();
To Sum Up
Selenium WebDriver is one of the most loved choices when it comes to automating a cross-browser script because of the ease of use it offers, the flexibility of language choices and its architecture. It supports various languages, browsers, and operating systems. Also, it is open source which proves to be an icing on the top.
Hope you liked our article. Do let us know your thoughts in the comments section below.
People are also reading:
- What is Selenium IDE?
- Best Selenium Testing Interview Questions
- Best Web Development IDE
- Security Testing Tools
- Best Software Testing Courses
- What is Software Testing Life Cycle?
- Types of Software Testing
- Manual Testing Interview Questions
- Security Testing Tools
- Best Software Testing Certifications
- Best Pentesting Certifications
- Best Blockchain Courses