Selenium-By-Arun: A Brief Overview on Selenium Test Automation Tool

1. Introduction to Selenium

Overview

The whole and sole purpose of this post is to give you a high level and big picture on Selenium Test Automation Tool. I believe this post will be very useful for the beginners, who wants to know the in and out of Selenium at a high level, so that they can begin their journey on Selenium well. Without any doubt, this post will also be very useful for the ones, who believe they have already learnt Selenium and also the people who are in the process of learning Selenium, as they can cross check their yet to date learning and see whether they are upto the mark in Selenium.

"The main motive of this post is to make everyone aware of Selenium and plan their learnings in a better and organized way"

What is Selenium?

In simple words, Selenium is a Test Automation Tool developed by the company named "ThoughtWorks" from the year 2004.

Things that Selenium can automate

Selenium can automate Web Applications which generally run in Web Browsers. For example, Selenium can automate web applications like www.gmail.com, as it runs over web browser.

Things that Selenium cannot automate

Selenium cannot automate Desktop / Window based applications which don't run on the web browsers. For example, Selenium cannot automate desktop / window based applications like MS Word, MS Excel etc. Luckily most of the applications in the market are Web Applications.

Selenium is an open source and free tool

Unlike other proprietary tools which require us to purchase license for using them, Selenium is an open source and free tool. We don't have to spend a single penny for using Selenium tool. Read more at the below link:

Selenium is a free and open source functional test automation tool

Different components in Selenium

Selenium is not a single tool, instead it is a set of tools. The below are the four different components which we together call as Selenium:

Selenium IDE
Selenium RC
Selenium WebDriver
Selenium Grid

Browsers supported by Selenium

Most of the projects automate their applications for Firefox, Chrome and Internet Explorer browser types. The below are the 5 browser types that Seleniums supports in total:

Firefox
Chrome
Internet Explorer
Safari
Opera

Operating Systems supported by Selenium

Selenium supports all the famous operating systems in the market. The below are the 3 operating systems that Selenium supports:

Windows
Macintosh
Linux

Programming languages supported by Selenium

Selenium supports multiple programming languages for creating automation scripts. Hence we can use any of the below programming languages supported by Selenium:

Java
C#
Ruby
Python
Perl
Haskell
Objective-C
R
JavaScript

Out of all the languages supported, Selenium has more support for Java language and most of the projects use Java, C#, Ruby & Python for automation the scripts using Selenium. And also, we can automate the scripts in Selenium using Java language, even though the application under test is developed using C# language.

Different versions of Selenium

So far, Selenium was released into the market in 3 different versions. The below are the three different versions of Selenium:

Selenium 1
Selenium 2
Selenium 3

The latest version of Selenium is Selenium 3 and it got released into the market on Oct 13, 2016.

History of Selenium

Selenium Core

Selenium was created by Jason Huggins in 2004. To avoid repeated execution of test cases day by day, he created a javascript program and named it as JavaScriptRunner which was later renamed as Selenium core. Over a period of time, Applications were only allowing the internal JavaScript programs by treating the external JavaScript programs as a security breach. This became a drawback for Selenium Core as it uses JavaScript programs to interact with the Applications under test. To overcome this problem while testing using Selenium Core, application testers used to install the Selenium Core's JavaScript programs into the Application code's local copy.

Selenium RC

To overcome the drawback of Selenium core, Paul Hammant has created a proxy server named 'Selenium RC' to trick the application under test that the JavaScript programs are in the same local machine where the application is residing, even though it is not. The drawback of Selenium RC approach is, we have to use a proxy server named 'Selenium RC' to communicate between the Application code and Automation code.

Selenium Grid

Selenium Grid was developed by Partick Lightbody, to reduce the time of Automation scripts execution by running the scripts in parallel on different machines. i.e. Instead of executing all the scripts on a single machine, to reduce the time of execution, all the scripts will be divided across different machines and executed simultaneously.

Selenium IDE

Selenium IDE was developed by Shinya Kasatani, to record the tests like recording a video and execute the recorded tests like playing a video. Hence Selenium IDE is a record and playback tool. Selenium IDE is release into the market as a Firefox add-on/plugin/extension and can be installed on the top of the default Firefox browser. Once installed, we can simply record the tests on firefox browser and playback when required.

Selenium 1

Selenium 1 is nothing but the combination of Selenium IDE, Selenium RC and Selenium Grid.

Selenium WebDriver

Selenium WebDriver was created by Simon Stewart in 2006. As the browsers and web applications were becoming more powerful and not allowing external JavaScript programs. While Selenium RC was tricking the Browser that the Selenium Core's JavaScript program is part of the Application code, by using the proxy server as a communicator. But Selenium WebDriver used the concept of Native Drivers to directly interact with the Application under test and hence eliminated the need for a proxy server. Selenium WebDriver got the support from the largest vendors of the Browsers like Firefox, Chrome, Internet Explorer, Opera and Safari to make Selenium a native part of their browsers. Hence the browsers won't treat Selenium JavaScript programs as an external. The native drivers available for the supported browsers are FirefoxDriver, ChromeDriver, InternetExplorerDriver, SafariDriver and OperaDriver. The development of these drivers is done by the Selenium guys, even though the drivers are native to browsers.

Selenium 2

Later Selenium 1 (i.e. Selenium IDE + Selenium RC + Selenium WebDriver) merged with Selenium WebDriver to make Selenium more powerful and got released into the market as Selenium 2 on July 8th, 2011. Earlier to this merger, there used to be separate teams for developing Selenium RC an Selenium WebDriver. As there two teams identified that Selenium can become more powerful on merging Selenium RC and Selenium WebDriver, they have released the merged version of Selenium as Selenium 2 into the market.

i.e. Selenium 2 is the combination of Selenium 1 and Selenium WebDriver.

Selenium 3

Selenium 3 is the latest version of Selenium and is released into the market on Oct 13, 2016.

Change # 1 - The big change in this version is that the Selenium RC which is part of Selenium 2 is dropped out in Selenium 3. Selenium RC which was internally implementing Selenium Core's JavaScript program libraries is now replaced with the backed WebDriver API implementation. i.e. What ever the tasks that can be performed only by Selenium RC's JavaScript implementation, can now be performed using the backed WebDriver API implementation which is more flexible. There is no change in the WebDriver implementation from Selenium 2 to Selenium 3. Only the Selenium RC JavaScript libraries got replaced with backed WebDriver API implementation from Selenium 2 to Selenium 3. There is less impact due to this change, as most of the users are writing tests in WebDriver, but there is significant impact on the users who are writing tests in Selenium RC, as Selenium RC wont be supported by Selenium team in future releases and hence should migrate to WebDriver for future support.

Change # 2 - Another big change in Selenium 3 is that, we have to use Mozilla's gecko driver for running Selenium automation scripts on Firefox browser. Till Selenium 2, development and support of Selenium driver for firefox is provided by Selenium guys using FirefoxDriver class in Selenium, but from Selenium 3 the development and support for Selenium driver for firefox browser will be provided by Mozilla Firefox Browser vendor using the gecko driver. Hence from Selenium 3, we have to use gecko driver in order to execute the Selenium Automation Scripts on Firefox Browser.

Change # 3 - Apple company will be providing the support for Safari Driver for executing the Selenium Automation scripts on Safari browser of macOS Sierra Operating System.

Change # 4 - Microsoft company will be providing the support for Edge Driver for executing the Selenium Automation scripts on Edge Browser.

Hence major change in Selenium 3 from Selenium 2 is the dropping out of Selenium RC JavaScript libraries implementation by replacing them with Selenium WebDriver's API implementation. And also the other major change is that the browser vendors have taken the ownership to support the drivers which are required to run the automation scripts on the browsers. i.e. Development support for gecko driver is provided by Mozilla, safari driver is provided by Apple and edge drive is provided by Microsoft.

2. Selenium Prerequisites

Prerequisites required for learning Selenium

The below are the different prerequisites that are required to know before learning Selenium:

Software Testing Basics
Automation Basics
Core Java
Web Technologies like HTML, CSS, XML, XPATH, DOM and JavaScript

Software Testing Basics required for learning Selenium

The below are the few software testing basics that are required for learning Selenium:

What is Software Testing ?
Why do we perform Testing ?
What is a Defect ?
Why do we find Defects ?
What are Test Cases ?
What is Manual Testing ?
What is Regression Testing ?
What is Automation Testing ?

If you are new to testing, go through the below blog post to get started.

What is Software Testing ?

Automation Basics required for learning Selenium

The below are the few automation basics that are required for learning Selenium:

What is Automation Testing ?
How can an Automation tool perform testing without a tester ?
What is the purpose for automating the testing ?
Advantages of Automation testing
Applications suitable for Automation Testing
Which tests can be automated ?
What is the goal of Automation Testing ?
What are the most popular Automation tools ?

If you want to learn all of the above Automation basics, go through the below post:

Test Automation Basics

Java concepts required for learning Selenium

Java programming language play a vital role in Selenium Automation. It's not required to learn the complete Java, instead learning of Core Java is enough for Selenium. The below are the different Core Java concepts that are required in Selenium at a high level are:

OOPS Concepts
Primitive Data Types
Variables and Values
Operators
Decision Making Statements
Iterative Statements
Methods
Constructors
Arrays
Strings
Access Modifiers
Wrapper Classes
Exception Handling
Files
Collections Framework
And other miscellaneous Core Java concepts

"Having the knowledge on Core Java is enough for Selenium. It is not required to learn the Advanced Java Concepts."

Web Technologies required for learning Selenium

Apart from learning Core Java, its required to learn different Web Technologies for Selenium. The below are the different Web Technologies that are required for learning Selenium:

HTML
CSS
XPath
DOM
JavaScript
XML

If you want to learn the HTML concepts, go through the below post:

HTML Basics

3. Locators and Add-ons

Purpose of Locators in Selenium

Locators are used in Selenium to identify the Graphical User Interface elements on the web pages. Lets say if you want to click on the 'Login' button on the facebook page, we have to use the locator of 'Login' button in Selenium Automation code. Similarly to automate any Application using Selenium, we have to provide the locators of the Graphical User Interface elements on the Application into the Selenium code. Once the required locators for automating the application are provided to the Selenium code, Selenium uses the provided locators to identify the respective GUI elements on the application.

Different types of Locator Strategies

The below are the different types of locators that we can provide in Selenium code for identifying the web elements on the application:

ID
Name
Link Text
Class Name
XPath Expressions
CSS Selectors
DOM

Locators Priority

As all the different types of locator strategies can be used to locate the same GUI element on the Application, we have to choose a single locator strategy out of all the available locator strategies. As a first priority, we have to use ID strategy, followed by Name, followed by Class Name and followed by Link Text. But, if the web element is not getting identified using the ID/Name/ClassName/LinkText locator strategies, then we have to go for CSS Selectors locator strategy. Still, if the Web Element is not getting identified using the CSS Selector strategy, then we have to finally go for XPath selector locator strategy. And finally, if the Web Element is not getting identified using the XPath selector locator strategy. then we have to go for DOM locator strategy.

XPath Expressions versus CSS Selectors

Out of all the locator strategies XPath expressions locator strategy is the powerful one, but CSS Selectors locator strategy is the recommended one.

The below is the reason behind recommending CSS Selectors over XPath Expressions:

CSS Selectors are speed in identified GUI elements and hence improves the performance of your Selenium Automation scripts.

But practically speaking, the performances difference between CSS Selectors over XPath Expressions in identifying the GUI elements is negligible. Hence this is not the main reason for recommending CSS Selectors over the powerful XPath Expressions.

The below is the reason behind why XPath Expressions are powerful over CSS Selectors:

XPath Expressions can traverse backward and forward in HTML code, for identifying the GUI elements, where as CSS Selectors can only traverse forward i.e. Using XPath Expressions, we can locate the child element using the parent element's unique ids, names etc. And also using the XPath Expressions, we can locate the parent element using the child element's unique ids, names etc. But CSS Selectors, cannot locate the parent element using the child element's unique ids, names etc. Hence XPath Expressions are powerful in identifying GUI elements over CSS Selectors.

By looking the above explanation, XPath Expression needs to be recommended over CSS Selectors. But the truth is, in some cases XPath Expressions are recommended over CSS Selectors and in some other cases, CSS Selectors are recommended over XPath Expressions. Now, lets find out why:

XPath Expressions are not suitable, if you want your Automation Scripts to run on Internet Explorer Browser. By using XPath Expressions for identifying GUI elements, Selenium may not be able to identify few GUI elements on the web pages which are rendered on Internet Explorer Browser. Hence if your requirement is to run the Selenium Automation Scripts on Internet Explorer Browser, CSS Selectors are recommended over XPath Expressions. But, if the CSS Selectors are not able to identify the GUI elements, then the powerful XPath Expressions needs to be used.

But if your requirement is to run your Automation Scripts on Firefox and Chrome browser only, then XPath Expressions are the best choice over CSS Selectors, as they are more powerful than CSS Selectors.

Hence, in case of Internet Explorer, CSS Selectors are recommended over XPath Expressions. But if the browsers list for running automation scripts don't have Internet Explorer Browser, then powerful XPath Expressions can be used over CSS Selectors.

Add-ons required for generating Locators

Though we can manually create XPath Expressions and CSS Selectors by understanding their syntax, Firefox browser is providing few add-ons which can be installed and used for any of the below :

Automatically generates the XPath Expressions / CSS Selectors
Helps us in manually creating the locators in a better way
Automatically generates the locators and the required Selenium statement.

Firefox Add-ons: Firebug, Firepath and WebDriver Element Locator are few Add-ons which does this job.

4. Selenium Components

Selenium Tools Set

Selenium is not a single tool, instead it is a set of tools. The below are the four different tools which we together call as Selenium:

Selenium IDE
Selenium RC
Selenium WebDriver
Selenium Grid

We can also call these tools set as Selenium components. Its not mandatory to use all these components to automate applications, instead we select them for automation based on our Applications requirement.

Selenium IDE

Selenium IDE is a 'Record and Playback' tool and is available for us to use in the form of a Firefox Add-on. So, before understanding the Selenium IDE, we have to first know about 'Record and Playback' tools.

In simple terms, Record and Playback tools are used to automate any application by recording the tests and playing back the recorded tests:

Recording the tests

Recording the tests is just like recording a video using cellphone
Recording in 'Record & Playback' tool is performed to create automated tests
Practically speaking 'Record' option of 'Record & Playback' tool will be turned on and then the activities performed on the Application which is displayed in a browser will be recorded until the 'Record' option is turned off.

Playing the recorded tests

Playback is just like playing the recorded video
Playback in 'Record & Playback' tool is performed to execute the recorded automated tests
Practically speaking 'Play' option of 'Record & Playback' tool will be selected to execute the recorded automated tests

Advantages of using 'Record & Playback' tools

Very easy for Automating the Tests - Can be used by the non programmers as programming knowledge is not required for using 'Record & Playback' tools
Takes less time for Automating the Tests - It wont take much time to automate the tests using 'Record & Playback' tool, as we simply record the application actions over browser and later execute the recorded actions using play options.
Suitable for less complex projects - 'Record & Playback' tools are very useful for automating the projects whose complexity is less, as the creation of automation tests will be easy and the time taken to automate the tests will be less when compared to other ways.

Disadvantages of using 'Record & Playback' tools

Automation scripts maintenance is more - When any buttons or links or UI changes in the application, we need to re-record all the earlier recorded automation tests which are getting impacted due to the changes. i.e. We need to use 'Record & Playback' tool to re-record all the impacted automation tests. But when we use programming for creating automation scripts, this maintenance will be reduced, as we will update the changes in a single file in the automation framework to which all the tests will refer, instead of editing the changes in all the automation tests which got impacted by these changes. Hence, even though the creation of automation scripts using 'Record & Playback' tool is easy, the maintenance of the recorded scripts will be high.
Re-usability will be less - If we want to automate 100 tests, where all the tests needs to be performed after logging into the application, then using 'Record and Playback' tool we need to record login functionality for all the 100 tests separately. But if we use programing for automating tests instead of ‘Record and Playback’ tool, then as part of framework we can have a single file where we can keep the reusable code for login functionality and use the same reusable code with the 100 tests. Hence the re-usability will be less in case of automating the tests using ‘Record and Playback’ tool.
Not suitable for complex applications - Due to the maintenance and re-usability problems in ‘Record and Playback’ tools, automating the test cases for complex Application is not recommended. As complex applications contains huge list of tests and it will be difficult to re-record the huge list of tests when anything changes in the application. And also due the huge list of scenarios, the scope for re-usability will increase. Hence ‘Record and Playback’ tools are not suitable for complex applications.

Selenium IDE

Selenium IDE as mentioned earlier is a ‘Record and Playback’ tool and is one of the 4 components of Selenium. Selenium IDE is provided by Firefox in the form of an Add-on / Plug-in and is free to install on your Firefox browser.

"As part of learning Selenium IDE, the basic understanding of using Selenium IDE is enough and is not required to learn in detail."

As you have now understood the usage of ‘Record and Playback’ tool, the following are the different posts using which you can explore the functioning of ‘Selenium IDE’:

Hence Selenium IDE is a ‘Recording and Playback’ tool and is useful for creating automation tests for simpler applications in an easiest and fastest way. And no programming language is required for using this tool.