How to get innerHTML of whole page in selenium java driver?
Last Updated :
30 Sep, 2024
When automating web applications with Selenium WebDriver in Java, it's often necessary to retrieve the entire HTML content of a webpage. This can be useful for testing purposes, data extraction, or validating the structure of the page.
Selenium WebDriver provides a straightforward way to access the innerHTML of the entire page using Java. By fetching the HTML content, you can analyze or manipulate it programmatically.
Prerequisite
We will be required 3 main things:
- Java Development Kit (JDK) installed.
- Browser Driver (e.g., ChromeDriver for Chrome).
- IDE like Eclipse or IntelliJ IDEA.
Dependencies for Selenium
We will be required to have dependencies for selenium, for that, we will add dependencies in the XML file.
pom.xml
XML
<dependencies>
<!-- Selenium Java Dependency -->
<dependency>
<groupId>org.seleniumhq.selenium</groupId>
<artifactId>selenium-java</artifactId>
<version>4.21.0</version>
</dependency>
</dependencies>
Example
Index.html
HTML
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Sample Page</title>
</head>
<body>
<h1>Welcome to My Website</h1>
<p>This is a simple paragraph on my web page.</p>
<div>
<h2>Section 1</h2>
<p>This is some content in section 1.</p>
</div>
<div>
<h2>Section 2</h2>
<p>This is some content in section 2.</p>
</div>
</body>
</html>
Application.java
Java
import org.openqa.selenium.JavascriptExecutor;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.chrome.ChromeDriver;
public class Application {
public static void main(String[] args) {
// Set the path to your WebDriver executable
System.setProperty("webdriver.chrome.driver", "path_to_chromedriver");
// Initialize ChromeDriver
WebDriver driver = new ChromeDriver();
try {
// Load the local HTML file
driver.get("file:///path_to_your_html_file/index.html");
// Option 1: Get the entire page source
String pageSource = driver.getPageSource();
System.out.println("Page Source:");
System.out.println(pageSource);
// Option 2: Get innerHTML of the body using JavaScriptExecutor
JavascriptExecutor js = (JavascriptExecutor) driver;
String bodyInnerHTML = (String) js.executeScript("return document.body.innerHTML;");
System.out.println("\nBody InnerHTML:");
System.out.println(bodyInnerHTML);
} finally {
// Close the browser
driver.quit();
}
}
}
Output
OutputGetInnerHTMLExample.java
Java
import org.openqa.selenium.JavascriptExecutor;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.chrome.ChromeDriver;
public class GetInnerHTMLExample {
public static void main(String[] args) {
// Set the path to your WebDriver executable
System.setProperty("webdriver.chrome.driver", "path_to_chromedriver");
// Initialize ChromeDriver
WebDriver driver = new ChromeDriver();
try {
// Load the local HTML file
driver.get("https://www.geeksforgeeks.org");
// Option 1: Get the entire page source
String pageSource = driver.getPageSource();
System.out.println("Page Source:");
System.out.println(pageSource);
// Option 2: Get innerHTML of the body using JavaScriptExecutor
JavascriptExecutor js = (JavascriptExecutor) driver;
String bodyInnerHTML = (String) js.executeScript("return document.body.innerHTML;");
System.out.println("\nBody InnerHTML:");
System.out.println(bodyInnerHTML);
} finally {
// Close the browser
driver.quit();
}
}
}
Output
OutputConclusion
Retrieving the innerHTML of the whole page in Selenium Java is a simple and effective method for accessing the complete HTML structure. By utilizing JavaScriptExecutor in Selenium, you can easily extract the HTML content of the webpage. This is particularly useful for debugging or validating the HTML code during test automation. Using Selenium WebDriver allows you to automate this process efficiently across different browsers.
Similar Reads
How to Run Internet Explorer Driver in Selenium Using Java? Selenium is a well-known software used for software testing purposes. Selenium consists of three parts. One is Selenium IDE, one is Selenium Webdriver & the last one is Selenium Grid. Among these Selenium Webdriver is the most important one. Using Webdriver online website testing can be done. Th
6 min read
How to get the total number of checkboxes in a page using Selenium? In the world of automated testing, Selenium is a powerful tool that helps automate web browsers. One common task during web testing is determining the number of specific elements on a page, such as checkboxes. This can be useful for validating the presence and quantity of checkboxes on a form or any
3 min read
How to Get All Available Links on the Page using Selenium in Java? Selenium is an open-source Web-Automation tool that is used to automate web Browser Testing. The major advantage of using selenium is, that it supports all major web browsers and works on all major Operating Systems, and it supports writing scripts on various languages such as Java, Â JavaScript, C#
2 min read
How to Handle iframe in Selenium with Java? In this article, we are going to discuss how to handle the iframe in Selenium with Java. The following 6 points will be discussed.Table of ContentWhat are iframes in Selenium?Difference between frame and iframe in SeleniumSteps to Identify a Frame on a Page?How to Switch Over the Elements in iframes
11 min read
How to get an attribute value from a href link in Selenium java? When working with Selenium WebDriver in Java, automating web interactions often involves handling hyperlinks. A common task is to extract the attribute value from a <a> tag, specifically the href attribute, which contains the link's destination URL. Extracting this href value allows you to ver
3 min read