How to parse HTML in Ruby? Last Updated : 23 Jul, 2025 Comments Improve Suggest changes Like Article Like Report We have many languages which are used to parse the html files. We have Python programming languages. In Python, we can parse the html files using the panda's library and the library which is beautiful soup. The Beautiful Soup library is mainly used for web scraping. Similarly, we can parse the HTML files in the ruby using a library called Nokogiri. The Nokogiri library in the ruby helps us to parse the html files more easily. To work with the html files in the ruby language we should have a pre-built library called Nokogiri. We should type the following command to get the library installed for parsing the html files. gem install nokogiri The above command helps us to install the library to parse the HTML file Table of Content 1. Extracting the tags from the HTML File 2. Extracting the tags from the URL Conclusion:1. Extracting the tags from the HTML File In this Program, we will parse the HTML string using the Nokogiri library in the ruby language. Then we use the parse method to read the HTML string. Then we can extract the title of the HTML string using the parsed string along with the title. Ruby #Importing the nokigiri Library require 'nokogiri' #Parsing the HTML Text using the Nokogiri Library html_text = "<title>MyFirstWebSite</title>" #Extracting the title from the HTML text html_title = Nokigiri::HTML.parse(html_text) #Printing the title of the html puts html_title.title Output : => MyFirstWebSite Program Explaination:In the above program we have first imported the nokogiri library .Then we have created a string with the html tags .The string we have created should be passed to the parse() method in the Nokogiri .parse()Then we have printed the title of the html text using the parsedstring object.title2. Extracting the tags from the URL In the program we have used the open-uri to read parse the html tags from the url of html file .Then we have extracted the title for the given url of a html file . Let's consider a example file: https://newpage.com/<html> <head> <title> MyFirstWebSite</title></head><body><h1> Hi </h1></body></html>Program: Ruby require 'open-uri' #Reading the html script from url Nokogiri::HTML.parse(open('https://newpage.com/')).title #The above command will fetch us the title of the html page Output :=>MyFirstWebSite Program Explaination:In the above program we have imported the module open-uri in the ruby.Then with the help of the Nokogiri library in the ruby programming language we have passed the url of the html file using the open method in the open-uri.The open method is used to read the whole thing available in the html url.Then with the help of the nokogiri we have printed the title of the of the html page.Conclusion:Generally we parse the data in the html files for the usage in the web scraping .The web scraping now a days has become one of the important concept in the data science and it is a part of the data wrangling in the python .So using the libraries in the ruby helps us to read the data in the html files very easily . so in this way the libraries such as the nokogiri and open-uri helps us to scrap the web and extract the data from the html files and even the urls and including the html strings. Comment More infoAdvertise with us Next Article Class Diagram | Unified Modeling Language (UML) B boora_harsha_vardhan Follow Improve Article Tags : Ruby Similar Reads Non-linear Components In electrical circuits, Non-linear Components are electronic devices that need an external power source to operate actively. Non-Linear Components are those that are changed with respect to the voltage and current. Elements that do not follow ohm's law are called Non-linear Components. Non-linear Co 11 min read Spring Boot Tutorial Spring Boot is a Java framework that makes it easier to create and run Java applications. It simplifies the configuration and setup process, allowing developers to focus more on writing code for their applications. This Spring Boot Tutorial is a comprehensive guide that covers both basic and advance 10 min read Class Diagram | Unified Modeling Language (UML) A UML class diagram is a visual tool that represents the structure of a system by showing its classes, attributes, methods, and the relationships between them. It helps everyone involved in a projectâlike developers and designersâunderstand how the system is organized and how its components interact 12 min read Python Variables In Python, variables are used to store data that can be referenced and manipulated during program execution. A variable is essentially a name that is assigned to a value. Unlike many other programming languages, Python variables do not require explicit declaration of type. The type of the variable i 6 min read Spring Boot Interview Questions and Answers Spring Boot is a Java-based framework used to develop stand-alone, production-ready applications with minimal configuration. Introduced by Pivotal in 2014, it simplifies the development of Spring applications by offering embedded servers, auto-configuration, and fast startup. Many top companies, inc 15+ min read Backpropagation in Neural Network Back Propagation is also known as "Backward Propagation of Errors" is a method used to train neural network . Its goal is to reduce the difference between the modelâs predicted output and the actual output by adjusting the weights and biases in the network.It works iteratively to adjust weights and 9 min read Polymorphism in Java Polymorphism in Java is one of the core concepts in object-oriented programming (OOP) that allows objects to behave differently based on their specific class type. The word polymorphism means having many forms, and it comes from the Greek words poly (many) and morph (forms), this means one entity ca 7 min read CTE in SQL In SQL, a Common Table Expression (CTE) is an essential tool for simplifying complex queries and making them more readable. By defining temporary result sets that can be referenced multiple times, a CTE in SQL allows developers to break down complicated logic into manageable parts. CTEs help with hi 6 min read What is an Operating System? An Operating System is a System software that manages all the resources of the computing device. Acts as an interface between the software and different parts of the computer or the computer hardware. Manages the overall resources and operations of the computer. Controls and monitors the execution o 9 min read 3-Phase Inverter An inverter is a fundamental electrical device designed primarily for the conversion of direct current into alternating current . This versatile device , also known as a variable frequency drive , plays a vital role in a wide range of applications , including variable frequency drives and high power 13 min read Like