0% found this document useful (0 votes)
3K views51 pages

Project Final PDF

The document describes a mini project submitted to fulfill requirements for a Master of Science in Computer Science degree from Periyar University in Salem, India by G. Menaga. It involves developing a PDF to audiobook converter using Python libraries like pyttsx3 and PyPDF2. The project aims to provide an alternative way for blind, lazy, or busy readers to access PDF content by converting it to audio format.

Uploaded by

menu menu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3K views51 pages

Project Final PDF

The document describes a mini project submitted to fulfill requirements for a Master of Science in Computer Science degree from Periyar University in Salem, India by G. Menaga. It involves developing a PDF to audiobook converter using Python libraries like pyttsx3 and PyPDF2. The project aims to provide an alternative way for blind, lazy, or busy readers to access PDF content by converting it to audio format.

Uploaded by

menu menu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 51

PDF TO AUDIO BOOK CONVERTER

A Mini Project submitted in partial fulfillment of


the requirements for the degree of

Master of Science in Computer Science


to the

Periyar University,Salem-11
By
G .MENAGA
Reg No:C21PG185CSCOO5

ADHIYAMAN ARTS AND SCIENCE COLLEGE FOR WOMEN,UTHANGARAI.

(Affiliated to Periyar University,Salem-11)

Recogonized Under Sec 2 (f)& 12(B) of the UGC Act ,1956


Srinivasa Nagar, Uthangarai-635 207
2022
PDF TO AUDIOBOOK CONVERTER

A Mini Project submitted in partial fulfillment of


the requirements for the degree of

Master of Science in Computer Science


to the

Periyar University, Salem-11


By
G MENAGA
Reg No:C21PG185CS005

ADHIYAMAN ARTS AND SCIENCE COLLEGE FOR WOMEN,


UTHANGARAI.
(Affiliated to Periyar University)
2022
Ms .N.JEEVANANTHINI MSc.,M.Phil.,MCA
HEAD & ASSISTANT PROFESSOR,
DEPARTMENT OF COMPUTER SCIENCE AND APPLICATIONS,
ADHIYAMAN ARTS AND COLLEGE FOR WOMEN,UTHANGARAI.

Place: Uthangarai.
Date:

CERTIFICATE
This is to certify that a Mini Project Work using Open Source entitled
“PDF TO AUDIO BOOK CONVERTER” submitted in partial fulfillment of
the requirements of the degree of Master of Science in Computer Science to
the Periyar University, Salem is a record of bonafide work carried out by
G.MEANAGA Reg. No C21PG185CSC005 under my supervision and
guidance.

HEAD OF THE DEPARTMENT INTERNAL GUIDE

Date of Viva-voce ………………….

INTERNAL EXAMINAR EXTERNAL EXAMINAR


ACKNOWLEDGEMENT

The completion of this project brings a sense of satisfaction, but it is never completed
without thanking the persons who were responsible for the successful completion. First and
foremost we wish to express our thanks to the almighty god who is the real one to help us
through the preparation of this project.
My sincerely thanks to our Management for having given all support through the
period of project.I sincerely acknowledge my deep sense of gratitude to our principal of
Adhiyaman Arts and Science College For Women , Dr. S. Thirumal Murugan, M.Sc.,
M.Ed., Ph.D., for his encouragement to bring this project.

I Convey my most sincere heartfelt thanks to Head of the Department of computer


Science & Applications Ms.N.Jeevananthini ,M.C.A., M.phil., for her guidance and support
to finish this project successfully.

I also wish to express our sincere feelings of gratitude to our college Adhiyaman
Arts and Science College for Women, Srinivasa Nagar,Uthangarai for providing me an
opportunity to accomplish the project.
I thank my family members and friends, for their immersive encouragement and all
my friends for their help rendered to me in many ways to complete the project.

I thank my family members and friends, for their immersive encouragement and all
my friends for their help rendered to me in many ways to complete the project. I thankful to
all those who have directly or indirectly helped us in successful completion of our project.

G.MENAGA
CONTENT
S.NO TITLE PAGE NO

1. 1.INTRODUCTION
1.1SYSTEM CONFIGURATION
1.1 HARDWARE CONFIGURATION

1.2 SOFTWARE CONFIGURATION


2. 2.SYSTEM STUDY
2.1 EXISTING SYSTEM

2.2 FEATURES

2.3 PROPOSED SYSTEM

2.4 PROJECT DESCRIPTION


3. SYSTEM DEVELOPMENT
4. IMPLEMENTATION
4.1 SOURCE CODE

4.2 SAMPLE INPUT

4.3 SAMPLE OUTPUT


5. CONCLUSION
ABSTRACT
PDF TO AUDIOBOOK CONVETER

ABSTRACT
In this project is the PDF to Audio Converter is proposed. It provides an
alternative way to access the books and any pdf file for lazy, readers, and others. Using this
PDF to Audio Converter the user will be able to listen to his\her favorite PDF and can do
their daily routine.

The application can be used to read any PDF . The following application can be used to
convert text from PDF to audio using Tkinter and python files, functions, and definitions.

Tkinter is the python binding to the Tk toolkit which is used across many programming
languages for building the Graphical user interface which is a GUI.

The main packages used in this audiobook converter are pyttsx3 and PyPDF2 libraries.
To overcome the(issues) the project PDF to Audio Converter has been developed to extract
data from the pdf selected by the user, and to extract the data from the pdf, convert it to audio
format to read out loud.

This is a python library built as a PDF toolkit. To overcome the(issues) the project PDF to
Audio Converter has been developed to extract data from the pdf selected by the user, and to
extract the data from the pdf, convert it to audio format to read out loud.

It will allow the user to note down important notes and it also has the feature of save, cut,
copy, and paste. Index Terms -
Python, pyttsx3 and PyPDF2, Text to Speech, Converter, Tkinter.
INTRODUCTION
1.INTRODUCTION

Text-to-speech and related read audio tools are being widely implemented in an attempt
to assist students’ reading comprehension skills. PDF to the audio system is a screen reader
application designed and constructed for an effective audio communication system.
PDFs were designed to present and exchange documents reliably, PDFs are an open
standard document format used globally, maintained by the International Organization for
Standardization (ISO).
The document format is one of the most convenient methods for electronic
communication, and also for the exchange of information.
The PDF to the audio system will power text on screens to read aloud (speak) with
support for many languages . The PDF to Audio Converter project provides an alternative to
access the PDF books for the blind, lazy, readers, and others.
The following application can be used to convert text from PDF to audio using Python
predefined libraries like pyttsx3 and PyPDF2 and also using some Tinker files.
In the current busy routine people do not have time to take a book and spend time
reading it, instead, everyone needs alternative access to read the content. If a person is
traveling, he\she cannot read a book, instead of reading, they can listen to it.
The PDF to Audio Converter project provides an alternative to access the PDF books
for the blind, lazy, readers, and others. Using this PDF to Audio Converter the user will be
able to listen to his\her favorite PDF and can do their daily routine.
1.1SYSTEM CONFIGURATION

1.1.1HARDWARE CONFIGURATION

MEMORY SPACE:

• MOUSE :Touch Pad


• INPUT :Keyboard,Mouse
• RAM :8GB
• PROCESSOR : 11th Gen Intel(R) i3-1115G4 @ 3.00GHz
• MINIMUM :32 MB
• RECOMMENDED :64 MB
• ETHERNET CONNECTION :wireless adapter(Wi-Fi)
• HARD DISK

1.1.2 SOFTWARE CONFIGURATION

• PYTHON :version 3.11


• OPERATING SYSTEM :windows 11
• CODING LANGUAGE :Python
• TOOL :IDLE,pycharm
1.2.3 SOFTWARE SPECIFICATION

PYTHON LANGUAGE
Python is a high-level,general-purpose programming language.In this python
language having so many packages. It is dynamically-typed and garbage-collected.
Python was conceived in the late 1980s by Guido van Rossum at Centrum
Wiskunde & Informatica (CWI) in the Netherlands as a successor to the ABC programming
language, which was inspired by SETL, capable of exception handling and interfacing with
the Amoeba operating system. Its implementation began in December 1989.
It interfaces to many OS system calls and libraries and is extensible to C or
C++.Many large companies use the Python programming language include
NASA,Google,youtube,etc.

The main characteristics of python

➢ Multi-paradiagram programming language


➢ Interpreter language
➢ Interactive language

PYTHON FEATURES

1.Easy to learn -Python includes a small number of keyWords, precise structure and well
defined syntax.
2.Easy to read -python code is clearly defined and visible to the naked eye.
3.Easy to maintain- Python source code is easy to maintain.
4.Standard General Library- python’s bluk library is very portable and shortcut compatible
with UNIX, Windows , and Macintosh.
5. it can be used as scripting language or compiled into Byte-code for large-scale
application development.
SYSTEM STUDY
2.SYSTEM STUDY:

Background Study:
On July 1, 2008, the ISO issued ISO 32000-1:2008 and disseminated it as an open standard.
When it came to defining the term, the responsibility was transferred to an ISO Committee of
industry experts. The Adobe Public Patent License, announced in 2008, grants ISO 32000-1
royalty-free use of all Adobe-owned patents necessary for PDF-compliant implementations to
be produced, used, sold, and distributed. In 2008, this licence was made available to the
public. As early as 1993, Adobe Systems released the PDF standard open source. Adobe's
own PostScript format, as well as other formats like Common Ground Digital Paper and
Farallon Replica were all opposing with PDF in the early years. As a starting point, the user
provides a location for a PDF file to be read to them using audio, which is a simple way to
association various Python modules.

2.1 EXISTING SYSTEM:

During the development of the research study, they following limitations were
encountered;

1. Limited research material available at the school library and on the internet.
2. High cost of setting up the system as it requires a high programming language.
3. Combining school work and carrying out the research.

2.1.1 DRAWBACKS:

➢ To create a PDF to Audiobook Converter using different Tkinter and python


files, functions, and definitions.
➢ The main packages used in this audiobook converter arepyttsx3 and PyPDF2.

➢ Pyttsx3 is a python library for text to speech. This is the reason which helps
the machine to speak to us.
➢ This reads out the selected PDF into audio in a suitable voice, speed of reading
with volume adjustments.This also contains a notepad.

➢ Pyttsx3 extracts the text from the PDF. This is a pure-python library built as a
PDF toolkit .
➢ This can extract document information, split, and merge documents page by
page.

2.1.2 DEFINITION OF TERMS

▪ PDF – Portable Document Format.


▪ Audio - Audio most commonly refers to sound, as it is transmitted in signal form.
▪ Speech- the expression of or the ability to express thoughts and feelings by articulate
sounds.
▪ Oral - relating to the transmission of information or literature by word of mouth
rather than in writing.
▪ Reading Disabilities - a condition in which a sufferer displays difficulty reading.
▪ File - a collection of data treated as a unit by a computer.
▪ Document- A document is a written, drawn, presented, or memorialized
representation of thought.

2.2 FEATURES:
The basic feature in our project is to convert a .pdf file into a .mp3 file. For
this, we have used two libraries (i.e. pyttsx3 and pyPDF2).

1. Pyttsx3 – It is a Python library for Text to Speech. It has many functions which will
help the machine to communicate with us. It will help the machine to speak to us .

2. PyPDF2 – It will help with the text from the PDF and is capable of extracting
document information, splitting documents page by page, merging documents page by
page, etc.

• First, the program will import the PyPDF2 and pyttsx3 modules.
• Then to open the Pdf fill we have used PdfFileReader().

• To extract the text from the page we have used extract text.

• Then by using pyttsx3 it will convert text to speech

• Then to select the page to be read we used the get Page() method.

• Then from the pyttx3 object, We used the say() and run wait() methods to speak out
the text.
We have added a few extra features to make our project more user-friendly like voice
changer, speed, volume, and a notepad.

OTHERS

❖ Minimum time required.


❖ It is easy to read.
❖ It is easy to understand.
❖ It is also using to reading practice.
❖ Pdf to audio book converter is using all the places it can be access.

2.3 PROPOSED SYSTEM:


In this current busy routine people do not find time to read a
book, or to convert the PDF file into an MP3 player using third-party applications or web
applications.
Even I have a directory at which I store pdf books that I plan on reading, but I never do.
So, I thought hey, why do not I make them audiobooks and listen to them while I do
something else !
In this system, we are developing a GUI application using python to convert the PDF file into
audio format and read it out to the user

The application is more user friendly as it does not require any audio file or MP3 player. The
user will have to select the PDF file which the user wants to listen to.
2.4 PROJECT DESCRIPTION:
In the current generation students, researchers, authors don’t find
time to read a book on an electronic device as that might strain their eyes and might face we
have designed an application that extracts the text from the selected PDF and reads it out to
the user.

Best free PDF to Audio converter software and online tools


Here is the list of tools added in convertinPDF to audio:

1. Balabolka
2. Online-Convert
3. Zamzar
4. AnyPDF.

1.BALABOLKA:

Balabolka is a free portable tool that lets you convert PDF to audio files.
It lets you convert PDF to MP3, WAV, WMA, OGG, MP4, M4A, M4B, OPUS,
and AWB format audio files. One of its unique features is that you can use its split and
convert feature to convert a PDF file by size of text blocks, lines where all letters are capital,
by named bookmarks, etc.

This tool also lets you open multiple PDFs in different tabs and then you can listen to its text
content or simply create the audio files of those PDFs. In addition to that, you can also
change the speech type (3 different speeches are available) and set the pitch, rate, and volume
of the selected speech.

Apart from converting PDF to an audio file, this tool can be used to view and read the text
from ODP, Markdown file, TXT, RTF, DOCX, DOC, XLSX, DJVU, XLS, EPUB, FB2, and
other supported format files. So the tool is feature-rich and works quite well.

Now let’s check how to use this tool for converting PDF to audio. Once you have
downloaded this tool and opened a PDF file on its interface, follow these steps:

1. Click on the SAPI 5 tab available on the top left part of its interface
2. Select a speech from the drop-down menu

3. Set the pitch, volume, and rate

4. Listen to the text content so that you can find out if the audio is good enough for you

5. Click on the Save Audio File button present just below the Speech menu

6. A Save Audio File window will pop-up

7. There, select the audio format

8. Give a name to the audio file.


2)Online-Convert

Online-Convert is an all-in-one service for different conversion types. There


are multiple conversion types such as archive converter, eBook converter, video converter,
image converter, etc. A PDF converter tool is also present there using which you can convert
PDF to MP3 files.

It’s free registered plan, you can upload a maximum of three PDF files per
conversion. However, there are limitations to the free plan. You can upload a maximum
of 100 MB PDF file and you can convert only 3 PDF files in 24 hours. If this limitation
doesn’t bother you, then this service is good to use.
3)Zamzar

Zamzar service is similar to the Online-Convert service as it also provides a lot of


conversion types. It has a document converter, audio video converter tools, eBook converter,
archive converter, etc. Hundreds of file formats are supported for conversion. And, PDF to
MP3 conversion is also possible here.

The features are pretty good but its free plan is very limited. In its free plan, you can upload a
maximum of 50 MB PDF file (less than 50 MB for 2 PDF files) and only 2 conversions are
supported per day.

For converting PDF to MP3, open its homepage. After that, click on Choose Files button to
add PDF files from your computer. You can also use the drop-down icon available just next
to Choose Files button to add a PDF document from your OneDrive, Dropbox, Box account,
and Google Drive account.
Once the PDF file is added, click on the Convert To drop-down menu, and select the mp3
option available in the Audio Formats section. After this, press the Convert Now button.
Finally, the service will do the rest of your work. It will upload the added PDF file(s), start
the conversion process, and provide you with the download links to save the output audio
file. This website doesn’t provide the options to set pitch, volume, or speech, but the output
comes well.

4. AnyPDF

Any PDF brings a very simple PDF to MP3 converter tool. It doesn’t mention
any size limit and the number of conversions per day. However, based on my testing, I found
that the tool is good for basic PDFs having only the text content. While other tools covered in
this post were able to convert other PDF files that I used for testing, it showed errors in
processing such PDFs.
To use this PDF to MP3 converter tool, you need to access any-pdf.com service. After that,
click on the Choose files button to add a PDF file from your desktop. Once done, hit
the Convert button. When the file is processed, click on the DOWNLOAD button to store
the output MP3 on your computer.
Steps to Build Python Pdf to Audio Converter Project
Below are the list of steps to convert Pdf text to audio speech and audio to pdf
file using python:

1. Import Modules

2. Make constructor of the Application class

3. Functions to draw main frame

4. Function deleting frame

5. Speech to Text function

6. Text to speech function

7. Conversion functions

8. Function for reading and clearing files.

9. Main function
Modules

• os – This module interacts with the operating system.

• Tkinter module – Tkinter is the standard interface in python which is used for
creating a graphical user interface.

• from tkinter import messagebox – Import message box separately for showing
messages on the screen.

• speech_recognition – This library converts speech into audio file.

• win32com.client – win 32com.client module provides access to automation


objects.
SYSTEM DEVELOPMENT
3.SYSTEM DEVELOPMENT:

Definition

The objective of this project is to create a GUI based PDF text to Audio and
Audio to PDF converter. To build this, you will need intermediate understanding of Tkinter,
OS, path, Speech Recognition and PyPDF4 libraries and basic understanding of the Pydub
and pyttsx3 libraries and message box module.

pdf_to_audio()

This method, where we will convert text from a page of a PDF file to speech,
our GUI window will be a Top Level widget, with two entry fields: the first one asking for
the path to the PDF file, and the second one asking for the page number from which the text
should be read.

speak_text()

The button in the window will take these two entry fields as arguments and run
it through the speak_text() function, which we will learn to create in the next step.

audio_to_pdf()

This method, where we will convert the speech of an audio file to text in a PDF
file, we will have two entry fields where the first one will ask for the path to an audio file that
should only be in .wav and .mp3 formats and the second one will take the path to a PDF file.

speech_recognition()

The button in this window will take the two paths as arguments and run them
through the speech_recognition() function where the audio file will be transcribed.

@staticmethod()

@staticmethod is used to convert a method indented under a Class definition


to an autonomous function that is not bound to the scope of the Class.
We will take 2 arguments in the speak_text() method, which will be our path to
the PDF file and the page number that the user wants the computer to read.

▪ At the start, we will check if the file path and the page number
arguments are not empty. If they are, we will stop the function from
further functioning.

▪ Then, we will initiate the pyttsx3 engine.

▪ After that, we will open our PDF file in the read mode, where we will
get the page number the user asked for, extract the text from that page
and then use the pyttsx3 engine to read the text.

DESIGNING

Design Patterns is the most essential part of Software Engineering, as


they provide the general repeatable solution to a commonly occurring problem in software
design. They usually represent some of the best practices adopted by experienced object-
oriented software developers.

1. METHODOLOG :
System Design
There isn't enough time in people's lives these days to read a book or use
internet services to convert a PDFfiletoanMP3 player. As a fellow bibliophile, I have a folder
where I save all of the books I'd want to read but on no occasion get around to it. As a result,
I reasoned, why not turn them into audiobooks so I could listen to them while doing other
work? A Python GUI programme is being developed to convert the PDF file to audio format
and play it back for the user on this machine .

No audio file or MP3 player is required for the application, making it easier for users to get
started and utilise. After selecting a PDF file, the user can begin listening. In contrast to
distinctive audiobook convertors, they convert PDF texts (or pictures) to speech and enable
volume control for single-voice conversion (either male or female). When it comes to
modifying the voice, the user has only one option. They provide the ability to play and pause the
video. The pace of the voice is persistent

.
Reading stories or essays or any text can be arduous, however an audio
reading of the text is convenient and doesn’t require as much concentration as reading
requires. In this project, I implemented a simple PDF to audio converter. This code scans
page(s) of PDF and reads it using audio, to the user

2.PROJECT SCOPE
The translation of PDF text to audio signals is in high demand due to
PDF's extensive use as the world's most popular document format. In addition to
instructional purposes, they may be used for a variety of other tasks, such as
automobile steering, train station announcements, telecommunications response
services, and reading e-mail. PDF files cannot be viewed or read by those with vision
damages, which is a big disadvantage. This study examines the problems of turning
PDF text into spoken word. One of the ways to make PDF-based text seem more
accurate when played over an audio system is to make synthetic speech sound more
natural.

Literature Survey:
Before starting the project we did a survey on why people are interested in
audiobooks rather than reading and also, we noted the reasons if they are not interested in
audiobooks. Most of the interested people answers are that listening to a book is convenient
than reading a book and it is very useful for the people who can understand the language but
cannot read, so, through audiobook system they can listen to the books they liked even
though they cannot read. This Audiobook System is also useful for blind people so rather
than reading they can listen to the book and understand or feel the book[4,5]. Some people
those who don’t like audiobook system the main reason is that they can improve their reading
skills while reading a book rather than listening to a book so that they can be perfect in the
language[6]. While some people say that they are happy that they don’t need to carry a book
every if they are suddenly interested in reading and some others say that they can do
multitasking i.e., by listening to a book, they can do some other work at the same time. After
reading the responses from the survey we got to know that there are more advantages than
disadvantages of the audiobook system

3.Problem statement :

Because PDF is the most widely used document format on the planet, there
is a demand for PDF text to be convertedto audio signals. In addition to instructional
objectives, they may be used for a variety of other tasks, such as auto mobile
navigation, train station notifications, telecommunications reply services, and reading
e-mail. PDF files are not accessible to those with poor vision, which is a severe
drawback. In this research, the difficulties of translating PDF text to voice are
examined.

In PDF-based writing, one challenge is to make the audio output of artificial speech
sound more truthful. You'll notice when you look at our Audiobook converter that it
converts PDF text (or pictures) into voice by making slight alterations inside the same
PDF document. Voices may be converted. Using a single button press, the user may
switch between male and female voices. Using voice speed as an example, it is
possible to alter the pace at which one speaks (rapid ,normal, and slow).

Alternate methods of regulating the volume are available. Since the audiobook
converts the PDF to text, we have complete control over what we want to include or
exclude from the final product. Adding this additional competence to our audiobook
converter is a first.
This makes it easier for the user to add or remove lines, for example. For example, the
user may use this notepad to combine or delete sections of text as they usually would.
While the music is playing, we may also open the relevant PDF or any other PDF.

ADVANTAGES OF THIS SYSTEM

• This application makes its users listening skills strong.

• The users will have ease in listening audiobooks.

• User-Friendly interface

• No prior environment required to run this application like Java, python.

• The query will be executed only in the command line interface and Graphical user
Interface

• Free of cost to make this project.

• User can get text file as well as audiofile.

DISADVANTAGES OF THIS SYSTEM

• The user can select any of the two voices.

• The user has to kill the program in order to stop the audio file while running in GUI.

• Works fine only in Windows family, application is not compatible with macOS and Linux.
TESTING
TESTING
Let us see how to read a PDF that is converting a textual PDF file into audio.
Packages Used:

• pyttsx3: It is a Python library for Text to Speech. It has many functions which
will help the machine to communicate with us. It will help the machine to speak
to us

• PyPDF2: It will help to the text from the PDF. A Pure-Python library built as a
PDF toolkit. It is capable of extracting document information, splitting
documents page by page, merging documents page by page etc.

Both these modules need to be installed


pip install pyttsx3
pip install PyPDF2

# PDF to Audio Conversion in Python #


# Pip install has to be done in command prompt window by using the below comments
#pip install gTTS
#pip install PyPDF2

import PyPDF2 # Import the required module for text to speech conversion

from gtts import gTTS # This module is imported so that we can play the converted audio

import os

pdffile = open('F:\w2a.pdf', 'rb') # Input file path select the correct path

readfileip = PyPDF2.PdfFileReader(pdffile) # creating a pdf reader object

print(read file ip.numPages) # printing number of pages in pdf file


pageinfo = readfileip.getPage(0) # creating a page object

# extracting text from page

print(pageinfo.extractText()) # Printing the extrated text

text=pageinfo.extractText() # Assigning the exterated text to variable

pdffile.close() # closing the pdf file object

mytext = text # The text that you want to convert to audio

language = 'en' # Language in which you want to convert

# Passing the text and language to the sytem,here we have marked slow=False. Which tells
the module that the converted audio should have a high speed
audiofile = gTTS(text=mytext, lang=language, slow=False)

audiofile.save("audiofile.mp3") # Saving the converted audio in a mp3 file named Audio


file

os.system("audiofile.mp3") # Playing the converted audiofile.mp3


OUTPUT
WORKFLOW:

1. When the user selects a pdf file, the pdf to audiobook converter begins the process of
converting it into an audio book

2. After selecting a PDF file, we must type in the rate of speech per minute using the
keyboard.

3. " In this way, we are able to set the amount of words per minute that we like to hear

4. Now we need to select the voice type, we can select male voice ore female voice. We
can control the volume of the speech produced.

5. We can also extract .txt file using command prompt which helps in extracting the file
in text format.

6. The exit button can be used to exit the application.

WORKING
In our project, we are converting a.pdf file into a.mp3 file as a core functionality.
Two libraries, pyttsx3 and pdfminer, were used to accomplish this.

• One of the most popular programmes for mining data from PDF files is PDFMiner.
It's the only PDF application that concentates only on text data mining and analysis. If
you use PDFMiner, you'll be able to see where text appears on a page, along with font
and line information.

• A Python library for converting text to voice is available in the form of pyttsx3. If
you don't have access to the Internet, this library can still be used. Extraction of data,
page-by-page document splitting, merging, and more are all possible.

• The PyPDF2, pdfminer, and pyttsx3 modules will be imported.

• Using pdfminer, the text is then mined from the pdf file and saved as a text file.

• It will be transformed to an mp3 file using Google's text-to-speech module, gtts.

• Windows player will be used to play the output.


BLOCK DIAGRAM

FRONT-END AND BACK END


IMPLEMENTATION
4.IMPLEMANTATION:

4.1SOURCE CODE:

from tkinter import *


import tkinter.messagebox as mb
from path import Path
from PyPDF4.pdf import PdfFileReader as PDFreader, PdfFileWriter as PDFwriter
import pyttsx3
from speech_recognition import Recognizer, AudioFile
from pydub import AudioSegment
import os

class Window(Tk):
def __init__(self):
super(Window, self).__init__()
self.title("ProjectGurukul PDF to Audio and Audio to PDF converter")
self.geometry('400x250')
self.resizable(0, 0)
self.config(bg='Burlywood')
Label(self, text='ProjectGurukul PDF to Audio and Audio to PDF converter',
wraplength=400,
bg='Burlywood', font=("Comic Sans MS", 15)).place(x=0, y=0)
Button(self, text="Convert PDF to Audio", font=("Comic Sans MS", 15), bg='Tomato',
command=self.pdf_to_audio, width=25).place(x=40, y=80)
Button(self, text="Convert Audio to PDF", font=("Comic Sans MS", 15), bg='Tomato',
command=self.audio_to_pdf, width=25).place(x=40, y=150)

def pdf_to_audio(self):
pta = Toplevel(self)
pta.title('Convert PDF to Audio')
pta.geometry('500x300')
pta.resizable(0, 0)
pta.config(bg='Chocolate')
Label(pta, text='Convert PDF to Audio', font=('Comic Sans MS', 15),
bg='Chocolate').place(relx=0.3, y=0)
Label(pta, text='Enter the PDF file location (with extension): ', bg='Chocolate',
font=("Verdana", 11)).place(x=10, y=60)
filename = Entry(pta, width=32, font=('Verdana', 11))
filename.place(x=10, y=90)
Label(pta, text='Enter the page to read from the PDF (only one can be read): ',
bg='Chocolate', font=("Verdana", 11)).place(x=10, y=140)
page = Entry(pta, width=15, font=('Verdana', 11))
page.place(x=10, y=170)
Button(pta, text='Speak the text', font=('Gill Sans MT', 12), bg='Snow', width=20,
command=lambda: self.speak_text(filename.get(), page.get())).place(x=150, y=240)

def audio_to_pdf(self):
atp = Toplevel(self)
atp.title('Convert Audio to PDF')
atp.geometry('675x300')
atp.resizable(0, 0)
atp.config(bg='FireBrick')
Label(atp, text='Convert Audio to PDF', font=("Comic Sans MS", 15),
bg='FireBrick').place(relx=0.36, y=0)
Label(atp, text='Enter the Audio File location that you want to read [in .wav or .mp3
extensions only]:',
bg='FireBrick', font=('Verdana', 11)).place(x=20, y=60)
audiofile = Entry(atp, width=58, font=('Verdana', 11))
audiofile.place(x=20, y=90)
Label(atp, text='Enter the PDF File location that you want to save the text in (with
extension):', bg='FireBrick' ,font=('Verdana', 11)).place(x=20, y=140)
pdffile = Entry(atp, width=58, font=('Verdana', 11))
pdffile.place(x=20, y=170)
Button(atp, text='Create PDF', bg='Snow', font=('Gill Sans MT', 12), width=20,
command=lambda: self.speech_recognition(audiofile.get(), pdffile.get())).place(x=247,
y=230)

@staticmethod
def speak_text(filename, page):
if not filename or not page:
mb.showerror('Missing field!', 'Please check your responses, because one of the fields is
missing')
return

reader = PDFreader(filename)
engine = pyttsx3.init()
with Path(filename).open('rb'):
page_to_read = reader.getPage(int(page)-1)
text = page_to_read.extractText()
engine.say(text)
engine.runAndWait()
@staticmethod
def write_text(filename, text):
writer = PDFwriter()
writer.addBlankPage(72, 72)
pdf_path = Path(filename)
with pdf_path.open('ab') as output_file:
writer.write(output_file)
output_file.write(text)
def speech_recognition(self, audio, pdf):
if not audio or not pdf:
mb.showerror('Missing field!', 'Please check your responses, because one of the fields is
missing')

return
audio_file_name = os.path.basename(audio).split('.')[0]
audio_file_extension = os.path.basename(audio).split('.')[1]
if audio_file_extension != 'wav' and audio_file_extension != 'mp3':
mb.showerror('Error!', 'The format of the audio file should only be either "wav" and "mp3"!')
if audio_file_extension == 'mp3':
audio_file = AudioSegment.from_file(Path(audio), format='mp3')
audio_file.export(f'transcript.wav', format='wav')
source_file = 'transcript.wav'
r = Recognizer()
with AudioFile(source_file) as source:
r.pause_threshold = 5
speech = r.record(source)
text = r.recognize_google(speech)
self.write_text(pdf, text)

from tkinter import *


import tkinter.messagebox as mb
from path import Path
from PyPDF4.pdf import PdfFileReader as PDFreader, PdfFileWriter as PDFwriter
import pyttsx3
from speech_recognition import Recognizer, AudioFile
from pydub import AudioSegment
import os

# Initializing the GUI window


class Window(Tk):
def __init__(self):
super(Window, self).__init__()
self.title("ProjectGurukul PDF to Audio and Audio to PDF converter")
self.geometry('400x250')
self.resizable(0, 0)
self.config(bg='Burlywood')
Label(self, text='ProjectGurukul PDF to Audio and Audio to PDF converter',
wraplength=400,
bg='Burlywood', font=("Comic Sans MS", 15)).place(x=0, y=0)
Button(self, text="Convert PDF to Audio", font=("Comic Sans MS", 15), bg='Tomato',
command=self.pdf_to_audio, width=25).place(x=40, y=80)
Button(self, text="Convert Audio to PDF", font=("Comic Sans MS", 15), bg='Tomato',
command=self.audio_to_pdf, width=25).place(x=40, y=150)
def pdf_to_audio(self):
pta = Toplevel(self)
pta.title('Convert PDF to Audio')
pta.geometry('500x300')
pta.resizable(0, 0)
pta.config(bg='Chocolate')
Label(pta, text='Convert PDF to Audio', font=('Comic Sans MS', 15),
bg='Chocolate').place(relx=0.3, y=0)
Label(pta, text='Enter the PDF file location (with extension): ', bg='Chocolate',
font=("Verdana", 11)).place(x=10, y=60)
filename = Entry(pta, width=32, font=('Verdana', 11))
filename.place(x=10, y=90)
Label(pta, text='Enter the page to read from the PDF (only one can be read): ',
bg='Chocolate',
font=("Verdana", 11)).place(x=10, y=140)
page = Entry(pta, width=15, font=('Verdana', 11))
page.place(x=10, y=170)
Button(pta, text='Speak the text', font=('Gill Sans MT', 12), bg='Snow', width=20,
command=lambda: self.speak_text(filename.get(), page.get())).place(x=150, y=240)
def audio_to_pdf(self):
atp = Toplevel(self)
atp.title('Convert Audio to PDF')
atp.geometry('675x300')
atp.resizable(0, 0)
atp.config(bg='FireBrick')
Label(atp, text='Convert Audio to PDF', font=("Comic Sans MS", 15),
bg='FireBrick').place(relx=0.36, y=0)
Label(atp, text='Enter the Audio File location that you want to read [in .wav or .mp3
extensions only]:',
bg='FireBrick', font=('Verdana', 11)).place(x=20, y=60)
audiofile = Entry(atp, width=58, font=('Verdana', 11))
audiofile.place(x=20, y=90)
Label(atp, text='Enter the PDF File location that you want to save the text in (with
extension):', bg='FireBrick' ,font=('Verdana', 11)).place(x=20, y=140)
pdffile = Entry(atp, width=58, font=('Verdana', 11))
pdffile.place(x=20, y=170)
Button(atp, text='Create PDF', bg='Snow', font=('Gill Sans MT', 12), width=20,
command=lambda: self.speech_recognition(audiofile.get(), pdffile.get())).place(x=247,
y=230)
@staticmethod
def speak_text(filename, page):
if not filename or not page:
mb.showerror('Missing field!', 'Please check your responses, because one of the fields is
missing')

return
reader = PDFreader(filename)
engine = pyttsx3.init()
with Path(filename).open('rb'):
page_to_read = reader.getPage(int(page)-1)
text = page_to_read.extractText()
engine.say(text)
engine.runAndWait()

@staticmethod
def write_text(filename, text):
writer = PDFwriter()
writer.addBlankPage(72, 72)
pdf_path = Path(filename)
with pdf_path.open('ab') as output_file:
writer.write(output_file)
output_file.write(text)
def speech_recognition(self, audio, pdf):
if not audio or not pdf:
mb.showerror('Missing field!', 'Please check your responses, because one of the fields is
missing')
return

audio_file_name = os.path.basename(audio).split('.')[0]
audio_file_extension = os.path.basename(audio).split('.')[1]
if audio_file_extension != 'wav' and audio_file_extension != 'mp3':
mb.showerror('Error!', 'The format of the audio file should only be either "wav" and "mp3"!')
if audio_file_extension == 'mp3':
audio_file = AudioSegment.from_file(Path(audio), format='mp3')
audio_file.export(f'{audio_file_name}.wav', format='wav')
source_file = f'{audio_file_name}.wav'
r = Recognizer()
with AudioFile(source_file) as source:
r.pause_threshold = 5
speech = r.record(source)
text = r.recognize_google(speech)
self.write_text(pdf, text)
# Finalizing the GUI window
app = Window()
app.update()
app.mainloop()
4.2 SAMPLE INPUT:
SAMPLE OUTPUT:
Future Scope:
As per our project present, we declare that a person who is having the interest
to learn English if he comes from his native language background our project helps then more
present we are having 4 modules in that like speech-text conversion, text-speech
conversion,pdf-speech conversion,image-pdf conversion this concept in a single platform
ad’s beauty to our project.

In further we want to update this project by adding some more modules which
can help future generation people access a variety of functions related to the library in this
platform that makes the library system of student into easy way as per the present technology
this type of conversions have more impact on the generation as we can take an example of
Alexa and so on which makes the life in a smooth way.

In the same manner, the conversions also play a major role. In the future, we
will implement our project in life by having a library in this virtual manner where we can
upload any book into it and get the content of the book as a voice where just by listing, we
can just gain knowledge in the best way as the every one say the thing which we hear makes
a major impact.
FINAL OUTPUT:
CONCULUSION
Conclusion:

This mini project developed to easy to read the pdf file.It was seen that this code
performs really well in reading straightforward PDF text files.
Should enable users to select the desired PDF and convert it to audio and display text in,
so the user can understand that particular text has been read. Should enable students with
reading disabilities.The success of this research project is significant given the broad use of
audiobooks in literacy and library programs across the United States.
Teachers and school librarians may also use these findings as a rationale for adding
audiobooks to the list of reading strategies used successfully with struggling readers.
We are interested in future research on the use of audiobooks with struggling readers
who are younger and older than those who participated in the study and on audiobook usage
with At this point, the code does not have a stop feature, I intend to add those and do more
interesting things with the application of Machine Learning in the audiobook.
With the help of machine learning, we can add the features that will recognize the voice of
the user and implement the function as the user wants . This feature will help mostly for the
disabled users like the blind, handicapEnglish Language Learners .

You might also like