Ashish Rokade
Ashish Rokade
ON
“VOICE COMMANDER”
SUBMITTED IN PARTIAL FULLFILLMENT OF THE REQUIREMENTS OF
DEGREE OF
BACHELOR OF COMPUTER SCIENCE
BY
ASHISH SONYABAPU ROKADE
Seat No. CS102405078
CERTIFICATE
This is to verify that the project entitled, “Voice Commander”, is bonafide work of
ASHISH SONYABAPU ROKADE of T.Y.B.Sc.(Sem V) bearing Seat
No.CS102405078 submitted in partial fulfillment of the requirements for the award
of degree of BACHELOR OF SCIENCE in COMPUTER SCI-
ENCEfrom University of Mumbai during the academic year 2024-2025.
External Examiner
I Ashish Sonyabapu Rokade hereby declare that the project entitled “Voice
Commander” submitted in the partial fulfillment for the award of Bachelor of
Science in Computer Science during the academic year 2024- 2025 is my original
work and the project has not formed the basis for the award of any degree,
associateship, fellowship or any other similar titles
Place: Rasayani
Date:
ACKNOWLEDGEMENT
Thanking you,
ASHISH.S.ROKADE
TABLE OF CONTENTS
1. Introduction 1
1.1 General Theory 2
1.2 OBJECTIVES 3
1.3 Purpose 4
1.4 Implementation Overview 5
2. Literature Survey 8
2.1 Scope 9
2.2 Applicability 10
2.3 Survey of technology 11
5. Result Analysis 29
5.1 Results and discussion 30
5.2 Conclusion 51
.......................................................................................
.......................................................................................
.......................................................................................
□ Yes
□ No
Date: Date:
1 INTRODUCTION
1.2 OBJECTIVES
The primary objective of Voice Commander is to create a sophisticated voice-
activated assistant designed to significantly enhance desktop productivity and user
convenience. This innovative assistant aims to provide users with hands-free
interaction capabilities across various applications and functionalities, allowing
them to perform a wide range of tasks efficiently and intuitively using voice
commands.
Key goals include:
1. Streamlined Task Management: Empower users to manage their schedules,
send messages, and control media playback seamlessly, reducing the need
for manual input and allowing for multitasking.
2. Enhanced Communication: Facilitate effortless communication through
voice dictation for messaging apps, enabling users to connect with others
without the need for typing.
3. Accessibility and Usability: Create an inclusive tool that enhances usability
for individuals with disabilities or those who prefer voice interactions,
promoting a more accessible computing environment.
4. Integration with Existing Applications: Ensure smooth integration with
popular applications such as WhatsApp, YouTube, and web browsers,
allowing users to perform familiar tasks in a new, efficient manner.
5. Real-time Information Access: Provide users with quick access to essential
information, including weather updates, internet speed checks, and
translations, enhancing their ability to stay informed.
6. Data Privacy and Security: Prioritize user privacy by implementing robust
security measures to protect personal data and ensure safe interactions with
the assistant.
1.3 Purpose
We can create a new function for wishing the user. Use if_else condition
statement for allocating the wished output. E.g. If its time between 12 to 18 then
the system will say “Good Evening”. Along with this, We can get a welcome voice
also. E.g. “Welcome, What can I do for us”.After that, we have to install the speech
recognition model and then import it
Define a new function for taking command from a user. Also mention class
for speech recognition and input type like microphone etc. Also mention
pause_threshold. Set a query for voice recognition language. We can use Google
engine to convert Wer voice input to the text.
We have to install and import some other packages like pyttsx3, Wikipedia
etc. Pyttsx3 helps We to convert text input to speech conversion. If We ask for any
information then it will display the result in textual format. We can convert it very
easily in the voice format as we have already defined a function in our code
previously.
,We have to install a web browser package and import it. In the same way,
We can add queries for many websites like Google, Instagram, Facebook etc. The
next task is to play the songs. It is same as We have done before. Add a query for
“play songs”. Also, add the location of songs folder so that assistant will play the
song from that only.
PILLAI HOC COLLEGE OF ARTS,SCIENCE & COMMERCE, RASAYANI
5
Voice
Commander
2 LITERATURE SURVEY
2.1 Scope
The Voice Commander project aims to enhance user experience through advanced
voice recognition technologies that facilitate individualized interactions. As voice
assistants improve in their ability to differentiate between users’ voices, it becomes
essential for developers to navigate the complexities of voice interface design.
Brands must also understand the capabilities of various voice-enabled devices and
determine which integrations align best with their specific needs and objectives.
As the project develops, it will focus on future-proofing the voice assistant to keep
pace with user expectations and technological advancements. This involves
exploring future trends in voice technology and preparing for potential shifts in
user interaction paradigms. Ultimately, the Voice Commander project seeks to
create an innovative voice assistant that not only enhances productivity but also
adapts to the changing landscape of voice technology.
2.2 Applicability
2.3.1 Python
2.3.2 Quepy
Quepy is a python framework to transform natural language questions to queries in a
database query language. It can be easily customized to different kinds of questions
in natural language and database queries. So, with little coding We can build Wer
own system for natural language access to Wer database.
2.3.4 Pyttsx
Pyttsx stands for Python Text to Speech. It is a cross-platform Python wrapper for
textto-speech synthesis. It is a Python package supporting common text-to-speech
engines on Mac OS X, Windows, and Linux. It works for both Python2.x and 3.x
versions. Its main advantage is that it works offline.
Problem definition
Modern desktop interactions are often limited by manual input methods, which can
hinder productivity and multitasking. Users, especially those with physical
limitations or those needing to manage multiple tasks at once, lack an efficient,
hands-free solution. While mobile voice assistants are common, there is a gap in
similar desktop functionality.
Voice Commander solves this problem by offering a voice-activated assistant that
allows users to manage tasks, communicate, control applications, and retrieve
information through simple voice commands. This enhances productivity,
accessibility, and overall ease of use in desktop environments.
Personal assistant software is required to act as an interface into the digital world by
understanding user requests or commands and then translating into actions or
recommendations based on agent’s understanding of the world.
Voice commander focuses on relieving the user of entering text input and using voice
as primary means of user input. Agent then applies voice recognition algorithms to
this input and records the input. It then use this input to call one of the personal
information management applications such as task list or calendar to record a new
entry or to search about it on search engines like Google, Bing or Yahoo etc.
Focus is on capturing the user input through voice, recognizing the input and then
executing the tasks if the agent understands the task. Software takes this input in
natural language, and so makes it easier for the user to input what he or she desires to
be done.
The Voice Commander project has been assessed for feasibility across technical,
operational, and economic dimensions, ensuring that it can be successfully developed
and implemented.
Technical Flexibility
The project leverages widely available technologies such as Google Speech API for
voice recognition, pyttsx3 for text-to-speech conversion, and Python libraries for
natural language processing (NLP) and application control. The integration of these
tools with desktop systems like Windows, macOS, and Linux is achievable, and the
use of Python ensures compatibility and scalability across various platforms.
Operational Feasibility:
The Voice Commander is designed to improve workflow efficiency by automating
tasks through voice commands. Its intuitive interface makes it accessible to both tech-
savvy users and those less familiar with technology. Moreover, the assistant can easily
be integrated into daily routines without disrupting existing workflows, making it
highly feasible for both personal and business use.
Economic Feasibility:
The development of the Voice Commander utilizes open-source libraries and APIs,
reducing upfront costs. Additionally, the hardware requirements (standard microphone
and speakers) are minimal, which lowers the barrier to entry. The potential
productivity gains and time savings outweigh the initial development and deployment
costs, making the project economically viable.
Software :-
• Windows 7(32-bit) or above.
• Python 2.7 or later
• Chrome Driver
UML DIAGRAMS:
2. Class Diagram
3. Sequence Diagram
4. Activity Diagram
5. ER Diagram
PILLAI HOC COLLEGE OF ARTS,SCIENCE & COMMERCE, RASAYANI
19
Voice Commander
Notation
In this project there is only one user. The user queries command to the system.
System then interprets it and fetches answer. The response is sent back to the user.
2.Class Diagram
Notation
The class user has 2 attributes command that it sends in audio and the response it
receives which is also audio. It performs function to listen the user command. Interpret it
and then reply or sends back response accordingly. Question class has the command in
string form as it is interpreted by interpret class. It sends it to general or about or search
function based on its identification. The task class also has interpreted command in string
format. It has various functions like reminder, note, mimic, research and reader.
3.Sequence diagram:
A sequence diagram shows object interactions arranged in time sequence. It depicts
the objects and classes involved in the scenario and the sequence of messages
exchanged between the objects needed to carry out the functionality of the scenario.
Sequence diagrams are typically associated with use case realizations in the Logical
View of the system under development. Sequence diagrams are sometimes called
event diagrams or event scenarios. A sequence diagram shows, as parallel vertical
lines (lifelines), different processes or objects that live simultaneously, and, as
horizontal arrows, the messages exchanged between them, in the order in which they
occur. This allows the specification of simple runtime scenarios in a graphical manner
and implementation of change requests (CRs) which is not possible in conventional
models like a waterfall.
Notation:
responding to messages.
Self-message: Represented by an arrow pointing back to the same participant, a self-
message indicates an action or interaction within the same participant, often used to
depict internal processing or decision-making.
Message: Represented by arrows between participants, messages depict interactions
or communications between different participants in the system, indicating the flow of
information or requests.
Reply Message: A reply message is a specific type of message indicating a response
to a previously received message. In the sequence diagram, it's represented by a
dashed arrow returning from the recipient back to the sender, indicating the response
to a preceding message.
The above sequence diagram shows how an answer asked by the user is being fetched
from internet. The audio query is interpreted and sent to Web scraper. The web scraper
searches and finds the answer. It is then sent back to speaker, where it speaks the answer
to user.
4. Activity diagram:
In UML, the activity diagram is used to demonstrate the flow of control within the
system rather than the implementation. It models the concurrent and sequential activities.
The activity diagram helps in envisioning the workflow from one activity to another. It
put emphasis on the condition of flow and the order in which it occurs. The flow can be
sequential, branched, or concurrent, and to deal with such kinds of flows, the activity
diagram has come up with a fork, join, etc. It is also termed as an object-oriented
flowchart. It encompasses activities composed of a set of actions or operations that are
applied to model the behavioural diagram.
Notation
Initial Node: This is denoted by a solid circle and indicates the starting point of the
diagram. It represents the initial state of the system before any activity takes place.
Final Node: Represented by a circle with a solid circle inside, the final node signifies the
end of the activity diagram. It indicates the final state of the system after all activities
have been completed.
Decision Node: Shown as a diamond shape, decision nodes represent points in the
process where the system must make a decision based on certain conditions.
Depending on the condition evaluation, the process may follow different paths
Initially, the system is in idle mode. As it receives any wake up cal it begins execution.
The received command is identified whether it is a questionnaire or a task to be
performed. Specific action is taken accordingly. After the Question is being answered or
the task is being performed, the system waits for another command. This loop continues
unless it receives quit command. At that moment, it goes back to sleep.
5. ER DIAGRAM
The above diagram shows entities and their relationship for a Voice commander
system. We have a user of a system who can have their keys and values. It can be
used to store any information about the user. Say, for key “name” value can be
“Jim”. For some keys user might like to keep secure. There he can enable lock
and set a password (voice clip). Single user can ask multiple questions. Each
question will be given ID to get recognized along with the query and its
corresponding answer. User can also be having n number of tasks. These should
have their own unique id and status i.e. their current state. A task should also
have a priority value and its category whether it is a parent task or child task of
an older task.
5 Result Analysis
We have used the VS Code IDE in this video. Feel free to use any other IDE We are
comfortable d with. I have started a new project and make a file called jarvis.py. Visual
Studio Code is a freeware source-code editor made by Microsoft for Windows, Linux
and macOS. Features include support for debugging, syntax highlighting, intelligent code
completion, snippets, code refactoring, and embedded Git.How to Install
Thirdly, click on “create a desktop icon” so that it can be accessed from desktop and
click on Next. Finally, after installation completes, click on the finish button, and the
visual studio code will get open.
VS Code comes with a straight-forward and intuitive laWet that maximizes the space
provided for the editor while leaving ample room to browse. Additionally, it allows
access to the full context of Wer folder or project. The UI is divided into five areas, as
highlighted in the above image.
Editor – It is the main area to edit Wer files. We can open as many editors as possible
side by side vertically and horizontally.
SideBar – Contains different views like the Explorer to assist We while working on Wer
project.
Status Bar – It contains the information about the opened project and the files We edit.
Activity Bar – It is located on the far left-hand side. It lets We switch between views
and gives We additional context-specific indicators, like the number of outgoing changes
when Git is enabled.
Panels – It displays different panels below the editor region for output or debug
information, errors, and warnings, or an integrated terminal. Additionally, the panel can
also move to the right for more vertical space. VS Code opens up in the same state it was
last in, every time We start it. It also preserves folder, laWet, and opened files.
Here’s a breakdown of the Voice Assistant code, explaining its structure, key
functionalities, and how each part works:
1. Imports and Setup
import datetime
from email import message
import subprocess
import webbrowser
from numpy import tile
import pyttsx3
import qrcode
import speech_recognition
import requests
from bs4 import BeautifulSoup
import os
import pyautogui
import random
import geocoder
from plyer import notification
from pygame import mixer
import speedtest
import Translator
Libraries:
o datetime: To manage time-based functionalities (like getting the current
time).
o subprocess & webbrowser: For interacting with system apps and opening
web pages.
o pyttsx3: For text-to-speech functionality (voice output).
o qrcode: To generate QR codes.
o speech_recognition: For voice command input (speech-to-text).
o requests & BeautifulSoup: To scrape and fetch data from web pages (like
weather).
o pyautogui: For automating tasks like controlling apps or the mouse.
o random: For random selections, like choosing which song to play.
o geocoder: To get current location based on IP.
o plyer.notification: For sending desktop notifications.
2. Password Protection
elif (a!=pw):
print("Try Again")
Password Check: Before loading the assistant, the user must enter the correct
password. The password is stored in a file (password.txt). If the password is
incorrect after 3 attempts, the program exits
. This section allows the user to change a password:
o input(): The new password is input via the keyboard.
o new_password.write(): The new password is saved to a file (password.txt).
o speak(): The assistant confirms the new password vocally.
3. Text-to-Speech Functionality
PILLAI HOC COLLEGE OF ARTS,SCIENCE & COMMERCE, RASAYANI
33
Voice Commander
python
engine = pyttsx3.init("sapi5")
voices = engine.getProperty("voices")
engine.setProperty("voice", voices[0].id)
rate = engine.setProperty("rate",170)
def speak(audio):
engine.say(audio)
engine.runAndWait()
Voice Engine Initialization: Using pyttsx3 to initialize and configure the voice
(here, it's set to male with a speech rate of 170 words per minute).
speak() Function: Converts text into speech and speaks it out.
def takeCommand():
r = speech_recognition.Recognizer()
with speech_recognition.Microphone() as source:
print("Listening.....")
r.pause_threshold = 1
r.energy_threshold = 300
audio = r.listen(source,0,4)
try:
print("Understanding..")
query = r.recognize_google(audio,language='en-in')
print(f"You Said: {query}\n")
except Exception as e:
print("Say that again")
return "None"
re
turn query
def alarm(query):
timehere = open("Alarmtext.txt","a")
timehere.write(query)
timehere.close()
os.startfile("alarm.py")
import pyttsx3
import datetime
import os
engine = pyttsx3.init("sapi5")
voices = engine.getProperty("voices")
engine.setProperty("voice", voices[0].id)
engine.setProperty("rate",200)
def speak(audio):
engine.say(audio)
engine.runAndWait()
extractedtime = open("Alarmtext.txt","rt")
time = extractedtime.read()
Time = str(time)
extractedtime.close()
deletetime = open("Alarmtext.txt","r+")
deletetime.truncate(0)
deletetime.close()
def ring(time):
timeset = str(time)
timenow = timeset.replace("jarvis","")
timenow = timenow.replace("set an alarm","")
timenow = timenow.replace(" and ",":")
Alarmtime = str(timenow)
print(Alarmtime)
while True:
currenttime = datetime.datetime.now().strftime("%H:%M:%S")
if currenttime == Alarmtime:
speak("Alarm ringing,sir")
os.startfile("music.mp3")
elif currenttime + "00:00:30" == Alarmtime:
exit()
ring(time)
elif "set an alarm" in query:
PILLAI HOC COLLEGE OF ARTS,SCIENCE & COMMERCE, RASAYANI
35
Voice Commander
alarm(): Saves the alarm time in a text file (Alarmtext.txt) and starts an external
alarm script.
6.YouTube Search
ef search_on_youtube(song_name):
try:
# Construct the YouTube search URL
search_url = f"https://www.youtube.com/results?
search_query={song_name.replace(' ', '+')}"
def check_internet_speed():
st = speedtest.Speedtest()
st.get_best_server()
download_speed = st.download() / (1024 * 1024) # Convert bytes to megabits
upload_speed = st.upload() / (1024 * 1024) # Convert bytes to megabits
ping = st.results.ping
while True:
query = takeCommand().lower()
if "go to sleep" in query:
speak("Ok SIR , You can call me anytime")
break
takeCommand(): A function that listens for voice input and converts it into text.
query = takeCommand().lower(): Captures the user’s voice input, converts it to
lowercase to ensure uniformity in processing.
if "wake up" in query: When the assistant hears "wake up," it starts a new while
loop to continue processing commands until the user says "go to sleep."
11.Volume Control
elif "volume up" in query:
from keyboard import volumeup
speak("Turning volume up,sir")
volumeup()
elif "volume down" in query:
from keyboard import volumedown
speak("Turning volume down, sir")
volumedown()
12.Schedule Management
elif "schedule my day" in query:
tasks = [] #Empty list
speak("Do you want to clear old tasks (Plz speak YES or NO)")
query = takeCommand().lower()
if "yes" in query:
file = open("tasks.txt","w")
file.write(f"")
file.close()
no_tasks = int(input("Enter the no. of tasks :- "))
i=0
for i in range(no_tasks):
tasks.append(input("Enter the task :- "))
file = open("tasks.txt","a")
file.write(f"{i}. {tasks[i]}\n")
file.close()
elif "no" in query:
i=0
no_tasks = int(input("Enter the no. of tasks :- "))
for i in range(no_tasks):
tasks.append(input("Enter the task :- "))
file = open("tasks.txt","a")
file.write(f"{i}. {tasks[i]}\n")
file.close()
elif "show my schedule" in query:
file = open("tasks.txt","r")
content = file.read()
file.close()
mixer.init()
mixer.music.load("notification.mp3")
mixer.music.play()
notification.notify(
title = "My schedule :-",
message = content,
timeout = 15
)
schedule my day: The assistant helps the user manage tasks for the day:
o Clears old tasks if requested and appends new ones.
o Saves tasks in tasks.txt for later retrieval.
show my schedule: Reads out the tasks from tasks.txt and displays them as a
system notification.
13.Translation
engine = pyttsx3.init("sapi5")
voices = engine.getProperty("voices")
engine.setProperty("voice", voices[0].id)
rate = engine.setProperty("rate",170)
def speak(audio):
engine.say(audio)
engine.runAndWait()
def takeCommand():
r = speech_recognition.Recognizer()
with speech_recognition.Microphone() as source:
print("Listening.....")
r.pause_threshold = 1
r.energy_threshold = 300
PILLAI HOC COLLEGE OF ARTS,SCIENCE & COMMERCE, RASAYANI
40
Voice Commander
audio = r.listen(source,0,4)
try:
print("Understanding..")
query = r.recognize_google(audio,language='en-in')
print(f"You Said: {query}\n")
except Exception as e:
print("Say that again")
return "None"
return query
def translategl(query):
speak("SURE Sir")
print(googletrans.LANGUAGES)
translator = Translator()
speak("Choose the language in which you want to translate")
b = input("To_Lang :- ")
text_to_translate = translator.translate(query,src = "auto",dest= b,)
text = text_to_translate.text
try :
speakgl = gTTS(text=text, lang=b, slow= False)
speakgl.save("voice.mp3")
playsound("voice.mp3")
time.sleep(5)
os.remove("voice.mp3")
except:
print("Unable to translate")
import os
import pyautogui
import webbrowser
import pyttsx3
from time import sleep
engine = pyttsx3.init("sapi5")
voices = engine.getProperty("voices")
engine.setProperty("voice", voices[0].id)
engine.setProperty("rate",200)
def speak(audio):
engine.say(audio)
engine.runAndWait()
dictapp =
{"commandprompt":"cmd","paint":"paint","word":"winword","excel":"excel","ch
rome":"chrome","vscode":"code","powerpoint":"powerpnt","explorer":"explorer
","edge": "msedge"}
def openappweb(query):
speak("Launching, sir")
if ".com" in query or ".co.in" in query or ".org" in query:
query = query.replace("open","")
query = query.replace("jarvis","")
query = query.replace("launch","")
query = query.replace(" ","")
webbrowser.open(f"https://www.{query}")
else:
keys = list(dictapp.keys())
for app in keys:
if app in query:
os.system(f"start {dictapp[app]}")
PILLAI HOC COLLEGE OF ARTS,SCIENCE & COMMERCE, RASAYANI
42
Voice Commander
def closeappweb(query):
speak("Closing,sir")
if "one tab" in query or "1 tab" in query:
pyautogui.hotkey("ctrl","w")
speak("All tabs closed")
elif "2 tab" in query:
pyautogui.hotkey("ctrl","w")
sleep(0.5)
pyautogui.hotkey("ctrl","w")
speak("All tabs closed")
elif "3 tab" in query:
pyautogui.hotkey("ctrl","w")
sleep(0.5)
pyautogui.hotkey("ctrl","w")
sleep(0.5)
pyautogui.hotkey("ctrl","w")
speak("All tabs closed")
else:
keys = list(dictapp.keys())
for app in keys:
if app in query:
os.system(f"taskkill /f /im {dictapp[app]}.exe")
open and close: The assistant can open and close applications by interacting with
the Dictapp module.
The assistant can control YouTube videos via keyboard shortcuts (e.g., k to
pause/play and m to mute).
module.
Reminders: Allows the user to store reminders in Remember.txt and recall them
later.
PILLAI HOC COLLEGE OF ARTS,SCIENCE & COMMERCE, RASAYANI
45
Voice Commander
22.Photo Capture:
23.Playing a Game:
import pyttsx3
import speech_recognition as sr
import random
engine = pyttsx3.init('sapi5')
voices = engine.getProperty('voices')
engine.setProperty('voice', voices[0].id)
engine.setProperty("rate", 170)
def speak(audio):
engine.say(audio)
engine.runAndWait()
def takeCommand():
r = sr.Recognizer()
with sr.Microphone() as source:
print("Listening.....")
r.pause_threshold = 1
r.energy_threshold = 300
audio = r.listen(source,0,4)
try:
print("Recognizing..")
query = r.recognize_google(audio , language= 'en-in')
print(f"You Said : {query}\n")
except Exception as e:
print("Say that again")
return "None"
return query
def game_play():
speak("Lets Play ROCK PAPER SCISSORS !!")
print("LETS PLAYYYYYYYYYYYYYY")
i=0
Me_score = 0
Com_score = 0
while(i<5):
choose = ("rock","paper","scissors") #Tuple
com_choose = random.choice(choose)
query = takeCommand().lower()
if (query == "rock"):
if (com_choose == "rock"):
speak("ROCK")
print(f"Score:- ME :- {Me_score} : COM :- {Com_score}")
elif (com_choose == "paper"):
speak("paper")
Com_score += 1
print(f"Score:- ME :- {Me_score} : COM :- {Com_score}")
else:
speak("Scissors")
Me_score += 1
print(f"Score:- ME :- {Me_score} : COM :- {Com_score}")
Com_score += 1
print(f"Score:- ME :- {Me_score} : COM :- {Com_score}")
elif (com_choose == "paper"):
speak("paper")
Me_score += 1
print(f"Score:- ME :- {Me_score} : COM :- {Com_score}")
else:
speak("Scissors")
print(f"Score:- ME :- {Me_score} : COM :- {Com_score}")
i += 1
engine = pyttsx3.init("sapi5")
voices = engine.getProperty("voices")
engine.setProperty("voice", voices[0].id)
rate = engine.setProperty("rate",170)
def speak(audio):
engine.say(audio)
PILLAI HOC COLLEGE OF ARTS,SCIENCE & COMMERCE, RASAYANI
49
Voice Commander
engine.runAndWait()
def takeCommand():
r = speech_recognition.Recognizer()
with speech_recognition.Microphone() as source:
print("Listening.....")
r.pause_threshold = 1
r.energy_threshold = 300
audio = r.listen(source,0,4)
try:
print("Understanding..")
query = r.recognize_google(audio,language='en-in')
print(f"You Said: {query}\n")
except Exception as e:
print("Say that again")
return "None"
return query
strTime = int(datetime.now().strftime("%H"))
update = int((datetime.now()+timedelta(minutes = 2)).strftime("%M"))
def sendMessage():
speak("Whom do you want to send a message")
a = int(input('''ashish -> 1
pratiksha -> 2 '''))
if a == 1:
speak("Whats the message")
message = str(input("Enter the message- "))
pywhatkit.sendwhatmsg("+91
86222249354",message,time_hour=strTime,time_min=update) #Enter The number
here instead of +91000
elif a==2:
speak("Whats the message")
message = str(input("Enter the message- "))
pywhatkit.sendwhatmsg("+91
8692022285",message,time_hour=strTime,time_min=update)
CONCLUSION
Open/Close Any App Using Voice: The AI listens for commands to open or close
specific applications, making it easier to switch between tasks hands-free.
Check Internet Speed: The AI can check your internet speed through voice
commands, providing you with real-time updates.
Schedule Your Day: The AI assistant can set schedules and send
notifications through voice commands.
Set Alarm: By saying commands like “set an alarm for 7 AM,” you can configure
alarms and reminders through your desktop’s clock
Remember, News, and Playlist Function: The AI can remember user preferences,
provide news updates, and organize playlists.
You can adjust the system's volume up or down through simple voice commands,
offering a hands-free audio control experience.
Searching from the Web: You can perform web searches using voice commands,
such as “search for the latest news”
CONCLUSION
The voice assistant project is a comprehensive tool that integrates voice recognition and
command execution. It showcases various functionalities, including application
management, sending WhatsApp messages, checking the weather, and generating QR
codes. The use of text-to-speech enhances user interaction, and the modular structure
allows for easy updates and extensions. This project effectively demonstrates how
natural language processing can be applied to create a useful personal assistant for
everyday tasks.
Future Scope
The future of Voice Commander is promising, with potential to expand into controlling
smart home devices like lights and thermostats through voice commands. It could
become more personalized, learning user preferences to offer better suggestions. As
voice recognition improves, it will handle more complex commands and conversations.
Multilingual support could make it accessible worldwide, and enhanced security features
like voice-based authentication will improve user privacy. Additionally, Voice
Commander may integrate with augmented reality, offering a more interactive
experience, making it a powerful tool for productivity and convenience.
REFERENCES
Deller, John R., Jr., Hansen, John J.L., and Proakis, John G. Discrete-Time Processing of
Speech Signals. IEEE Press, ISBN 0-7803-5386-2.
Hayes, H. Monson. Statistical Digital Signal Processing and Modeling. John Wiley & Sons
Inc., Toronto, 1996, ISBN 0-471-59431-8.
Proakis, John G., and Manolakis, Dimitris G. Digital Signal Processing: Principles,
Algorithms, and Applications, 3rd Edition. Prentice Hall, New Jersey, 1996, ISBN 0-13-394338-
Jain, Ashish, and Harris, Hohn. "Speaker Identification Using MFCC and HMM Based
Techniques." University of Florida, April 25, 2004.