0% found this document useful (0 votes)
86 views

Elements Software For Image Editing Over Voice Commands

Image editing is simply the processes of manipulating images, in whatever format they are, for instance, digital photographs, old photo-chemical photographs, or illustrations
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
86 views

Elements Software For Image Editing Over Voice Commands

Image editing is simply the processes of manipulating images, in whatever format they are, for instance, digital photographs, old photo-chemical photographs, or illustrations
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Volume 5, Issue 5, May – 2020 International Journal of Innovative Science and Research Technology

ISSN No:-2456-2165

Elements: Software for Image


Editing Over Voice Commands
Akash Johnny Kunnath, Albin Saji, Benjamin G Nechicattu,
Done Maria James, Suma R
St. Joseph’s College of Engineering and Technology, Palai

Abstract:- Image editing is simply the processes of


manipulating images, in whatever format they are, for
instance, digital photographs, old photo-chemical
photographs, or illustrations. Traditional analogue
image manipulation involves photo retouching, using
mechanical tools like an airbrush to modify
photographs else by editing illustrations with any
traditional art techniques. Graphic software programs
can be broadly categorized into vector graphics editors,
raster graphics editors, and 3D modellers. This software
is the primary tool with which a user may manipulate, Fig 1:- Info Screen
enhance, and transform images. Many image editing
programs are also used to render or create computer II. INTENDED AUDIENCE
art from scratch over voice commands and direct
interaction. This software concentrates on people who are
involved in image editing. Professionals, freelancers, tutors,
I. INTRODUCTION students and even people who love to make images better
and look good. Moreover, it especially concerns people
Image editing from scratch has become a time- who are differently-abled who are involved in image
consuming process for non-professionals as well as for editing, with good craftsmanship, may be physically
upgrading professionals. Learning chunks of shortcuts and challenged.
completely accessing editing tools via mouse and keyboard
has become difficult, time-consuming and particularly ‘Elements’ a user-friendly voice-enabled editing
overhead for at least a few. Here, we introduce an image platform uses Voice commands to alter images. Out of
editing interface that comprises of vocal command numerous editing software that are available currently,
recognizer, image editing is difficult to perform with voice none of them provide voice capability for editing which in
alone. For flexible and easy editing-control we use both turn makes these editors much more difficult. One of the
voice and manual editing interaction, using mouse and major add on benefits is that this can be used by the
keyboard. Selecting an object or a layer within the differently-abled persons too mainly those who have some
workspace has become easier. The editing panel is a grid physical disability which makes ‘Elements’ adaptable to a
fashion workspace and x-y axes (rulers) are scaled for large range of users. All you have to do is just say what you
selection of points at the workspace where an image is to be have to do. This makes it easier to use and simply you don’t
edited. This application contains an image storage directory require by heart commands anymore. Image editing
linked to the desktop so that importing images becomes software like Photoshop, Fotos, GIMP, etc. have a wide
easy which is already in the application storage. There are variety of codes and commands which are a bit difficult to
combinations of filters that provide a professional touch to learn, and without these commands by hearted no one can
the images. Elements like adding texts and formatting, work on any of these platforms with increased efficiency
colour-comb are add-on features. The functions with and full accuracy. By using ‘elements ’the same
varying values can be adjusted in percentage/values by Performance accuracy can be achieved without even by
saying it while specifying the arguments. Voice interface hearting a single piece of command. You can do what you
makes complex tasks easier and accessible as they Allow need to do just by saying what is to be done.
users to simply state goals without learning an interface.

IJISRT20MAY903 www.ijisrt.com 1853


Volume 5, Issue 5, May – 2020 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
III. TECHNOLOGIES USED V. DESIGN AND IMPLEMENTATION
CONSTRAINTS
 Python
Core Programing is based on Python Programing It makes use of Google Speech recognizer. Google
language which is more convenient flexible and fast. Speech Recogniser works over a fast internet connection.
Python is more understandable as well readable. Execution Hence the application requires a fast internet connection.
and complexity of the program is comparatively easier and Access to mic and recording is a must.
less respectively. Python is an interpreter language which
helps in sequential execution if the program. VI. ASSUMPTIONS AND DEPENDENCIES

 Visual Studio It makes use of Google Speech recognizer to take in


We used visual studio programing platform for the voice commands from the user. Hence Speech recognition
development of the project. WPF along with C# framework library is needed to be installed along as a dependency.
makes it easier to integrate all the functions and make the Pythons’ PyAudio package is needed to be taken into
software consideration since the software involves the usage of the
system mic. Python’s Pillow library is a powerful image
 Tkinter Python GUI processing support. Image manipulation and Enhancement
Python have Tkinter GUI which makes combining the can easily be done using the Pillow package. Hence Python
scripts together. This makes it executable on any machines Pillow Package has to be considered as a major
that have python with in thus making the program cross dependency. Python GUI Tkinter[vii][viii]is used for the
platform. development of the project.

 Pillow Library VII. SYSTEM FEATURES


The Python Imaging Library[iii] (PIL/pillow) adds
image processing capabilities to the Python interpreter. An image editing project requires a 4 GB
Basically every operations on image can be done using this recommended Memory. So the system must provide the
pillow library. This gives wide file format support, an recommended space. The recommended processor speed
efficient internal representation, and fairly powerful image of an average 2.0 GHz is suggested. All since it involves a
processing capabilities. PhotoImage and BitmapImage continuous rendering of the image. A graphics card is
interfaces helps to show the image. Pillow library supports recommended, if not available, process.
image resizing, rotation and arbitrary affine transforms.
VIII. USER INTERFACES
 Natural Language toolkit
Natural Language toolkit[iv] [vi] is used in order to get The user interface is made simpler so that everyone
the speech and convert it to a machine understandable form can use it even without any prior knowledge of how the
so that the machine can make meaning from it. Every software works. Got a clean interface than that of any other
commands that is given to the system is tokenized by the UI provided by similar products.
NLTK and this enables the system to find out what
operation is to be done on the image taken.

 Google Speech Recognition Engine


Google speech recognition engine[x] coverts the
speech that is captured to corresponding text[i]. This text is
then used by the Natural Language toolkit (NLTK). The
Speech is recorded by the system and acquires Google API
for speech recognition and uploads the speech to generate
the corresponding text[ii].

IV. PLATFORM

The software works on Windows PC. Tkinter GUI[vii]


can be run on any devices that supports python. Thus can
Fig 2:- Screenshot of the Workspace with filters
be installed on Windows and Linux Machines and mac OS
too. Thus making it a cross platform program.

IJISRT20MAY903 www.ijisrt.com 1854


Volume 5, Issue 5, May – 2020 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
The in-app file browser makes it easier to browse
through the image files in your system over voice
commands. You can either start with an image or with a
board (fig 4). Board feature requires more customisation
yet to be done to work with voice which is not available in
this version. In later versions board will be as easy as
drawing on paper with voice.

Workspace imports images to work with. Fig 5 shows


the voice activated saving screen.

Fig 3:- Add Text Option

An example of the UI is shown above (fig 2). Tools


are arranged on one side which makes it easier to access.
The workspace[viii] is equipped with Suggestions and fine-
tuning sliders (fig 3 text options) for precision.

Tools are probably less accessed since the project


works over voice commands.

Fig 5:- Save options


With great precision, images can be edited in the
workspace.

IX. HARDWARE/COMMUNICATION
INTERFACES

Hardware interface involves a microphone which


comes inbuilt with most of the Personal Computers. If not
available, an external mic can be plugged in. A microphone
is a must, since we need to gather the voice commands.
Fig 4:- Create board

X. PROGRAM EXECUTION FLOW

Fig 6:- Sequence diagram

IJISRT20MAY903 www.ijisrt.com 1855


Volume 5, Issue 5, May – 2020 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
The Program begins when the application is open. The  Saturation. This operation Increase the saturation of
application welcomes the user with a splash screen. Soon the image or reduces the saturation. Say saturation to
after the application files and libraries are loaded it checks activate command over voice, keyboard S, or select
for the internet access. If an internet access is available the saturation.
mic gets activated and listen for a ‘do’ command. After a
‘do’ audio fingerprint is detected you can say any command  Flip. Flips the image left to right or up to down. This
to be performed in the image. makes the image looks like mirrored or tilted upside
down. Say flip right/ flip up to activate command over
Selecting the image is much easier with an in app file voice, keyboard F/f, or select flip.
browser which shows the images with in the PC. All you
have to do is to say the name of the image or select the  Warmth. Increases the red in every pixel and makes the
image manually. Selected image is brought to the image warmer. Say warmth to activate command over
workspace window. Where you can perform image editing. voice, keyboard w, or select warmth.
Now we need to say what operation had to be performed on
the image. It is a command. The command is then  Text. Now we can add text in to an image. When the
converted to its corresponding text via Googles Speech text function is called, it asks for the sentence to be
Recogniser API. API returns the corresponding text. added in the image. Varity of fonts are also incorporated
along. Font type has to be specified by specifying the
The command is now tokenised to tokens. For every font name. Position of the text in the image has to be
token, compares to a keyword in the keyword file. If token specified by passing the x, y co-ordinates or by saying
found, calls the corresponding function and performs the how much to right or down. Say add text to activate
action. Else if no token is found in the keyword file, token command over voice, keyboard T, or select text.
is compared with similar file, to avoid miss predictions. If a
similar keyword is found. Then the corresponding function  Black and white filter. Now a days black and white
to the ‘similar keyword’ is called and then performs the images brings nostalgic feel to the image. This can be
action on to the image. For some functions, arguments are brought to the image by a call and specify the strength
needed to be passed. For instance, say angle for rotation. of the filter. Say black and white to activate command
When a rotation function is called, an argument has to be over voice, or select black and white.
passed, angle. Now it’s turn for the argument to be listened
and is passed to the function. We can perform enough  Sharpness. Increases the sharpness of the image and
actions on the image until a save or quit command appears. makes the image crispier and less blurred. Say
Save command confirms the edited image and saves the sharpness to activate command over voice, or select
image in new name and a new extension as the user prefer. sharpness.
Quit command quits the image editing window without
saving the changes.  Detail. Increases the details and the structure of the
objects in the image. Say details to activate command
XI. IMAGE MANIPULATION OPERATIONS over voice, keyboard d, or select details.

The usage of pillow library brings up a large space for  Crop. Crop the image by passing four arguments
image editing[v]. Among them few are loaded in to the corresponding to left, upper, right and lower of the
application. image. Say crop to activate command over voice,
keyboard C, or select crop.
 Rotation. Rotation operation rotates the image in the
workspace window. The user can specify the angle to  Blur. Blurs the image or reduces the sharpness of the
which the image has to be rotated. Say rotate to activate image which gives a spread appearance to the image.
command over voice, keyboard press r, or select rotate. Say blur to activate command over voice, or select blur.

 Brightness. User can now change the brightness by  Contour. This can select the pixels with same intensity
saying change brightness. Brightness too needs a and find out the edges in an image. Say contour to
parameter. A floating point number is passed as the activate command over voice, or select contour.
argument for brightness which in turn increase the
brightness by that much. Say Brightness to activate  Edge Enhance. Edge enhance enhances the edges of
command over voice, keyboard B, or select brightness. objects in an image. For more edge enhance use more
edge enhance. Say edge to activate command over
 Contrast. Contrast increases the contrast of the image voice, or select edge enhance.
by a floating point value. Say contrast to activate
command over voice, keyboard c, or select contrast.  Emboss. Creates an emboss effect to the given input
image. Say emboss to activate command over voice, or
select emboss.

IJISRT20MAY903 www.ijisrt.com 1856


Volume 5, Issue 5, May – 2020 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
 Smooth. Just like blurring, this function smoothens the while saving. Say resize to activate command over
objects in the image. For more smooth use more voice, or select resize in save.
Smooth. Say smooth to activate command over voice,
or select smooth.  Save. Save function saves the image in a specific name
and extension format as the user specify in a user space.
 Resize. This function reduce the image size, resolution Say save to activate command over voice.
thus reducing the memory space occupied by the image

XII. BLOCK REPRESENTATION

Fig 7:- Block Diagram

XIII. RESULT  Output

We designed our evaluation to solve few tasks. How


does our proposed multimodal interface compare with a
traditional image editing interface? Success rate for both
interfaces were identical. Even though the multimodal
interface slightly shows more attraction. Figures fig 8 is a
test image. Following figures (Fig 9–23) are the output of
individual operations on image.

Fig 9:- Rotated Image

Fig 8:- Test Image

Fig 10:- brightened Image

IJISRT20MAY903 www.ijisrt.com 1857


Volume 5, Issue 5, May – 2020 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165

Fig 11:- contrast varied Image

Fig 17:- sharpenned Image

Fig 12:- saturated Image

Fig 18 : detailed Image

Fig 13:- flip vertical

Fig 19 : cropped Image

Fig 14:- flip horizontal

Fig 20 : blured Image

Fig 15:- Adding text

Fig 21 : edge enhanced Image

Fig 16:- Black and white

IJISRT20MAY903 www.ijisrt.com 1858


Volume 5, Issue 5, May – 2020 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
REFERENCES

[1]. Research on Speech Recognition Technology and Its


Application, Youhao Yo international conference on
computer science and Electronics engineering. 2012
[2]. Speech Recognition System : A Review Sandheep
Sharma, Nithin Washani , International Journal of
computer Applications. April 2015
[3]. Pillow 7.1.2, https://pypi.org/project/Pillow/ , 2020
[4]. NLTK 3.5 documentation, https://www.nltk.org/ ,
Fig 22:- embossed Image 2020
[5]. Python | Working with the Image Data Type in
pillow, geeks for geeks
https://www.geeksforgeeks.org/python-working-with-
the-image-data-type-in-pillow/ , 2020
[6]. Natural_Language_Processing-
Python,_tutorials_point
https://www.tutorialspoint.com/natural_language_pro
cessing/natural_language_processing_python.htm,
2020
[7]. Python-
GUI_Programming_(Tkinter),_tutorials_point
https://www.tutorialspoint.com/python/python_gui_pr
Fig 23:- smoothed Image ogramming.htm, 2020
[8]. tk-tools 0.12.0, https://pypi.org/project/tk-tools/, 2020
XIV. CONCLUSION [9]. pocketsphinx,
https://github.com/cmusphinx/pocketsphinx , 2020
Here we introduced, ELEMENTS, A multimodal [10]. SpeechRecognition3.8.1,_https://pypi.org/project/Spe
interface system to enhance image editing tasks through echRecognition/ ,2020
voice and conventional direct manipulation. Other than
editing functionalities "Elements" is enabled with browsing
of an image as well as saving an image after editing. We
can browse our file manager or even internet by using
appropriate voice commands. After editing procedure is
complete user can save image using the "save" command
and we can specify appropriate location as well as name in
which image is to be saved. Thereby implementing each
functionalities with voice. Coming to the editing
functionalities we have implemented all the features that
are essential for an editing tool. Features include
brightness, Contrast, crop, rotate, a total of 9 filters etc.,
and all these using voice commands. "Elements" have an
add on functionality of image compression. Image that we
select for editing maybe of larger size and we can compress
them after according to our requirement, compression ratio
is on a scale of 0-100. The key feature that makes
"Elements" unique from other editing tools is that it is voice
enabled, as it is voice controlled it can be used by the
"differently abled people". With the board facilities in later
versions makes it more advanced. Voice commands are less
complex than shortcuts and is has a user-friendly UI which
all makes it easy to use. So now editing is no more a
complex task just tell what to do and it’s done.

IJISRT20MAY903 www.ijisrt.com 1859

You might also like