Readme Eng
Readme Eng
######################
About Program
###################################################################################
######################
The main purpose of this program is to provide functionality for extract hardcoded
text (hardsub) from video.
For working of this program will be required "Microsoft Visual C++ Redistributable
for Visual Studio 2015, 2017 and 2019": https://support.microsoft.com/en-
us/help/2977003/the-latest-supported-visual-c-downloads
x64: https://aka.ms/vs/16/release/vc_redist.x64.exe
x86: https://aka.ms/vs/16/release/vc_redist.x86.exe
Latest versions were built and tested on: Windows 10
###################################################################################
######################
Quick Start Guide
###################################################################################
######################
How to use without deep details:
1) Click in menu "File->Open Video" (any variant)
2) Check boundary box in "Video Box" where most subs will appear (for that you can
move split lines in it): by default it is whole video.
It is recommended to reduce area for search for getting less wrong detections and
less timings splits.
3) Check what horizontal alignment subtitles has on video relatively to selected
boundary box "Center/Left/Right/Any" and set related value in "Text Alignment"
property in "Settings" tab.
Center - is in most case, so it set by default.
4) It is strongly recommended to use "Use Filter Colors" but you can skip this
step.
For this you need:
* - scroll video to subtitle frame
* - press 'U' in Video Box and select subtitle pixel by 'Left Mouse' click
* - copy Lab color record from right bottom part in "Settings" tab to "Use Filter
Colors" in left top side of "Settings" tab
* - if there are many subtitles with different colors you can add all of them to
"Use Filter Colors" by adding new line records with "Ctrl+Enter"
5) Click "Run Search" in the first tab page, if you need to get only timing and
original images with potential subs go after this step to the last tab page and
press "Create Empty Sub",
found original images with subtitles will be located in "RGBImages" folder.
6) Check ILA Images in "ILAImages" folder: subtitles symbols by default will be
searched inside white pixels in ILA images, if white pixels in ILA images will not
contain some symbols or they are broken,
which is possible if you use too strong Color Filters or subtitles pop-up on video
and disappear, in this case it is better to change program settings or delete such
ILA Images.
7) [MOST IMPORTANT PART IF YOU DONT USE COLOR FILTERING]
Before continue: Check does subtitles has darker border color then subtitles text
color.
In most case it is so, if not than disable checkbox "Characters Border Is Darker"
in first right setting in "Settings tab".
In most cases program correctly identify which color is related to subtitles text
but in some cases it is too complicated, in such cases decision will be applied
according this setting.
8) If you are using "Use Filter Colors" and have too good ILAImages - all
characters separated from background,
it is recommended to turn on "Use ILAImages for getting TXT symbols areas" which
can reduce amount of garbage.
9) Click "Create Cleared TXTImages" on the last tab page for get text symbols
separation from background, after this you can do OCR text in other software as
described in separate topic "For OCR".
Video instructions:
There are many instructions which can be found in youtube and was made by this
program users.
One of most recommended by them are:
https://www.youtube.com/watch?v=Cd36qODmYF8
https://www.youtube.com/watch?v=VHsUfqqAkWY&t=124s
###################################################################################
######################
Known Issues
###################################################################################
######################
1) Different timing in OpenCV vs FFMPEG video open but now they are very close with
difference ~ 000-001 milliseconds.
2) In case of Center alignment, which is by default, take into the note that if
whole subtitle will be in right half part of selected boundary box it will be
removed.
3) Missed subtitles. Check that missed subtitles are not < 12 frames length (going
less then 0.5s), if so you can change "Sub Frames Length" to 6 or other in
"Settings" tab.
###################################################################################
######################
Recommended Settings And Some Solutions For "Run Search" and "Create Cleared
TXTImages"
###################################################################################
######################
1) For getting best results during "Run Search" and "Create Cleared TXTImages":
--------------------------------
1-1) Before starting "Run Search":
--------------------------------
After opening video:
*) Test all setting in "Settings" tab by pressing "Test" button with selection
different video frames with too light and to dark background and so on.
*) Check boundary box in "Video Box" where most subs will appear (you can move
split lines for that in it): by default it is whole video.
It is recommended to reduce area for search for getting less wrong detections and
less timings splits.
In worse cases, you can detect each line separately by running program multiple
times with different video area selection (this can fix
possible incorrect multiple lines subtitles splits in single frame)
*) Check what horizontal alignment subtitles has on video relatively to selected
boundary box "Center/Left/Right/Any" and set related value in "Text Alignment"
property in "Settings" tab.
alignment: Center - is in most case, so it set by default. Take into the note that
if whole subtitle will be in right half part of selected boundary box it will be
removed.
alignment: Any - currently supported but not so good as other types.
*) For decrease amount of: wrong timing splits during Run Search, wrong produced
timings, not detected subtitles, missed multiple lines during Clean TXT Images on
found images with subs;
you can do next - adopt: "Moderate Threshold"(moderate_threshold)(located in
"Settings" tab in left panel) in range [0.25, 0.6] by "Test" button
Need to find optimal value when subtitles symbols in 'After First/Second/Third
Filtration' images will have not broken edgings and
filled with white (not empty) but most part of background disappear, it very
depends from video and also from concrete frame,
so try on too bright and dark scenes and so on. Then higher value than more pixels
will be filtered/removed as not related to subtitle.
0.25 - are working on most videos, especially in case with 1080p, but can produce
too much of garbage.
0.5-0.6 - very oftenly optimal in case of subtitles with low resolution (~480p) and
solid outlines/borders (like white subs with black outlines).
0.1 - in some cases when subtitles have not outlines or/and transparent (but many
splits also as garbage can be found).
*) It is recommended to set "Use Filter Colors" which can very improve accuracy of
results:
more accurate produced timing, less wrong splits and better cleared TXT images
results (read below in separate topic).
*) Also on splits affects two parameters:
vedges_points_line_error = 0.3
ila_points_line_error = 0.3
0.3 is like 30% allowed difference in other case nearest by time two subtitles
treated as different.
Then higher values than more rarely time code will be split.
Then lover values than more often time code will be split.
I don't recommend to change them. Only if you sure that there is not sequential
subtitles in video (which go without pause between them) you can try to increase
them.
-------------------------------------------
1-2) Before starting "Create Cleared TXTImages":
-------------------------------------------
*) Test all setting in "Settings" tab by pressing "Test" button with selection
different video frames with too light and to dark background and so on.
*) Check does subtitles has darker border color then subtitles text color (in most
case it is so,
if not than disable checkbox "Characters Border Is Darker" in first right setting
in "Settings tab")
In most cases program correctly identify which color is related to subtitles text
but in some cases it is too complicated,
in such cases decision will be applied according this setting.
*) Check on missed symbols during clearing images or selected video frame
"Moderate Threshold For Scaled Image" (moderate_threshold_for_scaled_image)(located
in "Settings" tab in right panel) [0.1, moderate_threshold]
0.25 will be working in most case, if some symbols are removed try to use lover
values or else.
It should be not higher then optimal value for "Moderate Threshold"
(moderate_threshold).
In some cases when subtitles have bad/missed borders (outline color) and have
mostly same color as background
(can be difficultly separated from background) it need to be set to 0.1 or 0.15,
also in such cases is recommended
to turn on "Use ILAImages before clear TXT images from borders", it can improve
results.
*) It is recommended to set "Use Filter Colors" which can very improve accuracy of
results:
more accurate produced timing, less wrong splits and better cleared TXT images
results (read below in separate topic).
*) If you are using "Use Outline Filter Colors" or have too good ILAImages (all
characters separated from background)
it is recommended to turn on "Use ILAImages for getting TXT symbols areas" which
can reduce amount of garbage.
*) For decrease amount of garbage on cleaned TXT images you can try to turn on
"Clear Images Logical" (located in "Settings" tab in right panel) which is turned
off by default.
This option can remove garbage but can remove wrong/good elements. It try to remove
figures which mostly appear inside other figures and so on.
Mostly useful if everything listed below is right:
* - language is not Hieroglyph or Arabic type
* - symbols are not broken in produced results (subtitles are with good quality,
with stable luminance)
* - "Use Outline Filter Colors" is used (subtitles has solid outlines from all
sides)
* - "Use ILAImages for getting TXT symbols areas" turned on
*) For decrease amount of garbage on cleaned TXT images you can also try to turn on
"Remove too wide symbols"
Don't use it for Arabic or handwritten subs which can have very long symbols or can
be written not separately.
5) "Create Cleared TXTImages" from subs with multiple colors in single subtitle
line:
There possible cases when in single subtitle line part of symbols has one color
when other symbols has another,
for example left part of subtitle line is with one color and right part with
another.
In this case you can use "Use Filter Colors" with define all of used colors in
subtitle and turn on "Combine To Single Cluster" in "Settings" tab.
###################################################################################
######################
Used terms:
###################################################################################
######################
###################################################################################
######################
For OCR (conversion of images of text into machine-encoded text) can be used:
###################################################################################
######################
#-----------------------------------------------------
#-----------------------------------------------------
Usefull links:
http://www.baidu.com
https://rapidapi.com/blog/directory/baidu-ocr-text-recognition/
https://github.com/Baidu-AIP/python-sdk
https://programmer.ink/think/5d35803c404e4.html
https://ai.baidu.com/ai-doc/OCR/Dk3h7yf8m
https://ai.baidu.com/ai-doc/OCR/Kk3h7y7vq
https://ai.baidu.com/sdk#ocr
#-----------------------------------------------------
###################################################################################
######################
OUTDATED OLD INFORMATION:
###################################################################################
######################
2) In the menu select File-> Open Video, select the video file in which
it is necessary to find sabies (avi, mpeg, mpg, asf, asx, wmv, ...)
A video of the current video will appear in the Video Box.
3) You can immediately press Run (by default at the moment you must be in
the Search tab), only the current time will be displayed in the Video Box
processed frame (with a frequency of 1 times per second), if at the moment
will be found at least one sub, then its image will be displayed in Image
Box, the search results are dropped into the RGBImages folders (the original
screens from
video) and ISAImages (Intersected Subtitles Areas (by multiframe usage)), make
sure that
you had at least 100 mb of free disk space.
4) You can set the start and end time of the search for sabs, for this use
key keys ctrl + z and ctrl + x (or through the menu Edit -> ...)
Use the slider in the Video Box to navigate through the video :),
for frame-by-frame navigation on video (use the arrows <-, ->, Up, Down
or mouse wheel).
5) You can reduce the detection area by moving the vertical and horizontal
Separating lines in the VideoBox with the mouse.
6) At the end of the search for the sub it is desirable to go to the folder
RGBImages and delete all
those frames that are dummy, then on these images you can
create an empty sub with timings, go to the OCR tab and click
on Create Empty Sub, and there is the possibility of extending the sub by
setting the "Min Sub Duration" value of the minimum duration of the sub
seconds, and a logical attempt will be made to extend the sub
by changing the time it ends.
Moderate Threshold - by means of complex operators from the source frame (when
use of primary processing) the result is an image where
Each point of the original frame is correlated with its strength (the strength
of the color difference),
and averaged so that in the region of the bright text and the pale text of force
the color difference will not be too much different. Then on this
image is the maximum strength of the color difference, this value
multiplied by "Moderate Threshold", all points of power below the received
The quantities are replaced by black or white.
Between Text Distance - is used for secondary image processing. How are you
already had to understand the result of the primary processing, we get a binary
image (speaking in simpler words consisting of points with only two
possible colors black and white). This image is broken
On the thickness line "Line Height". In each of these lines, blocks are searched
for
(presumably textual), by the principle if in the next column of height
"Line Height" is at least one white point then the block has not yet ended. Then
It looks like the distance between all the blocks found, it should not exceed
The width of the video frame is multiplied by "Between Text Distance". If more
then
The block whose center is farthest from the center of the frame is deleted.
Min Points Number - in all remaining blocks in the line (read above)
the number of white dots is counted if their number is less than the "Min Points
Number" then
all blocks in the line are deleted.
Min Points Density - the total area of the remaining blocks in the row is counted
(= SS) (read above), as well as the total number of white points in them (= S).
It is checked that S / SS> = "Min Points Density", if not all of these
blocks are deleted.
Sum Color Difference - used at the very beginning of the initial processing for
reducing the detection area. The original video frame is split on the line
thickness 1, each line is divided into blocks of equal width "Segment Width".
In each block, the total color difference of neighboring points is calculated,
if
in the line are "Min Segments Count" of neighboring blocks with "swings"> = "Sum
Color Difference ", then the line is assumed to be text-containing.
Min Sum Multiple Color Diff - used at the end of the secondary processing. How are
you
should still remember at the end of the secondary processing we possibly have in
any of the
line "Line Height" still left some blocks
(potentially text-containing blocks). The whole area from their beginning to the
end
is divided into blocks of width "Segment Width". In each line (such lines
will be "Line Height") of each such block is calculated by the
differential. If in each line of any block of the "difference"> = "Sum Multiple
Color Diff ", then such a block is considered potentially text containing.
If as a result in this area there are "Min Segments Count" of such
neighboring blocks, then the entire line is assumed to be text-based.
OCR Settings
Clear Images Logical (Remove Garbage) - performe cleaned image analyzes on present
symbols and garbage.
In case of turned off it will not try to remove potential garbage on cleaned
images,
sometimes this produce better results.
Sub Square Error - in fact, the first detection of the sub goes as follows,
found the first frame potentially containing the text, then if not less than
"Sub
Frames Length "including its frames are also potentially
text-containing and the area of their potentially text-based areas does not
differ
more than on "Sub Square Error" it is considered that the sub is found (further
there is a test
for the presence of the text string, if the test did not fail, then the search
for the end of the sub is going on,
matching identical sabs).
Text Procent - used in the test for the presence of a text string. Looking for at
least
one line of thickness "Line Height" in which the sum of all lengths potentially
text blocks in relation to the extent of the area on which they are
are located> = "Text Procent". Also checks that this amount should be> =
"Min Text Length" * "Video frame width".
Note:
A long search time is due to the fact that text detection is used
very complex and time-consuming algorithms developed by the Chinese (whose articles
I
studied and implemented in this program), namely, the main used
(studied) works: