0% found this document useful (0 votes)
15 views6 pages

Analyzing Malicious PDF Files - Part 21

Uploaded by

tw626
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views6 pages

Analyzing Malicious PDF Files - Part 21

Uploaded by

tw626
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 6

0

1
00:00:01,940 --> 00:00:03,660
So let's go to PDF parser.
1

2
00:00:10,640 --> 00:00:21,670
We will again run as
>python pdf-parser.py
then give the location of the pdf file
2

3
00:00:21,670 --> 00:00:27,100
example1.pdf. Press 'Enter' and it throws bunch of result to us.
3

4
00:00:27,100 --> 00:00:37,150
So the first result of PDF parser is nothing but the complete raw output of the PDF
file.
4

5
00:00:37,150 --> 00:00:46,730
You can see that it begins with PDF magic bytes which tells us that it's a PDA file
of version 1.4.
5

6
00:00:46,760 --> 00:00:49,290
Then we have objects inside it.
6

7
00:00:49,340 --> 00:00:54,960
You can just keep scrolling down you can see there is one object that contains
Stream
7

8
00:00:58,590 --> 00:01:00,200
as you move down.
8

9
00:01:00,270 --> 00:01:02,060
So there is another object.
9

10
00:01:02,070 --> 00:01:09,000
This might seem like suspicious but you have to look at what's what's exactly there
inside this particular
10

11
00:01:09,000 --> 00:01:09,510
dictionary.
11

12
00:01:09,510 --> 00:01:16,440
So it seems like it's a font setting element where this PDF has some specific font
setting element
12

13
00:01:16,470 --> 00:01:17,970
these are basically
13

14
00:01:18,000 --> 00:01:24,680
the hex representation of the value of that font.
14

15
00:01:24,720 --> 00:01:31,680
So it's not really something critical in terms of maliciousness of the file. You
can further come down.
15

16
00:01:34,010 --> 00:01:38,060
So these objects that contain stream these can be of interest.
16

17
00:01:38,210 --> 00:01:45,650
But as you see these objects have been referenced so we have to look who actually
is trying to reference
17

18
00:01:45,650 --> 00:01:55,340
to these or whether they are actually being referenced or they are just some
placeholders.
18

19
00:01:55,390 --> 00:02:05,020
So if you move down you object 24 tells us that it's basically having a javascript
and the javascript
19

20
00:02:05,020 --> 00:02:12,560
is executing a URL with unescape. if you further move down object 25.
20

21
00:02:12,620 --> 00:02:18,430
That's more about the title of PDF and that say we have the end of file
21

22
00:02:22,680 --> 00:02:23,000
OK.
22

23
00:02:23,030 --> 00:02:30,740
In order to quickly search for anything inside the PDF, the option that pdf parser
gives us is '-s'
23

24
00:02:30,830 --> 00:02:35,960
with this parameter, you can search for any string inside the inside the PDF.
24

25
00:02:36,110 --> 00:02:39,460
Let's say I want to look for 'javascript'
25

26
00:02:41,860 --> 00:02:45,550
So it gets me all the locations where javascript has been located.
26

27
00:02:45,640 --> 00:02:51,880
For example object number 24 contains javascript and it has the actual script as
well.
27

28
00:02:52,800 --> 00:03:03,010
and there is another subject object 26, which contains a dictionary that is calling
the
28

29
00:03:03,010 --> 00:03:07,620
javascript and referencing to object number 23.
29

30
00:03:07,660 --> 00:03:14,090
So let us see what exactly is there in object number 26.
30

31
00:03:14,110 --> 00:03:22,370
I think that is going to be the same data that we see here but let's run '-o' which
is for object
31

32
00:03:22,520 --> 00:03:25,090
and pass it object number which is 26.
32

33
00:03:25,250 --> 00:03:29,580
So if we press enter it gives us the content of object number 26.
33

34
00:03:29,780 --> 00:03:37,700
So again the object number 26 says that it's trying to call a javascript that is
34

35
00:03:37,710 --> 00:03:38,420
at object number 23
35

36
00:03:38,420 --> 00:03:42,460
So let's go to object 23 and see what's there.
36

37
00:03:43,730 --> 00:03:46,990
So object 23 is interesting here.
37

38
00:03:47,060 --> 00:03:49,190
It's not really doing anything.
38

39
00:03:49,190 --> 00:03:52,370
It is just referencing to object number 24.
39

40
00:03:53,200 --> 00:03:56,600
And you guys know what is there an object on 24.
40

41
00:03:57,700 --> 00:04:00,340
It's our javascript that we just now saw.
41

42
00:04:00,340 --> 00:04:09,220
So this is basically a kind of way by which malware authors try to create a sort of
loop so that the
42

43
00:04:09,220 --> 00:04:14,280
PDF tools are not able to quickly recognize where the javascript is located.
43

44
00:04:14,470 --> 00:04:22,330
So if you see there was object 26 was referencing to object number 23
44
45
00:04:22,340 --> 00:04:25,610
in an object number 23 referenced to object number 24.
45

46
00:04:25,620 --> 00:04:30,870
And it was object 24 that actually contained the javascript inside it.
46

47
00:04:31,390 --> 00:04:33,100
So we have the javascript here.
47

48
00:04:33,220 --> 00:04:35,980
Now it's a simple unescape script.
48

49
00:04:36,040 --> 00:04:42,220
All you have to do is just append a document.write to it and you can see what
exactly this javascript
49

50
00:04:42,220 --> 00:04:53,230
translates into. Let us quickly analyze another example.
50

51
00:04:53,260 --> 00:04:57,880
So again it's a pretty long output and we already have a result from PDFid.
51

52
00:04:57,890 --> 00:05:01,450
that example2.pdf also contains javascript
52

53
00:05:01,570 --> 00:05:02,900
So let us search for that
53

54
00:05:08,290 --> 00:05:09,090
OK.
54

55
00:05:09,110 --> 00:05:18,640
So it's saying that there is a script that is referencing to an action.
55

56
00:05:18,810 --> 00:05:24,120
So lets search for action here.
56

57
00:05:24,150 --> 00:05:25,820
What exactly it does.
57

58
00:05:25,890 --> 00:05:34,710
OK so if we look at the referencing action, this javascript is trying to launch
command.exe. From there
58

59
00:05:34,860 --> 00:05:43,810
It's going to home drive. it's looking weather template.pdf exists on desktop or
not.
59

60
00:05:43,810 --> 00:05:45,260
or not.
60

61
00:05:45,280 --> 00:05:49,370
If that file exists it's actually executing it.
61

62
00:05:50,270 --> 00:05:55,070
So this is what this javascript is trying to do it's basically a launch action as
soon as you launch
62

63
00:05:55,070 --> 00:05:55,980
that PDF,
63

64
00:05:56,060 --> 00:05:59,560
This is the particular portion of the script that will get executed.
64

65
00:05:59,560 --> 00:06:05,870
So that is how we follow the trails and try to understand what the javascript is
trying to do.

You might also like