01 - 04 - Detecting Encodings With Python - en

This video discusses encoding schemes, particularly URL encoding and Base64 encoding, which are used to transmit data that doesn't conform to specific protocol rules. It explains how these encoding methods can be utilized for obfuscation in cybersecurity, making it harder for unauthorized users to identify sensitive information in network traffic. The video also introduces a helper function to check if data is likely encoded, demonstrating the process with examples and potential pitfalls in identifying encoded data.

Uploaded by

rasha.ziad.share

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views3 pages

01 - 04 - Detecting Encodings With Python - en

Uploaded by

rasha.ziad.share

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

You are on page 1/ 3

Hello and welcome back to this course.

In the past few videos, we've been talking

about identifying
a good network protocol and fields within those packets for
command and control. And the first of the three videos, we talked about the code
that
we use to accomplish this. In the previous video, we talked
about entropy, one of our measures of suitability, and now, in this video, we're
going to talk about encoding schemes. So encoding schemes were originally
designed to allow data that doesn't follow the rules of a particular protocol
to be transmitted over that protocol. So this could mean that in some cases, we
have protocols that can
only carry principal data. And so,
if you have unprincipled characters, if you want to send them over
that particular protocol, you need to convert the unprincipled
characters to principal ones. And another case is where you have
protocols that have reserved or special characters. So for example, in a URL, a
question
mark is a reserved character, and so if you want to use a question
mark somewhere in the URL and don't want it interpreted,
is that reserved character? You need to encode it, in a moment,
we'll talk about URL encoding or percent encoding which is
designed to do exactly that. And so, these are the original
purposes for various encoding schemes. However, they are also commonly applied,
especially in offensive cybersecurity for
obfuscation. So for example, if you're sending
a username and a password or other sensitive data over the network,
then it's easy for anyone to monitor that network traffic. And identify, okay,
if I do a keyword search for username or password, I found the packet
that I want and see that data. However, if that username or
password is encoded, then that keyword search won't match
unless you know to reverse the encoding. And so,
we're talking about encoding schemes here because if we're going to use a network
protocol for command and control. And put our data in a particular field, we might
want to have
the option to encode that data. And if so,
it would be useful if we choose a field where encoded data is
not unusual if possible. And so, in this video, we're going to
talk about two encoding schemes, URL encoding and Base64 encoding. And so, our main
function here or
the helper function is called check encoding, so
we'll give it some data, and they will tell us whether or
not that data is likely to be encoded. And so, our first test is if
the length of the day to zero, then return false because zero
length data can be successfully decoded by any scheme, so
it would be confusing. If we have a non-zero length data,
we're going to check for URL encoding and Base64 encoding. If we find that it
matches our rules for
those, then we'll return either URL or
Base64 respectively. And that will go back to our traffic
analyzer script we looked at a couple of videos ago, which includes that
information that's output as we saw. So let's talk about URL encoding first. So
with our URL encoding, we're going to focus on things
that are completely encoded. So often in the URL, the only characters
that are encoded are the ones that break the rules,
the ones that are reserved characters. So you might have something
that's mostly principle, and then the occasional encoded character. And so, we
certainly could use
that approach for command and control by randomly encoding
characters to break up text matching, and we could easily modify this code
to look for those opportunities. However, in this case,
we're just going to look for something that's completely URL encoding. And so, URL
encoding gets other name,
percent encoding from how it encodes data. So each character that's encoded in
the string is written as a percent sign followed by the hexadecimal representation
of the corresponding asking character. So for example,
a space which has an x value of 20 would be represented as percentage
to zero in percent encoded. And so, for
our check URL encoding function here, we're going to look for
things that match a rule that says it should be a percent followed
by two hexadecimal digits, followed by potentially more of
the same that percent x has. And we're going to use python's ARI
library to do that because it lets us match the string using
regular expressions, and here is our regular expression
that we'll be using here. So starting in the middle here,
let's take a look, so we've got our percent sign that we
want to match, and then we have this section in square brackets, so
square brackets mean any of these. And so, this particular section
says if it is a number 0-9 or capital A through F or lowercase
A through F, then match that character. Because those are the allowable values for
hex values, and then we also after that have
this two in curly braces, and so what this means is match exactly
two of whatever's previous. So we have a percent sign,
something that matches a hex character, and we want to of those which would
match something like percent to zero, which is our URL encoding for a space. And
so, all of this is wrapped
up in a set of parentheses, saying treat this all as one unit, so we only want to
match if we see percent,
our hex, hex, percent hex, hex. And then, we want one or more of them, so if we
can't match at least one,
we want to return false. And so, then, we pass in our data and if
the entire string of data that we pass and matches this, so it's percent hex,
hex, percent hex, hex etcetera. Then, we return true saying, yes,
it is URL all encoded, otherwise, you return false saying, well,
it doesn't match our rules. So it's entirely possible that it
is a field that uses URL encoding, but only some characters in URL encoding,
the ones that are reserved. And because we're using full match for
this, we won't match, but we could modify this to allow
partial URL encoding if we chose. The other and more difficult one that
we want to test for is Base64 encoding. So Base64 encoding gets its
name from the fact that it uses 64 characters as an alphabet for
it's encoding data. So those are alphanumeric characters,
so capital A to Z, lowercase A to Z, 0-9, and
then a couple of special characters. And so, if you add that up,
number of letters, double that, add 10 for 0-9, and
then add 2, you get 64. And so, the simple way to test for Base64 encoding is to
try to decode it and
see if it fails, so python has a Base64 library from
which we can import Base64 decode. And so, if we do B64 decoded data,
and it decodes to a plain text, we'll return true,
meaning that it could be Base64 encoded. If something goes wrong,
that means that it wasn't a valid Base64 encoding,
and so we'll return false. And so, as we're going to see
in our main function when we run this in a moment,
this is a bit of a shaky way of testing. And the reason why is we
don't know the data that's stored within our Base64 encoding data. So all we're
testing for
is does it decode to something in Base64, which just essentially means that
it's a multiple four characters. And it's limited to those 64 character
alphabet that I just mentioned, or it ends with one or two equal signs,
which are used for padding in Base64. And so, down here in our main function, we
have three messages that we're
going to check our encoding for. So we'll use Hello World,
that's actually Base64 encoded, we'll use URL encoded string, so
see the percent hex, hex et cetera. And then, we'll use the strength FFFF,
so eight apps, and so for each of these, we'll call check
encoding, and we'll print out the results. So now I'll call python CheckEncoding.py
hit Enter, and we see our three results. So Hello World,
B64 encodes to this string here, and on testing that, it determines, yes, it does
successfully decode to something. When we use our regular expression
to test for URL encoding, this matches because we
have our percent sign, two characters that are valid hex
characters percent to valid hex, etcetera. And so, those are both good because
they mean that in the correct case or the positive case,
we successfully identify something, that's Base64 encoded and
something that's URL encoded. However, we also get some false positives,
so FFF, essentially F eight times, is technically
a valid Base64 encoded string. However, it's not a particularly
useful one if you decode it to the plain text because it's just
the same thing continuously, and so this probably wasn't actually
intended to be Base64 encoded. It's probably padding or
something else, however, we match it as a valid Base64 encoded
string because it is decodable. And so, without knowledge about
the plain text that goes into it and what's considered a valid plain text, which we
don't necessarily have
when we're analyzing packet fields. Then, we can't be 100%
certain if our result for decoding actually means that this
field carries encoding data or if it's just that it
happens to be decodable. But identifying things that say
the majority of them are decodable indicates that we might have
a field where it actually is encoded which would be useful for
command and control. And so, again, this is just one of the two
helper functions that we are looking at in relation to the traffic analyzer
script from a couple videos ago. For identifying fields and
network packets that might be useful for command and control infrastructure. Thank
you.

Python For Cyber Security
No ratings yet
Python For Cyber Security
214 pages
LabVIEW Graphical Programming (4th Ed) (Gary and Richard)
100% (2)
LabVIEW Graphical Programming (4th Ed) (Gary and Richard)
625 pages
X9.24 3 2017 Python Source 20180129 1
No ratings yet
X9.24 3 2017 Python Source 20180129 1
35 pages
Bug Bounty Automation With Python The Secrets of Bug Hunting
75% (4)
Bug Bounty Automation With Python The Secrets of Bug Hunting
79 pages
Week 4
No ratings yet
Week 4
22 pages
Autosar RTE Layer
No ratings yet
Autosar RTE Layer
1,116 pages
Group 10 Etc
No ratings yet
Group 10 Etc
22 pages
SketchUp 2016 Help
No ratings yet
SketchUp 2016 Help
175 pages
Python Presentation 2
No ratings yet
Python Presentation 2
27 pages
The Ultimate Aws Cloud Practitioner Mastery: Mastering AWS Essentials, A Comprehensive Guide for Cloud Practitioners
From Everand
The Ultimate Aws Cloud Practitioner Mastery: Mastering AWS Essentials, A Comprehensive Guide for Cloud Practitioners
Furuta Kimiko
No ratings yet
PIYUSH PythonLABmanual
No ratings yet
PIYUSH PythonLABmanual
69 pages
2024 Y6 H2 Computing Prelim Paper 2 - Final
No ratings yet
2024 Y6 H2 Computing Prelim Paper 2 - Final
15 pages
DFOR510 Week10 Python Files Hash Err
No ratings yet
DFOR510 Week10 Python Files Hash Err
36 pages
CS 340 Lecture 15 Encryption and Anonymity
No ratings yet
CS 340 Lecture 15 Encryption and Anonymity
35 pages
Psyton rfc4648
No ratings yet
Psyton rfc4648
22 pages
Wireless Security Camera PC530
No ratings yet
Wireless Security Camera PC530
100 pages
CH 13
No ratings yet
CH 13
59 pages
Report SIP
No ratings yet
Report SIP
138 pages
Java Coding Problems: Improve your Java Programming skills by solving real-world coding challenges
From Everand
Java Coding Problems: Improve your Java Programming skills by solving real-world coding challenges
Anghel Leonard
No ratings yet
Comp
No ratings yet
Comp
16 pages
Introduction To Cryptography 2
No ratings yet
Introduction To Cryptography 2
23 pages
CN Program With VIVA
No ratings yet
CN Program With VIVA
10 pages
CS50P Notes
No ratings yet
CS50P Notes
1 page
Cypher Notes
No ratings yet
Cypher Notes
21 pages
Satyabhama Bigdata
No ratings yet
Satyabhama Bigdata
128 pages
ACN Chapter 1 - Part 2 Notes
No ratings yet
ACN Chapter 1 - Part 2 Notes
8 pages
Week 4
No ratings yet
Week 4
23 pages
Foundation
88% (25)
Foundation
19 pages
Based 64 Encoding - 00019594 - Pawan Kumar Das
No ratings yet
Based 64 Encoding - 00019594 - Pawan Kumar Das
21 pages
Admin Practice Exam 3
No ratings yet
Admin Practice Exam 3
95 pages
Unicodebook PDF
No ratings yet
Unicodebook PDF
73 pages
PHP LinkedIn Assesment
No ratings yet
PHP LinkedIn Assesment
74 pages
Omkar Project
No ratings yet
Omkar Project
31 pages
RFC 7468
No ratings yet
RFC 7468
20 pages
Cyber Python Practical File
No ratings yet
Cyber Python Practical File
14 pages
ACS 2024 - Pre Liminary Round Writeup - Team 555-1 - 241122 - 182822
No ratings yet
ACS 2024 - Pre Liminary Round Writeup - Team 555-1 - 241122 - 182822
28 pages
Python Harvard RegularExpressions
No ratings yet
Python Harvard RegularExpressions
20 pages
Lecture 7 - CS50's Introduction To Programming With Python
No ratings yet
Lecture 7 - CS50's Introduction To Programming With Python
13 pages
Tribhuvan University: Project Proposal
No ratings yet
Tribhuvan University: Project Proposal
17 pages
Simplified PHP
From Everand
Simplified PHP
James Blanchette
No ratings yet
Implementing A Custom X86 Encoder
No ratings yet
Implementing A Custom X86 Encoder
25 pages
Web Security (CAT-309) - Unit 1 Lecture 5
No ratings yet
Web Security (CAT-309) - Unit 1 Lecture 5
11 pages
Appendix 8 - Typical Project Execution Plan
No ratings yet
Appendix 8 - Typical Project Execution Plan
19 pages
Lecture 4
No ratings yet
Lecture 4
13 pages
CN Lab Programs
No ratings yet
CN Lab Programs
5 pages
Hexagonal Architecture Explained
From Everand
Hexagonal Architecture Explained
Alistair Cockburn
No ratings yet
iCEcube2 Userguide Dec2020
No ratings yet
iCEcube2 Userguide Dec2020
187 pages
PERT (Programme Evaluation and Review Techniques) CPM (Critical Path Methods)
No ratings yet
PERT (Programme Evaluation and Review Techniques) CPM (Critical Path Methods)
34 pages
IGCSE Cambridge Computer Science Theory Paper Exam Supports (2023-2025)
No ratings yet
IGCSE Cambridge Computer Science Theory Paper Exam Supports (2023-2025)
37 pages
FINAL INS Journal PD
No ratings yet
FINAL INS Journal PD
24 pages
Rails: Novice to Ninja: Build Your Own Ruby on Rails Website
From Everand
Rails: Novice to Ninja: Build Your Own Ruby on Rails Website
Glenn Goodrich
4/5 (1)
Slide 3
No ratings yet
Slide 3
9 pages
CSE101-Course Introduction-S21
No ratings yet
CSE101-Course Introduction-S21
26 pages
Classic Mceliece: Conservative Code-Based Cryptography 29 November 2017
No ratings yet
Classic Mceliece: Conservative Code-Based Cryptography 29 November 2017
38 pages
Flying Space Available on Military Aircraft
From Everand
Flying Space Available on Military Aircraft
W. Addison Gast
No ratings yet
DeepSeek - Python Tutorial
No ratings yet
DeepSeek - Python Tutorial
8 pages
05 Quiz-1 Topics
No ratings yet
05 Quiz-1 Topics
17 pages
Exercises
No ratings yet
Exercises
21 pages
01 - 03 - Calculating Entropy With Python - en
No ratings yet
01 - 03 - Calculating Entropy With Python - en
3 pages
03 - Worked Example Twfriends Py Chapter 15.en
No ratings yet
03 - Worked Example Twfriends Py Chapter 15.en
4 pages
MOVIES - Files On DVDs - 367
No ratings yet
MOVIES - Files On DVDs - 367
85 pages
Data Transmission - Comprehensive Notes
No ratings yet
Data Transmission - Comprehensive Notes
5 pages
CNS Unit - 5
No ratings yet
CNS Unit - 5
17 pages
Power Sharing
No ratings yet
Power Sharing
3 pages
IFX Expo Limassol - Exhibitor Manual
No ratings yet
IFX Expo Limassol - Exhibitor Manual
19 pages
RFC 3986 - Uniform Resource Identifier (URI) - Generic Syntax (RFC3986)
No ratings yet
RFC 3986 - Uniform Resource Identifier (URI) - Generic Syntax (RFC3986)
47 pages
Unit-2 Data Literacy
No ratings yet
Unit-2 Data Literacy
6 pages
Iris Recognition With Off-the-Shelf CNN Features: A Deep Learning Perspective
No ratings yet
Iris Recognition With Off-the-Shelf CNN Features: A Deep Learning Perspective
8 pages
RFC 5952
No ratings yet
RFC 5952
14 pages
Hash Functions: EJ Jung
No ratings yet
Hash Functions: EJ Jung
15 pages
Devnet-Python-Apic-Em: Stack Overflow (Links To An External Site.)
No ratings yet
Devnet-Python-Apic-Em: Stack Overflow (Links To An External Site.)
2 pages
Freescalpingindicator PDF
No ratings yet
Freescalpingindicator PDF
7 pages
Unicode HOWTO: Guido Van Rossum and The Python Development Team
No ratings yet
Unicode HOWTO: Guido Van Rossum and The Python Development Team
13 pages
Puit XX
No ratings yet
Puit XX
97 pages
L01 Mysql Workbench Setup
No ratings yet
L01 Mysql Workbench Setup
17 pages
Unicode HOWTO: Guido Van Rossum and The Python Development Team
No ratings yet
Unicode HOWTO: Guido Van Rossum and The Python Development Team
12 pages
Final Exam Review: Will Release at 10:00am Dec. 4, Due On Webcourse at 11:59pm The Next Day
No ratings yet
Final Exam Review: Will Release at 10:00am Dec. 4, Due On Webcourse at 11:59pm The Next Day
29 pages
Email Alerts On Whatsapp
No ratings yet
Email Alerts On Whatsapp
12 pages
Tweet Follow @danstools00 Share 2
No ratings yet
Tweet Follow @danstools00 Share 2
5 pages
HB Aircraft Industries AG HB-23/2400
No ratings yet
HB Aircraft Industries AG HB-23/2400
24 pages
Aoop-A CH
No ratings yet
Aoop-A CH
34 pages
Python Strings: Accessing Values in String S
No ratings yet
Python Strings: Accessing Values in String S
7 pages
Benefits of Lift-and-Shift Strategy For Cloud Migration: Compute Storage Network On-Premise Infrastructure
No ratings yet
Benefits of Lift-and-Shift Strategy For Cloud Migration: Compute Storage Network On-Premise Infrastructure
22 pages
Python Programming 101
92% (37)
Python Programming 101
108 pages
Puter Literacy MS Power Point Q & A SR
No ratings yet
Puter Literacy MS Power Point Q & A SR
11 pages
stm32 Nucleo 144 Overview
No ratings yet
stm32 Nucleo 144 Overview
4 pages
Jquery Validation
No ratings yet
Jquery Validation
2 pages
Individual Assignment 02
No ratings yet
Individual Assignment 02
2 pages
Apurv Mishra Resume Data Engineer
No ratings yet
Apurv Mishra Resume Data Engineer
1 page
Umeme Service Guidelines PDF
No ratings yet
Umeme Service Guidelines PDF
1 page

01 - 04 - Detecting Encodings With Python - en

Uploaded by

01 - 04 - Detecting Encodings With Python - en

Uploaded by

Hello and welcome back to this course.

In the past few videos, we've been talking

You might also like