1402 Columnar Transposition
1402 Columnar Transposition
Chris Christensen
CSS/MAT 483
Columnar Transposition
Classically ciphers that rearranged the letters of plaintext were called transposition
ciphers. They can be recognized because ciphertext letter frequencies are the same
as plaintext letter frequencies.
The nose is pointing down and the houses are getting bigger.
There are 49 letters in the message. We want to place the letters of the message in
a rectangular array. In this case, because we would like the rectangular array to
have 49 cells, a 7 × 7 array may be used. We also need a keyword having its
length the same as the number of columns – we will use analyst.
A N A L Y S T
1 4 2 3 7 5 6
t h e n o s e
i s p o i n t
i n g d o w n
a n d t h e h
o u s e s a r
e g e t t i n
g b i g g e r
The ciphertext is obtained by reading down the columns in the order of the
numbered columns (which are alphabetically ordered).
TIIAOEGEPGDSEINODTETGHSNNUGBSNWEAIEETNHRNROIOHSTG
Our message exactly fit the rectangular array. If the message does not completely
fill the array, nulls (i.e., padding) may be added to fill the array (this is the easier
cipher to break) or not (this is harder to break because the columns do not all have
the same length). In the latter case, the length of the keyword determines the
number of columns, and the number of letters in the message determines the
number of complete and partial rows.
The transposition should be applied several times if the plaintext message were
longer than 49 letters.
Here is a message that was encrypted using a rectangular array with keyword
analyst.
TRLEELIGCIGEHALANTNCTECYENEN
Because the keyword has 7 letters, we know that the rectangular array has 7
columns. The message has 28 letters; therefore, the array must be 4 × 7 . Each
column must have 4 entries.
First, we place the letters of the keyword in alphabetical order: aalnsty. Then place
the ciphertext letters in columns.
A A L N S T Y
t e c h n t e
r l i a t e n
l i g l n c e
e g e a c y n
Now rearrange the letters of the keyword to form analyst.
A N A L Y S T
t h e c e n t
r a l i n t e
l l i g e n c
e a g e n c y
We will do only "the easy case;" i.e., we will assume that the columnar
transposition uses a rectangular array that was completely filled.
A F L
S N S
A I T M T S E A M O
Either S R F I K O E or I I I .
A I N M L I M R M E
I T E
T K M
The 7 × 3 arrangement seems unlikely because it has a string TKM with no vowels
that is unlikely. Also, the III is unlikely. So, let us try the 3 × 7 arrangement.
Notice that there are 7! = 5040 arrangements of the columns. We would like to not
have to try all of them!
A I T M T S E
S R F I K O E
A I N M L I M
In the first row, MATE seems to leap out. This leaves ITS. Perhaps, a slightly
wrong guess – ESTIMAT- seems to be a possibility.
E S T I M A T
E O K R I S F
M I L I M A N
Not quite, but there are two Ts in the first row. Let us swap those columns.
E S T I M A T
E O F R I S K
M I N I M A L
This works. Notice that because we have multiple rows that are permuted the same
way, we can use multiple anagramming for cryptanalysis.
It is often worthwhile to write the ciphertext in columns, cut out the columns, and
rearrange the columns to do the anagramming.
For a 3 × 7 rectangle, each row should contain approximately 2.8 vowels. Let us
note the difference between this estimate and the actual count:
The sum of the differences is 6.2. It appears that the 3 × 7 rectangle is more likely.
A I T M T S E
S R F I K O E
A I N M L I M
We will pair the first column with each of the other columns on the right and
consider how likely it is that such digraphs will occur in English. The frequencies
we will use come from Sinkov (see the appendix). Recall that there are
26 × 26=676 digraph frequencies.
AI 311 AT 1019 AM 182 AT 1019 AS 648 AE 13
SR 9 SF 8 SI 390 SK 30 SO 234 SE 595
AI 311 AN 1216 AM 182 AL 681 AI 311 AM 182
631 2243 754 1730 1193 790
AT
SF
AN
Oops! We know that this is not the correct pairing, but the second most likely
pairing is correct. (During cryptanalysis, we don’t always get the correct result on
the first try.)
Once we have a pairing, we could then continue using digraph frequencies to select
columns to add on the left and on the right. Etc.
More columnar transposition
In by rows:
n o r s e
g e r m a
n y s e e
k s a n a
l l i a n
c e
Out by columns:
Because the columns do not have the same length, this would not be as easy to
cryptanalyze. It would not be obvious how many columns were used. (The size of
the rectangle would be either 2 ×11 or 11× 2 if we knew that a full rectangle had
been used; i.e., the keyword would have length either 11 or 2.)