Amlan Chakraborty 0440954: Amlan@cs - Washington.edu
Amlan Chakraborty 0440954: Amlan@cs - Washington.edu
#t ganog$a%h& an! 'igital (at $ma$)ing ** A%%lications+ Attac)s an! Co"nt $m as"$ s
,nt$o!"ction
Steganography is the science of hiding information in data. Normally steganography is done intelligently such that it is difficult for an adversary to detect the existence of a hidden message in the otherwise innocuous data. The piece of data that has the message embedded in it is visible to the world in the clear and appears as harmless and normal. This is in stark contrast with cryptography where the message is scrambled to make it extremely difficult or impossible for an adversary to put together. A message in ciphertext arouses some sort of suspicion whereas invisible message embedded in clear text does not. This is the advantage of steganography. Generally, a steganographic message will appear to be something else a picture, an audio file, a video file or a message in clear text ! the co- $t .t. "istorically, messages were written using hidden invisible ink between the visible lines of innocuous documents, or even written onto clothing. #ther techni$ues used were writing messages in %orse code in knitting yarn, or marking particular words or letters in the message, using invisible ink or pin prick that form the secret message. &uring ''(( Germans used the microdot technology, where an image the si)e of a period had the clarity of typewritten pages. (n this case the period was the covertext and the image is the message. Though smart hiding and innocuous hiding techni$ues are used to hide the st got .t, the algorithm itself is secure and only known to the communicating parties and not to the world. This is in slight contrast to classical cryptography where the algorithm is well known and only the key*s+ are secret. Though data is not encrypted in steganography, authenticity of a message is normally established by using a %A, or a signature. Steganography can be used to code messages in any transport layer - an image *G(./0%1/213G+, a %14 file, a communications protocol like 5&1 etc. Steganogrpahic information can also be added to richer multimedia content like &6&s. There are normally two motivations - to send a secret message or to establish authenticity of a piece of information - usually a multimedia file. The later is a ma7or application of modern steganography and known as 'igital (at $ma$)ing and /ing $%$inting. 'atermarks establish ownership of an artifact while fingerprints or labels help to identify intellectual property violators. They are different protocol implementation of the same basic idea.
redundancy. .or a digital image, this may be noise from the imaging element8 for digital audio, it may be noise from recording techni$ues - amplitude or fre$uency modulation. Any system with an analog *signal+ amplification stage will also introduce thermal noise, which can be exploited as a noise cover. Steganographic channel is a co- $t chann l in (nformation theory terms since it transfers some kind of information using a method originally not intended to transfer this kind of information. Steganography also supports both storage and timing covert channels. This report primarily discusses storage covert channels where a covert message is communicated by manipulating a stored ob7ect like an image. =on =ivestCs D,haffing and 'innowingE protocol discussed later can be argued as an example of timing covert channel. (t is fairly obvious that more the data content of the cover message, the easier it is to hide the message. (n case of images, bitmaps are better fits that G(.s and 213Gs because G(. is < bits per pixel and 213G is a lossy compression techni$ue. 0ut on the flipside, bigger images will attract more attention than smaller images as suspect stego!images. Subtlety in changes is a very important feature and stego!images should only have subtle changes. An image with large areas of solid colors would be a bad fit since large variances created by the embedded message would cause drastic differences easily spotted by the human eye. The spatial fre$uency distribution of the image *spatio! temporal in case of audio or video content+ is also a determining factor in the efficiency of the hiding process. As we will see later, we have techni$ues for both Gaussian and @a1lacian distribution using maximum likelihood estimators for the stego!messages. #ften the embedded message is itself encrypted using a key that may or may not be known to the adversary. Since steganography re$uires that communicating parties have some prior shared information, symmetric key is a natural fit. "owever, public steganography with steganographic key exchanges is also possible.
"owever, weak %A, functions can potentially leak information in this protocol. (t is also important to note that it is not possible to use digital signatures here since anyone will be then able to compare the signatures and tell DchaffE from DwheatE. "owever, Ddesignated verifier signatureE schemes where only signature designates can verify a signature would work fine. The other key idea is that since the creation of DchaffE involves generation of a bad %A, and not the knowledge of a secret key, any entity can play the role of a DchafferE.
process is iterated and the whole image palette is laid in a mosaic of bright and dark patches one of which is used to hide data. This patch information is vital to decode the hidden message later. This is clearly a fre$uency distribution method. 1atchwork makes the assumption that the image has a Gaussian distribution. T .t"$ Bloc) co!ing (n this method, pairs of areas of similar texture are found and one area is copied over the other. Thus we have identical blocks of texture in the image. (terating a few times, we can get two large blocks of identical textures. These two blocks would get altered identically for all non!geometric alterations of the image. These two blocks can then contain information about these images. M*# 6" nc s "sing lin a$ shi0t $ gist $s %!se$uences are based on starting vectors of a .ibonacci recursion relation which form a Galois field of finite cardinality. %athematically and statistically these numbers are known to have desirable autocorrelation functions8 the distribution of Galois field numbers is known to be of normal distribution thus resembling Gaussian noise in an image. So images encoded using m!se$uences are statistically impossible to distinguish from the original as they are similar to noise in a normal distribution. (f the stego message is encoded using m!se$uences, it can easily be embedded in the image by a @S0 substitution. A more secure implementation would be to use @S0 addition instead to embed the watermark. So it will re$uire the examination of the complete bit pattern and the current linear shift register implementation. This is more secure because to crack this, the adversary would have to do the same computations without any apriori knowledge. /$ 6" nc& ho%%ing (n this method scattering of the message is done on the basis of rules that change cumulatively. The idea is similar to &3S block encryption8 bits are swapped according to rules that are dictated by the stego!key and random data from the previous round. (hit nois sto$m+ an implementation of this methodology, creates a message space of < channels where each channel has a window of ' bytes, where ' is a random number. 3ach channel however carry only one bit of the message and a lot of unused bits. The bits inside a window permutate and rotate according to an algorithm that is regulated by the previous windowCs operations and the stego!key. .inally this encoded message is embedded in the image using @S0 substitution. The idea again is to simulate a distribution that is similar to a Gaussian distribution.
These attacks attempt at completely removing watermark from the data. Since a lot of steganography algorithms try to hide data as noise, removal of noise should obliterate the watermark. These algorithms they try to estimate the cover data using a given statistic for the noise in it. (t assumes the noise to be the watermark. @angelaar et al proposes a se$uence of filtering operations* median filtering, highpass filtering+ on the image to denoise the image that will likely get rid of the digital watermark. There are several other watermark estimator algorithms that uses either Ma.im"m A%ost $io$i 1$o3a3ilit& 9MA1: if we know the image statistics or Ma.im"m li) lihoo! 9ML: ,lassifier algorithms if we do not know anything about the images, to find an estimate of the digital watermark. 6oloshoynovisky proposed an algorithm where he used the %A1 estimator and then remodulates the image to find the least favorable noise distribution. This is guessed to be the watermark. #ften lossy compression of uncompressed image data like 213G, would completely wipe out the watermark since the raw data would be replaced by &irect ,osine Transforms of the data. "owever, this is mitigated by algorithms that can hide information directly in compressed data. ;. < om t$ic Attac)s 8 (a$%ing+ t$ans0o$ming+ =itt $ing tc. These attacks are the easiest to implement and often very effective. (nstead of removing the watermark, these stress on distortion of embedded data by spatial or temporal alterations *in case of audio and video data+. The result of these attacks is to scatter and alter the way the watermark is laid out in the image. .or a simple attack, if an image is rotated by a slight angle, say G degree and the edges filled by the texture of the average of ad7acent pixels, there is a high likelihood that the watermark would fall out of sync with the watermark detector. The key idea here is though the digital watermark data exists in the artifact, it has moved in such a way that the watermark detector can no longer detect the data. >itt $ing is another effective attack that works extremely well for audio data. An audio signal is chunked up into DnE chunks and then either one chunk is deleted or a copy is made and then assembled back together ending up in either *n !G+ or *n HG+ samples. This introduces a 7itter in the signal that is not detectable by humans. &igital watermarks would totally get destroyed in this attack. ?n@ign implements a pixel 7ittering algorithm that works well on spatial domain watermarks. Another important observation is that though some algorithms survive basic geometric attacks like rotation, shearing, resi)ing etc., they succumb to a combination of different attacks. #ti$Ma$) is an implementation based on these principles that simulates an iterative resampling process where the image is slightly resi)ed, sheared and rotated by a random small amount. "owever, repeated iterations of Stir%ark degrade the image to the point that humans can detect the difference between the original and the processed. C$&%to attac)s 8 B.ha"sti- ) & s a$ch+ Coll"sion+ A- $aging+ O$acl attac) These are similar to normal cryptographic attacks where the steganographic key is searched exhaustively. Statistical averaging attacks involve taking the same data set with different instances of watermarks and then averaging them to find the attacked data set. A modification of the averaging algorithm is the collusion attack where smaller portions of the data set are taken and attacked data set found using averaging algorithms. These smaller datasets are then combined to get a new attacked data set. 1$otocol attac)s 8 (at $ma$) in- $sion+ Co%& Attac). These attacks do not aim to detect, destroy or disable the watermark, but to attack the basic tenets of watermarking e.g. watermarks cannot be extracted from non watermarked data. The (at $ma$) in- $sion attac) uses the feature of o- $ma$)ing that is the ability to mark an image more than once. 0ob gets an image from Alice that has her watermark. 0ob subse$uently generates his own watermark and subtracts his watermark from the image he got from Alice. &ue to overmarking, AliceCs signature would still be readable from this image making it almost identical to the image Alice circulated. 0ob can now argue that Alice has removed his signature and added hers to generate this image. This will establish that 0ob was the actual owner of the image.
A.
;.
The Co%& Attac) gets an estimate of the watermark using a %A1 or a %@ estimator. (t then processes this watermark using the least favorable noise function *mentioned in replacement attacks+ to smoothen the watermark. (t then adds the watermark to a new document. ,opy attack allows anyone to identify his own document as being watermarked by a well known entity by placing a watermark copied from a document published by that entity on it. This is a very serious attack that Iutter et al experimentally succeeded to accomplish.
Concl"sion
The challenges in digital watermarking stem from the fact that the attacks derive from the same phenomenon as the watermarking technology itself !! small noise insertion doesnCt create humanly noticeable changes to an artifact. ,learly right now the watermarking technology is not robust enough to mitigate combination of attacks. (ntroduction of new authentication schemes as proposed by public key steganography would attach another layer of security but does not in itself guarantee universal absolute robustness of watermarks. ( think the solution may very well lie in better statistical models based on information theory. 'e can mitigate some attacks using authentication and authori)ation - but pattern detection and obfuscation should be mitigated by better scattering algorithms.
R 0 $ nc s
G. :. 4. ;. >. ?. J. <. A. GK. GG. G:. G4. G;. G>. Neil 2ohnson and Sushil 2a7odia, 3xploring Steganography Seeing the 5nseen. Gustavus 2 Simmons, The 1risonerCs 1roblem and the Sublimimal channel =onald @. =ivest, ,haffing and 'innowing ,onfidentiality without 3ncryption .abien A 1etitcolas, =oss 2 Anderson and %arkus Iuhn, (nformation "iding a Survey. 1ierre %oulin and 2oseph #C Sullivan, (nformation!Theoritic Analysis of information "iding Nicholas 2 "opper, 2ohn @angford and @uis 6on Ahn, 1rovably Secure Steganography Neil 2ohnson and Sushil 2a7odia, Steganalysis The investigation of hidden information 0ender, Gruhl, %orimoto and @u, Techni$ues for data hiding. Neil . 2ohnson, An (ntroduction to 'atermark recovery from (mages .abien A 1etitcolas, =oss 2 Anderson and %arkus Iuhn, Attacks on copyright marking systems. %artin Iutter and Sviatoslav 6oloshynoviskiy, The 'atermark ,opy attack Niels 1rovos, &efending against Statistical Steganlysis 0arr, 0radley and "annigan, 5sing &igital watermarks to mitigate the threat of copy attacks. Iaren Su, &eepa Iundur and &mitrios "at)inakoa, A novel approach to collusion resistant 6ideo watermarking. Stefan Iat)enbeiser and "elmut 0eith, Securing symmetric watermarking schemes against protocol attacks.