Steganography is the art of hiding information within information so as not to arouse suspicion and the process of introducing information into covert channels so as to conceal the information. It is an effective tool for protecting personal information, and organizations are investing significant resources at analysing steganographic techniques to protect their data integrity.
However steganography can also be harmful. It prevents law enforcement authorities in gathering evidences to stop illegal activities, because the techniques of hiding information are becoming more sophisticated.
While steganography has been used in various forms for thousand of years it has become more visible recently due to its application to hide the communications of terrorists as well as child pornography. Steganography conceals the existence of a message by transmitting information through various carriers. Its goal is to provide the detection of secret messages. The most common use of steganography is hiding information of one file within the information of another file. For example cover carriers, such as images, audio, video, text or code represented digitally, wrap up the hidden information. The hidden information may be plaintext, cipher text, images or information inserted in a bit stream.
As criminals become more aware of the capabilities of forensic examiners to recover computer evidence, they are making more use of encryption techniques to conceal incriminating data. Online child pornographers use steganography technology to create private communications and hide the files they exchanged into normal computer files. There are more than 4.000 web sites dedicated to steganography and many of them give information on illicit material detection and hiding.
The two major branches of information hiding are steganography and watermarking; their fundamental difference is that the object of communication in watermarking is the host signal, with the embedded data providing the copyright protection; the message contains information such as owner identification and a digital timestamp which is usually applied for copyright protection. In steganography the object to be transmitted is the embedded message, and the cover signal serves as and innocuous cover chosen arbitrarily by the user. In watermarking the existence of the mark may be declared openly and any attempt to remove or invalidate the embedded content makes the host useless. Steganography try to avoid the detect ability of human perception and computer algorithms.
Steganography differs from cryptography which does not conceal the data to be communicated but scrambles them to prevent the understanding of the contents; the two techniques are considered orthogonal and complementary: anyone wanting to communicate in a hidden manner can apply a cryptographic algorithm to the secret data and then proceed to its embedding.
Images are the most used cover media for steganography and can be stored in various formats (BMP, GIF, JPEG); the information hiding process may be summarized as composed by two steps: identification of redundant bits that can be modified without degrading the quality of the cover medium followed by the selection of a subset of the redundant bits to be replaced with data of the secret message.
The techniques are evaluated according to the security not to be detected, the size of the payload (the amount of data to hide) and the robustness against malicious and unintentional attacks. The techniques are classified in the following groups; for any of them software tools are available :
- modification of the Least Significant Bit (LSB); it is applied to the LSBs of the pixels; the stego image is visually identical to the cover;
- masking; the pixel values in masked areas are altered by some percentage; a kind of patchwork is built where pairs of patches are randomly selected and their pixels are raised and lowered by the same amount;
- domain transform; data are embedded according to mathematical transform methods, i.e. modulating coefficients in a transform domain such as Discrete cosine, Discrete wavelet, Discrete Fourier; the concealed data are spread across the entire image, then scrambled and submitted to second-layer transformation;
- compression; data embedding is integrated with an image compression algorithm, such as JPEG; due to the popularity of JPEG images on the Internet these techniques are very attractive;
- spread spectrum; the hidden data is spread throughout the cover image. A key is used to randomly select the frequency channels of the colours and then an encryption process occurs.
Even if the steganographic tools alter only the least significant image components, they leave detectable traces in the stego image; steganalysis refers to the detection of the presence of hidden information in a given image; moreover the cover image is not available to the attacker.
The simplest idea to detect modified file is to compare them with the original: the normal solution to detect hidden information is to build a library of hash sets and compare them with the hash values of files under investigation; the hash set will identify steganography file matches. Investigators must use safe hash sets to exclude reliable files from their investigations. System files not modified since their installation are included in a safe hash set. NIST started the National Software Reference Library research project which computes a unique identifier for each file in the operating systems based on the file's contents: the identifiers are created using SHA-1 algorithm; if a perpetrator tries to hide a pornographic image by renaming it as an ordinary operating system file or renaming a .JPG file as a .EXE file, the hash value derived from the image will not match that from the known operating system file and will thus be identified.
Steganalysis involves two major types of techniques:
- visual analysis; it tries to reveal the presence of secret communication through inspection, either through the eye or with the help of a computer system, typically decomposing the image into its bit planes;
- statistical analysis; it is able to reveal if an image has been modified by testing whether its statistical properties deviates from a norm. It can detect tiny alterations in the statistical behaviour caused by steganographic embedding; several different methods, aimed at different types of the above mentioned embedding techniques, are available.
Even if in the manuals of US law enforcements agencies (e.g. American Prosecutor Research Institute) no steganographic guidelines are available, there are rules of thumb the investigators use while looking for indications that may suggest the use of steganography; they are:
- technical capabilities or sophistication of the computer's owner;
- software clues from the computer; e.g. file names, Web site references, cookies, history files, cryptography;
- program files; e.g. hex editor, disk wiping software, specialized chat software;
- multimedia files; e.g. files large enough for staganographic use, with duplication;
- type of crime; e.g. child pornography, accounting fraud, identity theft, terrorism.
Steganography and steganalysis can be used by law enforcement forces as ordinary tools to contrast crimes and, in particular, child pornography.
The Italian codes provide the law n. 269 on August 3, 1998 as the main tool to fight against child pornography; its article n. 14 provides for law tools to ascertain the worst behaviours and allow the police investigators to work under cover to discover the tort liabilities and pornographic media trade. A few techniques overcame the legitimacy examination by the judges and therefore are considered legitimate and practicable.
Other techniques such as honey pots were declared illegitimate by the Court of Cassation (ref. 37074/2004) in relation with a few cases of less serious offence. According to this decision the use of an agent provocateur is allowed only when his or her behaviours do not conflict with the constitutional norms and it must be limited to exceptional circumstances under strict procedures and legal binding obligations. Out of this restrictive clauses, the activities are not permitted, are considered illegitimate and, in some cases, unlawful; therefore the clues that were gathered are not admitted in court.
Thus there is the need to define novel techniques and new tools that make the identification, contrast and punishment of illicit behaviours, in a framework of lawful operations strictly connected with high level information and communication technical solutions.
The above mentioned article n. 14 allows either a purchasing and an intermediation activity. If the intermediation activity includes the dynamic action of trading (purchasing and selling) child pornography materials, then the success possibility of crime contrast may be based on the new stego techniques and not any longer on the honey pots.
Steganographic techniques can be used by law enforcement bodies to :
- steganographically watermark the intermediation materials, after the authorization and under the public prosecutor control, with predefined marks;
- allow the tracing of the trade materials;
- define and provide network nodes where the trade materials is monitored;
- design and build up international data bank (following the Europol model) to collect date on the trading, under the control of investigative bodies;
- gather and analyse evidences to verify the univocity of buying and trading by the person under investigation.
The struggle between steganography and steganalysis represents an important part of cyber warfare with deep influence on computer security. The attempt to transmit secret messages, under cover of innocuous multimedia signals, is at odds with the effort to detect or prevent such hidden communication.
From the law point of view, various steganographic tools have been developed and a few are available online. Some simple methods are defeated by steganalysis but countermeasures against steganalysis are also emerging. Steganographic tools that can resist statistical attacks are being introduced. For example, in data embedding, much attention is dedicated to preserve the statistical characteristics of the cover media; to withstand steganalytic tools based on the analysis of the increase of the unique colours in an image, new embedding methods may be designed to avoid the creation of new colours; alternatively, modifications leading to detectable parts may be compensated, in different ways, while ensuring the intended recipient is still able to extract the secret message.
From the law point of view it must be stressed that the use of stego techniques by the law enforcement agencies may increase the chances of presenting to the court in the process phase well defined and verified evidences thus reducing the chances of disclaimer and increasing the believable references to the person under investigation. The success of police investigations will increase as much as the international cooperation will grow; in this area steganography, used by the trans-national police bodies to mark the traded materials with the sharing of international data banks, will become an important tool to trace, e.g., the child pornography trade and make the investigations more effective.
Cesare Maioli and Antonio Gammarota Law School University of Bologna
- Amin M. M. et alii, Information hiding using steganography, IEEE, 4th National Conference on Telecomm Technologies, Malaysia, 2003
- Curran H., et alii, An evaluation of image based steganography methods, International Journal of Digital Evidence, v. 2, n. 2., Fall 2003
- Johnson N. and S. Lajodia, Exploring steganography: seeing the unseen, IEEE Computer, v. 31, n. 2, Feb. 1998
- Kessler G., Staganography: implications for the prosecutor and computer forensics examiner, Child Sexual Exploitation Update, APRI, v.1, n. 1, Summer 2004
- Li X. and Seberry J., Forensics computing, LNCS 2904, Springer, 2003
- Provos N. and P. Honeyman, Detecting steganographic content on the Internet, ISOC NDSS, San Diego, 2002
- Wang H. and S. Wang, Cyber warfare: steganography vs. steganalysis, CACM v. 47, n. 10, October 2004