lsb documentation
TRANSCRIPT
Implementation of LSB Steganography and its Evaluation for
Various File Formats
Abstract
Steganography is derived from the Greek word steganos which literally means
“Covered” and graphy means “Writing”, i.e. covered writing. Steganography refers to
the science of “invisible” communication. For hiding secret information in various file
formats, there exists a large variety of steganographic techniques some are more
complex than others and all of them have respective strong and weak points. The Least
Significant Bit (LSB) embedding technique suggests that data can be hidden in the
least significant bits of the cover image and the human eye would be unable to notice
the hidden image in the cover file. This technique can be used for hiding images in 24-
Bit, 8-Bit, Gray scale
EXISTING SYSTEM
Image compression techniques are extensively used in steganography. Among the two
types of image compressions, lossy compression and loss less compression; lossless
compression formats offer more promises. Lossy compression compression may not
maintain the original image’s integrity. Lossless compression maintains the original
image data exactly, hence it is prefered. Example of Lossy compression format is
JPEG format files. Examples of Lossless compression formats are GIF[3] and BMP
formats.
PROPOSED SYSTEM
The “Data Hiding in Audio Files” mainly developed to embed or extract the
messages into audio files. This project basically deals with two important network
security concepts namely steganography and encryption. For encryption AES
1
algorithm is used. The plaintext is given as input. The plaintext is encrypted using
AES algorithm. The ciphertext is given as output. The output ciphertext is hidden into
the audio files using Seganography. For steganography LSB algorithm is used. The
audio file is in the wave format is the chosen medium to conceal and transmit the
secret information. Since the audio file is in ASCII format, the contents of the text file
are also converted to the bit stream. The encrypted file is now embedded behind the
audio file by mixing the contents together using LSB algorithm. At the other end, the
encrypted file is separated from audio file. The encrypted file is then decrypted and the
original text file contents are then viewed.
SYSTEM REQUIREMENTS
Hardware Specification
Processor : Pentium IV
RAM : 256 MB
CPU Clock : 1.6 GHz
Software Specification
Front End : JAVA
Back End : MS-Access
Platform : Windows XP
2
HIDING METHODS IN IMAGE STEGANOGRAPHY
In Image Steganography, There are a variety of methods using which information
can be hidden in images. Least Significant Bit Replacement Technique: In image
steganography almost all data hiding techniques try to alter insignificant information in
the cover image. Least significant bit (LSB) insertion is a common, simple approach to
embedding information in a cover image. For instance, a simple scheme proposed, is to
place the embedding data at the least significant bit (LSB) of each pixel in the cover
image[7,8,9] . The altered image is called stego-image. Altering LSB doesn’t change the
quality of image to human perception but this scheme is sensitive a variety of image
processing attacks like compression, cropping etc. We will be emphasizing more on this
technique for the various image formats. Moderate Significant Bit Replacement
Technique: The moderate significant bits of each pixel in the cover image can be used to
embed the secret message. This method improves sensitivity to modification, but it
degrades the quality of stego-image. Experiments have shown that the length of hidden
messages embedded in the least significant bits of signal samples can be estimated with
relatively high precision.
THE LSB TECHNIQUE
The least significant bit i.e. the eighth bit inside an image is changed to a bit of
the secret message. When using a 24- bit image, one can store 3 bits in each pixel by
changing a bit of each of the red, green and blue colour components, since they are each
represented by a byte. An 800×600 pixel image, can thus store a total amount of
3
1,440,000 bits or 180,000 bytes of embedded data. As an example, suppose that we have
three adjacent pixels (9 bytes) with the RGB encoding.
10010101 00001101 11001001
10010110 00001111 11001011
10011111 00010000 11001011
When the number 300, can be which binary representation is 100101100 embedded into
the least significant bits of this part of the image. If we overlay these 9 bits over the LSB
of the 9 bytes above, we get the following (where bits in bold have been changed)
10010101 00001100 11001000
10010111 00001110 11001011
10011111 00010000 11001010
Here the number 300 was embedded into the grid, only the 5 bits needed to be changed
according to the embedded message. On average, only half of the bits in an image will
need to be modified to hide a secret message using the maximum cover size. Since there
are 256 possible intensities of each primary colour, changing the LSB of a pixel results in
small changes in the intensity of the colours. The human eye cannot perceive these
changes - thus the message is successfully hidden. With a well-chosen image, one can
even hide the message in the LSB without noticing the difference[10].
4
Algorithm for Hiding (Steganography)
1. Read the original image and the image which is to be hidden in the original image
2. Shift the image to hide in the cover image by X bits.
3. And the original image or cover image with 240 which is 11110000 So four
MSB’s set to 0. Because of this only four LSB’s considered further.
4. The shifted hidden image and the result of step 3 are bitored. This makes changes
only in the X LSB bits so that the image is hidden in the original image.
In MATLAB we convert it to unit8 format. This image can be called as the stego image
Algorithm for Steganalysis
5
1. The stego image is bit shifted by 4 bits since it was shifted by 4 bits to insert it into the
original image.
2. The image is the ANDED with 255 i.e., 11111111, which gives the original image. It is
ANDED with 255 because initially all the LSB’s were made 0. Now it is recovered back.
3. To get it to Unit8 format we, convert it back to unit8 which is the extracted image.
SYSTEM DESIGN
The system design is the most creative and challenging phase of the system
development life cycle. It is an approach for the creation of the proposed system, which
still helps the system coding. It provides the understanding and procedural details
necessary for implementing the system. A number of subsystems are to be identified
which constitutes the whole system. In this phase the data organization is to be discussed,
in which the output formats are to be designed. The system design is composed of several
steps. Here the emphasis is on translating the performance requirements in to the design
specification.
4.1 System Flow Diagram:
4.1.1 Encryption
The process of converting from plaintext to cipher text is known as encryption or
enciphering. The plaintext is the original message and the ciphertext is the
unintelligible message. The encryption algorithm takes secret key and plaintext, and
6
Message
produces the ciphertext. The encryption algorithm performs various substitutions and
transformations on the plaintext.
4.1.2 Decryption
The process of restoring the plaintext from ciphertext is known as decryption or
deciphering. The decryption algorithm takes the cipher text and the secret key, and
produces the original plaintext.
7
ENCRYPTIONALGORITHM
STEGOSYSTEM
STEGOSYSTEM
DECRYPTIONALGORITHM
Secretkey
Secret key
AudioFile
AudioFile
Message
Cipher
Cipher
Fig: 4.1 System design
4.2 Data Flow Diagram
LEVEL O
Fig: 4.2 Level 0 DFD
LEVEL 1
8
Encryption & Audio Steganography
User Data Steganograped audio file
Steganograped audio file
DeStegnographed / Decryption
User Data
File
User DataEncryption
Secret Key
Steganography
Secret Key
DeSteganography
Secret Key
DeCryptionPlain text
Secret Key
Fig 4.3 Level 1 DFD
The level 0 of DFD specifies the user data that is encrypted and then the cipher
text is hiding into the audio file using steganography. The output obtained is
steganographed audio file. Using steganographic algorithm the encrypted text is
separated from the audio file. Using decryption the cipher text is converted to user data.
The level 1 of DFD specifies the user data which is encrypted using secret key
and then the cipher text is hiding into the audio file using steganography with the help of
secret key. Thus the audio file with encrypted data is stored in a file. The audio file is
steganographed with the help of secret key. Thus the encrypted text is separated from
audio file. Using decryption cipher text is converted to plain text using secret key.
UML DIAGRAMS
The Unified Modeling Language allows the software engineer to express an analysis
model using the modeling notation that is governed by a set of syntactic semantic and
pragmatic rules.
A UML system is represented using five different views that describe the system from
distinctly different perspective. Each view is defined by a set of diagram, which is as
follows.
User Model View
i. This view represents the system from the users perspective.
9
ii. The analysis representation describes a usage scenario from the
end-users perspective.
Structural model view
i. In this model the data and functionality are arrived from inside the
system.
ii. This model view models the static structures.
Behavioral Model View
It represents the dynamic of behavioral as parts of the system, depicting
the interactions of collection between various structural elements
described in the user model and structural model view.
Implementation Model View
In this the structural and behavioral as parts of the system are represented
as they are to be built.
Environmental Model View
In this the structural and behavioral aspects of the environment in which
the system is to be implemented are represented.
UML is specifically constructed through two different domains they are:
UML Analysis modeling, this focuses on the user model and structural model
views of the system.
UML design modeling, which focuses on the behavioral modeling,
implementation modeling and environmental model views.
Use case Diagrams represent the functionality of the system from a user’s point of view.
Use cases are used during requirements elicitation and analysis to represent the
10
functionality of the system. Use cases focus on the behavior of the system from external
point of view.
Actors are external entities that interact with the system. Examples of actors include users
like administrator, bank customer …etc., or another system like central database.
USE CASE DIAGRAM
Use Case: Use case describes the behavior of a system. It is used to structure things in a
model. It contains multiple scenarios, each of which describes a sequence of actions that
is clear enough for outsiders to understand.
Actor: An actor represents a coherent set of roles that users of a system play when
interacting with the use cases of the system. An actor participates in use cases to
accomplish an overall purpose. An actor can represent the role of a human, a device, or
any other systems.
11
Embed message
embed file
select file type
set password
retrieve message
retrieve file
about stegnograph
send data
User
help
SEQUENCE DIAGRAM:
This diagram is simple and visually logical, so it is easy to see the sequence of the
flow of control. It also clearly shows concurrent processes and activations in a design.
12
Object: Object can be viewed as an entity at a particular point in time with a specific
value and as a holder of identity that has different values over time. Associations among
objects are not shown. When you place an object tag in the design area, a lifeline is
automatically drawn and attached to that object tag.
Actor: An actor represents a coherent set of roles that users of a system play when
interacting with the use cases of the system. An actor participates in use cases to
accomplish an overall purpose. An actor can represent the role of a human, a device, or
any other systems.
Message: A message is a sending of a signal from one sender object to other receiver
object(s). It can also be the call of an operation on receiver object by caller object. The
arrow can be labeled with the name of the message (operation or signal) and its argument
values
Duration Message: A message that indicates an action will cause transition from one
state to another state.
Self Message: A message that indicates an action will perform at a particular state and
stay there.
Create Message: A message that indicates an action that will perform between two
states.
13
senduser select file embed file stegnograph
1: selects a file
2: embed the file
3: convert to stegnograph
4: send data
5: retrieve file
CLASS DIAGRAM:
Class: A Class is a description for a set of objects that shares the same attributes, and has
similar operations, relationships, behaviors and semantics.
Generalization: Generalization is a relationship between a general element and a more
specific kind of that element. It means that the more specific element can be used
whenever the general element appears. This relation is also known as specialization or
inheritance link.
Realization: Realization is the relationship between a specialization and its
implementation. It is an indication of the inheritance of behavior without the inheritance
of structure.
14
Association: Association is represented by drawing a line between classes. Associations
represent structural relationships between classes and can be named to facilitate model
understanding. If two classes are associated, you can navigate from an object of one class
to an object of the class.
Aggregation: Aggregation is a special kind of association in which one class represents
as the larger class that consists of a smaller class. It has the meaning of “has-a”
relationship.
EmbedMessageGUI
filenamefiletypefilesizepasswordmessage
EmbedMessageGUI()actionPerformed()stateChanged()
EmbedFileGUI
filenamefilesizepassword
EmbededFileGUI()actionPerformed()itemstateChanged()
4.3. DATABASE DESIGN
The file design is totally based on the users requirements, which holds
the normalized tables done in the system analysis. Every possible measures has taken to
15
avoid duplication of data. After identifying all the requirements the files was created
using SQL Server special care was taken.
To reduce redundancy
To increase integrity & accuracy and
For data independence
LOGIN TABLE
Field Name Data Type
User Name Text
Password Text
Table: 4.1
16
5. SYSTEM IMPLEMENTATION
5.1 Implementation Procedures
Implementation is one of the most important tasks in project implementation is
the phase, in which one has to be cautious, because all the efforts undertaken during the
project will be fruitful only if the software is properly implemented according to the plans
made.
The implementation phase is less creative than system design. It is primarily
concerned with user-training, site preparation and file-sites, the tests of the network along
with the systems are also included under implementation.
Depending on the nature of the systems extensive user training may be required.
Programming is itself a design works. The initial parameters of the management
information systems should be modified as a result of programming efforts. Programming
provides a real test for the assumption made by the analyst.
System testing checks the readiness and accuracy of the system to access, update
and retrieve data from the new files. Once the programs become available the test data
are read into the computer and processed. In most conventions parallel run was conducted
to establish the efficiency of the system.
Implementation is used here to mean the process of converting a new or revised
system in to an operational one. Conversion is one aspect of one implementation.
17
There are two types of Implementation
Implementation of a computer system to replace a manual one.
Implementation of a new computer system to replace an existing one.
AES
Fig: 5.1 AES Encryption & Decryption
18
The Advanced Encryption Standard (AES Algorithm) is a computer security
standard that became effective on May 26, 2002 by NIST to replace DES. The
cryptography scheme is a symmetric block cipher that encrypts and decrypts 128-bit
blocks of data. Lengths of 128, 192, and 256 bits are standard key lengths used by AES
Algorithm. The algorithm consists of four stages that make up a round which is iterated
10 times for a 128-bit length key, 12 times for a 192-bit key, and 14 times for a 256-bit
key. The first stage "SubBytes" transformation is a non-linear byte substitution for each
byte of the block. The second stage "Shift Rows" transformation cyclically shifts
(permutes) the bytes within the block. The third stage "Mix Columns" transformation
groups 4-bytes together forming 4-term polynomials and multiplies the polynomials with
a fixed polynomial mod (x^4+1). The fourth stage "AddRoundKey" transformation adds
the round key with the block of data. In most ciphers, the iterated transform (or round)
usually has a Feistel Structure. Typically in this structure, some of the bits of the
intermediate state are transposed unchanged to another position (permutation). AES
Algorithm does not have a Feistel structure but is composed of three distinct invertible
transforms based on the Wide Trial Strategy design method. The Wide Trial Strategy
design method provides resistance against linear and differential cryptanalysis. In the
Wide Trail Strategy, every layer has its own function: The linear mixing layer: guarantees
high diffusion over multiply rounds.The non-linear layer: parallel application of S-boxes
that have the optimum worst-case non-linearity properties.
19
LSB
Data hiding in the least significant bits (LSBs) of audio samples in the time
domain is one of the simplest algorithms with very high data rate of additional
information. The LSB watermark encoder usually selects a subset of all available host
audio samples chosen by a secret key. The substitution operation on the LSBs is
performed on this subset, where the bits to be hidden substitute the original bit values.
Extraction process simply retrieves the watermark by reading the value of these bits from
the audio stego object. Therefore, the decoder needs all the samples of the stego audio
that were used during the embedding process. The random selection of the samples used
for embedding introduces low power additive white Gaussian noise (AWGN). It is well
known from the psychoacoustics literature [6] that the human auditory system (HAS) is
highly sensitive to AWGN. That fact limits the number of LSB that can be imperceptibly
modified during watermark embedding. As the number of used LSBs during LSB coding
increases or, equivalently, depth of the modified LSB layer becomes larger, perceptual
transparency of stego objects is decreased. Therefore, there is a limit for the depth of the
used LSB layer in each sample of host audio that can be used for data hiding. Subjective
listening test showed that, in average, the maximum LSB depth that can be used for LSB
based watermarking without causing noticeable perceptual distortion is the fourth LSB
layer when 16 bits per sample audio sequences are used. The tests were performed with a
large collection of audio samples and individuals with different background and musical
experience. None of the tested audio sequences had perceptual artifacts when the fourth
20
LSB has been used for data hiding, although in certain music styles, the limit is even
higher than the fourth LSB layer. Robustness of the watermark, embedded using the LSB
coding method, increases with increase of the LSB depth used for data hiding. Therefore,
improvement of watermark robustness obtained by increase of depth of the used LSB
layer is limited by perceptual transparency bound, which is the fourth LSB layer for the
standard LSB coding algorithm.
21
6. SYSTEM TESTING
Testing is the process of evaluating a system by manual or automatic means to
verify that it satisfies the specified requirements or to identify differences between the
actual and expected results. During system testing, the system is used experimentally to
ensure that the software does not fail. Special test data are input for processing and the
results are examined. If the program fails to behave as expected, then the conditions under
which such a failure occurs are noted for debugging and correction. Program testing
represents the logical elements of the system. The testing ultimately leads to suffice the
quality factors such as correctness, reliability, efficiency, usability, maintainability,
portability, accuracy, error tolerance and expandability.
6.1 Test strategies:
For testing software, various test strategies can be used such as Unit Testing,
Integration Testing, Black-box Testing, White-box Testing, Regression Testing, and
Acceptance Testing etc.
6.1.1 Unit Testing:
Unit test focuses verification effort on the smallest unit of software design, the
software components or modules. By testing in this method we should be very clear of the
bugs occurred. The module interface is tested to ensure that information property flows into
and out of the program under test.
22
6.1.2 Integration Testing:
Integration testing is a systematic technique for constructing the program structure
while at the same time conducting tests to uncover errors associated with interfacing. The
objective is to take unit tested components and build a program structure that has been
dedicated by design.
6.1.3 Black-box Testing:
Black box testing alludes to test that are conducted at the software interface.
Although they are designed to uncover errors, black-box tests are used to demonstrate that
software functions are operational, that input is properly accepted and the output is
correctly produced. A black-box test examines some fundamental aspect of a system with
title regard for the internal logical structure of the software.
6.1.4 White-box Testing:
White box testing is predicated on close examination of procedural detail. Logical
paths through software are tested by providing test cases that exercise specific set of
conditions and loops.
6.1.5 Regression Testing:
Regression testing involves executing old test cases to test that no new errors have
been introduced. This testing is performed when some changes are made to an existing
23
system. The modified system needs to be tested to make sure that the new features to be
added to work.
6.1.6 Acceptance Testing:
Acceptance testing involves planning and execution of functional tests,
performance test to verify that implemented system satisfies its requirements. Acceptance
tests are typically performed by the quality assurance and customer organization.
CONCLUSION AND FUTURE ENHANCEMENT
In this project it is mainly used for hiding the data in audio files and sends
the data in secure manner in audio file. This project implemented our sending the
data in secure manner .In this project hiding the data in the audio files. So the
hackers can’t access the data. In the receiver side LSB algorithm extract the text
from the audio file. Using the AES Algorithm we decrypt the data form the
encrypted format.
Future Enhancement:
In future, the data can be hidden using video files. Also enhancement like
compression of audio files, more security can be performed.
24
25