Download - IT Text Book

Transcript
Page 1: IT Text Book

IT Texxt book for semister-5

INFORMATION TECHNOLOGY

Search engine For a book

Text book for semister -5

AP IIIT – BASARA

ADILABAD

Information taken from RGUKT hub

HTTPS://192.168.1.1/hub

1 Go to Index

Page 2: IT Text Book

IT Texxt book for semister-5

INDEX

This book contins the modules

MODULE NAME PAGE NO:

1. Count the Number of Words.................................................. 5

2. Count the Number of Words in a given File ......................... 11

3. Reading Text from Multiple Files ......................................... 17

4. Accessing Values in Strings .................................................. 20

5. String Slicing ......................................................................... 24

6. Count the number occurrences of a given word (Unigram) in a file .. 28

7. Count the given bigram ......................................................... 34

8. Trigram concept .................................................................... 37

9. Counting vowels in the text .................................................. 40

10.Dictionary ............................................................................. 42

11.Hash Table ............................................................................ 46

12.Counting bigrams in a text file ............................................. 52

13.Comparing two words with the same length ....................... 54

14.Compare two different length strings .................................. 59

15.Sorting of three strings ........................................................ 62

2 Go to Index

Page 3: IT Text Book

IT Texxt book for semister-5

INTRODUCTION ABOUT COURCE

Data Structures &

Algorithms using Python

The objective of this course is to impart basic understanding and hands-on training on

basic data structures and algorithms. As a part of this course, the student is exposed to

basic data structures including arrays (lists), strings, hashing tables / dictionaries and

inverted index list. The students would also be exposed to sorting and searching

algorithms.

This course is structured as a set of 42 modules. These modules are linked together as

a project referred to as book search engine. This search engine is similar to Google

search engine, except that the search takes place in a given book. As a process of

building this search engine, we explain the concepts of data structures and

algorithms. The concepts provided in each module would help the students to realize

a useful search engine. Exercise problems are given at the end of each module. These

problems provide hands-on training and implementation details of the search engine.

At the end of this course, each student is expected to demonstrate his/her search

engine for a given book.

As such the structure and the concepts used in this course is language-independent.

However, as this course is prescribed to be implemented in Python, there are a few

details and reading material which are specific to Python programming language. All

3 Go to Index

Page 4: IT Text Book

IT Texxt book for semister-5

the modules in this course are to be attempted sequentially, as there is an inherent link

between each one of them.

4 Go to Index

Page 5: IT Text Book

IT Texxt book for semister-5

Module 1:

Count the Number of Words

Strings

A string is simply a list of characters in order. A character is anything you can type on the keyboard

in one keystroke, like a letter, a number, or a backslash. For example, "hello" is a string. It is five

characters long — h, e, l, l, o. Strings can also have spaces: "hello world" contains 11 characters,

including the space between "hello" and "world".

There are no limits to the number of characters you can have in a string — you can have anywhere

from one to a million or more. You can even have a string that has 0 characters, which is usually

called "the empty string."

There are three ways you can declare a string in Python: single quotes ('), double quotes ("), and

triple quotes ("""). In all cases, you start and end the string with your chosen string declaration. For

example:

print ('I am a single quoted string')

I am a single quoted string

print ("I am a double quoted string")

I am a double quoted string

print ("""I am a triple quoted string""")

I am a triple quoted string

5 Go to Index

Page 6: IT Text Book

IT Texxt book for semister-5

You can use quotation marks within strings by placing a backslash directly before them, so that

Python knows you want to include the quotation marks in the string, instead of ending the string

there. Placing a backslash directly before another symbol like this is known as escaping the symbol.

Note that if you want to put a backslash into the string, you also have to escape the backslash, to tell

Python that you want to include the backslash, rather than using it as an escape character.

print ("So I said, \"You don't know me! You'll never understand me!\"")

So I said, "You don't know me! You'll never understand me!"

print ('So I said, "You don\'t know me! You\'ll never understand me!"')

So I said, "You don't know me! You'll never understand me!"

print ("This will result in only three backslashes: \\ \\ \\")

This will result in only three backslashes: \ \ \

print ("""The double quotation mark (") is used to indicate direct quotations.""")

The double quotation mark (") is used to indicate direct quotations.

As you can see from the above examples, only the specific character used to quote the string needs

to be escaped. This makes for more readable code.

To see how to use strings, let's go back for a moment to an old, familiar program:

print("Hello, world!")

Hello, world!

Strings and Variables

Now that you've learned about variables and strings separately, lets see how they work together.

Variables can store much more than just numbers. You can also use them to store strings! Here's

6 Go to Index

Page 7: IT Text Book

IT Texxt book for semister-5

Example:

question(Variable)= "What did you have for lunch?"(Value)

print (question)

In this program, we are creating a variable called question, and storing the string "What did you

have for lunch?" in it. Then, we just tell Python to print out whatever is inside the question variable.

Notice that when we tell Python to print out question, there are no quotation marks around the

word question: this is to signify that we are using a variable, instead of a string. If we put in

quotation marks around question, Python would treat it as a string, and simply print out question

instead of What did you have for lunch?.

Let's try something different. Sure, it's all fine and dandy to ask the user what they had for lunch,

but it doesn't make much difference if they can't respond! Let's edit this program so that the user can

type in what they ate.

question = "What did you have for lunch?"

print (question)

answer = raw_input()

print ("You had " + answer + "! That sounds delicious!")

To ask the user to write something, we used a function called raw_input(), which waits until the

user writes something and presses enter, and then returns what the user wrote. Don't forget the

parentheses! Even though there's nothing inside of them, they're still important, and Python will

give you an error if you don't put them in.

You can also use a different function called input(), which works in nearly the same way. We will

learn the differences between these two functions later.

7 Go to Index

Page 8: IT Text Book

IT Texxt book for semister-5

What is a word?

A word is a string without a whitespace or tab or newline. i.e., words are separated by whitespace,

tab or new line. For example, “hello world” is a string, which has two words “hello” and “world”.

Basic String Operations

String Concatenation:

Look at that! You've been using strings since the beginning! You can also add two strings together

using the + operator: this is called concatenating them.

Example:

print ("Hello, " + "world!")

Hello, world!

Notice that there is a space at the end of the first string. If you don't put that in, the two words will

run together, and you'll end up with Hello,world!

String Multiplication:

The * operation repeates the string n times. Example:

print ("bouncy, " * n)

bouncy, bouncy, bouncy, bouncy, bouncy, bouncy, bouncy, bouncy, bouncy, bouncy,

If you want to find out how long a string is, we use the len() function, which simply takes a string

and counts the number of characters in it. (len stands for "length.") Just put the string that you want

to find the length of, inside the parentheses of the function.

For example:

print (len("Hello, world!"))

8 Go to Index

Page 9: IT Text Book

IT Texxt book for semister-5

13

Len():

We can use the len() function to calculate the length of the string in characters.

Example:

var = 'eagle'

print var, "has", len(var), "characters"

O/P: eagle has 5 characters

Int(), float(), str():

We use a built-in int() function to convert a string to integer. And there is also a built-in str()

function to convert a number to a string. And we use the float() function to convert a string to a

floating point number.

Example:

print int("12") + 12

print "There are " + str(22) + " oranges."

print float('22.33') + 22.55

Split() :

Is a function splits given string into words.

Example:

sentence =’It is raining cats and dogs’

9 Go to Index

Page 10: IT Text Book

IT Texxt book for semister-5

splitwords = sentence.split()

here sentence splits into words where space encounters. Variable splitwords is a list contains all

words. See here how it looks

print words

['It', 'is', 'raining', 'cats', 'and', 'dogs']

Note: Explore all string functions

10 Go to Index

Page 11: IT Text Book

IT Texxt book for semister-5

Module 2:

Count the Number of Words in a given File

Splitting a sentence

We hope that you have learnt about what is a string, taking input string and printing strings in the

module 1 and also you have used split() function to count number of words in the given string. Here

we are going to learn how to open a file ,read and count number of words, spaces, lines.. etc in a

given file.

Before we are going to read a file you need to know about split()and count() functions. The standard

split() can use only one delimiter. To split a text file into words you need multiple delimiters like

blank, punctuation, math signs (+-*/), parentheses and so on.

Here's a quick some example to understand

Spliting a sentence:

sent = "Jack ate the apple." # Assign a stament to a variable

splitsent = sent.split(' ') # spliting a stament

print splitsent # printing the data in after spliting

Output: ['Jack', 'ate', 'the', 'apple.']

when we split a statement, that will be converted into a list and every word is stored as an element

11 Go to Index

Page 12: IT Text Book

IT Texxt book for semister-5

of list.

['Jack', 'ate', 'the', 'apple.']

So if we find the length of the list that length will be equal to number words in the statement. This is

one way of counting words in the data. You can use another way.

Counting number of substrings:

Syntax: count(sub[,start[,end]])

Is used to count the number of occurrences of the given item in the list.

l = ['a','b','a','c','d','e']

l.count('a')

=> 3 l.count('d')

=> 1

Counting No of spaces in a statement:

When we use count function to count No spaces it is very is to find out the No of words in the given

data.

sent = "Jack ate the apple." # Assign a stament to a variable

spaces = sent.count(' ') # spliting a stament

print "No of spces",spaces # printing the data in after spliting print "No of words,"

spaces+1

This will produce: No of spaces 3

12 Go to Index

Page 13: IT Text Book

IT Texxt book for semister-5

No of words 4

Reading a file

Files in a programming sense are really not very different from files that you use in a word

processor or other application: you open them, do some work and then close them again.

You can open files with open function, which has the following syntax

Open(name[,mode])

Open() takes two arguments. The first is the filename (which may be passed as a variable or a literal

string and mandatory). The second is the mode which is optional. The mode determines whether we

are opening the file for reading(r) or writing(w).

Ex: file = open(“text1.txt”, ‘r’) # This is in reading mode

We close the file at the end with the close() method.

Ex: file.close()

Here’s a quick example to understand. See that there is a txt file which contains “Hello world!

”statement in it.

f = open('somefile.txt', 'r') # open file in reading mode

print f.read() # reading the data

'Hello, World!'

We can assign data to a variable and it will be consider as a string.

f = open('somefile.txt', 'r') # open file in reading mode

13 Go to Index

Page 14: IT Text Book

IT Texxt book for semister-5

f.read()# reading the data at a time

a = f.read()# assigning data to a variable

print a # printing data

This will produce: Hello, World!

read() : function is used to read file at once

readline() : function is used to reads in just a single line from a file at a time

readlines() : function is used to reads ALL lines, and splits them by line delimiter.

Worked Example 1 :

Take a input statement from the user and count how many words are there in it .

a = raw_input(“Enter your statement \n”)

b = a.split() # splitting the sentences in words

print len(b)

Here b is list contains words.

Input is : “This is my first program.”

Output : 5

Worked Example 2:

Open a txt file called weeks.txt which consists of week days and then read that file.

f = open("weeks.txt",'r') # open file in reading mode

14 Go to Index

Page 15: IT Text Book

IT Texxt book for semister-5

data = f.read() # assigning data to a variable

print data

This will produce:

Sunday

Monday

Tuesday

Wednesday

Thursday

Friday

Saturday

Note: you need to create a text file called weeks.txt where the program file exists.

Worked Example 3 : count how many lines of data in the above txt file.

f = open("weeks.txt",'r')

b = f.readlines() # reading data line by line

count = 0 # assing 0 to a varible to count lines

for i in b: # this for loop is used to read data line by line

count = count+1

15 Go to Index

Page 16: IT Text Book

IT Texxt book for semister-5

print count

This will produce : 7

Worked Example 4:

Read a file name from the user and count the number of spaces in the file.

filename = raw_input("Enter a file name which is already exits \n")

file = open(filename,'r')

data = file.read()

count = data.count(" ")

print count

This will produce : How many spaces are there in a given file.

(Note: when taking a file name as input you need to give extension of that file. Ex: t

16 Go to Index

Page 17: IT Text Book

IT Texxt book for semister-5

Module 3:

Reading Text from Multiple Files

Readings Text from Multiple Files

So far we have covered how to open a text file, read the text, counting the words and closing the

file. Now in this module we will learn how to read multiple text files by using single text file.

Let us see an example to read the text from multiple files

Worked out example :

We have a file 'list.txt' having text one file name file1.txt and file2.txt. Some text has been included

to those two files.

Write a program to read the text in file1.txt , file2.txt and print the text.

list.txt contains following text :

file1.txt

file2.txt

file1.txt contains following text:

This is file1. Here I am adding some text.

file2.txt contains following text:

17 Go to Index

Page 18: IT Text Book

IT Texxt book for semister-5

This is file2. Here I am adding some more text.

Algorithm :

• open file list.txt which has file names

• read text line by line

1. Again open file with every line (every line is file name)

2. Read entire text and print on the console

3. Close file

• Close list.txt file

Output:

This is file1 Here I am adding some text.

This is file2. Here I am adding some more text

Exercise problems:

1. Read a file name "list.txt" which has several file names. Count number of words in each file.

2. Read a file name 'list.txt' which has several file names. Each file is a chapter in a Telugu text

book. Count the number of words in each file, as well as the total number of words in all the

files.

Algorithm for the Program 2:

• Open the "list.txt" file

• Read the "list.txt" file and assign the text to a variable

• Split the entire text into a words and make it as list

• Set totalWords to zero.

18 Go to Index

Page 19: IT Text Book

IT Texxt book for semister-5

• Do the following steps till the length of the list.

1. Open each file from list

2. Read entire text

3. Split the text of the file and count the words

4. Print number of words in each file.

5. Add count to totalWords``

• Print total number of words

19 Go to Index

Page 20: IT Text Book

IT Texxt book for semister-5

Module 4:

Accessing Values in Strings

Accessing Values in Strings

In the previous modules we have seen how to open, read, close a file and learned about strings,

variables and how to count the number of words from the given file. Now we are going to see string

functions how to count the words which starts with a character from the given file. To find the

words with same char we have to use string functions. To do those kind problems you should know

some basic string operations.

Accessing Values in Strings:

Python does not support a character type; these are treated as strings of length one, thus also

considered a substring.

To access substrings, use the square brackets for slicing along with the index or indices to obtain

your substring:

The best way to remember how slices work is to think of the indices as pointing between characters,

with the left edge of the first character numbered 0. Then the right edge of the last character of a

string of n characters has index n, for example:

+---+---+---+---+---+

20 Go to Index

Page 21: IT Text Book

IT Texxt book for semister-5

| H | e | l | p | A |

+---+---+---+---+---+

0 1 2 3 4 5

-5 -4 -3 -2 -1

Worked out example:

>>>s="HelpA"

>>> print s[0] #To print the first character

H

>>>print s[2] #To print the Middle character you can also write s[len(s)/2]

l

>>> print s[5] #To print the last character

A

We can also use some predefined string functions. See the below table

Common string operations

Here is a table:

String Method Description Example

string.startswith( ch

aracter)

Return True if the string starts with the specified

character,otherwise return False

>>> s=”India” >>>

print s.startswith("i")

True

21 Go to Index

Page 22: IT Text Book

IT Texxt book for semister-5

string.find( substrin

g)

Returns the lowest index in the string where the

character is found within the slice range of start and

end. Returns -1 if sub is not found

>>> s="india" >>>

s.find("d") 2

Worked out problems:

1) Using startswith()

>>> x=”string”

print x.startswith(‘s’) #

True

2) Count the number of words which starts with character in the given string

s= "Personal firewall software may warn about the connection IDLE”

char=raw_input("Enter a charcter:")

words=s.split()

count=0

for i in range (len(words)):

if words[i].startswith(char):

count=count+1

22 Go to Index

Page 23: IT Text Book

IT Texxt book for semister-5

print "number of words which starts with given character(",char,")",count

Output: Enter a charcter: s

number of words which starts with given character( s ) 1

23 Go to Index

Solve the problems given below and submit your answers:

Write a program to print the even characters in the given string” Country”?

Print all the first characters in the string “India is my country”? Hint: use split() function

Print the middle character for given odd length string?

Page 24: IT Text Book

IT Texxt book for semister-5

Module 5:

String Slicing

String Slicing

Reading Material:

In the previous modules we have seen how to access the characters in the string and how to access a

file. Now it is very easy to access the last letter of any string from the file chapter.

To access the last character we can use two methods

1) Slicing

Python supports reading part, or a slice, of a larger string:

>>> s = "Peter, Paul, and Mary"

>>> print s[0:5]

Peter

>>> print s[7:11]

Paul

>>> print s[17:21]

Mary

The operator [n:m] returns the part of the string from the nth character to the mth character,

including the first, but excluding the last.

24 Go to Index

Page 25: IT Text Book

IT Texxt book for semister-5

• This behavior is counterintuitive, but it might make more sense if you picture the indices

pointing between the characters, as in the following diagram:

banana

1) If you omit the first index (before the colon), the slice starts at the

beginning of the string.

2) If you omit the second index, the slice goes to the end of the string.

>>> x = 'banana'

>>> x[:3]

'ban'

>>> x[3:]

'ana'

3) s[:] returns the entire string:

>>> x = 'banana'

>>> x[:]

'banana'

4) If you want to access from the last then you can use ‘–‘before number.

>>> x = ‘new_string’

>>> x[-3:] #prints ing from the above string x.

2) Print True if the given string ends with ‘A’?

>>>s="HelpA"

>>>print s.endswith(‘A’)

True

#Solving given problems

Steps to solve problem 1:

1) Read input of string from user.

>>> length=raw_input(“enter the length of string to count words”);

25 Go to Index

Page 26: IT Text Book

IT Texxt book for semister-5

2) Find the first letter and last letter of input string and compare both the characters if both

characters are same then print.

3) print output based on compare. If both chars are same then print as True else print False.

Steps to solve Problem 2:

1) Open a file

>>> file=open(‘test.txt’,’r’) #opening a file.

>>> read= file.read();

>>> textlist = file.readlines() # reads file as a list of lines

# Probably more efficient:

>>>line = file.readline() # reads one line at a time**

2) Read input of last sub string length to print last characters.

>>> length=raw_input(“enter the length of string to count words”);

3) Print the output of all words with given input length.

Steps to solve Problem 3:

1) Open a file which you already done in previous modules. >>>

file=open(‘test.txt’,’r’) #opening a file.

>>> read= file.read();

>>> textlist = file.readlines() # reads file as a list of lines

# Probably more efficient:

>>>line = file.readline() # reads one line at a time

2) Read input string from user and store in a variable.

26 Go to Index

Page 27: IT Text Book

IT Texxt book for semister-5

>>> ch= raw_input(“enter a word to count”);

3) Read input of last sub string length to compare with file contains words.

>>> length=raw_input(“enter the length of string to count words”);

4) Search each word from the file and count the total number of words which are ended with input

string.

• String comparison for each string from the file words and input string.

>>> if (ch[length:]==word[length:])

>>>then increase the count of words.

5) Print the desired count.

27 Go to Index

Solve the problems given below and submit your answers:

Print the true if the string has same first and last character?

Read letter from the user, and store in the variable. Count the number of

words which END with the character stored in the variable.

Print all the last characters in the given file?

Page 28: IT Text Book

IT Texxt book for semister-5

Module 6:

Count the number occurrences

Count the number occurrences

of a given word (Unigram) in a file

We hope that you have learnt about counting a word which starts from a given character. Now we

are going to learn about how to search a substring in a given file.

We count the number of given words in different ways, we can use count() function, which is used

to count of the given word in a file or string this is in built in python another way by using for loop.

Method 1

Syntax: Scount(sub[, start[, end]])

It counts the number of occurrences of a substring. Optionally takes a starting and ending index

between which to search. Here is an example to understand.

String = "I am Raj. I am in iiit"

sub = "am"

28 Go to Index

Page 29: IT Text Book

IT Texxt book for semister-5

count = String.count(sub)

print count

OUT PUT : 2

Method 2

We can count the number of given word occurs in a string in another way.

String = "I am is Raju. I am IIIT,HYD "

a = String.split()

sub = "am"

count = 0

for i in a:

if i==sub:

count = count+1

print count

Output: 2

Worked Example 1:

Open a text file in reading mode and count number of a given word in that file. Using count

function.

file = open('text.txt','r')# opening a file in readig mode

29 Go to Index

Page 30: IT Text Book

IT Texxt book for semister-5

read = file.read() # reading a file and assgin to a varible

split = read.split() # spliting data into a list

sub = raw_input("Enter a word to count\n")# taking a word

count = 0 for i in split: # using for loop to comparing each word with

sub

if i == sub:

count = count+1

print "Total Number of ",sub," are ",count # printing the count

(Note : create a txt file in the same folder where the python file is saved.)

Worked Example 2:

Open a text file in reading mode and count number of a given word in that file. Without using count

function.

file = open('text.txt','r')# opening a file in readig mode

read = file.read() # reading a file and assgin to a varible

split = read.split() # spliting data into a list

sub = raw_input("Enter a word to count\n")# taking a word

count = 0

30 Go to Index

Page 31: IT Text Book

IT Texxt book for semister-5

for i in split: # using for loop to comparing each word with sub

if i == sub:

count = count+1

print "Total Number of ",sub," are ",count # printing the count

(Note : create a txt file in the same folder where the python file is saved.)

Steps to solve prob-2:

1. Open a file

>>> file=open(“text.txt”,’r’);

2. Read the file

>>> read=file.read(); # use related file operations based on usage.

3. Read an input from the user and assign to a variable.

>>> var=raw_input(“enter a word to search”);

4. Read the file line by line till end of file.

a. Assign each line to a temporary variable

i. If it is ends with ‘.txt’

1. Open the file and read the file line by line and compare with input string.

• If both the strings are equal then increase the count.

31 Go to Index

Page 32: IT Text Book

IT Texxt book for semister-5

ii. Else compare that word with input string

1. If both the strings are equal then increase the count.

b. Compare each time with given input string

i. If input string is same as from file.

1. Increase the count

5. Print output of count. Steps to solve prob-3:

1. Open a file

>>> file=open(“text.txt”,’r’);

2. Read the file

>>> read=file.read(); # use related file operations based on usage

3. Read an input from the user and assign to a variable.

>>> var=raw_input(“enter a word to search”);

4. Read the file line by line till end of file.

a. Compare each time with given input string

i. If input string is same as from file.

1. Increase the count.

Print output of count.

32 Go to Index

Page 33: IT Text Book

IT Texxt book for semister-5

1. Solve the problems given below and submit your answers:

1. 1. What is the output when you run the following program? a = “Ramu is

Ramu and Raju is Ramu” print a.count(“ramous”) A . 2 B . 1 C. 0 D. Error

2. Read a file name “family.txt” which has several family member’s file names.

Count the number of words “is” in each file, as well as the total number of

times occurs in all the files.

3. Read a file name “List.txt” from the user and a word and count occurrences

of that word in the file.

33 Go to Index

Page 34: IT Text Book

IT Texxt book for semister-5

Module 7:

Count the given bigram

Count the given bigram

In the previous module we have learned how to find unigram in a file consists of multiple files. In

this module we are going work out to count the given bigram.

Let us look at what is a bigram:

string=”this is a bigram program”

As we have done in the previous module every word in the above string is called unigram. And a

bigram is two consecutive letters or words or syllables separated by single space.

Bigrams in above string :

this is

is a

a bigram

bigram program

Worked out example:

Write a program to find given bigram is present there in the string or not.

string=”hello good morning to all”

search=”good morning”

34 Go to Index

Page 35: IT Text Book

IT Texxt book for semister-5

Solution:

Method 1:

1. Enter a text to check.

Ex: text="hello good morning to all" # String assignment

2. Take input string from user to check whether bigram exist or nt.

Ex: ch="good morning"

3. Split the text and store into other variables.

s_text = text.split() # Splitting the string ‘text’

s_ch = ch.split()

4. Find the length of each variable.

5. Initialize the count to 0 at initial position.

6. Create a for loop till end of text split length.

for i in range(length-1):

if s_text [i]==s_ch[0] and s_text [i+1]==s_ch[1]: # conditional checking

count=count+1

else:

continue

if count==0:

35 Go to Index

Page 36: IT Text Book

IT Texxt book for semister-5

print "The given bigram is NOT FOUND"

else:

print "The given bigram is found ",count," times"

Method 2:

text="hello good morning to all"

ch="good morning"

print text.count(ch) # count is a built in function

36 Go to Index

Page 37: IT Text Book

IT Texxt book for semister-5

Module 8:

Trigram concept

Trigram concept

(Trigram concept)

Reading Material:

In the previous module we have learned how to find unigram and bigram in a file consists of

multiple files. In this module we are going to find trigram.

Let us look at what is a trigram:

Example:- string ”this is a trigram python program”

As we have done in the previous module every word in the above string is called unigram, a bigram

is two consecutive letters or words or syllables separated by single space and a trigram constists aof

three letters or words separated by a single space.

Trigrams in above string:

this is a

is a trigram

a trigram python

trigram python program

Worked out example:

37 Go to Index

Page 38: IT Text Book

IT Texxt book for semister-5

1) Write a program to find given trigram is present in the string or not.

Steps to solve: `` 1. Enter a text to check. Ex: text="hello good morning all for you" # String

assignment

2. Take input string from user to check whether bigram exist or not. Ex: ch="good morning all"

3. Split the text and store into other variables. s_text = text.split() # Splitting the string ‘text’ s_ch =

ch.split()

4. Find the length of each variable.

5. Initialize the count to 0 at initial position.

6. Create a for loop till end of text split length.

a. for i in range(length-2):

i. if s_text [i]==s_ch[0] and s_text [i+1]==s_ch[1] and s_text[i+2]==s_ch[2]: # conditional

checking

1. count=count+1

ii. else:

1. continue

b. if count==0: i. print "The given trigram is NOT FOUND"

c. else: i. print "The given trigram is found ",count," times"

38 Go to Index

Page 39: IT Text Book

IT Texxt book for semister-5

ci.

Solve the problems given below and submit your answers:

1. Take the file “list.txt”( which you have in the previous module with names of

multiple files) and write a python program to print how many times given

trigram(Take from the user) is found

2. Create a file with some text and save it as “trigram.txt”. Write a program to

check the trigram (Take a trigram from the user) is in the file “trigram.txt” or

not.

3. Create a file with some text and save it as “trigram.txt”. Write a program to

check the trigram (Take a trigram from the user) is in the file “trigram.txt” or

not.

4. Take the file “list.txt”( which you have in the previous module with names of

multiple files) and write a python program to print how many times given

trigram(Take from the user) is found.

39 Go to Index

Page 40: IT Text Book

IT Texxt book for semister-5

Module 9:

Counting vowels in the text

Counting vowels in the text

Counting vowels in the text

Until now, you have been working on searching for unigram, bigram and trigram in the text. Now

we will see how to read a sentence from the user and count the total number of occurrences of all

vowels in that statement (vowels are a,e,i,o,u,A,E,I,O,U).

Here I am going to design and implement an application that reads a string from the user, then

determines and prints how many of each vowel (a/E, e/E, i/I, o/O and u/U) appears in the string. Let

us have a separate counter for each vowel (case sensitive). Punctuation not included or counted. In

this program you will learn how to count vowels ‘a’ and ‘e’ in a String. Here one sentence will be

assigned of your own choice and then you will get the number of vowels from that String.

sentence = "This turns out to be a very powerful technique for a problem"

a=[0,0]

for i in sentence:

if i=='a':

a[0]=a[0]+1

elif i=='e':

a[1]=a[1]+1

40 Go to Index

Page 41: IT Text Book

IT Texxt book for semister-5

print “No. of occurrences of a and e:\n”,"a:",a[0],"\n","e:",a[1]

Output: this will produce:

No. of occurrences of a and e:

a: 1

e: 8

The "for i in sentence" goes through the file one line at a time very quickly, and you simply add up

all the times the char occurs in each line. This takes care of any memory problems you might have

with large files, but does take longer.

Solve the problems given below and submit your answers:

1. Write a program to read a sentence from the user and count the total number

of occurrences of all vowels (case sensitive) in that statement. ------- The test

cases for this program are i) "hello world", ii) “a e i o u ", iii) “A E I O U",

and iv) " tO bE, Or not to bE: thAt is thE quEstiOn".

2. Write a program to open an existed file and count the total number of

occurrences of all vowels (case sensitive) in that file.

41 Go to Index

Page 42: IT Text Book

IT Texxt book for semister-5

Module 10:

Dictionary

Operations on Dictionaries

Operations on Dictionaries

The operations on dictionaries are somewhat unique. Slicing is not supported, since the items have

no intrinsic order.

>>> d = {'a':1,'b':2, 'cat':'Fluffers'}

>>> d.keys()

['a', 'b', 'cat']

>>> d.values()

[1, 2, 'Fluffers']

>>> d['a']

1

>>> d['cat'] = 'Mr. Whiskers'

>>> d['cat']

'Mr. Whiskers'

>>> 'cat' in d

True

42 Go to Index

Page 43: IT Text Book

IT Texxt book for semister-5

Combining two Dictionaries

You can combine two dictionaries by using the update method of the primary dictionary. Note that

the update method will merge existing elements if they conflict.

>>> d = {'apples': 1, 'oranges': 3, 'pears': 2}

>>> ud = {'pears': 4, 'grapes': 5, 'lemons': 6}

>>> d.update(ud)

>>> d

{'grapes': 5, 'pears': 4, 'lemons': 6, 'apples': 1, 'oranges': 3}

Add elements to dictionary

#This is a dictionary

>>> d = {'apples': 1, 'oranges': 3, 'pears': 2}

#Adding new element to the dictionary

>>> d['banana'] = 4

#Printing dictionary

>>> d

{'pears': 2, 'apples': 1, 'oranges': 3, 'banana': 4}

Deleting from dictionary

del dictionaryName[membername]

Dictionary

Let us rewind what we learned in the previous module. In the previous module we have learned

43 Go to Index

Page 44: IT Text Book

IT Texxt book for semister-5

counting the number of times of vowels occurring in given string. In the previous module we have

counted the vowels means only five letters(a,e,I,o,u). But In case of many letters or all letters of

alphabet program will be complicated. Here comes use of dictionaries. Then we should have an idea

about what a dictionary is.

Dictionary:

A dictionary in python is a collection of unordered values which are accessed by key.

Dictionary notation

Dictionaries may be created directly or converted from sequences. Dictionaries are enclosed in

curly braces, {}

>>> d = {'city':'Paris', 'age':38, (102,1650,1601):'A matrix coordinate'}

>>> seq = [('city','Paris'), ('age', 38), ((102,1650,1601),'A matrix coordinate')]

>>> d

{'city': 'Paris', 'age': 38, (102, 1650, 1601): 'A matrix coordinate'}

>>> dict(seq)

{'city': 'Paris', 'age': 38, (102, 1650, 1601): 'A matrix coordinate'}

>>> d == dict(seq)

True

Also, dictionaries can be easily created by zipping two sequences.

>>> seq1 = ('a','b','c','d')

>>> seq2 = [1,2,3,4]

>>> d = dict(zip(seq1,seq2)) # Zip function Combines two lists into dictionary

44 Go to Index

Page 45: IT Text Book

IT Texxt book for semister-5

>>> d

{'a': 1, 'c': 3, 'b': 2, 'd': 4}

Solve the problems given below and submit your answers:

1. Write a python program to count occurrences of ‘a’ and ‘z’ letters in below

string. Check_ letters=”this is a small string”

2. Write a python program to count the number of occurrences of each

character.( Take a string from the user )

45 Go to Index

Page 46: IT Text Book

IT Texxt book for semister-5

Module 11:

Hash Table

Hash Table

(Hash table for given sentence)

In the previous module you have learned how to give value to the

character. Now we will see how to give the value to word.

A hash table is a list of strings in which each item is in the form

Name=Value. It can be illustrated as follows:

KEY ValueName1 Value1Name2 Value2Name3 Value3There is no strict rule as to when, where, why, or how to use a hash table. Everything depends on

the programmer. For example, it can be used to create a list that would replace a 2-dimensional

array.

Example for referring a value to a string:

>>>String="word"

>>> value=ord(string[0])+ord(string[1])+ord(string[2])+ord(string[3])

>>>print value

46 Go to Index

Page 47: IT Text Book

IT Texxt book for semister-5

444

In the above example word can refer the value 444

Another useful data type built into Python is the dictionary.

One of Python's built-in datatypes is the dictionary, which defines one-to-one relationships between

keys and values.

Dictionaries:

A dictionary is mutable and is another container type that can store any number of Python objects,

including other container types.

Dictionaries consist of pairs (called items) of keys and their corresponding values.

Python dictionaries are also known as associative arrays or hash tables. The general syntax of a

dictionary is as follows:

It is best to think of a dictionary as an unordered set of key: value

pairs, with the requirement that the keys are unique (within one dictionary). A

pair of braces creates an empty dictionary: {}. Placing a comma-separated list

of ''key: value'' pairs within the braces adds initial key: value pairs to the

dictionary; this is also the way dictionaries are written on output.

Here is a small example using a dictionary:

Example defining a dictionary

>>> tel={'jack':4098,'sape' : 4139}

>>> tel{'guido']=4127

>>> tel

47 Go to Index

Page 48: IT Text Book

IT Texxt book for semister-5

{

'sape':4139,'guido':4127,'jack':4098}

>>> tel[ 'jack']

4098

>>> del tel['sape']

>>> tel['irv']=4127

>>> tel

{ 'guido' : 4127, 'irv' : 4127, 'jack' :4098}

>>> tel.keys()

['guido','irv','jack']

>>>'guido' in tel

True

Keys are unique within a dictionary while values may not be.

>>> dictionary = { 'apple' :1,'apple' :2,'apple': 3, 'ball' :4,'cat' :5}

>>> print dictionary

{ 'ball' :4, 'apple' :3, 'cat' : 5}

>>> dictionary.keys()

['ball','apple','cat'] >>> dictinary.values() [4,3,5]

Properties of Dictionary Keys

Properties of Dictionary Keys:

48 Go to Index

Page 49: IT Text Book

IT Texxt book for semister-5

Dictionary values have no restrictions. They can be any arbitrary Python object, either standard

objects or user-defined objects. However, same is not true for the keys.

There are two important points to remember about dictionary keys:

1) More than one entry per key not allowed. Which means no duplicate key is allowed. When

duplicate keys encountered during assignment, the last assignment will prints.

2) Keys must be immutable. Which means you can use strings, numbers, or tuples as dictionary

keys but something like ['key'] is not allowed.

Worked out example:

Converting a sentence into dictionary:

count = {}

sen=raw_input("Enter a sentence : ")

st=sen.split()

print count

j=0

for s in st:

count[s]=j

j=j+1

print count

OUTPUT:

49 Go to Index

Page 50: IT Text Book

IT Texxt book for semister-5

Enter a sentence : hi this is dictionary program

{'this': 1, 'program': 4, 'is': 2, 'hi': 0, 'dictionary': 3}

Finding the given word in the given sentence using dictionary:

count = {}

sen=raw+input("ENter a sentence : ")

word=raw_input("ENter a word : ")

st=sen.split()

j=0

for s in st :

count[s]=j

j=j+1;

if count.has_key(word):

print "True"

else : print "False"

OUTPUT: Enter a sentence: hi this is dictionary program

Enter a word: hi

True

50 Go to Index

Page 51: IT Text Book

IT Texxt book for semister-5

Solve the problems given below and submit your answers:

1. Sort the given dictionary. Z={‘apple;1, ‘ant’:2, ‘bat’:3, ‘ball’:4, ‘cat’:5}

2. Print all the keys and values from the given dictionary.

3. Take the user input key and its value. Change the first key and value from

the given dictionary. Z={‘abc’:1,’bcd’:2, ‘cde’:3}

4. Change dictionary keys to values, values to keys. Ex: Z={‘abc’:1,’bcd’:2,

‘cde’:3} changed z={1:’abc’, 2:’bcd’,3:’cde’}

5. Print the true if both dictionaries having same key and values. Take two input

dictionaries.

51 Go to Index

Page 52: IT Text Book

IT Texxt book for semister-5

Module 12:

Counting bigrams in a text file

Counting bigrams in a text file

Counting bigrams in a text file

This exercise is a simple extension of the word count demo: in the first part of the exercise, you'll

be counting bigrams, and in the second part of the exercise, you'll be computing bigram relative

frequencies.

Bigrams: are simply sequences of two consecutive words.

Ex: “this is this”

The bigrams of this string are 1. this is 2.is this

Count the bigrams

Take the word count example and extend it to count bigrams. For example, the previous sentence

contains the following bigrams: "Bigrams are", "are simply", "simply sequences", "sequence of",

etc.

Let’s see an example to understand

s="this is Ramu this is raju" # input string

sa=s.split() # spliting data into a string

52 Go to Index

Page 53: IT Text Book

IT Texxt book for semister-5

d= {} # creating dic

for i in range(len(sa)-1): # using for loop to count bigrams

a=sa[i]+' '+sa[i+1] # making bigrams

if a in d: # checking bigram is in dic or

not

d[a]=d[a]+1 # if so incrementing bigram

value

else:

d[a]=1 # adding new bigram to dic

print d # printing dictionary

Out put:

{'is Ramu': 1, 'this is': 2, 'Ramu this': 1, 'is raju': 1}

Solve the problems given below and submit your answers:

1. Take an input string and count how many bigrams are there in the sting.

2. Take filename as input and count how many bigrams are there in the file.

53 Go to Index

Page 54: IT Text Book

IT Texxt book for semister-5

Module 13:

Comparing two

words with the same length

Worked Example 1 :

S1 = 'RAVI'

S2 = 'ravi'

print cmp(S1,S2)

Output : -1

By this we can find the condition. -1 for S1<S2.

Worked Example 2 :

------- S1 = 'ramu'

S2 = 'raju'

print cmp(S1,S2)

Output: 1 -------

54 Go to Index

Page 55: IT Text Book

IT Texxt book for semister-5

Comparing two words with the same length

Comparing two words with the same length.

We hope that you are aware of the strings and their length in previous modules. Here we are taking

two strings and comparing and then printing the comparison values.

Before that we see the compression of two integers.

a = 10

b = 12

if a<b:

print 'True’

else:

print ‘False’

Output : True

Here we know which is big, where has computer checks the integers and then prints the output. But

when we compare the two strings it compares the ASCII of the characters and prints the outputs.

What is ASCII?

American Standard Code for Information Interchange. Pronounced ask-ee, ASCII is a code for

representing English characters as numbers, with each letter assigned a number from 0 to 127. For

example, the ASCII code for uppercase ‘M’ is 77. Most computers use ASCII codes to represent

text, which makes it possible to transfer data from one computer to another.

55 Go to Index

Page 56: IT Text Book

IT Texxt book for semister-5

Character to ASCII value:

>>>ord(‘a’)

97

ASCII value to character :

------- >>> chr(97)

'a' -------

Comparing to strings to characters

Comparing to strings to characters:

s1='a'

s2='b'

if s1>s2:

print “True”

else:

print “False”

Output: True

Noticed that the program is printing the output based on the ASCII values only. We know that

ASCII value of ‘a’ is 97 and ASCII value of ‘b’ is 98 so it will produce the out is “True”.

Method 1:

56 Go to Index

Page 57: IT Text Book

IT Texxt book for semister-5

Compare two strings:

S1 = 'RAVI'

S2 = 'ravi'

if S1<S2:

print 'True'

else:

print 'False'

Output: True

Though both words are same but it checks the lower case and upper case of the words and then

compare. ASCII of ‘R’ is 82 and ASCII of ‘r’ is 114.

Method 2:

**Compare two strings: **

There is a default function in python called cmp() which will produce the comparison values.

Output -1 if s1 < s2

0 if s1 == s2

1 if s1 > s2

Syntax : cmp(value1,value2)

57 Go to Index

Page 58: IT Text Book

IT Texxt book for semister-5

Solve the problems given below and submit your answers:

1. What is the output when you run the following program. ------- a = 10 b = 11

print cmp(a,b)

2. What are the ASCII values for ‘A’ and ‘z’?

3. Write a program to print all ASCII values of lower case alphabets (a to z).

4. Take two string from the user and then use cmp function to compare.

58 Go to Index

Page 59: IT Text Book

IT Texxt book for semister-5

Module 14:

Compare two

different length strings

Compare two different length strings

Compare two different length strings

From the previous module you have learnt how to compare two strings when the lengths are same.

Now you are going to work on the same but with different lengths.

Python Tells it Straight: Size Matters

It's true; Python is a size queen. It's obsessed with comparing strings, numbers, you name it. It may

not make a lot of sense, though, as to how Python sees the value of different strings (uppercase Z is

less than lowercase A).

To compare the size of strings, we use the < and > comparison operators. It should return a True or

False, depending upon whether or not the comparison is.

The rules for which strings are bigger is like so:

• Letters at the start of the alphabet are smaller than those at the end

• Capital letters are smaller than lowercase letters

• Numbers are smaller than letters

59 Go to Index

Page 60: IT Text Book

IT Texxt book for semister-5

• Punctuation marks (aside from curly braces, pipe characters, and the tilde) are smaller than

numbers and letters.

Besides that to compare we use some in-built functions like ord(), cmp() etc. As you are beginners

of programming, I don’t recommend you to use built in functions, where you miss complete logic.

Here is some explanation:

ord()

Given a string of length one, returns an integer representing the Unicode code point of the character

when the argument is a unicode object, or the value of the byte when the argument is an 8-bit string.

For example, ord('a') returns the integer 97, ord(‘U’) returns 85. This is the inverse of chr() for 8-bit

strings.

>>>ord(‘s’)

115

>>>chr(115)

‘s’

cmp()

Compare the two objects x and y and return an integer according to the outcome. The return value is

negative if x < y, zero if x == y and strictly positive if x > y.

string1=”python”

string2=”sython”

print cmp(string1,string2)

It returns -1. It means string1 is less than string2.

60 Go to Index

Page 61: IT Text Book

IT Texxt book for semister-5

Example Programs

Worked out example:

a=raw_input("Enter a value=")

b=raw_input("enter b value=")

sum1=sum2=0

for i in range(len(a)):

** sum1=sum1+ord(a[i])** sum2=sum2+ord(b[i])

if sum1<sum2:

print "-1"

elif sum1==sum2:

print "0"

else:

print "1"

Solve the problems given below and submit your answers:

1. Write a program that takes input from a text file(which contains two strings),

compare and print the biggest.

61 Go to Index

Page 62: IT Text Book

IT Texxt book for semister-5

Module 15:

Sorting of three strings

Example Programs

Worked out Example:

Sorting of two strings in ascending order.

Input: s1=rgukt,s2=iiit

Program:

s1=raw_input("Enter 1st string") #input for 1st string

s2=raw_input("Enter 2nd string") #input for 2nd string

print "Strings in ascending order \n"

if cmp(s1,s2)>0: #comparision of ASCII characters

print s2

print s1

else:

print s1

print s2

Output:

iiit

rgukt

62 Go to Index

Page 63: IT Text Book

IT Texxt book for semister-5

Sort function

Sorting of three strings

In the previous module, we learn comparison of two strings of the same length and different length.

Using the same logic we can sort the strings. We will learn sorting of three strings in this module.

We have the python built in functions to sort the given list like 'sort' and 'sorted'. We will know

about these functions below.

Sort function:

This is a python built in function to sort the list. The list may contain characters, strings and

numbers. When ever we sort a list the list will be changed to sorted list.

Example 1:

s=['a','z','e','s','q']

s.sort() #sorting of the list 's'

print s

Output:

['a', 'e', 'q', 's', 'z']

Sorts strings in a way that seems natural to humans. If the strings contain integers, then the integers

also taken as strings.

Example 2:

>>> s=['Team 11', 'Team 3', 'Team 1']

>>> s.sort() #built in function for sort

>>> print s

['Team 1', 'Team 11', 'Team 3']

63 Go to Index

Page 64: IT Text Book

IT Texxt book for semister-5

Sorted function

Sorted function:

The easiest way to sort is with the sorted(list) function, which takes a list and returns a new list with

those elements in sorted order. The original list is not changed. It's most common to pass a list into

the sorted() function, but in fact it can take as input any sort of iterable collection. The older

list.sort() method is an alternative detailed

Example 3:

s=['a','z','e','s','q']

print sorted(s) #sorting of the list 's'

Output:

['a', 'e', 'q', 's', 'z']

We have learned how to sort a list or strings using built in functions. It is very easy if we use the

built in function but we might not have improved logical skills. So, let us try with out using built in

function. To compare two strings we use cmp()function. You might have compared with the less

than or greater than operator or equal operator. Which works fine in python but not in other

programming languages. So, to be flexible for other programming languages we should use

respective function to compare not the operator. As we have cmp() function in python.

64 Go to Index

Page 65: IT Text Book

IT Texxt book for semister-5

Solve the problems given below and submit your answers:

1. Take the 3 characters from the user and sort them in ascending order. Note:

Sort the characters without using built in function.

2. Take the 3 strings from the user and sort them in ascending order. Note: Sort

the strings without using built in function. Same logic as characters.

THANK YOU

BETA -6

65 Go to Index


Top Related