1 cisc3130, spring 2013 x. zhang working with files

37
1 CISC3130, Spring 2013 X. Zhang Working with files

Upload: percival-hoover

Post on 31-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 CISC3130, Spring 2013 X. Zhang Working with files

1

CISC3130, Spring 2013X. Zhang

Working with files

Page 2: 1 CISC3130, Spring 2013 X. Zhang Working with files

2

Outlines Finish up with awk: pipeline, external

commands Commands working with files

tree, ls (-d option, -1 option, -R, -a)od (octal dump), stat (show meta data of file) touch command, temporary file, file with

random bytesFile checksum, verification locate, type, which, find command: Finding

files

Page 3: 1 CISC3130, Spring 2013 X. Zhang Working with files

3

Some useful tips Bash stores the commands history

Use UP/DOWN arrow to browse themUse “history” to show past commands

Repeat a previous command!<command_no>

e.g., !239 “!<any prefix of previous command>

E.g., !g++

Search for a commandType Ctrl-r, and then a stringBash will search previous commands for a match

File name autocompletion: “tab” key

Page 4: 1 CISC3130, Spring 2013 X. Zhang Working with files

Output redirection: to pipeline

#!/bin/awk -f

BEGIN {

FS = ":“

## generate a temporay file

"mktemp /tmp/prog.XXXXXXXX" | getline tmpfile

print "temp file is: ", tmpfile

close ("mktemp")

}

{ # select username for users using bash

if ($7 ~ "/bin/bash")

print $1 >> tmpfile

}

4

END{

while ((getline < tmpfile) > 0)

{

cmd="mail -s Fellow_BASH_USER " $0

print "Hello," $0 | cmd

## send an email to every bash user

}

close (tmpfile);

}

pipe_mail.awkTodo:1. 2.

Page 5: 1 CISC3130, Spring 2013 X. Zhang Working with files

Execute external command Using system function (similar to C/C++)

E.g., system (“rm –f tmp”) to remove a file if (system(“rm –f tmp”)!=0) print “failed to rm tmp”

A shell is started to run the command line passed as argumentInherit awk program’s standard

input/output/error

5

Page 6: 1 CISC3130, Spring 2013 X. Zhang Working with files

6

Outlines Finish up with awk: pipeline, external

commands Commands working with files

tree, ls (-d option, -1 option, -R, -a)od (octal dump), stat (show meta data of

file), cmp, diff touch command temporary file, file with random byteslocate, type, which, find command: Finding

files

Page 7: 1 CISC3130, Spring 2013 X. Zhang Working with files

7

What’s in a file ? files are organized in a hierarchical directory structure

Each file has a name, resides under a directory, is associated with some meta info (permission, owner, timestamps)

Disk files, virtual file system, device filesContents of disk file: text (ASCII) file (such as your C/C++

source code), executable file (commands), a link to other files, …ln -s /path/to/file1.txt /path/to/file2.txt

/proc filesystem stores system configuration parameters, resides in kernels memoryNumerical subdirectories exist for every process.

a device file or special file is an interface for a device driver that appears in a file system as if it were an ordinary fileFor example, /dev/stdin, /dev/tty*

Page 8: 1 CISC3130, Spring 2013 X. Zhang Working with files

8

What’s in a file ? Recall, ls –l output, first character indicates file types:

d directory, - plain file, b block-type special file, c character-type special file, l symbolic link, s socket

To check type of file: “file filename”To view “octal dump” of a file:

od [OPTION]... [FILE]... od --traditional [FILE] [[+]OFFSET [[+]LABEL]]  

Important options:-A: what base to use when displaying address (default:

base 8) -t: specify how to interpret file content

a: named character, c: ASCII character or backslash representation

d[size]: signed decimal, size bytes per integero[size], octal ; x[size], hexadecimal

Page 9: 1 CISC3130, Spring 2013 X. Zhang Working with files

9

What’s in a file ? Example of od $echo abc def ghi jkl | od -c

0000000 a b c d e f g h i j k l \n

0000020

[zhang@storm ~]$ echo abc def ghi jkl | od -Ad –c ## same as –t c

0000000 a b c d e f g h i j k l \n

0000016

$ echo abc def ghi jkl | od -Ad -t d1 ## interpret each byte as decimal integer

0000000 97 98 99 32 100 101 102 32 103 104 105 32 106 107 108 10

0000016

$echo abc def ghi jkl | od -Ad -t x1

0000000 61 62 63 20 64 65 66 20 67 68 69 20 6a 6b 6c 0a

0000016

Page 10: 1 CISC3130, Spring 2013 X. Zhang Working with files

Disk space usagedf  report file system disk space usage

df [OPTION]... [FILE]...Show information about file system on which

each FILE resides, or all file systems by default.

du - estimate file space usagedu [OPTION]... [FILE]...Summarize disk usage of each FILE,

recursively for directories.quota - display disk usage and limits

10

Page 11: 1 CISC3130, Spring 2013 X. Zhang Working with files

11

Compare file contentsCompare files

cmp file1 file2: finds the first place where two files differ (in terms of line and character)

diff file1 file2: reports all lines that are differentdiff’s output is carefully designed so that it can be used

by other programs. For example, revision control systems use diff to manage the differences between successive versions of files under their management.

patch command: apply a diff file to an original patch [options] [originalfile [patchfile]] patch -pnum <patchfile

Page 12: 1 CISC3130, Spring 2013 X. Zhang Working with files

File checksumprovide a single number, signature, that is

characteristic of the file (computed from all of the bytes of the file)Files with different contents is unlikely to

have same checksumUsage: Software announcements include

checksums of distribution files for user to tell whether a copy matches original.

12

Page 13: 1 CISC3130, Spring 2013 X. Zhang Working with files

openssla cryptography toolkit implementing Secure

Sockets Layer and Transport Layer Security network protocols and related cryptography standards

openssl program: a command line tool for using various cryptography functions from shell. Creation and management of private keys, public keys and

parameters Public key cryptographic operationsCreation of X.509 certificates, CSRs and CRLs Calculation of Message Digests Encryption and Decryption with CiphersSSL/TLS Client and Server TestsHandling of S/MIME signed or encrypted mail Time Stamp requests, generation and verification

13

Page 14: 1 CISC3130, Spring 2013 X. Zhang Working with files

Message digestopenssl dgst [-md5|-md4|-md2|-sha1|-sha|-

mdc2|-ripemd160|-dss1] [-c] [-d] [-hex] [-binary] [-out filename] [-sign filename] [-keyform arg] [-passin arg] [-verify filename] [-prverify filename] [-signature filename] [-hmac key] [file...]

Or [md5|md4|md2|sha1|sha|mdc2|ripemd160]

[-c] [-d] [file...]Output message digest of a supplied file or

files in hexadecimal form

14

Page 15: 1 CISC3130, Spring 2013 X. Zhang Working with files

Example $ md5sum /bin/l?696a4fa5a98b81b066422a39204ffea4 /bin/lncd6761364e3350d010c834ce11464779 /bin/lp351f5eab0baa6eddae391f84d0a6c192 /bin/lsOutput: 32 hexadecimal digits, i.e., 128 bits.chance of two different files with identical

signatures is: 1/2128 (the book: 1/264) In 2005, researchers were able to create pairs

of PostScript documents and X.509 certificates with the same hash. Later that year, MD5's designer Ron Rivest wrote, "md5 and sha1 are both clearly broken (in terms of collision-resistance)."

15

Page 16: 1 CISC3130, Spring 2013 X. Zhang Working with files

public-key cryptographyData security by two related keys: a private key, known

only to its owner, and a public key, potentially known to anyoneExamples: RSA, DSA algorithms

Digital signature: Alice => Bob communicationIf Alice wants to sign an open letter, she uses her private key to

encrypt it. Bob uses Alice’s public key to decrypt signed letter, and can then be confident that only Alice could have signed it, provided that she is trusted not to divulge her private key.

Secrecy:If Alice wants to send a letter to Bob that only he can read, she

encrypts it with Bob’s public key, and he then uses his private key to decrypt it. As long as Bob keeps his private key secret, Alice can be confident that only Bob can read her letter.

16

Page 17: 1 CISC3130, Spring 2013 X. Zhang Working with files

Secure Software Distributionmany software archives include digital signatures

that incorporate information from a file checksum as well as from signer’s private key.

how to verify such signatures ?$ ls -l coreutils-5.0.tar* ##Show the distribution files-rw-rw-r-- 1 jones devel 6020616 Apr 2 2003 coreutils-5.0.tar.gz-rw-rw-r-- 1 jones devel 65 Apr 2 2003 coreutils-5.0.tar.gz.sig$ gpg coreutils-5.0.tar.gz.sig ##Try to verify the

signaturegpg: Signature made Wed Apr 2 14:26:58 2003 MST using DSA

key ID D333CBA1gpg: Can't check signature: public key not found

17

Page 18: 1 CISC3130, Spring 2013 X. Zhang Working with files

Verify using public key Obtain public key from public servers Add the public key to your key ring

$ gpg --import temp.keygpg: key D333CBA1: public key "Jim Meyering

<[email protected]>" importedgpg: Total number processed: 1gpg: imported: 1

Verify the signature successfully:$ gpg coreutils-5.0.tar.gz.sig Verify the digital

signatureOnline resource: The GNU Privacy Handbook

18

Page 19: 1 CISC3130, Spring 2013 X. Zhang Working with files

19

Outlines Finish up with awk: pipeline, external

commands Commands working with files

tree, ls and echo (-d option, -1 option, -R, -a)od (octal dump), stat (show meta data of file),

cmp, diff touch command, mktemp, file with random

bytesFile checksum, verification locate, type, which, find command: Finding files

Process-related commands

Page 20: 1 CISC3130, Spring 2013 X. Zhang Working with files

touch: update modification timeTouch sometimes used to create empty files: their

existence and possibly their timestamps, but not their contents, are significant. a lock file to indicate that a program is already running, and

that a second instance should not be started. to record a file timestamp for later comparison with other

files.Example:

$touch -t 197607040000.00 US-bicentennial$ ls -l US-bicentennial ##List the file-rw-rw-r-- 1 jones devel 0 Jul 4 1976 US-bicentennial$ touch -r US-bicentennial birthday #Copy timestamp to the

new birthday file$ ls -l birthday ## List the new file-rw-rw-r-- 1 jones devel 0 Jul 4 1976 birthday

20

Page 21: 1 CISC3130, Spring 2013 X. Zhang Working with files

Temporary filesSo far, we created in current directory

And remove it after using itWhat if multiple scripts use same file name?

or malicious users modify the files?Special directories, /tmp (cleared when

system reboots) and /var/tmp To avoid filename collision, append process

id as suffix ## create a temporary file in shell scriptstmpfile=temp.$$ ## $$ (process id) echo $tmpfile

21

Page 22: 1 CISC3130, Spring 2013 X. Zhang Working with files

mktemp commandmktemp: takes an optional filename template

containing a string of trailing X characters, preferably at least a dozen of them.mktemp replaces them with an alphanumeric

string derived from random numbers and process ID, creates the file with no access for group and other, and prints filename on standard output.

$ TMPFILE=`mktemp /tmp/myprog.XXXXXXXXXXXX` || exit 1 Make unique temporary file

$ ls -l $TMPFILE List the temporary file

-rw------- 1 jones devel 0 Mar 17 07:30 /tmp/myprog.hJmNZbq25727

22

Page 23: 1 CISC3130, Spring 2013 X. Zhang Working with files

Random bytes two random pseudodevices: /dev/random

and /dev/urandom.These devices serve as never-empty

streams of random bytes: such a data source is needed in many cryptographic and security applications.

23

Page 24: 1 CISC3130, Spring 2013 X. Zhang Working with files

24

Outlines Finish up with awk: pipeline, external

commands Commands working with files

tree, ls and echo (-d option, -1 option, -R, -a)od (octal dump), stat (show meta data of

file), cmp, diff File checksum, verificationtouch command temporary file, file with random byteslocate, type, which, find command: Finding

files

Page 25: 1 CISC3130, Spring 2013 X. Zhang Working with files

Search for files locate: find files by name, using regularly updated

database constructed by complete scans of the filesystemlocate [OPTION]... PATTERN...$locate cksum

which: display full pathname for a command, using PATH variable$which rm alias rm='rm' /bin/rm

type: shell built-in command, how each name would be interpreted if used as a command name-t option: report if a name is an alias, shell reserved word,

function, builtin, or disk file

25

Page 26: 1 CISC3130, Spring 2013 X. Zhang Working with files

find commandfind [ files-or-directories ] [ options ]: find files

matching specified name patterns, or having given attributes.–atime n: Select files with access times of n days (-ctime, -

mtime)–ls: Produce a listing similar to the ls long form, rather than

just filenames.–name 'pattern’ : select files matching the shell wildcard

pattern (quoted to protect it from shell interpretation).–perm mask: select files matching the specified octal

permission mask.–prune: do not descend recursively into directory trees.–size n: select files of size n.–type t: select files of type t,a single letter: d (directory), f

(file),or l (symbolic link).

26

Page 27: 1 CISC3130, Spring 2013 X. Zhang Working with files

find: basic operationsfind [ files-or-directories ] [ options ]:

When it finds a file, it first carries out selection restrictions implied by options, and if those tests succeed, it hands the name off to internal action routine.default action: print name on standard output,–exec option: provides a command template into

which name is substituted, and the command is then executed.

27

files and directories to search (directories are (almost) always descended into recursively)

Options: select names for ultimate display or action

Page 28: 1 CISC3130, Spring 2013 X. Zhang Working with files

find usage examples find: display all files/directory under current directory find -ls: display files/directories in “ls” stylefind * -prunefind $HOME/. ! -user $USER.find -ls -type f -fprint /tmp/mytemp

$find -ls -type f -fprint /tmp/mytemp23724924 4 drwxr-xr-x 2 zhang staff 4096 Mar 25 22:40 .23724925 0 --wx------ 1 zhang staff 0 Mar 25 22:35 ./a23724927 0 -rw-r--r-- 1 zhang staff 0 Mar 25 22:35 ./b23724928 4 -rw-r--r-- 1 zhang staff 10 Mar 25 22:40 ./tmp[zhang@storm testfind]$ more /tmp/mytemp./a./b./tmp

28

Page 29: 1 CISC3130, Spring 2013 X. Zhang Working with files

find: examplesFiles that haven’t been modified in the last year

find . -mtime +365Unsigned integer: exactly that many days oldNegative: less than that absolute valuePositive: more than that value

Files that user has writing permissionfind . –perm -200 ## all bits set needs to match permission mask as an octal string

Unsigned: an exact match on the permissions is required. Negative: all of the bits set are required to match. Positive: at least one of the bits set must match,

E.g., +700 //user can read, or write, or execute … Files that user does not have reading permission

find . ! –perm -400

29

Page 30: 1 CISC3130, Spring 2013 X. Zhang Working with files

Find: selectorselector options can be combined: all must

match for the action to be taken. interspersed with the –a (AND) option –o (OR) option: at least one selector of the

surrounding pair must match. Find nonempty files smaller than 10

blocks (5120 bytes)$ find . -size +0 -a -size -10

Find files that are empty or unread in the past year$ find . -size 0 -o -atime +365

30

Page 31: 1 CISC3130, Spring 2013 X. Zhang Working with files

Usage of find in shell script#!/bin/bash … ## go to top level web site directoryfind . -name '*.html' -type f | ##Find all HTML

fileswhile read file ## Read filename into variabledo echo $file ## Print progress mv $file $file.save ## Save a backup copy ##Make the change sed -f $HOME/html2xhtml.sed < $file.save >

$file done

31

Page 32: 1 CISC3130, Spring 2013 X. Zhang Working with files

html2xhtml.sedconverts HTML to XHTML: converts tags to

lowercase, and changes <br> tag into self-closing form, <br/>:s/<H1>/<h1>/g Slash delimiters/<H2>/<h2>/gs/<H3>/<h3>/gs/<H4>/<h4>/gs/<H5>/<h5>/gs/<H6>/<h6>/gs:</H1>:</h1>:g Colon delimiter, slash in datas:</H2>:</h2>:g..s:</[Hh][Tt][Mm][LL]>:</html>:gs:</[Hh][Tt][Mm][Ll]>:</html>:gs:<[Bb][Rr]>:<br/>:g

32

HTML to XHTML, standardized XML-based version of HTML

Page 33: 1 CISC3130, Spring 2013 X. Zhang Working with files

Total file size $ find -ls | awk '{Sum += $7} END

{printf("Total: %.0f bytes\n", Sum)}'Total: 23079017 bytes

33

Page 34: 1 CISC3130, Spring 2013 X. Zhang Working with files

xargs commandSupply the list returned by find as

arguments to another command Via shell’s command substitution feature.

E.g., searching for symbol POSIX_OPEN_MAX in system header files:$ grep POSIX_OPEN_MAX /dev/null $(find /usr/include -

type f | sort)/usr/include/limits.h: #define _POSIX_OPEN_MAX 16Note: why /dev/null here? Potential problems: command line might exceed

system limit => argument list too long error$getconf ARG_MAX ##sysget configuration values2097152

34

Page 35: 1 CISC3130, Spring 2013 X. Zhang Working with files

Xargs commandxargs: takes a list of arguments from standard

input, one per line, and feeds them in suitably sized groups (determined by ARG_MAX) to another command given as arguments to xargs.

$ find /usr/include -type f | xargs grep POSIX_OPEN_MAX /dev/null

/usr/include/bits/posix1_lim.h:#define _POSIX_OPEN_MAX 16

/usr/include/bits/posix1_lim.h:#define _POSIX_FD_SETSIZE _POSIX_OPEN_MAX

35

Page 36: 1 CISC3130, Spring 2013 X. Zhang Working with files

Code Studies: filesdirectories

36

Page 37: 1 CISC3130, Spring 2013 X. Zhang Working with files

37

Summary Finish up with awk: pipeline, external

commands Commands working with files

tree, ls (-d option, -1 option, -R, -a)od (octal dump), stat (show meta data of file) touch command, temporary file, file with

random bytesFile checksum, verification locate, type, which, find command: Finding

files