4.1 reading and writing files. 4.2 open a file for reading, and link it to a filehandle: open(in,...
TRANSCRIPT
4.1
Reading and writing files
4.2
Open a file for reading, and link it to a filehandle:open(IN, "<EHD.fasta");
And then read lines from the filehandle, exactly like you would from <STDIN>:my $line = <IN>;
my @inputLines = <IN>;foreach $line (@inputLines) ...
Every filehandle opened should be closed:close(IN);
Always check the open didn’t fail (e.g. if a file by that name doesn’t exists):open(IN, "<$file") or die "can't open file $file";
Reading files
4.3
Open a file for writing, and link it to a filehandle: open(OUT, ">EHD.analysis") or die...
NOTE: If a file by that name already exists it will be overwriten!
Or, you can add lines at the end of an existing file (append): open(OUT, ">>EHD.analysis") or die...
Print to a file:print OUT "The mutation is in exon $exonNumber\n";
Writing to files
no comma here
4.4
You can ask questions about a file or a directory name (not filehandle):
if (-e $name) { print "The file $name exists!\n"; }
-e $name exists-r $name is readable-w $name is writable by you-z $name has zero size-s $name has non-zero size (returns size)-f $name is a file-d $name is a directory-l $name is a symbolic link-T $name is a text file-B $name is a binary file (opposite of -T).
File Test Operators
4.5
open(IN, '<D:\workspace\Perl\p53.fasta');
• Always use a full path name, it is safer and clearer to read
• Remember to use \\ in double quotes
open(IN, "<D:\\workspace\\Perl\\$name.fasta");
• (usually) you can also use /
open(IN, "<D:/workspace/Perl/$name.fasta");
Working with paths
4.6
It is common to give parameters within the command-line for a program or a script:
They will be stored in the array @ARGV:
@ARGV contains: ("my","argument","list");
foreach my $arg (@ARGV){ print "$arg\n";}
Command line parameters
> perl -w findProtein.pl my argument list
myargumentlist
4.7
It is common to give parameters within the command-line for a program or a script:
They will be stored in the array @ARGV:
@ARGV contains: ("my argument list");
foreach my $arg (@ARGV){ print "$arg\n";}
> perl -w findProtein.pl "my argument list"
Command line parameters
my argument list
4.8
It is common to give parameters within the command-line for a program or a script:
They will be stored in the array @ARGV:
my $inFile = $ARGV[0];my $outFile = $ARGV[1];
Or more simply:
my ($inFile,$outFile) = @ARGV;
Command line parameters
> perl -w findProtein.pl D:\perl\input.fasta D:\perl\output.txt
4.9Command line parameters in PerlExpress
4.10
Reminder: the class exercise of 3 days ago.
Reading files - example
Input: Yossi 6.10,16.50,5.00Dana 21.00,6.00Refael 24.00,7.00,8.00END
Output: Yossi 27.6Dana 27Refael 45.1
4.11
Reading files: example
$line = <STDIN>;chomp $line;
# loop processes one input line and print output for linewhile ($line ne "END") { # Separate name and numbers @nameAndNums = split(" ", $line); $name = $nameAndNums[0]; @nums = split(",", $nameAndNums[1]); $sum = 0;
# Sum numbers foreach $num (@nums) {
$sum = $sum + $num; } print "$name $sum\n";
# Read next line $line = <STDIN>; chomp $line;} Input: Yossi 6.10,16.50,5.00
Dana 21.00,6.00Refael 24.00,7.00,8.00END
Output: Yossi 27.6Dana 27Refael 45.1
4.12
Reading files: example
my ($inFileName) = @ARGV;open(IN, "<$inFileName") or die "can't open $inFileName";
$line = <IN>;chomp $line;
# loop processes one input line and print output for linewhile ($line ne "END") { # Separate name and numbers @nameAndNums = split(" ", $line); $name = $nameAndNums[0]; @nums = split(",", $nameAndNums[1]); $sum = 0;
# Sum numbers foreach $num (@nums) {
$sum = $sum + $num; } print "$name $sum\n";
# Read next line $line = <IN>; chomp $line;}close(IN);
Input: Yossi 6.10,16.50,5.00Dana 21.00,6.00Refael 24.00,7.00,8.00END
Output: Yossi 27.6Dana 27Refael 45.1
4.13
Reading files: example
my ($inFileName, $outFileName) = @ARGV;open(IN, "<$inFileName") or die "can't open $inFileName";open(OUT, ">$outFileName") or die "can't open $outFileName";$line = <IN>;chomp $line;
# loop processes one input line and print output for linewhile ($line ne "END") { # Separate name and numbers @nameAndNums = split(" ", $line); $name = $nameAndNums[0]; @nums = split(",", $nameAndNums[1]); $sum = 0;
# Sum numbers foreach $num (@nums) {
$sum = $sum + $num; } print OUT "$name $sum\n";
# Read next line $line = <IN>; chomp $line;}close(IN);close(OUT);
Input: Yossi 6.10,16.50,5.00Dana 21.00,6.00Refael 24.00,7.00,8.00END
Output: Yossi 27.6Dana 27Refael 45.1
4.14
Reading files: example
my ($inFileName, $outFileName) = @ARGV;open(IN, "<$inFileName") or die "can't open $inFileName";open(OUT, ">$outFileName") or die "can't open $outFileName";$line = <IN>;chomp $line;
# loop processes one input line and print output for linewhile (defined $line) { # Separate name and numbers @nameAndNums = split(" ", $line); $name = $nameAndNums[0]; @nums = split(",", $nameAndNums[1]); $sum = 0;
# Sum numbers foreach $num (@nums) {
$sum = $sum + $num; } print OUT "$name $sum\n";
# Read next line $line = <IN>; chomp $line;}close(IN);close(OUT);
Input: Yossi 6.10,16.50,5.00Dana 21.00,6.00Refael 24.00,7.00,8.00
Output: Yossi 27.6Dana 27Refael 45.1
4.15
Reading files: example
my ($inFileName, $outFileName) = @ARGV;open(IN, "<$inFileName") or die "can't open $inFileName";open(OUT, ">$outFileName") or die "can't open $outFileName";$line = <IN>;
# loop processes one input line and print output for linewhile (defined $line) { chomp $line; # Separate name and numbers @nameAndNums = split(" ", $line); $name = $nameAndNums[0]; @nums = split(",", $nameAndNums[1]); $sum = 0;
# Sum numbers foreach $num (@nums) {
$sum = $sum + $num; } print OUT "$name $sum\n";
# Read next line $line = <IN>;}close(IN);close(OUT);
Input: Yossi 6.10,16.50,5.00Dana 21.00,6.00Refael 24.00,7.00,8.00
Output: Yossi 27.6Dana 27Refael 45.1
4.16Class exercise 5
1. Write a script that reads a file containing a Perl script, that is named by the first command-line parameter (from @ARGV). Print out the script without comment lines (lines that begin with #).
2. Now write the results to a file that is named by the second command-line parameter.
3. Now remove all other comments as well (that may not start at the beginning of a line).
4.17
Perl allows easy access to the files in a directory by “globbing”:
The * represents any string character.For example, *.fasta represents all filenames with the extension ".fasta"
my @files = <D:\\proteins\\*.fasta>;foreach $fileName (@files) { open(IN, $fileName) or die "can't open file $fileName"; foreach $line (<IN>) { do something... }}
Note: the “glob” gives a list of the file names in the directory.
Reading directories
no " hereno " here
4.18
You can interpolate variables in the glob, as in double-quoted strings:
@files = <D:\\proteins\\chr$chromosme*.fasta>;
If $chromosome is 4 then we may get these files in @files: chr4.fasta chr4_NT_003827.fasta chr4_NT_007222.fasta
Reading directories
4.19
Delete a file: unlink ("fred.txt") or die "can't delete fred.txt";
Delete all files in a directory whose name matches a certain “pattern”: unlink <fred\\*.txt> or die "can't delete files in fred";
(Here – all file names that end with “.txt”)
Move/rename files:
rename ("fred.txt", "friends\\bob.txt") or die "can't move fred.txt";
Manipulating files
4.20
Generally, you can execute any command of the operating system:
$systemReturn = system("delete fred.txt");
Or:
$systemReturn = system("copy fred.txt george.txt");
When checking the value returned by a system call, usually 0 means no errors:
if ($systemReturn != 0) { die "can't copy fred.txt"; }
Calling system commands
4.21Class exercise 6
1. Write a script that prints a list of all Perl files (i.e. files with extension “.pl”) in a given directory, that is named by the first command-line parameter.
2. Change the script from class exercise 5.1 so that it will read all Perl files in a given directory, that is named by the first command-line parameter, and print them out to the screen without the comment lines.
3* Change the script so that each script will be written to a file named as the input file with an added extension “.noComments”e.g. input “class_ex.2.2.pl” output “class_ex.2.2.pl.noComments”