Download - Introducing the LINUX - CGIAR
![Page 1: Introducing the LINUX - CGIAR](https://reader030.vdocuments.site/reader030/viewer/2022012716/61aeb6bc450b7a33c36b07e8/html5/thumbnails/1.jpg)
1
Introducing the
LINUX
Operating System
Mark Wamalwa BecA-‐ILRI Hub, Nairobi, Kenya h"p://hub.africabiosciences.org/ h"p://www.Ilri.org/ [email protected]
Joint BecA-ILRI Hub, SLU and UNESCO Advanced Genomics and Bioinformatics
7th - 17th October 2013
![Page 2: Introducing the LINUX - CGIAR](https://reader030.vdocuments.site/reader030/viewer/2022012716/61aeb6bc450b7a33c36b07e8/html5/thumbnails/2.jpg)
2
What is UNIX?
• A family of operating systems
IRIX
SOLARIS
AIX
LINUX
Digital UNIX
HP-UX
...
• Multitasking
Runs more than one program at the same time.
A busy system can be running several hundred or even thousands of programs at the same time.
• Multiuser
Many different people can use the system at the same time.
• Networked
It is designed to be linked to other computers and to allow people to work over a network.
The network IS the computer.
![Page 3: Introducing the LINUX - CGIAR](https://reader030.vdocuments.site/reader030/viewer/2022012716/61aeb6bc450b7a33c36b07e8/html5/thumbnails/3.jpg)
3
What is LINUX?
n A freely available clone of the UNIX operating system
for personal computers n Linux and Unix
– Time Sharing OPS: allow multiple users to use the system simultaneously
– Unix: developed in 1969 at Bell-Labs – Linux is similar to Unix in some aspects
Linus Torvalds
![Page 4: Introducing the LINUX - CGIAR](https://reader030.vdocuments.site/reader030/viewer/2022012716/61aeb6bc450b7a33c36b07e8/html5/thumbnails/4.jpg)
4
X
Xprog X · unix> help
Press ENTER to continue:
Disk storage
Memory
Network adapter
Modem
Screen
Keyboard
UNIX
Kernel
What does UNIX do?
The Computer
• Controls access to the hardware. • Prevents programs interfering with each other. • Provides an easy way for programmers to talk to the electronics. • Controls data storage and protection.
The Shell (or command line) • Allows the user to interact directly with the computer by typing commands. • The shell interprets these and instructs the kernel accordingly. • Very powerful but can be intimidating
Console programs • Run from the shell • Use one program actively at a time
The X Window System • Graphical interface (point, click, drag, drop etc.) • Network enabled • Can use many programs at once • Is a separate program • Easier to use than the shell but less powerful
Pointy, clicky program. • Any number of users can use any number of programs and methods to access the system from any number of remote machines at the same time.
users User Interaction • Many different users, typically accessing the system from remote machines in different ways
![Page 5: Introducing the LINUX - CGIAR](https://reader030.vdocuments.site/reader030/viewer/2022012716/61aeb6bc450b7a33c36b07e8/html5/thumbnails/5.jpg)
5
Log in from anywhere.
Logging in
Log in from anywhere you have permission Have graphical output sent anywhere you have permission
You must have a username (login id) to use a unix/linux system
This identifies you to the system so it can manage your work properly.
Every user is a member of one or more groups of users.
This helps the system manage different types of user properly.
![Page 6: Introducing the LINUX - CGIAR](https://reader030.vdocuments.site/reader030/viewer/2022012716/61aeb6bc450b7a33c36b07e8/html5/thumbnails/6.jpg)
6
Connecting to http://hpc.ilri.cgiar.org Connected. Welcome to Genotyping by Sequencing (GBS) workshop Login:
Logging in Connect to the linux machine using: • Putty • WinSCP - open source SFTP (SSH File Transfer Protocol) • SCP (Secure CoPy) client for Windows using SSH (Secure SHell).
Telnet Xterm Secure Shell Kermit Other terminal emulators
username
unix is case sensitive. username is not the same as Username or USERNAME
Password: linux doesn’t show p/w on the screen as you type your password.
The system will be unavailable during Ramadhan. You have new mail. username@hpc~>
You may get some messages here from the system administrator.
![Page 7: Introducing the LINUX - CGIAR](https://reader030.vdocuments.site/reader030/viewer/2022012716/61aeb6bc450b7a33c36b07e8/html5/thumbnails/7.jpg)
Accessing HPC from Windows systems
n Two stage process: – Connecting to the system via secure shell (ssh) login – Getting a graphical connection that supports X-Windows
n ssh connection: – Need third party software. – Local suggestion – use puTTY
n Process is slightly more awkward than ideal because local puTTY is configured for the Sun UNIX environment.
n Better – download putty.exe from http://www.chiark.greenend.org.uk/~sgtatham/putty/
– Just runs from your desktop n Alternative – cygwin - a Linux-like environment for Windows
– www.cygwin.com
![Page 8: Introducing the LINUX - CGIAR](https://reader030.vdocuments.site/reader030/viewer/2022012716/61aeb6bc450b7a33c36b07e8/html5/thumbnails/8.jpg)
Using Local PuTTY - 1
Better choice
This is necessary for all PuTTY installs.
![Page 9: Introducing the LINUX - CGIAR](https://reader030.vdocuments.site/reader030/viewer/2022012716/61aeb6bc450b7a33c36b07e8/html5/thumbnails/9.jpg)
Using Local PuTTY - 2
linux
![Page 10: Introducing the LINUX - CGIAR](https://reader030.vdocuments.site/reader030/viewer/2022012716/61aeb6bc450b7a33c36b07e8/html5/thumbnails/10.jpg)
Using PuTTY-3
![Page 11: Introducing the LINUX - CGIAR](https://reader030.vdocuments.site/reader030/viewer/2022012716/61aeb6bc450b7a33c36b07e8/html5/thumbnails/11.jpg)
PuTTY Terminal Screen
![Page 12: Introducing the LINUX - CGIAR](https://reader030.vdocuments.site/reader030/viewer/2022012716/61aeb6bc450b7a33c36b07e8/html5/thumbnails/12.jpg)
12
The shell or command line Several different shells but they behave more or less the same
username@hpc/home~> interactive
your username the machine you are logged in to
your present location The prompt can be customised to look how you wish
1. The Prompt.
![Page 13: Introducing the LINUX - CGIAR](https://reader030.vdocuments.site/reader030/viewer/2022012716/61aeb6bc450b7a33c36b07e8/html5/thumbnails/13.jpg)
13
The shell or command line 2. Commands
username@hpc~>
The shell breaks the command up into individual words
ls -ald *.txt
The first word is a command
ls -ald *.txt ls -ald *.txt
The subsequent words form a list of arguments to the command arguments beginning with - are options
ls -ald *.txt ls -ald *.txt
* is a special character. It means ‘any group of characters’ (including none). The shell finds all the filenames that match anything.txt and adds them to the list of arguments
The boundary between words is a space. For the shell to treat a phrase that includes spaces as a single word, put it in quotes: 'my word' or "my word". Options control how the program runs. '-a -l -d' is equivalent to '-ald'
![Page 14: Introducing the LINUX - CGIAR](https://reader030.vdocuments.site/reader030/viewer/2022012716/61aeb6bc450b7a33c36b07e8/html5/thumbnails/14.jpg)
14
More Special Characters
* ? " '
& | > <
`` $ \
Any group of characters including none. Any single character. word delineation
Cause the process to run in the background Pipe. Pass the output of the command on the left as the input to the command on the right.
Redirect the commands output, eg. to a file Redirect a commands input. eg. from a file instead of the keyboard. Backticks (not '). Take the output of the command as an argument
String or Dollar Treat the next word as a variable and write out its value
Backslash. Change the meaning of the next character.
Some special characters can lose their special meaning if they are inside quotes.
; Semicolon Seperate commands typed in together.
![Page 15: Introducing the LINUX - CGIAR](https://reader030.vdocuments.site/reader030/viewer/2022012716/61aeb6bc450b7a33c36b07e8/html5/thumbnails/15.jpg)
15
Organisation
"Everything is a file"
• An ordinary file contains data. • A directory contains other files. • A link is a file that is a shortcut to another file. The data could be an image, a document, a set of instructions (a program) or any fixed information. This is a folder on windows. A directory can contain
other directories (sub-directories.) Files can have more than one name, and be in different directories at the same time
• There are many other types of file .
![Page 16: Introducing the LINUX - CGIAR](https://reader030.vdocuments.site/reader030/viewer/2022012716/61aeb6bc450b7a33c36b07e8/html5/thumbnails/16.jpg)
16
Organisation of the file system
/
The top of the file system is the directory '/', commonly known as the root directory
bin usr etc home
Several subdirectories under the root directory username
Another subdirectory. project
seq2 seq1 seq3 seq4
letter prot An example users home directory with a subdirectory and several files
Any file in the file system can be uniquely identified by describing the path to it from the root directory.
/home/username/prot
/
/home/username/prot
home
/home/username/prot
username
/home/username/prot
prot
![Page 17: Introducing the LINUX - CGIAR](https://reader030.vdocuments.site/reader030/viewer/2022012716/61aeb6bc450b7a33c36b07e8/html5/thumbnails/17.jpg)
17
Organisation of the file system
/
bin usr etc home
username
project
seq2 seq1 seq3 seq4
letter prot
Any process is located somewhere in the filesystem
The command 'pwd' will tell you where.
username@hpc ~> pwd /home/username '~' is a linux shortcut for 'your
home directory' ‘pwd’ – print working dir
![Page 18: Introducing the LINUX - CGIAR](https://reader030.vdocuments.site/reader030/viewer/2022012716/61aeb6bc450b7a33c36b07e8/html5/thumbnails/18.jpg)
18
Looking at the file system
/
bin usr etc home
username
project
seq2 seq1 seq3 seq4
letter prot
'ls' lists the files in a directory or directories
username@hpc ~> ls prot letter project username@hpc~> project: seq1 seq2 seq3 seq4
ls project
Without an argument, ls lists all the files that don't start with . in the current directory There are many options to ls that allow you to select and control the information it presents.
![Page 19: Introducing the LINUX - CGIAR](https://reader030.vdocuments.site/reader030/viewer/2022012716/61aeb6bc450b7a33c36b07e8/html5/thumbnails/19.jpg)
19
Moving around the file system
/
bin usr etc home
username
project
seq2 seq1 seq3 seq4
letter prot
'directory' is the directory to which you want to move. The name can be written as the full path (from root) or as the relative path (from your current directory)
You can move to a different directory with the command 'cd directory '
username@hpc ~> cd /home/username/project username@hpc ~/project> pwd /home/username/project
username@hpc ~> cd project username@hpc ~/project> pwd /home/username/project
username@hpc ~/project> cd ..
'..' means the parent directory. '.' means the current directory.
..
username@hpc ~> pwd /home/username
username@hpc ~>
repeat using the relative path
![Page 20: Introducing the LINUX - CGIAR](https://reader030.vdocuments.site/reader030/viewer/2022012716/61aeb6bc450b7a33c36b07e8/html5/thumbnails/20.jpg)
20
Changing the file system
/
bin usr etc home
username
project
seq2 seq1 seq3 seq4
letter prot
You can create a new subdirectory in the current directory with the command ' mkdir directory '
username@hpc ~> mkdir model username@hpc ~>
model
![Page 21: Introducing the LINUX - CGIAR](https://reader030.vdocuments.site/reader030/viewer/2022012716/61aeb6bc450b7a33c36b07e8/html5/thumbnails/21.jpg)
21
Changing the file system
/
bin usr etc home
username
project
seq2 seq1 seq3 seq4
letter prot
You can delete an empty subdirectory with the command ' rmdir directory'
username@hpc ~> rmdir model
model
username@hpc ~>
model You can delete a file with the command ' rm file '
rm prot username@hpc~> rm -rf directory
You can delete a subdirectory and its contents with the command ' rm -rf directory '
![Page 22: Introducing the LINUX - CGIAR](https://reader030.vdocuments.site/reader030/viewer/2022012716/61aeb6bc450b7a33c36b07e8/html5/thumbnails/22.jpg)
22
More about files: filenames
Filenames can contain any normal text character including spaces and special characters.
Filenames can be almost any length. It is best to stick to a-z, A-Z, _, -, and numbers. It is best to keep them short as it saves typing.
If a filename contains a special character or a space you may need to put quotes around the whole path.
Special characters in filenames can cause problems with some programs.
![Page 23: Introducing the LINUX - CGIAR](https://reader030.vdocuments.site/reader030/viewer/2022012716/61aeb6bc450b7a33c36b07e8/html5/thumbnails/23.jpg)
23
More about files: reading files
You can print the contents of one or more files to the screen with the command: 'cat file1 file2 ...'
cat prints the whole file at once, so a file longer than just a few lines will run off the top of your screen.
You can view the contents of one or more files a page at a time on the screen with the command: ' more file1 file2 ...'
more will let you search through a file, go backwards and forwards and has many other functions.
You can print the first few lines of a file with the command: 'head file1 file2 ...'
The last few lines can be viewed with 'tail'
![Page 24: Introducing the LINUX - CGIAR](https://reader030.vdocuments.site/reader030/viewer/2022012716/61aeb6bc450b7a33c36b07e8/html5/thumbnails/24.jpg)
24
More about files: editing files
You can change the content of text files and create new files with a text editor.
Text editors edit text. They do not try to format the text like word processors.
PICO A novice friendly basic text editor used as standard on many systems. Start with the command 'pico filename'
EMACS A powerful editing environment which can be programmed. It has many modes for auto layout of program code. Start with the command 'emacs filename'
VI A powerful editor which can be somewhat confusing for newcomers. It is designed for rapid editing of text files and programming. Start with the command 'vi filename'
Others: kedit,gedit,kwrite etc..
![Page 25: Introducing the LINUX - CGIAR](https://reader030.vdocuments.site/reader030/viewer/2022012716/61aeb6bc450b7a33c36b07e8/html5/thumbnails/25.jpg)
25
If newfilename is a directory, then the file will be copied to 'newfilename/oldfilename'
You can copy a file with the command 'cp oldfilename newfilename'
username@hpc ~> letter project username@hpc ~>
More about files: copying files
ls
cp letter draft username@hpc ~> ls draft letter project username@hpc ~> mv oldfilename newfilename Warning:
If a file called newfilename already exists then it will be overwritten. The command 'mv oldfilename newfilename'
can be used to rename a file
![Page 26: Introducing the LINUX - CGIAR](https://reader030.vdocuments.site/reader030/viewer/2022012716/61aeb6bc450b7a33c36b07e8/html5/thumbnails/26.jpg)
26
• Permissions determine who can read, write, or execute a given file.
More about files: permissions
• Every file is protected.
Owner Group World
The user who owns the file
Other users in the same group as the user who owns the file. All the other users in the system.
• Files can have read (-r), write (-w) or execute (-x) permission for each of the three types of user.
![Page 27: Introducing the LINUX - CGIAR](https://reader030.vdocuments.site/reader030/viewer/2022012716/61aeb6bc450b7a33c36b07e8/html5/thumbnails/27.jpg)
27
You can view the permissions for a file by listing it in long format with the command 'ls -l filename'
username@hpc ~> ls -l letter -rwxr--r-- 1 username users 6048 Aug 17 16:07 letter
The letter l The file type: - - ordinary file d - directory l - link (shortcut)
Permissions for the owner
-rwxr--r--
Permissions for the owners group
-rwxr--r--
Permissions for everyone else
-rwxr--r-- username
The user who owns the file
users
The files group
6048
The files size
Aug 17 16:07
The date the file was last modified
letter
The files name
More about files: permissions
![Page 28: Introducing the LINUX - CGIAR](https://reader030.vdocuments.site/reader030/viewer/2022012716/61aeb6bc450b7a33c36b07e8/html5/thumbnails/28.jpg)
28
change is the modification you want to make to the files permissions
username@hpc ~>
You can change the permissions for a file with the command 'chmod change filename'
-rwxr--r-- 1 username users 6048 Aug 17 16:07 letter username@hpc ~>
More about files: permissions
ls -l letter
chmod o-r letter chmod o-r letter
For whom you are changing permissions: o - other g - group u - user a - all
chmod o-r letter
Permissions being changed: r - read permission w - write permission x - execute (run) permission
chmod o-r letter
How you are changing permissions: - - remove these permissions + - add these permissions = - set permissions to this
username@hpc ~> -rwxr----- 1 username users 6048 Aug 17 16:07 letter username@hpc ~>
ls -l letter
![Page 29: Introducing the LINUX - CGIAR](https://reader030.vdocuments.site/reader030/viewer/2022012716/61aeb6bc450b7a33c36b07e8/html5/thumbnails/29.jpg)
Introduction to Awk
Awk is a convenient and expressive programming language that can be applied to a wide variety of computing and data manipulation tasks.
![Page 30: Introducing the LINUX - CGIAR](https://reader030.vdocuments.site/reader030/viewer/2022012716/61aeb6bc450b7a33c36b07e8/html5/thumbnails/30.jpg)
Awk
n Works well on record-type data n Reads input file(s) a line at a time n Parses each line into fields n Performs user-defined tests against
each line, performs actions on matches
![Page 31: Introducing the LINUX - CGIAR](https://reader030.vdocuments.site/reader030/viewer/2022012716/61aeb6bc450b7a33c36b07e8/html5/thumbnails/31.jpg)
Other Common Uses
n Input validation – Every record have same # of fields? – Do values make sense (negative time,
hourly wage > $100, etc.)? n Filtering out certain fields n Searches
– Who got a zero on lab 3? – Who got the highest grade?
n Many others (it's late)
![Page 32: Introducing the LINUX - CGIAR](https://reader030.vdocuments.site/reader030/viewer/2022012716/61aeb6bc450b7a33c36b07e8/html5/thumbnails/32.jpg)
Invocation
n Can write little one-liners on the command line (very handy): – print the 3rd field of every line: $ awk '{ print $3 }' input.txt
n Execute an awk script file: $ awk –f script.awk input.txt
n Or, use this sha-bang as the first line, and give your script execute permissions: #!/bin/awk -f
![Page 33: Introducing the LINUX - CGIAR](https://reader030.vdocuments.site/reader030/viewer/2022012716/61aeb6bc450b7a33c36b07e8/html5/thumbnails/33.jpg)
Form of an AWK program
n AWK programs are entries of the form: pattern { action } – pattern – some test, looking for a pattern
(regular expressions) or C-like conditions n if null, actions are applies to every line
– action – a statement or set of statements n if not provided, the default action is to print
the entire line, much like grep
![Page 34: Introducing the LINUX - CGIAR](https://reader030.vdocuments.site/reader030/viewer/2022012716/61aeb6bc450b7a33c36b07e8/html5/thumbnails/34.jpg)
Awk Features
n Patterns can be regular expressions or C like conditions.
n Each line of the input is matched against the patterns, one after the next. If a match occurs the corresponding action is performed.
n Input lines are parsed and split into fields, which are accessed by $1,…,$NF, where NF is a variable set to the number of fields. The variable $0 contains the entire line, and by default lines are split by white space (blanks, tabs)
![Page 35: Introducing the LINUX - CGIAR](https://reader030.vdocuments.site/reader030/viewer/2022012716/61aeb6bc450b7a33c36b07e8/html5/thumbnails/35.jpg)
Variables
n Not declared, nor typed n No character type
– Only strings and floats (support for ints)
n $n refers to the nth field (where n is some integer value) # prints each field on the line for( i=1; i<=NF; ++i ) print $i
![Page 36: Introducing the LINUX - CGIAR](https://reader030.vdocuments.site/reader030/viewer/2022012716/61aeb6bc450b7a33c36b07e8/html5/thumbnails/36.jpg)
Some Built-in Variables
n FS – the input field separator n OFS – the output field separator n NF – # of fields; changes w/each
record n NR – the # of records read (so far).
So, the current record #. n $0 – the entire input line
![Page 37: Introducing the LINUX - CGIAR](https://reader030.vdocuments.site/reader030/viewer/2022012716/61aeb6bc450b7a33c36b07e8/html5/thumbnails/37.jpg)
37
You can get help on a command by using the command ' man command'
Getting help
This will bring up the manual page and show it to you screen by screen
If you do not know what a command is called, use the option '-k' to get a list of commands that may be relevant 'man -k word'
This will find all manual pages containing word in the short description of the command.
Try using the options '-h', '-help', or '--help' if you can't find the man page.
![Page 38: Introducing the LINUX - CGIAR](https://reader030.vdocuments.site/reader030/viewer/2022012716/61aeb6bc450b7a33c36b07e8/html5/thumbnails/38.jpg)
Exercise: Filter SNPS
38
Go to http://hpc.ilri.cgiar.org/beca/gbs/ and run these commands in your home directory
a) mkdir snp_data b) cd snp_data c) wget http://hpc.ilri.cgiar.org/beca/gbs/Africa55K_10Pops.bim d) wget http://hpc.ilri.cgiar.org/beca/gbs/emp.data e) ls -alh f) grep '^23\|^25\|^26 Africa55K_10Pops.bim >
AfricaAll_Pops_non_autosomal.rsids g) awk '{if ($1 > 22) print $2}' Africa55K_10Pops.bim >
Africa55K_10Pops.xchrsnps
![Page 39: Introducing the LINUX - CGIAR](https://reader030.vdocuments.site/reader030/viewer/2022012716/61aeb6bc450b7a33c36b07e8/html5/thumbnails/39.jpg)
Example
$ cat emp.data Beth 4.00 0 Dan 3.75 0 Kathy 4.00 10 Mark 5.00 20 Mary 5.50 22 Susie 4.25 18
Print those employees who actually worked $ awk '$3>0 {print $1, $2*$3}' emp.data
Kathy 40 Mark 100 Mary 121 Susie 76.5
![Page 40: Introducing the LINUX - CGIAR](https://reader030.vdocuments.site/reader030/viewer/2022012716/61aeb6bc450b7a33c36b07e8/html5/thumbnails/40.jpg)
40
Acknowledgement
n SANBI (David Martin) n BSK
Adapted from SANBI & Bioinformatics Society of Kenya/BSK
![Page 41: Introducing the LINUX - CGIAR](https://reader030.vdocuments.site/reader030/viewer/2022012716/61aeb6bc450b7a33c36b07e8/html5/thumbnails/41.jpg)
41
Useful literature
'Learning the UNIX operating system', O'Reilly press.
'UNIX Quickguide’ hpc Questions ?