information technology in plant protection

298
Information Technology in Plant Protection Presentation

Upload: korbin

Post on 23-Feb-2016

96 views

Category:

Documents


7 download

DESCRIPTION

Information Technology in Plant Protection. Presentation. GIS tools for Plant Protection. Prepared by: Dr. János Busznyák. Digital Mapping Tools for Plant Protection. Methods of Obtaining Spatial Data Manual Geodesy With the help of Global Positioning Photogrammetry Remote Sensing - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Information Technology in Plant Protection

Information Technology in Plant Protection

Presentation

Page 2: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Prepared by:– Dr. János Busznyák

GIS tools for Plant Protection

2

Page 3: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Methods of Obtaining Spatial Data– Manual– Geodesy– With the help of Global Positioning– Photogrammetry– Remote Sensing– Manual Map Digitalisation– Scanning Maps– From Digital Files

Digital Mapping Tools for Plant Protection

3

Page 4: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Not only the digital form of the contents of a map ready to be used with a computer.

• No need for segmentation, the elements are of real size, has accurate fitting, has topology, often uses layers and objects.

• Primary Data Obtaining Methods– Measurements (GPS)– Existing Reports

Mostly vector data are obtained from primary data obtaining methods.• From Secondary Sources

– By digitalization, adding automatic or manual vectorization.In the case of georeferencing and vectorization in secondary methods,

the result is also a vector map. If a secondary data collection (scanning) is not followed by vectorization, the result is a digital raster map.

Digital Map

4

Page 5: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Aim– New level of GPS analysis (vector)– New publication possibilities– Lower storage and transfer capacity needs

• Preparatory steps– Digitalization of map sheets– Georeferencing, eliminating distortions, projection convertion (lots

of work)• Pre-processing• Vectorization

– Of areas– Of line-like objects– Of objects

• Post-processing

Raster-Vector Transformation

5

Page 6: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Vectorization– Manual– Semi-automatic– Automatic

Vectorization II.

6

Page 7: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Automatic vectorization of a soil map– Single bit– Low data density

• Automatic vectorization of a topographic map– 8-bit– High data density

Black: convert to lineBlue: segmented pixels

Application of the Automatic Method

7

Page 8: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Coordinates of the shape file vertex points– site,lat,long,name,HOTLINK – 1,38.889,-77.035,Washington

Monument,http://www.nps.gov/wamo

– 2,38.889,-77.050,Lincoln Memorial,c:/ESRI/AEJEE/DATA/WASHDC/linc.jpg

– 3,38.898,-77.036,White House,c:/ESRI/AEJEE/DATA/WASHDC/whse.txt

– 4,38.889,-77.009,Capitol,c:/ESRI/AEJEE/DATA/WASHDC/cap.pdf

ESRI Arc Explorer JEE tutorial

Data Input from Text File

8

Page 9: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• With the help of hybrid systems, raster and vector data can be used together. – Vector, raster and attribute data are stored separately, in

the most suitable way for the model.– The operations are carried out by these systems in the

model that is most suitable for the operation in question.– The systems apply a wide variety of vector-raster

transformations before and after the operations.– The GoogleMaps service is based on a hybrid data model.

Hybrid Data Model, Mashup Map

9

Page 10: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Facts that mostly influence data quality:– Origin of data– Geometric accuracy– Accuracy of attribute data – Consistency of attribute data– Topologic consistency– Completeness and validity of data

Data Quality

10

Page 11: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Georeferencing is the process of scaling, rotating, translating and deskewing the image to match a particular size and position.The word was originally used to describe the process of referencing a map image to a geographic location. Source: http://wintopo.com/help/html/georef.htm

• Usual ways:– World file– Header (GeoTiff, GeoJP2…)

Georeferencing

11

Page 12: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Certain image formats include georeferencing information in the header of the image file: – img, – bsq, – bil, – bip, – EXIF– ITT– GeoTIFF– grid

Header

12

Page 13: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Georeferencing information is stored in a separate word file: – The word file contains 6 parameters of an affin

transformation that means a connection between the image coordinate system and that of the world coordinate system.

– The images are stored as raster data, where each cell of the image is identified by a row and coloumn number.

– The name of the word file has to be the same as the image file and be in the same folder.

Word File

13

Page 14: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Georeferencing with the Help of 2 Reference Points segítségével

14

Page 15: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Graphic Georeferencing - Rubber sheeting

15

Page 16: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Projection, date• Geoid, geoidundulation• Uniform National Projection (UNP - EOV)• Transformation• Base points, base point systems

Projection Systems, Conversion

16

Page 17: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Based on image surface shape– Cylinder projection– Cone projection– Flat projection– Other projection

• Based on image surface axle– Polar (normal)– Transversal (equatorial)– Oblique (not normal difference)

• Based on the contact of the image and base surface – Tangent– Transect

Classification of Projection

17

Page 18: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Systems without projection• Dual projection Hungarian systems• Stereographic projection systems (BUDAPESTI,

MAROSVÁSÁRHELYI)• Oblique Mercator Projection• HÉR, HKR, HDR• EOV• Gauss-Krüger• UTM (Universal Transverse Mercator)• GEOREF (World Geographic Reference System)

Important Projection Systems

18

Page 19: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Reference ellipsoids nearing an area of the Earth surface• The centre of the ellipsoid is that of the Earth• The axis of rotation is that of the Earth’s

– Parameters• Major axis (equatorial radius)• Oblateness (connection between equatorial and polar radius)

• If the centre of the ellipsoid is moved until it fits to the examined area with the least error, we will get the geodesic date.– Bessel (stereographic)– Kraszovszkij (Gauss-Krüger)– Hayford (UTM)– WGS-84 (GPS), – IUGG-67 (EOV)

Important Ellipsoids

19

Page 20: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Geographic Projection– WGS 1984 Datum

• Ortographic Projection– SPHERE Datum

• Eckert IV. Projection– WGS 1984 Datum

Some Interesting Projections

20

Page 21: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• GPS measurement gives the height above the ellipsoid (h). When calculating height above sea level(H), geoidundulation has to be taken into consideration.

• Geoidundulation is the separation between the equipotential surface that represents a mean ocean surface and a reference ellipsoid (h=H+N, where N is the value of geoindundulation of the point).

• Geoid: the surface of oceans and seas, if connected by small canals under the land(Listing 1873)

Geoidundulation

21

Page 22: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• The starting coordinates have been placed 200km to the South and 650 km to the West. Thus, the Y coordinates are lower than 400, and the X coordinates are always higher than 400, which means they are easy to distinguish.

Uniform National Projection

22

Page 23: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• The first elevation of Hungary was carried out based on the Mediterranian base level from 1873-1913.– Height of Nadap main

base point: 173,8385 m. • Baltic base level after

World War II.– Height of Nadap main

base point: 173,1638 m, which is 0,6747 m lower.

Uniform National Elevation Network(EOMA)

23

Page 24: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• ETRS89 (OGPSH) points transformed into the Uniform National Projection (EOV) system and back

• The points for the transformation are chosen automatically

• Local transformation based on the common points of the OGPSH and EOV systems

• With 8 common points in Hungary

• With refined Geoidundulation data

Etrs89-Eov-Hivatalos-Helyi-Térbeli-Transzformáció

Transformation

24

Page 25: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Database of Altitudinal Base Points• Database of Horizontal Base Points• Database of OGPS Base Points

• Országos GPS Hálózat pontjai (Points of the National GPS Network-OGPSH)

Base Points

25

Page 26: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Video– Georeferencig (graphical)

• Animation– Georeferencing– Geoidundulation– Shape (create)

Videos and Animations for Chapter 1.

26

Page 27: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

I. questionIdentify the value of geoid-undulation at the Parliament Building,

Budapest, Hungary with the help of EHT (or any other) software .II. question

Digitalize any map sheet with the help of a scanner. Georeferate it with 3 reference points with the help of GEOREGARCVIEW software.

The necessary coordinates can be obtained from mapservers (eg. Googlemaps).

III. questionDigitalize another map sheet overlapping the previous one with the help of a scanner. Georeferate with 3 reference points with the help of GEOREGARCVIEW software. Open it together with the georeferated file of the previous task with ArcExplorer JEE (or any other) and check its accuracy.The necessary coordinates can be obtained from mapservers (eg. Googlemaps).

Tasks for Chapter 1.

27

Page 28: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Global Positioning– The coordinates of 3

satellites at a given time are needed.

– If time can be measured accurately, then wave spread speed and the time will help calculate how far we are from the satellite.

– In the case of 1 satellite, it will give a sphere surface.

GNSS Device System

28

Page 29: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• If there is a connection with 2 satellites, then we are on the sphere of both satellites. The section of the two spheres is a circle.

• The section of the sphere of the third satellite and the circle will be two points, one of which can always be excluded (eg. Points far from the earth surface).

Global Positioning II.

29

Page 30: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Differential Correction

30

Page 31: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• GNSSNet• NtripCaster IP address,

port: 84.206.45.44:2101

Network RTK in Hungary(2010)

31

Page 32: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Geotrade GNSS – Host:

www.geotradegnss.hu– Port: 2101

Multi-Base System in Hungary ( 2010)

32

Page 33: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Georgikon RTK coverage• DGPS forthe whole

country of Hungary– http://gnss.georgikon.hu– 193.224.81.88:2101

Single-Base System (2010)( 2009

33

Page 34: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Trimble European VRS System

34

Page 35: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• CSD (Circuit Switched Data)– Line connected mobile internet - 9,6 kbit/s - 1G

• GPRS (General Packet Radio Service)– Package connected - 115 kbit/s - 2G

• EDGE (Enhanced Data Rates for GSM Evolution) – GPRS reinforcement- 236 kbit/s-os (112-400) - 2,5G

• 3G – 3G mobile network, video call 384 kbit/s - 3G

• HSPA (High-Speed Downlink/Uplink Packet Access)– HSDPA theoretic data transfer speed depending on device and

coverage: up to 21 Mbit/s – 3,5G• 4G LTE (Long Term Evolution)

– 1Gbit/s - 4G

Mobile Internet

35

Page 36: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Video– Trimble VRS system

• Animation– GNSSNet service– Geotrade GNSS– Georgikon GNSS Base

Videos and Animations for Chapter 2.

36

Page 37: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

I. questionFind the data of the accessible satellites of the Galileo andBEIDOU systems at a given time.

II. questionFind the terrain control stations of the Navstar GPS systemat a given time.

III. questionFind the worst measurement site on the Earth’s surface concerning ionosphere state at a topical time. Use the ‘space weather forecast’ of Australia (or any other information source).

Tasks for Chapter 2.

37

http://www.ips.gov.au/Space_Weather

Page 38: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• GNSS Measurement– Planning (almanach)– Realization (online correction: procession too)– Data transfer(exchange formats, RINEX - Receiver

Independent Exchange Format)– Processing (vectors, transformation, error correction)– Network equalization (OGPSH – National GPS Network)

Terrain GNSS Measurement and Processing

38

Page 39: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Guarantee of integrity– GNSS– Way of correction

• Guarantee of nedded accuracy– Accuracy of the Rover device– Way of correction– Satellite constellation– Minimalization of other disturbing facts

Aim of GNSS Measurement Planning

39

Page 40: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• GNSS satellite data– Almanach

• Trimble Planning• Leica Satellite Availability• Topcon Occupation Planning

• Receiving correction data– Mobile internet

• Gprs coverage• Style, devices, realization

Devices for Planning

40

Page 41: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Timing – Further in time– Back in time

• General – YUMA formátum,USA Coast Guard Navigációs Központ

(YUMA format, USA Coast Guard Navigation Center)– A dátum és a GPS-hét kapcsolata a GPS-naptárban (the

connection between date and GPS-week in the GPS calendar)

• Trimble• Leica• Topcon

Almanach

41

Page 42: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Trimble Planning

42

Page 43: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Relative – Real time– Radio – Satellite – Internet

• Post-processed– Digital data transfer

Channels of Correction Data

43

Page 44: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Connection to satellites, controller• Connection to correction service• Setting measurement style• Starting measurement• Recording data

Realisation of Measurement

44

Page 45: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Obtain, check and converse existing spatial data

• Set up a measurement plan– Need for accuracy– Available devices and

services – Specialities of the area– Select measurement method

• Places of measurement• Conversion to the format of

the terrain device• Upload data to the terrain

device

Preparation of Measurement

45

Page 46: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Check measurement data• Inspection • Delete, edit• New recording • Data • Export in needed formats• Turn off terrain device

End of Measurement

46

Page 47: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Load data from terrain device– Formats– Give coordinate system and date– Examine data load mistakes– inspection– Delete, edit

• Export to the format of procession

Processing Data

47

Page 48: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Upload data to GIS system– Conversions – Analyses – Interpolations – Model building– Simulation – Statistical analysis– Publication

• Online correction– Procession

• Offline correction– Time of measurement– Obtain correction data– Correction – Check

GIS procession and Analysis of Data

48

Page 49: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• EEHHTT software– Data input

• From file• Via keyboard

– Set format of data input– Set data conversion direction– Give coordinates

Checking Transformation

49

Page 50: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Adatgyűjtő– Navigation accuracy

• ArcPad / palmtop with GPS antenna– GPS accuracy

• GPS Pathfinder office / Trimble GeoXH– Geodesic accuracy

• Trimble Survey Controller / Trimble 5800• Data procession

– GPS Analyst– GPS Pathfinder Office– Trimble Geomatics Office

– ArcGIS

Typical Terrain Device System

50

Page 51: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Sample• Aim of survey: automatic data collection for 3D relief model• Place of survey: the island of Kányavári, Hungary• Time of survey: 21. December, 2008. 0920h-1530h• Type of survey: RTK; Format of message transfer: CMR+• PDOP mask: 6, elevation cutoff: 10 degrees, antenna: Trimble 5800, hant: 2 m• Coordinate System Hungary Zone Hungarian EOV• Project Datum HD72 (Hungary)• Vertical Datum Geoid Model EGM96 (Global)• Coordinate Units Meters; Distance Units Meters;Height Units Meters

• Name of point DeltaX DeltaY DeltaZ Slope Distance RMS• 25001 13189,539m 1880,080m 11396,001m 17531,898m 0,002m• Name of pointX Y H• 25001 142686.277 505893.164 109.042

Description of Continuous Topographic GPS Survey

51

Page 52: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Take sample• Yield mapping• Sensors • Auto pilot system• Mass flow or sprayer

control • Row control• Seeder control

Basic GPS Elements of Precision Farming

52

Page 53: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• 1. GPS survey of field blocks, soil sample taking plan• 2. Take soil sample according to plan every 3-5 acres• 3. Soil examination (extended and holistic) • 4. Make nutrient content maps• 5. Information, services for professional advice, analyses• 6. Agrochemical service• 7. Differentiated fertiliser plan • 8. Differentiated nutrient output, plant number plan • 9. Seeding with base station• 10. Precision herbicid plan (based on Hu, KA, pH map and weed

uptake)• 11. Ffertiliser quantity, upload into professional advice system• 12. Download data from the Internet

Precision Management System (IKR)

53

Page 54: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• IKR

Precision Management System

54

Page 55: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Spreadsheet

Evaluation of Tillage Experiments

55

Page 56: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Spreadsheet

• GIS software (weed density)

• GIS software (weed density)

Evaluation of Tillage Experiments II.

56

Page 57: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

3D Model

57

Page 58: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Video– GNSSNet OGPSH

• Animation

Videos and Animations for Chapter 3.

58

Page 59: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

I. questionCreate a forecast for tomorrow 1200hr and 1215hr above 10 degree elevation cutofffor the area of the Helikon strand, Keszthely, Hungary (Lambda = 46 degree 45minutes, Fí = 17 degree 15 minutes, h = 150 m).GDOP=PDOP=HDOPVDOP=TDOP=Number of GPS satellites =Number of Glonass satellites=Number of Galileo satellites=Number of Compass satellites=

II. questionIn the IKR precision management system, which service(s) can use correction GNSSbase data?

III. questionIs soil sample take in the IKR precision management system realized with a yield mapor a grid?

Tasks for Chapter 3.

59

Page 60: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Remote Sensing– With the help of remote sensing, objects can be examined that

are not in a direct connection with the sensor.– In a narrow sense, the concept of remote sensing is usually used

for aerial and space images. In a wider sense, it can also be defined for eg. remote measurements or medical applications.

– Remote sensing is the acquisition of information about an object or phenomenon, without making physical contact with the object. In modern usage, the term generally refers to the use of aerial sensor technologies to detect and classify objects on Earth (both on the surface, and in the atmosphere and oceans) by means of propagated signals (e.g. electromagnetic radiation emitted from aircraft or satellites).

Remote Sensing Device System, 3D Modelling

60

Page 61: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• The measurement does not influence the examined object, or change its state.

• It can be used at wavelengths out of the visisble range. The result can be examined in the visible spectrum.

• Objective, exact data can be obtained.• Spatial, several dimension data can be obtained.• Lots of data can be obtained from big areas in a short time.• Areas that can not be reached or examined with other

methods can be examined.

Characteristics of Remote Sensing

61

Page 62: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Active sensors– sense the reflection of their own radiation

• Passive sensors– have no emission

• One or more wavelength range• Images with more than one band are called (depending on the

number of bands) multispectral or hiperspectral.

Clasification of Sensors

62

Page 63: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Geometric – pixel: the space of one point of the image measurable on

the earth surface, its real extension.• Spectral

– the value of radiation from the object• Radiometric

– characterises the colour depth of the pixels• Temporal

– the time interval between the images

Information from Sensors

63

Page 64: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Wavelength, frequency– Visible light (0,4 - 0,7 µm)– Infrared (0,7 µm felett) – Ultraviolet (0,4 µm alatt)

Electromagnetic Spectrum

64

Page 65: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Scatter - Multi path scattering• Occlusion

– Influencing factors• Traveled distance• Radiation energy • Composition of the atmosphere • Size of particles• Wavelength

Atmospheric Effects

65

Page 66: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Chlorophyl absorbs the energy of the wavelengths between 0.45 and 0.67 µm,mostly blue and red colours, thus the colour of the healthy plant is green.

• In an unhealthy plant, the yellow colour together with the green can be caused by red reflection caused by chlorophyl decrease.

• Reflection within the range 0.7 and 1.3 µm highly depends on leaf structure (sort specific), and dramatically increases.

• Effect of stratification, water occlusion bands above 1.3 µm.• Above 1.3 µm, reflection is inversely proportional to the

whole water content of the leaf.

Visible and Infrared Range

66

Page 67: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• The reflection curve of plant sorts are identifiable.

• Image correction (atmospheric distortion)

• Sample points• Spectrum

Visible and és Infrared Range II.

67

Page 68: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• TM 1 0.45 – 0.52 µm(blue) 30 m• TM 2 0.52 – 0.60 µm(green) 30 m• TM 3 0.63 – 0.69 µm (red) 30 m• TM 4 0.76 – 0.90 µm(near infrared) 30 m• TM 5 1.55 – 1.75 µm(medium infrared) 30 m• TM 6 10.42 – 12.50 µm(thermal infrared) 120 m• TM 7 2.08 – 2.35 µm(middle infrared) 30 m

Spectral Bands and Resolution of Landsat TM

68

Page 69: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• ASPRS (ASPRS satellite database)

Planned Objects of Satellite Sensing

69

Page 70: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• 2002. DLR DAIS, 79 band system• 2006. with the help of AISA DUAL hiperspectral camera,

aerial data collection service was launched by the University of Debrecen (Hungary) and the Ministry of Rural Development. – Senses in a maximum of 498 bands, at the wavelength of

0.45–2.45 micrometres.

Hiperspectral Imaging in Hungary

70

Page 71: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• National Aeronautics and Space Administration (NASA) and U.S. Geological Survey (USGS) (1999)

• Images in 7 bands (6 bands 30 m, termal-infra 60 m terrain resolution)

• Sun-synchronic orbit (the satellite travels above a given site at the same local time)

• Circulates at the height of 705 km• Can take images of an area of 185x170 km every 16 days

LANDSAT 5 TM

71

Page 72: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• TM 1 0.45 – 0.52 µm differentation of land from plants, mapping of artificial surfaces.

• TM 2 0.52 – 0.60 µm mapping plant cover, identification of artificial surfaces.

• TM 3 0.63 – 0.69 µm differentation of planted surfaces from plantless surfaces, identification of artificial surfaces.

• TM 4 0.76 – 0.90 µm identification of plant sorts, definition of green mass, survey of plant vitality, mapping water surfaces, mapping soil water content.

• TM 5 1.55 – 1.75 µm examination of soil and plant water content, differentation of cloudiness from snow blanket.

• TM 6 10.42 – 12.50 µm mapping heat emission (plant stress, heat pollution)

• TM 7 2.08 – 2.35 µm differentation between rock types, mapping plant eater content

Application of Landsat Images

72

Page 73: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Imaging : central perspective • Photogrammetry: defines the extention of real objects from

the sizes taken from the image– The resulting ortophoto (image data of the Earth surface

obtained by a satellite or aerial data collectors with geographic reference) can comprehensively be used with GPS systems

– During the planning and realisation of imaging, a GPS device system and adequate relief data are needed.

Ortophoto

73

Page 74: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Photogrammetric evaluation is based on stereoscopy with perspectivic mapping between aerial and space images taken using central projection. – The essence of stetoscopy is that given terrain objects are

mapped in different ways in images from different sources. The task of photogrammetry is to measure the difference between parallaxes, and calculate spatial coordinates.

Photogrammetry

74

Page 75: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Differentation of types of vegetation• Cover and yield• Calculatio• Productivity of biomass• Vitality and disease of flora• State of soil

– IMG files• View• Select bands• Colour bands

– Erdas ViewFinder 2.1– http://rst.gsfc.nasa.gov/Front/overview.html– FÖMI oktatóanyag (tutorial of the Institute of Geodesy, Cartography

and Remote Sensing, Hungary)

Remote Sensing Data in Agriculture

75

Page 76: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Model of objects• Relief model• Terrain model• Elevation model

– Digital elevation model (DEM) is the topographic visualisation of the earth surface. It is usually used for relief maps, 3D visualisation, waterflow modelling, and in the case of aerial image correction. Applies remote sensing data or traditional land surveying data.

– Raster based elevation model– Vector based elevation model

Application of 3D Models

76

Page 77: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Source elevation data create regular grid cells. The size of the cell is constant within the model. The height of the relevant geographic area can be considered constant in the same grid cell.

• Divides space into triangles not covering one another. – Vertices of every triangle are data

points, with the value of x, y, z. – The points are connected with lines,

which gives Delaunay triangles.– A TIN (Triangulated Irregular Network)

is a complete graph, which keeps its topologic connection with the relevant element (intersection, edge and triangle).

– Input data fit directly into the model.

Raster and Vector Models

77

Page 78: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• SRTM (Shuttle Radar Topography Mission 2000) program

– Digital relief of about 80% of the Earth’s surface, with the help of radar system (Endeavour 11 days)

– Radar-interferometry, with two receivers 60 m from one another

– Mapped area: 60 degrees North, 57 degrees south

– Resolution 3 (USA 1) arcsec

Global Relief Model

78

Page 79: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• TanDEM-X 2010, (TerraSAR-X)– Mapping of the whole surface of the Earth– Horizontal resolution 12 m, vertical resolution: 2m. – Two-radar remote sensing satellite with stereo microwave

radar device, at the height of 514 km– Polar sun synchronic orbit– Radiowaves emitted from a satellite with the help of

Synthetic-aperture radar (SAR) technique and then reflected from the surface are received with the antenna on the satellite , or the same surface is photographed from two different points.

Global Relief Model II.

79

Page 80: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• The digital relief model of Hungary, 5m resolution– 1:10 000 scale EOTR

database was used– A GRID derived from

vectorized level lines.

3D Relief Model

80

Page 81: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Generated from several sources– Level-line digitalization– Digitalization of elevation

points– Import GPS survey points– Correction (aerial photo) – Model generation– Publication

• Generation from direct GNSS measurement

3D Relief Map

81

Page 82: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Video• Animation

– Elevation Model

Videos and animations for Chapter 4.

82

Page 83: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

I. questionFind an aerial image of your place of living from internet sources.

II. questionFind a space image of your place of living from internetsources.

III. questionMeasure the area of the Kányavári Island (Kányavári-sziget),Hungary on the photos of 1990., 1992. and 2002. Use ErdasViewFinder (or any other IMG viewer). The images can be foundon the remote sensing tutorial website of FÖMI

Tasks for Chapter 4.

83

http://www.fomi.hu/taverzekeles_oktatoanyag

Page 84: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Types of Mapservers– Static webmaps– Dynamically created webmaps– Animated webmaps– personalized webmaps– Open, reusable webmaps– Interactive webmaps– Webmaps suitable for analysis– Collaborative webmaps

Spatial Data Databases

84

Page 85: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Static webmaps– No animation and interactivity– Only created once, infrequently updated– Mostly scanned paper based maps

• Dynamically created webmaps– Created on demand, often from dynamic data sources– Created by server (ArcIMS –ArcSDE)– WMS protocol

Types of Webmaps II.

85

Page 86: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Animated webmaps– Show changes in the map over time (water currents, wind

patterns, traffic info)– Real time, data from sensors– Updated Rregularly or on demand

• Personalized webmaps– Allow user to apply own data filtering, selective content– Personal styling and symbolization– OGC SLD WMS uniform system (Styled Layer Description)

Types of Webmaps III.

86

Page 87: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Open, reusable webmaps• Complex systems, open API(Google Maps, YahooMaps,

BingMaps)• Compatible with API „Open Geospatial and W3C Consortium”

standards• Interactive webmaps• Chengeable parameters • Easy navigation• Events, descriptions, DOM-manipulations

Types of Webmaps IV.

87

Page 88: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Analytic webmaps– Offer GIS-analysis

• Geodata uploaded by user• Geodata provided by server

• Analysis is carried out by a serverside GIS, results of analysis are displayed by the client.

• Collaborative webmaps– Geometric features being edited by one person can not be

changed by any one else at the time. – Quality check is needed before publication

(OpenStreetMap, Google Earth, Wiki- Mapia…).

Types of Webmaps V

88

Page 89: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• ‘Institute of Geodesy, Cartography and Remote Sensing’, Hungary

• Földmérési és Távérzékelési Intézet fontosabb adatbázisai

(important databases of the Institute of Geodesy, Cartography and Remote Sensing)

‘FÖMI’

89

Page 90: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• To continuously inform farmers and experts, to provide professional background knowledge for tenders and developments.

• Its knowledge base is based on professional news, events, articles, studies, publications-published in an organised, updated system.

• A further aim of the site is to prepare for online data service (logbooks, electronic submission of data of farmers working on vulnerable areas), to give info on data in connection with agri-environmental management, to publish relevant thematic maps and to ensure agrar forecast.

Hungarian National Rural Network (AIR)

90

Page 91: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• 1:200.000 scalegenetic soil map of Hungary

• 40 soil types, 80 sub types, with colours and colour shades

• Physical soil kinds (9 categories) with striping

• Soil formation rock (28 categories) betűjelekkel

AIR Public Map Library

91

Page 92: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Obtain, process and store weather data• Apply weather data in the agrometeorology model of the

crop growth monitoring system (Crop Growth Monitoring System, CGMS)

• Process NOAA-AVHRR and SPOT-VEGETATION satellite images using CORINE land coverage data (CORINE Land Cover, CLC)

• Common Research centre– Statistic analysis of data– Quantity forecast– Short time crop yield forecast

MARS (Monitoring Agriculture by Remote Sensing) terményhozam-előrejelző rendszer

92

Page 93: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Monitoring Agriculture by Remote Sensing

MARS

93

Page 94: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

‘FÖMI NÖVMON’ (Plant Monitoring)

94

Page 95: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

IKR Precision Map Server

95

Page 96: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Soil Data Publication (Georgikon Mapserver, Hun)

96

Page 97: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• (Infrastructure for Spatial Information in the European Community-INSPIRE)

• ‘The INSPIRE Geoportal provide the means to search for spatial data sets and spatial data services, and subject to access restrictions, view and download spatial data sets from the EU Member States within the framework of the Infrastructure for Spatial Information in the European Community (INSPIRE) Directive.

• Aims at making available relevant, harmonised and quality geographic information to support formulation, implementation, monitoring and evaluation of policies and activities which have a direct impact on the environment.’

• (www.inspire-geoportal.eu)

INSPIRE Geoportal

97

Page 98: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Inspire should be based on the infrastructures for spatial information that are created by the Member States and that are made compatible with common implementing rules and are supplemented with measures at Community level. These measures should ensure that the infrastructures for spatial information created by the Member States are compatible and usable in a Community and transboundary context.

Spatial Data Directive

98

Page 99: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Member States shall ensure that metadata are created for the spatial data sets and services corresponding to the themes listed in Annexes I, II and III, and that those metadata are kept up to date.

Inspire2008 metadata

99

Page 100: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Online access to a collection of geographic data and services• Does not store or maintain data• Metadata, catalogues can be accessed with several search

options• With the help of a map server service, maps and metadata

can be searched for and browsed. • Personal maps can be created from existing data sources.

INSPIRE Geoportal

100

Page 101: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• INSPIRE Geoportal

INSPIRE Geoportal Viewer

101

Page 102: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• ArcExplorer JEE Corine Land Cover mash up map from several sources

• http://vektor.georgikon.hukvsz

• http://geo.kvvm.huclc (80%

transparency)Mashup map: a map that includes another (API), made from several internet sources.

Mashup Mapserver Service

102

Page 103: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

WebMap and publication

103

Website

MapServer

Picture

Video

Web service

HTML

Page 104: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Steps of realization : – 1. chose topic– 2. create map, upload

data• a. Create web album,

upload photos• b. Upload video

– 3. create website, embed map

– 4. publish website

Steps of Realization

104

Page 105: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Video– Institute of Geodesy Cartography and Remote Sensing– Hungarian National Rural Netvork– Inspire Geoportal– GoogleMaps service

• Animation

Videos and Animations for Chapter 5.

105

Page 106: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

I. questionMeasure the length of the Belső-tó (‘Inner lake’) of Tihany, Hungary with the help of the topographic map service of the Georgikon Mapserver (or any other mapserver).

II. questionCreate a GoogleMaps map in any agricultural topic with at least 5objects, inserted images and embed it into a website of the sametopic.

III. questionEmbed further mapserver services (Bingmaps, YahooMaps…) intothe website you have created.

Tasks for Chapter 5.

106

Page 107: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Prepared by:– Dr. Máté Csák

Plant protection database

107

Page 108: Information Technology in Plant Protection

Plant Protection Information

Plant protection database

Page 109: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

109

Plant protection’s databases: Topics

• Database management theory– Information, data– Database models, databases– Database Management Systems

• Relation model– Base of theory– Normalized database– Catalog, data-dictionary

• Plant protection’s databases– Practical problems and their solutions

Plant Protection Information

Page 110: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

110

Database management - Information• Information technology concepts, words of Latin

origin, which is intelligence, news, messages, information does.

• Definitions:1) In general, the data information, news of which we

consider relevant, and lack of knowledge has decreased. Wikipedia

2) Knowledge gains, the growth of knowledge, and it means reduce uncertainty. SH Atlas

3) The information provided is new data, news which removes uncertainty and consequences. Kalamár-Csák

Plant Protection Information

Page 111: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

111

Theoretical - Information

The information is the same physical reality of the universe as

matter and energy.

Plant Protection Information

pure informationInformation processing

Meaningful information

DNA-moleculeComputer data input

proteinCalculation results

Page 112: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

112

Theoretical - Information

Manifestations:• Clearly pronounced– Explicit

– When the information is completely clear to everyone, not in need of explanation.

– For example: the Balaton water at 28 °C• Hidden – Implicit

– The data connection between a method can be displayed.

– For example: statistical calculation (average)

Plant Protection Information

Page 113: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

113

Theoretical - Data

• The data of an object (any thing that relates to the data), to a specific value (character state, completed forms) for the variable (properties, attributes, characteristic, character).– Therefore be considered as a specific data are

defined, you define what kind of object that is variable, what value are added. The figures represented the value unit is always connected.

• For example: Name: Arvalin LR; Agent: Zinc phosphate; Volume: 4 %

Plant Protection Information

Page 114: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

114

Theoretical – Data model

• A collection of concepts, which clearly describe the structure of a database.– The structure includes the data type and

their relationship to the restrictive conditions for the data.

– The database conceptual level, logical structure description.

Plant Protection Information

Page 115: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

115

Entity-Relationship-ER basic elements of data model

Plant Protection Information

ENTITIES

ATTRIBUTES

RELATIONSHIPS

Page 116: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Data model – ER - Entities

• Entities : are the principal data objects, which all other things to distinguish, and information is to be collected. – Procedures at issue, and whom we want to store

data.– For example: Citizens, Workers, Patients,

Custumers; Plants, Agents, Phenological phase, Harmful; Cars, Goods, Accounts ...

– The entity to a specific value of the occurrence.

Plant Protection Information

116

·

Page 117: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Data model – ER - Attributes

Attributes: • Internal structure of the entities• are characteristics of entities that provide

descriptive detail about them.– Plants of the named individual characteristic

such as : name, Latin name, ...• The property values of an individual's actual

value is determined.• For example: Peach, Prunus persica, …

Plant Protection Information

117

Page 118: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Data model – ER – Attributes - Key

• If a property or properties to a group of clearly specifies, that the value which the individual is involved, together they are called keys. – For example: name in Plants

Plant Protection Information

118

Page 119: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Data model – ER - Relationship

The relationships: • the external structure of entities,• the represent real-world associations among

one or more entities.• are described in terms of degree,

connectivity, and existence.– For example: Plants-Harmful, Accounts-Goods, ...

• A particular occurrence of a relationship is called relationship instance.

Plant Protection Information

119

Page 120: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Datamodel – ER – Relationship - Types

The types of relationships:

•Independent connectivity

•1:1 connectivity

•1:N connectivity

•N:M connectivity

Plant Protection Information

120

Page 121: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

121

Adatmodell – ER – Kapcsolatok 1.

1. Independent connectivity– The two entities independent of each

other, if one set of instances, nothing is linked to a single element or another entities.

• For example:• Agent’s Id: Employe’s account

Plant Protection Information

Page 122: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Data model – ER – Relationship 2.

2. One –to – one connectivity (1:1):

• One of the elements of each set of instances of another entity set exactly one element is linked.

– For example: Agent’s Id: Agent’s name

Plant Protection Information

122

Page 123: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Data model – ER – Relationship - 1:1 Connectivity

Plant Protection Information

Page 124: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Adatmodell – ER – Kapcsolatok 3.

3. One-to-many connectivity:

• A set of instances of each element of the B element within the multi-set of instances.

– For example: Aetiologies: Diseases

Plant Protection Information

124

Page 125: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Data model – ER – Relationships - 1:N connectivity

Plant Protection Information

Page 126: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Adatmodell – ER – Kapcsolatok 4.

4. Many-to-many connectivity:• A set of intstances of all elements of the B

element within the multi-set of instances, vice versa.

– Például: Plants : Diseases

126

Page 127: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Data model – ER – Relationship - N:M connectivity

Plant Protection Information

Page 128: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Data model - ER definition

• The data model isa finite number set of entity, their finite number set of properties and their set of relationship.

Plant Protection Information

128

Page 129: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Data model - TypesDepending on the core 3 is based on storing the physical data model exist.

entity property connectivity•net, hierarchical + - +

•relation + + -

•Object oriented + + +

• + object-relational (mixid data model)

Plant Protection Information

129

Page 130: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

130

Databases• Database: some relation to each other in a

structured set of data, stored so that multiply users can access, typically digital form.

• The database is a finite number of entities occur, their are a finite number of property value, and the relationship of the presence data model orgonized as a combination.

• Benefit: you can use many at once. The data are stored "single" only.

Plant Protection Information

Page 131: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

131

Integrated database

• Linked to all data that are used by different users in different groupings.

• The physical placement of data, centrally, redundancy-free or minimal, controlled redundancy occurs .

• Centrally controlled – data protection, – entering the new data, and– change existing data.

Plant Protection Information

Page 132: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

132

Database Management System (DBMS)

• A softvare, which provides the connection to the database.

• Allows databases – creation, – query the data, – modification, – maintenance, – large amounts of data on long-term safe

storage.

Plant Protection Information

Page 133: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

133

Database Management System (DBMS)

• Grouping– According to the number of users

• Single-user• Multi-user

– Job sharing as• A tasking• Client-Server

– Number of storage locations• A stored• Split /shared

Plant Protection Information

Page 134: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

134

Database Management System (DBMS)

• The system components– Data Definition Language (DDL)

• User level• Conceptual level• Physical storage level

– Data Manipulation Language (DML)– Data Control Language (DCL)

Plant Protection Information

Page 135: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

135

Database Management System (DBMS) – Operating concept

Page 136: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

DBMS – Operating concept - Explanation

1 Request for information from the database (Application program)2 Request the interpretation and analysis (DBMS: syntax,

existence, rights)3a Executeable→ to operating system3b can not execute → to program4 Contact the exterior container (operating system) 5 The transfer of the requested data (OS, from storage into buffer)6 The passing of data, feedback for a program7 The receipt of data into a program.

Plant Protection Information

Page 137: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

137

Database Management System (DBMS)

• Two types:– Has a autonomous languages

• Oracle (1977)• DB/2 (1983)• SyBase (1987)• Informix (1981)• Ingres (1980)

– Plug-in type• IDMS (1983)• SQL (1986)

Plant Protection Information

Page 138: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

138

Relation database model – Theoretical basis

• In 1970 Dr. Edgar F. Codd (IBM) create the Relation Database Model.

• The data model describes the various types of data, their relation, connections, and their privacy procedures.

• The collected data are logically separate entity types, entities (table). Determine that the individual entities, whereas we can clearly identify, and also what additional features (attributes).

Plant Protection Information

Page 139: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

VirKor program – Relation diagram

In VirKor database has seven tables and their properties and relations.

Plant Protection Information

139

Page 140: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

VirKor program – Relational mode of representation

• Relation of entities (special tables) shows.

• They describe the real world, different entities and their properties.

• Plants table

Plant Protection Information

Page 141: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

141

VirKor program - Relational mode of representation

• The connection between the entities can be depicted in relations.

• The data management comes true with relational operations.

• Plants – Pests relation

Plant Protection Information

Page 142: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Relation model – Benefits and disadvantages

Benefits:• Mathematical (set theoretical) based on models• Very close to everyday thinking,• Most flexibly modifiable,• Well-separable, can be made independent the

three level.Disadvantages:• The power delivery is less effective.

– This is not so big trouble already today.

Plant Protection Information

Page 143: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Relation model – The properties of relations • Is a clear relation in the database;• The specimens were characterized by rows and columns of entity properties;• The same number of colums in each row;• Columns within a clear relation to the name;• Any column in a row add up to a value (if no value is NULL);• Columns in any order;• Not two are the same place;• There are least a combination of columns that uniquely identifies the row.

This is the primery key.• Identify any data:

– Relation name– + column name– + value of primery key

Plant Protection Information

Page 144: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Relation model – Table, viewsWe do not store each entity value of each property physically.• Base relation. Table

Physically stored.• Virtual relation. View

Contains no data. We create from tables with relational operations.• Materialized view.

Physically stored. We create from tables with relational operations. Change when you change the default tables.

• Snapshot Physically stored. Value of tables, views in a certain moment.

• Queries, the selection result.Relation is not true and only temporarily exist.

• Temporarily tablesTemporarily need that operation, task.

Plant Protection Information

Page 145: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Relation model – Keys

Ensure data integrity, consistency and relation exemption.

The system automatically checks:• The primary key and foreign key

relations between entities(eg., key of plants in a diseases of plants)– matching– Cascading change– Cascading delete

Plant Protection Information

Page 146: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Relation model – Keys

PRIMARY KEY• Clearly identify a relation the rows.• The primary key (or part thereof) may not

be (or not) null value, and should not contain unnecessary columns.

• It is important to decide what should be the primary key if you have more options (eg, person identity: identity card number, tax identification, social insurance number)

Plant Protection Information

Page 147: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Relation model – Integrity

Integrity additional options:• Define a unique index (this columns

will not add the same value in two rows)

• Given specific field conditions must be satisfied(eg, the check number only value possibility)

Plant Protection Information

Page 148: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Relation model – Indexes

• To expedite the indexed column in the direct and sequential access.– Auto maintenance,– You can always be created, deleted,

– Slows down the change,– Space is needed.

Plant Protection Information

Page 149: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Relation model – Keys

FOREIGN KEY• A column (combination) in relation to the

link only to add value, eithers as a NULL value, or the referenced tablaóe with one of the primary key values are equal.

• Establish connections between the 1:N relationship. Shall remain valaid for all the changes, data input, deleting.

Plant Protection Information

Page 150: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Relation model – Foreign key relationship

Plant Protection Information

Page 151: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Relation model – Foreign key relationship

Plant Protection Information

Page 152: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

152

Relation model – Normalization

• Normalization is a formal algorithmic process in which the initial data, the negative pattern of consistent application of appropriate rules of succession is logically more transparent better shape form.

Plant Protection Information

Page 153: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Relation model – Normalization• The previous steps of design entities well

manageable, received a standard take forms.

• Algorithmizable.• Result of:

– The data will be less need for storage;– The elementary data faster and less error-

prone to change;– The database will be logically clearer.

Plant Protection Information

Page 154: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Relation model – Normalization – Functional dependence• Any relation of attributes values depend on the values of other attributes.• If one of the attributes of the relation R (X), the independent variable is clearly

identified by another attribute (Y), the dependent variable, then we say that Y is functionally dependent on X from the relation R.

• Naturally this is a clear relation to the actual content of R is not only valid, but independent of time, for the whole duration of its existence constraint database.

• Both the X and Y attributes can be complex, that is consist of several columns as well.

• Functional dependence of the usual marked withR.X R.Y

• Maybe this even a functional diagram, dependency diagram is represent with a different name.

Plant Protection Information

Page 155: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Relation model – Representation of functional dependence

Plant Protection Information

The arrow Z is from.points to an independent attribute of the dependent attribute.Y and Z in the diagram is functionally dependent from X,Y and

Page 156: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Relation model – Full functional dependence• General terms, it R in relation it Y attribute

functionally if and only if X (composite) is complete attribute, if it is functionally dependent on X from, but does not depend on X has only a real component of his. – If X is not complex, then the functional and the

full functional dependence is the same.– strong– weak

Plant Protection Information

Page 157: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Relation model – Full dependence

•Be P, Q Í A and P ® Q.•Q full dependent (functionally) from P, P only if Q does not depend on the part of set •Otherwise, the dependence is partial.

– For example:– ORDERITEM {order_id, goods_id, piece}– REPAYMENTS{deptor_id, month, amount, date}– VISIT{visitor_id, date, time, subject, period}

Plant Protection Information

157

Page 158: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Relációs modell – Tranzitív függőség

•Depends on the P to S is transitive, if there exists Q Í A, and P ® Q, Q ® S, but the reverse is not true dependecies.

– Példák: P ® Q® S– WORKER {perid, name, class_code, class_name}– ORDERHEADER {order_id, custcode, custname,

custaddres, date, deadline, totalvalue}– VISITOR{id, name, firm, firmname, firmaddres, …}

Plant Protection Information

158

Page 159: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

159

Relation model – Normalforms

• The entity’s structural state • 0NF • 1NF• 2NF• 3NF• 4NF

Plant Protection Information

Plant Latin name Deseases …

Apple Malus domestica

Applemosaic, Impetigo, Apple powdery mildew, …

Potato Solanum tuberosum

Staining virus reticulated, Blight, Black rotting, …

Page 160: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Relation model – First normal form

• First normal form (1NF) is the relation which– Each column has one and only one attribute is

present,– Each row is different,– The order of attributes in each row is same,– There are not repeating fields,– Belongs to each line (at least) a unique key,

from which all the other attributes are functionally dependent.

Plant Protection Information

Page 161: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

161

Relation model – Normál forms - 1NF - example

Plant Protection Information

Plant Latin name Disease Athology …

Apple Malus domestica Applemosaic virus

Apple Malus domestica Impetigo mushrooms

Apple Malus domestica Apple powdery mildew mushrooms

Potato Solanum tuberosum

Staining virus reticulated virus

Potato Solanum tuberosum

Blight mushrooms

Potato Solanum tuberosum

Black stemrotting bacterium

Wheat Triticum vulgare Pitch staining mushrooms

Page 162: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Relation model – Normalforms – 1NF - AnomaliesIt can be seen a lot of redundancy (eg Plant and Latin name).Hidden error possibilities (anomalies of change):• Erase anomaly:

– If we erase the removal of the wheat disease Pitch staining• Modify anomaly:

– If the potato into the new name blight disease are renamed, you can either „new” plants will should be modified or anywhere.

• Enter anomaly:– New disease can be entered only if a plant is already ill

(primary key can not be part of a NULL value).

Plant Protection Information

Page 163: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Relation model – Normalforms – 2NF

• A relation R is in second normal form (2NF) if and only if it is in 1NF and every non-key attribute is fully dependent on the primary key. – Elementary primary key 1NF relations are also automatically in2NF.

Key relations are complex, however, in order to eliminate anomalies int the change we need to2NF.(This is not to removes all the amomalies, but could significantly reduce their number). This is called decomposition of relations.

– The decomposition happens so, that it 1NF from relation with a projection like that 2 NF, we manufacture relations, the primary keys of which the primary key of the original relation, or parts therefor, are those and can only those column that are fully dependent in the new primary key.

Plant Protection Information

Page 164: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

164

Relation model – Normalforms – 2NF - Example

Plant Protection Information

Plan Latin name

Apple Malus domestica

Potato Solanum tuberosum

Wheat Triticum vulgare

Plan Desease Latin name PictureApple Apple mosaic virus Apple mosaic virusApple Impetigo Venturia inaequalisApple Apple powdery mildew Podosphaera leucotrichaPotato Virus networking staining Potato leafrollPotato Black stem rotting Erwinia carotovora subsp.

atroseptica

Potato Blight Phytophthora infestansWheat Pitch staining Lidophia graminis

Page 165: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Relation model – Normalforms – Decomposition (1NF®2 NF)• R(A,B,C,D) before decomposition(1NF)

– PRIMARY KEY(A,B)– R.A ® R.D

• After decomposition(2NF)– R1(A,D)– PRIMARY KEY (A)

• and– R2(A,B,C)– PRIMARY KEY (A,B)– FOREIGN KEY(A), refers to R1

Plant Protection Information

Page 166: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Relation model – Normalforms – 3NF

• A relation R is in third normal form (3NF) if and only if it is in 2NF and every non-key attribute is non-transitively dependent on the primary key.

• In other words, the 3NF means that only the functional dependence of the primary and the alternative keys can start up.

• Employee of the 2NF relation is not in 3NF, because for example, the class (CLASS) is not the primary or alternate key and other columns (CLASS-NAME), BOSS) is functionally dependent on it.

Plant Protection Information

Page 167: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Relation model – Normalforms – Decomposition (2NF®3NF)

• The decomposition happens so, that the 2 NF relation, we take the projection, which includes only those attributes that are exclusivly dependent on the primary key. This is primarily key will remain the same. The other new relation (or relations, if more than one relationship), the primary key attribute of an independent relation dismantled, and the columns of his dependent attributes.

Plant Protection Information

Page 168: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

168

Relation model – 3NF - Example

Plant Protection Information

Plan Latin name

Apple Malus domestica

Potato Solanum tuberosum

Wheat Triticum vulgare

Plan Disease kép

Apple Apple mosaic virus

Apple Impetigo

Apple Apple powdery mildew

Potato Virus networking staining

Potato Black stem rotting

Potato Blight

Wheat Pitch staining

Disease Latin name Aetiology …

Apple mosaic virus

Apple mosaic virus

virus

Impetigo Venturia inaequalis

fungus

Apple powdery mildew

Podosphaera leucotricha

fungus

Virus networking staining

Potato leafroll virus

Blight Erwinia carotovora subsp. atroseptica

fungus

Black stem rotting

Phytophthora infestans

bacteria

Pitch staining Lidophia graminis fungus

Page 169: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Relation model – Normalforms – Decomposition (2NF®3 NF)

• General terms, if the A, B, C, D columns (any compound can be) of 2 NF relation– R(A,B,C,D)

• PRIMARY KEY(A)• R.B ® R.C

• 3NF is the decompositions of the re-establishment relations of the following means:

– R1(B,C)• PRIMARY KEY(B)és

– R2(A,B,D)• PRIMARY KEY(A)• FOREIGN KEY(A), refers to R1

• The relation R can be set back at any time is clearly a combination of R1 and R2 (B). It is however, that the splitting is done according to the principle set out above.

Plant Protection Information

Page 170: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Relation model – Normalforms – Decomposition - 3 NF

Notes:• Not always appropriate to the 3NF

shape (e.g., address and zip code).• Most database management system

enough to 1 NF, and even the primary key is not required!!!

Plant Protection Information

Page 171: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

171

Catalog, data dictionary

• The database – definition, – relationships, – storage, – how to use maintaining tables,

• views of all.• System administration carry out tasks.

Plant Protection Information

Page 172: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

172

Plant protection database

• Pesticides Register• VirKor – assistant educational material

Plant Protection Information

Page 173: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

173

Pesticides Register - Records of pesticides• The aim of a database

Task of the ER-chemistry Co. register of pesticides manufactured by the Planning

• The register should include:– the origin of certain pesticides,– the elements needed to produce the drug,– the possible application areas.

• It is assumed that:– a pesticide may be single or multi-component,– more may be used against the pest,– one component can be derived from multiple

suppliers.

Plant Protection Information

Page 174: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

174

Pesticides Register - Entity types

• Pesticides(id, name, degree of hazard, price)

• Factories (factory code, name)• The fields of application of drugs (pest

, type)• The drug components (compnent

name)• The Transporters (Transporter code,

date, name, address)

Plant Protection Information

Page 175: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

175

Pesticides Register - Entity type of ER-model

• Pesticides(id, name, hazard, price)• Factories(id, name)• Pests(pestname, type)• Components(name)• Transporters(id, date, name, address)

Plant Protection Information

Page 176: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

176

Pesticides Register - - ER-diagram

Plant Protection Information

Page 177: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

177

Pesticides Register - Relations• Where can produce ? Factories:Pesticides (1:N)

– A pesticide plant produces only one, but a plant can produce more gain.

• What do you apply? Pesticides:Pest (M:N)– A pesticide may be used against several pests, and in a pest

can destroy more times.• What are the ingredients? Pesticides:Components

(M:N)– A pesticide consists of several components, but other

substances may also be a component creator.• Where did it come from? Components:Transport (M:N)

– Carry more of a component supplier, but a number of component suppliers will also be distributed

Plant Protection Information

Page 178: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

178

Pesticides Register - Relation model

Plant Protection Information

Page 179: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

179

Pesticides Register - Relations

Only the primary keys• Pests(pname, type)• Factories(fid, name)• Components (cname)• Transports(tid, tdate,

tname, taddress)

Primary keys and foreign keys• Pesticedes(pid, name,

hazard, price, fid)• Applies(pid, pname,

term)• Elements(pid, cname,

volume%)• Origines(cname, tid,

tdate, quantity)

Plant Protection Information

Page 180: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

180

Pesticides Register - Pest table

Plant Protection Information

Page 181: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

181

Pesticides Register - Pesticide form

Plant Protection Information

Page 182: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

182

Pesticides Register - Pests form

Plant Protection Information

Page 183: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

183

Pesticides Register - Factories form

Plant Protection Information

Page 184: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

VirKor program

assistant educational material

Plant Protection Information

Page 185: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

VirKor program - assistant educational material

The Virkor is an assistant educational program, that helps students understand how to recognize diseases of different plants.

• Demonstration boards are modern.• Educational resource for students of

plant doctor.

Plant Protection Information

Page 186: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

VirKor program – How does it works?

• Stored in the database:– plants,– diseases,– these relations.

• The displayed images are stored in a folder (locally or server).

Plant Protection Information

Page 187: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

VirKor program – Diseases of plant: Apple

Apple proliferation phytoplasma Podosphera leuchotricha

Plant Protection Information

Page 188: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

VirKor program - Diseases of plant: Apple

Apple mosaic virus Monolinia fructigena

Plant Protection Information

Page 189: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

VirKor program – One disease on different Plants: Mosaic virus on cucumber and apple.

Plant Protection Information

Page 190: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

VirKor program - Symptoms: Necrosis of tissue

Plant Protection Information

Page 191: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

VirKor program – How it’s made?

• Photographed hand-capture demonstration boards and then were cleaned.

• The boards in the data recorded in an Excel spreadsheet.

• Created the relational database model.

• Developed the application.

Plant Protection Information

Page 192: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

VirKor program – How it’s made? - Data collection - Digitization

Digitization of the demonstration boardsOriginal Cleaned

Plant Protection Information

Page 193: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

VirKor program – How it’s made? - Data collection - Stored

Store the data in a worksheet (1NF)

Plant Protection Information

Page 194: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

VirKor program – How it’s made? - Modify and correct data structure

In this case we supplemented the data with some other properties, for example: add plant parts.

Plant Protection Information

Page 195: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

VirKor program – How it’s made? - Data collection - Analyze the relationship between data

• Functional dependencies:– The Latin names of the plants are

dependants of the Hungarian names.– The same refers to the disease, the

symptoms and the aetiology (e.g. virus).

Plant Protection Information

Page 196: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

VirKor program – How it’s made? – Create a Relation database model

Plant Protection Information

Page 197: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

VirKor program - How it’s made? - Table – Entity: Plants

• Apple and it diseases

Plant Protection Information

Page 198: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

VirKor program - How it’s made? - Table – Entity: Diseases

Plant Protection Information

Page 199: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

VirKor program - How it’s made? - Table – Entity: Plants’ diseases

Plant Protection Information

Page 200: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

VirKor program - How it’s made? - Develop tutor program

• Form of Plants• Setting the properties of each tool• Programming each event

– For example • load image file in the picture box• Change the status of checkboxes• Etc.

Plant Protection Information

Page 201: Information Technology in Plant Protection

AZ ELŐADÁS LETÖLTHETŐ:-

Thank you for your kind attention.

Made by: Máté Csák PhD.

Page 202: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Bibliography• Quittner P. - Baksa-Haskó G. (2008): Adatbázisok, Adatbázis-

kezelő rendszerek, DE ATC AVK• KUPCSIKNÉ FITUS I. (2004): Adatbáziskezelés, AIFSZ képzés

tananyaga• TÍMÁR L. ET AL. (2007): Építsünk könnyen és lassan

adatmodellt!, Pannon Egyetemi Kiadó, 46/2007, pp. 23-99.• HERNANDEZ, M. J. – Viescas, J. L. (2009): SQL-lekérdezések

földi halandóknak, Kiskapu.• ULLMAN, J. D. – Widom, J. (2008): Adatbázisrendszerek

Alapvetés 2. átdolgozott kiadás, Panem Kiadó.• CZENKY M. (2005): Adatmodellezés - SQL és ACCESS

alkalmazás - SQL Server és ADO, ComputerBooks.

Plant Protection Information

202

Page 203: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

• Prepared by:– Sándor Nagy

Bioinformatics

203

Page 204: Information Technology in Plant Protection

Information Technology in Plant Protection

BioinformaticsBioinformatics -

Databases and homology searching

Page 205: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Contents

• What does Bioinformatics mean?• Structure and operation of DNA• Bioinformatical databases• Using databases• Exercise

Information Technology in Plant Protection

Page 206: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Definition• Bioinformatics derives knowledge from computer

analysis of biological data. These can consist of the information stored in the genetic code, but also experimental results from various sources, patient statistics, and scientific literature. Research in bioinformatics includes method development for storage, retrieval, and analysis of the data. Bioinformatics is a rapidly developing branch of biology and is highly interdisciplinary, using techniques and concepts from informatics, statistics, mathematics, chemistry, biochemistry, physics, and linguistics. It  has many practical applications in different areas of biology and medicine.

Information Technology in Plant Protection

Page 207: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Fields of Bioinformatics

• Superindividual Bioinformatics uses systematical modelling in order tp know biological systems

• Molecular Bioinformatics does protein and nucleotid analysis and planning

• Computing Bioinformatics is focusing on utilization of biological systems

Information Technology in Plant Protection

Page 208: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Aim of Bioinformatics

Is to decipher the genetically encoded information, which lead us information on the followings:

• 3D sturcture,• Function,• Evolutionary relations.

Information Technology in Plant Protection

DNA Protein Function

Page 209: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Questions answered by Bioinformatics

• In which other creature can we find the actual sequences? (→ ortholog searching)?

• What kind of variatons can occure in a certain creature? (→paralog searching)?

• What is the rate of heterogeity in a certain paralog (→searching polymorphism)?

• Which positions are important in a given sequency (→ evolutionary conserved) ?

Information Technology in Plant Protection

Page 210: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Basics: Structure of the DNA• Double helix, in which nucleotide bases on the two strands

are connected by Hydrogene bonds: A:T - 2, G:C – 3 H-bonds • Base pairing: complemeter nucleotide bases within the long

polymer are: A:T and G:C• replication, • Genetic code- isn’t monotone • Two helical chains each coiled round the same axis, and each

with a pitch of 34 Ångströms (3.4 nanometres) and a radius of 10 Ångströms

• These two strands run in opposite directions to each other and are therefore anti-parallel, 5′ (five prime) and 3′(three prime) ends

• It containes four bases: adenine (A), cytosine (C), guanine (G), thymine (T)

• Structure of DNA:– http://www.youtube.com/watch?v=qy8dk5iS1f0&feature=player_e

mbedded

Information Technology in Plant Protection

Page 211: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Bases: replication of DNA

• Following from the rule of Base pairing: Hydrogene bonds within the double helix can be pulled apart, both strands are templates for the synthesis of a new strand. Result of this process: same structure

• Genetically coded information ensured by the order of nucleotides

• DNA replication:– http://www.youtube.com/watch?v=E8N

HcQesYl8&feature=related

Information Technology in Plant Protection

Page 212: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Történeti áttekintés

• Early 50thies– publishing insuline sequence• 1953 Watson-Crick: Structure of DNA• Early 70thies – creating algorithms for

sequenal analysis:– Dot matrix– Local and Global Sequence Alignment– BLAST algoritmus

• 1972 first computer stored databases of proteine sequences

• 1979 GenBank prototype

Information Technology in Plant Protection

Page 213: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Bioinformatical databases

• Gene Bank: NCBI (National Center for Biotechnology Information)– http://www.ncbi.nlm.nih.gov/

• European Molecular Biology Laboratory – European Bioinformatics Institute - EMBL-EBI– http://www.ebi.ac.uk/

• DNA DataBank of Japan – DDBJ– http://www.ddbj.nig.ac.jp/

Information Technology in Plant Protection

Page 214: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012Information Technology in Plant Protection

http://www.ncbi.nlm.nih.gov/

Choosing database

Searching keywords

Page 215: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Navigation in Database - Gene Bank

• Choosing database:– Pubmed – database of

publications– Protein – database of

proteines– Nucleotide – database of

nucleic acid– Genom – database of whole

genomes– Gene – database of genes

Information Technology in Plant Protection

Page 216: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Navigation in Database - Gene Bank

• Exercise:– Looking for information on Phytophthora

infestans• Database: Taxonomy, Keyword: Phytophthora

infestans• Result of our search (next slide)

– Taxonomic classification– Databse results from GeneBank– References to other resources

• Choosing the following reference link on the result page we can reach all sequences in the Database: Nucleotid – Dirket

• Looking for INF2A gene within the results

Information Technology in Plant Protection

Page 217: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Eligazodás az adatbázisban - Gene Bank

• Eredmény tábla:

Information Technology in Plant Protection

Page 218: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012Information Technology in Plant Protection

Azonosítók

Information on publishing

Information on structure

Gén azonosítás

Page 219: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012Information Technology in Plant Protection

Aminoacid orderNukleotide sequence

Page 220: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Navigation in Database- GenBank• Dataformats: :

– Summary – short description, important information

– GenBank – own format of Genbank, detailed data

– FASTA – name+identifiers+sequence most common used format

– ASN.1 – international format

Information Technology in Plant Protection

220

Page 221: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Navigation in Database- GenBank

• FASTA format:– Advantage:

• commonly used• simple• small

– Disadvantage:• less information

Information Technology in Plant Protection

Page 222: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Navigation in Database- GenBank

• Accession number:– AY693804 - Phytophthora infestans INF2A

(Inf2A) gene, complete cds– Accepted international identifier for

nukelic acids and protein sequences• GI (Genbank Identification) number:

– GI:51832280 - - Phytophthora infestans INF2A (Inf2A) gene, complete cds

– Identifier especially used only by Genbank

Information Technology in Plant Protection

Page 223: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Use for … ?

• To quest genetic information of a given organism

• To compare and check our results• Basis of comparing experiments• Collection of papers and publication• Base of researches

Information Technology in Plant Protection

Page 224: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Exercise:

– Look for the genetic code and proteine sequence of a chosen important causative agent and examine the availability of its’ genome.

– Save the given result in FASTA format, textfile. Keep the saved file, it is required for the exercise next time.

Information Technology in Plant Protection

Page 225: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Content

• Comparing two sequences• Searching homologue sequences –

BLAST• Nucleotide BLAST - BLASTN• Proteine BLAST – BLASTP• Use for what?• Exerxise

Information Technology in Plant Protection

Page 226: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

How to compare two sequences – Dot matrix

• Dot Matrix method (Gibbs and McIntyre, 1970): It compares two amino acid or nucleotide sequences in a way of placing the two sequences in a matrix in both vertical and horizontal direction and it draws a dot in case of parity.

• Exceedingly suitable for visual demonstration of mutations, deletions and insertions.

Information Technology in Plant Protection

Page 227: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

How to compare two sequences – Global Sequence Alignment• Other well known analytical method is the

Global Sequence Alignment which uses dynamical programming.

• Essence of the process: examining analogy of the sequences with the help of a scoring system on the whole sequence.

Information Technology in Plant Protection

Page 228: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

How to compare two sequences – Local Sequence Alignment• Using also the dynamic programming

process.• Essence of the process: examining

analogy of the sequences with the help of a scoring system. It tries to create the best alignment.

Information Technology in Plant Protection

Page 229: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Pair Sequence Similarity Search

• Basic Local Alignment Search Tool (BLAST) the most effective and common process of searching similarity– Peculiarities:

• Fast• Effective sensibility

– Types:• Blastn – for nucleotide sequences• Blastp – for proteins• Blastx – for translated nucleotid sequences

• http://www.ncbi.nlm.nih.gov/blast

Information Technology in Plant Protection

Page 230: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Pair Sequence Similarity Search

• Types:

Information Technology in Plant Protection

BLAST Kereső szekvencia Adatbázis

Blastn Nucleotide Nucleotide

Blastp Proteine Proteine

Blastx 6 frame translated nucleotide Proteine

Tblastn Proteine 6 frame translated nucleotide

Tblastx 6 frame translated nukleotide 6 frame translated nucleotide

Page 231: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012Information Technology in Plant Protection

Entering sequences - copy - upload

Page 232: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Most important settings: Blastn – Searching Databases

• The most commonly used one is the not-redundant nucleotide database (chosen one)

• It is possible to narrow searching in case we add taxonomical data in section „Organism”.

Information Technology in Plant Protection

Page 233: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Most important settings: Blastn – program optimalisation

• Megablast: searching analogies with 95% or bigger similarity, very fast.

• D megablast: exceedingly suitable for comparison of species, bit slower.

• Suitable for the comparison of any sequences, it indicates little similarities, slow.

Information Technology in Plant Protection

Page 234: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Most important settings: Blastp – Searching Databases

• The most commonly used one is the not-redundant protein database (chosen one)

• It is possible to narrow searching in case we add taxonomical data in section „Organism”.

Information Technology in Plant Protection

Page 235: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Most important settings: Blastp – program optimalisation

• Blastp: simple searching in protein database.

• PSI-BLAST: searching algorithm with position-specific scoring

• PHI-BLAST: searching with pattern-specific scoring system.

Information Technology in Plant Protection

Page 236: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

BLAST result evaluation

• Look for those nucleotide sequences which are similar to AY693804 - Phytophthora infestans INF2A gene.

Information Technology in Plant Protection

Page 237: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

BLAST result evaluation

Information Technology in Plant Protection

Searching parameters

Garphical demonstration of the result

Page 238: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

BLAST result evaluation

• Result summary table– Max score

• Bigger value means bigger similarity– Query coverage

• Bigger value means bigger similarity

Information Technology in Plant Protection

Page 239: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

BLAST result evaluation

• Result summary table– E value – Expected value

• Lower value means higher similarity.– Max ident – maximal query alignment

• Higher value means higher similarity

Information Technology in Plant Protection

Page 240: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

BLAST result evaluation

• Detailed sequence alignment:

Information Technology in Plant Protection

Page 241: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

BLAST on local environment• It is possible to run BLAST program in local environment.

– It is useful in the following cases:• Comparing sequences to local databases• Operations requiring large number of calculations

• ftp://ftp.ncbi.nih.gov/blast/ – Command line running with parameter inputs.– Supporting many operating systems (also 32 and 64 bits

architectures)– Detailed help– First step is the database formatting, next step is similarity

analysis. – It is able to create a lot of output formats

Information Technology in Plant Protection

Page 242: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

What can we do with it?

• We have an unknown sequence from an unknown source.– What can be the source?– To which gene is similar?– What can be the function of the protein

coded by this sequence?• Use the sequence in FASTA format as

query parameter in the BLAST program. From the result we can answer the questions above.

Information Technology in Plant Protection

Page 243: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Exercise

– Using of the FASTA format sequence saved in the previous presentation search relatives with similarity analysis (BLAST).

Information Technology in Plant Protection

Page 244: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Contents

• Multiple sequence alignments• Examining protein sequences• Protein 3D models• Use for what?• Exercise

Information Technology in Plant Protection

Page 245: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Multiple Sequence Alignment

• Essence:– Trying to fit more sequences at the same

time. Possible differences are estimated by penalties.

• Use:– Searching common peculiars and

parameters– Inserting new sequences - taxonomy– Protein structures– Phylogenetical analysis

Information Technology in Plant Protection

Page 246: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Multiple Sequence Alignment - An example

TTGACATG CCGGGG---A AACCGTTGACATG CCGGTG--GT AAGCCTTGACATG -CTAGG---A ACGCGTTGACATG -CTAGGGAAC ACGCGTTGACATC -CTCTG---A ACGCG******** ?????????? *****• What is the consensus sequence?• In case of differences it is difficult to detect

common patterns. That’s why we use alignment software.

Information Technology in Plant Protection

Page 247: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Multiple sequence alignment

• Types of alignment:– Manual: hand-made, laborious and long

process. Human faults can occur.– Automata: faster, sometimes it doesn’t

consider biological requirement.– Combinated: we gain the best result by

using together the manual and the automata processes. First use the computer then refine it by hand.

Information Technology in Plant Protection

Page 248: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Multiple sequence alignment

• Most used program for multiple sequence alignment: CLUSTAL W– Downloadable local version:

http://www.clustal.org– WWW version: http://www.ebi.ac.uk/clustalw– Use progressive alignment method– Fast, low memory usage application– More sequence alignment effective– Able to use drawing simple phylogenetic

trees

Information Technology in Plant Protection

Page 249: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Multiple sequence alignment

• Exercise: Do the multiple alignment on the following sequence:– Sequence file

http://align.bmr.kyushu-u.ac.jp/mafft/online/server/

Information Technology in Plant Protection

Page 250: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Multiple sequence alignment

• Example:– Examine the catalase sequences of the

plants below:• Paprika (Capsicum annuum)• Tobacco (Nicotiana tabacum)• Tomato (Solanum lycopersicum)• Potato(Solanum tuberosum)

• Sequence file• http://align.genome.jp/

Information Technology in Plant Protection

Page 251: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Multiple sequence alignment

Information Technology in Plant Protection

Page 252: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Multiple sequence alignment

Information Technology in Plant Protection

The result with Jalview program.

Page 253: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Phylogenetics

• The study of evolutionary relatedness among various groups of organisms through molecular sequencing data and morphological data matrices.

Information Technology in Plant Protection

Page 254: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Phylogenetics

• Relation of phylogenetical analysis and sequence alignments. – Sequence alignment determine the similarity and

difference of the aligned sequences. – In case of ortholog sequences: differences of

sequences from different species arises from the mutations collected during their different evolution.

– Number of mutations, namely the rate of the difference between the sequences is connected to the evolutionary distance between the two species: the longer ago the two species separated, the higher sequence difference occur within their ortholog genes.

Information Technology in Plant Protection

Page 255: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Phylogenetics

• Steps of phylogenetic analysis:– 1. Sequence alignment– 2. Definition of evolutionary model– 3. Tree build– 4. Examination of the tree(s)

Information Technology in Plant Protection

Page 256: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Phylogenetics

• Example:– Do the multiple sequence alignment with

the sequence file below and draw the phylogenetic tree.

– Sequence file

http://align.bmr.kyushu-u.ac.jp/mafft/online/server/

Information Technology in Plant Protection

Page 257: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Multiple sequence alignment

• 1. Sequence alignment

Information Technology in Plant Protection

Page 258: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Multiple sequence alignment

Information Technology in Plant Protection

4. Examination of the tree

Page 259: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Phylogenetics

Information Technology in Plant Protection

Page 260: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Use for what?

• Function prediction• Relative finding• Identification

Information Technology in Plant Protection

Page 261: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Exercise

– Look for resistance genes with known sequences. Do multiple sequence alignments on them. Evaluate similarities and create a phylogenetic tree.

– What are the consequences of the result?

Information Technology in Plant Protection

Page 262: Information Technology in Plant Protection

Information Technology in Plant Protection

Bioniformatics - multiple sequence

alignmentProtein sequence analysis

Page 263: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Content

• Characteristics of the proteins• Protein sequence analysis• Protein 3D models• Practical task

Information Technology in Plant Protection

Page 264: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Characteristics of the proteinsProteins are biochemical compounds consisting of one or more

polypeptides typically folded into a globular or fibrous form, facilitating a biological function

Properties:• 20 amino acid coding triplets• Four types of bases 43 = 64 type of triplets enough 20 amino acid

coding• 61 triplet coding amino acids• 3 stop sign (stop codon) UAA, UAG, UGA• 1 codon sign the start of translation (start codon, methionine AUG)• One triplet coding only one type of amino acid, but the same amino acid

can determinate more triplets degeneration• synonymous triplets: coding the same amino acid• The gene and it has coded protein chain is coo linear• The code is zero overlapped• The genetically code is universal on the living resources.

Information Technology in Plant Protection

Page 265: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Characteristics of the proteins

Information Technology in Plant Protection

DNA nucleotide order

RNA nucleotide order

Protein amino acid order

Page 266: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Characteristics of the proteins

Information Technology in Plant Protection

The code table. (Griffiths et al., An Introduction of Genetic analysis, 8th Ed. Fig. 9-8.)

Page 267: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Group of proteins:Basis of biological activity• Enzymes (pepsin)• Protection proteins (immunoglobulin)• Transport proteins (hemoglobin, mioglobin,

transpherin )• Hormones (insulin, ACTH)• Structure proteins (collagen, elastin,

keratin)• Toxins (snake poison)

Information Technology in Plant Protection

Page 268: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Characteristics of protein sequences

• Protein synthesis video:– http://www.youtube.com/watch?v=NJxobg

kPEAo

• Protein structure video:– http://www.youtube.com/watch?v=lijQ3a8

yUYQ

Information Technology in Plant Protection

Page 269: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Caracteristics of protein sequences

• Structures:– Primary structure: order of amino acids

• „MSASSSSALPPLVPALYRWK”– Secondary structure: spatial structure regularly

repeating local structures. The most common examples are the:

• Alpha helix and Beta sheet– Tertiary structure: the overall shape of a single protein

molecule; the spatial relationship of the secondary structures to one another.

– Quaternary structure: several protein molecules (polypeptide chains), usually called protein subunits in this context, which function as a single protein complex

Information Technology in Plant Protection

Page 270: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Characteristics of protein sequences

Information Technology in Plant Protection

Page 271: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Characteristics of protein sequences

• Protein databases:– Protein database: UniProt http://www.uniprot.org/– Protein structure database: ProteinDataBank

http://www.rcsb.org/pdb/home/home.do – Protein interaction database: String http://string.embl.de/

Information Technology in Plant Protection

Page 272: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Protein sequence analysis

• Analysis primary structure– First of all we examine the distribution

and physical-chemical qualities of aminoacids.

• Example: HCA - Hydrophobic Cluster Analysis acetyl-transpherase protein sequence from fusarium.

– http://mobyle.rpbs.univ-paris-diderot.fr/cgi-bin/portal.py?form=HCA

Information Technology in Plant Protection

Page 273: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Fehérje szekvenciák vizsgálata

Information Technology in Plant Protection

Page 274: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Protein sequence analysis

Information Technology in Plant Protection

Page 275: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Protein sequence analysis

• For general similarity trials• We can gain information by general

physical and chemical examinations.

• It is a big challenge for today’s technology to determine the structure and the function of a protein from the DNA sequence.

• For instance several diseases can be defeated if we are able to solve this problem.

Information Technology in Plant Protection

Page 276: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Protein sequence analysis

• Prediction according to the dimension:– 1D: amino acid properties, which are

able to write as 1D string. Example: sequence, secondary structure, hydrophobicity

– 2D: distance and contacts between amino acid pairs

– 3D: configuration prediction on the basis of all atom coordinates

Information Technology in Plant Protection

Page 277: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Protein sequence analysis

• Common analysis for protein structures:– 3D configuration visualization → we can see some

important properties. – 3D configuration aligning→ similar structure – similar

function.– 3D configuration classifying → line condition, similar

function.– 3D configuration predicting → secondary, tertiary,

quaternary structure prediction.– Small molecules docking → medical product candidate

molecule for known structure molecule.– Protein structure behavior→ molecule-dynamic

simulation.

Information Technology in Plant Protection

Page 278: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Protein 3D models

• In order to understand the function of a proteine, we have to know its 3D structure and quaternary structure. Then we are able to conclude its linking possibilities to other molecules, proteines, enzymes.

• Let’s see an example for acetyl-transpherase protein sequence from fusarium :

– http://www.rcsb.org/pdb/explore/explore.do?structureId=3FP0

Information Technology in Plant Protection

Page 279: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012Information Technology in Plant Protection

Page 280: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Analysis of protein synergy

• Database of Interacting Proteins– It collect data about proteines interacting

(bonding) with each others by experimental results.

– It describes about 11 000 interaction of about 6200 proteine.

– Specific details of one interaction: one proteine, the other proteine, interacting regions, experimental methods, dissociation constant, references.

– Example: we can show how interaction-network graphs build (nodes clickable).

Information Technology in Plant Protection

Page 281: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Analysis of protein synergy

• http://dip.doe-mbi.ucla.edu/

Information Technology in Plant Protection

Page 282: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Exercise

– Look for the proteine sequence and (if it exists) the 3D structure of an important causative agent in an optional sequence database. Examine the possible linking points.

Information Technology in Plant Protection

Page 283: Information Technology in Plant Protection

Information Technology in Plant Protection

Bioinformatics- Genomes

Page 284: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Content

• What is genome?• Genome projects• Genome browser software• Use for what?• Exercise

Information Technology in Plant Protection

Page 285: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Definition of genome?

• The genome: „The genome is the entirety of an organism's hereditary information. It is encoded either in DNA or, for many types of virus, in RNA. The genome includes both the genes and the non-coding sequences of the DNA/RNA.” (Wikipedia)

Information Technology in Plant Protection

Page 286: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Genomics is the discipline of Genome

• Genomics: encompasses a broader scope of scientific inquiry associated technologies than when genomics was initially considered. A genome is the sum total of all an individual organism's genes. Thus, genomics is the study of all the genes of a cell, or tissue, at the DNA (genotype), mRNA (transcriptome), or protein (proteome) levels.– Functional genomics: attempts to make use of the

vast wealth of data produced by genomic projects – Structural genomics: attempts to determine the

structure of every protein encoded by the genome

Information Technology in Plant Protection

Page 287: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Tools of genomics

• Microarray (v. chip): It is a 2D array on a solid substrate (usually a glass slide or silicon thin-film cell) that assays large amounts of biological material using high-throughput screening methods.

• Types: – DNA microarrays (oligonucleotids or cDNA) – Protein microarrays – Cellular microarrays – Tissue microarrays– Antibody microarrays

Information Technology in Plant Protection

Page 288: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Tools of genomics

Information Technology in Plant Protection

Page 289: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Genome projects

• In the last decade several genome projects has started in the world. The aim of these projects is the entire recognition of the genetic code of more and more creatures. In August 2010 genomes of nearly 2500 species are known entirely or partly.

• Important projects:

– http://genome.ucsc.edu

Information Technology in Plant Protection

Page 290: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Genome projects

• First successful genome project was the Hemophilus Influenzae in 1995, done by Fleischmann et al. The first plant genome was Arabidopsis thaliana in 2000. The entire exploration of the humane genome was completed in 2003.

• Tools of bioinformatics play an important role in the analytical work following genome sequencing. This time happen assembling the genetically code.

Information Technology in Plant Protection

Page 291: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Genom browsers

• Trough genome browsers we are able to view clear and arranged format of genomic data.

• In some case genome is the starting point of genetically analysis.

• Some genome browsers:– http://genome.ucsc.edu– http://www.ensembl.org – http://ecrbrowser.dcode.org– http://www.ncbi.nlm.nih.gov/mapview/

Information Technology in Plant Protection

Page 292: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Genome browsers

• Properties of UCSC Genome browser– News and information on the start page– Genome grouping on the basis of class

and genome.– Selectable sequence assembly by date

(tracking able)– Searching by test

Information Technology in Plant Protection

Page 293: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Chromosome number

Position on Chromosome

Base position

Known genes

Mammals conservations

Information Technology in Plant Protection

Page 294: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Genome browsers

• Options under the graphical interface:– Gene and gene probability options– mRNA and EST options– Expressions and regulation– Comparison options– Variations and duplicates

Information Technology in Plant Protection

Page 295: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Genome browsers

• NCBI MapViewer properties– Four level system

• Start page• Genome View• Map View• Sequence View

– Ability for keyword searching– Access to 800 species genomes

Information Technology in Plant Protection

Page 296: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Genome browsers

• NCBI MapViewer

Information Technology in Plant Protection

Page 297: Information Technology in Plant Protection

TÁMOP-4.1.2.A/2-10/1-2010-0012

Exercise

– With the help of genome browser look for the genome of an important causative agent in an optional sequence database. Examine what kind of genes can occur in the surroundings of the one-millionth nucleotide on the second chromosome of this organism.

Information Technology in Plant Protection

Page 298: Information Technology in Plant Protection

PRESENTATION CAN BE DOWNLOADED FROM:-

Georgikon Faculty

Information Technology in Plant Protection

Prepared by:Dr. János Busznyák - Dr. Máté Csák– Sándor Nagy