ibm netezza appliance models (by, )

25
IBM PureData System for Analytics (Formerly known as, IBM Netezza) - Ravi www.etraining.guru [email protected]

Upload: ravikumar-nandigam

Post on 17-Jul-2015

257 views

Category:

Education


6 download

TRANSCRIPT

Page 1: IBM Netezza Appliance Models (By, )

IBM PureData Systemfor Analytics

(Formerly known as, IBM Netezza)

- Ravi

[email protected]

Page 2: IBM Netezza Appliance Models (By, )

IBM Netezza 100(Skimmer)

Page 3: IBM Netezza Appliance Models (By, )

Single chassis, single host system

Appliance for data warehouse test and development environments

It’s a non High-Availability (HA) system

Shares the same architecture and software as IBM Netezza 1000 (Twinfin) model

Page 4: IBM Netezza Appliance Models (By, )

IBM Netezza 1000(Twinfin)

Page 5: IBM Netezza Appliance Models (By, )

Purpose built, standards based data warehouse appliance that architecturally integrates database, server, storage and advanced analytics capabilities into a single, easy-to-manage system

It supports High Availability (HA)

Fast load speeds: > 4TB/Hour

Fast backup rates: > 4TB/Hour

1:1:1 ratio between disks, CPU’s, FPGA’s

Page 6: IBM Netezza Appliance Models (By, )
Page 7: IBM Netezza Appliance Models (By, )
Page 9: IBM Netezza Appliance Models (By, )

IBM PureData System for Analytics N1001(Twinfin)

Page 10: IBM Netezza Appliance Models (By, )

Update to the IBM Netezza 1000 model, with same architecture

On 09/Oct/2012, IBM announced the name changes from TwinFin (1000) to PureData System for Analytics, as below:

1:1:1 ratio between Disks, CPU’s, FPGA’s

Page 11: IBM Netezza Appliance Models (By, )
Page 12: IBM Netezza Appliance Models (By, )

IBM Netezza High Capacity Appliances(C1000 model)

Page 13: IBM Netezza Appliance Models (By, )

C1000 models are similar to N1001 systems, but with more storage/rack

Scales to more than 10 petabytes of user data capacity

Each Netezza C1000 rack has one S-Blade chassis that contains 4 S-blades. Each S-blade has 8 CPU/FPGA processors. Same as N1001

1 Rack = 4 Storage Arrays (or Storage Groups)1 Storage Array = 1 disk raid controller + 2 disk enclosures1 disk raid controller = 12 disks; 1 disk enclosure = 12 disksSo, 1 Storage array/group = 12 + (2*12) = 36 disksSo, 1 Rack = 4 Storage array/groups = 4 * 36 disks = 144 TB Note: 2 spare disks per each storage array

C1000-4:1 Rack, 1 S-Blade Chassis 8 CPU, 8 FPGA4 Storage groups 144 TB (8 Spares)

C1000-8:2 Racks 2 S-Blade Chassis 16 CPU, 16 FPGA8 Storage groups 288 TB (16 Spares)

Note: There is no 1:1:1 ratio between Disks, CPU’s, FPGA’s

Page 14: IBM Netezza Appliance Models (By, )
Page 15: IBM Netezza Appliance Models (By, )

IBM PureData System for Analytics N2001 (Striper)

Page 16: IBM Netezza Appliance Models (By, )

3x faster analytics performance & 50% more usable capacity per rack

128 GB/Sec scan rate

1:1:1 ration between disks, FPGA engines, and CPU cores doesn’t apply

1 Rack = 7 S-blades + 288 disks1 S-blade = 16 CPU cores + 16 FPGA engines288 disks = 240 active disks + 34 spare + 14 used for swap/log spaceNote: Each disk size in striper: 600 GB (user space:200GB, Mirror:200GB, Temp: 200GB)

Why no 1:1:1 ratio?There is a 1:1 ratio between FPGA:CPU. There is no 1:1 ratio between FPGA and disks because FPGA can process the data much faster than the disk can produce.

2 S-blades have 40 disks; 5 S-blades have 32 disks

Striper configurations: ½ rack, 1 rack, 2 rack, or 4 rack.We no longer have a ¼ rack configuration as we did with the Twinfin

Page 17: IBM Netezza Appliance Models (By, )
Page 18: IBM Netezza Appliance Models (By, )
Page 19: IBM Netezza Appliance Models (By, )

IBM PureData System for Analytics N2002 (Striper)

Page 20: IBM Netezza Appliance Models (By, )

N2002 is the first hardware lifecycle refresh of the IBM PureData System for Analytics N200x family of appliances.

Page 21: IBM Netezza Appliance Models (By, )

What happens when you submit a query?

Page 22: IBM Netezza Appliance Models (By, )

When the system starts up, 32 or 40 dataslices are assigned to each s-blade

At no point during operation can one s-blade access the data on a dataslice which has been assigned to another s-blade

If an s-blade fails, the dataslices which were assigned to the failed s-blade will be rebalanced and assigned to the remaining still operational s-blades. That is exactly the same as the system worked in later releases of NPS on the TwinFin architecture as well.

There in no attachment of CPUs to disks. The only attachment is that dataslice is assigned to 1 S-blade when the system starts. CPU and FPGA resources are assigned as they are available.

When a query starts, NPS will start 240 processes, one for each dataslice.

The processes start reading data off disk; As each page of data (128KB) comes off disk, that page gets assigned to the first available FPGA on that s-blade.

FPGA decompresses and filters data and passes the result back to the process.

The Linux CPU scheduler assigns the process to one of the CPUs which processes the remaining data that came out of the FPGA.

Once complete, the next 128KB is read off disk and that continues until all of the data has been processed for the table being scanned.

Page 23: IBM Netezza Appliance Models (By, )

Find Netezza Model

Page 24: IBM Netezza Appliance Models (By, )

option1: nz_get_model

[/export/home/nz]$nz_get_modelIBM PureData System for Analytics N2001-010

option2: select * from _t_environ where name like ‘%NPS%’

/export/home/nz->nzsql -c “select * from _t_environ where name like ‘NPS%’;”NAME | VAL————–+————NPS_PLATFORM | xsNPS_MODEL | P1000X_A_ENPS_FAMILY | Pseries

Note: Pseries is for Twinfin; Qseries is for Striper

Page 25: IBM Netezza Appliance Models (By, )