mysql cluster 72 in the cloud

Implementing MySQL Cluster in the cloud

Marco Tusa

MySQL CTL

May 17 2012

© 2012 Pythian 2

• Recognized Leader: • Global industry-leader in remote database administration services and consulting for Oracle, Oracle

Applications, MySQL and SQL Server

• Work with over 165 multinational companies such as Forbes.com, Fox Sports, Nordion and Western Union to help manage their complex IT deployments

• Expertise: • One of the world’s largest concentrations of dedicated, full-time DBA expertise. Employ 7 Oracle

ACEs/ACE Directors

• Hold 7 Specializations under Oracle Platinum Partner program, including Oracle Exadata, Oracle GoldenGate & Oracle RAC

• Global Reach & Scalability: • 24/7/365 global remote support for DBA and consulting, systems administration, special projects or

emergency response

Why Pythian

© 2012 Pythian 3

Who am I? • Cluster Technical Leader at Pythian for MySQL technology

• Previous manager Professional Service South EMEA at MySQL/SUN/Oracle

• In MySQL before the SUN gets on us

• Lead the team responsible for Oracle & MySQL DBs service in support to technical systems, at Food and Agriculture Organization of United Nations (FAO of UN)

• Lead developer & system administrator teams in FAO managing the Intranet/Internet infrastructure.

• Worked (a lot) in developing countries like (Ethiopia, Senegal, Ghana, Egypt …)

• My Profile http://it.linkedin.com/in/marcotusa

• Email [email protected] [email protected]

© 2012 Pythian 4

1. Practical aspects related to: 2. Amazon images to use and node identifications 3. MySQL cluster set-up 4. Cluster parameters dimensioning and definition 5. Starting and checking cluster 6. Distribution Awareness & AQL 7. Taking numbers (doing tests) 8. Results and comments

What we will talk about

© 2012 Pythian 5

• What is MySQL Cluster • What is a data node/node group/manager • What is a fragment • What is a Cluster data Replicas I assume you know MySQL cluster basics. Other webinars • Mysqlcluster for begginer • Use mysql cluster replication • Use Java API for mysql cluster • Monitor mysql cluster

What we will NOT talk about

© 2012 Pythian 6

The choice needs to take in to account:

• Memory requirements (from Dataset calculation)

• CPU numbers (from the real workload)

• Disks configuration (from transaction modification/sec)

Amazon image to use

© 2012 Pythian 7

Choose image from Large 2 Core 7.5 GB To High memory Quadruple Extra Large 8 Core 68.4 GB. Wait to have the information on the data set.

But never go below! CPU needs to be at least 2 to manage efficiently the Kernel blocks. 7.5GB Ram means a data allocation per node of 4GB.

Cluster is an in memory database, BUT it flush on disk a lot, do not use Table on disk on EC2.

Amazon image to use

© 2012 Pythian 8

Brief consideration on disk configuration, Cluster flush it’s status constantly (unless DISKLESS is define) using ephemeral disk is not a good idea: • In case of ZONE crash you will loose local data • performance are less consistent then using EBS and RAID

I have achieve the better stability using: • 6 EBS (or 4) • RAID 0 If possible split the REDO log from DATA

Datadir=/opt/mysql-cluster/datacluster FileSystemPath=/opt2/mysql-cluster/datacluster

Amazon image to use

© 2012 Pythian 9

Brief consideration on Network configuration

Cluster need to talk internally, and need to be consistent. To avoid issues

• Create your own VPC • Associate network device with defined IP (10.0.1.138) • Name the device respecting the node (easy to remember) cluster1_ndbmtd_4 associate to data node 4 in cluster 1

• Set the IPs in the config to match the internal IPs

Amazon image to use

© 2012 Pythian 10

Dimensioning the cluster dataset as it is right now • Use a fake/local cluster installation • Use Sizer from www.severalnines.com • Estimate the real requirements.

Pay particular attention to: DataMemory IndexMemory

Play with Number of nodes to have your configuration matching requirements DO NOT CHANGE number of replicas (never use 1)

Amazon image to use

© 2012 Pythian

Assuming our requirements are: 4 Millions rows in tbtest1 4 Hundred thousands in tbtest2

Will need to have: 4 Data nodes 2 Node groups Allocated DataMemory = ~3GB

11

Dimensioning the cluster dataset. For our test we have 2 tables:

Amazon image to use

CREATE TABLE `tbtest1` ( à` int(11) NOT NULL, ùuid` char(36) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL, `b` varchar(100) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL, `c` char(200) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL, `counter` bigint(20) DEFAULT NULL, `time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, `partitonid` int(11) NOT NULL DEFAULT '0', `strrecordtype` char(3) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL, PRIMARY KEY (ùuid`), KEY ÌDX_a` (à`) ) ENGINE=ndbcluster DEFAULT CHARSET=latin1 1 row in set (0.00 sec)

CREATE TABLE `tbtest2` ( à` int(11) NOT NULL, `stroperation` mediumtext CHARACTER SET utf8 COLLATE utf8_bin, `time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, PRIMARY KEY (à`) ) ENGINE=ndbcluster DEFAULT CHARSET=latin1

© 2012 Pythian 12

MySQL cluster set-up How our cluster architecture will looks like:

Node group 1 Node group 2

Management Nodes

MySQL servers

Applications servers

© 2012 Pythian 13

MySQL cluster set-up

Shopping list, what we will need: 4 Large instances (at least 2 CPU) for data Nodes 2 Small instances for MySQL

2 Small instance for management nodes (we can set them in MySQL instance)

6 x 4 = 24 x 2 = 48 EBS RAID0 for data nodes

© 2012 Pythian 14

MySQL cluster set-up Setup one instance and then create your own AMI will be faster.

• OS ReadHat 6 • Install packages (htop; sysstat;oprofile)

• Install EC2 command line tools (http://s3.amazonaws.com/ec2-downloads/ec2-api-

tools.zip) • bring sizer with you as well

© 2012 Pythian 15

MySQL cluster set-up EBS creation and configuration:

Finally create a LVM base on the /dev/md0 Ready to create the AMI now.

1. for x in {1..6}; do ec2-create-volume -s 8 -z us-east-1b; done > ebs.txt

2. (i=0; for vol in $(awk '{print $2}' ebs.txt); do i=$((i+1)); ec2-attach-volume $vol –I <INSTANCENAME> -d /dev/sdc${i}; done)

3. mdadm --verbose --create /dev/md0 --level=0 --chunk=256 --raid-devices=6 /dev/xvdg1 /dev/xvdg2 /dev/xvdg3 /dev/xvdg4 /dev/xvdg5 /dev/xvdg6

4. echo 'DEVICE /dev/xvdg1 /dev/xvdg2 /dev/xvdg3 /dev/xvdg4 /dev/xvdg5 /dev/xvdg6' | tee -a /etc/mdadm.conf

5. sudo mdadm --detail --scan | sudo tee -a /etc/mdadm.conf 6. lockdev --setra 128 /dev/md0

blockdev --setra 128 /dev/xvdg1-6

© 2012 Pythian 16

Cluster parameters definition First review the MySQL cluster configuration and MySQL configuration.

ALWAYS use IPs, not machine names or DNS tricks

skip-name-resolve query_cache_type=0 query_cache_size=0 query_cache_limit=0M ndb-cluster-connection-pool=4 ndb-use-exact-count=0 ndb-extra-logging=1 ndb-autoincrement-prefetch-sz=1024 engine-condition-pushdown=1 ndb_join_pushdown=1 ndb_optimized_node_selection=3

Datadir=/opt/mysql-cluster/datacluster LockPagesInMainMemory=1 FileSystemPath=/opt2/mysql-cluster /datacluster Fragments better to be 1 time Data memory FragmentLogFileSize=256M InitFragmentLogFiles=FULL NoOfFragmentLogFiles=18

Some buffers used to manage also regular data BackupDataBufferSize=32M BackupLogBufferSize=32M

© 2012 Pythian 17

Starting MySQL Cluster Start ndb_mgmd as usual: bin/ndb_mgmd -f config.ini --ndb-nodeid=1 --config-dir=`pwd` --initial

Start the nodes bin/ndbmtd -c 10.118.19.9:1186 --ndb-nodeid=6 –-initial

Check the log for connection message: ndb_mgm> all status

Connected to Management Server at: localhost:1186

2012-05-15 11:27:41 [MgmtSrvr] INFO -- Node 3: Started (mysql-5.5.15 ndb-7.2.2)




…

Node 3: starting (Last completed phase 1) (mysql-5.5.15 ndb-7.2.2)

Node 6: starting (Last completed phase 1) (mysql-5.5.15 ndb-7.2.2)

© 2012 Pythian 18

Starting MySQL Cluster And iostat: avg-cpu: %user %nice %system %iowait %steal %idle

0.56 0.00 8.43 86.80 0.28 3.93

Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util

xvdep1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

xvdj 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

xvdk 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

xvdgp1 0.00 4515.00 0.00 514.00 0.00 21.08 84.00 45.60 105.03 1.73 88.70

xvdgp2 0.00 4451.00 0.00 443.00 0.00 15.29 70.68 40.23 76.52 1.53 67.70

xvdgp3 0.00 4592.00 0.00 481.00 0.00 20.63 87.83 106.26 241.28 2.08 100.00

xvdgp4 0.00 4503.00 0.00 357.00 0.00 13.52 77.54 97.62 229.52 2.29 81.80

xvdgp5 0.00 4473.00 0.00 478.00 0.00 17.65 75.63 46.91 94.89 1.55 74.00

xvdgp6 0.00 4513.00 0.00 448.00 0.00 18.17 83.07 67.49 149.76 1.92 86.10

xvdf 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

md127 0.00 0.00 0.00 30023.00 0.00 117.28 8.00 0.00 0.00 0.00 0.00

dm-0 0.00 0.00 0.00 30023.00 0.00 117.28 8.00 4387.88 144.10 0.03 100.00

dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

© 2012 Pythian 19

Cluster logs to check • MySQL log:

120515 12:11:15 [Note] NDB: NodeID is 7, management server '10.114.122.44:1186’ 120515 12:11:15 [Note] NDB[0]: NodeID: 7, all storage nodes connected 120515 12:11:16 [Note] NDB[1]: NodeID: 8, all storage nodes connected 120515 12:11:17 [Note] NDB[2]: NodeID: 23, all storage nodes connected 120515 12:11:17 [Note] NDB[3]: NodeID: 24, all storage nodes connected

• Cluster General log in : LogDestination=FILE:filename=ndb_1_cluster.log

• Data Node log: Datadir=/opt/mysql-cluster/datacluster

© 2012 Pythian 20

Cluster data nodes Kernel blocks Check what is going on to the inside and to our CPUs thr: 0 tid: 19836 (main) cpu: 0 OK DBTC(0) DBDIH(0) DBDICT(0) NDBCNTR(0) QMGR(0) NDBFS(0) TRIX(0) DBUTIL(0) DBSPJ(0)

thr: 1 tid: 19837 (rep) cpu: 0 OK BACKUP(0) DBLQH(0) DBACC(0) DBTUP(0) SUMA(0) DBTUX(0) TSMAN(0) LGMAN(0) PGMAN(0) RESTORE(0) DBINFO(0) PGMAN(5)

thr: 2 tid: 19838 (ldm) cpu: 1 OK PGMAN(1) DBACC(1) DBLQH(1) DBTUP(1) BACKUP(1) DBTUX(1) RESTORE(1)

thr: 3 tid: 19829 (recv) CMVMI(0)

2012-05-15 12:30:18 [ndbd] INFO -- Start initiated (mysql-5.5.15 ndb-7.2.2)

NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer

2012-05-15 12:30:18 [ndbd] WARNING -- Ndb kernel thread 1 is stuck in: Unknown place elapsed=9

2012-05-15 12:30:18 [ndbd] INFO -- Watchdog: User time: 3 System time: 47

Locked to CPU ok

Kernel Blocks will be allocated as we define in config.ini ThreadConfig=ldm={count=1,cpubind=1},main={cpubind=0},rep={cpubind=0},io={count=1,cpubind=1}

© 2012 Pythian 21

Cluster Kernel So was that good enough? • Yes for small load/traffic • No for more complex and remanding use

Why? Because the kernel blocks are, at the end, still one on top of the other.

Better optimization for CPU usage start with 4 CPU.

Kernel block description(http://dev.mysql.com/doc/ndbapi/en/ndb-internals-kernel-blocks.html)

© 2012 Pythian 22

Query Cluster’s State Before moving ahead, review how to access cluster information. Configuration bin/ndb_config --type=ndbd --query=id,host,datamemory,indexmemory,datadir -f ' : ' -r '\n'

3 : 10.118.19.9 : 4258267136 : 532676608 : /opt/mysql-cluster/datacluster …

6 : 10.83.90.94 : 4258267136 : 532676608 : /opt/mysql-cluster/datacluster

Table status bin/ndb_desc -u tbtest1 tbtest2 -d test

Table content bin/./ndb_select_all -c 127.0.0.1 tbtest1 -d test

a uuid b c counter time partitonid strrecordtype

Table count bin/ndb_select_count -c 127.0.0.1 -d test tbtest1 tbtest2

0 records in table tbtest1

0 records in table tbtest2

© 2012 Pythian 23

MySQL Cluster distribution awareness Cluster distribute the data by fragments on horizontal partitioning. NoFragment=2 and 4 Data nodes will result in this distribution. Table TBTEST1

Partition 1

Partition 2

Partition 3

Partition 4

Node Group 2

Node Group 1

F2 F4

F4 F2

F1 F3

F3 F1

© 2012 Pythian 24

MySQL distribution awareness • Cluster has internal partitioning, based on the primary key.

• By default cluster distribute the data on data node by RR, this to ensure equal data distribution.

• Data that could be theoretically group, can reside on different fragments. This will result in additional work for the Transaction Coordinator.

• Creating explicit partition by key, will guarantee that similar data will reside on the same fragment.

… PRIMARY KEY (à`, ìd`)) ENGINE=ndbcluster Partition by KEY(à`) ;

• When fetching the data cluster (TC) will fetch it from ~one single fragment.

© 2012 Pythian 25

MySQL AQL Adaptive Query Localization

why is so important? • Reduce the round trip on the network for data subsets

• Reduce the work on MySQL nodes

• Improve data collection on the data nodes by parallelism

• Return only the final data set to MySQL node

All these will reduce overhead on EC2, improving performance

Condition push down, join push down and are very relevant in EC2 environment.

© 2012 Pythian 26

MySQL AQL what to do

What we must ensure to have : Primary Keys • ALWAYS DEFINE A PRIMARY KEY ON THE TABLE!

• A hidden PRIMARY KEY is added if no PK is specified.

Not using Primary key is BAD. Example not replicated between clusters.

So even if you don’t need it create an ID: `ID` BIGINT AUTO_INCREMENT PRIMARY KEY

© 2012 Pythian 27

MySQL AQL what to do

• Joined columns must be of identical types • No reference to BLOB or TEXT columns • No explicit lock (select .. for update)

• Child tables in the Join must be accessed using one of the ref, eq_ref, or const • Do not partition by [LINEAR] HASH, LIST, or RANGE • Avoid ‘Using join buffer' in the PLAN • If root of Join is an eq_ref or const, child tables must be joined by eq_ref • Avoid range ANALIZE table is not an option it is a MUST

© 2012 Pythian 28

MySQL AQL what to do Using our test schema to match the requirements:

CREATE TABLE `tbtest1` ( à` int(11) NOT NULL, ùuid` char(36) NOT NULL, `b` varchar(100) NOT NULL, `c` char(200) NOT NULL, `counter` bigint(20) DEFAULT NULL, `time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, `partitonid` int(11) NOT NULL DEFAULT '0', `strrecordtype` char(3) DEFAULT NULL, PRIMARY KEY (ùuid`), KEY ÌDX_a` (à`) ) ENGINE=ndbcluster DEFAULT CHARSET=latin1

CREATE TABLE `tbtest2` ( ìd` int AUTO_INCREMENT NOT NULL, à` int(11) NOT NULL, `stroperation` varchar (200), `time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, PRIMARY KEY (ìd`,à`) ) ENGINE=ndbcluster Partition by KEY(à`) ;

We have to modify: • The primary key • Add partitioning • Change datatype

© 2012 Pythian 29

MySQL load data and test Our final test environment: • 4 NDB Data nodes • 2 NDB MGM • 2 TO 6 MySQL nodes

Our test schema: • 1 Main table each record ~355 bytes • 3 secondary tables, each record ~209 bytes for

Plus indexes

© 2012 Pythian 30

MySQL load data and test Test performed where focus on: • Inserts

• Check if the implemented platform was managing the load • Identify the possible limit on scaling

• Identify how to go beyond that limit • Select validate the condition pushdown & Join push down • Identify common mistakes in join • Get Select numbers

Inserts where done running from 2 up to 42 threads pushing for each MySQL server;

© 2012 Pythian 31

MySQL load data and test Numbers related to the test: +---------+--------------+------------+------------+------------+-------------+ | node_id | memory_type | used | used_pages | total | total_pages | +---------+--------------+------------+------------+------------+-------------+ | 3 | Data memory | 4102160384 | 125188 | 4258267136 | 129952 | | 3 | Index memory | 101687296 | 12413 | 532938752 | 65056 | | 4 | Data memory | 4102193152 | 125189 | 4258267136 | 129952 | | 4 | Index memory | 101695488 | 12414 | 532938752 | 65056 | | 5 | Data memory | 4107534336 | 125352 | 4258267136 | 129952 | | 5 | Index memory | 102465536 | 12508 | 532938752 | 65056 | | 6 | Data memory | 4106977280 | 125335 | 4258267136 | 129952 | | 6 | Index memory | 102522880 | 12515 | 532938752 | 65056 | +---------+--------------+------------+------------+------------+-------------+

Rows: Tbtest1 : 6,851,215 Tbtest2 : 32,320 Tbtest3-4: 678,720

© 2012 Pythian 32

MySQL load data and test Results for 1 MySQL server

Insert per second where decent considering the platform and the single server. Better performance was at 14 Th, given the load on MySQL node not on the NDB side.

© 2012 Pythian 33


Insert per second where much better as expected.

Better performance was at 18 & 36 TH, given the load on MySQL, I was suspecting EBS issue, but repeating the tests confirm the numbers.

© 2012 Pythian 34


I had a lot of hiccups, with the Inserts increasing and decreasing. Again this was mainly due to MySQL nodes be too busy then NDB, but also NDB was starting to suffer, specially on EBS side and CPU both expected.

© 2012 Pythian 35

MySQL load data and test One graph, thousands words:

© 2012 Pythian 36

MySQL load data and test IO on disks: avg-cpu: %user %nice %system %iowait %steal %idle SINGLE MYSQL 0.78 0.00 1.03 7.75 0.00 90.44

Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util xvdep1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 xvdj 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 xvdk 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 xvdgp3 0.00 338.00 0.00 122.00 0.00 1.80 30.16 1.72 14.11 0.66 8.00 xvdgp4 0.00 344.00 0.00 116.00 0.00 1.80 31.72 2.09 18.04 0.75 8.70 xvdgp5 0.00 288.00 0.00 104.00 0.00 1.53 30.15 2.93 28.21 1.61 16.70 xvdgp6 0.00 287.00 0.00 97.00 0.00 1.50 31.67 1.89 19.49 0.85 8.20 xvdf 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 xvdgp1 0.00 346.00 0.00 110.00 0.00 1.78 33.16 1.59 14.47 0.61 6.70 xvdgp2 0.00 346.00 0.00 110.00 0.00 1.78 33.16 2.75 25.01 1.02 11.20 dm-0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 md0 0.00 0.00 0.00 2614.00 0.00 10.19 7.98 0.00 0.00 0.00 0.00 dm-1 0.00 0.00 0.00 2611.00 0.00 10.19 7.99 80.55 30.85 0.07 17.80

avg-cpu: %user %nice %system %iowait %steal %idle Two MYSQL 0.78 0.00 1.30 14.03 0.00 83.90

Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util xvdep1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 xvdj 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 xvdk 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 xvdgp3 0.00 432.00 0.00 119.00 0.00 1.59 27.36 1.85 13.59 0.67 8.00 xvdgp4 0.00 457.00 0.00 115.00 0.00 1.78 31.72 2.10 15.93 0.77 8.80 xvdgp5 0.00 457.00 0.00 120.00 0.00 1.80 30.73 10.11 82.11 2.52 30.30 xvdgp6 0.00 452.00 0.00 125.00 0.00 1.80 29.50 2.17 15.22 0.72 9.00 xvdf 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 xvdgp1 0.00 489.00 0.00 139.00 0.00 1.66 24.52 1.41 8.54 0.51 7.10 xvdgp2 0.00 487.00 0.00 120.00 0.00 1.59 27.07 1.91 14.26 0.72 8.70 dm-0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 md0 0.00 0.00 0.00 3648.00 0.00 14.22 7.98 0.00 0.00 0.00 0.00 dm-1 0.00 0.00 0.00 3644.00 0.00 14.22 7.99 104.09 26.79 0.09 31.70

© 2012 Pythian 38

MySQL read data and test Second remember to do ANALYZE on your tables.

(root@localhost) [test]select count(tbtest4.a) from tbtest4, tbtest1 where tbtest1.a=tbtest4.a and tbtest1.a=346424503; +------------------+ | count(tbtest4.a) | +------------------+ | 1491 | +------------------+

1 row in set (1.71 sec)

(root@localhost) [test]analyze table tbtest1; +--------------+---------+----------+----------+ | Table | Op | Msg_type | Msg_text | +--------------+---------+----------+----------+ | test.tbtest1 | analyze | status | OK | +--------------+---------+----------+----------+ 1 row in set (32.46 sec)

(root@localhost) [test]select count(tbtest4.a) from tbtest4, tbtest1 where tbtest1.a=tbtest4.a and tbtest1.a=346424503; +------------------+ | count(tbtest4.a) | +------------------+ | 1491 | +------------------+

1 row in set (0.03 sec)

© 2012 Pythian 39

MySQL read data and test Why this two queries have the same results but the second

takes much longer?

?

(root@localhost) [test]select count(tbtest4.a) from tbtest4, tbtest1 where tbtest1.a=tbtest4.a and tbtest1.a=346424503; +------------------+ | count(tbtest4.a) | +------------------+ | 1491 | +------------------+ 1 row in set (0.03 sec)

(root@localhost) [test]select count(tbtest2.a) from tbtest2, tbtest1 where tbtest1.a=tbtest2.a and tbtest1.a=346424503; +------------------+ | count(tbtest3.a) | +------------------+ | 1491 | +------------------+ 1 row in set (1.64 sec)

© 2012 Pythian 40

MySQL read data and test Why this two queries have the same results but the second

takes much longer?

Because the second was not matching one of the condition for the Join push down.

+--------------+------------+------+-----+-------------------+-----------------------------+ | Field | Type | Null | Key | Default | Extra | +--------------+------------+------+-----+-------------------+-----------------------------+ | id | int(11) | NO | PRI | NULL | auto_increment | | a | int(11) | NO | PRI | NULL | | | stroperation | mediumtext | YES | | NULL | | | time | timestamp | NO | | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP | +--------------+------------+------+-----+-------------------+-----------------------------+ 4 rows in set (0.00 sec)

© 2012 Pythian 41

MySQL read data and test Another aspect that we must take in consideration and be

careful: (root@localhost) [test]explain Select … from test.tbtest1, test.tbtest2 where tbtest1.a = tbtest2.a and tbtest1.a > 822845727 and tbtest1.a <1362834750; +----+-------------+---------+-------+---------------+------+---------+----------------+------+-----------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+---------+-------+---------------+------+---------+----------------+------+-----------------------------------+ | 1 | SIMPLE | tbtest2 | range | a | a | 4 | NULL | 1616 | Using where with pushed condition | | 1 | SIMPLE | tbtest1 | ref | a,IDX_a | a | 4 | test.tbtest2.a | 1 | | +----+-------------+---------+-------+---------------+------+---------+----------------+------

(root@localhost) [test]explain Select … from test.tbtest1, test.tbtest4 where tbtest1.a = tbtest4.a and tbtest1.a > 822845727 and tbtest1.a <1362834750; +----+-------------+---------+-------+---------------+------+---------+----------------+-------+--------------------------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+---------+-------+---------------+------+---------+----------------+-------+--------------------------------------------------------------+ | 1 | SIMPLE | tbtest4 | range | PRIMARY,a | a | 4 | NULL | 33936 | Parent of 2 pushed join@1; Using where with pushed condition | | 1 | SIMPLE | tbtest1 | ref | a,IDX_a | a | 4 | test.tbtest4.a | 1 | Child of 'tbtest4' in pushed join@1 | +----+-------------+---------+-------+---------------+------+---------+----------------+-------

© 2012 Pythian 42

MySQL read data and test In the fist query we have the medium text so condition push down apply but not Join. In the second query a range was used in the first instance, then Join push down. This has very bad effect on the performance, because range can scan cross nodes and takes a lot of resources = SLOW! As the facto it is:

(root@localhost) [test] Select count(tbtest1.a) from test.tbtest1, test.tbtest4 where tbtest1.a = tbtest4.a and tbtest1.a > 822845727 and tbtest1.a <1362834750; +------------------+ | count(tbtest1.a) | +------------------+ | 168651 | +------------------+ 1 row in set (8.81 sec)

|

© 2012 Pythian 43

MySQL read data and test Just for fun let us see what happen with subqueries, I know it will take ages:

root@localhost) [test]explain select count(tbtest1.a) from tbtest1 where tbtest1.a IN (select tbtest4.a from tbtest4 where tbtest4.a > 1362834750)\G *************************** 1. row *************************** id: 1 select_type: PRIMARY table: tbtest1 type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 6851215 Extra: Using where *************************** 2. row *************************** id: 2 select_type: DEPENDENT SUBQUERY table: tbtest4 type: index_subquery possible_keys: PRIMARY,a key: PRIMARY key_len: 4 ref: func rows: 2112 Extra: Using where

Id: 275 User: root Host: localhost db: test Command: Query Time: 3442 State: preparing Info: select count(tbtest1.a) from tbtest1 where tbtest1.a IN (select tbtest4.a from tbtest4 where tbtest4

And counting …

© 2012 Pythian 44

MySQL read data and test Rewrite the same as Join:

(root@localhost) [test]explain select count(tbtest1.a) from tbtest1 LEFT join tbtest4 on tbtest4.a=tbtest1.a where tbtest4.a > 1362834750\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: tbtest4 type: range possible_keys: PRIMARY,a key: a key_len: 4 ref: NULL rows: 67872 Extra: Parent of 2 pushed join@1; Using where with pushed condition *************************** 2. row *************************** id: 1 select_type: SIMPLE table: tbtest1 type: ref possible_keys: a,IDX_a key: a key_len: 4 ref: test.tbtest4.a rows: 1 Extra: Child of 'tbtest4' in pushed join@1 2 rows in set (0.00 sec)

(root@localhost) [test]select count(tbtest1.a) from tbtest1 LEFT join tbtest4 on tbtest4.a=tbtest1.a where tbtest4.a > 1362834750; +------------------+ | count(tbtest1.a) | +------------------+ | 193074 | +------------------+ 1 row in set (13.86 sec)

Not excellent because the range but … at least 13 seconds.

© 2012 Pythian 45

MySQL read data and test My Comments on NDBCluster - read side:

• Was/is not good in performing complex read

• It is now better in performing special join

• Range are slowing it down a lot

• Subquery still kill it, really! AVOID!

• Selects for PK are fast, really fast

• Select using batch IN ( ….) are good

Kudos to the developer team for the Join push down work, but still long way to go, before declaring it ready for complex SQL

retrievals

© 2012 Pythian 46

When to Consider Cluster • What are the consequences of downtime or failing to meet

performance requirements? • How much effort and $ is spent in developing and managing

HA in your applications? • Are you considering sharding your database to scale write

performance? How does that impact your application and developers?

• Do your services need to be real-time?

• Will your services have unpredictable scalability demands, especially for writes ?

• Do you want the flexibility to manage your data with more than just SQL ?

© 2012 Pythian 47

When NOT Consider Cluster • Data sets >3TB (unless special HW in please) • Replicate cold data to InnoDB • Long running transactions • Large rows, without using BLOBs

• Foreign Keys • Full table scans • Savepoints • Geo-Spatial indexes

• InnoDB storage engine would be the right choice • Complex SQL & Functions

© 2012 Pythian 48

Considerations MySQL Cluster scale by node, and this is a statement.

Key of success on EC2 are:

• Setup right storage (RAID0 6 device is good)

• Low/medium traffic can work on 2 CPU data node

• Medium/High need to be at least on 4CPU data node

• During test of 4 data nodes we scale up to 6 MySQL 42 threads before having some slow down

• Adding node groups add flexibility as expected

© 2012 Pythian 49

Use MySQL Cluster on EC2 Make it sense?

YES! • Do not expect the same performance of a Blade server

• Do not install it as you do on a Blade server

• Do not put nodes on different regions

• Scale your load/data set as usual

You will not get 1 Billion update x Minute here!

But I got 1,751,940 per minute which is not bad at all.

© 2012 Pythian 50

Cluster limits • The maximum number of data nodes is 48.

• Maximum number of nodes in a MySQL Cluster is 255.

• DataMemory is allocated as 32KB pages.

• Maximumum total number of Objects per cluster is 20320.

• Maximum number of attributes per table is limited to 128.

• Row size maximum permitted size is 14000 bytes

• ndb-cluster-connection-pool limit is 63 and STILL takes one slot for each from the 254 Slot max available.

© 2012 Pythian 51

Cluster on line operation • Fully online transaction response times unchanged

• Add and remove indexes, add new columns and tables

• No temporary table creation

• No recreation of data or deletion required

• Faster and better performing table maintenance operations

• Less memory and disk requirements

© 2012 Pythian 52

Cluster on line operation • Scale the cluster (add data nodes)

• Repartition tables

• Recover failed nodes

• Upgrade / patch servers & OS

• Upgrade / patch MySQL Cluster

• Back-Up

© 2012 Pythian 53

Thanks to: I must thank few people for their work, that makes working with MySQL cluster still fun.

• SeveralNines (http://www.severalnines.com) for their tool, their competence, and friendship.

• FromDual (http://www.fromdual.com) for their article and broad vision.

• Mikael Ronstrom (http://mikaelronstrom.blogspot.ca) Each article on his blog is a MILESTONE.

• Oracle developers around the globe, (Stockholm specially)

© 2012 Pythian 54

http://www.pythian.com/news/

http://www.facebook.com/pages/The-Pythian-Group/163902527671

@pythian

http://www.linkedin.com/company/pythian

1-877-PYTHIAN

[email protected]

To contact us…

To follow us…

Thank you and Q&A

@pythianjobs

mysql cluster 72 in the cloud

Technology

mysql cluster basics

mysql cluster setup4

cluster dataset

cluster architecture

cluster parameters

cluster data replicasi

data set

mysql clusterin