oracle 12.2 sharding learning more
TRANSCRIPT
ORACLE SHARDING LEARNING MOREOracle 12.2 Sharded Database Management
ABOUT ME➤ Kamus@Enmotech 张乐奕
➤ Travel and Starbucks
➤ Travel: Japan
➤ Starbucks: Venti Caramel Macchiato
➤ Games: Blizzard fans, Hearthstone
➤ Channel[K]: http://www.dbform.com
WHAT IS ORACLE SHARDINGEvery shard is a part of logical database
WHAT IS SHARDING EXACTLY
//Query single row
select User_Name from T1 where User_ID=1;
//Query all
create view T as select * from T1 union all select * from T2;
select count(*) from T;
Database “DB” Database “DB”
Split Table
Table “T”
User_ID User_Name
1 Kim
2 Tim
3 Jim
4 Sim
Table “T1”
User_ID User_Name
1 Kim
3 Jim
Table “T2”
User_ID User_Name
2 Tim
4 Sim
WHAT IS SHARDING EXACTLY
//Query single row
select User_Name from T where User_ID=1;
//Query all
select count(*) from T;
Database “DB” Database “DB”
Partition Table
Table “T”User_ID User_Name
1 Kim
2 Tim
3 Jim
4 Sim
Partition “P1”User_ID User_Name
1 Kim
3 Jim
Partition“P2”User_ID User_Name
2 Tim
4 Sim
Table “T”
WHAT IS SHARDING EXACTLY
//Query single row
select User_Name from T where User_ID=1;
//Query all
select count(*) from T;
Database “DB” Database “DB1”
Shard Table
Table “T”
User_ID User_Name
1 Kim
2 Tim
3 Jim
4 Sim
Shard “S1”
User_ID User_Name
1 Kim
3 Jim
Shard “S2”
User_ID User_Name
2 Tim
4 Sim
Database “DB2”
Table “T”
➤ Greater scalability and fault isolation than possible with RAC
➤ Large billing systems
➤ Airline ticketing systems
➤ Online financial services
➤ Media companies
➤ Online information services
➤ Social media companies
WHICH SYSTEM WILL NEED SHARDING?
0100000200000300000400000500000600000700000800000900000
1000000OLTPThroughput
APPLICATION DESIGNED FOR SHARDING
➤ Sharding is not application transparent
➤ Application must specify a sharding key for optimal performance
➤ e.g. customer_id, account_id etc
➤ Primary usage pattern
➤ Direct routing to a shard based on sharding key
➤ Single-shard operations for highest performance
➤ Ancillary usage pattern
➤ Proxy routing for multi-shard queries (reporting)
➤ Able to tolerate lesser performance than direct routing used for single-shard operations
HOW LINKEDIN USE ORACLE SHARDING
Copy of Slide from Oracle Corp.
SHARDING ARCHITECTURE: LINKEDIN
Copy of Slide from Oracle Corp.
DEPLOYMENT OF ORACLE SHARDINGEvery shard is a part of logical database
DEPLOYMENT OF A SYSTEM-MANAGED SDB
SIMPLE ENV FOR TESTING
shard director + shard catalog
shard node1, shard node2
SDB DEPLOYMENT OVERVIEW
➤ 1.Oracle Sharding Prerequisites
➤ 2.Installing Oracle Database Software (database)
➤ 3.Installing the Shard Director Software (gsm)
➤ 4.Installing schagent in all Shard Node (database)
➤ 4.Creating the Shard Catalog Database (dbca)
➤ 5.Setting Up the Oracle Sharding Management and GDS
➤ 6.Deploying and Managing a System-Managed SDB (gdsctl)
https://oracleblog.org/working-case/deployoracle-sharding-database/创建Oracle sharding database - ⼩荷OracleBlog 天堂向左,DBA向右
REQUIRED MEDIA
➤ database.zip, gsm.zip
db software, for shardcat database
db software on every shard node
GDS framework and GSM service
Scheduler Agent on shard node
ORACLE SHARDING PREREQUISITES
➤ 12.2 Enterprise Edition
➤ Non-cdb
➤ Filesystem, no ASM (12.2 Beta)
➤ every shard node IP resolved in every node’s hosts file
➤ A whole new machine without any Oracle software preinstallation
WHAT DOES DEPLOY DO?
➤ Creates shards and listeners
➤ DBMS_SCHEDULER package (executed on shard catalog) communicates with Scheduler Agents on remote hosts
➤ Agents run DBCA and NETCA to create shards and listeners
➤ Creates the Data Guard configuration
➤ Primaries are created first, RMAN duplicate is used to create corresponding standbys
➤ Redo transport and broker are configured, observers are started on shard director hosts and Fast-Start Failover is enabled
➤ Optionally, deploys GoldenGate bi-directional replication (OGG 12.3)
➤ Replication pipelines are configured and replication is started
CENTRALIZED SCHEMA MANAGEMENT
connect to GDS$CATALOG service
alter session enable shard ddl;
create tablespace set …
create tablespace …
create user ...
create sharded table … tablespace set
Create duplicated table … tablespace
ShardDirector
Shard1 Shard2 Shardn
ShardCatalog
UNDERSTANDING CHUNKS AND TABLESPACE
➤ Chunk is the Unit of Data Movement in a Sharded Database
➤ Simple form: 1 chunk = 1 tablespace = 1 datafile
➤ The number of chunks is defined during the creation of shard catalog
UNDERSTANDING CHUNKS AND TABLESPACE//Log in GSM
GDSCTL>config chunks
Chunks
------------------------
Database From To
-------- ---- --
sh1 1 6
sh2 7 12
UNDERSTANDING CHUNKS AND TABLESPACE//Log in shard node database sh1
SQL> select tablespace_name from dba_tablespaces where tablespace_name like '%TSSET%';
TABLESPACE_NAME
--------------------
TSSET1
C001TSSET1
C002TSSET1
C003TSSET1
C004TSSET1
C005TSSET1
C006TSSET1
7 rows selected.
UNDERSTANDING CHUNKS AND TABLESPACE//Log in shard node database sh2
SQL> select tablespace_name from dba_tablespaces where tablespace_name like '%TSSET%';
TABLESPACE_NAME
------------------------------
TSSET1
C007TSSET1
C008TSSET1
C009TSSET1
C00ATSSET1
C00BTSSET1
C00CTSSET1
7 rows selected.
UNDERSTANDING CHUNKS AND TABLESPACE//Log in catalog database
//Where is sharded table
SQL> select table_name from dba_tables where tablespace_name='TSSET1';
no rows selected
//Where is duplicated table
SQL> select table_name from dba_tables where tablespace_name='TS1';
TABLE_NAME
--------------------
PRODUCTS
MLOG$_PRODUCTS
UNDERSTANDING CHUNKS AND TABLESPACE AND DATAFILE//Log in shard node database sh1
SQL> select partition_name, tablespace_name from dba_tab_partitions where table_name='CUSTOMERS' and tablespace_name like 'C%TSSET%' order by tablespace_name;
PARTITION_NAME TABLESPACE_NAME
-------------------- --------------------
CUSTOMERS_P1 C001TSSET1
CUSTOMERS_P2 C002TSSET1
CUSTOMERS_P3 C003TSSET1
CUSTOMERS_P4 C004TSSET1
CUSTOMERS_P5 C005TSSET1
CUSTOMERS_P6 C006TSSET1
6 rows selected.
SQL> select table_name from dba_tables where tablespace_name='TS1';
TABLE_NAME
--------------------
PRODUCTS
UNDERSTANDING CHUNKS AND TABLESPACE//Log in shard node database sh2
SQL> select partition_name, tablespace_name from dba_tab_partitions where table_name='CUSTOMERS' and tablespace_name like 'C%TSSET%' order by tablespace_name;
PARTITION_NAME TABLESPACE_NAME
---------------------------------------- ------------------------------
CUSTOMERS_P7 C007TSSET1
CUSTOMERS_P8 C008TSSET1
CUSTOMERS_P9 C009TSSET1
CUSTOMERS_P10 C00ATSSET1
CUSTOMERS_P11 C00BTSSET1
CUSTOMERS_P12 C00CTSSET1
6 rows selected.
SQL> select table_name from dba_tables where tablespace_name='TS1';
TABLE_NAME
--------------------
PRODUCTS
AND … DATAFILES AND TABLESPACE //Log in shard node database sh1
SQL> select TABLESPACE_NAME,FILE_NAME from dba_data_files where TABLESPACE_NAME like 'C%TSSET%' order by tablespace_name;
TABLESPACE_NAME FILE_NAME
-------------------- ----------------------------------------------------------------------
C001TSSET1 /u01/app/oracle/oradata/SH1/datafile/o1_mf_c001tsse_d1rfod3l_.dbf
C002TSSET1 /u01/app/oracle/oradata/SH1/datafile/o1_mf_c002tsse_d1rfofj6_.dbf
C003TSSET1 /u01/app/oracle/oradata/SH1/datafile/o1_mf_c003tsse_d1rfogs5_.dbf
C004TSSET1 /u01/app/oracle/oradata/SH1/datafile/o1_mf_c004tsse_d1rfoht8_.dbf
C005TSSET1 /u01/app/oracle/oradata/SH1/datafile/o1_mf_c005tsse_d1rfojs6_.dbf
C006TSSET1 /u01/app/oracle/oradata/SH1/datafile/o1_mf_c006tsse_d1rfokv6_.dbf
6 rows selected.
RESHARDING
ROUTING OF ORACLE SHARDINGEvery shard is a part of logical database
ROUTING IN AN ORACLE SHARDED ENVIRONMENT
➤ Direct Routing
➤ For OLTP workloads that specify sharding_key (e.g. customer_id) during connect
➤ Connect string must contain: (SHARD_KEY=...)
➤ JDBC: connection.setShardKey(<shard_key>,<shard_group_key>);
➤ Support for OCI/OCCI (C++)/ODP.NET
➤ Support for PHP, Python, Perl, and Node.js
➤ Proxy Routing
➤ Multi-shard queries – e.g. reporting workloads
➤ Workloads that cannot specify sharding_key as part of connection
DIRECT ROUTING VIA SHARDING KEY ➤ The connection pool maintains a shard topology
cache=a mapping of key ranges to shards
➤ DB requests for a key in a cached range go directly to the shard (i.e., bypasses shard director)
➤ Or a new connection is created by forwarding the request with the sharding key to the shard director
Shard KeyRanges
ChunkName Shards
1-- 10 Chunk 1 Shard1,Shard210 -- 20 Chunk2 Shard1,Shard220 -- 30 Chunk3 Shard3,Shard430– 40 Chunk4 Shard3,Shard4
PROXY ROUTING VIA COORDINATOR (SHARD CATALOG)
➤ Multi-shard Queries & Non-shard Key Access
➤ Connection is made to the coordinator
➤ Coordinator parses SQL and willproxy/route request to correct shard
➤ SQL statements rewritten to get much of the query processing done on the participating shards and as little as possible on the coordinator shard
➤ For developer convenience and not for high performance
Coordinator(shardcatalog)
ApplicationServer
ShardDirectors
App Tier
RoutingTier
DataTier
EXECUTION PLANExecution Plan
----------------------------------------------------------
Plan hash value: 2953441084
--------------------------------------------------------------
| Id | Operation | Name | Cost (%CPU)| Inst |IN-OUT|
--------------------------------------------------------------
| 0 | SELECT STATEMENT | | 0 (0)| | |
| 1 | SHARD ITERATOR | | | | |
| 2 | REMOTE | | | ORA_S~ | R->S |
--------------------------------------------------------------
Remote SQL Information (identified by operation id):
----------------------------------------------------
2 - EXPLAIN PLAN SET STATEMENT_ID='PLUS630005' INTO PLAN_TABLE@! FOR
SELECT "A1"."CUSTID" FROM "CUSTOMERS" "A1" /*
coord_sql_id=0zpg825w625yn */ (accessing
'ORA_SHARD_POOL@ORA_MULTI_TARGET' )