deleting lots of data

35
Deleting LOTS of Data Author: Randy Cunningham, OCP

Upload: jemima

Post on 18-Jan-2016

30 views

Category:

Documents


0 download

DESCRIPTION

Deleting LOTS of Data. Author: Randy Cunningham, OCP. Why Use DELETE?. Retention policies; good data hygiene Data de-duplication Corporate mergers & acquisitions Purge and archival processing Cleaning out temporary & work tables Reclaim storage space Obtain better performance?…. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Deleting LOTS of Data

Deleting LOTS of Data

Author: Randy Cunningham, OCP

Page 2: Deleting LOTS of Data

Why Use DELETE?

Retention policies; good data hygieneData de-duplicationCorporate mergers & acquisitionsPurge and archival processingCleaning out temporary & work tablesReclaim storage spaceObtain better performance?…

Page 3: Deleting LOTS of Data

What Are The Issues?

DELETE never finishesDELETE takes too longLockingImpact on application performanceDesired space not released by DELETEUNDO space & errors

Page 4: Deleting LOTS of Data

What Happens During a DELETE

4

X

Page 5: Deleting LOTS of Data

What Happens During a DELETE

5 Redo

XX

1

X

2 3

MV Log

6

4 UndoInsert…

Page 6: Deleting LOTS of Data

Alternatives to DELETE

TRUNCATE TABLECTAS, truncate and reinsert data to keepUse partitioning:

Truncate partitionDrop partitionExchange partitionDELETE confined to one, or a few, partitions

Page 7: Deleting LOTS of Data

TRUNCATE TABLE

Upside: typically many times faster than using an unqualified DELETE operationDownsides:

Does not operate in transactional context (it’s DDL)You can’t rollback a TRUNCATECannot truncate parent rows in a R.I. ConstraintIt is DDL and requires DROP TABLE privileges

Page 8: Deleting LOTS of Data

Delegating TRUNCATE rights

CREATE PROCEDURE trunc_do (fqtn VARCHAR2) IS BEGIN EXECUTE IMMEDIATE 'TRUNCATE TABLE '||fqtn; END trunc_do;

GRANT EXECUTE ON trunc_do TO PUBLIC;

Page 9: Deleting LOTS of Data

Delegating TRUNCATE rights

CREATE PROCEDURE trunc_do (fqtn VARCHAR2) IS

BEGIN EXECUTE IMMEDIATE 'TRUNCATE TABLE '||fqtn;

END trunc_do; GRANT EXECUTE ON trunc_do TO PUBLIC;

Page 10: Deleting LOTS of Data

Delegating TRUNCATE rights

CREATE PROCEDURE trunc_do (fqtn VARCHAR2) ISBEGIN EXECUTE IMMEDIATE 'TRUNCATE TABLE '||fqtn;END trunc_do;

CREATE PROCEDURE trunc (owner IN VARCHAR2 DEFAULT USER, table_name IN VARCHAR) AUTHID CURRENT_USER

IS fqtn VARCHAR2(80) := owner || '.' || table_name;BEGIN EXECUTE IMMEDIATE -- Ensure DELETE rights 'DELETE FROM ' || fqtn || ' WHERE 1=2'; trunc_do (fqtn);END trunc;

GRANT EXECUTE ON trunc TO PUBLIC;

Page 11: Deleting LOTS of Data

Create Table as Select…

Use where most of the data are deletedGood for sidestepping UNDO issuesProvides quick & simple fault resilienceDownside: requires intermediate storage

Page 12: Deleting LOTS of Data

Create Table as Select: Example

CREATE TABLE student_temp AS SELECT * FROM student WHERE status_flag IN ('A','N','T') OR enroll_date >= to_date (’02/15/2006’, ‘MM/DD/YYYY’);

TRUNCATE TABLE student;

INSERT /*+APPEND*/ INTO student SELECT * FROM student_temp;

DROP TABLE student_temp;

Page 13: Deleting LOTS of Data

Create Table as Select: Observations

Run steps manually, or provide thorough error handlingAll limitations of TRUNCATE TABLE applyEnsure there is enough space for the CTAS temporary holding tableBe sure to use APPEND hint in the INSERTDoes not work for tables with LONG data type (use SQL*Plus COPY command instead)

Page 14: Deleting LOTS of Data

Table Partitioning

DROP PARTITION – ideal for temporal dataTRUNCATE PARTITION – for cyclic dataEXCHANGE PARTITION – for archivalDELETE … [PARTITION] – a full scan of the partition is quicker than a full table scan

Page 15: Deleting LOTS of Data

Dropping a Table Partition

Quickly eliminates an entire range or listExample:ALTER TABLE enrollment

DROP PARTITION Year2001

UPDATE GLOBAL INDEXES;

2002 2003 2004 2005 2006

Page 16: Deleting LOTS of Data

Truncating a Table Partition

Quick, easy way to manage cyclic partitionsExample:

ALTER TABLE enrollment

TRUNCATE PARTITION mar_data

UPDATE GLOBAL INDEXES;

Jan Feb Mar Apr May Jun

Jul Aug Sep Oct Nov Dec

X

Page 17: Deleting LOTS of Data

Exchanging a Table Partition

Quick way to excise data from the table, but retain it for archival or backup:

CREATE TABLE enrolled03_2005 AS SELECT * FROM enrollment

WHERE 1=2;ALTER TABLE enrollment

EXCHANGE PARTITION mar_data WITH TABLE enrolled03_2005 UPDATE GLOBAL INDEXES;

Page 18: Deleting LOTS of Data

DELETE on a Table Partition

If the predicate filters the partition key, then only specific partitions are scanned.In this case, it is not necessary to specify PARTITION in the DELETE command.Look for these in the EXPLAIN PLAN:

PARTITION RANGE SINGLEPARTITION RANGE ITERATOR

Page 19: Deleting LOTS of Data

If DELETE You Must…

Check the EXPLAIN PLAN for the DELETE!DELETE in batchesGet indexes out of the wayUse ROWID to minimize interferenceUse ROWID to implement parallelism

Page 20: Deleting LOTS of Data

Check EXPLAIN PLAN for DELETE

If most rows are being deleted, you will want to see a full table scan.If most blocks are being visited, you will want to see a full table scan.Optimize subquery form DELETE to perform well as a stand-alone SELECT.

Page 21: Deleting LOTS of Data

DELETE in batches

Delete a million (or so) rows at a time:DELETE FROM evt

WHERE status = 'X' AND rownum <= 1000000;

Delete an unrelated partition at a time:DELETE FROM evt PARTITION div162

WHERE status = 'X';

Page 22: Deleting LOTS of Data

Get Indexes Out of the Way

Old way: Drop the indexes and build them from scratch following deletionNew way: Alter the indexes unusable and rebuild them following deletion:

ALTER INDEX u_name_ix UNUSABLE;DELETE FROM names WHERE …ALTER INDEX u_name_ix REBUILD;

Page 23: Deleting LOTS of Data

Before You Tamper With Indexes…

Be certain an index isn’t needed to facilitate the access path for your DELETEEnsure that a significant proportion of the rows in the table are affectedTest to be sure that it is worthwhileBe sure that other database operations are not relying on the index (best done during a scheduled maintenance window)

Page 24: Deleting LOTS of Data

ROWID is your friend

Isolates query operation from DELETEMinimizes number of block changesIs blazingly fast and efficientOvercomes key preserved table restrictionsCan facilitate DELETE workload mgmt:

RestartabilityHome-brew parallelizationBatching

Page 25: Deleting LOTS of Data

ROWID Case Study 1

We have identified extraneous rows:SELECT * FROM (

SELECT domain_owner, latest_uce, MAX(latest_uce) OVER (PARTITION BY domain_owner) very_latest

FROM spammers)

WHERE latest_uce < very_latest

Page 26: Deleting LOTS of Data

ROWID Case Study 1

DELETE FROM ( SELECT * FROM (

SELECT domain_owner, latest_uce, MAX(latest_uce) OVER (PARTITION BY domain_owner) very_latest

FROM spammers)

WHERE latest_uce < very_latest)

Page 27: Deleting LOTS of Data

ROWID Case Study 1

Result:

ORA-01752: cannot delete from view without exactly one key-preserved table

Page 28: Deleting LOTS of Data

ROWID Case Study 1

DELETE FROM spammers WHERE ROWID IN ( SELECT ROWID FROM (

SELECT domain_owner, latest_uce, MAX(latest_uce) OVER (PARTITION BY domain_owner) very_latest

FROM spammers)

WHERE latest_uce < very_latest)

Page 29: Deleting LOTS of Data

ROWID Case Study 2 - Deduplication

DELETE FROM Customers WHERE ROWID IN ( SELECT DISTINCT ROWID FROM (SELECT ROWID, MIN (ROWID) OVER (PARTITION BY Cust_Last_Name,

Cust_First_name) Best

FROM Customers) WHERE ROWID <> Best);

Page 30: Deleting LOTS of Data

ROWID Case Study 3 - Staging

CREATE GLOBAL TEMPORARY TABLE Rowid_Tbl (Xrowid ROWID) ON COMMIT PRESERVE ROWS;INSERT INTO Rowid_TblSELECT DISTINCT ROWID FROM (SELECT ROWID, … FROM Huge_Table …);DELETE FROM Huge_Table WHERE ROWID IN (SELECT Xrowid FROM Rowid_Tbl) ;

Page 31: Deleting LOTS of Data

ROWID Case Study 4 – Workload

DEFINE N=8 -- # buckets desiredCREATE TABLE Delete_Driver (rid ROWID, pctile NUMBER, PRIMARY KEY (rid));

INSERT /*+APPEND*/ INTO Delete_DriverSELECT ROWID rid, ntile(&N) OVER (ORDER BY ROWID) pctile FROM Warranty WHERE Warranty_Expiration < TO_DATE('01/01/2002','MM/DD/YYYY');

Page 32: Deleting LOTS of Data

ROWID Case Study 4 - Workload

Each batch, whether run in serial or parallel, operates this way:

Define Job=4 -- set between 1 and 8

DELETE FROM

(SELECT NULL FROM Warranty, Delete_Driver

WHERE Warranty.ROWID = Delete_Driver.rid

AND pctile=&Job);

COMMIT;

Page 33: Deleting LOTS of Data

ROWID Caveats

Not for use in portable SQL… it is specific to OracleDon’t use stale ROWIDs… don’t keep them around permanentlyNot for use where rows migrate constantly

Page 34: Deleting LOTS of Data

Questions?

Page 35: Deleting LOTS of Data

Questions

35