We Do Streams: Data Comparison with DBMS

The new 11g DBMS_COMPARISON package helps to detect how data of 2 tables differ. The interface is easy to use and quite powerful. Not only, it allows to compare and converge the whole content of 2 tables but you can also work with data samples and with views. This post shows how to use this feature and include flashback query. This can be very useful to use in combination with always changing data, you can only block for a few seconds, of a production system.

Limits and Requirements of DBMS_COMPARISON

Before we start, refer to the limits of DBMS_COMPARISON:

DBMS_COMPARISON and FLASHBACK Query

We'll compare the content of a table with a view that contains a flashback clause. In that case, the view is on the table itself; here is the script to create our example:

create user demo
identified by demo
default tablespace users
temporary tablespace temp;

grant connect, resource to demo;

create table DEMO.T1 (
id number,
field1 varchar2(1000),
field2 varchar2(1000),
constraint T1_PK primary key(id));

insert into demo.T1
(select rownum, to_char(rownum),to_char(rownum)
  from dual
connect by level <=10000);  

commit;  

col scn new_value scn

select dbms_flashback.get_system_change_number scn
  from dual;

insert into demo.T1
  (select 10000+rownum, to_char(rownum),to_char(rownum)
    from dual  connect by level <=10000);

commit;

create view demo.t1v
  as select * from t1 as of scn &&scn;

Once the schema created, we can use create_comparison and compare to find the differences between the table and the view:

begin
dbms_comparison.create_comparison(
 COMPARISON_NAME    => 'mycomparison',
 SCHEMA_NAME        => 'demo',  
 OBJECT_NAME        => 't1v',
 DBLINK_NAME        => null,  
 REMOTE_SCHEMA_NAME => 'demo', 
 REMOTE_OBJECT_NAME => 'T1',
 COMPARISON_MODE    =>
    DBMS_COMPARISON.CMP_COMPARE_MODE_OBJECT,
 COLUMN_LIST        => '*',
 SCAN_MODE          =>
    DBMS_COMPARISON.CMP_SCAN_MODE_FULL);
end;
/

var scanid number;

SET SERVEROUTPUT ON
DECLARE
consistent   BOOLEAN;
scan_info    DBMS_COMPARISON.COMPARISON_TYPE;
BEGIN
consistent := DBMS_COMPARISON.COMPARE(
           comparison_name => 'mycomparison',
           scan_info       => scan_info,
           perform_row_dif => TRUE);
:scanid:=scan_info.scan_id;
DBMS_OUTPUT.PUT_LINE('Scan ID: '||scan_info.scan_id);
IF consistent=TRUE THEN
  DBMS_OUTPUT.PUT_LINE('No differences were found.');
ELSE
  DBMS_OUTPUT.PUT_LINE('Differences were found.');
END IF;
END;
/

COLUMN OWNER HEADING 'Comparison Owner' FORMAT A16
COLUMN COMPARISON_NAME HEADING 'Comparison Name' FORMAT A20
COLUMN SCHEMA_NAME HEADING 'Schema Name' FORMAT A11
COLUMN OBJECT_NAME HEADING 'Object Name' FORMAT A11
COLUMN CURRENT_DIF_COUNT HEADING 'Differences' FORMAT 9999999

SELECT c.OWNER,
c.COMPARISON_NAME,
c.SCHEMA_NAME,
c.OBJECT_NAME,
s.CURRENT_DIF_COUNT
FROM DBA_COMPARISON c, DBA_COMPARISON_SCAN_SUMMARY s
WHERE c.COMPARISON_NAME = s.COMPARISON_NAME AND
 c.OWNER           = s.OWNER AND
 s.SCAN_ID         = :scanid;

Comparison Owner Comparison Name      Schema Name Object Name Differences
---------------- -------------------- ----------- ----------- -----------
SYS   MYCOMPARISON       DEMO   T1V      10000

We can even make the table converge with the view:

SET SERVEROUTPUT ON
DECLARE
scan_info    DBMS_COMPARISON.COMPARISON_TYPE;
BEGIN
DBMS_COMPARISON.CONVERGE(
  comparison_name  => 'mycomparison',
  scan_id          => :scanid,
  scan_info        => scan_info,
  converge_options => DBMS_COMPARISON.CMP_CONVERGE_LOCAL_WINS);
DBMS_OUTPUT.PUT_LINE('Local Rows Merged: '||scan_info.loc_rows_merged);
DBMS_OUTPUT.PUT_LINE('Remote Rows Merged: '||scan_info.rmt_rows_merged);
DBMS_OUTPUT.PUT_LINE('Local Rows Deleted: '||scan_info.loc_rows_deleted);
DBMS_OUTPUT.PUT_LINE('Remote Rows Deleted: '||scan_info.rmt_rows_deleted);
END;
/

Local Rows Merged: 0
Remote Rows Merged: 0
Local Rows Deleted: 0
Remote Rows Deleted: 10000

select count(*) from demo.t1;

COUNT(*)
----------
   10000

select max(id) from demo.t1;

 MAX(ID)
----------
   10000

You'll be able to do much more from this example; to learn more, check:

You can drop the comparison and the demo schema for your next test:

exec dbms_comparison.drop_comparison('mycomparison');
drop user demo cascade;

How does DBMS_COMPARISON work?

The chapter of the Streams Replication Administrator's Guide explains how the comparison is performed and that ora_hash is used. To get more details, I've traced the compare procedure and the main query look like that:

SELECT ll.l_rowid, rr.r_rowid, NVL(ll."ID", rr."ID") idx_val
 FROM
(SELECT    l.rowid l_rowid, l."ID",
           ora_hash(
               NVL(to_char(l."ID"),'ORA$STREAMS$NV'),
               4294967295,
               ora_hash(
                   NVL((l."FIELD1"), 'ORA$STREAMS$NV'),
                   4294967295,
                   ora_hash(
                       NVL((l."FIELD2"), 'ORA$STREAMS$NV'),
                       4294967295,
                       0
                   )
               )
           ) l_hash 
   FROM "DEMO"."T1" l
  WHERE l."ID">=:scan_min1  AND l."ID"<=:scan_max1   ) ll    
FULL OUTER JOIN
(SELECT /*+ NO_MERGE REMOTE_MAPPED */ r.rowid r_rowid, r."ID",
            ora_hash(
               NVL(to_char(r."ID"), 'ORA$STREAMS$NV'),
               4294967295,
               ora_hash(
                  NVL((r."FIELD1"), 'ORA$STREAMS$NV'),
                  4294967295,
                  ora_hash(
                     NVL((r."FIELD2"), 'ORA$STREAMS$NV'),
                     4294967295,
                     0
                  )
               )
            ) r_hash
   FROM "DEMO"."T2" r
  WHERE r."ID">=:scan_min1  
    AND r."ID"<=:scan_max1     ) rr 
ON ll."ID"=rr."ID"
WHERE ll.l_hash IS NULL
   OR rr.r_hash IS NULL
   OR ll.l_hash <> rr.r_hash;

It gives a good idea of what we can expect from the tool, no?

We Do Streams

Thursday, August 27, 2009

Data Comparison with DBMS_COMPARISON

Limits and Requirements of DBMS_COMPARISON

DBMS_COMPARISON and FLASHBACK Query

How does DBMS_COMPARISON work?

No comments:

Post a Comment

Subscribe via Email

Blogger Followers

Search This Blog

Twitter

Twitter Updates

Blog Archive

Labels

We Do Streams

Thursday, August 27, 2009

Data Comparison with DBMS_COMPARISON

Limits and Requirements of DBMS_COMPARISON

DBMS_COMPARISON and FLASHBACK Query

How does DBMS_COMPARISON work?

No comments:

Post a Comment

Subscribe via Reader

Subscribe via Email

Blogger Followers

Search This Blog

Twitter

Twitter Updates

Blog Archive

Labels