Bookmark and Share

Friday, December 18, 2009

Oracle 11gR2, GoldenGate and Koala Jump Start

It was some time ago now, but Oracle GoldenGate statement of direction made me sad. Sometimes, I still wish the person who wrote it is somehow as far away as possible from the database server development team. But I know that's not the case. The sentence that is killing me for sure is:
"Given the strategic nature of Oracle GoldenGate, Oracle Streams will continue to be supported, but will not be actively enhanced. Rather, the best elements of Oracle Streams will be evaluated for inclusion with Oracle GoldenGate."

Some Background

Even today, a few weeks after I've read GoldenGate documentation, tested the product and I've listed GoldenGate benefits for myself (openness, simplicity, completeness, ...). I'm still getting some trouble swallowing I will miss some of the unique features of Streams.

Not that I'm intolerant or I'm stuck to the "good old time". I mean, I know there are bad and good things about Streams. During the past years, I've spent quite some energy(*) on leveraging what is so unique in it and I believe most of its advantages come from its "built-inside" architecture. To a point where it could be very difficult for Oracle to provide Streams strengths to GoldenGate because their architectures are so different.

You may wonder what I am talking about. Well, I'm talking about:
  • Streams in-memory end-to-end propagation that provides an extremely scalable framework without the need to log changes in another separate file (trail)
  • The reliability of the staging (the database) and propagation (buffered queues).
  • Streams lighweight footprint that relies (most of the time) on internal object ids, can easily be offloaded to a separate server without any database copy and doesn't access original data at all
  • The ability Streams has to treat set of changes without "deferring" the constraints
  • The flexibility of the framework that is running inside the database with hooks at every stage you can completely personalize and extend to your needs
  • The openness for developers that can directly use the APIs to provide advanced features to their applications by leveraging their Oracle knowledge
  • The omnipotence and ubiquity of the solution that is already installed with Oracle 11g, work with RAC no change, is extended by the Change Data Capture feature and easy to share with the outside with AQ or XStream.
  • The maturity, ease and simplicity it has gained over the years of hard work by the development team and you can leverage by a simple SQL script
  • The cost model that comes for free with Oracle Database Enterprise Edition and with a limited set of features with 11g Standard Edition
At some point, I was hoping XStream would be the missing link between Streams and GoldenGate so that GoldenGate could leverage the power of Oracle Database 11g. That would for sure have been my preferred scenario and the fact XStream is licenced with GoldenGate made me hope for a while. But the reading of the statement of direction suggests a very different end.

Now Oracle GoldenGate is very likely to, if not succeed, do a better job than Streams to bring real-time data integration to the mass ;-). Nevertheless, I have no clue how Oracle could, in a timely manner, provide Streams strengths to it. Interesting challenge for the product managers and development team... The future will tell us!

Koala jump start

As you can guess, time has come for all of us to learn more about Oracle GoldenGate. Since my laptop runs Karmic Koala, I've given it a try on Ubuntu 9.10. It's not supported but did not face any issue so far. I've downloaded and installed GoldenGate from Oracle E-Delivery and I've choosen the following distribution:
  • Select a product Pack: "Oracle Fusion Middleware"
  • Platform: "Linux x86"
  • Description: "Oracle GoldenGate on Oracle Media Pack for Linux x86"
  • Name: "Oracle GoldenGate V10.4.0.x for Oracle 11g on RedHat 5.0"
To perform the installation, I've just unzip/untar the file in a directory and I've set the environment variable so that I can access my Oracle Database:
. oraenv
ORACLE_SID = [WHITE] ? BLACK
The Oracle base for ORACLE_HOME=/u01/app/oracle/product/11.2.0/db_1 is /u01/app/oracle
That done, you should be ready to use Oracle GoldenGate; run the command line interface like below:
./ggsci 

Oracle GoldenGate Command Interpreter for Oracle
Version 10.4.0.19 Build 002
Linux, x86, 32bit (optimized), Oracle 11 on Sep 29 2009 08:50:50

Copyright (C) 1995, 2009, Oracle and/or its affiliates. All rights reserved.


exit

Prepare the database

There are a few settings to check on the database. It must be in archivelog mode and GoldenGate must be able to access the archivelogs and redologs:
sqlplus / as sysdba

archive log list
Database log mode Archive Mode
Automatic archival Enabled
Archive destination /u01/app/oracle/oradata/BLACK/archivelogs
Oldest online log sequence 1
Next log sequence to archive 3
Current log sequence 3
I've set the NLS_LANG so that it matches the one from my database:
sqlplus / as sysdba

select parameter, value
from nls_database_parameters
where parameter in ('NLS_LANGUAGE',
'NLS_TERRITORY',
'NLS_CHARACTERSET',
'NLS_LENGTH_SEMANTICS');

parameter VALUE
-------------------- -------------
NLS_LANGUAGE AMERICAN
NLS_TERRITORY AMERICA
NLS_CHARACTERSET WE8MSWIN1252
NLS_LENGTH_SEMANTICS BYTE


exit

export NLS_LANG=AMERICAN_AMERICA.WE8MSWIN1252
The configuration has to rely on a user with a high level of privileges to perform several operations, like extracting the content of an UDT or a LOBs with flashback queries. It should also be able to set supplemental log groups to the tables that are part of the replication:
sqlplus / as sysdba

create user gg
identified by gg
default tablespace users
temporary tablespace temp;

grant create session, resource, dba to gg;
The database must also have the minimal supplemental logging enabled:
alter database add supplemental log data;

alter system switch logfile;
In my case, that was it. Obviously, based on what you want to do, your database may require more changes. It could require some tables to audit the DDL or to store checkpoints to be created. To get the complete list of what needs to be done, check GoldenGate's documentation and more specifically the "Oracle Installation and Setup Guide".

A Demo Schema

I did not try to setup anything advanced. I've just replicated scott.dept in a demo schema. To make the example even more simple, I've considered nobody was accessing the table and I did not pay any attention to pending transactions or out-of-sync instantiation. I've just created a table named demo.dept and loaded it with scott.dept's data:
create user demo identified by demo
default tablespace users
temporary tablespace temp
quota unlimited on users;

grant connect, resource to demo;

create table demo.dept
( deptno number(2,0),
dname varchar2(14),
loc varchar2(13),
constraint pk_dept primary key(deptno))
tablespace users;

insert into demo.dept
select * from scott.dept;

commit;

Configure GoldenGate Process Manager

Once the database configured, refer to the "Administration Guide" to continue and get more details about GoldenGate setup. First, it requires some directories to store its configuration, logs and trail files. You can choose an alternate location for them but that wasn't really my concern either:
./ggsci 

create subdirs

Creating subdirectories under current directory /gg

Parameter files /gg/dirprm: created
Report files /gg/dirrpt: created
Checkpoint files /gg/dirchk: created
Process status files /gg/dirpcs: created
SQL script files /gg/dirsql: created
Database definitions files /gg/dirdef: created
Extract data files /gg/dirdat: created
Temporary files /gg/dirtmp: created
Veridata files /gg/dirver: created
Veridata Lock files /gg/dirver/lock: created
Veridata Out-Of-Sync files /gg/dirver/oos: created
Veridata Out-Of-Sync XML files /gg/dirver/oosxml: created
Veridata Parameter files /gg/dirver/params: created
Veridata Report files /gg/dirver/report: created
Veridata Status files /gg/dirver/status: created
Veridata Trace files /gg/dirver/trace: created
Stdout files /gg/dirout: created
Once done, I've edited the Manager configuration file named MGR to set the port parameter and I've started it:
edit params mgr

view params mgr
port 7809

start manager

status manager
Manager is running (IP port arkzoyd.7809).

Source Table Supplemental Log Group

Like Streams (there is no secret!), GoldenGate needs to be able to identify rows to apply captured changes. It provides some generic tools to enable and check additional logging to the tables from ggsci:
dblogin userid gg, password gg
Successfully logged into database.

add trandata scott.dept
Logging of supplemental redo data enabled for table SCOTT.DEPT.

info trandata scott.dept
Logging of supplemental redo log data is enabled for table SCOTT.DEPT

Parameter Files

I've named the extract scott. It captures changes made to the SCOTT.DEPT table and send them to the remote trail file that, in my case, is managed by the same manager. I've named the replicat demo. The parameter files for scott and demo looks like the ones below:
edit params scott

view params scott

extract scott
userid gg, password gg
rmthost localhost mgrport 7809
rmttrail SC
table SCOTT.DEPT;


edit params demo

view params demo

replicat demo
assumetargetdefs
userid gg, password gg
map SCOTT.DEPT, target DEMO.DEPT;
Note:
With Oracle, you have to use double-quote to manage case-sensitive table names. However, that's not the case with all the database engines. As a result, depending on the parameter, GoldenGate may or may not differentiate strings with different cases. To avoid any issue, I use uppercase for the parameter values, unless I want to specifically use a different case.

Extract and Replicat

Once the parameter files defined, I've added the extract, the replicat and the trail files from ggsci:
add extract scott, tranlog, begin now
EXTRACT added.

add rmttrail SC, extract scott
RMTTRAIL added.

add replicat demo, exttrail SC, nodbcheckpoint, begin now
REPLICAT added.
And I've started them both:
start er *
Sending START request to MANAGER ...
EXTRACT SCOTT starting

Sending START request to MANAGER ...
REPLICAT DEMO starting


info all

Program Status Group Lag Time Since Chkpt
MANAGER RUNNING
EXTRACT RUNNING SCOTT 00:00:00 00:00:02
REPLICAT RUNNING DEMO 00:00:00 00:00:08


exit

Are Changes Replicated?

Once the setup completed, I've tested the replication with the script below:
sqlplus / as sysdba

update scott.dept
set dname='OPERATIONS2'
where deptno=40;

commit;

select dname from demo.dept
where deptno=40;

DNAME
--------------
OPERATIONS2


update scott.dept
set dname='OPERATIONS'
where deptno=40;

commit;

select dname from demo.dept
where deptno=40;

DNAME
--------------
OPERATIONS


exit

Configuration Cleanup

Obviously that's just a start. To avoid any issue with my next tests, I've cleaned up my configuration, once happy with it:
./ggsci

stop er *
Sending STOP request to EXTRACT SCOTT ...
Request processed.

Sending STOP request to REPLICAT DEMO ...
Request processed.


delete er *
Are you sure you want to delete all groups? y
Deleted EXTRACT SCOTT.
Deleted REPLICAT DEMO.


stop manager
Manager process is required by other GGS processes.
Are you sure you want to stop it (y/n)? y

Sending STOP request to MANAGER ...
Request processed.
Manager stopped.


exit
And I've dropped the demo and gg users:
sqlplus / as sysdba

drop user gg cascade;
drop user demo cascade;

exit
(*) I know at least one person that considers I'm a useless clown and the time I spend on Streams should be treated less mercifully. I admit it easily. I'm aware I'm not close to one percent of his knowledge about what stays an awesome technology. Anyway, good or bad, I've spent some energy on digging into Streams internals and I've always tried to listen to positive and negative feedback.

Read more...

Tuesday, December 8, 2009

Hidden and Undocumented "Cardinality Feedback"

I know this blog is supposed to be about Oracle Streams and AQ only... I could not resist to share one of my recent finding about some undocumented and automatic plan re-evaluation behavior I've faced with my 11.2.0.1 on Linux 32bits. Hopefully you'll forgive me this post...

If you've read the Oracle Database 11.1 documentation, you've found a feature so called "Intelligent or Adaptive Cursor Sharing". From the documentation the feature seems related to bind peeking and you'll find an excellent article about it on the Optimizer Magic's "Update on Adaptive Cursor Sharing" blog. This story seems both very close and very different.

However, like often with Oracle, that's more subtle than what it appears; I've found a case, you should be able to reproduce, where the query is reparsed, though there is not any bind variable.

My demo schema

The behavior my be related to some of my laptop characteristics. My very simple database is running 11.2.0.1 on Linux x86 32 bits with the following parameters:
sga_target=260M
pga_aggregate_target=180M
filesystemio_options=setall
I create a DEMO user and a set of tables, data and statistics. You can download the script from my website.

The query

Execute the query below; the first time, you should figure out that it gets a first plan and its hash value is 1851413986:
-- To change the behavior, set this parameter to none:
-- alter session set "_optimizer_extended_cursor_sharing_rel"=none;
-- Other than that, leave it to simple that is the default value for 11.2.0.1:
-- alter session set "_optimizer_extended_cursor_sharing_rel"=simple;

set timing on
select /* GG */ t.year_id, sum(f.metric1)
from fact f, time t, dim2 d2, dim3 d3, dim4 d4
where f.time_id=t.time_id
and f.dim2_id=d2.dim2_id
and f.dim3_id1=d3.dim3_id1
and f.dim3_id2=d3.dim3_id2
and f.dim4_id=d4.dim4_id
and d2.dim2_lib='Value 5'
and d3.dim3_lib='Value (2,2)'
and d4.dim4_l2='L2.1'
and attr2='ZZ4'
and t.time_id=trunc(t.time_id,'W')
group by t.year_id
order by t.year_id;

YEAR_ID SUM(F.METRIC1)
--------- --------------
01-JAN-09 38490

Elapsed: 00:00:06.10

select *
from table(dbms_xplan.display_cursor(format=>'basic note'));

PLAN_TABLE_OUTPUT
-----------------
EXPLAINED SQL STATEMENT:
------------------------
select /* GG_2 */ t.year_id, sum(f.metric1) from fact f, time t, dim2
d2, dim3 d3, dim4 d4 where f.time_id=t.time_id and
f.dim2_id=d2.dim2_id and f.dim3_id1=d3.dim3_id1 and
f.dim3_id2=d3.dim3_id2 and f.dim4_id=d4.dim4_id and
d2.dim2_lib='Value 5' and d3.dim3_lib='Value (2,2)' and
d4.dim4_l2='L2.1' and attr2='ZZ4' and
t.time_id=trunc(t.time_id,'W') group by t.year_id order by t.year_id

Plan hash value: 1851413986

-----------------------------------------------------------------
| Id | Operation | Name |
-----------------------------------------------------------------
| 0 | SELECT STATEMENT | |
| 1 | SORT GROUP BY | |
| 2 | NESTED LOOPS | |
| 3 | NESTED LOOPS | |
| 4 | HASH JOIN | |
| 5 | PART JOIN FILTER CREATE | :BF0000 |
| 6 | NESTED LOOPS | |
| 7 | NESTED LOOPS | |
| 8 | MERGE JOIN CARTESIAN | |
| 9 | PARTITION RANGE ALL | |
| 10 | TABLE ACCESS FULL | TIME |
| 11 | BUFFER SORT | |
| 12 | TABLE ACCESS FULL | DIM3 |
| 13 | PARTITION RANGE ITERATOR | |
| 14 | PARTITION HASH ALL | |
| 15 | BITMAP CONVERSION TO ROWIDS | |
| 16 | BITMAP AND | |
| 17 | BITMAP INDEX SINGLE VALUE | FACT_TIME_IDX |
| 18 | BITMAP INDEX SINGLE VALUE | FACT_DIM3_IDX |
| 19 | TABLE ACCESS BY LOCAL INDEX ROWID| FACT |
| 20 | PARTITION HASH JOIN-FILTER | |
| 21 | TABLE ACCESS FULL | DIM2 |
| 22 | INDEX UNIQUE SCAN | DIM4_PK |
| 23 | TABLE ACCESS BY INDEX ROWID | DIM4 |
-----------------------------------------------------------------
When I've executed the exact same query again, I've found a second plan with a hash value of 1094455219. Though the plan is better than the 1st one in my case, I would have expected the cursor to be reused:
set timing on
select /* GG */ t.year_id, sum(f.metric1)
from fact f, time t, dim2 d2, dim3 d3, dim4 d4
where f.time_id=t.time_id
and f.dim2_id=d2.dim2_id
and f.dim3_id1=d3.dim3_id1
and f.dim3_id2=d3.dim3_id2
and f.dim4_id=d4.dim4_id
and d2.dim2_lib='Value 5'
and d3.dim3_lib='Value (2,2)'
and d4.dim4_l2='L2.1'
and attr2='ZZ4'
and t.time_id=trunc(t.time_id,'W')
group by t.year_id
order by t.year_id;

YEAR_ID SUM(F.METRIC1)
--------- --------------
01-JAN-09 38490

Elapsed: 00:00:00.18

select *
from table(dbms_xplan.display_cursor(format=>'basic note'));

PLAN_TABLE_OUTPUT
-----------------
EXPLAINED SQL STATEMENT:
------------------------
select /* GG_2 */ t.year_id, sum(f.metric1) from fact f, time t, dim2
d2, dim3 d3, dim4 d4 where f.time_id=t.time_id and
f.dim2_id=d2.dim2_id and f.dim3_id1=d3.dim3_id1 and
f.dim3_id2=d3.dim3_id2 and f.dim4_id=d4.dim4_id and
d2.dim2_lib='Value 5' and d3.dim3_lib='Value (2,2)' and
d4.dim4_l2='L2.1' and attr2='ZZ4' and
t.time_id=trunc(t.time_id,'W') group by t.year_id order by t.year_id

Plan hash value: 1094455219

--------------------------------------------------------------
| Id | Operation | Name |
--------------------------------------------------------------
| 0 | SELECT STATEMENT | |
| 1 | SORT GROUP BY | |
| 2 | HASH JOIN | |
| 3 | NESTED LOOPS | |
| 4 | NESTED LOOPS | |
| 5 | MERGE JOIN CARTESIAN | |
| 6 | MERGE JOIN CARTESIAN | |
| 7 | PARTITION HASH ALL | |
| 8 | TABLE ACCESS FULL | DIM2 |
| 9 | BUFFER SORT | |
| 10 | TABLE ACCESS FULL | DIM3 |
| 11 | BUFFER SORT | |
| 12 | PARTITION RANGE ALL | |
| 13 | TABLE ACCESS FULL | TIME |
| 14 | PARTITION RANGE ITERATOR | |
| 15 | PARTITION HASH ITERATOR | |
| 16 | BITMAP CONVERSION TO ROWIDS | |
| 17 | BITMAP AND | |
| 18 | BITMAP INDEX SINGLE VALUE | FACT_TIME_IDX |
| 19 | BITMAP INDEX SINGLE VALUE | FACT_DIM3_IDX |
| 20 | BITMAP INDEX SINGLE VALUE | FACT_DIM2_IDX |
| 21 | TABLE ACCESS BY LOCAL INDEX ROWID| FACT |
| 22 | TABLE ACCESS FULL | DIM4 |
--------------------------------------------------------------

Note
-----
- cardinality feedback used for this statement
As you can see from the plan note something has changed in the calculation; if you collect the 10053 trace, you'll see, not only the cursor is re-parsed at the 2nd execution time but also some opt_estimate hints are used to correct the cardinality estimate. Interesting?

Conclusion

From my tests, the value of _optimizer_extended_cursor_sharing_rel impacts that behavior. However, the data the CBO relies on to decide it's more clever to perform a plan change doesn't look obvious to me. In addition, the 1st time the query is parsed, the IS_SHAREABLE column of the query is set to 'Y' and if you query V$SQL_SHARED_CURSOR when the 2 cursors are in the shared pool, all you'll find no property to explain the plan change. I'm very curious about the how and the why... Any Idea?

Read more...