Simple Reporting Without Materialized Views

A common initial resolution to reporting requirements without a reporting database is to utilize materialized views.  Unfortunately, the cost to refresh these views can be extremely expensive on a database, especially if it’s an OLTP that rarely has a lull in activity.

A lesser known feature of Oracle’s is the Change Data Capture, (a.k.a CDC
http://download.oracle.com/docs/cd/B10500_01/server.920/a96520/cdc.htm )
The CDC process utilizes a materizalized view log on a source table with a destination copy of the source table that can be kept insync with the original in any interval chosen.  The result is often a much lighter footprint than an mview refresh, no matter if you’ve employed complete or fast refreshes. 

I’ve had added success by pulling sections of the CDC underlying code to syncronize destination objects and replicating just the pieces I need to get the results desired.  CDC has a lot of features and can be over-whelming the first time a DBA or developer delves into it.  For anyone who would like to reproduce the simple source-to-reporting object syncronization with an mview log, here are the high level steps:

1.  Create a materialized view log on the source table using a primary key or rowid.
2.  Create a stage table that will hold two columns of information:
            1st Column:  The type of row, Insert, Update or Delete, (‘I’,’U’,’D’).
            2nd Column: The primary key or rowid.
3.  Create two stored procs, either in a package or separately, where the 1st one will populate the stage table and the second will take the info from the stage table and process the rows to the destination table.

In depth, the process and code will appear similar to the following example:

Create the materizlied view log on the source table, (SRC_TBL):
CREATE MATERIALIZED VIEW LOG ON SCOTT.SRC_TBL;

Create the staging table:
CREATE TABLE STG_PK_IDS
(OPT CHAR(1),
PK_ID NUMBER not null)
tablespace USERS;

Create the stored procedure to load the stage table, (STG_TBL):
CREATE OR REPLACE PROCEDURE UPD_PKID_FOR_RPT IS
/*********************************************************************
–Get Records for both insert and update
CURSOR cur IS
SELECT *
FROM (SELECT ‘I’ opt, pk_id
FROM scott.mlog$_src_tbl where DMLTYPE$$=’I’)  –Query the mview log on the source
UNION ALL
(SELECT ‘U’ opt, pk_id
FROM scott.mlog$_src_tbl where DMLTYPE$$=’U’)
UNION ALL
(SELECT ‘D’ opt, pk_id
FROM scott.mlog$_src_tbl where DMLTYPE$$=’D’);


BEGIN


— Insert records for transactions into staging table, all of them
FOR rec IN cur LOOP

IF rec.opt in (‘I’,’U’,’D’) THEN
INSERT INTO scott.STG_TBL (opt, pk_id) VALUES (rec.opt, rec.pk_id);
END IF;
END LOOP;
COMMIT;


–Purge processed rows from the mview log, using the reporting table to tell us which rows were added to it.
DELETE FROM scott.mlog$_src_tbl
WHERE OPT in (‘I’,’U’)
AND PK_ID in (select PK_ID from scott.RPT_TBL);


COMMIT;


END UPD_PKID_FOR_RPT;

Create a second stored prodedure to populate the data to the reporting table, (RPT_TBL)
CREATE OR REPLACE PROCEDURE UPD_RPTTBL_SP AS

/*********************************************************************

CURSOR cur IS
SELECT *
FROM (SELECT stg.opt,
stg.pk_id,
src.user_id,
src.phn_nbr,
src.login_id,
src.usr_type_cd,
src.usr_dt,
src.email_add
FROM scott.SRC_TBL src, scott.STG_TBL stg
WHERE src.pk_id = stg.pk_id
AND stg.opt=’I’

UNION ALL
SELECT stg.opt,
stg.pk_id,
src.user_id,
src.phn_nbr,
src.login_id,
src.usr_type_cd,
src.usr_dt,
src.email_add
FROM scott.SRC_TBL src, scott.STG_TBL stg
WHERE src.pk_id = stg.pk_id
AND stg.opt=’U’

UNION ALL
SELECT stg.opt,
stg.pk_id,
src.user_id,
src.phn_nbr,
src.login_id,
src.usr_type_cd,
src.usr_dt,
src.email_add
FROM scott.SRC_TBL src, scott.STG_TBL stg
WHERE src.pk_id = stg.pk_id
AND stg.opt=’D’
);

v_bgn_cnt number :=0;
v_step_name varchar2(200);
v_rec_ins number := 0;
v_start_dt date := sysdate;
v_error_msg varchar2(500);


BEGIN
v_step_name := ‘Process inserts and updates into scott.RPT_TBL’;


SELECT count(*)
into v_bgn_cnt
FROM scott.STG_TBL;


IF v_bgn_cnt >0 THEN


— Process each change


FOR rec IN cur LOOP


IF rec.opt = ‘I’ THEN
INSERT INTO scott.RPT_TBL
VALUES
( rec.pk_id,
rec.user_id,
rec.phn_nbr,
rec.login_id,
rec.usr_type_cd,
rec.usr_dt,
rec.email_add);
END IF;

IF rec.opt = ‘U’
THEN
UPDATE scott.RPT_TBL
set USER_ID = rec.user_id,
PHN_NBR = rec.phn_nbr,
login_id = rec.login_id,
user_type_cd = rec.user_type_cd,
usr_dt = rec.usr_dt,
email_add = rec.email_add
WHERE pk_ID = rec.pk_ID;
END IF;

IF rec.opt = ‘D’
THEN
DELETE from scott.RPT_TBL
WHERE pk_ID = rec.pk_ID;
END IF;


COMMIT;
END LOOP;
END IF;

–Insert Error handling here, I’m going to skip this in the example…
–Add a lookup table to track the pk_id from previous runs, etc.
–Truncate the stage table for the next run once success had been reported

END UPD_RPTTBL_SP;

We now can use the objects and procedures to maintain a copy of the SRC_TBL.  The DBA or developer simply needs to choose a functional interval to refresh and execute the procedures view a scheduled job or through a shell script.
exec UPD_PKID_FOR_RPT;
exec UPD_RPTTBL_SP;

Ta-da!  A simple process to syncronize data from a transactional table to a reporting copy.  This process can be scaled to load and commit in many different ways, depending on the demand and size of the targets.  It’s simple and can support many levels of loading intervals without the impact that often occurs with a materialized view refresh.

 

Print Friendly
August 30th, 2010 by

facebook comments:

  • orastory

    Simples, but you should be doing the work in BULK with a LIMIT, assuming you're planning to select more than 1 row each time.

  • Kellyn Pedersen

    sigh…figures I grab an example that I wrote for a database that actually didn't work well with bulk collects…

    CURSOR new_data_cur IS

    SELECT pk_id, opt
    FROM STG_TBL;

    type new_data_type IS TABLE OF new_data_cur%ROWTYPE INDEX BY PLS_INTEGER;
    new_data_tab new_data_TYPE;

    type row_id_type IS TABLE OF ROWID INDEX BY PLS_INTEGER;
    row_id_tab row_id_TYPE;

    TYPE rt_update_cols IS RECORD
    (
    new_col1 new_data.new_col1%type,
    new_col2 new_data.new_col1 %type
    );

    TYPE update_cols_type IS TABLE OF rt_update_cols INDEX BY PLS_INTEGER;
    update_cols_tab update_cols_type;

    dml_errors EXCEPTION;
    PRAGMA EXCEPTION_INIT(dml_errors, -24381);

    etc…etc…
    fetch base_table_cur bulk collect into base_tab limit 100;

  • Facebook
  • Google+
  • LinkedIn
  • Twitter