Wednesday, November 16, 2011

Data replication using DBMS_COMPARISON


dbms_compare  can compare objects/schemas/data between two different databases or schema's. You can use it to  help determine if there is some data divergence between two tables. If data divergence is found you can bring the objects to a consistent state.

For example, if I have two schemas in my database (SCOTT and HR in this case) I can compare the EMP tables in both schemas. First I create the comparison:
exec dbms_comparison.create_comparison('hr','hr','emp',null,null,null,'scott','emp');

This tells Oracle that I'm getting ready to compare these two objects. This can be done over a DBLink too by just defining the name of the dblink. Even more, while the local compare side must be Oracle Database 11g, the remote site can be Oracle Database 10g release one or late.

Now I can do the actual compare as seen here:

declare
compare_info dbms_comparison.comparison_type;
compare_return boolean;
begin
compare_return :=
dbms_comparison.compare (
comparison_name=>'hr',
scan_info=>compare_info,
perform_row_dif=>TRUE);

if compare_return=TRUE
then
dbms_output.put_line('No differences exists');
else
dbms_output.put_line('Differences found');
dbms_output.put_line('Check the dba_comparison and dba_comparison_scan_summary views for locate the differences for scan_id:'||compare_info.scan_id);
end if;
end;
/

Here is the result in my case:

Bad news... there is data divergence. Check the dba_comparison and dba_comparison_scan_summary views for locate the
differences for scan_id:13

As we can see here, the results send us to dba_comparison and dba_comparison_scan_summary where we find these results:

select a.owner, a.comparison_name, a.schema_name, a.object_name,
z.current_dif_count difference
from dba_comparison a, dba_comparison_scan_summary z
where a.comparison_name=z.comparison_name
and a.owner=z.owner
and z.scan_id=13;

OWNER COMPARISON_NAME SCHEMA_NAME OBJECT_NAME DIFFERENCE
------------------------------ ------------------------------ ----------
HR           HR                           HR                   EMP            16

There are actually a lot of views you can use here including:

DBA_COMPARISON
DBA_COMPARISON_COLUMNS
DBA_COMPARISON_SCAN
DBA_COMPARISON_SCAN_SUMMARY
DBA_COMPARISON_SCAN_VALUES
DBA_COMPARISON_ROW_DIF

You can also use the dbms_comparison.converge function to "fix" the data, as it were. This procedure will converge our objects, taking care of the data divergence. With this procedure you can either say that the remote or local table is to be the "master" table. Divergences will be sourced from that object. So, in our case we can update the hr.emp table so that it's no longer divergent with the scott.emp table. Here is an example:

declare
compare_info dbms_comparison.comparison_type;
begin
dbms_comparison.converge (
comparison_name=>'hr',
scan_id=>13,
scan_info=>compare_info,
converge_options=>dbms_comparison.cmp_converge_remote_wins);

dbms_output.put_line('--- Results ---');
dbms_output.put_line('Local rows Merged by process: '||compare_info.loc_rows_merged);
dbms_output.put_line('Remote rows Merged by process: '||compare_info.rmt_rows_merged);
dbms_output.put_line('Local rows Deleted by process: '||compare_info.loc_rows_deleted);
dbms_output.put_line('Remote rows Deleted by process: '||compare_info.rmt_rows_deleted);
end;
/

--- Results ---
Local rows Merged by process: 16
Remote rows Merged by process: 0
Local rows Deleted by process: 0
Remote rows Deleted by process: 0

Note that if you run this more than once, you are in no danger of duplicating your data as seen in this subsequent output from a second run:

--- Results ---
Local rows Merged by process: 0
Remote rows Merged by process: 0
Local rows Deleted by process: 0
Remote rows Deleted by process: 0

Something else to be aware of. If the data changes in either table, it can impact the compare/converge operations. Thus for best results, you should quiesce all activity on the tables being compared.

There are some other requirements with this feature that you will want to review (as always), but it is a very powerful new feature in Oracle Database 11g!

No comments: