Oracle Change Data Capture Technology – Evolution and Current Form

Oracle CDC (Change Data Capture) is the technology used for monitoring and tracking changes to a database so that required action can be taken based on those changes. In its simplest form, Oracle CDCcan be said to be software design patterns that optimally manage the integration, identification, and delivery of data for all changes made to the source database in any organization. The quality and performance of databases are also enhanced many times over by Oracle CDC through real-time data integration and speeding up data warehousing facilities.

One of the most efficient tools and an optimized non-intrusive method for replicating databases are Oracle CDC. Multiple activities can be performed by this technology such as database migration from on-premises to the cloud without shutting down source databases and offloading queries for analytics from databases in use to the data warehouses. Businesses also can mine incremental data from different sources and migrate it to a data warehouse.

A critical function of Oracle CDC is preserving and capturing the state of the data within a data warehouse environment. This can be deployed in any database or data repository system. Users configuring Oracle CDC can either use application logic or opt for physical storage or even a combination of the two.

The Evolution of the Oracle Change Data Capture Technology

Oracle CDC technology was first launched by Oracle Corporation in its 9i version as a feature that was built-in and came out of the box. It was very helpful to track and monitor all changes made in user tables in a database. These modifications were then stored in change tables to be used in ETL (Extract, Transform, Load) applications. This data was then processed and migrated to other databases or data warehouses. This nascent Oracle CDC worked through triggers that were placed in the source tables.

However, this technology was found to be quite complex and invasive by Database Administrators, leading to Oracle releasing a modified version of the CDC with its 10g version named Oracle Streams. This version of Oracle CDC was based on the redo logs of the source database instead of triggers as before and the in-built replication tool of Streams. The new version was quite popular and became a very effective tool to detect change data in a data repository without impacting the speed and performance of the system at the source.

Unfortunately, despite the CDC technology being very user-friendly and well-accepted, Oracle discontinued Streams from its 12c version and it no longer was available as a built-in feature of the Oracle database system. Instead, users had no other option but to look for some other Oracle replication tool or pay for Oracle Golden Gate that has Oracle CDC out of the box.

Oracle Change Data Capture Currently in Use

The concept of Oracle CDC is based on the following premise – if the data in the source database is changed, another system which is the target database has to initiate some action that is based on those changes. The two databases do not need to be different. Rather, Oracle CDC works just as well even if the source and the target databases are the same as some CDC solutions exist in the same system.

Oracle CDC recognizes changes to the data at source through the Oracle Data Integrator. There are two modes available here.

The first is the Synchronous Mode where triggers are placed in the source database wherein any changes made are captured instantly. Each SQL statement carries out a Data Manipulation Language (DML) activity which can be classified as Insert, Update, or Delete action. The data that has changed is captured as a part of the transactions that are responsible for the changed data at source. This form of Oracle CDC is available in the Oracle Standard and the Oracle Enterprise Editions.

The second is the Asynchronous Mode where the data is moved to the redo log files and later, the changes made are captured after the SQL statement is taken through a DML activity. There is no effect on the transactions of the modified data as it has not been captured as a component of the transactions that resulted in the changes in the source table. The three modes of Asynchronous Change Data Capture are HotLog, Distributed HotLog, and AutoLog. The Asynchronous Change Data Capture mode works on the principle of the presently-discontinued Oracle Streams.

Both the Synchronous and the Asynchronous modes of the Oracle Data Integrator are easy to set up and configure as the process is fully automated.

Benefits of Extracting Databases with Oracle CDC

Organizations can avail of several benefits by opting for database extraction with Oracle CDC.

To start with, instant database extraction for Insert, Delete, and Update activities can be done with Oracle CDC in real-time as soon as changes are made at the source tables. If CDC is not in use database extraction cannot be done for Insert activity and is very complex for Update and Delete due to lack of data.

Also, flat files are not required for staging data as Oracle CDC places them directly in relational tables. Otherwise, without CDC, the complete tables would have to be moved into flat files.

Finally, a critical benefit of Oracle CDC is its user-friendly interface provided through DBMS_LOGMNR_CDC_PUBLISH and DBMS_LOGMNR_CDC_SUBSCRIBE packages without which extensive manpower would be required for CDC. The Oracle CDC technology is affordable despite being a paid service of Oracle GoldenGate and optimizes all activities related to database migration and replication, though the configuration of CDC requires several user permission changes before it can be initialized and completed.