Select query on dba_mviews takes too long

Who should tune "select * from dba_mviews" query? 

I’ve moved my blog from https://insanedba.blogspot.com to https://dincosman.com Please update your bookmarks and follow/subscribe at the new address for all the latest updates and content. More up-to-date content of this post may be available there.

In one of our database, whenever "Select * from dba_mviews" statement, with no other filter, gets executed,  it takes 45 seconds to complete.  Actually this is not an acceptable length and requires a detailed investigation.

Inspect slow query on dba_mviews


Basically "dba_mviews" is a data dictionary view. Its query is as shown below, a really complex one, just skip it.

A great number of subquery runs behind the scenes. As this database serves in an air-gapped environment, I do not have any execution plans of the real-time scenario. Below is the execution plan in my test environment. Access paths are similar.

The execution plan has some full table scans such as SYS.SUMDELTA$, SYS.SUMDETAIL$, SYS.SMON_SCN_TIME, SYS.TABCOMPART$ etc. When we exclude the three columns (NUM_STALE_PCT_REGIONS, NUM_FRESH_PCT_REGIONS, STALE_SINCE) from query instead of selecting all columns (*), it runs super fast.

I will inspect why adding these three columns to query costs that much within two sections. 

Section 1 (NUM_STALE_PCT_REGIONS, NUM_FRESH_PCT_REGIONS) 

NUM_FRESH_PCT_REGIONS and NUM_STALE_PCT_REGIONS are added to dba_mviews from 11g onwards. These two columns hold the number of fresh/stale PCT partition regions.

What is Partition Change Tracking?

Partition Change Tracking (PCT) is a cool feature for fast refresh of materialized views when one of the base tables is partitioned. By using this feature, It is possible and advantageous to track freshness to a finer grain than the entire materialized view. The ability to identify which rows in a materialized view are affected by a certain detail table partition, is known as Partition Change Tracking. When one or more of the detail tables are partitioned, it may be possible to identify the specific rows in the materialized view that correspond to a modified detail partition(s); those rows become stale when a partition is modified while all other rows remain fresh.

In a nutshell, PCT helps fast refresh of materialized views and in dba_mviews data dictionary, this information is shown, but it is expensive. 

In the MOS (https://support.oracle.com/epmos/faces/DocumentDisplay?id=1322308.1&displayIndex=1) There is a document similar to our case "Select from DBA_MVIEWs is very slow in 11g. (Doc ID 1322308.1)" According to the document, the cause of the problem and the solution is as stated below.

Cause
     The tables TABPART$, TABSUBPART$ and TABCOMPART$ are queried internally. If the data volume is high in these internal tables, then the query will be very slow." 
        This is expected behaviour with  high volumes of data in these tables.
Solution:
    If there is no specific reason for querying the columns NUM_FRESH_PCT_REGIONS & NUM_STALE_PCT_REGIONS then create an view on DBA_MVIEWS excluding these columns and query the new view.

To be honest, it is not a satisfying answer and there is a lot to say.

First of all;
In 19c Release, The query's execution plan is different than 11g. 

Based on the execution plan at line 38 and line 73, CBO accesses SYS.SUMDELTA$ table with full table scans. When i checked the statistics of this table, statistics were missing and  I thought that I found something worthy and i immediately gathered dictionary stats and fixed object stats, but statistics were still not gathered. I gathered stats of this table explicitly by using dbms_stats.gather_table_stats procedure and ran the query again but the execution plan did not change.  Considering the "Doc ID 1684072.1" Statistics are not gathered for the following objects:

    * IOT overflow segments

    * IOT mapping tables

    * snapshot logs

    * objects in the recycle bin 

    * external tables

    * context DR$/DR# tables

    * sys.sumdelta$:

    * secondary tables of domain indexes

There should be no statistics for the sys.sumdelta$ table and i dropped the statistics. In our ten years old database, SYS.SUMDELTA$ table was storing more than 200.000 rows. 

Actually, In an Oracle database, a materialized view log, associated with the master table of a materialized view and the SYS.SUMDELTA$ are equally responsible in the fast-refresh of materialized views.

When one of the base tables in the mview query undergoes DML changes (such as INSERT, UPDATE, or DELETE), the rows describing those changes are stored in the materialized view log. If these base tables undergo a bulk operation such as a direct-path insert (/*+ APPEND */) , the newly created blocks information is stored in SYS.SUMDELTA$ table and nothing gets recorded in  the materialized view logs.

Strangely, in this database, there was not any materialized view logs. All the materialized views were created with "REFRESH COMPLETE ON DEMAND" syntax and also always refreshed with complete method and atomic refresh set to false flag. As the atomic refresh flag set to false, Oracle optimizes refreshes by using parallel DML and direct path inserts. Nevertheless, sys.sumdelta$ still holds records in case of need for fast refresh method. 

Also, when i checked over the data stored in SYS.SUMDELTA$ table, i discovered that, many of materialized views were dependent to each other and refreshed on daily periods and their refresh times were not designed correctly. That faulty refresh period design was among the factors leading to high volumes of data on SYS.SUMDELTA$ table. Their refresh periods should had been designed according to the dependency to each other. Dba_dependencies dictionary view can be used as a starting point for that purpose. Moreover, lots of materialized views were dropped over time, but the bulk operation records of base tables of these dropped materialized views were not deleted. I truncated SYS.SUMDELTA$ table and refreshed all the materialized views and redesigned refresh periods. After that operation, table size reduced dramatically and full table scans benefitted from reduced table size. 

There is also a tracing event switch to disable direct inserts logging in sumdelta$ table. (Bug 16556250)

I would not recommend using that event switch, as this event is set, direct inserts to base tables wont be recorded anymore and when fast refresh is used, you will end up with wrong result.

Herethefore, we reduced the size of the `sys.sumdelta$` table to an ideal amount, but now the execution time for the query `select * from dba_mviews` takes 37 seconds. There is one more column to inspect.

Section 2 (STALE_SINCE)

When we exclude the `stale_since` column, the query runs fast now. The `stale_since` column stores "the time from when the materialized view became stale". Here is the abbreviated form of the query and the execution plan:

Based on the execution plan at line 3, CBO accesses the `SYS.SMON_SCN_TIME` table with a full table scan. At this stage, basically an SCN to timestamp conversion takes place.  In "Doc Id 365536.1, How To Map SCN To Timestamp Before 10g?", SCN and time conversion are clearly explained. According to this document:

`SYS.SMON_SCN_TIME` will have a maximum of 1440 rows, and each record will be for a 5-minute period. Oracle maintains this information for a maximum of 5 days, after which the records will be recycled.

This means that data is stored 12 times per hour * 24 hours * 5 days = 1440 rows. 
SCN value is stored internally as :

    * SCN_wrap
    * SCN_base

But, if you are using the flashback time travel feature, the data stored in this table will depend on the Flashback Data Archive retention period parameter.

I also suggest DBAs administering databases with long retention period requirements to look into "Jonathan Lewis' Flashback Fail post" to get more on `SMON_SCN_TIME` and to avoid ORA-2475.

For example, for a two-year retention, `SMON_SCN_TIME` will store more than 200,000 rows. As the CBO accesses this table with a full table scan, for each row in `dba_mviews` view, each row of the `SMON_SCN_TIME` table is read in a sequential (serial) order. And that is the problem in a nutshell.

In general, the required steps to avoid a full table scan are "CREATING AN INDEX" or "REWRITING THE QUERY".

I can speed up this query easily by creating an index on `(scn_wrp * 4294967296 + scn_bas)`. In my demo environment, I will create it.

The new execution plan, after the index has been created, is as follows. 

Now the full table scan of `SMON_SCN_TIME` transformed to INDEX RANGE SCAN. 

I also tested this workaround on the snapshot standby database of the real production database, and the query ran within 1 second.

Wait a minute, folks, something is wrong here.
I will not be advising you to create an index on an internal table belonging to the `sys` user. That is up to Oracle developers, not the DBAs. Creating an index on such an internal table may have some negative unintended consequences, as the more indexes created, the more slowly inserts. It should be tested thoroughly.

OK then, What about rewriting the query?

On `SMON_SCN_TIME`, `TIME_MP` column is already indexed and it stores the time, calculated with "scn_wrp * 4294967296 + scn_bas". I still wonder, why this indexed column is not used in this dictionary view query.

I created another view called `dba_mviews2`, just replacing "scn_wrp * 4294967296 + scn_bas" statement with "time_mp". Now "select * from dba_mviews2" runs super fast just like FLASH.

speed up data dictionary query


Hope it helps.

Comments

Popular posts from this blog

Oracle Grid Release Update by using Ansible Playbooks

Oracle Database Release Update by Using Ansible Playbooks

How to Upgrade PostgreSQL, PostGIS and Patroni in Air-Gapped Environments