ORACLE 12C : approx_count_distinct parameter

The APPROX_COUNT_DISTINCT function was added, but not documented, in
Oracle 11g to improve the speed of calculating the number of distinct values
when gathering statistics using the DBMS_STATS package. Oracle database
12c (12.1.0.2) now includes the function in the documentation, so we are free
to use it in our applications as a supported SQL function.
It returns approximate number of rows that contain distinct values of
expression.
It’s alternative to the COUNT (DISTINCT expr) function, which returns the
exact number of rows that contain distinct values of expr.
For processing large amounts of data it’s significantly faster than COUNT,
with negligible deviation of values from the exact result.
The APPROX_COUNT_DISTINCT() function ignores records that contain a null
value for the expression. Plus is performs less work on the sorting and
aggregations. Just run and Explain Plan and you can see the differences.
In some of the material from Oracle the APPROX_COUNT_DISTINCT() function
can be more than 5x times faster. But it depends on the number of distinct
values and the complexity of the SQL query.
Example
Suppose you have a table loaded with 30 million of rows.
Now simple performance test:
set timing on
SELECT
  COUNT(DISTINCT cp1),
  COUNT(DISTINCT cp2),
  COUNT(DISTINCT cp3)
FROM TEST_TABLE;
       CP1        CP2        CP3
———- ———- ———-
      3000       1000        100
Elapsed: 00:00:32.125
SELECT
  APPROX_COUNT_DISTINCT(cp1),
  APPROX_COUNT_DISTINCT(cp2),
  APPROX_COUNT_DISTINCT(cp3)
FROM TEST_TABLE;
      CP1       CP2       CP3
———- ———- ———-
      2962       990        100
Elapsed: 00:00:04.368

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *