Monday, March 6, 2017

Hints in oracle and Informatica

Hints for improving the performance of query in Oracle
The performance and the explain plan of a query can be improved by using Hints in the query, here are few of them:
1. /*+ parallel(table_name,8) */ can be used in a select statement.

Example: select /*+ parallel(emp,8) */ * from emp;
This will help in getting the results quickly, this hint will create 8 parallel pipelines to select the records from the emp table in this example.
This hint can also be used with inserts, but will only help without the DBlinks (meaning copying data from the same database).

2. /*+ append */ used with insert statement

Example: insert /*+ append */ into emp select /*+ parallel(emp,8) */  * from xyz;
This should be used for large loads, it bypasses the buffer cache and does a direct path load.

3. /*+ use_hash(table1 table2...) */ used with select statement

Example: select /*+ use_hash(table1 table2...) */ * from table1 table2.. where table1.xyz = table2.yzx;
This is used to improve the explain plan of the query, this hint eliminates the nested loops and uses the hash join instead. This helps in improving the performance of the query.

SELECT /*+ PARALLEL(employees 4) PARALLEL(departments 4) USE_HASH(employees)
ORDERED */
       MAX(salary), AVG(salary)
FROM employees, departments
WHERE employees.department_id = departments.department_id
GROUP BY employees.department_id;


Hints in oracle


·                     ALL_ROWS
One of the hints that 'invokes' the 
Cost based optimizer 
ALL_ROWS is usually used for batch processing or data warehousing systems.

(/*+ ALL_ROWS */) 

·                     FIRST_ROWS
One of the hints that 'invokes' the 
Cost based optimizer 
FIRST_ROWS is usually used for OLTP systems.

(/*+ FIRST_ROWS */) 
SELECT /*+ FIRST_ROWS(10) */ * FROM employees;


·                     CHOOSE
One of the hints that 'invokes' the 
Cost based optimizer 
This hint lets the server choose (between ALL_ROWS and FIRST_ROWS, based on 
statistics gathered.
·                     HASH
Hashes one table (full scan) and creates a hash index for that table. Then hashes other table and uses hash index to find corresponding records. Therefore not suitable for < or > join conditions.

/*+ use_hash */

Hints are most useful to optimize the query performance.


/*+ hint */
/*+ hint(argument) */
/*+ hint(argument-1 argument-2) */
All hints except /*+ rule */ cause the CBO to be used. Therefore, it is good practise to analyze the underlying tables if hints are used (or the query is fully hinted.
There should be no schema names in hints. Hints must use aliases if alias names are used for table names. So the following is wrong:
select /*+ index(scott.emp ix_emp) */ from scott.emp emp_alias
better:
select /*+ index(emp_alias ix_emp) */ ... from scott.emp emp_alias

Why using hints

It is a perfect valid question to ask why hints should be used. Oracle comes with an optimizer that promises to optimize aquery's execution plan. When this optimizer is really doing a good job, no hints should be required at all.
Sometimes, however, the characteristics of the data in the database are changing rapidly, so that the optimizer (or more accuratly, its statistics) are out of date. In this case, a hint could help.
It must also be noted, that Oracle allows to lock the statistics when they look ideal which should make the hints meaningless again.

Hint categories

Hints can be categorized as follows:
  • Hints for Optimization Approaches and Goals,
  • Hints for Access Paths, Hints for Query Transformations,
  • Hints for Join Orders,
  • Hints for Join Operations,
  • Hints for Parallel Execution,
  • Additional Hints


Documented Hints

Hints for Optimization Approaches and Goals

  • ALL_ROWS
One of the hints that 'invokes' the Cost based optimizer 
ALL_ROWS is usually used for batch processing or data warehousing systems.
  • FIRST_ROWS
  • One of the hints that 'invokes' the Cost based optimizer 
    FIRST_ROWS is usually used for OLTP systems.
  • CHOOSE
  • One of the hints that 'invokes' the Cost based optimizer 
    This hint lets the server choose (between ALL_ROWS and FIRST_ROWS, based on statistics gathered.
  • RULE
  • The RULE hint should be considered deprecated as it is dropped from Oracle9i2.
    See also the following initialization parameters: optimizer_mode,optimizer_max_permutations, optimizer_index_cost_adj,optimizer_index_caching and

    Hints for Access Paths

    • CLUSTER
    Performs a nested loop by the cluster index of one of the tables.
  • FULL
  • Performs full table scan.
  • HASH
  • Hashes one table (full scan) and creates a hash index for that table. Then hashes other table and uses hash index to find corresponding records. Therefore not suitable for < or > join conditions.
  • ROWID
  • Retrieves the row by rowid
  • INDEX
  • Specifying that index index_name should be used on table tab_name: 
    /*+ index (tab_name index_name) */ 
    Specifying that the index should be used the the CBO thinks is most suitable. (Not always a good choice). 
    Starting with Oracle 10g, the index hint can be described:
    /*+ index(my_tab my_tab(col_1, col_2)) */. Using the index on my_tab that starts with the columns col_1 and col_2.
  • INDEX_ASC
  • INDEX_COMBINE
  • INDEX_DESC
  • INDEX_FFS
  • INDEX_JOIN
  • NO_INDEX
  • AND_EQUAL
  • The AND_EQUAL hint explicitly chooses an execution plan that uses an access path that merges the scans on several single-column indexes

    Hints for Query Transformations

    • FACT
    The FACT hint is used in the context of the star transformation to indicate to the transformation that the hinted table should be considered as a fact table.
  • MERGE
  • NO_EXPAND
  • NO_EXPAND_GSET_TO_UNION
  • NO_FACT
  • NO_MERGE
  • NOREWRITE
  • REWRITE
  • STAR_TRANSFORMATION
  • USE_CONCAT
  • Hints for Join Operations

    • DRIVING_SITE
    • HASH_AJ
    • HASH_SJ
    • LEADING
    • MERGE_AJ
    • MERGE_SJ
    • NL_AJ
    • NL_SJ
    • USE_HASH
    • USE_MERGE
    • USE_NL

    Hints for Parallel Execution

    • NOPARALLEL
    • PARALLEL
    • NOPARALLEL_INDEX
    • PARALLEL_INDEX
    • PQ_DISTRIBUTE

    Additional Hints

    • ANTIJOIN
    • APPEND
    If a table or an index is specified with nologging, this hint applied with an insert statement produces a direct path insert which reduces generation of redo.
  • BITMAP
  • BUFFER
  • CACHE
  • CARDINALITY
  • CPU_COSTING
  • DYNAMIC_SAMPLING
  • INLINE
  • MATERIALIZE
  • NO_ACCESS
  • NO_BUFFER
  • NO_MONITORING
  • NO_PUSH_PRED
  • NO_PUSH_SUBQ
  • NO_QKN_BUFF
  • NO_SEMIJOIN
  • NOAPPEND
  • NOCACHE
  • OR_EXPAND
  • ORDERED
  • ORDERED_PREDICATES
  • PUSH_PRED
  • PUSH_SUBQ
  • QB_NAME
  • RESULT_CACHE (Oracle 11g)
  • SELECTIVITY
  • SEMIJOIN
  • SEMIJOIN_DRIVER
  • STAR
  • The STAR hint forces a star query plan to be used, if possible. A star plan has the largest table in the query last in the join order and joins it with a nested loops join on a concatenated index. The STAR hint applies when there are at least three tables, the large table's concatenated index has at least three columns, and there are no conflicting access or join method hints. The optimizer also considers different permutations of the small tables.
  • SWAP_JOIN_INPUTS
  • USE_ANTI
  • USE_SEMI
  • Undocumented hints:

    • BYPASS_RECURSIVE_CHECK
    Workaraound for bug 1816154
  • BYPASS_UJVC
  • CACHE_CB
  • CACHE_TEMP_TABLE
  • CIV_GB
  • COLLECTIONS_GET_REFS
  • CUBE_GB
  • CURSOR_SHARING_EXACT
  • DEREF_NO_REWRITE
  • DML_UPDATE
  • DOMAIN_INDEX_NO_SORT
  • DOMAIN_INDEX_SORT
  • DYNAMIC_SAMPLING
  • DYNAMIC_SAMPLING_EST_CDN
  • EXPAND_GSET_TO_UNION
  • FORCE_SAMPLE_BLOCK
  • GBY_CONC_ROLLUP
  • GLOBAL_TABLE_HINTS
  • HWM_BROKERED
  • IGNORE_ON_CLAUSE
  • IGNORE_WHERE_CLAUSE
  • INDEX_RRS
  • INDEX_SS
  • INDEX_SS_ASC
  • INDEX_SS_DESC
  • LIKE_EXPAND
  • LOCAL_INDEXES
  • MV_MERGE
  • NESTED_TABLE_GET_REFS
  • NESTED_TABLE_SET_REFS
  • NESTED_TABLE_SET_SETID
  • NO_FILTERING
  • NO_ORDER_ROLLUPS
  • NO_PRUNE_GSETS
  • NO_STATS_GSETS
  • NO_UNNEST
  • NOCPU_COSTING
  • OVERFLOW_NOMOVE
  • PIV_GB
  • PIV_SSF
  • PQ_MAP
  • PQ_NOMAP
  • REMOTE_MAPPED
  • RESTORE_AS_INTERVALS
  • SAVE_AS_INTERVALS
  • SCN_ASCENDING
  • SKIP_EXT_OPTIMIZER
  • SQLLDR
  • SYS_DL_CURSOR
  • SYS_PARALLEL_TXN
  • SYS_RID_ORDER
  • TIV_GB
  • TIV_SSF
  • UNNEST
  • USE_TTT_FOR_GSETS
  • Thanks


    2 comments:

    1. I personally feel and have also experienced that both Informatica and Oracle are the most useful tools for database and software related problems respectively.

      Informatica Read Soap API



      ReplyDelete

    Data engineering Interview Questions

    1)  What all challenges you have faced and how did you overcome from it? Ans:- Challenges Faced and Overcome As a hypothetical Spark develop...