Google Search

Tuesday, October 15, 2013

Tuning SQL Queries


1.    Optimizers
Oracle optimizes the statement after validating the syntax and verifying the objects.
There are two types of optimizers
a)    Rule Based Optimizer
b)    Cost based optimizer



a)   Rule Based Optimizer (RBO)

It is Widely used in OnLine Transaction Processing (OLTP) applications.
If more than one access paths are available for executing a particular SQL statement, then RBO uses the access path with higher rank (i.e. lesser rank value).
Following table gives the access paths available with RBO and their corresponding ranks-
Rank
Access Path
1
Single row by rowid
2
Single row by cluster join
3
Single row by hash cluster key with unique or primary key
4
Single row by unique or primary key
5
Cluster join
6
Hash cluster key
7
Indexed cluster key
8
Composite key
9
Single-column indexes
10
Bounded range search on indexed columns
11
Unbounded range search on indexed columns
12
Sort-merge join
13
MAX or MIN of indexed column
14
ORDER BY on indexed columns
15
Full table scan
SELECT deptname FROM dept WHERE deptno = 40;
Above statement can be executed using 'Full table scan' of dept  table or using the index available on deptno column. RBO will go for index scan as it is ranked higher. If  above query is going to retrieve 90 rows out of total 100 rows in the dept  table, RBO will go for the index scan. The disk head will be flying all over the place to retrieve every row, which is more time-consuming if many rows are getting selected (as in the above case).
b)   Cost Based Optimizer (CBO)
       It uses a numerical value (termed as Cost) assigned to a SQL statement based on the estimated amount of memory, input-output and CPU time. If there are more than one method available for executing a particular SQL statement, then CBO executes the method with the lowest cost estimate.
In case of the query discussed above (in the RBO para), CBO will first check the 96/4 rule. This rule states that if 4% or less of the rows of a table are returned, an index would probably return the rows faster than a Full table scan.
So, in above case CBO will go for the Full table scan. CBO uses the statistics available for the objects referred in the SQL statement. If stats are not available, oracle on the fly derives the cost. CBO is not generally used in OLTP applications, as the latest statistics are not available for CBO to take a correct decision.
Statistics are generated using ANALYZE command. This command populates the data dictionary tables. For eg. Analyzing a table populates num_rows, blocks, avg_row_len columns of corresponding row in user_tables. Similarly, analyzing an index populates distinct_keys column of user_indexes.
The Syntax of Analyze command is-
ANALYZE TABLE/INDEX/CLUSTER <obj_name> COMPUTE/ESTIMATE/DELETE STATISTICS SAMPLE <number> ROWS/PERCENT;
'Sample rows/percent' can only be used with ESTIMATE clause.
  • For larger tables, use ESTIMATE clause. It is fairly accurate and much faster than the COMPUTE clause.
  • Use dbms_utility.analyze_schema procedure to analyze all tables of a schema. 
There are some features in oracle which works only with CBO-
  1. Partitioned Tables
  2. Star Join
  3. Bitmap Index
  4. Reverse key index
  5. Parallel Query
  6. Index Organized tables etc. etc.
 How to Select the Optimizer?
The Optimizer can be selected at Instance, Session or Statement levels.
At instance level: we can set optimizer modeby defining in pfile.
At Session level: using Alter Session command, we can set optimizer_goal parameter.
At statement level:  we can use Hints to select the optimizer.
 The possible values for these parameters/hints are-
FIRST_ROWS- Uses CBO. Executes the method which would retrieve the first row faster it will give the best response time.
ALL_ROWS-    Uses CBO. Executes the method which would retrieve all the rows faster. It gives the best throughput.
RULE-               Uses RBO.
CHOOSE-        Uses CBO if statistics are available for any of the table referred in the FROM clause, otherwise goes for RBO.

2.    Explain Plan
It is an application performance tuning tool. The purpose of this tool is to indicate the steps in the Plan tree for a particular SQL statement. Oracle formulates a Plan tree before executing the statement, based on the contents of the data dictionary. For each step, a row is inserted into the table plan_table.
To create the plan_table, execute the SQL script utlxplan.sql in “$ORACLE_HOME/rdbms/admin for oracle on UNIX platform 
Information stored in plan_table includes the optimizer, driving table, the table access method, and the estimated cost to execute the statement being explained.
 SQL Trace & tkprof-
It is used for complex, long-running SQL statements. It gives more detailed information.
Following are the steps to get SQL trace.
1. ALTER SYSTEM SET TIMED_STATISTICS = TRUE
2. ALTER SESSION SET SQL_TRACE = TRUE
3. Run the SQL statement. The unformatted trace file (.trc) is created in the path stored in the    parameter user_dump_dest.
4. To create a formatted prf file, give following command at the unix prompt
  tkprof <filename.trc> <output_filename.prf>

Generating Trace in Raptor




Generating Trace in Toad



Generating Trace in Concurrent Programs
          Submit the concurrent program and note down the Concurrent Request id.
          If the concurrent program is a report then the trace file name is derived by the below query
select oracle_process_id From fnd_concurrent_requests where request_id=<conc request id> 
DB_NAME”_ORA_” oracle_process_id”.trc
else the trace file name is the DB_NAME”_ORA_”Request_Id”.trc
          The location of the trace file is provide by below query,
SELECT name,value from v$parameter WHERE name like 'user_dump_dest'
          Trace can also be enabled while submitting the concurrent request by clicking on Debug Options buttons and select  SQL trace check box (See the below snapshot).
          If the Debug option button is greyed out & not updateable set the profile Concurrent: Allow Debugging to Yes

3.    Hints
Hints are extremely flexible and we  can combine them.

The syntax is-
SELECT --+ full(emp)
SELECT /*+ full(emp) */ ename
Cases when hints are ignored-
SELECT /*+ full */ ename FROM emp; -- table name missing
SELECT /*+ full(emp) */ ename FROM emp E; -- table alias E or e should be used
SELECT /*+ full(emp) index(emp btree_deptno) */ ename FROM emp E; -- conflicting hints
  • Hints to use btree index are INDEX, INDEX_ASC, INDEX_DESC and INDEX_FFS. Hints INDEX & INDEX_ASC are similar. Use INDEX_DESC, when you know that the value searched for is near the end. This hint might speed up the query but the cost will be same. INDEX_FFS hint is used when all the columns in SELECT clause can be got from index scan.
These can be used in any of the following 3 formats-
INDEX(<table_name>) -- uses index with lowest cost
INDEX(<table_name> <index1_name>) -- uses index1_name
INDEX(<table_name> <index1_name> <index2_name>) -- uses index with lowest cost
To use both the indexes and merge the index scans, use-
AND_EQUAL(<table_name> <index1_name> <index2_name>) -- AND is used in WHERE clause
USE_CONCAT(<table_name> <index1_name> <index2_name>) -- OR is used in WHERE clause, may use Full table scan also
  • Hint to use bitmap index is-
    INDEX_COMBINE(<table_name> <index1_name>) -- works only with CBO
  • With RBO, you can change the driving table by physically changing the order of tables in the
    FROM clause (if oracle ranks the tables equally). The last table in FROM clause will be the
    driving table. Using ORDERED hint, we can choose the most selective table in the chain as the driving table. The first table in the FROM clause will be the driving table.
  • STAR hint is used when Fact table is joined with many Dimension tables. This hint forces oracle to place the fact table at the end of the join chain. The parameter hash_join_enabled should be true.
  • To select a joining method, use hints USE_MERGE (for sort merge), USE_NL (for nested loop), USE_HASH (for hash join) and CLUSTER (for cluster join). Syntax is-
    USE_MERGE(<table_name>)
  • Use HASH hint to access tables in hash cluster.
  • Use PARALLEL hint if your platform has multiple CPUs.
    PARALLEL(<table_name>,<no_of_processes>)
    Parallel processes can also be used in Create table/index and sqlloader commands.
  • If you have small tables that applications are constantly performing full table scan on and you want the table blocks to remain in memory as long as possible, use CACHE hint.
    CACHE(<table_name>)
4.    Joining Tables

Oracle uses one of the 4 methods to join tables together-

·         Sort merge,
·         Nested Loop,
·         Hash
·         Index Cluster.

Nested Loop seems to be the fastest of the four and the Sort Merge slowest. For oracle to use hash join, the parameter hash_join_enabled must be set to true. The optimizer joins the tables using a cluster join when the tables are in a btree index cluster. Nested loop join work by reading the outer table, using the information gathered to probe the inner table. Whereas in case of Hash and Sort merge joins both the tables are accessed seperately and then the resultant data is joined.

Method
When Used
Sort Merge
No indexes on either tables and no cost hints
Nested Loop
One index on either PK or FK
2. Both PK & FK indexed
3. The hint use_nl
Hash
No indexes on PK or FK columns and no cost hints and    hash_join_enabled=true
2. The hint Hash
Cluster
Tables are in a btree index cluster

5.    Other tuning tips
Below are points which should be considered before writing SQL query:

  • Check the frequency of Commits in a batch program. The more frequently you commit, the slower the batch program executes. But, if you commit rarely, you might get Rollback segment error. The frequency should be optimal.
  • Use truncate command when deleting all the rows from a table. It is much faster than delete command. It drops the storage unlike Delete command and also fires an auto commit. So the Truncate cannot be rolled back. Truncate cannot be used if the table has delete trigger.
  • Use NOT EXISTS instead of NOT IN. NOT EXISTS can use index whereas the NOT IN  cannot.
  • Use Set operators wherever possible.
  • If you use the Connect By and Start with clause, you should probably index the columns in the Connect By clause.
  • Row chaining occurs when one row is updated to new value and the new value is stored in another block. This might slow down the queries. Use proper pctfree setting to prevent row chaining.

No comments:

Post a Comment