Fortnightly newsletters help sharpen your skills and keep you ahead, with articles, ebooks and opinion to keep you informed.
When it comes to tuning a badly-performing query, there are many things that need to be checked. There may be poor query design causing the query to run slowly. There could be an issue with the underlying hardware such as CPU or IO which is bringing the performance of the query down. There could be stale statistics or missing indexes on the important columns. In short, there is not just one reason that a query can be performing poorly. Now, we can’t tune everything in just one go. But we can certainly look for one thing that would surely bring the performance of a query down – parsing of the query. This article will help you understand exactly what parsing is and how it impacts a query’s performance.
Before we look into what parsing is, let’s first understand the steps involved in the processing of a query. Right from the moment query is written and submitted by the user, to the point of its execution and eventual return of the results, there are several steps involved. These steps are outlined below in the following diagram.
The outline of the above workflow is:
In this article, we won’t be looking at details of the other steps except the parsing.
A SQL statement is comprised of various inputs, i.e. different tables, functions, expressions. Thus it is possible that there are multiple ways to execute one query. Of course, the query must run in the most optimal way in order to execute in the shortest possible time. Parsing of a query is the process by which this decision making is done that for a given query, calculating how many different ways there are in which the query can run. Every query must be parsed at least once.
The parsing of a query is performed within the database using the Optimizer component. The Optimizer evaluates many different attributes of the given query i.e. number of tables involved, whether we have indexes available or not, what kind of expressions are involved, etc. Taking all of these inputs into consideration, the Optimizer decides the best possible way to execute the query. This information is stored within the SGA in the Library Cache – a sub-pool within the Shared Pool.
There are two possible states for a query’s processing information. One, that it can be found in the Library Cache and two, that it may not be found. The memory area within the Library Cache in which the information about a query’s processing is kept is called the Cursor. Thus if a reusable cursor is found within the library cache, it’s just a matter of picking it up and using it to execute the statement. This is called Soft Parsing. If it’s not possible to find a reusable cursor or if the query has never been executed before, query optimization is required. This is called Hard Parsing.
Hard parsing means that either the cursor was not found in the library cache or it was found but was invalidated for some reason. For whatever reason, Hard Parsing would mean that work needs to be done by the optimizer to ensure the most optimal execution plan for the query. The optimizer does so by looking into every possible input given to it by the user. This includes the presence (or absence) of any indexes, expressions or functions applied to the columns, whether it’s a join query or not, any hints specified etc. All of this information is of very great importance and presence or absence of any such inputs can change the execution plan.
Before the process of finding the best plan is started for the query, there are some tasks that are completed. These tasks are repeatedly executed even if the same query executes in the same session for N number of times:
Before we do anything for the query it’s important that the query’s syntax must be correct. What’s the point of trying to find the best possible way to execute the query when it can’t be executed in the first place due to a missing keyword? The database checks whether the query is written with the correct syntax or not and also whether the user executing the query has the proper permissions for the underlying objects. If either of these two checks are failed, the process of query’s execution is terminated.
Here is an example where a select statement failed to execute because the wrong syntax was used.