Investigation in Recommendations for Troubleshooting of Ineffectiveness of Queries to Database Tables

One of the ways to increase work output of the applications connected with database is optimization of queries to database tables. Importance of query optimization issue increases with growth of records number to decades and hundreds of thousands tables which the query is referring for. There are various recommendations on troubleshooting of queries performance deceleration. However, there are no results of experimental investigations on effectiveness of applying other recommendations given in scientific literature. The authors have conducted numerical experiments for investigation of recommendations, both existing and proposed by the authors, to determine the quantitative values of effectiveness of applying recommendations. A number of recommendations are illustrated by means of examples of applying recommendations including several alternative variations. The work describes differences in effectiveness of applying recommendations in different systems of database management considered in this article. The work describes relative values of effectiveness of applying recommendations compared with source queries. As a result of article, the authors have developed the algorithm of query text analysis and applying recommendations. Investigation results expand knowledge on practical applying both known and new recommendations for optimization of queries to database tables. The article has practical orientation and it is methodically useful for first time programmers in a field of structured queries language.

of writing such queries for which the DBMS could be able to use the most effective ways for data detection. The feature of queries written in declarative language involves the fact that to achieve the similar result completely different types of queries it can be written consisting of different operators, with various ways of tables splitting, differently using the indexing advantages etc.
There are different recommendations for increasing of queries writing quality [1-10]. However, the authors indicate the quantitative values of effectiveness of these recommendations using. There are many questions. What algorithm programmer should follow to analyze the query for its optimization by the performance time? Are the clauses recommendations equally operating for each DBMS? This article is devoted to answers for all these questions.
One of the key optimization terms is a query plan. Query plan is methods for algorithm of their by means of which the DBMS is able to obtain the desirable result. After lexical and syntax analysis of the query text its optimization is conducted, and then the DBMS chooses the most relevant plan and carries it out.
That is to say that all developers have different approaches for various DBMS for issue of index organization. There are two types of indexes provided for DBMS MS SQL Server: clustered (cluster) and non-clustered (non-cluster). indexes; nevertheless, they support creation of non-clustered indexes [7].
To determine the effectiveness of possible query modifications the tests were made on database. The database subject to analyzing is supported by the database management system Firebird SQL. The amount of information consists of more than 10 gigabytes, and the amount of tables is equal to several hundreds of existence the database has accumulated about million records, therefore the issue of organizing quick data access and transparent logic of SQL-instructions are still relevant.
Original database of the enterprise is supported by Firebird SQL, but for comparative analysis of some recommendations the structure of original database has been created in MS SQL Server.

3-Causes of queries ineffectiveness and troubleshooting recommendations
Let's consider the causes of queries ineffectiveness, troubleshooting recommendations and their effectiveness: 1-Synonyms are not used for the tables.
The synonyms for the tables should be applied in all queries where more than one table is mentioning in a clause FROM. Though using of synonyms in such queries insignificantly accelerates detection operation of SQL by operator by means of DBMS core due to decreasing of queries recursion; nevertheless, it improves readability and relief the code detection by the developer. Effectiveness of this recommendation is 1.42%.

2-Improper order of conditions listing in FROM section.
The order of tables listing in a FROM section is relevant for optimizer when he is searching the way to carry out the query. Optimizer is working as follows: firstly, he considers clauses "WHERE" and gives all tables a determined weight based on a predicate type, then he chooses the table with the least price and makes it the master table. However, there is one nuance which is that if several tables obtain the same value, and it is the least, then the optimizer chooses the table as the last table in the "FROM" clause. Therefore, the last in the list must be the table that returns the smallest number of rows [5]. During the experiment, a query was selected in which two tables containing different number of records according to the conclusions of the DBMS optimizer, has obtained the same price of performance. To find out the truth of this recommendation, the query was performed in two variations: one of two tables, which are equal in value, returning the smallest number of records, was specified before the table that returns the largest number of records and vice versa. The query in which the table returning the smallest number of rows was the last, has been run by 932 ms faster, the effectiveness of this recommendation was 9.89%.
Thus, after the preliminary query plan is formed, it is necessary to look at the prices of the tables participating in the query. In the case that for several tables the price is the same and the query requires faster performance, the table that returns the lowest number entries must be indicated at the end of the "FROM" section.

3-Query consisting of the operators EXIST (IN) and DISTINCT is periodically delaying.
Effectiveness of the operators EXISTS and IN depends on number of rows, returned by each query.
In a query with IN, the manage table is the sub query specified in IN, the master query is repeating for each row. In a query using EXISTS, on the contrary, the manage table is the main query, and the sub query specified in EXISTS is repeated for each row that is selected in the main query.
Thus, if the sub query returns a small amount of rows and the main query returns the big amount of rows it is better to use operator IN. Otherwise, it is necessary to use the operator EXISTS.
However, there are some particular cases when it is necessary to find the first inverted row by the value of any field and you can achieve the same result by means of the operator DISTINCT. In a database such as Firebird SQL or Oracle, the DISTINCT statement will cause the database engine to look through all the records from the required table one by one, in the database organized by MS SQL Server, the clustered index will be viewed, if any, or a full table scanning will be performed if the index does not exist, until the desired value is met. In the case when the search value is in the first rows of the table, the speed of finding the line will be extremely high, but in the case where the search string is near the end of the table, the search will be complicated. And in this case, the use of EXIST or IN operators will be more efficient than using of operator DISTINCT [9].
Recommendation: query must be divided in two parts by means of operator EXIST or IN. In the course of the experiment, the effectiveness of this recommendation was 95.04%.

4-Extra use of operators for the line set operation.
To apply of the Boolean operations for line set (subtraction, merging, addition and so on) it is necessary to represent the structure of the base clearly. When using sub queries, it is necessary to have an idea of the approximate number of records returned by each sub query. In the most cases, with the competent use of operations on sets of lines, you can significantly increase the efficiency. For the tutorial database, the efficiency of excluding one "EXIST" operator as well as revision of the set of line relationships and modification of the query text, led to an increase in efficiency for 97%.

5-Combination of the approved amendments in the table with the absence of necessary indexes.
One of the slowest commands in SQL is an UPDATE command because the most part of approved amendments in the tables requires the complete review of the tables. As a result, these operations are resource intensive and too slow when the tables are very huge. The trouble of low productivity is arising because there is no information used in query from the table to restrict the amount of current rows [7]. If any term of the WHERE clause of the UPDATE operator will not separate the rows set, the UPDATE operator requires a huge amount of processing time. When performing this query, the DBMS core will need to review all records from the table ABONENT1, and, in the case if ID of the subscribers is similar from two tables, the field VAL1 will be refreshed. While among the amount of records from table ABONENT1 even those which clearly do not satisfy the term will be attempted to be updated, this entail slow performance of the query. Therefore, modifications of the original query are possible in which extra lines will be deleted.
The first variation of query modification (Listing 5-2) with a using of "IN", which effectiveness was 24.11%. There is an example of query in the (Listing 6-1) performing the non-correlated amendments [4].  Proper applying of a search argument (terms in section WHERE) is critical for Firebird SQL DBMS, but it is not relevant for MS SQL Server in which the optimizer will choose the index (if it is required) in accordance with its price. If the price of non-clustered index is less than the price of clustered index, then, even upon the improper use of search argument, a query will be processed for tolerable period, because the search will be performed by the nonclustered index, in other words, it is impossible to block index search in DBMS MS SQL Server when the clustered index is required.
Recommendation: reformulate the terms blocking the index search, if the DBMS MS Firebird SQL Oracle is used, but it is not important for DBMS MS SQL Server. 8-Rule 10,15, 20% is neglected.
During the experiments, it was found out that using of indexes in queries is appropriate if the query extracts less than 15 % (10, 20) rows from the table. In all other cases, full review of the table will be operating more rapidly. This rule is appropriate for both MS SQL Server and Firebird SQL.

4-Conclusion
We can make the following summary: 1. There are recommendations from the ones presented above, which do not bring tangible gains in time, nevertheless, they are highly recommended for use. These include, for example, a recommendation using synonyms, recommendations for an approximate calculation of the number of records from each table participating in the query.
2. To apply some recommendations, you should pay attention to the structure of the query, in particular -whether sub queries are used, whether operation operators on sets of rows of the type EXIST, IN, UNION and others are used. In case of detecting such properties, it is necessary to evaluate the efficiency of processing each of the sets, possibly by applying on the sets operations such as packing, subtraction, addition, etc., or at least to have an idea of the number of records that satisfy the terms of the query.
3. Some queries can be a special case, for example, the so-called "non-related" queries, and require competent indexing of the fields of tables participating in the query. 4. It is necessary to use the operator DISTINCT accurately because its applying is not always profitable due to dependence on the position of key row in the table.
5. DBMS MS SQL Server and Firebird SQL have shown the different approaches for organization of indexes in their systems.