Joinutility seperatorLogin utility separator Infobright.com

Navigation

 ·   Wiki Home
 ·   Wiki Help
 ·   Categories
 ·   Title List
 ·   Uncategorized Pages
 ·   Random Page
 ·   File Upload
 ·   Uploaded Files
 ·   Recent Changes
 ·   RSS
 ·   Atom
 ·   What Links Here

Active Members:

Search:

 

Create or Find Page:

 

View Query Execution Log

Infobright currently does not support the MySQL EXPLAIN command. As a columnar database, the Infobright engine generates an execution plan which is not easily convertible to MySQL-like EXPLAIN notions. Moreover, the actual execution plan is generated dynamically, based on Knowledge Nodes and previous execution steps, thus it is not possible to generate it without the actual execution of a query.

Instead, the Query Execution Log is used to track the execution plan, provide rough statistics, and provide insights into performance bottlenecks.

Maintaining the Query Execution Log

The query execution log is located:

  • in bh.err file in your database directory (configurable as log-error value in the --config file provided at installation, /etc/my-ib.cnf by default),
  • on console, if the server is run with --console option

By default, the Query Execution Log contains information on server start and shutdown, as well as values of execution parameters (from the brighthouse.ini file). To enable more detailed query execution logging, follow these steps:

1. Open the brighthouse.ini configuration file, located in your database directory

2. Set the following option:

ControlMessages = 2

3. Restart the Infobright server to execute the change

The execution log for a query may be quite large (several KBs), and the bh.err log may grow substantially. For a server executing a lot of small queries it is recommended either to switch the logging off (ControlMessages = 0), or to implement an automatic backup scheme. A shell script (Unix) which copies the error log to a backup file and re-initializes the bh.err file is available here

Set ControlMessages = 1 to obtain a log in more compact form (without timestamps).

Understanding Query Execution Log

Most of the lines of the log consists of a timestamp, and a session number in a square bracket. The session number is maintained by a querying tool – some tools use different numbers for each subsequent query, some other will keep the number for the next queries. The number in the log is useful to determine which message is generated by which query (running in parallel). For example:

2009-05-11 14:30:10 [6] (t0): 25 all packrows, 25 to open (including 25 full)
2009-05-11 14:30:11 [2] Displaying result: 1000 rows.

These two log messages are connected to two different queries (session 6 and 2) and should not be understood as a sequence of events.

When a query starts, its compiled version is displayed in the log, as a sequence of internal Infobright execution steps:

2009-05-12 09:29:26 [1] T:-1 = TABLE_ALIAS(T:0)
T:-2 = TMP_TABLE(T:-1)
A:-1 = T:-2.ADD_COLUMN(TERM(t:-1 a:0),LIST,"RECORDTYPE","ALL")
...

The query is then executed by the Infobright optimizer, and each step is described in the log. Before the exact calculation, a result of rough execution is presented, including a total number of data packs, and data packs excluded by KNs. The log for a query is completed after displaying the summary of loaded and decompressed (i.e. not present in cache) data packs:

2009-05-12 09:29:26 [1] Displaying result: 20 rows.
2009-05-12 09:29:26 [1] Total data packs actually loaded (approx.): 35
2009-05-12 09:29:26 [1] ----------------------------------------------------------------------------

The query log is a useful tool for identifying bottlenecks and pinpointing errors. The following messages may indicate problems:

Table ./mydb/clients (41) accessed by MySQL engine.

The Infobright table was accessed by MySQL engine, instead of Infobright one. Possible reasons:

  • Unsupported syntax is used.
  • Join with non-Infobright table is performed.
  • MySQL format is used in export, instead of Infobright text format.
  • MySQL is gathering statistics of Infobright tables either for ETL tool, or to optimize query (e.g. outer joins) before sending it back to Infobright engine.

All cases except the last one mean that the query will be very slow.

2009-05-29 07:41:35 [7] Warning: a big intermediate object created (~268 GB)

This message means that a substantial disk space will be used to execute the query. In most cases it is not intended and may indicate a mistake in query formulation – e.g. a full Cartesian join of tables with millions of rows.

2009-11-06 18:52:20 [1] Starting joiner loop (4017000000000 rows).

Joiner loop is the least efficient joining algorithm in Infobright (just a quadratic loop through all pairs of rows). The number in message indicates the number of pairs to be checked. If this number is large (as above), it is probably better to stop query and rethink its formulation, to avoid joins declared in implicit way, e.g.:

...WHERE t1.x + t2.y > 5... > ...WHERE t1.x > 5 - t2.y...

The latter formulation will use much faster joining algorithm.

For some long-lasting operations it is possible to forecast approximate total time basing on log:

2009-11-06 19:05:57 [1] Aggregating: 1000000000 tuples left, 0 gr. found so far
2009-11-06 19:17:59 [1] Aggregating: 864292787 tuples left, 3677031 gr. found so far

Using timestamps we can conclude that 722 sec. was spent on 136 mln. rows (13.6%), which means that the query will probably run for another 1 h 16 minutes.