Next week (July 26-July28) marks the kickoff of OSCON, one of the most important gatherings of open-source enthusiasts. As like in years past, Infobright will be showcasing the best-in-class, open-source analytic database. For those attending, come stop by Booth 106. We will be giving out t-shirts and offering tasty treats.
In the booth this year, we will be discussing:
Plus, we'll be there to answer any other questions as well as get your feedback. If you'll be at the show and would like to have a drink at/after the show, send us a shout. We're at (JavaScript must be enabled to view this email address).
Cheers and see you in Oregon!
Comments (0)
In a previous post, a scenario was identified as being suitable for improvement using rough queries. In this post we will illustrate how one can improve the overall response time of the system by leveraging rough queries.
The challenge is to make sure the reaction time between the temperature increase and system response is as small as possible.
mysql> select max(temp) from sensor where ts > 1000000 and id = 1;
+-----------+
| max(temp) |
+-----------+
| 35 |
+-----------+
1 row in set (3.10 sec)
By examining the execution plan we can see that 951 packs were loaded during this execution.
2011-06-23 09:19:26 [2] [...] Total data packs actually loaded (approx.): 951
We can improve the implementation by having a two-step implementation where the first check if any sensor reports a higher temperature and then, if it does, drill down into the specific sensor information that caused it.
In order to do that, one can issue a query below to detect a temperature increase
mysql> select max(temp) from sensor where ts > 1000000;
+-----------+
| max(temp) |
+-----------+
| 36 |
+-----------+
1 row in set (0.13 sec)
In doing so a relatively small number of packs are loaded as the current engine already uses plan optimization at the Knowledge Grid level based on rough query.
2011-06-23 09:20:02 [2] [...] Total data packs actually loaded (approx.): 6
We can immediately improve this scenario and use the rough query to determine if any sensor reports a higher temperature, such as the query below:
mysql> select roughly temp from sensor where ts > 1000000;
+------+
| temp |
+------+
| 30 |
| 36 |
+------+
2 rows in set (0.00 sec)
The rough query response time is way superior due to the fact no actual data being loaded:
2011-06-23 09:20:16 [2] [...] Total data packs actually loaded (approx.): 0
Further improvements can be done after we examine the behaviour when an incident occurs.
mysql> select roughly temp from sensor where ts > 1000000;
+------+
| temp |
+------+
| 30 |
| 88 |
+------+
2 rows in set (0.00 sec)
Since we now know there is a temperature spike, we need to determine if sensor 1 is the cause of it:
mysql> select max(temp) from sensor where ts > 1000000 and id = 1;
+-----------+
| max(temp) |
+-----------+
| 88 |
+-----------+
1 row in set (3.03 sec)
The response time is due to the fact 498 data packs were loaded during the exact query execution:
2011-06-23 09:52:03 [2] [...] Total data packs actually loaded (approx.): 498
In the next part we will see how we can further improve and significantly reduce the reaction time by using rough queries.
mysql> select roughly ts from sensor where ts > 1000000 and temp > 60;
+----------+
| ts |
+----------+
| 13632490 |
| 13659115 |
+----------+
2 rows in set (0.00 sec)
The rough query response is extremely fast as no data is loaded during evaluation, and we now know the upper and lower bound of the timestamp when the temperature increased above the monitored threshold.
2011-06-23 09:55:18 [2] [...] Total data packs actually loaded (approx.): 0
What the response is giving us is a smaller range for the timestamps where the incident occurs so we can now drill down into the exact response and exclude intervals not relevant to our initial search; by comparing the initial threshold of 1000000 and the lower rough bound of 13632490, we can recompose the query as such:
mysql> select max(temp) from sensor where ts > 13632490 and id = 1;
+-----------+
| max(temp) |
+-----------+
| 88 |
+-----------+
1 row in set (0.07 sec)
The response time is now a fraction of the initial response time because significantly less data packs are loaded:
2011-06-23 09:56:35 [2] [...] Total data packs actually loaded (approx.): 3
Using rough queries we can now trigger an alert from 3 seconds to a near real time response of 0.07 seconds.