Joinutility seperatorLogin utility separator Infobright.com
   
 
Select random rows
Posted: 17 November 2011 12:19 PM   Ignore ]  
Newbie
Rank
Total Posts:  5
Joined  2011-05-02

What is the best (or most optimized) way in ICE 4.0 writing a query to select 10 random rows from a total of around a billion?
In my case the unique key for the rows contains of two columns.

I tried this query
but this only gives me the first 10 records in a random order
select * temp_table from order by rand() limit 10

a way just to get a random value of each one of the key values might be
select FLOOR(RAND()*MAX(key1)), FLOOR(RAND()*MAX(key2)) from temp_table limit 1;

Any other ideas?

Thankyou
/Catarina

[ Edited: 18 November 2011 04:21 AM by Catarina]
Profile
 
Posted: 17 November 2011 12:52 PM   Ignore ]   [ # 1 ]  
Super Duper Member
Avatar
RankRankRankRankRank
Total Posts:  964
Joined  2008-08-18

Hi !

but this only gives me the first 10 records in a random order

Can you elaborate what is the difference between “gives me the first 10 records in a random order” and “select 10 random rows” ?

What is wrong with the query below?

mysqlselect vin from fact_sales order by rand() limit 5;
+-------------------+
vin               |
+-------------------+
5YHMU065GHP378690 |
0LUOB072MXL372693 |
0AJVL329UUV360784 |
3VJKA588FOL684720 |
0MSRW957BLM521731 |
+-------------------+
5 rows in set (5.26 sec)

mysqlselect vin from fact_sales order by rand() limit 5;
+-------------------+
vin               |
+-------------------+
8QEQL005TNZ829200 |
5FICP304LLR181901 |
0PBYP859ZRP489568 |
3XVKQ483BRP117583 |
4UDLJ441VFA342594 |
+-------------------+
5 rows in set (1.78 sec)

mysqlselect vin from fact_sales order by rand() limit 5;
+-------------------+
vin               |
+-------------------+
2XWNI380XRL070325 |
5AHFE739MXT919322 |
6AWRP030QDG016033 |
3QAIR002SPA375190 |
5VAHI605JCC823927 |
+-------------------+
5 rows in set (1.79 sec

Also have a look at http://www.infobright.org/forums/viewthread/509/

[ Edited: 17 November 2011 12:55 PM by Janusz Borkowski]
Profile
 
Posted: 18 November 2011 06:50 AM   Ignore ]   [ # 2 ]  
Newbie
Rank
Total Posts:  5
Joined  2011-05-02

Strange ...
As I can see you get 5 random rows and thats the result I’m looking for.
I only get this result when running ‘order by rand() limit 5’ in myISAM
se examples below ....

Thankyou
/Catarina

CREATE TABLE `temp_table` (
  `id` int(11) NOT NULL DEFAULT ‘0’,
  `idx` int(11) NOT NULL DEFAULT ‘0’,
  `value` double DEFAULT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1

CREATE TABLE `temp_table_ib` (
  `id` int(11) DEFAULT NULL,
  `idx` int(11) DEFAULT NULL,
  `value` int(11) DEFAULT NULL
) ENGINE=BRIGHTHOUSE DEFAULT CHARSET=latin1

I fill the two tables with index 0 to 220000
ex.
‘10’, ‘0’, ‘2.83’
‘10’, ‘100’, ‘2.52’
‘10’, ‘200’, ‘1.9’
‘10’, ‘300’, ‘1.94’
‘10’, ‘400’, ‘1.9’
‘10’, ‘500’, ‘1.89’
‘10’, ‘600’, ‘1.66’
‘10’, ‘700’, ‘1.44’
‘10’, ‘800’, ‘1.3’
etc.


select idx from temp_table_ib order by rand() limit 5;
‘400’
‘200’
‘0’
‘300’
‘100’

select idx from temp_table_ib order by rand() limit 5;
‘100’
‘200’
‘400’
‘300’
‘0’

etc.

select idx from temp_table order by rand() limit 5;
‘33800’
‘76200’
‘94500’
‘222100’
‘35000’

select idx from temp_table order by rand() limit 5;
‘224700’
‘81700’
‘171500’
‘193700’
‘77300’

Profile
 
Posted: 18 November 2011 07:38 AM   Ignore ]   [ # 3 ]  
Sr. Member
Avatar
RankRankRankRank
Total Posts:  763
Joined  2008-08-18

Hi,

I’m confirming that the latest version of Infobright works as it should:

mysqlselect p_partkey from part order by rand() limit 5;
+-------------+
p_partkey   |
+-------------+
|      
180152 |
|       
73801 |
|       
58305 |
|       
30360 |
|      
138931 |
+-------------+
5 rows in set (0.09 sec)

mysqlselect p_partkey from part order by rand() limit 5;
+-------------+
p_partkey   |
+-------------+
|       
91241 |
|       
36247 |
|      
109482 |
|       
24757 |
|       
61565 |
+-------------+
5 rows in set (0.11 sec

(it’s for TPC-H benchmark database, scale 1, so the ‘part’ table has 200000 rows). The result is for a version identical with the latest released 4.0.4. Maybe there was a bug fixed since your ICE release?

Regards,

Signature 
Profile