It’s not that I haven’t published at any conferences before. I have!!! However, those were conferences on intelligent systems, reasoning under uncertainty, rough sets and soft computing, et cetera. Before 2008, I have never submitted a paper to a database event.
I have studied database literature. But listening to live talks is surely far better. This year I went to SIGMOD and VLDB. I was amazed. I was inspired. I’ll do everything to participate in the major database conferences from now on. I’ll try to go to ICDE too. If anyone has comments on other database events, I’ll be happy to learn something!
Recently, the VLDB 2008 materials became available online (both papers and presentations). In particular, I invite everyone to have a look at Infobright’s first conference publication:
Brighthouse: an analytic data warehouse for ad-hoc queries (Brighthouse is the former name of the Infobright data warehouse software)
There is quite a story behind this one. We wanted to submit it to SIGMOD but couldn’t keep the dates. Then we tried with PODS but it was rejected. It was really unwise to send this kind of paper there but, on the other hand, we received a lot of valuable comments from the reviewers (thanks a lot!). Those comments helped us to put together the final version accepted to VLDB.
But it’s not the end of the story yet…
I had to change the date of my presentation. (I had to run away to another conference.) The VLDB 2008 organizers were so kind to move me to the Data Mining Session on the very first day. I like data mining (with all its faces). I enjoyed all the talks in the session. On the other hand, I heard later that some people did not notice changes in the program. I’d like to apologize for this. I remember that all the talks were recorded. I hope those recordings are still available. Actually, I did such a bad job with answering to the questions that I’d really like to listen to myself one more time!
Anyway, although the paper was written prior to the ICE Era, the core ideas based on our knowledge grid, columnar storage and data compression remain the same. Actually, the paper focuses mostly on optimization / execution of the select statements, which is the part of code (and invention!) identical in the Infobright Community and Enterprise Editions. Therefore, I hope one may find it as interesting and useful. Certainly, I’ll be glad to answer any questions related to the paper!
Have a good reading!
Dominik
By the way, I created a new thread in the Infobright’s forums related to the current and future conference publications (http://www.infobright.org/Forums/viewthread/297/).
Hello Fan Min,
Thanks a lot for the link!
Actually, one of my colleagues at Infobright, Piotr Synak, did some research at the edge of time series, sequential, temporal data analysis, in combination with other data mining methods.
I also did some research in the internet and it appears that - at the level of SQL - the time series analysis can be often (at least partially) expressed by means of correlated subqueries, which is one of Piotr’s major interests within Infobright too.
I’ll tell Piotr about your research and comments and perhaps you could both continue on the “Data Mining in Warehousing” forum?
http://www.infobright.org/Forums/viewthread/288/
I’m sure further discussion in this area may be very interesting!!
With many thanks and best greetings,
Dominik
Well, I just started this research for three months. All my resources, including data, reports and programs, are available at http://www.cems.uvm.edu/~fmin/upperyangtze/upperyangtze.html.
Data is a big problem. I obtained some time series runoff data and tried to forecast daily runoff with only daily runoff data at hand (see the report Time Series Data Mining to River Runoff
Forecasting through Sophisticated Dimension Reconstruction). Unfortunately, the runoff change seems pertinent to only the recent day’s runoff change! Therefore I took an simpler (see the report Time Series Analysis of Runoff Data Using
Polynomial Estimation), which gives the same conclusion! That is, the linear estimation performs best :(
Sure, more datasets would be tested. But I am afraid more advanced methods should be employed for this issue.
Hello Fan Min,
Many thanks for the comment!!
I’m curious about your reservoir data mining, if you don’t mind. How would you describe this kind of data? For particular data, some customizations or hybridizations may work better than the original data mining methods… Well, it’s a long discussion…
Anyway, I do like writing the papers. When I write a paper about a new idea, I’m forced to present it very clearly. So, after finishing a paper, I usually understand the idea better by myself.
Best greetings,
Dominik
Yes, I know you have published many papers
Publishing papers are important. While designing and implementing something really useful and at the same time, incorporating new ideas, are definitely more challenging! So, congratulations!
I am working on reservoir data mining, and my focus is how to obtain good results, without considering whether or not the method is new.
Post Comment