infobright.org
Joinutility seperatorLogin utility seperator Infobright.com

Academic Blog

03
Dec

How Rough Sets Started

Dominik Slezak's photo
by Dominik Slezak     Wed, Dec 03, 2008

When writing about rough sets and a new industry book initiative, I should have mentioned Dr. Zdzislaw Pawlak and his seminal monograph titled Rough sets – Theoretical aspects of reasoning about data (Kluwer, 1991).

Dr. Pawlak was a friend of my supervisor and I had the honor and pleasure to meet and talk with him from time to time. He published his Rough sets in 1982 in the International Journal of Computer and Information Sciences. Well, I didn’t know him then. I was still in primary school, focused on completely different undertakings.

More recently, when attending SIGMOD 2008, I noticed at the registration desk the printouts of Dr. Pawlak’s Information systems theoretical foundations (Information Systems, 1981). I was really touched by the fact that people still remember the foundations introduced so many years ago. Actually, this particular paper is of special importance as a common background for later advances in databases and data mining.

Unfortunately, I’m not able to provide any of the above-mentioned publications in an electronic form. However, here is something else! Thanks to the great kindness of the authors, I attach at our forum (post #7 on Dec 03) a preprint of Zdzislaw Pawlak life and work (1926–2006). I encourage everyone to read this short bio of a true scholar and wonderful human being.

I also attach there a preprint of an introductory paper on Rudiments of rough sets published last year. I recommend it to those who haven’t heard about rough sets before and who want to learn the basics. I’d like to thank Information Sciences and Elsevier – the publisher of both articles. Their final official versions can be found here and here.

Best greetings,

Dominik

Infobright     Tags:

03
Dec

Business and industrial applications of rough sets – the book in progress

Dominik Slezak's photo
by Dominik Slezak     Wed, Dec 03, 2008

Given the 25-year history of rough sets in data mining and knowledge discovery, given a constantly growing interest in rough sets in such application areas as, e.g., data warehousing, isn’t it the highest time for a more business-oriented rough set book? Well, that was the question I received from Georg Peters – a great researcher I met for the first time at PReMI 2005 and more recently at RSCTC 2008.

Georg decided to proceed and – together with Pawan Lingras who is a top expert on embedding rough sets into hybrid solutions and Yiyu Yao who contributed a huge amount of work into the rough set foundations – we came up with the call for chapters (available at our forum, post #7 on Dec 03).

I do hope this initiative will be successful. The rough set methods really deserve more recognition in business and industry. You can see that we still did not set up all the details with the publisher et cetera. We will work it out dynamically, in parallel to continuing discussions with the authors of particular chapters.

Should you have any questions, please send an email to Georg or feel free to comment here.

Best greetings,

Dominik

Infobright     Tags:

19
Nov

Roughly on Rough Sets

Dominik Slezak's photo
by Dominik Slezak     Wed, Nov 19, 2008

When digging deeper into the bios of Infobright’s founders, you’ll find out that most of us did rough sets. Hence, it’s not a surprise that rough sets are present in Infobright’s technology. On the other hand, when you read about rough sets on the web and try to match it with ICE/IEE, it’s perhaps not as straightforward as expected. Moreover, whenever I have the pleasure of speaking with rough setters, there are some questions too. So let’s put it all together. Let’s refer to our rough set forum:

Rough sets is an approach introduced more than 25 years ago, used so far mainly in data mining and knowledge discovery. As an example, consider automatic detection of some important events in data (like fraud detection or credit risk assessment). In such an application, the rough set approach will attempt to create a decision model that automatically identifies: (1) the cases that support the given event; (2) the cases that do not support the given event; (3) the cases that remain undecided.

Methodologies based on rough sets can be, however, applicable to many other domains too.

At Infobright, for example, we follow the rough set approach to identify: (1) the data portions that are fully relevant to the given query execution; (2) the data portions that are fully irrelevant to the given query execution; (3) the data portions that remain undecided. The corresponding data identification mechanism is based on the Infobright’s knowledge grid, where compact information about the particular data portions is gathered. The data portions are highly compressed and - for analytical queries - only the undecided data portions are required to be accessed.

The progress in rough sets has occurred in two major areas: (a) the usage of variously interpreted lower (full relevance) and upper (possible relevance) approximations in various types of applications (with ICE/IEE as a data warehouse industry example); (b) the support for feature selection and learning classifiers from data (with ICE/IEE hoping to become an example soon with regards to our knowledge grid tuning).

The rough set community is one of those truly dynamically growing international groups and that we at Infobright are happy to expand it together onto previously uncharted territories of applications. For more information, let me refer everyone to the already-mentioned rough set forum, as well as to the homepage of International Rough Set Society and to the online database with rough set publications. It’s surely not the last post on rough sets here!!!

Best greetings,

Dominik

PS: I’ve just been told that one of my friends, a professor at a university in Poland, develops Rough ICE: Rough Set Interactive Classification Engine. Isn’t it a nice coincidence?

PS2: I attached a short paper about Rough Sets and Data Warehousing to the post #1 on our academic papers forum. As usual, I invite everyone to read and comment it!

Infobright     Tags:

Next Page