Joinutility seperatorLogin utility separator Infobright.com
   
 
What are rough sets?
Posted: 03 September 2008 08:50 AM   Ignore ]  
Member
Avatar
RankRankRank
Total Posts:  192
Joined  2008-08-18

Can somebody explain in one or two sentences?
One aspect is extension of set theory, where an element (1) can belong to a set, (2) can belong to the complement of a set, (3) can belong to the boundary of set, so we can not precisely say whether it belongs or not to the set.

[ Edited: 03 September 2008 08:55 AM by Piotr Synak]
Profile
 
Posted: 15 September 2008 08:45 AM   Ignore ]   [ # 1 ]  
Sr. Member
Avatar
RankRankRankRank
Total Posts:  488
Joined  2008-08-18

It is interesting to note that rough sets is an approach introduced more than 25 years ago, used so far mainly in data mining and knowledge discovery. As an example, consider automatic detection of some important events in data. (Like fraud detection or credit risk assessment.) In such an application, the rough set approach will attempt to create a decision model that automatically identifies: (1) the cases that support the given event; (2) the cases that do not support the given event; (3) the cases that remain undecided.

Methodologies based on rough sets can be, however, applicable to many other domains too.

At Infobright, for example, we follow the rough set approach to identify: (1) the data portions that are fully relevant to the given query execution; (2) the data portions that are fully irrelevant to the given query execution; (3) the data portions that remain undecided. The corresponding data identification mechanism is based on the Infobright’s knowledge grid, where compact information about the particular data portions is gathered. The data portions are highly compressed and - for analytical queries - only the undecided data portions are required to be accessed.

[ Edited: 15 September 2008 09:31 AM by Dominik Slezak]
Signature 
Profile
 
Posted: 07 November 2008 07:10 AM   Ignore ]   [ # 2 ]  
Sr. Member
Avatar
RankRankRankRank
Total Posts:  488
Joined  2008-08-18

Just a quick update on rough sets…

The International Rough Set Society (IRSS) has quite an updated homepage:

http://www.roughsets.org/

Also, if someone is interested in scientific aspects of rough sets, there are some materials from the recent rough set conference:

http://www.informatik.uni-trier.de/~ley/db/conf/rsctc/rsctc2008.html

Actually, we presented how ICE/IEE uses the principles of rough sets at that conference.

The revised version of the paper presented there can be found in the following thread (see the attachment to #1):

http://www.infobright.org/Forums/viewthread/297/

Best greetings,

Dominik

[ Edited: 18 November 2008 11:22 AM by Dominik Slezak]
Signature 
Profile
 
Posted: 07 November 2008 07:38 AM   Ignore ]   [ # 3 ]  
Sr. Member
Avatar
RankRankRankRank
Total Posts:  488
Joined  2008-08-18

By the way…

There is Infobright VLDB paper available online at:

http://www.informatik.uni-trier.de/~ley/db/journals/pvldb/pvldb1.html

(Brighthouse: an analytic data warehouse for ad-hoc queries. 1337-1345.)

In particular, it includes some intro to rough set math and how we refer to its principles…

Well, this is quite an academic style of description, slightly outdated, written prior to ICE/IEE era.

Still, I hope one may find it useful…

Best greetings,

Dominik

Signature 
Profile
 
Posted: 08 November 2008 10:21 AM   Ignore ]   [ # 4 ]  
Sr. Member
Avatar
RankRankRankRank
Total Posts:  488
Joined  2008-08-18

Okay, my last post here for now, I promise…

I have just recalled that there is one more good source of information about rough set academic research:

http://rsds.univ.rzeszow.pl/

It is an online database gathering the rough set-related academic papers.

Actually, I remember a funny story about Infobright and that database:

Long time ago, just after Infobright had been founded, I was asked whether we were really so unique using rough sets in the database systems. When I answered: “yes, we are fully unique!”, the response was: “but there is that Rough Set Database System available online at http://rsds.univ.rzeszow.pl/ !!!

Actually, it would be nice to see whether the database system with rough set documents can be run using the database engine based on rough set principles. I know the creators of that online system pretty well, so maybe some day we will try? I guess we’d need to better support the document search queries, especially at the level of Knowledge Grid. However, with ICE open, perhaps we may count on some help from the Community with this respect? I would be really, really, really great!!

Best greetings,

Dominik

Signature 
Profile
 
Posted: 12 November 2008 02:30 PM   Ignore ]   [ # 5 ]  
Newbie
Rank
Total Posts:  2
Joined  2008-11-10

Rough sets is an extension of traditional set theory.
In the traditional set theory, it is clear whether an element x is in a given set A. While rough sets looks at the relationship between the set x belongs to according to the indiscernibility, and a given set A. The essence is that a higher degree, i.e., the power set, is studied. Therefore, it might be more accurate to say “an element’s indiscernible set (1) is belong to a set ...” rather than “an element (1) can belong to a set”.
  Fan Min

Profile
 
Posted: 13 November 2008 06:05 AM   Ignore ]   [ # 6 ]  
Sr. Member
Avatar
RankRankRankRank
Total Posts:  488
Joined  2008-08-18

Hello Fan Min,

You are right that the theory of rough sets is more about working with sets than with elements. This is why, at the level of foundations, rough sets are sometimes discussed together with mereology (http://en.wikipedia.org/wiki/Mereology) or, as another example, with granular computing (http://en.wikipedia.org/wiki/Granular_computing). What we do at Infobright can be actually interpreted as a kind granular computing but, if you don’t mind, I’ll keep this topic for another post…

One of the points now is how those sets are defined in particular approaches and applications. In original rough set applications, the sets of interest are the classes of rows with the same values on some subsets of available columns. In rough set extensions, they may gather the rows with similar but not necessarily the same values, et cetera. At Infobright, both in the case of ICE and IEE editions, we also work with the sets. The sets can be defined either as the blocks of rows gathered and compressed together or the sets of rows satisfying some conditions defined by SQL statements arriving to the system.

Best greetings,

Dominik

[ Edited: 13 November 2008 06:10 AM by Dominik Slezak]
Signature 
Profile
 
Posted: 03 December 2008 04:53 AM   Ignore ]   [ # 7 ]  
Sr. Member
Avatar
RankRankRankRank
Total Posts:  488
Joined  2008-08-18

Some new materials related to rough sets…

Have a look at our academic blog for details:

http://www.infobright.org/Open-Source/Blog/Academic

Best greetings,

Dominik

[ Edited: 03 December 2008 06:49 AM by Dominik Slezak]
File Attachments 
call for chapters.pdf  (File Size: 86KB - Downloads: 1006)
rough sets.pdf  (File Size: 361KB - Downloads: 5314)
Zdzislaw Pawlak.pdf  (File Size: 177KB - Downloads: 836)
Signature 
Profile
 
Posted: 06 March 2009 11:47 AM   Ignore ]   [ # 8 ]  
Sr. Member
Avatar
RankRankRankRank
Total Posts:  488
Joined  2008-08-18

In relation to the blog:

http://www.infobright.org/index.php/Blog/Entry/toronto_2007/

I attach the presentation we did at the 2007 rough set conference in Toronto.

It was more than a year before going ICE. Our product was called Brighthouse then…

Please have a look at the last two slides. They illustrate the foundations of rough sets in two scenarios described previously in this thread. By the way, please let me know in case you’re interested in any of the Toronto 2007 papers.

Best greetings,

Dominik

File Attachments 
BH JRS07.pdf  (File Size: 760KB - Downloads: 1515)
Signature 
Profile
 
Posted: 01 April 2009 06:40 AM   Ignore ]   [ # 9 ]  
Sr. Member
Avatar
RankRankRankRank
Total Posts:  488
Joined  2008-08-18

When you visit the homepage of IRSS (International Rough Set Society), please have a look at IRSS Resources. In Guides, tutorials etc, there are two presentations. You may say that there’s “too much math” there. However, the whole idea is really straightforward, easy to implement and extend.

The first presentation introduces basic notions. (I suggest reading slides 1-33 and going back to the rest a bit later.) In particular, there’s the framework for removing superfluous attributes (columns) and shortening IF-THEN rules while learning decision (classification, prediction, etc.) models from data. I’ve already mentioned this way of optimization (simplification, clarification) of data-based models in the data mining thread (#3). Such a tendency is visible not only among rough setters but also in many areas of machine learning.

The second presentation translates above notions to the case of dealing with preference-related data. I thought about this material while attending the ICDE 2009 tutorial on Preference Queries from OLAP and Data Mining Perspective. I’ll follow up in the ICDE 2009 thread.

Best greetings,

Dominik

[ Edited: 01 April 2009 11:12 AM by Dominik Slezak]
Signature 
Profile
 
Posted: 03 April 2009 10:59 AM   Ignore ]   [ # 10 ]  
Newbie
Rank
Total Posts:  1
Joined  2009-04-03

For those of you who may be interested, there are several pages of THE BOOK, i.e. the original book on rough sets by Zdzislaw Pawlak, avaliable at Google Books:
http://books.google.com/books?id=MJPLCqIniGsC&hl=en
There are other RS books there as well, unfortunately only a few of them with full text.

Enjoy

MS:-)

Profile
 
Posted: 06 April 2009 12:09 PM   Ignore ]   [ # 11 ]  
Sr. Member
Avatar
RankRankRankRank
Total Posts:  488
Joined  2008-08-18

Hello MS:

Thanks a lot! This is indeed a very good reference. There were some rough set publications before but in this book Professor Pawlak was able to put everything together very clearly. It’s a pity that there are only a few first pages available in the free preview. Still, it’s worth reading them. On the one hand, it’s visible that original way of constructing knowledge granulations basing on indiscernibility relations is different from what we do in Infobright. On the other hand, we do rely on the notions of rough approximations in our technology and we’ll surely continue to refer even more to the rough set foundations in the future.

It reminds me about another publication. Sorry if I mentioned about it before. I hope I’m not getting boring. When I attended SIGMOD 2008, I noticed some familiar printouts at the conference registration desk. I came closer and… those were copies of even earlier paper (a decade before the book about rough sets):

Zdzislaw Pawlak: Information systems theoretical foundations. Inf. Syst. 6(3): 205-218 (1981)

It was about interpretation of data, information derivable from data, and the querying methodologies related to data and data information. Pawlak’s approach with this respect can be regarded as an alternative to Codd’s framework for relational data models. There are some interesting similarities and dissimilarities.

I’ll try to get a reasonable quality electronic version of this paper.

Best greetings,

Dominik

[ Edited: 07 April 2009 05:15 AM by Dominik Slezak]
Signature 
Profile
 
Posted: 21 April 2009 05:10 AM   Ignore ]   [ # 12 ]  
Sr. Member
Avatar
RankRankRankRank
Total Posts:  488
Joined  2008-08-18

Hello maison09,

You are very welcome!

I should have also mentioned about the rough set page on wiki:

http://en.wikipedia.org/wiki/Rough_set

Many people say that it should be seriously updated. Still, it’s also a kind of reference.

Best greetings,

Dominik

Signature 
Profile
 
   
 
 
     Data Mining in Warehousing ››