Hello,
This time I’m going to write about Korea and more precisely about the Future Generation Information Technology (FGIT 2009) conference that was held on Jeju Island last week. I’ve been to Jeju two times before but it still strikes me with its special charm. Well, I realize that this blog should be more about technology than travel but let me at least refer you to some nice pictures of the island and the venue.
The idea of this conference relates to the concept of Hybrid Information Technology, considered a few years ago together with some of my colleagues in Korea. Nowadays, they are holding their own Hybrid IT related academic events (see also ICHIT and ICCIT) with a high number of paper submissions (in the case of FGIT 2009 it exceeded 1100 articles). It shows a huge practical need for interdisciplinary research in IT.
FGIT is a multi-conference that consists of 10 events with their own committees and materials, but unified with respect to organization, keynotes and the special volume of materials containing the most representative papers accepted to particular events. The special volume is published within LNCS series (probably quite familiar to academic readers). The remaining papers can be found in 10 CCIS volumes. The CCIS series is a new Springer’s initiative, a good complement to more established Lecture Notes on Computer Science, Artificial Intelligence, and Bioinformatics. In my opinion, CCIS may soon become more attractive to industry and practitioners than in the case of its older LNCS/LNAI/LNBI brothers.
However, let’s leave behind all those complicated publishing strategy details and focus on the contents. Let’s go back to keynotes and cite at least two out of nine invited speakers:
The Era of Social Computing (Irwin King): The Web has changed the landscape of how humans interact socially. With the advent of Web 2.0, Social Computing has emerged as a new and innovative paradigm that changes the way we communicate, interact, and learn. Social Computing involves the investigation of collective intelligence by using computational techniques such as machine learning, data mining, natural language processing, etc. on social behavioral data collected from blogs, wikis, emails, instant messages, clickthrough data, query logs, social bookmarks, tags, etc. In this talk, I will first introduce Social Computing by outlining some of the unique characteristics and aspects that are found on the various social platforms. Applications in each of the platforms will be presented to further demonstrate the use of these new technologies to enhance and enrich our lives. Lastly, I will conclude with some current challenges and potential future promises of Social Computing.
The Ubiquitous DBMS (Kyu-Young Whang): Recent widespread use of mobile technologies and advancement in computing power prompted strong needs of database systems that can be used in small devices such as sensors, cellular phones, PDA, ultra PCs, and navigators. We call database systems that are customizable from small-scale applications for small devices to large-scale applications such as large-scale search engines ubiquitous database management systems (UDBMSs). In this talk, we first review requirements of UDBMSs. The requirements we identified include selective convergence (or “devicetization”), flash-optimized storage system, data synchronization, supportability of unstructured / semi-structured data, and complex database operations. We then review existing systems and research prototypes. We first review the functionality of UDBMSs including the footprint size, support of standard SQL, supported data types, transactions, concurrency control, indexing, and recovery. We then review the supportability of requirements by those UDBMSs surveyed. We highlight ubiquitous features of a family of Odysseus systems that have been under development at KAIST for over 19 years. Functionalities of Odysseus can be “devicetized” or customized depending on the device types and applications as in Odysseus / Mobile for small devices, Odysseus / XML for unstructured / semistructured data, Odysseus / GIS for map data, and Odysseus / IR for large-scale search engines. We finally present research topics that are related to the UDBMSs.
The first of the above keynotes was particularly interesting to listen because of the nature of the data underlying the social computing challenges. Indeed, comparing to my previous blog, we face here one more example of Log Analytics, now related to Web 2.0 applications. Some people may actually call it Web Analytics or Social Analytics, given some specific tasks like, for example (this is my favorite one!), Social Marketing that aims to target the best-connected nodes (most-influential individuals) within the social network. Surprisingly (or not at all), quite analogous tasks can be formulated for other types of complex network-related data sets, for example, the data related to the growth of infectious diseases (see another keynote by Peter M.A. Sloot). As a result, one of the future directions may be abstraction of complex network data models and the most typical queries against them. However, regardless of whether we use classical relational data models and something more specific for network-related data, query complexity will be still there. Actually, I find most of the tasks of Social Computing (and Social Marketing in particular) as the ones corresponding to quite mixed, ad hoc, complex query workloads.
The second of the above keynotes corresponds to the question to what degree database engines should be tunable with respect to available resources. Look at all those new storage devices and their specific performance bottlenecks. Is it possible to create a database architecture that can efficiently work with (adapt to) multiple heterogeneous resources at the same time? Is it possible to create an architecture consisting of roughly two layers: one responsible for truly abstracted execution of database operations and another one hiding away the mechanisms of adaptation to particular types of resources? It’s surely something more than traditional meaning of being “oblivious” (e.g. cache oblivious). I hope it’s doable. But it looks like just the first part of UDBMS story –gathering all information sources and computational resources to retrieve useful knowledge. The second part is what kind of knowledge is worth retrieving, given potentially limited data access and incompleteness? As an example, one of my colleagues in Italy has conducted a very interesting research on approximate reporting on handheld devices. Of course at least some of you already know that approximate querying is one of my favorite topics but I’m trying to keep myself unbiased here. – I honestly believe that it is something natural within UDBMS framework.
Last but not least, the conference is not only about keynotes but also about regular paper presentations, so we did our best to contribute and we submitted two papers. The first paper, co-authored by Marcin Kowalski, is about such data organization that improves the quality of our Knowledge Grid. I blogged about it several times before, so let me briefly refer to the paper and to Infobright’s Wikipedia page, where you can find some relevant links. The second paper, co-authored by Hiroshi Sakai, is a perfect example of how the forum discussions can evolve into scientific cooperation. The paper is about SQL-based mining of non-deterministic data. It’s just a preliminary work but I hope you will like it.
To conclude, I hope that I convinced you to look more carefully at the idea of Hybrid IT and the place of databases within it. I’ll be looking forward to your comments and… do not forget about FGIT 2010!
Best greetings,
Dominik
Interesting article. I see you’ve done a quality research papers in Korea
Post Comment