****Download now available here****Watch a video that explains them****
In a concentrated effort to improve upon our users' experience in working with Infobright Community Edition, a new open source software download will be released shortly. This download will be released under the MIT license and will be supported through the Community. It will include sample projects written in C++, C#, Java and PHP, with future projects in Ruby and Silverlight. These example projects have been organized by me and coded by interns working with Infobright from the University of Illinois.
The intern program was designed to bring in computer science and computer engineering students and give them an opportunity to work hands on with Community-driven software projects. The transition from classroom to the real world can often be a challenging experience for young developers. Adoption into the organization's community, adapting to the corporate infrastructure and the pressure to deliver quality code are just a few of the challenges that students face.
One of my personal goals in working with interns is to help instill a sense of confidence and humility within their professional manner. The confidence factor is key for allowing developers to make decisions and the humility factor is essential in giving the developer a sense of possibility in making decisions. These two traits combined gives young engineers the ability to be very effective in a industry that quickly changes and moves at the speed of research.
All of the projects within this single download are examples of how to connect and work with Infobright. Each of these sample projects have well written comments and readme files for each language and should allow for the quick development to testing process. One of the goals of this open source release was to provide a boilerplate project for each language that can be utilized as a "starting point" for individualized projects. By assisting in this starting process, we are helping software developers to focus primarily on their business logic and architecture needs.
This software will be released to the public and available for download starting Friday, May 18th from the Contributed Software page on http://www.infobright.org. This release will be updated from time to time with additional projects supporting new architectures and platforms. The next release will contain additional projects written in Ruby as well as Silverlight using various development and design patterns.
Infobright applauds all of the interns that are participating in our intern program and we look forward to supporting the Open Source Community with further contributed software releases.
Read Comments (0)
Doing your first webinar can be a harrowing experience. Although it was exciting, I found that no matter how prepared you are, it is still a very humbling experience. Note to self, stand at your desk next time instead of sitting, and close your eyes and imagine you are in front of the crowd and that everyone is in their underwear (LOL). All kidding aside, I found that it can be very hard to be a polished presenter until you have experienced trying to be a presenter; no matter how well you know the information. No matter how many times you repeat it in your head, unless someone is standing in front of you listening to you, it is different. I have no desire to be a salesman. But I do have the passion and enthusiasm for what I do, and that makes the difference.
How many of you feel like you are doing exactly what you wanted to do in your life with your careers or feel you are exactly where you want to be in your career? This is something that I have been realizing for myself since I took on this role with Infobright. What it comes down to is this: I love what I am doing and the company I work for. Every day I strive my hardest to make a difference in the work that I do, which can be challenging; but for me, it is incredibly rewarding.
When I first took on this role, I had to quickly learn about the technology. Which for me, coming from a software engineering background, was exciting and happened fairly quickly. The challenge was taking on the responsibility of the role and understanding how the Community Manager role fits within the company's infrastructure. But that is where the challenge that I enjoy in my career fit perfectly within where I want to be.
"You have the career you want, so what are you going to do with it? "
I have outlined a few objectives that are guiding posts for everything I am aiming to accomplish this year as Community Manager. Primarily, I want to improve upon your experience in working with ICE. To help achieve this, we (the University of Illinois interns and I) have a multitude of projects centered around improving the overall download process, and began engineering a handful of open source software projects centered around working with ICE/IEE. These projects are going to be released to the community and will contain source code and examples of best practices on various topics. These projects will be released in a variety of software languages to help support multiple platforms.
Another objective I have is to increase awareness by working with other communities. We already have an incredible community with a lot of great members. I consider that the current community is truly the core team. The support that is present within our community is amazing and very encompassing. By this I mean we support such a variety of topics from different schema and performance query issues to working with different platforms and compilation issues, to topics that cover industry trends. This community in its current state is truly a resource for the rest of the world. One of my goals in talking with people in other communities is underscoring the value of our community as an incredible resource.
I want to end this blog with a pat on the back to all of our community members. You have done a great job and I hope that you all keep up the good work.
I had the chance to attend the MySQL Percona 2012 conference just a few shorts weeks ago in Santa Clara. The conference and expo were a three day event that focused on a variety of topics that ranged from performance and tuning to scaling and backup technologies. Although the attendance has been slightly diminishing over the last few years, I still felt that the demographics of the attendees were more defined. Among them were the software architects and engineers, database administrators and chief technology officers that were seeking ways of handling their normal day to day issues surrounding Big Data. Many of the attendees commented quite nicely about the presentations and the general tenor of the conference seemed very amicable.
Some of the most notable exhibitors had new open source technologies and frameworks that are starting to become noticeable. Companies like Sphinx, an open source search technology, had quite a few of their engineers at the show and were ecstatic about feedback they got from the attendees. Another growing company, Akiban, was also there to introduce their table grouping framework for speeding up queries. There was a bit of buzz around their booth because they had this Lego contest that attracted quite a few creative types, but even more so due to the up and coming release of their software to the open source world. Most of the MySQL database companies were present at the expo, along with a range of NoSQL and NewSQL technologies like NuoDB. On the hardware side of things, technologies like Fusion-io with their amazing speed and benchmarks were there enticing decision makers to test out their ioMemory and ioDrives. There was another young company, Virident, that looks like it may give Fusion-io a run for their money.
As the Community Manager, I had a goal to talk about the advantages of our open source software and try my best to tell attendees about our architecture. Oftentimes, I had approximately twenty seconds to do so, and I felt like I was successful. Though there were quite a few people I did not get a chance to speak with, of those with whom I did, the response was incredibly positive and even more so, exciting. I was impressed with the amount of knowledge that some people did have about our technology as well as excited to talk to those who knew little to none. Something else that I did find impressive though was the amount of energy in the room. There was certainly a high level of enthusiasm that felt good to be around, which I felt made for a great expo.
Overall, I found the conference to be hugely beneficial to the community because of the networking that it offered and a chance to identify new use cases with more companies. Amongst the hoopla, I was able to get a moment to walk through with my laptop and create a video of a few of the booths and talk with some of the exhibitors, so take a quick look when you get a chance. I cannot say that I did not have a lot of fun, because I certainly did, as well as get a chance to carry forward the message about our incredible architecture. Big "Ups" to Infobright for their decisions in bringing this technology to where we are and where we are going. See you guys in Portland at Oscon 2012...
I am new to Infobright but I have been working on the problems we solve for many years This is the first of several blogs I plan to write on both business and technical issues, including one on how to get reliable results from a proof-of-concept.
Maybe it is the everyman IT story but it is so common that it has become part of the background noise of Big Data, particularly the problem of machine-generated data: it is the problem of finding a solution to our solution.
Most of us continue doing what we have done in the past until it doesn't work anymore. That seems pretty smart. And it is. However, once it stops working things get foggy quick. Big machine data problems are a classic example. Twenty and even ten years ago we were all doing fine, our existing hardware and row-based solutions were able to scale with the data volumes. We were working in a world where Moore's Law kept us out of trouble on the raw power side and disk manufactures kept innovating storage systems. We were IT and we were smart and we had money to spend.
For several reasons, such as data retention regulation and Internet access expansion, our money started to run out. More importantly we never considered that our needs would outstrip our growth curve. Ten years ago we were confident that Moore's Law would hold and it has. We never thought that our problems could grow faster than that. Well they did.
Here is where the real opportunity sets in (read big problems). We naturally think in terms of the tools we already have to find solutions to our problems. It is one of my worst habits, to go to my toolbox to see what I might have that fixes my problem, or more accurately that fits the solution I think I need. With Big Data we wanted a solution in the worst way. In many cases this is what we have. Studies show the top responses to big machine data analytics problems are: reduce the amount of data the user has access to, hire more DBAs and buy bigger, faster hardware. In some cases one quarter of our IT budget goes to storage. We did this knowing we were sinking faster than we were bailing. Same toolbox yields same tools.
Next week I am are going to discuss other technologies that attempt to add tools to the toolbox and why this is so hard to do. I do want to end with one last "new guy" insight from my beginnings at Infobright. Most people come to Infobright having already seen that applying standard thinking to their problem has failed. They now suspect they have the problem of looking for a solution to a solution they have already spent considerable time and money on. Infobright is what happens when serious problems are tackled by serious thinkers. In the realm of big machine data this means speed and compression go up and cost goes down. Until Infobright, everyone above was right; you were either small or you were powerful but you were not both – and low cost was never an option. As we will cover in the next few blogs we finally have a solution to our solution. Infobright.
In Infobright 4.0 (ICE and IEE), we optimized our lookup functionality. The 10,000 limit recommendation has been removed.
While the removal of the 10,000 limit recommendation allows for larger dimension tables to be flattened into the fact table, one should also consider the ramifications of using 'lookup' columns. As the lookup is stored uncompressed in memory, you're limited by the amount of resources on the system. If you have a very large table with a very large number of distinct values, you may consume significant RAM resources with this lookup. Therefore I recommend you only utilize large lookups on columns which are critical and beneficial to you. Do not use 'lookup' on a column just because you can; consider how often you use that column and compare with resource consumption.
In short, use lookups when you can maintain low cardinality (>= 10:1 ratio of total-to-distinct values). When the total number of distinct values is extremely large, justify the use of RAM before setting the lookup flag.
To wrap up:
· Have a >= 10:1 ratio of total-to-distinct
· Ensure you have enough RAM to hold all uncompressed, distinct values in memory (without causing other processes/queries to suffer)
· Lookups are only applicable to varchar/char fields. Numbers and Dates will be ignored.
· Only consider lookups for commonly used columns in the select, where, and group-by clauses of queries. Rarely used columns only suck up RAM usage.
· Don't use lookups as a general 'surrogate key'; only use lookups when you need to use it.
Other things to consider:
· Initial Server Startup Time can be impacted if you have an extremely large number of lookup columns. It's pulling those values off disk and putting them in RAM when you start the service.
· You're automatically taking RAM away from other processes/queries when using Lookups
· You cannot change the DDL to remove or add new lookups. It requires a full data dump, drop table, create table, and re-load in order to add/remove lookup columns. In the future, we hope to change where 'lookups' are defined, but at least for now, it's a risk.
· DomainExpert™ technology (beginning in 4.0) is a great alternative for any column which doesn't fit the lookup paradigm *and* has a repeatable pattern (ex: e-mail addresses).
· If DomainExpert and Lookups do not qualify, adding an md5-hash-equivalent column can help with query times on char/varchar columns. More information on MD5 hashing can be found here: http://www.infobright.org/images/uploads/blogs/how-to/How_To_Efficiently_Search_Strings_in_Infobright.pdf
We all have witnessed the explosion of data and the challenges it brings. Today organizations are awash with data. When I first started in Information Technology we were talking gigabytes, today it's terabytes, petabytes and even exabytes. Next up zettabytes.
Healthcare is no different, though their data is more complex as it's wrapped in layers of complex regulations and stringent safeguards thanks to federal and state regulations. In the US it includes HIPAA, HITECH Act, FISMA and a litany of other alphabet soup regulations.
The ability to collect Big Data within healthcare is not the problem, organizations are already doing so, particularly when it comes to log management for compliance. It's the ability to process, store and interpret massive amounts of information which is one of today's most important technological drivers within healthcare.
One of the fundamental problems with log management within many organizations is effectively balancing resources with the torrents of HIPAA log data being generated on a daily basis. The high frequency of log generation is further complicated by the length of time HIPAA requires that these files need to be retained in order to guard against non-compliance and audit issues. Add to this the need to perform regular effective analysis of this data. The more you look at log management within healthcare, both the technical and economic challenges of storing and analyzing terabytes of log information become very clear.
The traditional approach to log management is to store these logs within a row-based database. However, these are not well-suited to manage and store the surging data volumes required by these regulations. Doing so puts IT administrators between a rock and a hard place in terms of mitigating the plummeting performance and increasing storage requirements.
The other approaches are to deploy a general-purpose data warehousing solution, an Event Log Management application, or an appliance. While most of these are great solutions they often pose a very costly proposition, both in terms of hardware, licensing, and DBA effort. The high DBA effort is evident in how these solutions address the most costly aspects of managing surging data volumes, which are needless I/O operations and high latency, thus requiring some sort of database tuning.
Also many of these solutions are best suited for workloads that consist of a high volume of planned, repetitive reports and queries. This approach fails to address the growing need for a data warehouse designed for the ad hoc, investigative analysis that healthcare organizations require to perform effective analysis of their log data.
Infobright provides a more innovative approach for quickly analyzing the fast-growing volumes of event data. A purpose-built, self-tuning column store analytic database, designed to deliver a scalable solution. Infobroght is optimized for the complex ad hoc analysis required by healthcare organizations today.
Infobright's architecture solves the limiting factors of traditional databases, by minimizing disk I/O, eliminating the need for database tuning and delivering the ability to allow queries to be run in a single column, thereby limiting the search to relevant data rather than the entire database. Infobright provides a solution that provides better performance and a greater degree of scalability than traditional approaches, allowing organizations to store analyze more data for a fraction of the cost, a fraction of the time and a fraction of the hardware requirements of other solutions on the market.
Infobright is built upon the MySQL architecture and, although we typically recommend the Infobright Loader (and/or the Distributed Load Processor, DLP, with Infobright Enterprise Edition), frequently questions arise as to what connectivity tools can be used in different circumstances for data access and efficient insertion of data into Infobright. As the MySQL website states:
"MySQL provides connectivity for client applications developed in the Java programming language through a JDBC driver, which is called MySQL Connector/J. MySQL Connector/J is a JDBC Type 4 driver. Different versions are available that are compatible with the JDBC 3.0 and JDBC 4.0 specifications. The Type 4 designation means that the driver is pure-Java implementation of the MySQL protocol and does not rely on the MySQL client libraries."
We recently had a customer who asked us a question about fetching larger sets of data via the MySQL JDBC driver for insertion into Infobright. They told us that this driver worked fine for smaller data sets but their business use case required handling at least one million rows of data.
Their specific inquiry was whether they could specify a larger fetch size of at least one thousand records. Alternatively, they asked us if we could provide a JDBC driver other than the MySQL driver or if we could recommend some other solution.
Initially, we told him that there were a few options: looking at alternate drivers that could be used such as Connector/J, Datadirect JDBC, and even the Drizzle JDBC Driver; streaming the data using JDBC (as opposed to Buffering); or, moving the Java service to another server to avoid memory contention. We also specified that there are known issues with JDBC and memory and that there was a good write-up about the JDBC driver from one of our former sales engineers which could be read at the following link:
http://www.infobright.org/Blog/Entry/infobright_mysql_and_jdbc/
However, this customer was emphatic about getting their Java application to connect to Infobright using the JDBC driver; specifically, J/Connector 5.1.18. A couple of weeks after their initial inquiry, they specified that they tested setting the fetch size to 1000, 100 or 10 (the latter containing a very large result set) and, in all three cases, getting an "out-of-memory" exception error.
They found a workaround to avoid the exceptions. This workaround is based on setting the fetch size to INTEGER.MIN; thereafter, the JDBC driver returns 1 row at a time from the existing result set. This works well as long as the result set is small; but this would be a concern if the result set is really large.
In response to this, I did some research on the INTEGER.MIN setting and whether there was a better "streaming" option. After my research was completed, I responded with the following:
"...the main options for avoiding memory contention between the database and JDBC connections are to either stream the results to the application rather than buffer, or isolate the Java service onto a separate VM. If you end up having a large result set, simply moving the Java service to another server has worked consistently so make sure to do that."
After further review, the problem was not inherently in the JDBC driver, but in the configuration.
A review of the issue can be found here: http://bugs.mysql.com/bug.php?id=18148
By default, MySQL + the Connector/J JDBC driver for MySQL will return the entire result set at once, filling the JVM memory even before stepping through the results set.
ResultSet rs=stmt.execute(queryString);
The variable "rs" now contains the entire result set at once. According to the JDBC API, setting the fetch size will provide a hint to the driver and database to return the results in blocks based on the fetch size so that you do not need to hold the entire result set in memory at once.
The work-around recommended was to setFetchSize(INTEGER.MIN) which will force the results to stream one row at a time. This impacts performance due to network latency for each fetch. It's better to retrieve in reasonably sized chunks to avoid network latency.
However, we found that an additional configuration setting is available and, if you use it, the JDBC driver will perform as expected and use the setFetchSize as intended by the JDBC API.
If you have Infobright and the MySQL Connector/J Driver 5.0.1 or higher this will work, but you need to add a setting in the JDBC URL when establishing the DB connection.
jdbc:mysql://SERVER:PORT/DBNAME?useCursorFetch=true&defaultFetchSize=1000
This tells MySQL to use a cursor to step through the results and sets the connection with a default fetch size. After that, you can actually use the setFetchSize and reset it as needed, or simply use the default you specify at the time the connection is made.
On a similar note, a different customer was experiencing very slow bulk load rates into a specific Infobright table which he described as follows:
"What we basically do is read the data from Oracle, and then load it to MySQL in bulk sets of 50,000 records. We are using 'ojdbc6.jar' to read from Oracle, and 'mysql-connector-java-5.1.18-bin.jar' to write to MySQL. We would appreciate any suggestions you might have to improve the loading time."
Initially, I thought that this customer could benefit from the first customer's experience in setting the default Fetch Size higher. However, this latter customer came back to me and explained that the problem was the insertion speeds into Infobright, not the fetch speeds from Oracle. Again, after researching why loading with the MySQL Connector/J might be slow, I came upon the following three recommendations:
cacheServerConfiguration=true
useLocalSessionState=true
rewriteBatchedStatements=true
The customer tried this last suggestion first and, believe it or not, it made a significant difference in load performance! Here is an explanation of how they tested this recommendation and analyzed the results:
"We were able to use the 'rewriteBatchedStatements=true' configuration with a smaller bulk set of 9000 records (originally we had tried 50,000). With a larger bulk set of greater than 10,000 records, we get the error 'java.sql.SQLException: Prepared statement contains too many placeholders'.
But, with a smaller batch size we see a great improvement when using the 'rewriteBatchedStatements=true' parameter. We saw that we can load 15,800 rows in a second to TABLE1 which is the table that took us the most time to load."
Previously, they were able to load only 2250 rows per second, so they're seeing roughly a 7X performance improvement with this change alone!
I hope these recommendations help if you encounter similar situations in your environment.
With Infobright 4.0 we delivered a number of unique “Big Data” features including Rough Query (link to video) and DomainExpert™ (link to video) to support querying large volumes of machine-generated data. At the same time we also introduced a companion product to Infobright Enterprise Edition, the Distributed Load Processor (DLP) with a Hadoop connector, to enable very fast data load and a simple method to extract data from a Hadoop cluster. The ultimate goal is to extract and load data into Infobright as quickly as possible to allow for great data growth and “queryability”. Customers can use these tools to load over 2 terabytes an hour into just one table. In a previous blog post found here (http://www.infobright.org/Blog/Entry/loading_data_in_infobright_on_amazon_ec2_using_dlp/), we illustrated how to install DLP on an Amazon EC2 instance. In this blog, we will discuss how to leverage our Hadoop connector in conjunction with DLP.
If you’re new to Hadoop, take a look at the Apache™ Hadoop project (http://hadoop.apache.org/). Hadoop provides a highly-scalable approach to crunching through insane amounts of data. The main use case for Hadoop is to deal with near-petabyte+ worth of data. Hadoop provides a great resource for storing *all* of your detailed data for great lengths of time. Plus it doesn’t require a schema. So if your data is semi-structured or unstructured, Hadoop is a great asset in your arsenal.
Where Hadoop tends to fail is with respect to fast analytic response. While there are technologies that sit atop Hadoop intended for analytics, they do require a level of overhead. If you’re looking for blazing ad-hoc query speed, you won’t necessarily find it with Hadoop and MapReduce. To that point, Infobright provides that capability. By structuring and moving over a subset of the most highly relevant or even aggregated data into Infobright, you can provide an easy and fast user experience. We are strong advocates of using the right tool for the right job, and the combination of Infobright and Hadoop is a good example. If you are interested, watch the video about LiveRail (link to video on Youtube), which explains how and why they are using both.
Jumping ahead to the specifics, I assume you have Hadoop installed and working properly. Plus, I assume you’ve installed DLP. (You can download an Enterprise Edition free trial along with a DLP trial at http://www.infobright.com/Products/Product-Demo/ if you don't have a DLP license). Java is also required, so ensure the latest JDK is available. To utilize our Hadoop connector, a few pre-requisites are required.
These steps were verified using:
Infobright Database: IB_4.0_r13151_13690 (iee-commercial)
Infobright DLP (this includes the Infobright Hadoop Connector): DLP 1.1.0 64Bit
Hadoop: hadoop-0.20.205
Operating System: CentOS 5
Once you execute the command, you should see your data in the database. If you receive any errors, please consult the documentation for error codes. The official Hadoop connector documentation can be downloaded on the support portal at infobright.com.
If you're wondering how Infobright started, here is a brief history of the company and the technology:
Infobright was founded in 2005 to leverage a mathematical approach, called Rough Set, to solve data management and analytic problems. Rough Set started with a Polish computer scientist Zdzislaw Pawlak in 1981 as a mathematical tool to deal with vague concepts. In his theory, Pawlak describes the lower and upper approximations of a set as crisp or conventional, defining the upper and lower boundaries. This approach is useful for rule induction from incomplete data sets, and in other variations, also helped to form approximating sets known as fuzzy sets. Dominik Slezak, one of Infobright's founders, explains this as "we follow the rough set approach to identify: (1) the data portions that are fully relevant to the given query execution; (2) the data portions that are fully irrelevant to the given query execution; (3) the data portions that remain undecided." This theory when applied to data mining and machine learning was quickly adopted by a variety of industries with different applications.
Infobright's founders realized that Rough Set is a powerful tool to enable fast queries against large data sets without doing all the database administration work that had always been a requirement in the past to achieve fast performance. Instead of requiring indexes, data partitioning and other typical techniques, intelligence in the software could drive performance. Infobright calls this intelligence the Knowledge Grid. Wedding the Knowledge Grid to a columnar database architecture produced a powerful solution that could handle large amounts of data fast and simply, at a very low overall TCO.
In 2006 Infobright formed a partnership with MySQL, taking advantage of the "storage engine" architecture that MySQL had to encourage other companies to create new databases for different use cases while taking advantage of many MySQL functions. This integration meant that migrating from a row-based MySQL database to the Infobright columnar-based database would be as simple as a command line change.
Infobright first introduced its' technology, then known as Brighthouse, at the 2007 Rough Set Conference in Toronto to a very positive reception. A few months later, in 2008 the company released the industryís first commercial open source analytic database software (infobright.org) and started building a strong and growing open source user community, with more than 15,000 downloads in the first year. Within a year, Infobright had more than 40 customers including ISV OEM customers who embedded Infobright in their own software offerings.
Since 2008 there have been a lot of other changes. For one thing, the product is now called Infobright not Brighthouse. There are tools to integrate with major BI partners such as Pentaho, Jaspersoft, Talend, Actuate and Informatica. Users can load data several different ways depending on their needs such as using the Infobright loader, the MySQL loader, Infobright's Distributed Load Processor, or other ETL tools. Customers have reached data load speeds of up to 200,000 records per second. Infobright positioned themselves as the leader in the open source data warehousing community, and soon after, for recognition of their outstanding contributions to the MySQL ecosystem, Infobright was awarded the prestigious MySQL Partner of the Year award by Sun Microsystems in April 2009.
Using built-in intelligence, Infobright's unique way of storing and analyzing machine-generated data has provided the vehicle to near real-time analytics in big data. Machine-generated data has become one of the fastest growing categories of big data, with sources ranging from web, telcom network and call-detail records, to data from online gaming, social networks, sensors, computer logs, satellites, financial transaction feeds and more. This focus on machine-generated data within big data, beginning in 2010, gave rise to the rapid increase in customer momentum. And our latest version 4.0, released last summer, included Hadoop connectivity, as well as the introduction of DomainExpert™ and Rough Query. Developed exclusively by Infobright, DomainExpert™ uses specific intelligence about machine-generated data to automatically optimize how data is stored and how queries are processed. Rough Query leverages our Knowledge Grid to deliver data mining drill down at RAM speed, otherwise known as "Investigative Analytics.".
From the beginning our executives and engineers recognized that the looming database challenge was how to analyze and extract actionable knowledge from very large (and growing) data sets. Clearly the market agrees as terms such as "Big Data" and "machine-generated data" become more commonly used. In addition, companies appreciate that Infobright's approach "to work smarter not harder" means that their users can get fast query response, even to ad hoc queries, without a high overhead of database administration or hardware costs.
By 2011 eight of the top ten telecommunications service providers worldwide are using Infobright to mine their big data. Hundreds of customers use Infobright daily and more than 100,000 users have downloaded both our community and enterprise editions. Infobright is still leading the industry and paving the way.
Craig Trombly, Infobright Community Manager
Last week, I attended and spoke at POSSCON 2012, (one of ?) the largest open-source software conferences on the east coast. Boasting nearly 700 attendees this year, the three-track set of presentations went over very well. When you couple the southern hospitality, the event was surely a cloud pleaser.
In the technical track, I spoke about one of my most favorite topics: the emerging database landscape. I introduced the basics of row, column, NoSQL and even NewSQL technologies: where they fit, how they fit, and who needs them. In total, we had about ten questions at the end of the talk. Plus, we had representation from Oracle and MariaDB to help answer any vendor-specific questions. At the conclusion of the talk, I provided the link to download our emerging database landscape whitepaper (you can get it here: http://www.infobright.com/land/emerging_database_landscape/). Beyond my talk, they discussed several really cool technologies. FOr example, Twitter spoke about their extensive usage of open-source technologies. It really does take a lot of open-source to power just 1 tweet. It also explained why my tweets take a second or two to appear in my feed
.
As for one of the keynotes, Scott McNealy provided great context for the survival of open-source. We need a champion, and the champion needs to be a power player. He encouraged the community to find its new champion and herald its momentum. Without a leader, I believe Scott is correct: we can only sustain the momentum behind the power of a capable leader.
The organizers of POSSCON should be extremely proud. Todd Lewis did a magnificent job of working with vendors, patrons, and speakers, and his staff did a wonderful job of showing everyone a great time. Honestly, this conference was far better than any other conference I have attended. I really look forward to the opportunity to returning back next year.
For more information on POSSCON, please visit: http://www.posscon.org/