Joinutility seperatorLogin utility separator Infobright.com

Infobright Blog

15
May

How To Load Using a Named Pipe

CarlGelbart's photo
by CarlGelbart     Sat, May 15, 2010

A guide to loading data from a Linux/Solaris named pipe.

How To - Load using a Named Pipe

Infobright     Tags:

13
May

Top Five 1/2 Things You Should Know About Columnar Databases

Bob Zurek's photo
by Bob Zurek     Thu, May 13, 2010

The use of columnar databases for analytic purposes continues to increase as information management and database technologists at corporate IT shops, software and SaaS companies see significant benefits over row based solutions. This is not meant to say that row based systems aren't useful to the organization. Row based systems are ideal for handling heavy transaction processing requirements and have been since their birth in the industry.  Columnar databases are frequently used side-by-side with row-based systems for precisely what they are good for - supporting read-intensive applications like analytics where fast query response is what is needed the most. With this in mind, here is a list of the top five ½  things you should about columnar databases. 


1. Great performance for analytic solutions. Columnar databases are generally much faster than row based systems for analytic queries. For example, one Infobright customer reports that it took 11 minutes to produce a one month analytic report with 5 million events in row based SQLServer compared to just 10 seconds with Infobright’s columnar database. A complex filter that took 29 minutes in SQLServer took just 8 seconds with Infobright. While these results are very dramatic, they are not unusual.


2. Great at compression. A columnar database is likely to support a high rate of data compression. Compression provides several benefits to an organization, including a reduction in the overall footprint of storage and simplified backup. Less storage means less money and also provides an extension to the life of storage. Some professionals believe compressed data is more secure. Note that compression can degrade performance unless the database also has the intelligence to reduce the amount of data needed to be uncompressed to respond to a query.

3. Minimizes the requirement to scale out thus reducing complexities. Columnar databases can support the storage and retrieval of terabytes of data using smart intelligence of the core columnar database software rather than forcing you down the complex path of throwing more hardware to scale in the form of multi-node deployments. The average range of databases for analytics is 150 GB  to 7-10 Terabytes in size depending on much history is required. This is the majority of the market. Scaling up on multi-core machines without requiring additional hardware investments to support scale-out is something that many database professionals prefer. Well designed columnar database products can handle these requirements by using intelligence and not hardware to drive query performance, so you are able to scale quite well on low cost commodity hardware. Moores Law continues to benefit these approaches. The new processors arriving over the next one to two years will continue to help.

4. Do not require the use of indexes. When challenged with improving query response times, one of the first things that database administrators pursue is an index optimization strategy. Tweaking and tuning indexes is not only a science but it is also an art that takes considerable expertise and sometimes lots of trial and error. Most databases have support for indexing but several columnar databases require no indexing because it is not needed to achieve high performance. No indexing means more time spent on other critical tasks at hand, and much less administrative efforts overall.

5. Supported by a growing ecosystem and great interoperability. Several years ago, you would be hard pressed to find tools, applications, reporting solutions, etc. that supported and interfaced to columnar databases. The good news is that this is no longer the case. Many open source and commercial solutions, from application development tools to analytic and data mining solutions, have embraced columnar databases. To help this, columnar database companies have embraced open API's in the form of ODBC, JDBC and other forms of native connectivity.

1/2. You might be asking yourself what's with the 1/2? Well, since I don't think The Top 5 1/2 topic has ever surfaced in a web search, I just thought it would be fun to add it to this list to test your curiosity. Did you happen to look at this point last?

Infobright     Tags:

Previous Page   Next Page