This area lists files that have been contributed to the Infobright Community by Infobright, Infobright Employees, and community members. Users may have their contributions listed here by executing our Contibution Agreement. These files have not gone through rigorous QA testing at Infobright.
Sample Infobright Database
Infobright has created a sample database, carsales, to allow you to evaluate the Infobright Analytic Database, both ICE and IEE, and execute queries against significant volumes of data. The carsales database represents a fictional automobile dealer network and contains 10,000,000 sales records in its central fact table, fact_sales.
The fact_sales table records information about each sale. As part of a transformation process, the dealer identifier, marketing area, and sales area were appended to each transactional record to form the fact record. Every car transaction falls into a sales area and a market area. The sales and market areas may overlap, so separate tracking of each is required.
The sample download includes:
- documentation
- enhanced logging
- enhanced (optional) MyISAM table objects
- a suite of 110 SQL commands including queries, VIEWs and a stored procedure
NOTE
- The compressed sample database download is about 10X the size of the software download (at ~630MB) and takes up almost 3GB on disk when uncompressed.
- It will consume approximately another 5GB of database disk space if the MyISAM tables are loaded, only about 1GB if only Infobright tables are loaded.
- It will take time to download. Evaluators will need appropriate disk space for the download, the data files, and the loaded database itself.
Query Descriptions
Linux README
Windows README
Linux Database
Windows Database
ICE Breaker for Oracle V1.0 April 28, 2010
For Windows use only
OVERVIEW – ICE Breaker for Oracle is a Windows program that connects to an Oracle database, and creates a file that contains the necessary commands to migrate from Oracle to Infobright.
ICE Breaker for SQL SERVER (VERSION 1.0) April 28, 2010
ICE Breaker for Microsoft SQL Server is a Windows program that connects to a SQL Server database, and creates a file that contains the necessary commands to migrate from SQL Server to Infobright.
Fix Comments in MySQL 5.1.40 Added March 8th, 2010
Versions of Infobright prior to v3.3.1 were bundled with MySQL 5.1.14 and observed the 'double-dash' (--) syntax to represent a COMMENT. With the release of v3.3.1 (and later) based on MySQL 5.1.40, the syntax support for COMMENT was changed to 'double-dash-whitespace' (-- ).
This is a Unix/Linux bash shell script that uses the 'sed' command to change all double-dashes to double-dash-space and any any double-dash-space-spaces back to double-dash-space. This is a very simple script with all assumptions documented in the script header.
Innodb Export - Import to Infobright (VERSION 0.9) Added January 29, 2010
Script to export/import data from InnoDB table in one MySQL instance to an Infobright table in another.
Assumption(s): table to be exported and imported have same name the whole table is exported - edits will need to be made for higher selectivity
Caveats: there is no validation of the "type" of either schema there are very few sanity checks in this script such as file system space and record counts out and in.
Talend ETL Connector (VERSION 3.4) Updated December 4, 2009
The connector shows up under Databases -> MySQL in the palette.
Install note: The path to the component is hardcoded for TOS 3.1.3. If you are installing to a different version of Talend, you will need to update the install path to the component under the plugins directory.
New features with 3.4
New features with 3.3
Connector Core Library
infobright-core-v3 (VERSION 3) Updated December 3rd, 2009
The connector core library provides an API for Java developers to interface directly with the Infobright bulk load capability. It is made available for use with the Talend and Pentaho ETL connectors developed by Infobright. It is also for developers who wish to implement new bulk-output connectors for other Java-based ETL tools.
This package includes: Full source code, the library, and API (Javadoc) documentation.
New as of Dec 3: Now with support for remote loading!
Please note: Windows is only supported with v3.2.1 (or newer) of ICE or IEE
ICE Breaker for Oracle V0.7 November 2, 2009
For Windows use only
OVERVIEW – ICE Breaker for Oracle is a Windows program that connects to an Oracle database, and creates a file that contains the necessary commands to migrate from Oracle to Infobright.
Simple DB Size October 6, 2009
Perl script
A script to report on database disk usage.
Tutorial: Data Transfer from MySQL to Infobright September 22, 2009
Tutorial on how to get data out of mysql into a CSV file, work on the data using awk and finally loading it into Infobright.
The emphasis is on the checking and discovery of the data in the CSV file.
Rename Table September 17, 2009
Usage: rename_table
rename_table will do the same as the alter SQL statement
alter table
rename to
MySQL Proxy / Infobright CTAS Lua script September 2, 2009 (Untested softare by Infobright, submitted by a Infobright contributor)
For MySQL Proxy 0.61 only
Support for Create Table As Select (CTAS) through MySQL Proxy Server. Requires v0.61 of the MySQL Proxy - downloaded separately from MySQL.
New Features with this release
And please note that, because MySQL and infobright has some differences between data types, there is a function in code that does this conversion. For example MySQL has decimal (38,0) but in Infobright its maximum size is 18. So the program must be extensively tested on several different CTAS’es. Probably need to have change this function according to the results.
MyISAM Export August 6, 2009
For Linux/Unix use only
This is a Unix (bash) script that works on a table-by-table basis and exports data from a MyISAM table into a data file and imports (via LOAD) into an Infobright table of the same name and schema. It could be adapted for multiple tables can be purged using the same filter criteria.
LogRotate-IB (VERSION 1.0) July 30, 2009
For Linux/Unix use only
Infobright maintains an error log file named bh.err located, by default, in /usr/local/infobright/data. This log file is not “versioned”, it continues to append to the same file.
This Unix (bash) shell script copies the error log to a backup file and re-initializes the bh.err file. This script can be added to cron or executed manually. It can also be run multiple times per day and will append the incremental logs from the same day to the existing file for that day instead of overwriting it if it already exists. It does not depend on the Infobright/MySQL server to be running as does the option to use FLUSH LOGS in MySQL. It also allows one to maintain a history of log files instead of a single copy of the contents since the last time it was run (in contrast to FLUSH LOGS).
Log files are not automatically compressed or deleted after a specified number of days, although this is a simple modification that can be added to this script by the Infobright administrator to their specifications.
ReOrg (VERSION 1.0) July 30, 2009
reorg.sh documentation:
Infobright is designed as a “load and read” analytical database and storage engine. However, after the execution of DML activities like UPDATE and DELETE (in the Enterprise Edition of Infobright), tables can become fragmented and benefit from reloading which effectively compacts the table by removing all existence of formerly deleted record(s) and their associated delete mask(s).
reorg.sh does just this. It exports all data from a single, given table in binary file format to the file system, performs an Infobright variation of TRUNCATE on the table, reloads the table with the Infobright Loader, and then removes the export file from the file system. See the code extract below:
if [ $# -ne 2 ]; then
echo "Usage: $0
" 1>&2
exit 2
fi
MYSQL=`locate mysql-ib`
$MYSQL << EOF
-- set up environment
USE ${1};
SET @bh_dataformat='binary';
-- export data
SELECT COUNT(*) FROM ${2};
SELECT * FROM ${2} INTO OUTFILE '/tmp/${2}.bin';
-- truncate table
CREATE TABLE X LIKE ${2};
DROP TABLE ${2};
CREATE TABLE ${2} LIKE X;
DROP TABLE X;
-- reload table
LOAD DATA INFILE '/tmp/${2}.bin' INTO TABLE ${2};
SELECT COUNT(*) FROM ${2};
-- remove export file
\! rm -f /tmp/${2}.bin
\q
EOF
exit 0
In its current form, reorg.sh doesn’t perform any sanity checks such as ensuring there is enough available file system space for the export or validating that the amount of data exported matches the amount of data reloaded. (COUNT(*) output is sent to STDOUT during execution for manual validation.)
It also assumes, for example, that the Infobright engine is running, that the MySQL client is named mysql-ib (from the default installation), and that there is only one of them. But it does not assume any particular location of this file.
This is a script. Use it “as is” if you are comfortable with its current form or feel free to use it as a template for your own custom reorg utility.
ICE Breaker for MySQL (VERSION 0.1 BETA) April 22, 2009
ICE Breaker Documentation (MySQL)
ICE Breaker for MySQL is a Unix shell script that connects to a MySQL database, and creates a file that contains the necessary commands to migrate from MySQL to Infobright.
ICE Breaker for SQL SERVER (VERSION 0.9 BETA) Updated April 17, 2009
ICE Breaker Documentation (SQL Server)
ICE Breaker for Microsoft SQL Server is a Windows program that connects to a SQL Server database, and creates a file that contains the necessary commands to migrate from SQL Server to Infobright.
This is a tool used by the Infobright Implementations Team and it is a shell script that that will manage a number of parallel load processes for ICE. A single control file is used to allow simple configuration of the queue.
Kettle Connector (VERSION 2.2) Updated March 16, 2009
The Infobright Kettle connector is a Kettle transformation step that is designed to load data into an ICE table. Please note that this release is for Linux distributions of ICE only.
Talend / Jaspersoft Connector (VERSION 2.4) Updated April 17, 2009
The ICE Talend / Jaspersoft connector is a component that is designed to load data into an ICE table. Please note that this release is for Linux distributions of ICE only.
Truncate Table
truncate.sh.zip Updated May 14, 2009
The SQL function TRUNCATE TABLE tbl_name quickly and efficiently removes all data from a table while retaining all of its other attributes such as indices, constraints and permissions. Infobright does not implement indices nor constraints so those are not issues in this environment. Permissions are associated with a table name and are preserved with this utility.
The MySQL Administrator’s Guide, page 44 of the chapter on Optimization, says:
6.2.18 Speed of DELETE Statements
...
To delete all rows from a table, TRUNCATE TABLE tbl_name is faster than DELETE FROM tbl_name.
However, Infobright implements TRUNCATE TABLE tbl_name with the DELETE operation which does not deliver the normally expected performance, especially on large tables.
Therefore, this script, truncate.sh, performs an effective TRUNCATE TABLE tbl_name by creating an empty copy of the table, dropping the table ( DROP TABLE tbl_name), and then creating a copy of the empty table with the same name (effective the same as ALTER TABLE old_tbl_name RENAME new_tbl_name, which is not yet implemented in Infobright – see rename_table.sh script on the Contributed Software page). This is quick and efficient and works exactly as would be expected via the SQL API – except from the operating system command line.
You can save the script under any name you wish but we suggest truncate.sh. The synax/usage is as follows:
Usage:
where script name is whatever name you choose to save it as.
Make sure to set the appropriate ownership and permissions for your environment (perhaps, mysql mysql rwxr-xr-x (755)) and that the script is either in the user’s path or is executed with the explicit path to where it was saved.
Truncate_Table (Stored Procedure) July 30, 2009
The SQL function TRUNCATE TABLE tbl_name quickly and efficiently removes all data from a table while retaining all of its other attributes such as indices, constraints and permissions. Infobright does not implement indices nor constraints so those are not issues in this environment. Permissions are associated with a table name and are preserved with this utility.
The MySQL Administrator’s Guide, page 44 of the chapter on Optimization, says:
6.2.18 Speed of DELETE Statements
...
To delete all rows from a table, TRUNCATE TABLE tbl_name is faster than DELETE FROM tbl_name.
However, Infobright implements TRUNCATE TABLE tbl_name with the DELETE operation which does not deliver the normally expected performance, especially on large tables.
Therefore, this stored procedure, truncate_table(), performs an effective TRUNCATE TABLE tbl_name by creating an empty TEMP TABLE copy of the table, dropping the table (DROP TABLE tbl_name), and then creating a copy of the empty TEMP TABLE with the same name (effectively the same as ALTER TABLE old_tbl_name RENAME new_tbl_name, which is not yet implemented in Infobright – see rename_table.sh script on the Contributed Software page). This is quick and efficient and works exactly as would be expected via the SQL API – except from the operating system command line.
You must execute the SQL file truncate_sp.sql in any database where you wish to use this stored procedure. The synax/usage is as follows:
Usage:
mysql> USE
mysql> CALL truncate_table(‘
’);
Deploying ICE on the Amazon Cloud
This document details how to install ICE within the Amazon Elastic Compute Cloud (EC2.)
RECENT COMMENTS
[#8612] frankb comments on Commanding Line
[#8609] Jimmy comments on December in India and Granular Computing
[#8608] Jimmy comments on Intrusion Detection – the case of Log Analytics (inspired by one of keynotes at SoCPaR 2009)
[#8607] Jimmy comments on Social Networks, Ubiquitous DBMS, and Future Generation IT
[#8604] Jimmy comments on New White Paper for PHP Developers
[#8601] Jimmy comments on Data Warehouse Database Conversions
[#8595] eonarts comments on Enhanced logging
[#8593] David Lutz comments on Enhanced logging
[#8592] eonarts comments on Enhanced logging
[#8591] msemsel comments on Top Five 1/2 Things You Should Know About Columnar Databases