Having associated with Exadata in one or other way from its release ( see: http://technofunctionalconsulting.blogspot.in/2008/09/exadata-database-processing-moves-into.html ) I have tried consolidating key points related to Exadata for a session to technical audience.
Related Posts:
http://technofunctionalconsulting.blogspot.in/2009/10/exadata-v2-worlds-first-oltp-database.html
http://technofunctionalconsulting.blogspot.in/2010/02/hybrid-columnar-compression-hcc.html
http://technofunctionalconsulting.blogspot.in/2012/06/exadata-performance-features.html
Thursday, April 25, 2013
Friday, April 5, 2013
Accelerating Analytics using “Blink” aka “BLU acceleration”
This
Friday marks completion of my 2 years in the second innings with TCS ‘s
Technology Excellence Group and it is time for a technical blog post.
During
this week, I have seen IBM announcing new “BLU acceleration” enabled
DB2 10.5 that claims a 10 to 20 times performance improvement out of
box. (Ref: http://ibmdatamag.com/2013/04/ super-analytics-super-easy/ )
This post aims at giving a brief summary of the Blink Project which has brought in this acceleration to the analytic queries.
The Blink technology has primarily two components that achieve the said acceleration to the analytic processing:
1. The compression at the load time
2. The query processing
Compression & Storage:
At
load time each column is compressed using a “Frequency Partitioning”
order preserving fixed length dictionary encoding method. Each partition
of the column has a dictionary of its own making it to use shorter
column codes. As it preserves order the comparison operators/predicates
can be applied directly to the encoded values without needing to
uncompress them.
Rows
of are packed using the bit aligned columns to a byte aligned banks of
8, 16, 32 or 64bits for efficient ALU operations. This bank-major
storage is combined to form blocks that are then loaded into the memory
(or storage.) This bank-major storage exploits SIMD (Single Instruction,
Multiple Data) capability of modern POWER processor chips of IBM.
Query Processing:
In
Blink there are no indexes, no materialized views nor a run-time query
optimizer. So, it is simple. But the query must be compiled to take care
of different encoded column lengths of each horizontal partition of the
data.
Each
SQL is split into a series of single-table queries (STQs) which does
scans with filtering. All the joins are hash joins. These scans happen
in an outside-in fashion on a typical snowflake schema creating
intermediate hybrid STQs.
Blink
executes these STQs in multiple blocks to threads each running on a
processor core. As most modern ALUs can operate on 128bit registers all
the operations are bit operations exploiting SIMD which makes the
processing fast.
For more technical details of Blink project refer to - http://sites.computer.org/ debull/A12mar/blink.pdf
Hope this will bring “Analytics” a boost and some competition to Oracle’s Exa- appliances. Views, Comments?
Labels:
Analytics,
Database,
in memory,
Performance Tuning,
Technology
Subscribe to:
Posts (Atom)