vectorwise — Too much information

The Data Day, A few days: March 20-22 2013

March 22nd, 2013 — Data management

MongoDB goes Enterprise. Riak CS goes open source. And more.

For 451 Research clients: 10gen accelerates NoSQL commercial plans with MongoDB Enterprise bit.ly/ZJoXhl

— Matt Aslett (@maslett) March 20, 2013

For 451 Research clients: JethroData raises funding to develop Hadoop-based analytic database bit.ly/11j5wgF

— Matt Aslett (@maslett) March 21, 2013

For 451 clients: The rise of the ‘predictive business’ – a machine-learning future for analytics M&A? bit.ly/11j5BRe By Krishna Roy

— Matt Aslett (@maslett) March 21, 2013

Concurrent Closes $4 Million in Series A Funding, Appoints Gary Nakamura as CEO mwne.ws/Yr669s

— Matt Aslett (@maslett) March 20, 2013

Riak CS – simple, available cloud storage built on Riak – is now open source. basho.com/riak-cs-is-now…

— Basho Technologies (@basho) March 20, 2013

Cloudera and T-Systems announce strategic partnership to deliver cloud-based data analytics based on Hadoop. bit.ly/16IZGck

— Matt Aslett (@maslett) March 20, 2013

Actian announced the launch of Vectorwise 3.0 analytic database with Hadoop integration. bit.ly/ZRnvZ0

— Matt Aslett (@maslett) March 19, 2013

Jaspersoft added more than 400 new customer deals in 2012. bit.ly/Y3lCfa

— Matt Aslett (@maslett) March 22, 2013

And that’s the data day, today.

Comments Off on The Data Day, A few days: March 20-22 2013

The Data Day, Today: November 14 2012

November 14th, 2012 — Data management

Funding for Continuuity and 10gen. Wibi Data launches the Kiji. And more.

For 451 Research clients: IxReveal seeks funding round, highlights uReveal brand and data-harmonization use case bit.ly/PTw7P5

— Matt Aslett (@maslett) November 14, 2012

For 451 clients: Datawatch details semi-structured data analysis strategy and roadmap following Monarch buy bit.ly/PTwclT Krishna Roy

— Matt Aslett (@maslett) November 14, 2012

10gen Announces Strategic Investment from @intelcapital and Red Hat soc.ai/2o9@redhatnews #MongoDB #NoSQL #Database #Intel

— 10gen(@10gen) November 14, 2012

Continuuity raises $10M Series A round to ignite Big Data app development within the #Hadoop ecosystem bit.ly/Qd0kKb

— Continuuity (@Continuuity) November 14, 2012

SAP positions HANA for transaction, analytics, text and predictive processing. prn.to/T2Z4Zu

— Matt Aslett (@maslett) November 14, 2012

Wibi Data launches the Kiji Project: An open source framework for building big data apps with Apache HBase bit.ly/W5VZKi

— Matt Aslett (@maslett) November 14, 2012

Hadapt and MapR partner to enable Hadapt’s Adaptive Analytical Platform to use MapR’s Distribution for Hadoop. bit.ly/PTyzVE

— Matt Aslett (@maslett) November 14, 2012

NuoDB launches release candidate, pricing and licensing for forthcoming elastic database for the cloud. bit.ly/W5Whkl

— Matt Aslett (@maslett) November 14, 2012

Actian positions Vectorwise for large data warehouse environments via OEM agreement with ScaleMP. bit.ly/W7Gmxa

— Matt Aslett (@maslett) November 14, 2012

Socrata plans open source reference implementation of its open data platform. mwne.ws/PTwnO2

— Matt Aslett (@maslett) November 14, 2012

And that’s the Data Day, today.

Comments Off on The Data Day, Today: November 14 2012

The Data Day, Today: Feb 8 2012

February 8th, 2012 — Data management

SAP targets HANA at SMEs. WibiData raises $5m. Zimory acquires Sones devs. And more.

An occasional series of data-related news, views and links posts on Too Much Information. You can also follow the series @thedataday.

* SAP to Arm Small and Midsize Enterprises With Real-Time Analytics Powered by SAP HANA

* Hadoop startup WibiData raises $5M to power web analytics

* Zimory Acquires Database Development Team from Sones

* Oracle Announces Availability of Oracle Advanced Analytics for Big Data

* Kalido Fuels Growth with New Customers, Market Leading Data Governance Capabilities in 2011

* Xeround Announces Free Version of Popular Cloud Database

* Hypertable Inc. Announces New Products and Services for Next Generation Hadoop NoSQL Database Deployments

* Cloudera Connector for Tableau Has Been Released

* Information Builders Launches WebFOCUS Hyperstage to Speed Performance and Delivery of Business Intelligence

* Actian Releases Vectorwise Workgroup Edition, Claims Best in Affordable Big Data Analytics to Mid-Market

* 10gen and Carahsoft Partner to Bring Leading NoSQL Solution to Government Sector

* MySQL progress in a year

* Endeca CEO: We wanted IPO, but Oracle acquisition gave peace of mind

* For 451 Research clients

# Armed with fresh funding, Tidemark looks to churn up performance management waters Impact Report

# Cloudant seizes opportunity for greater involvement with CouchDB Market Development report

# Xeround details cloud database pricing, launches free option Market Development report

And that’s the Data Day, today.

Comments Off on The Data Day, Today: Feb 8 2012

The future of the database is… plaid?

September 2nd, 2009 — Data management

Oracle has introduced a hybrid column-oriented storage option for Exadata with the release of Oracle Database 11g Release 2.

Ever since Mike Stonebraker and fellow researchers at MIT, Brandeis University, the University of Massachusetts and Brown University presented (PDF) C-Store, a column-oriented database at the 31st VLDB Conference, in 2005, the database industry has debated the relative merits of row- and column-store databases.

While row-based databases dominated the operational database market, column-based database have made in-roads in the analytic database space, with Vertica (based on C-Store) as well as Sybase, Calpont, Infobright, Kickfire, Paraccel and SenSage pushing column-based data warehousing products based on the argument that column-based storage favors the write performance required for query processing.

The debate took a fresh twist recently as former SAP chief executive, Hasso Plattner, recently presented a paper (PDF) calling for the use of in-memory column-based storage databases for both analytical and transaction processing.

As interesting as that is in theory, of more immediate interest is the fact that Oracle – so often the target of column-based database vendors – has introduced a hybrid column-oriented storage option with the release of Oracle Database 11g Release 2.

As Curt Monash recently noted there are a couple of approaches emerging to hybrid row/column stores.

Oracle’s approach, as revealed in a white paper (PDF) has been to add new hybrid columnar compression capabilities in its Exadata Storage servers.

This approach maintains row-based storage in the Oracle Database itself while enabling the use of column-storage to improve compression rates in Exadata, claiming a compression ratio of up to 10 without any loss of query performance and up to 40 for historical data.

As Oracle’s Kevin Closson explains in a blog post: “The technology, available only with Exadata storage, is called Hybrid Columnar Compression. The word hybrid is important. Rows are still used. They are stored in an object called a Compression Unit. Compression Units can span multiple blocks. Like values are stored in the compression unit with metadata that maps back to the rows.”

Vertica took a different hybrid approach with the release of Vertica Database, 3.5, which introduced FlexStore, a new version of the column-store engine, including the ability to group a small number of columns or rows together to reduce input/output bottlenecks. Grouping can be done automatically based on data size (grouped rows can use up to 1MB) to improve query performance of whole rows or specified based on the nature of the column data (for example, bid, ask and date columns for a financial application) to improve query performance.

Likewise, the Ingres VectorWise project (previously mentioned here) will create a new storage engine for the Ingres Database positioned as a platform for data-warehouse and analytic workloads, make use of vectorized execution, which sees multiple instructions processed simultaneously. The Vectorwise architecture makes use of Partition Attributes Across (PAX), which similarly groups multiple rows into blocks to improve processing, while storing the data in columns.

Update – Daniel Abadi has provided an overview at the different approaches to hybrid row-column architectures and suggests something I had suspected, that Oracle is also using the PAX approach, except outside the core database, while Vertica is using what he calls a fine-grained hybrid approach. He also speculates that Microsoft may end up going the third route, fractured mirrors – Update

Perhaps the future of the database may not be row- or column-based, but plaid.

Comments Off on The future of the database is… plaid?

Lowering barriers to data warehousing adoption with open source

August 6th, 2009 — Data management

Since the start of this year I’ve been covering data warehousing as part of The 451 Group’s information management practice, adding to my ongoing coverage of databases, data caching, and CEP, and contributing to the CAOS research practice.

I’ve covered data warehousing before but taking a fresh look at this space in recent months it’s been fascinating to see the variety of technologies and strategies that vendors are applying to the data warehousing problem. It’s also been interesting to compare the role that open source has played in the data warehousing market, compared to the database market.

I’m preparing a major report on the data warehousing sector, for publication in the next couple of months. In preparartion for that I’ve published a rough outline of the role open source has played in the sector over on our CAOS Theory blog. Any comments or corrections much appreciated.

1 Comment

Ingres launches project for in-memory, columnar, vectorized database engine

July 29th, 2009 — Data management

Interesting news from Ingres today that it is teaming up with VectorWise, a database engine spin-off from Amsterdam’s Centrum Wiskunde & Informatica (CWI) scientific research establishment, to collaborate on a new database kernel project.

The Ingres VectorWise project will create a new open source storage engine for the Ingres Database that will better enable it to be positioned as a platform for data warehouse and analytic workloads, although Ingres does not have detailed plans for the productization of the technology at this stage. The starting point for the project is the theory that modern multi-core parallel processors now look like, and behave like, symmetrical multi processing (SMP) servers, and that on-chip memory is taking the place of RAM, but that database software has not been updated to take advantage of process developments.

In order to do so Ingres and VectorWise will be collaborating on vectorized execution, which sees multiple instructions processed simultaneously, and in-cache processing, through which the execution occurs within the CPU cache and main memory is effectively treated like disk. The result, according to Ingres, is to reduce the I/O bottleneck for query processing. Additionally, the VectorWise engine enables on the fly decompression and operation handling in memory and includes a compressed column store.

It is claimed that the Ingres VectorWise project will deliver 10x performance increases over the current Ingres database.

VectorWise span off from CWI in 2008 to commercialize the the X100 system previously created by its database architecture research group. Development of X100, now also known as VectorWise, has been led by respected research scientists Peter Boncz and Marcin Zukowski.

Ingres maintains that by working with the CWI research scientists it has proven that their theories are technically feasible in a commercial product. Bringing such a commercial product to general availability is the next step, and history has proven that can be easier said than done. With that caveat we are impressed with the vision and ambition that Ingres is demonstrating.

7 Comments

The Data Day, A few days: March 20-22 2013

The Data Day, Today: November 14 2012

The Data Day, Today: Feb 8 2012

The future of the database is… plaid?

Lowering barriers to data warehousing adoption with open source

Ingres launches project for in-memory, columnar, vectorized database engine

Search

Twitter: maslett

Categories

451 Group blogroll

Recent Posts

Subscribe via Email

Archives

The Data Day, A few days: March 20-22 2013

The Data Day, Today: November 14 2012

The Data Day, Today: Feb 8 2012

The future of the database is… plaid?

Lowering barriers to data warehousing adoption with open source

Ingres launches project for in-memory, columnar, vectorized database engine

Search

Tags

Twitter: maslett

Categories

451 Group blogroll

Recent Posts

Subscribe via Email

Archives