2015-08-03

Notes of mitX Big Data course: new DBMSs

Trends in Database management systems: in-memory column format.

Basic knowledge:
ACID: atomicity, consistency, isolation, durability

1. Column store
Column store is 50-100 times faster than row store.
The reasons are
1) when you want to read something by a query, you may want only a few columns, but a row-based store will give you all the columns of each row, most you don't want;
2) columns are easy to encode and compress;
3) don't use headers for each record;
4) column-wise picking by an executer is faster because of vector processing.

Native column store systems:
HP/Vertica, SAP/Hana, Paraccel(Amazon), SAP/Sybase/IQ
Native row store systems:
MS, Oracle (latest Oracle db 12c is doing column store as an option), DB2, Netezza
In transition:
Teradata, Asterdata, Greenplum,

Example with Vertica:
read some records into main memory and present them in row-store form;
Once changes are made, rotate the rows into columns, encode them and write onto disk.
Paraccel and Hana are similar.

2. Ways to reduce computation cost (time)

1) To reduce time for isolation:
locking: isolate rows being edited from other editors
latching: isolate data being processed in one thread from other threads
New ways to avoid locking while dealing with isolation:
concurrency control:
  MVCC (NuoDB, Hekaton)
  Time stamp order (H-Store, VoltDB)
  Lightweight combination of time stamp order and dynamic locking (Calvin, Dora)
Use single-thread systems.

2) To reduce memory loading time:

Transaction processing databases are not so big as data warehouses. They can well be done in main memory (buffer pool).
1TB memory is already in market ($30k)
Anti-caching is going to be popular.

3) To reduce logging time:
i. do command logging (logic logging) instead of data changelog (dynamic logging).
ii. with replica in big data stuctures, no need of logging and recovery for plain failures. You just 'failover' and that's fine.

Recommended new OLTP (Online transaction processing) systems: MS Hekaton, SAP Hana, VoltDB, MemSQL, SQLFire,... They are 100 times faster than old systems.
Notes from outside the course: Oracle 12c has dual row and column based formats, both can do in-memory OLTP. The column format uses memory only while the row format can be stored on disk.

3. NoSQL (MongoDB, Cassandra, ...): comments from Professor Mike Stonebraker

1) NoSQL advocates to give up ACID, which is not good. Actually NoSQL is currently seeking after better ACID.
2) NoSQL advocates to 'scheme later', which is possible in some cases but not really good practice.

4. Array database systems (such as SciDB)
Good to do complex analyses requiring matrix computation. It depends on how such analyses are sought after in market.
Examples includes data mining looking into covariance between pairs of stocks in their prices over time.
SciDB provides array SQL.

5. Graph database (Neo4J, )
Also good to do OLTP.

6. Hadoop, Hive, Pig
MapReduce is only useful when doing with highly parallel computations.
Hive (Facebook) and Pig (Yahoo) are top layers doing SQL like queries and manipulations, which is more likely to be active in market.
HDFS, the bottom level file system, is dying.
Facts: Facebook runs ~2500 nodes of mapreduce.
A normal SQL aggregate running on Hadoop is 100x slower than on a column store DBMS.
A matrix computation analysis on Hadoop is 100x slower than on an array based DBMS.

Cloudera Impala (an execution engine implementing Hive) is becoming more like SAP Hana or Vertica.

Hortonworks and Facebook are doing the same thing, developing Hive-interfaced data warehousing systems without MapReduce.
Other data warehousing systems may also well have supported Hive, however.

Notes from others:
Google Dremel is also running SQL over MapReduce. These alike systems usually do very simple queries and don't need fault tolerance.
While Google Tenzing is designed for complex queries on top of mapreduce clusters.
Apache Spark uses MapReduce clusters but runs in-memory.
Apache Shark implements in-memory, column-oriented database on top of Spark (Hive on Spark), so we can use SQL with Spark.
Those new systems can do data partitioning in memory. This eases doing transactions separately on each partition simultaneously thus avoid concurrency controlling.

Conclusion:
A. Data warehouse is going to be a column store market;
B. OLTP will be a main memory market;
C. Array based and Graph DBMSs may get traction;
D. NoSQL currently is not good if you want ACID.

没有评论:

发表评论