Column Store

openGauss supports hybrid row-column store. Row store stores tables to disk partitions by row, and column store stores tables to disk partitions by column.

Each storage model applies to specific scenarios. Select an appropriate model when creating a table. Generally, openGauss is used for databases in online transaction processing (OLTP) scenarios. By default, row store is used. Column store is used only in online analytical processing (OLAP) scenarios where complex queries are performed and the data volume is large. By default, a row-store table is created. For details about differences between row store and column store, see Figure 1.

Figure 1 Differences between row store and column store

In the preceding figure, the upper left part is a row-store table, and the upper right part shows how the row-store table is stored on a disk; the lower left part is a column-store table, and the lower right part shows how the column-store table is stored on a disk.

Both row-store and column-store models have benefits and drawbacks.

Storage Model

Benefit

Drawback

Row store

Record data is stored together. Data can be easily inserted and updated.

All the columns of a record are read after the SELECT statement is executed even if only certain columns are required.

Column store

  • Only the columns involved in a query are read.
  • Projection is efficient.
  • Any column can serve as an index.
  • The selected columns need to be reconstructed after the SELECT statement is executed.
  • Data cannot be easily inserted or updated.

Generally, if a table contains many columns (called a wide table) and its query involves only a few columns, column store is recommended. Row store is recommended if a table contains only a few columns and a query involves most of the columns.

Storage Model

Application Scenario

Row store

  • Point queries (simple index-based queries that only return a few records)
  • Scenarios requiring frequent addition, deletion, and modification
  • Frequent updates and few insertions.

Column store

  • Statistical analysis queries (requiring a large number of association and grouping operations)
  • Ad hoc queries (using uncertain query conditions and unable to utilize indexes to scan row-store tables)
  • A large amount of data is inserted at a time.
  • If a table contains a large number of columns, you are advised to use a column-store table.
  • If only a small number of columns (less than 50% of the total) is queried each time, use a column-store table.

Syntax

CREATE TABLE table_name 
    (column_name data_type [, ... ])
    [ WITH ( ORIENTATION  = value) ];

Parameter Description

  • table_name

    Specifies the name of the table to be created.

  • column_name

    Specifies the name of a column to be created in the new table.

  • data_type

    Specifies the data type of the column.

  • ORIENTATION

    Specifies the storage mode (row-store, column-store, or ORC) of table data. This parameter cannot be modified once it is set.

    Value range:

    • ROW indicates that table data is stored in rows.

      ROW applies to OLTP services and scenarios with a large number of point queries or addition/deletion operations.

    • COLUMN indicates that the data is stored in columns.

      COLUMN applies to the data warehouse service, which has a large amount of aggregation computing, and involves a few column operations.

Example

If ORIENTATION is not specified, the table is a row-store table by default. For example:

openGauss=# CREATE TABLE customer_test1
(
  state_ID   CHAR(2),
  state_NAME VARCHAR2(40),
  area_ID    NUMBER
);

-- Delete the table.
openGauss=# DROP TABLE customer_test1;

When creating a column-store table, you need to specify the ORIENTATION parameter. For example:

openGauss=# CREATE TABLE customer_test2
(
  state_ID   CHAR(2),
  state_NAME VARCHAR2(40),
  area_ID    NUMBER
)
WITH (ORIENTATION = COLUMN);

-- Delete the table.
openGauss=# DROP TABLE customer_test2;
Feedback
编组 3备份
    openGauss 2024-05-07 00:46:52
    cancel