Supported Data Types
In the current big data area, ORC and Parquet are the mainstream file formats of the HDFS-based file system. openGauss supports the ORC and Parquet file formats as well. You can import data to the HDFS file system through Hive and store data in ORC or Parquet file format. You can use the openGauss to query and analyze data in the ORC or Parquet file. This requires the data type supported by ORC or Parquet file format to match the data type supported by openGauss. The match relationship is shown as follows:
Table 1 Data type matching
Hive Table Type When Data Is Imported to an HDFS Foreign Table | ||
---|---|---|
DECIMAL (The maximum precision can reach up to 38 for Hive 0.11.) | ||
NOTICE:
- The openGauss HDFS foreign table supports the NULL definition, and the Hive data table supports and uses the corresponding NULL definition.
- The date and time type of the openGauss HDFS foreign tables do not support the time zone definition. Hive does not support the time zone definition as well.
- The date type in Hive contains only date. The date types in openGauss contain date and time.
- In openGauss, ORC files can be compressed in ZLIB, SNAPPY, LZ4, and NONE mode.
- In openGauss, Parquet files can be compressed in SNAPPY and NONE mode.
- The FLOAT4 format itself is not accurate, and the sum operation results in different effect in various environments. You are advised to use the DECIMAL type in the high-precision scenarios.
- In Teradata-compatible mode, the HDFS foreign table does not support the DATE type.
Feedback