Netezza: datatype storage CHARACTER v.s. NATIONAL CHARACTER

SUMMARY

There is NO significant storage saving (just 3.7%) when choosing between  NATIONAL CHARACTER (Unicode) and CHARACTER (ASCII) for database table design at Netezza.

I recommend choosing a datatype that pertains to design and scalability of the application.

DETAILS

Compare storage saving using CHARACTER VARYING  instead of NATIONAL CHARACTER VARYING  datatype at Netezza NPS 7.2.1.

NATIONAL CHARACTER
CHARACTER
CREATE TABLE ARCHIVE_TABLE_201405
(
       CMTS_NM     NATIONAL CHARACTER VARYING (30 ),
       INTF_NM     NATIONAL CHARACTER VARYING (30 ),
       MAC_ADDR    NATIONAL CHARACTER VARYING (20 ),
       CM_STA      NATIONAL CHARACTER VARYING (20 ),
       TM_OFST     CHARACTER VARYING (20 ),
       IP_ADDR     NATIONAL CHARACTER VARYING (20 ),
       NODE_KEY           NATIONAL CHARACTER VARYING (50 ),
       BATCH_TS    TIMESTAMP,
       AS_OF_DT    TIMESTAMP,
       DATA_ACQ_ID INTEGER
)
DISTRIBUTE ON (AS_OF_DT, MAC_ADDR);
CREATE TABLE ARCHIVE_TABLE_201405_VARCHAR
(
       CMTS_NM     CHARACTER VARYING (30 ),
       INTF_NM     CHARACTER VARYING (30 ),
       MAC_ADDR    CHARACTER VARYING (20 ),
       CM_STA      CHARACTER VARYING (20 ),
       TM_OFST     CHARACTER VARYING (20 ),
       IP_ADDR     CHARACTER VARYING (20 ),
       NODE_KEY           CHARACTER VARYING (50 ),
       BATCH_TS    TIMESTAMP,
       AS_OF_DT    TIMESTAMP,
       DATA_ACQ_ID INTEGER
)
DISTRIBUTE ON (AS_OF_DT, MAC_ADDR);
ALLOCATED_GB: 38 Gb
USED_GB     : 38 Gb
RECORDS     : 1,369,391,278
ALLOCATED_GB: 36.7 Gb
USED_GB     : 36.7 Gb
RECORDS     : 1,369,391,278


From description below, once could assume that each NVARCHAR (Unicode) character takes 4x larger storage than VARCHAR (ASCII).  The result of the experiment indicates that this is not the case.

VARCHARCHARACTER VARYING, VARCHAR(n), CHARACTER VARYING(n), CHAR VARYING(n)Variable length character string to a maximum length of n. If you do not specify n, the default is an unsized VARCHAR value. There is no blank padding, and the value is stored as entered. The maximum character string size is 64,000.
NVARCHARNATIONAL CHARACTER VARYING, NATIONAL CHAR VARYING, NATIONAL CHAR VARYING(n), NATIONAL CHARACTER VARYING(n), and NVARCHAR(n)Variable length character string to a maximum length of n. If you do not specify n, the default is an unsized NVARCHAR value. The maximum length is 16,000 characters.


Post a Comment

0 Comments