Impala TRUNCATE TABLE Statement with Practical Example
As we know to delete the table completely we use Impala Drop Table Statement. However, we just need to delete some data from an Impala table while leaving the table itself. For that purpose, we use the Impala TRUNCATE TABLE Statement.
So, in this article, we will learn the whole concept of Impala TRUNCATE TABLE Statement. Apart from its introduction, it includes its syntax, type as well as its example, to understand it well.
So, let’s start Impala Truncate Table Statements.
How to use Impala TRUNCATE TABLE Statement
Basically, to delete some data from an Impala table while leaving the table itself we use Impala TRUNCATE TABLE Statement. In other words, to remove all the records from an existing table we use the Truncate Table Statement of Impala.
So, there is one more command we can use DROP TABLE command, but it would remove the complete table structure from the database. Then we would require re-creating this table once again if we wish you store some data.
a. Syntax of Impala Truncate Table Statements
So, the syntax for using Impala TRUNCATE TABLE Statement is-
TRUNCATE TABLE [IF EXISTS] [db_name.]table_name
b. Statement type
Impala TRUNCATE TABLE Statement is of DDL Type.
c. Usage
After the data has been copied to another table for the next stage of processing often used to empty tables that are used during ETL cycles.
In addition, to dropping and recreating the table, or using INSERT OVERWRITE to replace the data during the next ETL cycle, this statement is a low-overhead alternative.
Makes sure it deletes all the data and associated data files in the table. So, we can use it to delete data files from internal tables, external tables, partitioned tables, and tables mapped to HBase or the Amazon Simple Storage Service (S3).
Well, including all partitions of a partitioned table, the data removal applies to the entire table.
However, when the data is removed any statistics produced by the COMPUTE STATS statement are reset.
Although, before truncating a table ensure that you are in the correct database. Either by using a fully qualified name db_name.table_name or by issuing a USE statement first.
There is an optional IF EXISTS clause. It is available in CDH 5.7 / Impala 2.5 and higher. It makes the statement succeed even if the table exists or not. There are two possible conditions, Â Either the table does exist or not exist.
So, if it does exist, it is truncated; or if it does not exist the statement has no effect. To be more specific, in standardized setup scripts that is might be run both before and after some of the tables exist, this is very useful.
Moreover, if we use this statement to remove any HDFS data files, it goes into the  HDFS trashcan. Further, if this operation turns out to be a mistake, we can recover them within a defined time interval.
d. Cancellation
It is not possible to cancel it. That implies it Cannot be canceled.
Amazon S3 Considerations
Here, the TRUNCATE TABLE statement can remove data files from S3, but Impala cannot write new data to a table stored in the Amazon S3 filesystem.
HDFS Permissions
It is must that the Impala user ID has to write permission for all the files and directories those make up the table that the Impalad daemon runs under.
Examples of Truncate Table in Impala
Here, is a table containing some data and with table and column statistics. Now, the data is removed and the statistics are reset after the TRUNCATE TABLE statement.
For Example,
CREATE TABLE truncate_demo (x INT); INSERT INTO truncate_demo VALUES (1), (2), (4), (8); SELECT COUNT(*) FROM truncate_demo;
count(*) |
4 |
COMPUTE STATS truncate_demo;
summary              |
Updated 1 partition(s) and 1 column(s). |
SHOW TABLE STATS truncate_demo;
#Rows | #Files | size | Bytes cached | Cache Replication | Format | Incremental Stats |
4 | 1 | 8B | NOT CACHED | NOT CACHED | TEXT | false |
SHOW COLUMN STATS truncate_demo;
column | Type | #Distinct Values | #Nulls | Max Size | Avg Size |
x | INT | 4 | -1 | 4 | 4 |
-- After this statement, the data and the table/column stats will be gone. TRUNCATE TABLE truncate_demo; SELECT COUNT(*) FROM truncate_demo;
count(*) |
0 |
SHOW TABLE STATS truncate_demo;
#Rows | #Files | Size | Bytes Cached | Cache Replication | Format | Incremental stats |
-1 | 0 | 0B | NOT CACHED | NOT CACHED | TEXT | false |
SHOW COLUMN STATS truncate_demo;
Column | Type | #Nulls | #Distinct Values | Max Size | Avg Size |
X | INT | -1 | -1 | 4 | 4 |
Further, it shows how the IF EXISTS clause allows the TRUNCATE TABLE statement to be run without error whether or not the table exists:
For Example,
CREATE TABLE staging_table1 (x INT, s STRING); Fetched 0 row(s) in 0.33s SHOW TABLES LIKE 'staging*';
NAME |
staging_table1 |
Fetched 1 row(s) in 0.25s -- Our ETL process involves removing all data from several staging tables -- even though some might be already dropped, or not created yet. TRUNCATE TABLE IF EXISTS staging_table1; Fetched 0 row(s) in 5.04s TRUNCATE TABLE IF EXISTS staging_table2; Fetched 0 row(s) in 0.25s TRUNCATE TABLE IF EXISTS staging_table3; Fetched 0 row(s) in 0.25s
So, this was all in Impala Truncate Table Statements. Hope you like our explanation.
Conclusion
As a result, we have seen the whole concept of Impala TRUNCATE TABLE Statement. Still, if any doubt occurs in how to use Impala TRUNCATE TABLE Statement, feel free to ask in the comment section.
Did you like this article? If Yes, please give DataFlair 5 Stars on Google