Load Data From Databend Stages
Before You Begin
- Databend : Make sure Databend is running and accessible, see Deploy Databend With MinIO.
Step 1. Create Stage Object
Execute CREATE STAGE to create a named internal stage.
mysql -h127.0.0.1 -uroot -P3307
CREATE STAGE my_int_stage;
DESC STAGE my_int_stage;
+--------------+------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------+--------------------------------------------------------------------------------------------------------------------+---------+
| name | stage_type | stage_params | copy_options | file_format_options | comment |
+--------------+------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------+--------------------------------------------------------------------------------------------------------------------+---------+
| my_int_stage | Internal | StageParams { storage: S3(StageS3Storage { bucket: "", path: "", credentials_aws_key_id: "", credentials_aws_secret_key: "", encryption_master_key: "" }) } | CopyOptions { on_error: None, size_limit: 0 } | FileFormatOptions { format: Csv, skip_header: 0, field_delimiter: ",", record_delimiter: "\n", compression: None } | |
+--------------+------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------+--------------------------------------------------------------------------------------------------------------------+---------+
Step 2. Stage the Data Files
Download the sample data file(Choose CSV or Parquet), the file contains two records:
Transaction Processing,Jim Gray,1992
Readings in Database Systems,Michael Stonebraker,2004
- CSV
- Parquet
Download books.csv
Download books.parquet
- CSV
- Parquet
Upload books.csv
into stages:
Request /v1/upload_to_stage
curl -H "stage_name:my_int_stage"\
-F "upload=@./books.csv"\
-XPUT http://root:@localhost:8081/v1/upload_to_stage
Response
{"id":"50880048-f397-4d32-994c-ce3d38af430f","stage_name":"my_int_stage","state":"SUCCESS","files":["books.csv"]}
tip
http://127.0.0.1:8081/v1/upload_to_stage
127.0.0.1
ishttp_handler_host
value in your databend-query.toml8081
ishttp_handler_port
value in your databend-query.toml
-F \"upload=@./books.csv\"
- Your books.csv file location
Upload books.parquet
into stages:
Request /v1/upload_to_stage
curl -H "stage_name:my_int_stage"\
-F "upload=@./books.parquet"\
-XPUT http://root:@localhost:8081/v1/upload_to_stage
Response
{"id":"50880048-f397-4d32-994c-ce3d38af430f","stage_name":"my_int_stage","state":"SUCCESS","files":["books.parquet"]}
tip
http://127.0.0.1:8081/v1/upload_to_stage
127.0.0.1
ishttp_handler_host
value in your databend-query.toml8081
ishttp_handler_port
value in your databend-query.toml
-F \"upload=@./books.parquet\"
- Your books.csv file location
Step 3. List the Staged Files (Optional)
mysql -h127.0.0.1 -uroot -P3307
LIST @my_int_stage;
+---------------+------+------+-------------------------------+--------------------+
| name | size | md5 | last_modified | creator |
+---------------+------+------+-------------------------------+--------------------+
| books.csv | 91 | NULL | 2022-06-10 12:01:40.000 +0000 | 'root'@'127.0.0.1' |
| books.parquet | 91 | NULL | 2022-06-10 12:01:40.000 +0000 | 'root'@'127.0.0.1' |
+---------------+------+------+-------------------------------+--------------------+
Step 4. Creating Database and Table
CREATE DATABASE book_db;
USE book_db;
CREATE TABLE books
(
title VARCHAR,
author VARCHAR,
date VARCHAR
);
Step 5. Copy Data into the Target Tables
Execute COPY to load staged files to the target table.
- CSV
- Parquet
COPY INTO books FROM @my_int_stage files=('books.csv') file_format = (type = 'CSV' field_delimiter = ',' record_delimiter = '\n' skip_header = 0);
tip
files = ( 'file_name' [ , 'file_name' ... ] )
Specifies a list of one or more files names (separated by commas) to be loaded.
- file_format
Parameters | Description | Required |
---|---|---|
record_delimiter | One characters that separate records in an input file. Default '\n' | Optional |
field_delimiter | One characters that separate fields in an input file. Default ',' | Optional |
skip_header | Number of lines at the start of the file to skip. Default 0 | Optional |
COPY INTO books FROM @my_int_stage files=('books.parquet') file_format = (type = 'Parquet');
Step 6. Verify the Loaded Data
SELECT * FROM books;
+------------------------------+----------------------+-------+
| title | author | date |
+------------------------------+----------------------+-------+
| Transaction Processing | Jim Gray | 1992 |
| Readings in Database Systems | Michael Stonebraker | 2004 |
+------------------------------+----------------------+-------+
Step 7. Congratulations!
You have successfully completed the tutorial.