awscliへのjsonの渡し方

2022-2-10 - 読み終える時間: 4 分

ん~、クォーティングが、ががg

とりあえず外部ファイルにして

aws glue update-table --database-name --table-input file://table.json

のように書く。

table.json 例

{
    "Name": "new_table",
    "StorageDescriptor": {
        "Columns": [
            {"Name": "c1", "Type": "int"},
            {"Name": "c2", "Type": "string"},
            {"Name": "c3", "Type": "string"}
        ],
        "Location": "s3://mybucket/db/new_table",
        "InputFormat": "org.apache.hadoop.mapred.TextInputFormat",
        "OutputFormat": "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat",
        "SerdeInfo": {
            "SerializationLibrary": "org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe",
            "Parameters": {
                 "separatorChar":","
            }
        }
    },
    "Parameters": {
        "projection.enabled": "true",
        "projection.year.range": "2021,2030",
        "projection.year.type": "integer",
        "projection.month.range": "1,12",
        "projection.month.type": "integer",
        "projection.day.range": "1,31",
        "projection.day.type": "integer"
    },
    "PartitionKeys":[
        {"Name":"year", "Type":"int"},
        {"Name":"month", "Type":"int"},
        {"Name":"day", "Type":"int"}
    ],
    "TableType": "EXTERNAL_TABLE"
}

ようわからんのじゃ🤯


追記

まずathenaで

aws --profile aaa athena start-query-execution --query-string file://SQL.txt

SQL.txt

create table db001.new_table (
    c1 int,
    c2 string,
    c3 string
)
PARTITIONED BY (year int,month int,day int)
ROW FORMAT DELIMITED
  FIELDS TERMINATED BY ','
  LINES TERMINATED BY '\n'
  STORED AS TEXTFILE
LOCATION 's3://mybucket/db/new_table/';

こんな風にして、Glueで

aws --profile aaa glue update-table --database-name db001 --table-input file://table.json

table.json

{
    "Name": "new_table",
    "StorageDescriptor": {
        "Columns": [
            {"Name": "c1", "Type": "int"},
            {"Name": "c2", "Type": "string"},
            {"Name": "c3", "Type": "string"}
        ],
        "Location": "s3://mybucket/db/new_table/",
        "InputFormat": "org.apache.hadoop.mapred.TextInputFormat",
        "OutputFormat": "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat",
        "SerdeInfo": {
            "SerializationLibrary": "org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe",
            "Parameters": {
                 "field.delim":",",
                 "line.delim":"\n",
                 "serialization.format":","
            }
        }
    },
    "Parameters": {
        "projection.enabled": "true",
        "projection.year.range": "2021,2030",
        "projection.year.type": "integer",
        "projection.month.range": "1,12",
        "projection.month.type": "integer",
        "projection.day.range": "1,31",
        "projection.day.type": "integer"
    },
    "PartitionKeys":[
        {"Name":"year", "Type":"int"},
        {"Name":"month", "Type":"int"},
        {"Name":"day", "Type":"int"}
    ],
    "TableType": "EXTERNAL_TABLE"
}

こうよ。出来たんじゃない?


今回はここまで