1 min readDec 16, 2019
Hi,
You are certainly right, something seems to be missing from my example!
The name of the database/schema is specified as a prefix to the table name
Something like this:
def get_tables():
dim_tables = ["dim.DimAge", "dim.DimPerson"]
fact_tables = ["facts.FactPerson"]
export_tables = dim_tables + fact_tables
tables = []
for dim in export_tables:
cfg = TableConfig(cloud_sql_instance='CLOUD_SQL_INSTANCE_NAME',
export_table=dim.split(".")[-1],
export_bucket='YOUR_STAGING_BUCKET',
export_database=dim.split(".")[0],
export_query="SELECT * from {}".format(dim),
gcp_project="YOUR_PROJECT_ID",
stage_dataset="YOUR_STAGING_DATASET",
stage_table=None,
stage_final_query=None,
bq_location="EU")
tables.append(cfg)
return tables# and filename used like this, for example:
filename="cloudsql_to_bigquery/{}/{}".format(table_config.params['export_database'],
table_config.params['export_table']) + "_{}"
The connection to the database instance is done in the proxy, so if you have more than one database instance youd like to connect to, you will need a proxy for each.
Thanks for the feedback!