Anders Elton
1 min readDec 16, 2019

--

Hi,

You are certainly right, something seems to be missing from my example!

The name of the database/schema is specified as a prefix to the table name

Something like this:

def get_tables():

dim_tables = ["dim.DimAge", "dim.DimPerson"]
fact_tables = ["facts.FactPerson"]
export_tables = dim_tables + fact_tables
tables = []
for dim in export_tables:
cfg = TableConfig(cloud_sql_instance='CLOUD_SQL_INSTANCE_NAME',
export_table=dim.split(".")[-1],
export_bucket='YOUR_STAGING_BUCKET',
export_database=dim.split(".")[0],
export_query="SELECT * from {}".format(dim),
gcp_project="YOUR_PROJECT_ID",
stage_dataset="YOUR_STAGING_DATASET",
stage_table=None,
stage_final_query=None,
bq_location="EU")
tables.append(cfg)
return tables
# and filename used like this, for example:
filename="cloudsql_to_bigquery/{}/{}".format(table_config.params['export_database'],
table_config.params['export_table']) + "_{}"

The connection to the database instance is done in the proxy, so if you have more than one database instance youd like to connect to, you will need a proxy for each.

Thanks for the feedback!

--

--