Hi,. You are certainly right, something… | by Anders Elton

1 min readDec 16, 2019

Hi,

You are certainly right, something seems to be missing from my example!

The name of the database/schema is specified as a prefix to the table name

Something like this:

def get_tables():

    dim_tables = ["dim.DimAge", "dim.DimPerson"]
    fact_tables = ["facts.FactPerson"]
    export_tables = dim_tables + fact_tables
    tables = []
    for dim in export_tables:
        cfg = TableConfig(cloud_sql_instance='CLOUD_SQL_INSTANCE_NAME',
                          export_table=dim.split(".")[-1],
                          export_bucket='YOUR_STAGING_BUCKET',
                          export_database=dim.split(".")[0],
                          export_query="SELECT * from {}".format(dim),
                          gcp_project="YOUR_PROJECT_ID",
                          stage_dataset="YOUR_STAGING_DATASET",
                          stage_table=None,
                          stage_final_query=None,
                          bq_location="EU")
        tables.append(cfg)
    return tables# and filename used like this, for example:
filename="cloudsql_to_bigquery/{}/{}".format(table_config.params['export_database'],
                                             table_config.params['export_table']) + "_{}"

The connection to the database instance is done in the proxy, so if you have more than one database instance youd like to connect to, you will need a proxy for each.

Thanks for the feedback!

Written by Anders Elton

No responses yet