I have a pandas DataFrame with 3 columns: col1 contains lists, col2 contains dictionaries, and col3 contains NaNs:
dict_ = {'col1': [['abc'], ['def', 'ghi'], []],
'col2': [{'k1': 'v1', 'k2': 'v2'},
{'k1': 'v3', 'k2': 'v4'},
{'k1': 'v5', 'k2': 'v6'}],
'col3': [np.nan, np.nan, np.nan]}
df = pd.DataFrame(dict_)
Uploading the DataFrame to BigQuery I create the following schema for the first and second columns:
schema = [
bigquery.SchemaField(name="col1", field_type="STRING", mode='REPEATED'),
bigquery.SchemaField(name="col2", field_type="RECORD", mode='NULLABLE',
fields=[bigquery.SchemaField(name="k1", field_type="STRING", mode='NULLABLE'),
bigquery.SchemaField(name="k2", field_type="STRING", mode='NULLABLE')])
]
job_config = bigquery.LoadJobConfig(write_disposition="WRITE_TRUNCATE", schema=schema)
job = client.load_table_from_dataframe(df, table, job_config=job_config)
job.result()
The DataFrame was uploaded, but the col1 is empty.
Table Preview :
What should I do to fix this?
question from:
https://stackoverflow.com/questions/66054651/uploading-dataframe-to-bigquery-with-array-structure 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…