0

I am using kedro.extras.datasets.pandas.SQLTableDataSet and would like to use the chunk_size argument from pandas. However, when running the pipeline, the table gets treated as a generator instead of a pd.dataframe().

How would you use the chunk_size within the pipeline?

My catalog:

table_name:
  type: pandas.SQLTableDataSet
  credentials: redshift
  table_name : rs_table_name
  layer: output
  save_args:
    if_exists: append
    schema: schema.name
    chunk_size: 1000
Anonymous Asked question May 14, 2021