I am using kedro.extras.datasets.pandas.SQLTableDataSet and would like to use the chunk_size argument from pandas. However, when running the pipeline, the table gets treated as a generator instead of a pd.dataframe().
How would you use the chunk_size within the pipeline?
My catalog:
table_name: type: pandas.SQLTableDataSet credentials: redshift table_name : rs_table_name layer: output save_args: if_exists: append schema: schema.name chunk_size: 1000
Anonymous Asked question May 14, 2021
Recent Comments