kodeagent.agents.csv_agent#


CSV Analysis Agent. Wraps KodeAgent’s ReActAgent with tools for analysing a pandas DataFrame. The agent reasons about what to investigate, calls tools, observes results, and produces a structured list of narrative-ready findings.

assess_column

Assess whether a column is worth analysing for a data story.

compare_groups

Compare average values of a numeric column across categories.

find_anomalies

Find unusually high or low values in a numeric column.

find_correlations

Find correlations between numeric columns.

find_trends

Analyse how a numeric column changes over time.

get_df_schema

Get the schema of the dataset: column names, types, cardinality, sample values, and missing value counts.

get_summary_stats

Get summary statistics for one or more numeric columns.

get_value_counts

Get frequency distribution for a categorical column.

init_df_for_analysis

Initialize the DataFrame for analysis.

main

Example usage of the CSVAnalysisAgent.

sample_rows

Get sample rows matching a condition.

CSVAnalysisAgent

An agent specializing in discovering patterns and insights from CSV data.

CSV Analysis Agent. Wraps KodeAgent’s ReActAgent with tools for analysing a pandas DataFrame. The agent reasons about what to investigate, calls tools, observes results, and produces a structured list of narrative-ready findings.

class kodeagent.agents.csv_agent.CSVAnalysisAgent(name: str = 'CSV Analyst', model_name: str = 'gemini/gemini-2.0-flash-lite', **kwargs: Any)[source]#

Bases: ReActAgent

An agent specializing in discovering patterns and insights from CSV data.

Examples

Using a local file:
agent = CSVAnalysisAgent()
# Pass the file path (or URL) directly as a task file
async for response in agent.run(task, files=['/path/to/data.csv']):
    pass
Using a URL:
agent = CSVAnalysisAgent()
async for response in agent.run(task, files=['https://example.com/data.csv']):
    pass

Initialize the CSVAnalysisAgent.

Parameters:
  • name – Name of the agent.

  • model_name – The LLM model to use.

  • **kwargs – Additional arguments passed to ReActAgent.

async pre_run() AsyncIterator[AgentResponse][source]#

Pre-run hook to auto-load CSV files and yield initialization logs.

kodeagent.agents.csv_agent.assess_column(column: str) str[source]#

Assess whether a column is worth analysing for a data story. Returns a judgment and reasoning about the column’s narrative value. Use this when unsure whether to investigate a column further.

Parameters:

column – The column name to assess.

Returns:

A text assessment with judgment and reasoning about the column.

kodeagent.agents.csv_agent.compare_groups(numeric_column: str, category_column: str) str[source]#

Compare average values of a numeric column across categories. Useful for finding which groups are highest, lowest, or most surprising.

Parameters:
  • numeric_column – The numeric column to compare.

  • category_column – The categorical column to group by.

Returns:

A JSON string comparing statistical metrics across groups.

kodeagent.agents.csv_agent.find_anomalies(column: str) str[source]#

Find unusually high or low values in a numeric column. Returns count of outliers and the most extreme values.

Parameters:

column – The numeric column to check for anomalies.

Returns:

A JSON string or error message detailing detected outliers.

kodeagent.agents.csv_agent.find_correlations(columns: str) str[source]#

Find correlations between numeric columns. Returns pairs with strong positive or negative relationships.

Parameters:

columns – Comma-separated column names to correlate, e.g. “sales,profit,units”

Returns:

A JSON string listing strong statistical correlations.

Analyse how a numeric column changes over time. Returns direction, magnitude of change, and whether the trend reversed.

Parameters:
  • numeric_column – The numeric column to analyse.

  • time_column – The time/date column to use as x-axis.

Returns:

A JSON string or error message describing the detected trend.

kodeagent.agents.csv_agent.get_df_schema() str[source]#

Get the schema of the dataset: column names, types, cardinality, sample values, and missing value counts. Always call this first.

Returns:

A JSON string containing the dataset schema and summary.

kodeagent.agents.csv_agent.get_summary_stats(columns: str) str[source]#

Get summary statistics for one or more numeric columns. Returns mean, median, std, min, max, and percentiles.

Parameters:

columns – Comma-separated column names, e.g. “price,quantity,age”

Returns:

A JSON string with summary statistics for the requested columns.

kodeagent.agents.csv_agent.get_value_counts(column: str, top_n: int = 10) str[source]#

Get frequency distribution for a categorical column. Shows the most common values and their counts.

Parameters:
  • column – The column name to analyse.

  • top_n – Number of top values to return (default 10).

Returns:

A JSON string with the frequency distribution.

kodeagent.agents.csv_agent.init_df_for_analysis(csv_file_path: str) str[source]#

Initialize the DataFrame for analysis. This tool must be called first.

Parameters:

csv_file_path – Path to the CSV file.

Returns:

A string indicating success or failure.

async kodeagent.agents.csv_agent.main() None[source]#

Example usage of the CSVAnalysisAgent.

kodeagent.agents.csv_agent.sample_rows(filter_column: str, filter_value: str, n: int = 5) str[source]#

Get sample rows matching a condition. Useful for investigating specific anomalies or verifying a pattern.

Parameters:
  • filter_column – Column to filter on.

  • filter_value – Value to match (string comparison).

  • n – Number of rows to return (default 5, max 20).

Returns:

A JSON string containing the sample rows.