kodeagent.agents.csv_agent#
CSV Analysis Agent. Wraps KodeAgent’s ReActAgent with tools for analysing a pandas DataFrame. The agent reasons about what to investigate, calls tools, observes results, and produces a structured list of narrative-ready findings.
Assess whether a column is worth analysing for a data story. |
|
Compare average values of a numeric column across categories. |
|
Find unusually high or low values in a numeric column. |
|
Find correlations between numeric columns. |
|
Analyse how a numeric column changes over time. |
|
Get the schema of the dataset: column names, types, cardinality, sample values, and missing value counts. |
|
Get summary statistics for one or more numeric columns. |
|
Get frequency distribution for a categorical column. |
|
Initialize the DataFrame for analysis. |
|
Example usage of the CSVAnalysisAgent. |
|
Get sample rows matching a condition. |
|
An agent specializing in discovering patterns and insights from CSV data. |
CSV Analysis Agent. Wraps KodeAgent’s ReActAgent with tools for analysing a pandas DataFrame. The agent reasons about what to investigate, calls tools, observes results, and produces a structured list of narrative-ready findings.
- class kodeagent.agents.csv_agent.CSVAnalysisAgent(name: str = 'CSV Analyst', model_name: str = 'gemini/gemini-2.0-flash-lite', **kwargs: Any)[source]#
Bases:
ReActAgentAn agent specializing in discovering patterns and insights from CSV data.
Examples
- Using a local file:
agent = CSVAnalysisAgent() # Pass the file path (or URL) directly as a task file async for response in agent.run(task, files=['/path/to/data.csv']): pass
- Using a URL:
agent = CSVAnalysisAgent() async for response in agent.run(task, files=['https://example.com/data.csv']): pass
Initialize the CSVAnalysisAgent.
- Parameters:
name – Name of the agent.
model_name – The LLM model to use.
**kwargs – Additional arguments passed to ReActAgent.
- async pre_run() AsyncIterator[AgentResponse][source]#
Pre-run hook to auto-load CSV files and yield initialization logs.
- kodeagent.agents.csv_agent.assess_column(column: str) str[source]#
Assess whether a column is worth analysing for a data story. Returns a judgment and reasoning about the column’s narrative value. Use this when unsure whether to investigate a column further.
- Parameters:
column – The column name to assess.
- Returns:
A text assessment with judgment and reasoning about the column.
- kodeagent.agents.csv_agent.compare_groups(numeric_column: str, category_column: str) str[source]#
Compare average values of a numeric column across categories. Useful for finding which groups are highest, lowest, or most surprising.
- Parameters:
numeric_column – The numeric column to compare.
category_column – The categorical column to group by.
- Returns:
A JSON string comparing statistical metrics across groups.
- kodeagent.agents.csv_agent.find_anomalies(column: str) str[source]#
Find unusually high or low values in a numeric column. Returns count of outliers and the most extreme values.
- Parameters:
column – The numeric column to check for anomalies.
- Returns:
A JSON string or error message detailing detected outliers.
- kodeagent.agents.csv_agent.find_correlations(columns: str) str[source]#
Find correlations between numeric columns. Returns pairs with strong positive or negative relationships.
- Parameters:
columns – Comma-separated column names to correlate, e.g. “sales,profit,units”
- Returns:
A JSON string listing strong statistical correlations.
- kodeagent.agents.csv_agent.find_trends(numeric_column: str, time_column: str) str[source]#
Analyse how a numeric column changes over time. Returns direction, magnitude of change, and whether the trend reversed.
- Parameters:
numeric_column – The numeric column to analyse.
time_column – The time/date column to use as x-axis.
- Returns:
A JSON string or error message describing the detected trend.
- kodeagent.agents.csv_agent.get_df_schema() str[source]#
Get the schema of the dataset: column names, types, cardinality, sample values, and missing value counts. Always call this first.
- Returns:
A JSON string containing the dataset schema and summary.
- kodeagent.agents.csv_agent.get_summary_stats(columns: str) str[source]#
Get summary statistics for one or more numeric columns. Returns mean, median, std, min, max, and percentiles.
- Parameters:
columns – Comma-separated column names, e.g. “price,quantity,age”
- Returns:
A JSON string with summary statistics for the requested columns.
- kodeagent.agents.csv_agent.get_value_counts(column: str, top_n: int = 10) str[source]#
Get frequency distribution for a categorical column. Shows the most common values and their counts.
- Parameters:
column – The column name to analyse.
top_n – Number of top values to return (default 10).
- Returns:
A JSON string with the frequency distribution.
- kodeagent.agents.csv_agent.init_df_for_analysis(csv_file_path: str) str[source]#
Initialize the DataFrame for analysis. This tool must be called first.
- Parameters:
csv_file_path – Path to the CSV file.
- Returns:
A string indicating success or failure.
- kodeagent.agents.csv_agent.sample_rows(filter_column: str, filter_value: str, n: int = 5) str[source]#
Get sample rows matching a condition. Useful for investigating specific anomalies or verifying a pattern.
- Parameters:
filter_column – Column to filter on.
filter_value – Value to match (string comparison).
n – Number of rows to return (default 5, max 20).
- Returns:
A JSON string containing the sample rows.