kodeagent.tools.extract_as_markdown#

kodeagent.tools.extract_as_markdown(url_or_path: str, max_length: int = 20000) str[source]#

Extract content from documents (PDF, DOCX, XLSX, PPTX) as Markdown text. Works with both URLs and local file paths.

Supported formats: - PDF files (.pdf) - Word documents (.docx) - Excel spreadsheets (.xlsx) - PowerPoint presentations (.pptx)

For reading HTML web pages, use ‘read_webpage’ instead (faster and cleaner).

Examples

Parameters:
  • url_or_path – URL or file path to a PDF, DOCX, XLSX, or PPTX file.

  • max_length – Optional limit on output length in characters. Use this to truncate very long documents (may lose information).

Returns:

Document content as Markdown text, or an error message if extraction fails.