Alteryx Python not Calling ii_init and ii_push_record

17 views Asked by At

I'm working on an AlteryxPythonSDK Tool to process PDFs and extract tables from them. The plugin consists of an XML configuration, an HTML user interface, and a Python script. Despite implementing the necessary methods, I'm facing an issue where the ii_init and ii_push_record methods are not being called during the plugin execution.

XML Configuration (PDFExtractToolConfig.xml):

<?xml version="1.0"?>
<AlteryxJavaScriptPlugin>
  <EngineSettings EngineDll="Python" EngineDllEntryPoint="PDFExtractToolEngine.py" SDKVersion="10.1" />
  <GuiSettings Html="PDFExtractToolGUI.html" Icon="PDFExtractTool.png" Help="https://your-help-link-here.com" SDKVersion="10.1">
    <InputConnections>
      <Connection Name="Input" AllowMultiple="False" Optional="False" Type="Connection" Label="PDF Input"/>
    </InputConnections>
    <OutputConnections>
      <Connection Name="Output" AllowMultiple="False" Optional="False" Type="Connection" Label="Table Output"/>
      <Connection Name="ErrorOutput" AllowMultiple="False" Optional="False" Type="Connection" Label="Error Output"/>
    </OutputConnections>
  </GuiSettings>
  <Properties>
    <MetaInfo>
      <Name>PDF to Table</Name>
      <Description>Reads PDFs and extracts tables</Description>
      <CategoryName>Data Parsing</CategoryName>
      <SearchTags>pdf, table, parsing</SearchTags>
      <ToolVersion></ToolVersion>
      <Author></Author>
      <Company></Company>
      <Copyright></Copyright>
    </MetaInfo>
  </Properties>
</AlteryxJavaScriptPlugin>

HTML User Interface (PDFExtractToolGUI.html):

<!DOCTYPE html>
<html>
<head>
  <meta charset="utf-8">
  <title>PDF to Table</title>
  <script type="text/javascript">
    document.write('<link rel="import" href="' + window.Alteryx.LibDir + '2/lib/includes.html">');
  </script>
</head>
<body>
  <h1>PDF to Table</h1>
  <form>
    <fieldset>
      <legend>XMSG("Select Options")</legend>
      <section>
        <label>XMSG("Select a PDF file:")</label>
        <ayx aria-label="data-source-metainfo-filebrowse" data-ui-props='{type:"FileBrowse", widgetId:"dataSourceFilePath", fileTypeFilters: "PDF Data Source|*.pdf|All Files|*.*", placeholder:"XMSG("Select .pdf file...")"}' data-item-props='{dataName: "PdfField", dataType:"SimpleString"}'></ayx>
      </section>
    </fieldset>
  </form>

  <style>
    body {
      font-size: 10pt;
      font-family: Arial, sans-serif;
      margin: 20px;
    }

    legend {
      border: none;
    }

    fieldset {
      border: 2px solid #EA7C7C;
      border-radius: 5px;
    }

    section, label, select, checkbox, input {
      padding: 10px 0;
    }
  </style>
</body>
</html>

Python Script (PDFExtractToolEngine.py):

import AlteryxPythonSDK as Sdk
import tabula

class AyxPlugin:
    def __init__(self, n_tool_id: int, alteryx_engine: object, output_anchor_mgr: object):
        self.n_tool_id = n_tool_id
        self.alteryx_engine = alteryx_engine
        self.output_anchor_mgr = output_anchor_mgr
        self.input = None

    def pi_init(self, str_xml: str):
        self.input = None

    def pi_add_incoming_connection(self, str_type: str, str_name: str) -> object:
        self.input = self
        return self

    def pi_add_outgoing_connection(self, str_name: str) -> bool:
        return True

    def pi_push_all_records(self, n_record_limit: int) -> bool:
        self.alteryx_engine.output_message(self.n_tool_id, Sdk.EngineMessageType.info, "Processing PDFs...")
        return False

    def pi_close(self, b_has_errors: bool):
        self.alteryx_engine.output_message(self.n_tool_id, Sdk.EngineMessageType.info, "pi_close...")
        pass

    def ii_init(self, record_info_in: Sdk.RecordInfo) -> bool:
        self.pdf_field = record_info_in.get_field_by_name('PdfField')
        return True

    def ii_push_record(self, in_record: Sdk.RecordRef) -> bool:
        pdf_path = self.pdf_field.get_as_string(in_record)

        if pdf_path:
            self.alteryx_engine.output_message(self.n_tool_id, Sdk.EngineMessageType.info, f"Processing PDF: {pdf_path}")
            try:
                tables = tabula.read_pdf(pdf_path, pages='all')
                for table_num, table in enumerate(tables):
                    self.alteryx_engine.output_message(self.n_tool_id, Sdk.EngineMessageType.info, f"Extracted Table {table_num + 1}:")
                    self.alteryx_engine.output_message(self.n_tool_id, Sdk.EngineMessageType.info, str(table))
                    # Process and push the table data downstream here
            except Exception as e:
                self.alteryx_engine.output_message(self.n_tool_id, Sdk.EngineMessageType.error, f"Error processing PDF: {str(e)}")
                return False
        else:
            self.alteryx_engine.output_message(self.n_tool_id, Sdk.EngineMessageType.warning, "No PDF path found in input record.")

        return True

    def ii_update_progress(self, d_percent: float):
        pass

    def ii_close(self):
        pass

I've tried various debugging approaches, such as using the alteryx_engine.output_message method to print debug messages. However, no output messages from the ii_init and ii_push_record methods are displayed, indicating that these methods are not being called.

I would appreciate any insights or suggestions on why the ii_init and ii_push_record methods might not be called and how I can troubleshoot this issue.

Thank you in advance for your assistance!

0

There are 0 answers