Set an argument for a function down the chain

119 views Asked by At

Im working with the Confluence loader in LangChain. I need to change the Confluence output from it's API by passing a value for it to expand the desired output like explained here.

The confluence loader has a function called load

def load(
        self,
        space_key: Optional[str] = None,
        page_ids: Optional[List[str]] = None,
        label: Optional[str] = None,
        cql: Optional[str] = None,
        include_restricted_content: bool = False,
        include_archived_content: bool = False,
        include_attachments: bool = False,
        include_comments: bool = False,
        content_format: ContentFormat = ContentFormat.STORAGE,
        limit: Optional[int] = 50,
        max_pages: Optional[int] = 1000,
        ocr_languages: Optional[str] = None,
        keep_markdown_format: bool = False,
        keep_newlines: bool = False,
    ) -> List[Document]:

and i access it like this

documents = loader.load(
    space_key="KB",
    include_attachments=False,
    keep_newlines=True,
    keep_markdown_format=True,
)

the load function also has a if-condition

        if space_key:
            pages = self.paginate_request(
                self.confluence.get_all_pages_from_space,
                space=space_key,
                limit=limit,
                max_pages=max_pages,
                status="any" if include_archived_content else "current",
                expand=content_format.value,
            )
            docs += self.process_pages(
                pages,
                include_restricted_content,
                include_attachments,
                include_comments,
                content_format,
                ocr_languages=ocr_languages,
                keep_markdown_format=keep_markdown_format,
                keep_newlines=keep_newlines,
            )

The expand argument use the content_format value from this class

class ContentFormat(str, Enum):
    """Enumerator of the content formats of Confluence page."""

    EDITOR = "body.editor"
    EXPORT_VIEW = "body.export_view"
    ANONYMOUS_EXPORT_VIEW = "body.anonymous_export_view"
    STORAGE = "body.storage"
    VIEW = "body.view"

    def get_content(self, page: dict) -> str:
        return page["body"][self.name.lower()]["value"]

The if statement above send us to the get_all_pages_from_space function to call the Confluence API. The function looks like this

    def get_all_pages_from_space(
        self,
        space,
        start=0,
        limit=50,
        status=None,
        expand=None,
        content_type="page",
    ):
        """
        Get all pages from space

        :param space:
        :param start: OPTIONAL: The start point of the collection to return. Default: None (0).
        :param limit: OPTIONAL: The limit of the number of pages to return, this may be restricted by
                            fixed system limits. Default: 50
        :param status: OPTIONAL: list of statuses the content to be found is in.
                                 Defaults to current is not specified.
                                 If set to 'any', content in 'current' and 'trashed' status will be fetched.
                                 Does not support 'historical' status for now.
        :param expand: OPTIONAL: a comma separated list of properties to expand on the content.
                                 Default value: history,space,version.
        :param content_type: the content type to return. Default value: page. Valid values: page, blogpost.
        :return:
        """
        return self.get_all_pages_from_space_raw(
            space=space, start=start, limit=limit, status=status, expand=expand, content_type=content_type
        ).get("results")

I dont understand how I can set a custom value for the expand argument further down the chain, there is no **args/**kwargs I can set from the initial load function.

Update: Until someone gives me a better solution I have created a new class mimic-in ContentFormat like this

class _ContentFormat(str, Enum):
    """Enumerator of the content formats of Confluence page."""

    EDITOR = "body.editor"
    EXPORT_VIEW = "body.export_view"
    ANONYMOUS_EXPORT_VIEW = "body.anonymous_export_view"
    STORAGE = "body.storage,version"
    VIEW = "body.view"

    def get_content(self, page: dict) -> str:
        return page["body"][self.name.lower()]["value"]

It is not beautiful, but works for now.

0

There are 0 answers