Is there a way to copy the text of an .ods and .pdf file into a libreoffice .odt file?

366 views Asked by At

I'm trying to create a libreoffice basic macro which lets you copy the entire content of a file into a table row. The code below works correctly with text files, such as .ods or .txt, but has some problems with .pdf and .ods files. In particular it crashes on the getText() method. Do you know any other method that I could use in order to solve my problem?

`

REM ***The file Url***
sUrlDoc = "file:///C:/Users/user/Desktop/Test.ods"

REM ***It correctly opens the file***
oDoc = StarDesktop.loadComponentFromURL(sUrlDoc, "_blank", 0, Prop() )

REM ***Correctly inserts a new row in the table***
oTable.Rows.insertByIndex(oTable.getRows().getCount(),1)

REM ***It goes into the right position***
oCell = oTable.getCellByPosition(0,1)

REM ***Should read from file (only works with .odt and .txt)***
oCursor = oDoc.getText(1)
oCell.setString(oCursor.string)

oDoc.close(true)`
1

There are 1 answers

1
JohnSUN On BEST ANSWER

You can get the context of an ODS file in several ways.

The slowest of these is to iterate over all the data in the workbook sheet by sheet and cell by cell, taking out the text content of each cell.

I suggest using the method that Andrew Pitonyak shows in chapter 5.23. Manipulating the clipboard (keep this book close at hand and you will not have to write many macros for solving everyday tasks - you will just take ready-made code)

Function getContentODS(sDocName As String) As String 
Dim oDoc As Variant         ' Spreadsheet as object
Dim bDisposable As Boolean  ' Can be closed
Dim oSheets As Variant      ' All sheets of oDoc
Dim oSheet As Variant       ' Single sheet
Dim i As Long           
Dim oCurrentController As Variant
Dim oCursor As Variant      ' Get Used Area
Dim oTransferable As Variant    ' Content of selection
Dim oTransferDataFlavors As Variant
Dim oConverter As Variant   ' Util
Dim j As Integer, iTextLocation As Integer
Dim oData As Variant
Dim sResult As String       ' All content as very long string
    GlobalScope.BasicLibraries.loadLibrary("Tools")
    If Not FileExists(sDocName) Then Exit Function 
    oDoc = OpenDocument(ConvertToURL(sDocName), Array(), bDisposable)
    sResult = FileNameoutofPath(sDocName) & ": "
    oCurrentController = oDoc.getCurrentController()
    oSheets = oDoc.getSheets()
    oConverter = createUnoService("com.sun.star.script.Converter")
    For i = 0 to oSheets.getCount()-1
        oSheet = oSheets.getByIndex(i)
        oCursor = oSheet.createCursor()
        oCursor.gotoEndOfUsedArea(True)
        oCurrentController.select(oCursor)
        oTransferable = oCurrentController.getTransferable()
        oTransferDataFlavors = oTransferable.getTransferDataFlavors()
        iTextLocation = -1
        For j = LBound(oTransferDataFlavors) To UBound(oTransferDataFlavors)
            If oTransferDataFlavors(j).MimeType = "text/plain;charset=utf-16" Then
                iTextLocation = j
                Exit For
            End If
        Next
        If (iTextLocation >= 0) Then
            oData = oTransferable.getTransferData(oTransferDataFlavors(iTextLocation))
            sResult = sResult & oSheet.getName() & "=" & _
                oConverter.convertToSimpleType(oData, com.sun.star.uno.TypeClass.STRING) & "; "
        End If
    Next i
    If bDisposable Then oDoc.close(True)
    getContentODS = sResult
End Function

This function will open the spreadsheet, the path and name of which it will receive in the parameter, iterate over all the sheets, taking out the text content and concatenating it into one long string variable, and at the end closes the document.

You can test this code using this procedure:

Sub tst
    MsgBox getContentODS("C:\Users\user\Desktop\Test.ods")
End Sub

So the function will return a string for you. Think about what to do with this line (or look in chapter 7. Writer Macros)

To obtain the text part of a PDF document, you can use a similar technique (copying content to the clipboard from AcrobatReader and taking out only the text part of the copied one) or open it in Draw and iterate over all graphic elements in order to take fragments of text from them.