get sharepoint file programatically (I know the url from a web browser)

125 views Asked by At

I use sharepoint from a web browser.

I visualize a file (xls file) that has following url

https://mysite.sharepoint.com/:x:/r/_layouts/15/Doc.aspx?sourcedoc=%7Bxxxxxx-xxxxx-xxxx%7D&file=filename.xlsx&action=default&mobileredirect=true

If I right click on the document and and I click on 'Copy URL' I get a url of the type:

https://mysite.sharepoint.com/:x:/g/XXXXXXXXX-sXXXXXXXX?e=yyyyy

My question is how can I programatically get this document:

I tried using Office365-REST-Python-Client (version 2.5.2)

and manage to get a client context (authentication with same username and password as used on my browser)

with following snippet:

import os

from office365.runtime.auth.authentication_context import AuthenticationContext
from office365.sharepoint.client_context import ClientContext

def get_ctx():
    sharepoint_url = os.environ["SP_URL"]
    username = os.environ["SP_USER"]
    password = os.environ["SP_PASSWORD"]
    auth_ctx = AuthenticationContext(url=sharepoint_url)
    auth_ctx.acquire_token_for_user(username, password)
    ctx = ClientContext(sharepoint_url, auth_ctx)
    return ctx

I manage to list all existing document_libraries and each items in them

def get_doc_libraries(ctx):
    web = ctx.web
    ctx.load(web)
    ctx.execute_query()

    lists = web.lists
    ctx.load(lists)
    ctx.execute_query()
    for sp_list in lists:
        props = sp_list.properties
        if props['BaseTemplate'] == 101:  # Document libraries
        library_name = props["Title"]
        doc_library = ctx.web.lists.get_by_title(library_name)
        ctx.load(doc_library)
        ctx.execute_query()

        items = doc_library.get_items()
        ctx.load(items)
        ctx.execute_query()

        paged_items = doc_library.items.paged(500, page_loaded=print_progress).get().execute_query()
        for item in paged_items:
          # do_something_with_item

I get a few thousand items which might be realistic for this sharepoint url, but most of the items don't have titles and I don't know how to find out whether any of these is refering to the document that I have a url for.

Attempts of using

def get_file_with_rel_url(ctx, url, sharepoint_url):
    sharepoint_url = sharepoint_url.rstrip("/")
    rel_url = url.replace(sharepoint_url, "")

    response = File.open_binary(ctx, rel_url)

    with open("bla.xls", 'wb') as output_file:
        output_file.write(response.content)

do fail.

I get an error messageof the kind

{"error":{"code":"-2130575338, Microsoft.SharePoint.SPException","message":{"lang":"fr-FR","value":"Le fichier /:x:/g/XXXXXX-XXXXXX n'existe pas."}}}

Which means in English The file /:x:/g/XXXXXX-XXXXXX doesn't exist

I think the urls given by the web browser aren't the ones I should use in the API.

But I don't know how to determine the right url or how to get something like a uid for the document, that I can use to fetch it.

The file I look at is not a file, that I own. It has been shared by somebody else, but I have read and write permissions to it (in my browser)

0

There are 0 answers