Getting HTML source using WebKit2

1.3k views Asked by At

I use a little Python3 program, which I would like to switch from Webkit to WebKit2. In short, I open a Website in a Gtk-Window and would like to store the html code in a variable. For WebKit its easy:

import sys
import gi
gi.require_version('WebKit', '3.0') 
from gi.repository import WebKit
from gi.repository import Gtk, Gdk

starturl = 'https://www.google.de'

def printHTML(webview, frame):
    html = frame.get_data_source().get_data().str
    print(html)

webview = WebKit.WebView()
webview.open(starturl)
webview.connect("load-finished", printHTML)

win = Gtk.Window()
win.add(webview)
win.connect("delete-event", Gtk.main_quit)
win.set_default_size(800,600)
win.show_all()

Gtk.main()

Using WebKit2 everything is fine except: I cant get the html code...

import sys
import gi
gi.require_version('WebKit2', '4.0') 
from gi.repository import WebKit2
from gi.repository import Gtk, Gdk

starturl = 'https://www.google.de'

def printHTML(webview, event):
    html = ???
    print(html)

webview = WebKit.WebView()
webview.load_uri(starturl)
webview.connect("load-changed", printHTML)

win = Gtk.Window()
win.add(webview)
win.connect("delete-event", Gtk.main_quit)
win.set_default_size(800,600)
win.show_all()

Gtk.main()

Is there an equivalent for 'frame' in WebKit2? Maybe someone know a solution.

Regards and thank you. Till

1

There are 1 answers

1
Arjen Balfoort On

You need to use the WebView.get_main_resource() function, get the data from that resource and create a callback function to retrieve the html source.

This example code was used in a class:

def on_load_changed(self, webview, event):
    if event == WebKit2.LoadEvent.FINISHED:
        resource = webview.get_main_resource()
        resource.get_data(None, self._get_response_data_finish, None)

def  _get_response_data_finish(self, resource, result, user_data=None):
    self.html_response = resource.get_data_finish(result)
    print((self.html_response))