Loading RDF data to PostgreSQL Table through RDFLib-SQLAlchemy

3.2k views Asked by At

I have a large RDF dataset (Geonames dataset: 18GB) in NT format. I would like to load it into a PostgreSQL relational table by using rdflib_sqlalchemy.SQLAlchemy. I know that it is doable (performing sparql query on the rdf data stored in relational database). However, I am not sure how. Could you please provide me an example?

My next goal is to write an SPARQL query from python by using RDFLib. I know how to do it. Thanks in advance for your help.

2

There are 2 answers

1
F1refly On

Install these Python libraries:

pip install rdflib
pip install rdflib-sqlalchemy
pip install psycopg2

Run the following Python code:

from rdflib import plugin
from rdflib.graph import Graph
from rdflib.store import Store
from rdflib_sqlalchemy import registerplugins

registerplugins()

SQLALCHEMY_URL ="postgresql+psycopg2://user:password@hostname:port/databasename"

store = plugin.get("SQLAlchemy", Store)(identifier="my_store")
graph = Graph(store, identifier="my_graph")
graph.open(SQLALCHEMY_URL, create=True)

graph.parse("demo.nt", format="nt")

result = graph.query("select * where {?s ?p ?o} limit 10")

for subject, predicate, object_ in result:
    print(subject, predicate, object_)

graph.close()

'demo.nt' is the N-Triples file to import. I used this for testing:

<http://example.org/a> <http://example.org/b> <http://example.org/c> .

After being imported successfully, your database contains five tables (e.g., kb_[some_id]_asserted_statements) populated with the triples. The console has printed ten triples at most.

Tested on Windows 10, PostgreSQL 10.5, Python 3.5.4 (all 64bit) with rdflib-4.2.2, rdflib-sqlalchemy-0.3.8, and psycopg2-2.7.5.

0
Jim Jones On

Slightly off topic:

In case the RDF data is available via a SPARQL endpoint, you might be interested in taking a look at the rdf_fdw project (PostgreSQL Foreign Data Wrapper for RDF Triplestores) - fairly new and still under tests. Doing so, you can have direct access to the data using simple SQL statements, e.g.

(DBpedia) Living politicians in DBpedia that are affiliated to a party. The party name must have a german translation.

Create a SERVER with the SPARQL endpoint's address

CREATE SERVER dbpedia
FOREIGN DATA WRAPPER rdf_fdw 
OPTIONS (endpoint 'https://dbpedia.org/sparql');

Create a FOREIGN TABLE with the just created SERVER.

CREATE FOREIGN TABLE politicians (
  uri text        OPTIONS (variable '?person', nodetype 'iri'),
  name text       OPTIONS (variable '?personname', nodetype 'literal', literaltype 'xsd:string'),
  birthdate date  OPTIONS (variable '?birthdate', nodetype 'literal', literaltype 'xsd:date'),
  party text      OPTIONS (variable '?partyname', nodetype 'literal', literaltype 'xsd:string'),
  country text    OPTIONS (variable '?country', nodetype 'literal', language 'en')
)
SERVER dbpedia OPTIONS (
  sparql '
    PREFIX dbp: <http://dbpedia.org/property/>
    PREFIX dbo: <http://dbpedia.org/ontology/>

    SELECT *
    WHERE {
      ?person 
          a dbo:Politician;
          dbo:birthDate ?birthdate;
          dbp:name ?personname;
          dbo:party ?party .       
        ?party 
          dbp:country ?country;
          rdfs:label ?partyname .
        FILTER NOT EXISTS {?person dbo:deathDate ?died}
        FILTER(LANG(?partyname) = "de")
      } 
');

... and then you're able to query the table using simple SQL.

Select the 5 youngest politicians from Germany and France who were born after Dec 31st 1995.

SELECT name, birthdate, party
FROM politicians
WHERE 
  country IN ('Germany','France') AND 
  birthdate > '1995-12-31' AND
  party <> ''
ORDER BY birthdate DESC, party ASC
FETCH FIRST 5 ROWS ONLY;

        name        | birthdate  |                  party                  
--------------------+------------+-----------------------------------------
 Louis Boyard       | 2000-08-26 | La France insoumise
 Klara Schedlich    | 2000-01-04 | Bündnis 90/Die Grünen
 Pierrick Berteloot | 1999-01-11 | Rassemblement National
 Niklas Wagener     | 1998-04-16 | Bündnis 90/Die Grünen
 Jakob Blankenburg  | 1997-08-05 | Sozialdemokratische Partei Deutschlands
(5 rows)