Is there a way that I can leverage a JSON Schema to flag / classify an attribute as PII

119 views Asked by At

If I have a JSON Schema file that has an email attribute: { "from": { "title": "From email address", "type": "email" } }

Is there any way, using standard schema JSON keywords to identify property "from" as a "PII" property? It would be great to be able to define your own: { "from": { "title": "From email address", "type": "email", "pii": true } }

I browsed through the JSON schema definitions / docs at https://cswr.github.io/JsonSchema/spec/definitions_references/ and could not find something that would work.

1

There are 1 answers

0
Jason Desrosiers On

You can do this with JSON Schema annotations. The annotations feature is just a way to add meta data to a JSON instance. You can then write code to process the JSON how you wish based on those annotations.

An annotation can be as simple as a custom keyword that you choose to use in your schema. Your pii keyword is sufficient.

There aren't too many JSON Schema implementations out there that provide annotation data, and almost all of them just give the data and you have to parse it yourself to get what you need. I've written some javascript tooling that makes working with annotations a little easier, but there's nothing like it in other languages as far as I'm aware.

import { annotate } from "@hyperjump/json-schema/annotations/experimental";
import * as Instance from "@hyperjump/json-schema/annotated-instance/experimental";
import { addSchema } from "@hyperjump/json-schema/draft-2020-12";
import * as JsonPointer from "@hyperjump/json-pointer";


(async function () {
  const schemaId = "https://example.com/foo";
  const dialectId = "https://json-schema.org/draft/2020-12/schema";

  addSchema({
    "type": "object",
    "properties": {
      "name": { "type": "string" },
      "email": {
        "type": "string",
        "format": "email",
        "pii": true
      }
    }
  }, schemaId, dialectId);

  const data = {
    name: "Jason",
    email: "[email protected]"
  };
  const instance = await annotate(schemaId, data);

  let encrypted = data;
  for (const pii of Instance.annotatedWith(instance, "pii", dialectId)) {
    encrypted = JsonPointer.set(pii.pointer, encrypted, encrypt(Instance.value(pii)));
  }

  console.log(encrypted);
}());

Output

{
  "name": "Jason",
  "email": "[email protected]"
}

(Insert your own encrypt function that actually encrypts.)