Internet-Draft | Designing APIs with REST API Linked Data | October 2024 |
Polli | Expires 26 April 2025 | [Page] |
This document provides guidance for designing schemas using REST API Linked Data keywords.¶
This note is to be removed before publishing as an RFC.¶
Status information for this document may be found at https://datatracker.ietf.org/doc/draft-polli-design-process/.¶
information can be found at https://github.com/ioggstream/draft-polli-restapi-ld-keywords.¶
Source for this draft and an issue tracker can be found at https://github.com/ioggstream/draft-polli-restapi-ld-keywords/issues.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 26 April 2025.¶
Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
This document provides guidance and examples for JSON Schema modeling using and REST API Linked Data keywords.¶
Since REST API Linked Data keywords only support JSON-LD compact notation, this document focuses on JSON objects.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here. These words may also appear in this document in lower case as plain English words, absent their normative meanings.¶
All JSON samples are represented in YAML format for readability and conciseness.¶
The terms "primitive types" and "structured types", are from Section 1 of [JSON].¶
The terms "JSON document", "JSON object", "JSON array" are from [JSON].¶
The terms "schema", "schema instance", "keyword" are from [JSONSCHEMA];
the term "schema instance" referso to a JSON document
that conforms to a JSON Schema.
Schema instances are conveyed in the example
JSON Schema keyword.¶
Examples can be viewed opening this link in the schema editor.¶
To ensure that the API is extensible and that the data can be easily enriched with additional information,
it is a best practice to only convey JSON objects,
with other types (primitive or array) being used as references (i.e., using the $ref
keyword).¶
Moreover:¶
non-object values cannot be directly mapped to an RDF triple, as they require a subject and a predicate.¶
JSON-LD only supports JSON objects (compact form) and JSON arrays (expanded form).¶
For this reason, [LD-KEYWORDS] does not support
adding x-jsonld-type
and x-jsonld-context
to non-object schemas.¶
For example, the schema instance associated with
the following schema is the string Diego Maria
.¶
GivenName: type: string maxLength: 64 example: Diego Maria
A modeling strategy for non-object values is to create reusable syntax blocks (i.e., constraining the length or the character set according to the specifications), and to defer the semantics to the containing JSON object.¶
This approach ensures that the same schema can be used in different contexts.¶
RegistryString: type: string maxLength: 64 description: >- A string that can be used to represent a givenName, a familyName, a patronymicName, or any other string associated with a naming property registry information. Person: x-jsonld-type: Person x-jsonld-context: "@vocab": "https://schema.org/" type: object properties: givenName: $ref: "#/components/schemas/RegistryString" familyName: $ref: "#/components/schemas/RegistryString" example: givenName: Diego Maria familyName: De La Peña
This allows focusing on the semantics of the JSON object, while isolating the efforts of properly constraining the syntax according to the requirement of the specific API.¶
It is possible, in fact, that syntax constraints may change over time, while the semantics of the object remain the same.¶
A specific service might require, for example, specific string constraints such as latinized uppercase.¶
RegistryStringL: type: string maxLength: 64 description: >- A latinized string. pattern: "^[A-Z ]+$" PersonL: x-jsonld-type: Person x-jsonld-context: "@vocab": "https://schema.org/" type: object properties: givenName: $ref: "#/components/schemas/RegistryStringL" familyName: $ref: "#/components/schemas/RegistryStringL" example: givenName: DIEGO MARIA familyName: DE LA PENHA
The resulting RDF graph is¶
@prefix schema: <https://schema.org/> . _:b0 a schema:Person ; schema:familyName "De La Vega" ; schema:givenName "Diego Maria" .¶
Identifiers are a special case of text-based entries, and isolating syntax from semantics can make things more readable.¶
NumericTaxCode: description: >- Legal persons have a 11-digit tax code. type: string pattern: "^[0-9]{11}$" example: "12345678901" StringTaxCode: description: >- Natural persons have a 16-character tax code. type: string pattern: "^[A-Z]{6}[0-9]{2}[A-Z][0-9]{2}[A-Z][0-9]{3}[A-Z]$" example: RSSMRO99A04H501A TaxCode: description: >- This is a purely syntactic definition, and can be used in different semantic contexts. oneOf: - $ref: "#/components/schemas/NumericTaxCode" - $ref: "#/components/schemas/StringTaxCode" PersonID: description: >- The Person identifier is a 16-character string. type: string pattern: "^[0-9]{16}$" example: "1234567890123456"
These schemas can be reused both when using their values as identifiers or as simple property values.¶
The following schema uses a tax code as an identifier.¶
Person: x-jsonld-type: Person x-jsonld-context: "@vocab": "https://schema.org/" tax_code: "@id" "@base": "urn:example:tax:it:" given_name: givenName family_name: familyName type: object required: [tax_code] properties: tax_code: $ref: "#/components/schemas/TaxCode" given_name: $ref: "#/components/schemas/RegistryString" family_name: $ref: "#/components/schemas/RegistryString" children: type: array items: $ref: "#/components/schemas/Person" example: given_name: Mario family_name: Rossi tax_code: RSSMRO99A04H501A children: - tax_code: RSSLCC99A04H501A¶
The associated RDF graph is:¶
urn:example:tax:it:RSSMRO99A04H501A a schema:Person ; schema:givenName "Mario" ; schema:familyName "Rossi" ; schema:children urn:example:tax:it:RSSLCC99A04H501A .¶
This other schema uses person_id
property as identifier,
while tax_code
is a simple property value.¶
RegisteredPerson: x-jsonld-type: RegisteredResidentPerson x-jsonld-context: "@vocab": "https://schema.org/" "@base": "urn:example:anpr.it:" person_id: "@id" tax_code: taxCode given_name: givenName family_name: familyName type: object required: [tax_code] properties: person_id: $ref: "#/components/schemas/PersonID" tax_code: $ref: "#/components/schemas/TaxCode" given_name: $ref: "#/components/schemas/RegistryString" family_name: $ref: "#/components/schemas/RegistryString" children: type: array items: $ref: "#/components/schemas/RegisteredPerson" example: given_name: Mario family_name: Rossi person_id: "1234567890123456" tax_code: RSSMRO99A04H501A children: - person_id: "2234567890123457" tax_code: RSSLCC99A04H501A¶
The resulting RDF consists in two, linked nodes,
where the identifier is the person_id
property.¶
<urn:example:person:1234567890123456> a schema:RegisteredResidentPerson ; schema:givenName "Mario" ; schema:familyName "Rossi" ; schema:taxCode "RSSMRO99A04H501A" ; schema:children <urn:example:person:2234567890123457> . <urn:example:person:2234567890123457> a schema:RegisteredResidentPerson ; schema:taxCode "RSSLCC99A04H501A" .¶
Note that the changes to the schema instances
were minimal: just the addition of the person_id
JSON Schema property.¶
There are different ways to model a vocabulary-based entry, e.g., a list of countries or a list of currencies.¶
Normally, you would use a JSON Schema (e.g., with an enum
keyword):¶
CountryCode: type: string enum: [ "ITA", "FRA", "DEU" ] example: ITA
The resulting schema instance is a simple string
(e.g. ITA
).
To be able to represent the entry in JSON-LD,
an enumerated entry can be modeled using
a specific property for the identifier,
and a JSON-LD context.¶
Country: type: object properties: identifier: $ref: "#/components/schemas/CountryCode" name: type: string example: identifier: ITA name: Italy¶
Linked Data keywords provide a context. Different contexts can lead to different RDF representations for the same schema instances (i.e. the actual data).¶
A "property-to-property" representation preserves the mapping between JSON object members and RDF properties;
with the only addition of the @type
keyword if x-jsonld-type
is present.¶
The following schema instance¶
CountryBlankNode: x-jsonld-type: Country x-jsonld-context: "@vocab": "https://schema.org/" type: object properties: identifier: "$ref": "#/components/schemas/CountryCode" name: type: string example: identifier: ITA name: Italy¶
results in this RDF graph with a blank node:¶
@prefix schema: <https://schema.org/> . _:b0 schema:identifier "ITA" ; schema:name "Italy" .
A non-isomorphic representation maps one property to the node name.¶
Associating a property with the @id
keyword and a @base
prefix,
we state that the corresponding value is the name of the node.
This schema¶
CountryURI: x-jsonld-type: Country x-jsonld-context: "@vocab": "https://schema.org/" identifier: "@id" "@base": "https://en.wikipedia.org/wiki/ISO_3166-1_alpha-3#" type: object properties: identifier: $ref: "#/components/schemas/CountryCode" name: type: string example: identifier: ITA name: Italy¶
results in the following RDF graph using a named node:¶
@prefix schema: <https://schema.org/> . @prefix iso_3166_3: <https://en.wikipedia.org/wiki/ISO_3166-1_alpha-3#> . iso_3166_3:ITA a schema:Country; schema:name "Italy" .
When modeling an object with references, the parent's context will normally provide the context for the child.¶
The following example models a Person
object with a nationality
property
referencing the CountryCode
schema.
The x-jsonld-context ensures that the nationality
property will be resolved to an URI,
though there is no space in the schema instance to provide a name for the country.¶
Person: x-jsonld-type: Person x-jsonld-context: "@vocab": "https://schema.org/" nationality: "@type": "@id" "@context": "@base": "https://en.wikipedia.org/wiki/ISO_3166-1_alpha-3#" type: object properties: givenName: type: string familyName: type: string nationality: $ref: "#/components/schemas/CountryCode" example: givenName: John familyName: Doe nationality: ITA¶
results in the following RDF graph:¶
@prefix schema: <https://schema.org/> . @prefix iso_3166_3: <https://en.wikipedia.org/wiki/ISO_3166-1_alpha-3#> . _:b0 a schema:Person ; schema:familyName "Doe" ; schema:givenName "John" ; schema:nationality iso_3166_3:ITA .
To provide a label or other properties for the country, we can use a nested object.¶
NestedPerson: x-jsonld-type: Person x-jsonld-context: "@vocab": "https://schema.org/" type: object properties: givenName: type: string familyName: type: string nationality: $ref: "#/components/schemas/CountryURI" example: givenName: John familyName: Doe nationality: identifier: ITA name: Italy¶
An implementation supporting context composition
will check that the value of NestedPerson/x-jsonld-context/nationality/@context
is undefined,
and will then integrate the information present in CountryURI/x-jsonld-context
into the instance context.¶
results in the following RDF graph:¶
@prefix schema: <https://schema.org/> . @prefix iso_3166_3: <https://en.wikipedia.org/wiki/ISO_3166-1_alpha-3#> . _:b0 a schema:Person ; schema:familyName "Doe" ; schema:givenName "John" ; schema:nationality iso_3166_3:ITA . iso_3166_3:ITA schema:name "Italy" .
To minimize context information, a common practice is to name JSON Schema properties after the corresponding RDF predicates.¶
Place: x-jsonld-type: Place ... Occupation: x-jsonld-type: Occupation ... Person: x-jsonld-type: Person x-jsonld-context: "@vocab": "https://schema.org/" type: object properties: familyName: type: string givenName: type: string birthPlace: $ref: "#/Place" hasOccupation: $ref: "#/Occupation"
As we can see from the above schema,
this practice can lead to inheriting
non uniform naming conventions from the RDF vocabulary:
for example, birthPlace
and hasOccupation
both target objects,
while only hasOccupation
starts with a verb (i.e. has
).¶
Another issue is related to the schema instance size when using very long property or class names such as https://schema.org/isAccessibleForFree and https://schema.org/IPTCDigitalSourceEnumeration.¶
Mapping JSON Schema properties to RDF predicates in x-jsonld-context can reduce semantic risks when an ontology changes, or when there's a need to switch to a different ontology: this is because having different names for the property and the predicate clarifies that the property may well evolve into a different predicate in time, like shown in the following example.¶
Instead of using a generic surname
, this schema uses
the more specific patronymicName
named after the corresponding RDF predicate.¶
Person: x-jsonld-context: "@vocab": "http://w3.org/ns/person#" properties: patronymicName: type: string example: patronymicName: "Ericsson" x-rdf: >- _:b0 :patronymicName "Ericsson" .
If the service evolves to be more generic (e.g., moving to foaf:
),
the property name might be mapped
to the foaf:familyName
predicate, but the schema instance will remain the same
thus retaining the information of a legacy ontology.¶
A more flexible design would have considered using a generic surname
property name,
and either map it to http://w3.org/ns/person#patronymicName
or foaf:familyName
in the context.¶
Always prefer explicit context information over implicit context composition. Different implementations of context composition may lead to different results, especially over large schemas with many nested objects.¶
While composition is useful in the schema design phase, bundling and validating the composed context in the final schema definition reduces the risk of interoperability issues.¶
The following example shows a Person JSON Schema with semantic information provided by the x-jsonld-type and x-jsonld-context.¶
Person: "x-jsonld-type": "https://schema.org/Person" "x-jsonld-context": "@vocab": "https://schema.org/" custom_id: null # detach this property from the @vocab country: "@id": addressCountry "@language": en type: object required: - given_name - family_name properties: familyName: { type: string, maxLength: 255 } givenName: { type: string, maxLength: 255 } country: { type: string, maxLength: 3, minLength: 3 } custom_id: { type: string, maxLength: 255 } example: familyName: "Doe" givenName: "John" country: "FRA" custom_id: "12345"
The example object is assembled as a JSON-LD object as follows.¶
{ "@context": { "@vocab": "https://schema.org/", "custom_id": null }, "@type": "https://schema.org/Person", "familyName": "Doe", "givenName": "John", "country": "FRA", "custom_id": "12345" }¶
The above JSON-LD can be represented as text/turtle
as follows.¶
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> @prefix schema: <https://schema.org/> _:b0 rdf:type schema:Person ; schema:country "FRA" ; schema:familyName "Doe" ; schema:givenName "John" .¶
The following example shows a "Person" schema with semantic information provided by the x-jsonld-type and x-jsonld-context.¶
Person: "x-jsonld-type": "https://schema.org/Person" "x-jsonld-context": "@vocab": "https://schema.org/" email: "@id" custom_id: null # detach this property from the @vocab country: "@id": addressCountry "@type": "@id" "@context": "@base": "http://publications.europa.eu/resource/authority/country/" type: object required: - email - given_name - family_name properties: email: { type: string, maxLength: 255 } familyName: { type: string, maxLength: 255 } givenName: { type: string, maxLength: 255 } country: { type: string, maxLength: 3, minLength: 3 } custom_id: { type: string, maxLength: 255 } example: familyName: "Doe" givenName: "John" email: "jon@doe.example" country: "FRA" custom_id: "12345"
The resulting RDF graph is¶
@prefix schema: <https://schema.org/> . @prefix country: <http://publications.europa.eu/resource/authority/country/> . <mailto:jon@doe.example> schema:familyName "Doe" ; schema:givenName "John" ; schema:addressCountry country:FRA .
The following schema contains a cyclic reference.¶
Person: description: Simple cyclic example. x-jsonld-type: Person x-jsonld-context: "email": "@id" "@vocab": "https://w3.org/ns/person#" children: "@container": "@set" type: object properties: email: { type: string } children: type: array items: $ref: '#/Person' example: email: "mailto:a@example" children: - email: "mailto:dough@example" - email: "mailto:son@example"¶
The example schema instance contained in the above schema results in the following JSON-LD document.¶
{ "email": "mailto:a@example", "children": [ { "email": "mailto:dough@example", "@type": "Person" }, { "email": "mailto:son@example", "@type": "Person" } ], "@type": "Person", "@context": { "email": "@id", "@vocab": "https://w3.org/ns/person#", "children": { "@container": "@set" } } }¶
Applying the workflow described in Section 3.2 just recursively copying the x-jsonld-context, the instance context could have been more complex.¶
{ ... "@context": { "email": "@id", "@vocab": "https://w3.org/ns/person#", "children": { "@container": "@set", "@context": { "email": "@id", "@vocab": "https://w3.org/ns/person#", "children": { "@container": "@set" } } } } }
In the following schema document, the "Citizen" schema references the "BirthPlace" schema.¶
BirthPlace: x-jsonld-type: https://w3id.org/italia/onto/CLV/Feature x-jsonld-context: "@vocab": "https://w3id.org/italia/onto/CLV/" country: "@id": "hasCountry" "@type": "@id" "@context": "@base": "http://publications.europa.eu/resource/authority/country/" province: "@id": "hasProvince" "@type": "@id" "@context": "@base": "https://w3id.org/italia/data/identifiers/provinces-identifiers/vehicle-code/" type: object required: - province - country properties: province: description: The province where the person was born. type: string country: description: The iso alpha-3 code of the country where the person was born. type: string example: province: RM country: ITA Citizen: x-jsonld-type: Person x-jsonld-context: "email": "@id" "@vocab": "https://w3.org/ns/person#" type: object properties: email: { type: string } birthplace: $ref: "#/BirthPlace" example: email: "mailto:a@example" givenName: Roberto familyName: Polli birthplace: province: LT country: ITA
The example schema instance contained in the above schema results in the following JSON-LD document. The instance context contains information from both "Citizen" and "BirthPlace" semantic keywords.¶
{ "email": "mailto:a@example", "givenName": "Roberto", "familyName": "Polli", "birthplace": { "province": "RM", "country": "ITA", "@type": "https://w3id.org/italia/onto/CLV/Feature" }, "@type": "Person", "@context": { "email": "@id", "@vocab": "https://w3.org/ns/person#", "birthplace": { "@context": { "@vocab": "https://w3id.org/italia/onto/CLV/", "city": "hasCity", "country": { "@id": "hasCountry", "@type": "@id", "@context": { "@base": "http://publications.europa.eu/resource/authority/country/" } }, "province": { "@id": "hasProvince", "@type": "@id", "@context": { "@base": "https://w3id.org/italia/data/identifiers/provinces-identifiers/vehicle-code/" } } } } } }
That can be serialized as text/turtle
as¶
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix eu: <https://w3.org/ns/person#> . @prefix itl: <https://w3id.org/italia/onto/CLV/> . <mailto:a@example> rdf:type eu:Person ; eu:birthplace _:b0 ; eu:familyName "Polli" ; eu:givenName "Roberto" . _:b0 rdf:type itl:Feature ; itl:hasCountry <http://publications.europa.eu/resource/authority/country/ITA> . itl:hasProvince <https://w3id.org/italia/data/identifiers/provinces-identifiers/vehicle-code/RM> .
Thanks to Giorgia Lodi, Matteo Fortini and Saverio Pulizzi for being the initial contributors of this work.¶
In addition to the people above, this document owes a lot to the extensive discussion inside and outside the workgroup. The following contributors have helped improve this specification by opening pull requests, reporting bugs, asking smart questions, drafting or reviewing text, and evaluating open issues:¶
Pierre-Antoine Champin, and Vladimir Alexiev.¶
This section is to be removed before publishing as an RFC.¶
TBD¶