I came across "Do we need a new kind of schema language?" from Tim Bray's blog. After reading it, I can't resist this: why not use a subset of Standard ML for this purpose?
int, real, bool, char, string
type for type abbreviations - useful when defining "bigger" schema.
datatype for algebraic data types
structures to group related types.
In the above, we have every type expression except for function types! (IMO, it is not a big list!). Annotations for bindings may be specified in SML comments [this needs more thought].
{ url: String, width: Integer?, height: Integer?, title: String }
would become the following in SML:
{ url: string, width: Maybe int, height: Maybe int, title: string }
where Maybe is
datatype 'a Maybe = Just of 'a | Nothing
Pros:
- We can have parametric types and module system - may be useful, for defining larger, generic schema.
- Tuples, Lists, Records could be mapped to parametrized classes (like in Scala, Java etc.) or native data types in scripting languages.
- sum-of-product types can be translated as classes (like case classes in Scala)
- Type expressions are proper set of one particular language - at least few people would feel at home
JSON is proper subset of JavaScript object literals, functional folks could have their turn
. BTW, I am okay with Haskell as well. Personally, I've played with SML little bit more
- If we want to have schema specified default values of various elements of data, we can include proper subset of value definitions in SML.