What do you think about Microsoft Oslo MGraph?

634 views Asked by At

MGraph is a great textual data format brought by Microsoft "Oslo".

Do you think it has a chance to get as broad as XML is today?

Example (Google Geocode):

{  
  name = "waltrop, lehmstr 1d",  
  Status {  
    code = 200,  
    request: "geocode"  
  },  
  Placemark [  
    {  
      id = "p1",  
      address = "Lehmstraße, 45731 Waltrop, Deutschland",  
      AddressDetails { Country {CountryNameCode = "DE", CountryName = "Deutschland", AdministrativeArea { AdministrativeAreaName = "Nordrhein-Westfalen", SubAdministrativeArea = { SubAdministrativeAreaName = "Recklinghausen", Locality { LocalityName = "Waltrop", Thoroughfare { ThoroughfareName = "Lehmstraße" }, PostalCode = { PostalCodeNumber = "45731" }}}}}, Accuracy = 6 },  
      ExtendedData {  
        LatLonBox {  
          north = 51.6244226,  
          south = 51.6181274,  
          east = 7.4046111,  
          west = 7.3983159  
        }  
      },  
      Point {  
        coordinates [ 7.4013350, 51.6212620, 0 ]  
      }  
    }  
  ]  
}

Mode information here: Microsoft "Oslo" MGraph - the next XML?

5

There are 5 answers

0
Jason S On BEST ANSWER

...and what does it do that JSON doesn't do?

0
AudioBubble On

In response to James Clark's thoughts on M:

I also see some things missing from M and Oslo, but not quite the same things.

It would be nice to have some guaranty that M would preserve the order that entities within collections are preserved. However, how you want to order elements is an implementation detail. If you have an ordered collection in M and you persist that to a database, how do you maintain their order there? The only way would be to make some assumptions about the shape of the data, to add some column to a table that you didn't specify, and in that case it makes more sense to be in full control of your data structure's shape.

The same goes for identity. The reason we have object identity in memory is because each object allocates a different place in memory, and has that memory address to uniquely identity it. When saved to a database, however, this information is no longer relevant, and you need some column or combination of columns to uniquely identify that record, to serve as its primary key. If you don't specify it, then M has to invent a column for you and you won't have a reference to it, except perhaps through some kind of trick that may be difficult to discover. In other words, there is no "inherent identity"; there's always some data that explicitly identifies it.

Documents and data aren't two different things. XML doesn't handle documents per se; it just represents hierarchical data, and documents are composed from this. As long as the data is structured, it can be represented in M, in the same way that you can write classes for the various parts of the hierarchy and reference one type from another to compose them into arbitrarily-complex trees. Admittedly, this is easier to throw together in XML because it's free-form text and there's no real validation unless you write an XSD schema, but in those cases, you're doing the same kind of work as defining types and relations in code classes.

So ultimately, M handles documents that you define the structure for, and that structure doesn't really have any limitations. The question is how easy is to do so. The idea you have for a tool to pull apart an XML document and generate M schema is a pretty good one. I imagine it wouldn't be too difficult to write one, or for Microsoft to include with their tool chain once it matures a bit more. As far as the structure "getting ugly" goes, if your data structure is really that complex, it is what it is. Schematizing it has great advantages, same in XSD or M or C# classes, but if your goal is to store it in a SQL Server database (or the Oslo Repository specifically), then it's necessary and worthwhile.

I'm pretty confident that M and the supporting tool chain will evolve into something pretty amazing and useful. There's obviously a lot missing right now. Personally, I'm more concerned with the fact that M is currently targeted at modeling at the relational, physical database level instead of the conceptual level (like Entity Framework), where it feels most natural for a developer to begin modeling. After all, when writing classes to instantiate objects from MGraphs (the purpose and output for a DSL), your classes may be defined quite differently from how they are persisted. Especially if you use inheritance in your models.

I agree with you on standardization. That would be nice. However, I think it's less important due to the fact that the goal is to store this data in the Oslo repository. Especially once SQL Data Services is mature enough to host the repository, we're going to have all different protocols and formats for querying and manipulating this data. Clients will be able to query and update via ADO.NET Data Services, formatting messages with JSON, POX, SOAP, MGraph, and so on. All MGraph data needs is an MGraph connector to get it in the database, from which it can be accessed in any way imaginable.

You can find more information about Oslo in my article here: http://dvanderboom.wordpress.com/2009/01/17/why-oslo-is-important/

4
Kev On

I can't help it, but I kinda feel Oslo is a solution looking for a really excellent concrete problem to solve. I truly hope they find it.

I also got that feeling that they needed something fun to pad out this years PDC with.

2
Jakob On

I wonder why MGraph is always compared to XML instead of YAML which looks much more similar. Is it ignorance or blindness why we regularly reinvent wheels?

P.S: This is how YAML can looks like (without custom data types and references to the node 'p1' which YAML provides in addition to JSON):

{  
  name: "waltrop, lehmstr 1d",  
  Status: {  
    code: 200,  
    request: "geocode"  
  },  
  &p1 Placemark: [  
    {
      address: "Lehmstraße, 45731 Waltrop, Deutschland", 
      ExtendedData: { LatLonBox: {  
          north: 51.6244226,  
          south: 51.6181274,  
          east: 7.4046111,  
          west: 7.3983159
      }},  
      Point: { coordinates: [ 7.4013350, 51.6212620, 0.0 ] }
    }  
  ]  
}
0
Dimitre Novatchev On

Here are part of James Clark's thoughts on M:

" I see several major things missing in M, whose absence might be acceptable for a database application of M, but which would be a significant barrier for other applications of M. Most fundamental is order. M has two types of compound value, collections and entities, and they are both unordered. In XML, unordered is the poor relation of ordered. Attributes are unordered, but attributes cannot have structured values. Elements have structure but there's no way in the instance to say that the order of child elements is not significant. The lack of support for unordered data is clearly a weakness of XML for many applications. On the other hand, order is equally crucial for other applications. Obviously, you can fake order in M by having index fields in entities and such like. But it's still faking it. A good modeling language needs to support both ordered and unordered data in a first class way. This issue is perhaps the most fundamental because it affects the data model.

Another area where M seems weak is identity. In the abstract data model, entities have identity independently of the values of their fields. But the type system forces me to talk about identity in an SQL-like way by creating artificial fields that duplicate the inherent identity of the entity. Worse, scopes for identity are extents, which are flat tables. Related to this is support for hierarchy. A graph is a more general data model than a tree, so I am happy to have graphs rather than trees. But when I am dealing with trees, I want to be able to say that the graph is a tree (which amounts to specifying constraints on the identity of nodes in the graph), and I want to be able to operate on it as a tree, in particular I want hierarchical paths.

One of the strengths of XML is that it handles both documents and data. This is important because the world doesn't neatly divide into documents and data. You have data that contains documents and document that contain data. The key thing you need to model documents cleanly is mixed text. How are you going to support documents in M? The lack of support for order is a major problem here, because ordered is the norm for documents.

A related issue is how M and XML fit together. I believe there's a canonical way to represent an M value as an XML document. But if you have data that's in XML how do you express it in M? In many cases, you will want to translate your XML structure into an M structure that cleanly models your data. But you might not always want to take the time to do that, and if your XML has document-like content, it is going to get ugly. You might be better off representing chunks of XML as simple values in M (just as in the JSON world, you often get strings containing chunks of HTML). M should make this easy. You could solve this elegantly with RELAX NG (I know this isn't going to happen given Microsoft's commitment to XSD, but it's an interesting thought experiment): provide a function that allows you to constrain a simple value to match a RELAX NG pattern expressed in the compact syntax (with the compact syntax perhaps tweaked to harmonize with the rest of M's syntax) and use M's repertoire of simple types as a RELAX NG datatype library.

Finally, there's the issue of standardization. The achievement of XML in my mind isn't primarily a technical one. It's a social one: getting a huge range of communities to agree to use a common format. Standardization was the critical factor in getting that agreement. XML would not have gone anywhere as a single vendor format. It was striking that the talks about Oslo at the PDC made several mentions of open source, and how Microsoft was putting the spec under its Open Specification Promise so as to enable open source implementations, but no mentions of standardization. I can understand this: if I was Microsoft, I certainly wouldn't be keen to repeat the XSD or OOXML experience. But open source is not a substitute for standardization.

"

Read here James Clark's blog article on the Oslo Modelling language.