Wednesday, December 12, 2007

Social Graph Portability

There's a lot of talk these days about social graph portability so I guess it's time that I explore the xri based idea that has been running around my head for the last couple of years. This post probably isn't going to go into any more depth than I have already thought about but I hope it will inspire me to go think some more....

The basic idea is this... I am =andy, you are =steve, I can create any number of directed, typed, relationships to you by using the extensibility of the =andy namespace....

=andy*(=steve) establishes a generic relationship.

=andy*(+friend)*(=steve) establishes a friend relationship.

=andy*(+trusted)*(=steve) establishes an actionable relationship.

Let me point out the thing that I think are cool about this...

These entries MUST have been added and/or removed by the entity that controls the =andy name space.
Typed relationships can take advantage of the 'dictionary' space (xris that start with '+') and therefore solve a lot of the semantic mapping issues.
By creating this entry in my name space this relationship has its own i-number, I have very literally reified the relationship. The relationship can have metadata, services and any other quality of a top level entity.
The target of the relationship can USE this identity; =Steve can assert =andy*(+trusted)*(=steve) as their identity... the fact that xri resolution for this succeeds 'prooves' that =andy established the relationship... You can xri resolve =steve for whatever flavor of authN service that you are interested in so the 'user' can 'proove' they are =steve... I AM =steve who has a (+trusted) relationship with =andy.
Group management IS relationship management =andy*(+family)*(=richard).
Relationships are STRONGLY directed. An assertion about a relationship with =steve means only as much as you decide to put on it. Did you know that me and Bill Gates are best buddies and that I'm married to Angelina Jolie?

A couple of problems with this as it stands:

It is ALL public. Maybe once I have finished reading the ID-WSF Service Discovery Spec I'll have a better idea how to mix and match the public and private parts of this in a more privacy protecting way.
There is no native xri way to query the graph. Even if I wanted it to be public there's no (currently speced) way to get all of =andy's +trusted people.

Despite the obvious problems, I think the strengths still make this something worth exploring and the problems something worth trying to solve. Part of why I like this approach is that using some simple wildcards lets me address and permission based on the graph in the same syntax...

When I sign up for the genealogical service it is understood that write rights are granted to =andy and =andy*(+family)*($children) and read rights are given to =andy*(+family)*($descendants)
I can send a message to =andy*(+trusted)*($all)

I guess that what I'm trying to say is this.... I don't see the Identity Layer and the Social Graph as 2 separate things. I think it's well accepted that any meaningful abstract identity system MUST reify relationships as top level objects. It must be an Identity AND Relationship layer....

We must not get confused between the world and the map of the world. It's great that with todays technology we can create these interactive maps that let you:

show all the houses and the roads
now turn off the roads and show the electric grid
now only show the sewer lines

But that isn't what exists... the connections between the house and the grid are real and solid. When we talk about the Social Graph we need to be clear if we are talking about the map of the graph or the actual reality it is meant to represent.

Monday, December 10, 2007

Option 5

I was reading this post in Phil's blog, as I do, and had to get this thought out of my head...

Phil says:

There are basically four options for deployment, as far as I can tell:

Sell software that gets installed on customer hardware
Package your code onto a hardware appliance and sell the box
Package your code onto a virtual appliance and sell the appliance
Sell a hosted solution
All of these have advantages and disadvantages and each is appropriate in different circumstances.

Phil goes on to describe some of the pros and cons of each of these options... It's a good read. There is however another solution; well it's not really another solution, it's really just a variation on 4... But it doesn't have the problems described in Phil's post....

5. Have Wingaa host it.

Now this also has pros and cons like any of the other solutions... but is an option. Wingaa is all about High Availability, High Security and Non-Intrusive Identity Service Hosting by specialists in Privacy and Trust. That is our business... We want to take the support calls, keep the geographically disperse secure data centers in hot fail over mode. We want to persist all that PII and take on that liability, that's the challenge that we have set ourselves... Don't get me wrong, we plan to charge you for it. Security and availability and lots of liability insurance doesn't come cheap... but if you have a business that needs those qualities we can probably do it as inexpensively as you can do it yourself, without you having to do the work.

There are business for whom this fundamentally isn't an option, but there are also business for whom this is the perfect solution. You have the killer business idea, THE next social app, we have the capability to not only run it for you but to run it in an environment that can be trusted to protect the privacy of each individual above ANY business imperative. You can deploy quickly and cheaply into high production value mode without the burden of the upfront cost. You can look instantly credible to enterprise partners.... they'll say... oh, you use the Wingaa network.... GREAT!

Wow... what started as a simple observation that maybe we could help, in some cases, just became a my marketing guys worst nightmare.... Andy runs wild with words. (I think maybe the fever has come back)

Sunday, December 09, 2007

What WIngaa Does....

So the message on the Wingaa site still isn't simple enough. I'm going to try to fix that this week.

What Wingaa does is host services for other companies. There is more and more demand on internet companies to offer their customers more and more services. We help companies satisfy that demand with minimal cost and effort on their part.

That's a start... let's see what ends up on the site.

Wednesday, December 05, 2007

Looking good...

The Wingaa website has been launched. Check it out at www.wingaa.com

Sunday, December 02, 2007

XRI utils

One of the things that I look for with new technologies is tools and utilities being developed that will make the technology easy to use. This week 2 such utilities have appeared...

Markus Sabadello of Parity and @freeXRI has developed an XRI Resolution Client for the iPhone... You can learn more about it and download it from here.

I also shared with a couple of people... and now with you, this Mac Dashboard Widget that lets you look into an XRDS document without having to remember the proxy resolver syntax.

Wednesday, November 21, 2007

If only it were that simple...

Both Kim and Paul picked up this post by Francis Shanahan about the fragmentation of our online information. The center piece of his post is his diagram representing our information spheres... here it is:

I like the diagram in as much as it STARTS to show the problem we face. I dislike it because it implies a structure and solution that WAY over simplifies the problem.

Consider these 2 questions and you'll see what I mean...

Think about the next level out beyond the blue boxes... the attributes. You'll notice that there is massive duplication of information all around circle. This diagram totally fails to represent the interconnectedness of the data. The DataWeb is NOT a set of nicely ordered hierarchies and diagrams that lock us into that way of thinking, I think, do us a disservice.

While this diagram neatly implies that the blue boxes can be canonically categorized it is simply not true. My guess is if we gave each of you the job of categorizing the blue boxes you would come up with not only different groupings but different semantics for those groupings. Don't get me wrong; Francis's diagram is as valid a projection of order onto the mess as any. My complaint is that, we, the people trying to solve these problems must not get lulled into only seeing one dimension of this problem.

I think about the problem like this... Each of you do your version of Francis's diagram but include all the lines between the blue boxes that have data duplication. Also don't limit yourself to only putting each blue box as a child to only one green box... embrace the fact that World of Warcraft is a community, a social network and a gaming site. Once I have gathered all of you diagrams I make them all semi-transparent and put them on top of each other. That diagram is a fair representation of the DataWeb.

An interesting thing to notice is that the lines that go around the outer rim of the diagram are not, on the whole, subjective. We can build a 'rule' that says if two data points have the same value and the same update rules then they should be linked. In other words the lines at the third level should juxtapose fairly well from one persons diagram to the next. Notice that the linking rule is based on values not on labels as the semantic issues in looking at the labels adds another level of complexity.

So.... to the point.... What Francis described is exactly what I have been talking about for the last 3 years. If you take Farncis's diagram, with my radial additions, and put it into a linear form instead of radial, you get exactly the graphs that I have been drawing for years. Three levels, lines going up and down the levels and lines going across the levels. This is no coincidence; there is a fundamental 'truth' about that representation that is much like, in my mind, Kim's laws. This truth is not me, or anyone else, saying 'you must do it this way' it is us trying to point out "this is the nature of the DataWeb". What we have done, and I would be happy to spend time with any of you showing you this in detail, is specified a syntax for describing, precisely, the relationships in that web. Relational databases NEEDED ERD in order to get wide adoption, if people couldn't simply communicate, capture and represent the data models they were working with how could they ever build large complex systems. We need not just an abstract data model but a clear way to graphically represent that model.

Tuesday, November 13, 2007

Andy in the news

who knew... Dale, of ooTao said stuff

Thursday, November 01, 2007

XDI Update

What a year... I just looked back and saw that the last time posted something that was really about XDI, on this XDI blog, was in March… That’s crazy!! Now in my defense I have posted quite a bit on XRI and XRDS and these are necessary building blocks to the realization of the XDI DataWeb.

So, here’s some of the news and my current thinking…

First and foremost… we have cut the 1.0 Version of our DataWeb Server!! This is the server that we have deployed as part of the Kintera Project (that you can read about in earlier posts). This feat is doubly amazing because of the magnitude of the problem that we are trying to solve and the fact that this year Steve Churchill has been working solo on this project. Steve has performed a Herculean task in building, deploying, supporting and documenting this project… He is a one man team of 20. THANKS STEVE!!

NEXT…

We implemented a plugin framework in our DataWeb server that lets anyone build plugins to access legacy data stores. It works great BUT it is something we made up. We are looking at replacing our plugin framework with Higgins IDAS (Identity Attribute Service). IDAS provides a ‘standards based’ interface definition for ‘Context Providers’… plugins to access legacy systems.

I have started thinking about the qualities that are 'lacking' in IDAS in order for it to be able to replace our plugin framework.... not that it does’t do, what it does, well... just that there are other things that it 'could' be made to do... that it doesn't now.

With the assumption that IDAS implementations sit 'close' to the underlying systems, on the same LAN, caching should not be needed, at least for the classic network latency optimization considerations. Caching could be used for fault tolerance and system failure scenarios but that's a whole other issue. Caching can reduce IO but the problems of keeping that cache in synch far outweigh that consideration if we solve the other problems that I talk about here. In theory, the data is 'right there' so duplicating it SHOULD not be necessary.

What we do need is the ability to 'find' stuff.... Find all of the Digital Subjects whose home city is 'Oakland'. What you DON'T want to have to do is:

1) Traverse all Contexts to see which expose 'Home City' attributes about their subjects

2) Traverse all Subjects in the identified Contexts to query and test the Home City attributes

While caching would improve this problem it is far from a good solution... we don't want to be doing mass traversals, ever, at query time.

What we want to do is pre-determine which attributes are going to be 'search criteria' .... yes.... you MIGHT want to search on any criteria, in which case you have to take the hit of searching without an index... they haven't even solved this in RDBMS world... you can build SQL queries that take days to run and then add a couple of indexes and run them in minutes. Once you have determined the uses cases… add the indexes. (compound keys and simple ‘set’ math across multiple indexes can give you significant flexibility and power)

Executing a search against an index results in a list of pointers to Subjects that meet the search criteria. It should NOT result in a list of pointers to the attributes themselves… remember you are unlikely to query… get me all of the home cities for all of the people whose home city is Oakland… You probably want to query something like; get me the email addresses and names of everyone whose home city is Oakland. (We do NEED to support ‘complex’ matching logic… startsWith, endsWith, greaterThan, beforeDate, etc…)

The next problem is optimizing the data access… The ‘easy’ way to process the results is to iterate over the list dereferencing the pointers… our experience has shown that this royally pisses off the DBAs…. What I mean is, if the ‘Context’ is an RDBMS then the iteration approach results in executing “SELECT email, fullname FROM people WHERE userID = ‘XXXX’” as many times as there are results in the set. This is slow and, as I said, not popular with the DBAs. You need to be able to package your query into “SELECT email, fullname FROM people WHERE userID in ‘XXX,YYY,ZZZ,ABC’” and then parse the results back in your ‘client code’. I put ‘client code’ in quotes because I don’t mean that this is done by the application coder but it should be done at the IDAS implementation. As an application developer I want to be able to say to “IDAS…. Get me all of the emails for people that live in Oakland and get back a list of emails” and never have to care that half of the emails were in Oracle and half were in PeopleSoft. BUT, I want to know that only 2 calls were made across the network (I have had to PROVE this to our customers in order for them to accept our DataWeb Server, they really care about this).

That’s my first pass at ‘what’ we need to do…. Next we have to work out ‘how’ within the existing IDAS spec :-)

Ohhh… and robust distributed transactional management. I will add others as I think of them.

Saturday, October 20, 2007

Wingaa takes flight

After months of background work Wingaa is finally born. Patrick Audley and John Bradley, recently of Cogneto and the ooTao team are glad to announce that Wingaa is finally launching.

Wingaa means 'My name is...' in the Central Alaskan Yup'ik language.

Wingaa will continue the work of ooTao in bringing together strong authentication and user-centric services, making them readily accessible to everyone.

ooTao will continue to develop core technology and provide professional services in the identity space.

Sunday, October 07, 2007

OpenID and you

John Bradley posted this great post about how to use your i-name at OpenID sites that don't have i-name support yet... It's a great tech tip for those of us that like to tinker. For the rest of you, don't worry; the new OpenID specs have full i-name support and will be deployed very soon.

Thursday, September 27, 2007

Adopting Evolution

In my bitchier moments I have been heard to say… “OpenID; brought to you by people who didn’t want to read the SAML spec”. I truly believe that the process of enhancing OpenID from supporting its original use cases to supporting a wide range of internet scale activities of varying values will eventually see OpenID evolve to be a fully compliant SAML definable profile.

So I have been asking myself; why has OpenID grabbed so much popularity while SAML, a much more mature, academically respected, ‘robust’ specification has been largely ignored by the cutting edge web 2.0 community…. an image came to me that I think might be profound, at least for me, and this blog seems like as good a place as any for me to try to get it out of my head.

I'm imaging a perfectly planned city… you bring together the best minds in social and urban planning and have them design and build the perfect city. Then you ask people to move to it… it’s big and empty and impersonal, its very perfection is off putting and intimidating. Meanwhile, just down the street there is a collection of mud huts with lots of people milling about, drinking beer and having fun. People are flocking to the village and it’s growing rapidly. The urban planners that built the city say; but don’t you see, you will need all the infrastructure that we have built in order to continue to thrive as a community, you’ll need police, medical and fire services, you’ll need schools and water pumping stations. But still people flock to the village to be part of growing something new and exciting. The villagers say; if we need police, someone will step up and become a police man, if there’s a fire we’ll get together and put it out. The inevitable outcome of the growth of the village seems to be a less well planned version of the planned city. It will, by inevitability, have many of the same features, some less well executed and some surprisingly better than the planned city.

To me, and maybe it’s just me, I know I would much rather be part of the village than move into the city. I might not want to re-invent the wheel but internet identity is a large complex and subtle system, like a city, it isn’t a wheel. Internet identity is going to have very organic qualities… I’m wondering if the growth, the evolution of the organic system isn’t the magic source that will actually humanize internet identity… I think it might be necessary that we start with simple organisms that can evolve, branch and each branch succeed or fail based on their efficacy in their ever changing environment. If two teams of engineers looked out over an early version earths eco-system and one designed ‘the perfect organism’ and the other designed an ameba capable of rapid reproduction and innovation which would you bet on for long time survival?

I’m not saying that the SAML community isn’t also receptive, open, innovating and evolving, they are. I repeat my original statement that I do think that SAML is more mature and ‘robust’ than OpenID… I’m simply trying to understand the juju that OpenID has (at least in my opinion).

now you're talking...

Just been sent this link... a must see if you are very serious about social networking

Monday, September 17, 2007

OpenID 2.0 discovery (with 1.0 compatability)

This is how we are doing it in our client libraries... does it look right to you?
(click on the image to see a readable version)

Thursday, July 05, 2007

New RPs

Well the first of the Geffen Artist web sites have gone live as OpenID 2.0 (WD 11) relying parties. If you have a little time go login and play around… we would love to hear your feedback about the integration and user experience. The first 2 sites are:

http://www.rooney-band.com

http://www.evefans.com

Monday, June 25, 2007

More news...

I don't think I shared this with you-all...

GEFFEN RECORDS OFFERS SIMPLE SIGN ON FOR MUSIC LOVERS

ooTao, a Leader in Open Identity Management, Partners with Record Label

June 12, 2007 (Berkeley/Los Angeles) – Geffen Records, has partnered with ooTao, a leader in Open Identity (Open ID) management, to make it easier for music lovers to log on and connect with all their favorite artists, such as Snoop Dogg, Mary J. Blige, the Cure, Nelly Furtado, Trevor Hall, and Ashlee Simpson. The new “Single Sign-On” launches June 18 and will be expanded to include other Geffen artists as well as new services for fans.

Lee Hammond, Director of New Media for Geffen Records, has been working with ooTao President Andy Dale for almost a year to make it easier and more secure for fans to log on with a single user identification, or i-name. Says Hammond, “We love the promise of OpenID to reduce registration barriers for newcomers to our properties as well as make it easier for current audiences to crossover to new artists’ content.”

Artists love the new feature, too, said ooTao’s Dale, because music lovers no longer have to sign off and sign in again every time they want to see what’s happening with an artist. Using the same i-name, says Dale, fans can sign into the Geffen site and read about and listen to the music for their favorite artists without leaving the site.

More is in the works, adds Dale. “After this initial phase goes live on June 18, we expect to continue working with Geffen to leverage these identity standards and bring music lovers truly innovative services that allow them to get closer to the music and the artists.

ooTao (ooTao.com) is an engineering development company specializing in distributed data sharing and identity management infrastructures. Its founding partners are leaders in developing standards for Open Identity management.

#####

Wednesday, June 13, 2007

Expediting XPP

I am hoping to quickly and painlessly work up a basic 1.0 spec for XPP (XRDS Provisioning Protocol). This should be simple and easy to implement. I am going to drive the process and try to get the spec done in the next month… then people can use it, or not.

If you want to be a part of this open process add your name to the people section on the front page of the xpp wiki at http://xpp.seedwiki.com. I will organize a conference call for the middle of next week for anyone that wants to play with me.

Friday, June 08, 2007

Validating i-name claims.

There will be many ways in which people will assert claims over the internet and depending on the nature of the claim and the identity of the claimant we will have to do different types of validation. I’m talking about claim validation in a dynamic trust environment, in a distributed identity network. There is no assumption of a prior relationship between the asserting party (AP) and the Relying Party (RP).

When I talk about validating a claim what I’m really talking about is; is there a way for the RP to be sure that the AP can be trusted for this specific claim. When I say ‘this specific claim’ I mean 2 things;

1) that the AP is trusted by the community to make this claim type (just because they are trusted for one claim doesn’t mean they should be trusted for all claims)

2) that the identifier about which the claim is being made has indicated that this is their designated AP for this claim. (I think this is only an issue in some claim types)

Specifically I am starting with i-name claims and I will explore this from an InfoCard perspective. This is a totally valid way to authenticate ‘ownership’ of an i-name although the mechanism is totally different from an http redirect authentication protocol like OpenID, BBAuth, Google’s SAML implementation, etc… In the http redirect case the i-name is the identifier and therefore validation/authentication of ownership is the point of the entire interaction. In the InfoCard case the i-name is just another piece of metadata associated with the PPID. Self asserted claims about i-name ownership should never be accepted.

So here’s what an RP should do when they get an i-name claim via InfoCards…

Perform SEP resolution to find the designated ‘InfoCardService’ published by the owner of the i-name. The only AP that should be acceptable for claims about that i-name should be that entity. This should be enough validation. (Obviously you are checking the crypto to make sure that the AP is who they say they are.)

In order to defeat this validation a would-be spoofer would have to have subverted the XRI resolution and inject their own SEP; if the RP is using the http proxy resolver this can be achieved by subverting the DNS layer. Once I start to assume that DNS is compromised any validation starts to fall apart… so either use a local resolver in trusted resolution or some other mechanism of trusting your resolution infrastructure if you deem it necessary… personally I’m not sure that the value of the i-name claim in this context requires a particularly high level of paranoia. In the InfoCard usecases it’s the PPID that is the identifying key, and any additional services derived from the i-name should be separately validated anyway.

Tuesday, June 05, 2007

Final word on XRDS… for now

So while posting a comment to Phil’s blog on this thread I finally hit upon the thing that I have been trying to say in a clear concise way… Sorry you have had to watch me formulate this idea in real time.

The simple statement is this:

SEPs in XRDS must be considered self asserted claims and as such should not be trusted on their face. Service Providers should publish the mechanisms by which SEP claims should be validated to be about a specific subject (authenticated identifier). (ooo… I feel another spec coming).

Monday, June 04, 2007

Even More on XRDS.

Phil Windley picked up the XRDS conversation with a great post. I just want to reiterate my concern about misunderstanding of XRDS usage. You may have all already groked this in which case I apologize for the redundancy but I want to make sure that this is really clear.

The problem is this:

Lets say that I have 2 services listed in my XRDS, an OpenID authentication service and a photo service. I go to a dating service and log in using my OpenID. The dating service now looks for a photo service in my XRDS and finds it and presents the photos found as photos of me. The danger here is that because I logged in using one service listed in my XRDS that there is the impression that the there is some validation that the photos really are of me, this simply isn’t true.

All you can derive from the fact that multiple services are listed in a specific XRDS is that one of the entities that have access to edit that XRDS want to associate that service with that identifier. Service endpoints in XRDS documents should be treated like any other self asserted claim.

At this point I want to give a big thanks to Steven Churchill, ooTao’s CTO. I recognized this vulnerability of XRDS and discussed it with Steve, he then went ahead and solved the problem at least as it pertains to i-name resolution and services in XRDS. His solution is now a part of the XRI resolution specification, lookup Canonical ID verification.

Some things that do work:

If you know that services use the Canonical ID as their ‘key’ then you can use canonical ID verification to ensure that the same i-number is the subject of both your authentication request and your service access request. Assuming the service has authenticated and validated the user correctly when the service was provisioned you can use this mechanism to create trusted service bindings. The trouble here is knowing which service providers to trust, both in intent and implementation.

A simpler case that does work is the case like EZIBroker’s Claim Services. In this case the semantics of the services itself asserts the relationship between the services and the identifier. This works equally well with URLs or i-names (I think). A user authenticates (demonstrates their access to the credentials for a given identifier) using a specific OpenID identifier. The relying party then looks up the ‘claims service’ (could be WS-*, SAML, AX, XDI, etc…) The assertions that are generated by the claims service specifically assert the claim and the related identifier. For Example: signed {=andy is over 21}. If you find this claim from a service in =andy’s XRDS then it makes a lot of sense… If you find it from =jim’s XRDS you know there may be a problem. (of course if =andy and =jim resolve to the same canonical ID then it’s all good). A service that presents the claim; signed{this person is over 21} is obviously not useful in the context of an XRDS unless there is an additional authentication step that lets the user assure the Relying Party that they are ‘this person’.

NOTE: The EZIB claims service will be providing one-time opaque identifiers that can be used with claims so the relying party only need know that !2003!1928.2746 is over 21 and !2003!1928.2746 is the person logged in. This will satisfy some of the privacy concerns but not all, we are not claiming that we are building a zero knowledge system.

Wednesday, May 30, 2007

More on XRDS

In trying to answer Saronson01’s question I seem to fall into a total flow of consciousness that might make no sense to anyone but me… sorry. I will try to frame this more clearly soon. Keep the questions coming as that is what fuels the fire...

Saronson01 asked

When "adding" services to an XRDS document how are the services used, viewed, etc.? It seems that there is a missing piece regarding how a service is actually "consumed." How would the i-name @images*andy utilize a flickr feed?

Now I think I understand the question.. but if I’m answering the wrong question restate and I’ll try again…

For the sake of this answer I am going to refer to i-names but this is mainly true for any identifier that can be resolved to an XRDS. The reason I say ‘mainly’ is that i-names assume an abstraction between the human friendly i-name and the persistent i-number (canonical ID). The i-name resolution infrastructure supports both trusted resolution and CanonicalID Verification. These are qualities not shared by URLs that resolve to XRDSs. Once we start to use the richness of the XRDS for discovering services other than OpenID for URLs we will have to explore the security implications of these differences.

The simple pattern for XRDS usage is this:

I go to some application… it may be a web app or it may be a thick app… doesn’t matter.
I enter my i-name into the app.
The app knows what service it is looking for so it performs Service EndPoint (SEP) resolution for the service it is looking for and gets back the needed information about that service, where it can be found and how it can be accessed.

The most common current usage of this pattern, today, is to find authentication services, i.e. To enable SSO. In that case the ‘type’ that is looked for is http://openid.net/signon/1.0

So to answer your question… why would I put my flickr feed into my XRDS. Here’s my answer…

I create a new Web Application that creates Flash photo albums for use on MySpace… It uses OpenID to authenticate people. A person comes and logs into my new service using their i-name. Once they have authenticated I need to know where to find their photos.. I could ask them, but if they have configured a service of type http://photo.feed/1.0 then I don’t have to ask them I just know… In fact if I know someone’s i-name I can look up their photo feed (just like I can look up their authN service) for whatever purpose I want.

Yes, XRDS is public, so you only want to put services that you are happy people knowing about related to your publicly know identifier.

Putting stuff in the XRDS rather than in AX (or XDI) makes sense when you are happy the information being public and you want the optimization of using a very light weight ‘resolve’ protocol on top of high availability backbone infrastructure.

The SEP schema (part of an OASIS spec) is specifically designed to describe this type of data. Sometimes there is goodness in using well defined, domain specific, schemas rather than abstract ‘can describe anything’ schemas.

Although resolution is designed to be public, we have devised several mechanisms to terminate public resolution in private realms for privacy and security reasons. These are useful for specific use cases.

Using i-names with CID validation and trusted resolution you can authenticate a user via OpenID and then access a service that they have in their XRDS (with the CID as the primary key) with a very high level of confidence that the service is truly related to, or providing information about the entity that authenticated. (Assuming that you trust the service, but that’s a whole other issue).

EZIBroker is in the process of building and rolling out an Age Claims Service that can be associated with an i-name… Any relying party who can provide the right credentials to the age claims service may have access (under the users control) to age claims about that i-name owner. The XRDS provides the glue between finding the authN service for the user to know they are who they claim to be (ok.. have access to the credentials for that account) and the Age Claims Service that provides necessary claims for the i-name user to buy their beer on-line.

Monday, May 28, 2007

Making use of the XRDS

The XRDS (eXtensible Resource DescriptorS) document is an XML document that you will find behind every OpenID 2.0 identifier, both urls and i-names. The XRDS contains a list of ‘Service End Points’ (SEPs ) that describe the services associated with the identifier, where they can be found and how they can be accessed. Notably the most important SEP from the OpenID 2.0 (yadis) standpoint is the authentication endpoint that tells the relying party where the OpenID service can be found.

Remember that XRDS was originally brought into the OpenID as part of yadis; a mechanism designed to provide interoperability between OpenID and LID, 2 http redirect authentication protocols that both use URL identifiers. Yadis, and therefore XRDS provided a way to describe which authentication protocol was associated with this particular url. Once we know that a specific URL can be resolved to an XRDS we can associate any number of services with that URL… SAML authN, XDI, Higgins Context Provider Factory Class, Flikr feed, reputation service, age claims, etc… All of this is a given for i-names but OpenID urls have the capability as well.

The problem is this; XRDS documents are XML documents, not particularly complex ones but XML none-the-less. Imagine my mother… I bought an i-name for her… I believe she can remember that Gillian.dale is her name (shameless i-name plug: no I don’t know that she could deal with any url for of her name reliably). So, she has her i-name and uses it to log into services that accept openID 2.0, she now only has to remember one username and password and I get a lot less support calls.

What happens when someone wants to sell her a new service? Lets say that someone launches a better authentication service (and I know a bunch of people working on that). They do not want to tell my mother to go edit her XRDS… if her OP even gives her access. So years back, I spec’d (with help from others of course) an XRDS provisioning protocol. It’s a very simple http redirect protocol… Mum goes to a new service and wants to get it… she clicks on the ‘get this service’ link… the would-be service provider looks into her XRDS for the provisioning endpoint… and redirects her to it together with the SEP details for this new service… Mum now sees a dialog, from her own OP (with all the same phishing controls that she is used to at her OP for logging in) that says… “Service X wants to become you new service provider, do you want to continue?” … this makes total sense to her as she got this message as a result of saying “get this new service”. Assuming she tells her OP to go ahead and add the service it can now add the SEP to her XRDS and she has a new service (probably something to do with grandkids).

Now I never completed the XPP (eXtensible Provisioning Protocol) spec as no one seemed to care enough about it... So here is that first draft, if anyone out there wants to work with me on finishing it I would love it. I wrote this originally for i-brokers but it would be trivial to generalize it to any OP.

Thursday, May 24, 2007

Wednesday, May 23, 2007

CAPEC

If you build software that is meant to be secure you might find the CAPEC site as informative and useful as I do.

Wednesday, May 02, 2007

Distributed Computing

I came across The Eight Fallacies of Distributed Computing today attributed to Peter Deutsch. The list is:

1.	The network is reliable
2.	Latency is zero
3.	Bandwidth is infinite
4.	The network is secure
5.	Topology doesn't change
6.	There is one administrator
7.	Transport cost is zero
8.	The network is homogeneous

I'm glad to say that these are all issues that we have explicitly addressed in our xdi work, except maybe, number 7 that I don't really understand.

Thursday, April 05, 2007

People who like this blog will also like:

Mike Jones at Microsoft has started blogging. Mike is one of those really nice, quiet, brilliant people that I have the pleasure of working with from time to time. He is deeply insightful about technology in general and digital identity technology specifically. I will be watching his blog and know I will learn from it.

Saturday, March 31, 2007

The quality of data is not strained

“The quality of data is not strained; It droppeth as the gentle rain from heaven Upon the place beneath. It is twice blessed- It blesseth him that gives, and him that takes.”– A bastardization of William Shakespeare.

I am often faced with the question; “Why don’t we just do this with our Web Services?” Generally when I’m asked that question it’s in relation to what I call Dataweb technologies. When I’m asked it in other contexts it makes even less sense.

There are many answers to this question and different ones tend to resonate with different people. One of the main qualities of the Dataweb that I strive for is the richness of interaction that one gets when accessing data, through ANSII SQL that is in a well designed schema. This is a quality that you only grock if you have spent time writing database reports or very data intensive apps. Those of us that have been there know that extracting information from a well written schema is a joy. In fact, given a little imagination and a reporting tool you can learn stuff from a well built data set that you didn’t know you knew. This phenomenon fueled a whole industry starting back in the mid 80’s when ODBC first hit our radar. We still build big data warehouses that we troll and derive new information and stats from, but only inside closed systems.

Back in the early 80’s all the data was locked up on the mainframes and we started writing PC apps that needed to access that data. Each time we wrote an app, we wrote a data driver to access the data we needed off the mainframe. There was very little reusability and no appreciation of the ‘value of data’. Then, along came ODBC, the first widely adopted manifestation of ANSII SQL, and everything changed. Now, you built an ODBC driver that could access your mainframe and went to town, you never had to write another custom driver again. This was the inflection point where we discovered that using a fluid, abstract, data access mechanism let us learn new things from the data we had already collected. The difference between those custom data drivers and the ODBC data access paradigm was that the drivers tightly bound the purpose of the access of the data to the mechanism for accessing it, while ODBC (SQL) provided an abstract mechanism that didn’t care what the data was or how it was going to be used. These qualities were inherent in the way we thought about those custom data drivers; when we designed and built them we built interface definitions getUser(), getInvoice(), etc… We used method invocation to access the data we needed. SQL provided us a way to query any schema in any way and ‘try new stuff’ without having to re-program our data access layer.

Given my example of getUser() and getInvoice(), what happened if I wanted to find out if there was any correlation between geographic region and annual total purchases… I was basically stuck waiting for the mainframe guys. With SQL in place I could slice and dice my 2 table schema (users and invoices) any way I wanted. I could look for patterns and play to my hearts content… but it wasn’t really play, it was the birth of business intelligence. Now that I could work out the profile of my best customers, I could target other people with that profile to become my new customers. How’s that as an unexpected outcome from a higher level of data access abstraction?

The way that we conventionally use Web Services today is not just akin to those old data drivers, it is the same thing. We know this, it’s inherent in the names of the protocols that we use, XML RPC, Remote Procedure Calls; method invocation. getUser() and getInvoice() would be very reasonable methods to see defined in a WSDL.

Now sometimes you need the quality of RPC, you don’t want people to be trolling through you data and deriving all sorts of stuff, you want to keep them on a tight leash, so use conventional Web Services. I call this integration pattern ‘application integration’, not data integration.

The protocols that support the Dataweb, XRI, Higgins, XDI, SAML, OpenID, WS-*, etc… provide mechanisms to access a distributed network of data with the same richness as if you were accessing a single data source via SQL, but with more control. Imagine doing a database join between two tables, now imagine doing a join between two heterogeneous, distributed systems… wouldn’t it be cool.

The qualities of an abstract data layer are; a well defined query language that can be used to access a well defined abstract data model that in turn returns a persistence-schema agnostic data representation. These qualities are shared by SQL, XDI and Higgins.

When contemplating a data abstraction for a distributed data network there are some other things that we have to add to the mix; trust frameworks, finer grain security, social and legal agreements, network optimization, fault tolerance, to name but a few… And that is what I spend a lot of my time thinking about.

So I hope that this describes somewhat why Dataweb technology is different from conventional Web Services implementations, although they run on the exact same infrastructure.

It is interesting to note, and I may be way off line here so if you know better please correct me, from what I’ve seen: SalesForce agrees with me. What I mean by that is that their new generation Web Services are some of the most abstract interfaces you are likely to see in a system that derives so much of its value from its programmatic interfaces. (Along with Kintera who we are working with). The only downside with the SalesForce approach is that it’s proprietary, which is a shame, when there are open standards that, appear, on the face of it, to satisfy their requirements. (SalesForce, I’d love to hear from you if you want to talk about this.)

Wednesday, March 14, 2007

Higgins IdAS and XDI

The more I look at the Higgins IdAS the more I recognize that it is the part of the puzzle in the Higgins world that maps fairly closely to what I call the XDI Engine. They both present abstract data interfaces that are meant to be put in front of legacy persistence. I have been telling Paul for a while that I think that IdAS is going to need indexing capability to be really useful. I realized, not long ago, that we need to replace the ooTao specific ‘plugin’ engine with an IdAS implementation. I am seeing more and more that once xdi takes into account the Higgins IdAS use cases and Higgins IdAS consumes the xdi use cases as sub sets of the complete ‘dataweb’ use cases that an xdi engine and an IdAS implementation are going to end up being pretty much identical. I watch the Higgins-dev list and learn and hope to contribute where I can.

In that light I am going to start putting more Higgins musings on this blog as well as xdi stuff.

Here is a thought provoked by a current discussion on the Higgins list:

The way that we are dealing with systemic and semantic mapping in xdi is by introducing an xri abstraction into the mix... attribute types are xris, generally in the '+' namespace, known as the 'dictionary space', like +email, or +first.name. Unlike the '=' and '@' namespaces the '+' namespace is not a rooted space, but I'll get back to that.

So in xdi land, any attribute name is resolvable in dictionary space to a dictionary entry, a dictionary entry may include a bunch of different stuff, including:

synonyms (both semantic (street and rue) and systemic(phone_number and phoneNumber))
schematic constraints (+address = must link to 1 or 2 streets, 1 city, 1 state and 1 zip... I KNOW that +address is a bad example because it's not a global construct)
validations (validation lists, real expressions (masks), executable validation scripts (different implementations in different languages))
UI implementations (for building rich UIs for arbitrary attribute types; +eye.color may provide a color picker that limits color choices to natural human color range as dcom, .class, xul, etc...)

So, in xdi land, as we build indexes of the various contexts (one of the primary 'qualities' of xdi is indexing the contexts it knows about so that you don't have to go trolling 200 contexts to find the attribute that you need about a given subject); rather than indexing the attribute type '+email' we index the canonicallized i-number that the i-name resolves to.. +!3215.2154.1254.

example:

xri://=andy/+email, 'andy's email address', points to a specific attribute in a specific context but what we persist in the index is xri://=andy/+!3215.2154.1254

Now when anyone wants to do a get against the index they can search for xri://=andy/+email or xri://=andy/+e_mail or xri://=andy/+Email or xri://=andy/+doar.hashmali (transliteration from Hebrew) and get back the desired record because the type is always resolved back to the i-number. On set operations the xdi engine checks the validations and schema constraints of the type before parsing the operation back into the 'context provider' to persist the new data.

I said that the '+' space is not rooted, so how does it resolve? Well, just like with English, you can look up a word in whatever dictionary you want, you might prefer Webster’s, personally I like the Oxford Standard. This quality lends itself to supporting a seamless continuum of global, community and personal dictionaries so you can be as precise or as vague with any given term as you like. A person can specify the intended dictionary for a given type: @ootao*(+email) would be ootao’s definition of +email and IS resolvable in the global @ namespace.

The early dictionary implementations that we are working with use a folksonomy approach to building the communal knowledge… anyone can edit the dictionary. So if your system uses a field name for an attribute that hasn’t been mapped yet, you just add it to the dictionary. Once one person has added the ldap schema and one person has added the vcard schema the world now knows that +cn, is the same as +fn and they are both instances of +!3211.5485.3656, which is also +full_name, +person.name, etc…

I’m not saying we have all of the problems solved. Off the top of my head I don’t know how we would express the transformation between givenName, sn and cn… But I could propose a few suggestions if anyone was interested.

Tuesday, March 13, 2007

More on CardSpace and XRI

I like CardSpace. I finally got it installed on my XP machine at home and have used it to log into Kims Blog. Installing it wasn’t as easy as I would have liked, it was a big download, a long install and then I had to get ‘special’ tech support in order to get it to work (by special I mean I had to call someone I know over at MS, in that department to help me). Now it turned out that it was an ‘obvious’ problem but on-line help was not easy to find and the error messages were not helpful. All I had to do was install IE7… another big download and install… BUT… that’s the price we pay for security :-)

I have seen CardSpace demos for years now, and have pondered the paradigm shift in the user login experience and have always liked it… Now that I’ve tried it I like it even more!! As a user experience this makes a lot of sense to me… and with some xri and xdi integration this thing could really rock :-0

There are 3 places that I would like to see xri and xdi integration into the CardSpace world. These opinions are based on a deep knowledge of xri and xdi and a pitiful understanding of anything beyond the basic mechanisms of WS-* that make up the CardSpace. I will try to explain the use cases and the properties of the interactions that I am looking for as I talk about these integration points and if there is alternate (better?) ways of solving the same problems I would love to hear about it.

Integration 1: Portability
One problem I still have with CardSpace is that my cards seem to be bound to a specific machine. If I create self issued cards at home and at the office and log in to Kim’s blog from both places; how do I get recognized as the same person?

In the Higgins project HBX card selector the Card Store is not on the client machine, it is ‘in the cloud’. I think that using i-names to bootstrap authenticating me and finding my card store would make CardSpace better. I want to walk up to any machine that is CardSpace enabled, enter my i-name, authenticate (using the multi-factor mechanism of MY choice) and have trusted resolution (not spoofable like DNS resolution) find my Card Store and let me use my cards. Now, I only have to log in once using my i-name, after that I just pick cards. Because I only have to log in once I’m fine jumping through a few multi-factor hoops to make sure that the authentication is solid. That would be cool!!

Integration 2: I-Name Authentication
OpenID is great a way to authenticate an i-name… but not the only way; I really like the ease of picking a card to login. BUT just because a card says “my i-name is =andy” does NOT mean it should be trusted. This is just the same as on Kim’s blog, my card asserted my email address but I still had to go through an email validation…you can’t trust self asserted claims!

So who should be able to make claims about i-name ownership… whoever the i-name owner wants…who the relying party is willing to trust… and here’s how that can work:

EZIBroker (an ooTao business) is about to start offering managed cards with i-name assertions. We hope that through our XDI.org accreditation and our general reputation we will become a trusted provider of assertions. But that isn’t enough… The owner of the i-name needs to ‘show’ that they have selected EZIBroker as their token service. They can do that by adding a Service Block of type “managed card’ to their i-name record (XRDS). So, a relying party, on receipt of an assertion that the ‘bearer’ of this card is the rightful user of the i-name ‘=andy’ should do 2 validation checks… 1) They should check that the asserting party is who they say they are and that they are trusted by the RP to make the claim and 2) They should perform xri resolution to check that the XRDS for that i-name does, in deed, designate that Token Service as the claims provider for that i-name (the theory being that only the i-name ‘holder’ can change the XRDS). XRI resolution should be performed by the RP anyway to persist the i-number as well as, or in stead of, the i-name.

Integration 3: Pointers as Data
This is close to the heart of my real passion… distributed data management. When an RP asks for an email address I want to be able to return either an email address OR a pointer to an email address. Today if an RP asked for an email address and got back an xri (or uri) I would expect it to be upset… and that’s why we need integration. There are use cases where you want to push the data to the RP, but there also use cases where having the RP be able to pull data on demand can be very useful (like current temp in your location so we know how much beer to deliver). In XDI land the response to any request can be one of 2 things… data or a pointer to data. In card space land the response can only be data (as I understand it). If the response is a pointer to data then the RP has to know how to dereference the pointer.. of course you want the protocols that support the pointer to protect privacy, have fine grained security, have link contracts, have pull and push cache syncornization… be xdi :-)

So at a REALY high level those are the 3 points of integration that I am interested in seeing between XRI, XDI and CardSpace. (The 4^th one that I have talked about on the TC calls is really an integration with the Higgins IdAS service not CardSpace so that will go in a different post).

I will dig into these more as time lets me… I’ll let you know when you can get i-name cards at EZIBroker i-brokers.

Thursday, January 25, 2007

I wrote this missive in an email thread about using CardSpace with i-names I thought I should share it with you too:

IMHO the use cases that i-names support are a super set of the use cases that cardSpace supports... All of the digital identity usecases where someone else's wants to refer to me they need to use a globally unique identifier to identify me. Now I could keep giving people (services) my email but I'd much rather give my i-name. When a party has my i-name they can bootstrap ANY functionality that I provide. This is very different from cardSpace.

CardSpace is REALLY good at doing authentication (on Vista clients). Here's where I'm going to go out on that limb... I-names aren't bound to any specific authentication mechanism, they can be used in SAML they can be used in OpenID, but they can be used in any number of other schemes as well. A managed i-card with a signed assertion from the i-broker that this i-name has been validated as belonging to this card holder seems to me to be just as valid a mechanism to authenticate an i-name as any.

Use case:

I go to Evite (the i-name enabled Evite) and say I want to invite =joe to my party. NOTE: CardSpace has no mechanism for me to identify 'Joe'; I HAVE to know a global identifier for him. Now Evite can look up =Joe's invitation SEP (could be his contact service, an email or other). Later =Joe wants to look at his invitation at Evite so he goes to the site and logs in with the convenence of the cardSpace paradigm. I don't think that using cardSpace authentication diminishes the value of the i-name in doing what it is good at doing.

So once cardSpace/higgins is broadly available we are going to need to define an attribute type so that an RP can ask for an i-name (or should it ask for an xri?). We are also going to have to provide the list of parties that should be trusted to assert i-name ownership (self asserted 'this is my i-name' should NOT be trusted); presumably XDI.org could publish that list.

So in summary... I think that people NEED i-names; they are just too useful in too many usecases. I DONT think that authentication mechanism is a good place to focus on the value of i-names, I would go as far as to say that this is one of the biggest mistakes that we in the i-name community have made. Once you have authenticated that the principle is the valid user of i-name that's where the value starts, not stops. So authenticate by whatever means the RP wants, and then look at all the cool services that can happen.

Wednesday, January 17, 2007

XDiggins

What is XDiggins? It’s what you get when you smash XDI and Higgins together at high speed. Last week Drummond Reed, Paul Trevithick and myself had the opportunity to get together and explore this synergy in some detail with a specific client use cases to explore.

The use cases that we were looking at were those presented by the establishment of ‘Wiser Commons’. The Wiser Commons is a group of nonprofit organizations who are willing to share information in order that all of them can provide better services and be more effective in their missions. The commons, lead by NCI and Jon Ramer of Intera, provided the excuse that Drummond, Paul and myself have been looking for, for years, to come together to try to ‘rationalize’ our various standards into a working solution.

Paul, Drummond and I spent 3 days together. The first day we spent with a small group of ‘techies’ from the commons discussing requirements. The second day we secluded ourselves and talked tech and the third day we presented a proposed initial architectural approach. Over the next couple of weeks I will dive into the details of the approach we are proposing, here on my blog, for all to see. Meanwhile, there is one main thing that is worth mentioning that I think is very exciting…. We did it!

The 3 of us finally achieved a level of understanding of each others work that we were able to build a single proposal that all three of us could agree to, endorse and understand. The single most startling thing that I came to understand is how similar our work is. Many of the ‘problems’ that we were having in understanding how Higgins and XDI should work together were not because they were incompatible but rather because they are so similar. The 2 efforts are deeply, profoundly, validatingly similar in their underlying models. There are certainly nuances that differentiate the 2 bodies of work but now that we are able to appreciate the big picture similarities we simply have plug the best-of-both together to get a working solution that transcends both and removes the necessity to choose between them.

Watch this space for upcoming details…