The Tao of XDI

Saturday, March 15, 2025

Recognition, Identity, and AI: Building Trust in Digital Agents

My first love was self-sovereign distributed data, where each person owns and controls their data, hosting it wherever they choose and permissioning it under their own terms. But I got lost in the complexity of building a robust distributed identity infrastructure. How can you give permission to someone if you can’t 'name' them in a way that is verifiable and resistant to subversion? There's no point in saying "only John can access this" if Tom can show up and convincingly say, "I'm John."

This issue isn’t theoretical—many modern digital problems stem from weak identity foundations. Take email, for example. SMTP, the core protocol, evolved without a strong sense of identity. Had we designed email with a robust identity layer—and maybe a little reputation—spam might have been less rampant. Instead, we've had to patch identity onto email systems, mostly at the DNS layer. Could better choices early on have changed the landscape of digital trust?

As we enter the era of AI and Personal AI, this challenge resurfaces. We will increasingly rely on agents to interact, assist, and even make decisions on our behalf. But how can we trust these agents? How do we know they are who they claim to be, and whose interests they truly serve? When I ask my AI how to unwind after a long day, it might suggest a refreshing Diet Coke. But is that suggestion rooted in understanding my preferences, or is it influenced by unseen commercial incentives?

Recognition and Identity in AI

In the animal world, intelligence is often measured by the ability to recognize oneself and others. The mirror test is a classic example—when an animal identifies itself in a reflection, it demonstrates a form of self-awareness. Similarly, recognizing specific others—distinguishing one individual from another—marks advanced cognitive development.

AI, in contrast, remains limited in this capacity. While AI excels at pattern recognition, it lacks the ability to form a persistent sense of identity, either of itself or others. This limitation restricts its ability to build trust and context in interactions. Without a foundation for recognizing specific entities, AI systems risk becoming tools of confusion or exploitation.

Embedding Identity Systems into AI

One solution is to deeply embed identity frameworks into AI architectures from the outset. Decentralized Identifiers (DIDs), Verifiable Credentials (VCs), and similar systems could provide AI with a structured way to "recognize" and differentiate entities.

Persistent Identity Chains: AI could track verifiable chains of identity, ensuring that when it reports information—like "Brad says buy this stock"—it can verify that it truly came from the Brad you trust.
Verification of Origin: By leveraging cryptographically verifiable credentials, AI can ensure that information hasn’t been tampered with and originates from a trusted source.
Reputation Frameworks: Identity systems could incorporate reputation mechanisms, helping AI prioritize information from sources that consistently meet a trust threshold.
Chain of Custody: AI could provide transparency on how information was received and processed, ensuring that its recommendations are based on data with verifiable origins.

The Path to Trusted AI

Trustworthy AI isn’t about making machines socially aware; it’s about ensuring that humans can trust the chain of custody behind AI-generated insights. When AI states that "Brad recommends this action," it should be able to prove that the recommendation came from the right "Brad"—the person you trust, not an imposter or manipulated data source.

The real question is: How do we create systems where AI is not just technically accurate but verifiably trustworthy? In an era where decisions increasingly rely on AI advice, embedding identity systems at the core isn’t just beneficial—it’s fundamental.

Tuesday, January 28, 2025

Take 1... Solid Pods and Dids

My first attempt at building a decentralized app in this day and age will use Solid Pods and DIDs. The goal? A super simple “BooksWeLike” app—a place where I can review books and see what my friends are reading and enjoying.

What makes this app different is how it handles data. Unlike traditional apps where data lives in a centralized database, my app will let users store their own data in Solid Pods. Think of a Pod as your own personal data vault—you control who can access it and how it’s used. And instead of relying on centralized logins like Google or Facebook, I’ll use Decentralized Identifiers (DIDs), which allow users to prove their identity on their own terms.

The plan for the app is straightforward:

• If you already have a DID or a Solid Pod, you can sign in using your existing accounts.

• If you don’t, the app will help you create them when you sign up.

Of course, part of this journey is figuring out how practical and possible all of this really is. Beyond building the app, I’ll also evaluate the tools, SDKs, client libraries, and documentation available for Solid and DID developers. How well is the building community being enabled? I’ll compare my experience with other distributed ecosystems as I attempt to replicate this app in different environments in the future. Once the app exists across multiple ecosystems, I can explore broader topics like ecosystem interoperability and federation.

These technologies are still evolving, and I’m excited to explore what’s possible—and what needs improvement.

So, what about you? Have you already taken the plunge into the world of DIDs or Solid Pods? Or is this your first time hearing about them? Let’s find out together as I document this journey.

In my next post, I’ll dive into the nitty-gritty of authentication—getting users to log in with their DIDs and connecting them to their Pods. I suspect it’s trickier than it sounds, but that’s all part of the adventure.

Sunday, January 12, 2025

Is anybody out there?

Is blogging still a thing?

If you’re reading this, please comment or like it so I know.

I am, at heart, still the XDI Guy. My passion for robust, secure, and scalable distributed data management hasn’t waned. Building one of the first XDI implementations (shout-out to Markus Sabadello, who might have built the only other one), I learned a lot about the challenges and opportunities of distributed data at scale. Over the years, I’ve reflected on qualities essential for data ecosystems, qualities that are often overshadowed by content-driven ecosystems. For example:

• Caching semantics: Apps need governance and management of caches to respect data ownership while maintaining local operational efficiency.

• Transactionality: Mature data protocols depend on it for consistency and reliability.

• Request batching: Optimizing network requests is vital for performance and scalability.

After years away, I’m ready to dive back in. There are a couple of apps I want to build, and I want to build them on a distributed data platform. My first idea is a fully distributed implementation of Brad deGraf’s BooksWeLike concept—a way to share and discover books that resonate with you. (Brad, if you’re reading this and don’t like the idea, let me know!)

To make this happen, I’ve started scanning the distributed protocol space to see what’s out there. Here’s my initial list of protocols to evaluate:

• AT Protocol

• Nostr

• Solid

• ActivityPub

• Matrix

• IPFS (InterPlanetary File System)

• SSB (Secure Scuttlebutt)

• DID (Decentralized Identifiers)

• Libp2p

• Hypercore

• Waku

• Zero-Knowledge Identity Protocols (ZK Protocols)

What am I missing?

Are there protocols on this list that don’t belong? If so, why? Are there others I should consider? I haven’t started my evaluations yet, so I’m open to your insights. If you’ve built apps on these protocols or have opinions about them, I’d love to hear from you.

I’ll be capturing my journey of discovery here—sharing what I learn, where I stumble, and how I (hopefully) succeed. Let’s make distributed data a reality, by and for the people.

Thursday, June 06, 2019

Introducing PURDAH

So I'm reading Neal Stephenson's latest novel 'Fall'... In Chapter 11 he introduces PURDAH... Personal Unseperable Registered Designator for Anonymous Holography.

He explains that Holography is from the original meaning of the term:

"A holograph is a document written entirely in the handwriting of the person whose signature it bears. Some countries (e.g., France) or local jurisdictions within certain countries (e.g., some U.S. states) give legal standing to specific types of holographic documents, generally waiving requirements that they be witnessed. One of the most important types of such documents are holographic last wills." - https://en.wikipedia.org/wiki/Holograph

"So it's just an anonymous ID, with a fancy name?", Corvallis asks...

No, PURDAHs are all registered in distributed ledger technology so their veracity can be verified at any time. Unseperable means that no one can take it away from you, as long as you take reasonable precautions...

At least Neal seems to be paying attention!

Wednesday, November 08, 2017

Trust vs Confidence

Over the years, in my own mind, I have built specific semantics around the terms 'Trust' and 'Confidence'. These are closely related to the validity of 'Proof'... I think that often the use of these terms in the vernacular are too fuzzy to be of use in identity system discussions. I would posit:

Trust:

Security and its many mechanisms are used to establish trust; once trust is established, you just trust. My canonical use-case for this is access to the school blog. I can grant or revoke write access to my kids' school blog. I give access to people who I trust will only post age appropriate material. I could use manual or automated mechanisms to check posts before they are published but the effort or cost outweighs the risks. I choose to trust. Trust is a human, emotional, social construct that implies a loosening of control. Trust can be abused and it is, knowing the risks, the rewards and the remediations for abuse of trust is important (systems of accountability; reputation? legal?). So trust needs to be bounded "I trust XX to do YY".

Concretely: I trust an entity with my money, like coinbase, who has my bitcoin wallet. I have taken a leap. Coinbase could steal my money despite all of the controls of the blockchain and distributed ledger technology. I could use a different wallet technology but then I am still choosing to trust the software that enables that wallet, or the hardware that the software runs on. At some point the cost of not trusting outweighs the risk and the expense of trusting.

So on some level... this poses the question: If my relationship with coinbase is purely one of trust that they will hold my money and return it based on the current value of bitcoin, what difference does the underlying blockchain technology actually make to me? I could use bitcoin in a way that I don't need a trusted third party (at the extreme: build my own hardware and software) but I don't, and most people don't.

I think it is incumbent on people talking about identity systems to really understand where security ends and trust starts. Do most people understand how misplaced their trust in their mobile hardware could be?

So to me Trust is what happens beyond the bounds of control. Or to put it another way; Trust is what happens within 'pipes' or 'bubbles' of control established using security mechanisms.

Confidence:

Confidence is, when I'm trying to use it precisely, a measure of certainty in a claim. In terms of an identity system a claim might be:

an authentication claim (I am the person identified by ID XX)
an authorization claim (I should have access to this resource)
an attribute claim (I am over 18 years of age)

These claims often get delivered in terms of or together with 'Proofs'. Where 'Proof' is a mechanism (hopefully standardized) to deliver a claim with metadata to increase confidence (in the absence of trust). Some examples:

In an authentication claim, the claim of the ID may be accompanied with claims of who established the ID (signed by a private key of a trusted party) the claim may also include details of how the user was authenticated (password, multi-factor, smart card, etc...). The associated metadata establishes a level of confidence in the claim. Step-up authentication models (you can view your balance if you logged in with a password but you have to use multi-factor to initiate a transfer) are a direct result of your levels of confidence in various authentication claims.

In an attributes claim again one would expect the claim to be signed by a trusted party, trusted to make the specific claim, and the claim may include metadata about how the attribute was validated. An over 18 years of age claim that was self asserted (they checked a checkbox that says "i'm over 18") may be enough to satisfy COPPA compliance requirements in the US but would be insufficient to provide legal access to porn in the UK.

Bringing Confidence and Trust together:

So with a claim that is signed by a party that is trusted to make age claims; the signature gives me confidence that claim is from the the trusted party and then I trust the age claim rarely do I require proof of the mechanics of acquiring the validation. Even if I require details of how the claim was was established (self asserted or credit system check) I could, but I don't, make the third party 'prove' it.

That is establishing a Trust space using a security mechanism (in this case; PKI and standardized claim semantics) and then... trusting the information that is provided in that secured context.

Alternatively:

Is there is a source (glossary) that you use to define these fuzzy terms when you reference them in specs? I know that there were efforts to normalize identity terminology back in the day... did any survive the test of time?

Wednesday, November 01, 2017

Eight years and counting

Well it was 8 years since I last posted here and 12 years since I started this blog and I have to ask... what has changed, what has been achieved in all that time? I've been out of touch with this space for a while and i'm going to go on a little personal voyage of discovery to see what I can learn and see if any of the fundamental problems have been solved.

My first step is going to be attempting to articulate in abstract terms what I consider to be 'the fundamental problems'.

My primary point of interest since this all started has been to give people access to and appropriate control over data about themselves and their transactions. It is well known that the likes of Google, Facebook, Experian, Equifax and many others make their money trading in data generated by or about us. These companies provide important and valuable services but they do not adequately respect individual privacy nor do they fairly include the individuals that are their currency in the value chain.

I have spoken with business stakeholders in large organizations that use these services, despite knowing that they are 'unclean', because the value they add is very real. The increase of ROI on targeted marketing based on these services is phenomenal. As we, as a community, worked on alternate models they were eager: Gives us a viable alternative, they would say, and we will use it. As far as I can tell there is still no viable alternative... I will try to unpack why.

I will try to discover if we have a technology problem, a communications problem, a legal problem, an education problem or a business problem; presumably we have a little of each.

The lack of a viable alternative is closely related to scale. The aforementioned companies have huge user populations which is what makes them so appealing and so valuable. Users do not adopt a technology based on the elegance of the standards or even, unfortunately, based on the strength of the privacy. Users primarily adopt technology because it makes it easier for them to do something they want to do (including playing games and consuming porn). With that said I do believe that there is a growing number of people dissatisfied with the status quo. People who would be willing to engage in an alternative system even if it costs them a little. How do we provide them a viable alternative?

So a fundamental question I have is: Do we have the building blocks to build a viable alternative and we just haven't found the right constellation of services and apps to provide or, are there still gaps in the technology stack to build a viable alternative? I hope to find out.

In my upcoming posts I will start to dig into what I believe are the important qualities of systems that might address this need. I will undoubtedly build on old classics like Kim Cameron's Laws of Identity but will also add some flavor of my own in terms of business and legal frameworks that I believe need to be in place. I will also address the qualities that I believe are necessary for a distributed data network to actually work, at scale, as a data network (Spoiler: Link based systems like "Linked Data" work great for unstructured content, documents, but fail rapidly to satisfy operational requirements for structured data).

I'm excited to dig in and learn blockchain and blockchain alternatives... Please let me know about stuff that you think is worth reviewing and including as stops on my voyage of discovery!

Monday, August 24, 2009

SXSW

If you have a chance; check out this proposed session for SXSW:http://bit.ly/vuPu5. Have you noticed that when you search the internet you probably don't see results from the stuff that you pay for (subscriptions, stuff available through your local library, etc...)? this panel will discuss how we could fix that... If you think that would be useful.. go give it the thumbs up.