Thursday, October 16, 2008

Resolution Revolution

So I learned a little this week about sockets and it has given me pause to think about the realities of 'success' in regards to MASSIVE the adoption of the protocols that I tend to talk about on this blog.

They say a little knowledge is a dangerous this... well here I go... head first:

DNS resolution has been under attack recently (last 6 month) from a new set of poisoning attacks. One of the main reasons the attacks work is because DNS uses UDP and not of TCP. The basic fix that has been implemented is Source Port Randomization but even that has been brute force attacked.... so people speculate as to what else could be done. One idea was make every request twice and the answers MUST match (this is known as debouncing). Another option proposed is, just use TCP instead of UDP.

So here's what I find interesting... The debounce option was rejected because it would double the amount of traffic on the DNS system; we would go from 2 packets on the wire to 4. It has been determined that the current DNS infrastructure is running at over 50% capacity so instantly doubling the load is simply not an option. SO... why not use TCP? Well, if you use TCP you have the 3 way handshake, then the query, then the response and then the fin and the fin ack.... 7 packets on the wire (and larger packets at that). So I find all of this fascinating in a purely academic way, this stuff is all new to me. (now I have a basis on which to go understand DNS Sec, that'll be next week's reading)

Then I wander... is anyone doing the math? IF OpenID became ubiquitous, or InfoCards did, what would that look like at a packets on the wire level? Is there so much spare bandwidth and processing power now available that we don't have to worry about this?

Wednesday, October 15, 2008

Is this reputed to be a reputation?

There's a great thread going on about reputation on one of the lists I read. I tried to respond to the thread, which is something I NEVER do, but apparently it has been too long since I was active so it wouldn't let me.... So I'm weighing in here for any one to check if they like.

Another definition of reputation:

Reputation is the result of running an evaluation algorithm over a set of input data.

Some sample input data:

a) Number of sale transactions and number of complaints
b) Number of IM connection requests and number of IM spam reports
c) Ebay reputation, Credit score and number of points on my drivers license.
d) How much 100 people, selected at random, like Diet Coke

The evaluation algorithm can be very simple or very complex.... Ebay's is arguable very simple and Fair Issac's has a very complex algorithm.

Arguably the reputation of a reputation could be measured based on the quality of its input data and the quality of the evaluation algorithm.

Reputation system attacks tend to attack the data input stream, or depend on a delay between input and output. (I've written on this in the past.)

As identity providers I think our first line of responsibility to reputation systems is the CONTROLED delivery of quality input data that is surrounded by enough metadata about collection/storage/retention and "whatever else" that anyone can run reputation evaluations against that data and reach meaningful conclusions. I can then feed that (anonymized?) data into the reputation service of my choice which will likely be dependent on the context of my current activity.

If I want an agent at my smtp gateway to 'decide' if a piece of information should be delivered to my inbox I don't care what the sender says about themselves, I don't want to go query a bunch of reputation services to see if they know anything about this sender (which ones would I trust?). I want to have access to a set of data, signed by a reputable source, how long has the account existed, how many mail have been sent, how many complaints have there been, registration info(made available for bootstrapping) that I can put into my personalized reputation algorithm.