Code Highlighting

Tuesday, December 16, 2014

Ladies and gentlemen: salt your passwords

A few weeks ago I recovered the database of an old website from backup. The customer wanted the old contact info and had lost its own backup.
The old database hashed its passwords (SHA-1), but did not salt them. Looking at the old users table, I wondered how well the rainbow table approach would work - using Google as a big rainbow table. Just feed the hash into the search field and see what comes up.

Here are the results:
PasswordClear textCount
7C4A8D09CA3762AF61E59520943DC26494F8941B12345636
9C2028963DC9F7FBB4CB30140428A210C61DBB2Cwachtwoord24
9CF95DACD226DCF43DA376CDB6CBBA7035218921azerty22
62E839476B23E579EFB96B47391599FFCA4CFA94?18
7110EDA4D09E062AA5E4A390B0A572AC0D2C0220123418
F001F2E438738807D3079BFFAF66519B9D0F26C7azqswx16
F7C3BC1D808E04732ADF679965CCC34CA7AE344112345678916
D18C9E9DCCE902D6A3E21E72BADA443AC294CB5FPaswoord15
FDA48D6A63F351DC46D411336DE4BA33F77B66F5voetbal14
D22BC6BAD61129B636AFFA2511B3CE522CD74BB3brecht13
EE8D8728F435FD550F83852AABAB5234CE1DA528iloveyou12
48F285F9A1E15CB6240506182A3C08AEDD639F26dansen11

The count indicates how many users chose that password. Users spoke dutch, used an azerty keyboard. Of the top 12, I found all but 1 using a Google search. This is why you salt your passwords.

Mind you, it's also a reason to implement a decent password policy. One that does not allow '1234'.

I really wonder about that 1 that I wasn't able to find though. Must be some dutch word.

Friday, September 5, 2014

OutputCache on User Controls - VaryByControl

A question on stackoverflow prompted me to research this. Initially a code sample was included, showing how this user was trying to set up cache variation by property value. The VaryByControl parameter was set to the name of this property. This appeared to work for property values entered in markup, but failed if the property was set to a variable: the control bound to the property only ever showed the first value is was bound to.

I set up a test project to attempt to reproduce the problem, and saw identical behavior. So far so good.

<!-- the aspx page, the Number property is set to a random number -->
<%@ Register Src="~/UserControls/Test.ascx" TagPrefix="uc1" TagName="Test" %>

<asp:Content runat="server" ID="BodyContent" ContentPlaceHolderID="MainContent">
    <uc1:Test runat="server" ID="Test1" Number="1" />
    <uc1:Test runat="server" ID="Test2" Number="2" />
    <uc1:Test runat="server" ID="Test3" Number="<%# Number %>" />

    <asp:Label runat="server" Text="<%# Number %>"></asp:Label>
</asp:Content>

<!-- the user control -->
<%@ OutputCache Duration="180" VaryByControl="Number" VaryByParam="None" Shared="true" %>

<h1><%# Number  %></h1>


Detective hats everyone! Let's investigate how outputcache for user controls works!

First the code generation for user controls. An attribute is applied to the generated class that contains the caching parameters:

    [System.Web.UI.PartialCachingAttribute(180, null, "Number", null, null, true, ProviderName = null)]
    public class usercontrols_test_ascx : global::TestOutputCache.UserControls.Test

The parser for the page checks user controls for the caching attribute and - if present - creates a PartialCachingControl, passes it the caching parameters, control ID and a control builder delegate:

    System.Web.UI.StaticPartialCachingControl.BuildCachedControl(@__ctrl, "Test1", "50114342", 180, null, "Number", null, null, new System.Web.UI.BuildMethod(this.@__BuildControlTest1), null);

Inside the PartialCachingControl, a hash is calculated based on the variation parameters, and the cache is checked for the hash. A cache miss results in the delegate being called, and the results getting stored in cache.
Here is the relevant portion of the hash calculation routine:

        if (cachedVary._varyByControls != null) {

            // Prepend them with a prefix to make them fully qualified
            string prefix;
            if (NamingContainer == Page) {
                // No prefix if it's the page
                prefix = String.Empty;
            }
            else {
                prefix = NamingContainer.UniqueID;
                Debug.Assert(!String.IsNullOrEmpty(prefix));
                prefix += IdSeparator;
            }

            prefix += _ctrlID + IdSeparator;

            // Add all the relative vary params and their values to the hash code
            foreach (string varyByParam in cachedVary._varyByControls) {

                string temp = prefix + varyByParam.Trim();
                combinedHashCode.AddCaseInsensitiveString(temp);
                string val = reqValCollection[temp];
                if (val != null)
                    combinedHashCode.AddObject(reqValCollection[temp]);
            }
        }


Variable "reqValCollection" is a dictionary of POST values (if there are any) or GET values.
And the first question presents itself, because the VaryByControl only varies by POST values corresponding with the controls specified. The property value cannot matter: it is never read.

In fact; when deciding to server the cached markup or not, the control has not been created yet, and may not ever be. That's the entire point of the output caching. Obviously variation by Property value doesn't make sense.

So why does it ever work?

Back to the test project to confirm hypothesis by setting VaryByControl to something else.
<%@ OutputCache Duration="180" VaryByControl="PorkPies" VaryByParam="None" Shared="true" %>

<h1><%# Number  %></h1>

Yup. Everything still works identically to how it worked before. The value of VaryByControl does not matter, as long as there is a value. Dropping the VaryByControls entirely shows normal behavior: one single cached version being served for each request.

Smells like a bug! Lets see if we can find it!

In fact, if you look up to the code I posted, the bug is right there. The "temp" variable contains the unique id of the user control. I imagine this field is added to prevent a cache collision for two controls - both specified in VaryByControl - with identical values. But it is added to the combinedHashCode even before it is verified to have a value in reqValCollection. It still works because the HashCodeCombiner class generates a different hash for the same objects added in a different order - it does not simply xor everything and calls it a day.
That means User Control OutPutCache has a built-in variation for:

  • Control ID, and
  • NamingContainer ID
You can activate it by setting VaryByControl to any value whatsoever. I would argue that the correct code looks like this:

            // Add all the relative vary params and their values to the hash code
            foreach (string varyByParam in cachedVary._varyByControls) {

                string temp = prefix + varyByParam.Trim();
                string val = reqValCollection[temp];
                if (val != null){
                    combinedHashCode.AddCaseInsensitiveString(varyByParam.Trim());
                    combinedHashCode.AddObject(reqValCollection[temp]);
                }
            }

Perhaps there is a good reason to include the ID and naming container in the cache key, but it does not make sense to me.

I don't expect Microsoft will ever fix this bug; it would break a multitude of sites in very ugly ways. I'm guessing they've found it before, and decided to leave well enough alone.

Thursday, July 3, 2014

What if events in C# were, you know, cooler?

In the beginning events in C# were awesome. Some of us came from Java, and used to have to implement interfaces named IButtonClickListener on some inner classes whatever nonsense just to react to some button being clicked, but not in C#. No, in C# you just did the following:

    public partial class Form1 : Form
    {
        Button newButton;

        public Form1()
        {
            InitializeComponent();

            newButton = new Button()
            {
                Text = "Click me!",
                Left = 20,
                Top = 20
            };

            newButton.Click += new EventHandler(HandleClick);

            this.Controls.Add(newButton);
        }

        private void HandleClick(object sender, EventArgs e)
        {
            MessageBox.Show("Hi there!");

            newButton.Click -= HandleClick;
        }
    }

No obnoxious interface nonsense! Delegates! The future is now!

But over 10 years later we have gotten spoiled with anonymous delegates and lambdas, and all of a sudden the C# event model does not seem so cool anymore. Because when you hook up an event handler like this:

newButton.Click += (sender, e) => MessageBox.Show("Hi there!");

do you imagine you can remove it like this:

// fugetaboutit!
newButton.Click -= (sender, e) => MessageBox.Show("Hi there!");

No way! You would have to save the handler to a variable to have a reference handy to be able to remove
the handler later:
EventHandler handler = (sender, e) => MessageBox.Show("Hi there!");

Worse! Microsoft added lots of nice generic delegate types Action<T>, Action<T1, T2>, Action<T1, T2, T3> etc that we can't use because we're stuck with the EventHandler delegate types for Events. You would need a wrapper around the Invoke function to make that work.

Would you like to check out what is currently hooked up to your event? Tough!

Clear the whole thing and start over? Get outta here!

But despair no more! I have written a half-assed solution that I think might have been marginally better if it had worked properly!

Presenting: the new and improved Event:


    public class Event<TEventArgs,TEventHandler>
        where TEventArgs : EventArgs
        where TEventHandler : class
    {
        // the main store for handlers
        private Dictionary<int, Action<object, TEventArgs>> handlers =
            new Dictionary<int, Action<object, TEventArgs>>();

        // a dictionary that maps the original handlers to the generated wrappers
        private Dictionary<object, int> operatorAddedHandlers =
            new Dictionary<object, int>();

        public Event()
        {
        }

        static Event()
        {
            // check if my TEventHandler type parameter is a delegate type
            // where TEventHandler : Delegate is not supported
            if (!typeof(TEventHandler).IsSubclassOf(typeof(Delegate)))
            {
                throw new InvalidOperationException(typeof(TEventHandler).Name + " in generic type parameter TEventHandler is not a delegate type.");
            }
        }

        /// <summary>
        /// Maps events of the same name on the container object to the Event field that hides it
        /// </summary>
        /// <param name="eventGenerator">the object that contains the Event field (typically this)</param>
        public void CaptureBaseEvents(object eventGenerator)
        {
            var fieldNames = new List<String>();
            var classType = eventGenerator.GetType();
            var fields = classType.GetFields();
            
            // check which field the current instance has been assigned to, store the name
            foreach (var field in fields)
            {
                var val = field.GetValue(eventGenerator);
                if (Object.Equals(val, this))
                {
                    fieldNames.Add(field.Name);
                }
            }

            // get a MethodInfo for this class's Invoke method
            var invokeHandler =
                typeof(Event<TEventArgs, TEventHandler>).GetMethod("Invoke");

            // check if any base events exist that have the same name
            // if so, hook up our Invoke method
            foreach (var fieldName in fieldNames)
            {
                var classEvent = classType.GetEvent(fieldName);

                if (classEvent != null)
                {
                    var tDelegate = classEvent.EventHandlerType;
                    var d = Delegate.CreateDelegate(tDelegate, this, invokeHandler);

                    var addHandler = classEvent.GetAddMethod();
                    Object[] addHandlerArgs = { d };
                    addHandler.Invoke(eventGenerator, addHandlerArgs);
                }
            }
        }

        /// <summary>
        /// Invokes the event handlers
        /// </summary>
        /// <param name="sender">The object that triggered the event</param>
        /// <param name="e">The event parameters</param>
        public void Invoke(object sender, TEventArgs e)
        {
            // just in case a handler modifies the handlers
            var handlerDelegates = handlers.Select(hs => hs.Value).ToList();

            foreach (var handler in handlerDelegates)
            {
                handler.Invoke(sender, e);
            }
        }

        /// <summary>
        /// Removes all handlers
        /// </summary>
        public void ClearHandlers()
        {
            handlers.Clear();
        }

        /// <summary>
        /// Gets a list of all handlers and their removal token
        /// </summary>
        /// <returns></returns>
        public IEnumerable<EventHandlerSet<TEventArgs>> GetAllHandlers()
        {
            // store in list in case someone loops through to find a handler to remove
            // that seems kind of likely
            var handlerSets = handlers.Select(hs => new EventHandlerSet<TEventArgs>()
            {
                Handler = hs.Value,
                RemovalToken = hs.Key
            }).ToList();

            return handlerSets;
        }

        /// <summary>
        /// Adds an event handler
        /// </summary>
        /// <param name="newHandler">The new handler to add</param>
        /// <returns>The removal token for the newly added handler</returns>
        public int AddHandler(Action<object, TEventArgs> newHandler)
        {
            // we don't hook up the same handler twice,
            // just return its current removal token
            if (handlers.Any(ks => ks.Value.Equals(newHandler)))
            {
                return handlers.Where(ks => ks.Value.Equals(newHandler)).Single().Key;
            }

            // use a new guid cut off at int32 length for key generator
            var newGuid = Guid.NewGuid();
            int newKey = BitConverter.ToInt32(newGuid.ToByteArray(), 0);

            // it's a lot more likely at 32 bits
            while (handlers.ContainsKey(newKey))
            {
                newGuid = Guid.NewGuid();
                newKey = BitConverter.ToInt32(newGuid.ToByteArray(), 0);
            }

            handlers.Add(newKey, newHandler);

            return newKey;
        }

        /// <summary>
        /// Adds a new handler to the Event field
        /// </summary>
        /// <param name="thisEvent">The event field</param>
        /// <param name="newHandler">The handler to add</param>
        /// <returns></returns>
        public static Event<TEventArgs, TEventHandler> operator +(Event<TEventArgs, TEventHandler> thisEvent, TEventHandler newHandler)
        {
            var handlerDelegate = newHandler as Delegate;

            if (handlerDelegate == null)
                throw new InvalidOperationException("Handler is not a delegate type!");

            // if this handler has already been added, do nothing
            if(thisEvent.operatorAddedHandlers.ContainsKey(newHandler)){
                return thisEvent;
            }

            var removalToken = thisEvent.AddHandler(delegate(object sender, TEventArgs e)
            {
                handlerDelegate.DynamicInvoke(new object[] { sender, e });
            });

            thisEvent.operatorAddedHandlers.Add(newHandler, removalToken);

            return thisEvent;
        }

        /// <summary>
        /// Removes the specified handler
        /// </summary>
        /// <param name="handler">The handler to remove</param>
        public void RemoveHandler(Action<object, TEventArgs> handler)
        {
            if (handlers.Any(ks => ks.Value.Equals(handler)))
            {
                var removalToken = handlers.Where(ks => ks.Value.Equals(handler)).Single().Key;

                handlers.Remove(removalToken);
            }
        }

        /// <summary>
        /// Removes the specified handler
        /// </summary>
        /// <param name="removalToken">The handler's removal token</param>
        public void RemoveHandler(int removalToken)
        {
            if (handlers.ContainsKey(removalToken))
                handlers.Remove(removalToken);
        }

        /// <summary>
        /// Removes the specified handler
        /// </summary>
        /// <param name="thisEvent">The event field</param>
        /// <param name="handler">The handler to remove</param>
        /// <returns></returns>
        public static Event<TEventArgs, TEventHandler> operator -(Event<TEventArgs, TEventHandler> thisEvent, TEventHandler handler)
        {
            if (!thisEvent.operatorAddedHandlers.ContainsKey(handler))
                return thisEvent;

            var removalToken = thisEvent.operatorAddedHandlers[handler];

            thisEvent.RemoveHandler(removalToken);

            return thisEvent;
        }
    }

    public class EventHandlerSet<TEventArgs>
    {
        internal EventHandlerSet() { }
        public int RemovalToken { get; set; }
        public Action<object, TEventArgs> Handler { get; set; }
    }

I've made it so that it's mostly a drop-in replacement for pre-existing events. Obviously you need to inherit from the class implementing the events. So let's take that boring old button we had, and jazz up the click event:


    public class NewAndImprovedButton : System.Windows.Forms.Button
    {
        public NewAndImprovedButton()
            : base()
        {
            Click.CaptureBaseEvents(this);
        }

        public new Menno.Event.Event<EventArgs, EventHandler> Click = new Menno.Event.Event<EventArgs, EventHandler>();
    }

Turns out you can hide an event with a field. Who knew? Also, the CaptureBaseEvents in the constructor only works because fields are initialized before the constructor runs. Otherwise I'd have had to do something odd and confusing. More so.

If we take the first piece of code and replace Button by NewAndImprovedButton, like so:
    public partial class Form1 : Form
    {
        NewAndImprovedButton newButton;

        public Form1()
        {
            InitializeComponent();

            newButton = new NewAndImprovedButton()
            {
                Text = "Click me!",
                Left = 20,
                Top = 20
            };

            newButton.Click += new EventHandler(HandleClick);

            this.Controls.Add(newButton);
        }

        private void HandleClick(object sender, EventArgs e)
        {
            MessageBox.Show("Hi there!");

            newButton.Click -= HandleClick;
        }
    }

Everything keeps working exactly the way it did. I implemented the + and - operators to mirror the current Event syntax, and the CaptureBaseEvents method hooks up the event in the base class that the field hides to the Invoke method. Magic!
But we did not make this to keep working the same way. Here's what's new:

  • an AddHandler method that takes an Action<object, TEventArgs> and returns an int that functions as a token.
  • (obviously) a RemoveHandler that an int token.
  • for good measure: a RemoveHandler that takes an Action<object, TEventArgs>
  • a ClearHandlers method
  • a GetAllHandlers method that returns all handler delegates and their tokens, even the ones added through reflection.
Of course there are some rough edges too:
  • Field cannot be readonly if you use the operators. Anything could happen.
  • Operators generate wrappers through reflection. Performance is bound to be questionable.
  • Probably hundreds of corner cases I haven't considered.
Still, now you can do this:

            int removalToken = 0;
            removalToken = newButton.Click.AddHandler((sender, e) =>
            {
                MessageBox.Show("Hi there!");
                newButton.Click.RemoveHandler(removalToken);
            });

For some folks, that's worth a few rough edges.

Sunday, August 18, 2013

What if performance was more important than accuracy?

In 2000 Interplay released a game called "Messiah". It wasn't a particularly good game, and the only reason I mention it is it had this automatic system for scaling detail levels. Rather than letting people set detail level manually, it would detect when it could add some more polygons, and when detail levels needed to be dialled down. That way the game could fairly consistently hit the target frames per second.

That got me wondering: what if you find yourself programming something for which being fast is more important than being exactly right?
This is not an entirely unfamiliar trade-off: everytime you use a double instead of a decimal you decide that fast calculations are more important than exact decimal representation of your numbers. And the UDP protocol foregoes transmission verification for speed.

In the past we would write asynchonous function calls somewhat like this:
(Now with async/await, things are different again. Well, they look different.)

// Start reading from that slow file system
var fileStream = new System.IO.FileStream(@"C:\blablah", System.IO.FileMode.Open, System.IO.FileAccess.Read);
var asyncResult = fileStream.BeginRead(blahblah ...);

// Do lots of stuff in between to maximize your efficiency

// And now you just need your data, so for the rest of the 
// time you're synchronously waiting for it to arrive
fileStream.EndRead(asyncResult);

That's fine and dandy if you're just reading data. But what if you're calculating something on another thread, and you've already found your result, but the caller hasn't asked for it yet? Maybe you could spend your time improving your result rather than twiddling your thumbs.
So we need a method that iteratively improves on its previous results, and returns each so that we have something to return. Obviously I chose pi (using Microsoft.SolverFoundation.Common):
(using BBP formula)

static IEnumerable<Rational> GetPi()
{
    Rational bigPi = new Rational();
    Rational bigMinusOne = -1;
    Rational bigOne = 1;
    Rational bigTwo = 2;
    Rational bigFour = 4;
    Rational bigFive = 5;
    Rational bigSix = 6;
    Rational bigEight = 8;
    Rational bigSixteen = 16;
    for (int i = 0; i < 10000; i++)
    {
        Rational bigI = i;
        Rational powI;
        Rational.Power(16, i, out powI);
        var bigS = (bigOne / powI) * (bigFour / ((bigEight * bigI) + bigOne) - (bigTwo / ((bigEight * bigI) + bigFour)) - (bigOne / ((bigEight * bigI) + bigFive)) - (bigOne / ((bigEight * bigI) + bigSix)));
        bigPi += bigS;
        // We don't want to return anything less than 7 hexadecimal digits
        if (i > 7)
            yield return bigPi;
    }
}

Sooo, up to ten thousand hexadecimal digits (hexits?) of pi generated as a decimal fraction. Now to write a wrapper class that runs this in a loop in another thread:

public sealed class IncreasinglyAccurate<T>
{
    private Func<IEnumerable<T>> _generator;
    private Thread _generatorThread;
    private bool _hasValue = false;
    private T _currentValue;
    public IncreasinglyAccurate(Func<IEnumerable<T>> generator)
    {
        if (generator == null)
            throw new ArgumentNullException("generator");
        _generator = generator;
        // returns as soon as the generator thread is started
        GetValues();
    }
    private void GetValues()
    {
        // Spinning up a new thread is wasteful, but I can't set priority
        // on work items added to the threadpool
        _generatorThread = new Thread(delegate()
        {
            foreach(T value in _generator()){
                _currentValue = value;
                _hasValue = true;
            }
        });
        _generatorThread.Priority = ThreadPriority.BelowNormal;
        _generatorThread.Start();
    }
    public T GetValue()
    {
        // wait for a value
        while (!_hasValue)
        {
            Thread.Sleep(1);
        }
        // delegates will need to expect a ThreadAbortException
        _generatorThread.Abort();
        _generatorThread.Join();
        return _currentValue;
    }
}
/// <summary>
/// Little wrapper to get generic type inference
/// </summary>
public static class IncreasingAccuracy
{
    public static IncreasinglyAccurate<S> FromDelegate<S>(Func<IEnumerable<S>> generator)
    {
        return new IncreasinglyAccurate<S>(generator);
    }
}

That looks about right. Now we should be able to use this as follows:

var piGenerator = IncreasingAccuracy.FromDelegate(GetPi);
// Do some work!
System.Threading.Thread.Sleep(50);
Console.Write(piGenerator.GetValue().ToString());

On my slow old computer that shows a very big number divided by another big number (probably a power of 16!). Increasing the sleep time increases the size of the number, so that looks right.

I'll grant you that pi digits are not that useful; I'm not sure when you would ever need more than, oh, 10 digits max. But I can think of some situations where it's useful to have a sufficient answer fast, but to keep thinking about a better one: AI, voice recognition, anything that will not be 100% perfect in a hurry.

Thursday, June 13, 2013

Something silly

I've been spending some of my free time working on something silly. People who know me expect me to bring up the OMGWTF2 contest now. That's a valid point, so let me start over.

I've been spending some of my free time working on two silly somethings. Here's a link to one of those: http://demo.tabeoka.be/Contact%20%20Tabeoka.htm . Click that link and fill out the form and send it up.
Go ahead, nothing bad will happen. I'll wait right here.






Did you click that link? No? You need to have clicked that link to make sense of the rest of this post. No really! Go back and click.






Right. So now that that is done, I have a second link for you: click right here.
Now there are three possibilities:
1) you're pretty gullible, and thought that was amazing,
2) you thought that was still pretty neat, or
3) that link did nothing because you're using some browser I haven't tested, which are many.

For those of you who chose option 3 and missed all the excitement: that second link should have replayed exactly what you did on that page the first time you were there. I had the idea after going through some Google Analytics graphs and wondering: "But why do they click that particular link?". You could potentially use this type of script to do usability testing on the cheap.
The code is for the most part pretty straightforward, but there are a few things which I thought were interesting to point out:

It generates unique selectors for elements. If you click something, it needs to log which element. When you're just dealing with javascript, you would keep a reference to the DOM element, but we're logging to json. Generating a jquery selector enables getting an easy reference to the clicked element. For now I'm generating a selector by going through parent elements ( td>tr ) and logging the index (:nth-child) until I hit an element with an id (#content) or the document root (html>body>). Amazingly I had to write this myself; I couldn't find a pre-written script on Google.

In the interest of keeping bandwidth use and file size down, I log  frame-by-frame. Every 40 ms (for a frame rate of 25) the data collector comes by, compiles and documentEvents object and adds it to the array. which brings me to the next point: this script is completely modular. Currently it logs input fields, mouse location, scroll position and clicks. If you want to add another thing to log though, say actions in a Flash object, you just write another set of methods to log and replay those, and point the main logger file to it. Remember that every additional logger adds extra bytes to the data that needs to be sent to the server.

Everybody knows that the onunload (or onbeforeunload) can not be trusted. That's why I ping the latest frames to the server every second. You could set it to 2 seconds or 10 seconds or whatever as well. At the server end I have a .NET handler that appends the extra json to the end of a file. The first request generates a new file and sends back the id for subsequent requests (currently a Guid). This would be trivial to implement in another server language.

Current problems include:
In Chrome onchange on an input field is only fired after the field loses focus. IE fires it on keypress. That means that in IE you see the text field value being typed up, but in Chrome it just appears fully written. Chrome is better for bandwidth, but IE is nicer to see replayed. I haven't made up my mind yet which is preferable.

Replaying the logged event mutates the initial page state, but moving back the timeline does not revert to previous page state. I could clear text fields and such, but simulated clicks that fired event handlers might be impossible to undo. Like that cookie notice.

Lastly, this is just some demo script code, don't use it for anything serious. I haven't figured out yet if I'm going to actually release this code as open source at e.g. GitHub. Don't copy/paste and run with it. I'm sure it's illegal or something.

Tuesday, May 7, 2013

Short: belgian rijksregisternummer validation in javascript

It's been really busy, so no time for blog posts. In the meantime, here is a short function to validate id card numbers (rijksregisternummer) in Belgium, according to the official guidelines. I could not find a proper implementation of this in javascript.

        function IsRRNoValid(n) {
            // RR numbers need to be 11 chars long
            if (n.length != 11)
                return false;

            var checkDigit = n.substr(n.length - 2, 2);
            var modFunction = function(nr) { return 97 - (nr % 97); };
            var nrToCheck = parseInt(n.substr(0, 9));
            
            // first check without 2
            if (modFunction(nrToCheck) == checkDigit)
                return true;

            // then check with 2 appended for y2k+ births
            nrToCheck = parseInt('2' + n.substr(0, 9));

            return (modFunction(nrToCheck) == checkDigit);
        }

Easy as pie. You can get the gender and date of birth out of there as well, but I don't need that info for my purpose. A slightly more advanced validation would also check to see if the first 6 digits represent a valid date according to the yyMMdd format (determining the right century using the current validation).

Monday, March 4, 2013

Good news! People are smart! Kind of!

For customer HAROL we wrote a little memory game. There are 16 cards (8 pairs), and you get 3 attempts to try and find a match. If you do find a match you might just win an iPad! If you don't - well, you still get a discount coupon. The game was played on iPads set up at Batibouw. There is no online link for the game, or I'd link you there *.

Before we sent the game off to Batibouw, I sat down to calculate the likelihood of finding a match within your three attempts: it shouldn't be too hard or Harol would never get rid of its iPads. For calculations, I would assume a 100% perfect memory. The most interesting part here is how close people would get to that 100% perfect memory score. Are people smart? Kind of? Or are they really dumb?

I haven' t had a lot of formal statistics training, so I've largely relied on The Big Three: "simple common sense", "blatant disregard for other people's opinions" and "shameless guessing". Readers with a background in statistics: feel free to set me on the right path in the comments. I'll feel free to blatantly disregard you.
Anyway, here's what I figured:

Three turns, six cards. Rather than calculate the 'winning' percentage, I'll calculate the 'losing' percentage, and deduct from 100 %. Losing is the one route you take through the game where you don't find a match, winning is any of 3 distinct events - maybe even 5.
  • Card 1: nothing to do here
  • Card 2: Lucky guess chance to win. Chances to lose: 14/15
  • Card 3: May be a match for one of the 2 previous ones. 12/14 it ain't though.
  • Card 4: And if it ain't, there's still a lucky guess chance: 12/13
  • Card 5: A match for any of the previous ones is starting to be likely, but not quite: 8/12
  • Card 6: Last chance to hit a match: 10/11
So where does that put us? Far as I can see, we just multiply all of those fractions, and come up with the losing percentage: 0,4475524475524476. Say 45%. You're actually 55% likely to win if you play it smart. (If you just guess blindly, your winning chances are obviously (1-pow(14/15, 3)) = 18.6 % = not looking good).

So how did people do?




**** drumroll ****




49%
Only 6% away from a perfect memory.

Good news! People are smart! Kind of!



* For those of you who care: it was an MVC app, some nice ajax, css transitions that I could afford to rely on, as it only needed to work on iPad. Who knows maybe it will be recycled for another promotion that will run online.