Menno@Tabeoka: December 2012

Friday, December 28, 2012

SELECTing the comedy horror genre

Whenever we get a new intern, I like to poke and prod him*, to see what he's made of (bones and gooey bits, usually). To test SQL proficiency, I use the following problem. I like it because it's something that comes up occasionally in projects, and is surprisingly complex, though it seems like it shouldn't be:

Suppose you've got a movie database. You've got a table with movies T_MOVIES:

CREATE TABLE [dbo].[T_MOVIES](
 [movie_key] [int] NOT NULL,
 [movie_name] [nvarchar](500) NOT NULL,
 CONSTRAINT [PK_T_MOVIES] PRIMARY KEY CLUSTERED 
(
 [movie_key] ASC
)WITH (IGNORE_DUP_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]

and there's also a table for movie genres T_GENRES:

CREATE TABLE [dbo].[T_GENRES](
 [genre_key] [int] NOT NULL,
 [genre_name] [nvarchar](500) NOT NULL,
 CONSTRAINT [PK_T_GENRES] PRIMARY KEY CLUSTERED 
(
 [genre_key] ASC
)WITH (IGNORE_DUP_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]

To link both tables there is a third table T_MOVIE_GENRES:

CREATE TABLE [dbo].[T_MOVIE_GENRES](
 [movie_key] [int] NOT NULL,
 [genre_key] [int] NOT NULL,
 CONSTRAINT [PK_T_MOVIE_GENRES] PRIMARY KEY CLUSTERED 
(
 [movie_key] ASC,
 [genre_key] ASC
)WITH (IGNORE_DUP_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]

Assume the proper foreign key constraints have been applied, and you have a pretty sensible lay-out. To select all movies within the horror genre you only have to join T_MOVIES to T_MOVIE_GENRES and filter down your genre key in the WHERE clause to whatever the horror genre is, suppose it's 5:

SELECT
 T_MOVIES.movie_key,
 T_MOVIES.movie_name
FROM
 T_MOVIES INNER JOIN
 T_MOVIE_GENRES ON
 T_MOVIES.movie_key = T_MOVIE_GENRES.movie_key
WHERE
 T_MOVIE_GENRES.genre_key = 5

The question now is: What if I want to filter by two genres? How do I find movies that are not only horror, but also comedy? What query will yield "Shaun of the Dead"?
More generally, how do we find those records in a table that have more than one matching record in another table, where those matching records have a field set to a set of specific values?

Obviously WHERE T_MOVIE_GENRES.genre_key IN (1,5) is going to yield movies that belong to either genre.
At this point usually the intern scratches his chin, gives it a bit of thought, and comes up with this:

SELECT
 T_MOVIES.movie_key,
 T_MOVIES.movie_name
FROM
 T_MOVIES INNER JOIN
 (
 SELECT  T_MOVIE_GENRES.movie_key
 FROM T_MOVIE_GENRES
 WHERE T_MOVIE_GENRES.genre_key = 5
 ) AS Q_HORROR
 ON T_MOVIES.movie_key = Q_HORROR.movie_key
  INNER JOIN
 (
 SELECT  T_MOVIE_GENRES.movie_key
 FROM T_MOVIE_GENRES
 WHERE T_MOVIE_GENRES.genre_key = 1
 ) AS Q_COMEDY
 ON T_MOVIES.movie_key = Q_COMEDY.movie_key

(Or some other solution involving a subquery)
Okay, fine, you found "Shaun of the Dead". But what if I'm actually in the mood for Zombieland, a comedy / horror / road movie? In fact, I want a stored procedure that can take an arbitrary number of genres, and filter by it. You can go the dynamic sql route, and build a string with an arbitrary number of subqueries. It'd be ugly and hard to maintain, but it would work.
The following shows what I think is the best solution though:

SELECT
 T_MOVIES.movie_key,
 T_MOVIES.movie_name
FROM
 T_MOVIES INNER JOIN
 T_MOVIE_GENRES
 ON T_MOVIES.movie_key = T_MOVIE_GENRES.movie_key
WHERE
 T_MOVIE_GENRES.genre_key in (1,5)
GROUP BY
 T_MOVIES.movie_key,
 T_MOVIES.movie_name
HAVING
 COUNT(*) = 2

To refactor this into forementioned stored procedure you need to add a table-typed variable, and use a little bit of dynamic sql - because SQL does not have anything like arrays. Something like this:

CREATE PROCEDURE GetMoviesByGenres
 @genre_keys nvarchar(2000)
AS
BEGIN
 DECLARE @genre_keys_table table(genre_key int)
 DECLARE @genre_count int
 
 INSERT INTO @genre_keys_table
   EXEC(N'SELECT DISTINCT genre_key FROM T_GENRES WHERE genre_key IN (' + @genre_keys + N')')
 
 SELECT @genre_count = COUNT(*) FROM @genre_keys_table
 
 SELECT
  T_MOVIES.movie_key,
  T_MOVIES.movie_name
 FROM
  T_MOVIES INNER JOIN
  T_MOVIE_GENRES
  ON T_MOVIES.movie_key = T_MOVIE_GENRES.movie_key
 WHERE
  T_MOVIE_GENRES.genre_key IN (
   SELECT genre_key FROM @genre_keys_table
  )
 GROUP BY
  T_MOVIES.movie_key,
  T_MOVIES.movie_name
 HAVING
  COUNT(*) = @genre_count
END

If it wasn't clear from the code; @genre_keys takes a comma-delimited list of keys. That leaves an obvious injection vulnerability, so I would advise to have your calling code take an array of int, and not a string.
So that pretty much concludes that. If any of you ever end up as an intern at Tabeoka, you can now impress me with your mad SQL skillz.

Menno

* "him" because we've only ever gotten male interns. I'm prety sure I could get in trouble for poking and prodding a girl intern anyway. Both with the law, and my wife.

Monday, December 17, 2012

Html and Svg: handling events to and fro

I've been working on a website where I embed a couple of svg's in an html page. Actually, I embed the same svg in three places on one page. The svg in question has some script-driven animation, so I needed to use an object tag, rather than a simple img.
First challenge was this: I needed to trigger the svg animation for all svg's one after the other, which meant I had to call into the svg's script:

        function bounceBall(ball) {
            var svgView = getSvgView(ball);
            
            if(svgView)
                svgView.startBounce();
        }

        function getSvgView(ball) {
            var svgDoc;
            try {
                if (ball.getSVGDocument)
                    svgDoc = ball.getSVGDocument();
                else if (ball.contentDocument)
                    svgDoc = ball.contentDocument;

                if (svgDoc) {
                    return svgDoc.defaultView;
                }
            } catch (e) { }
            return null;
        }

When I initially call this function, the SVG dom might not be loaded yet. That's why I include the try/catch, and return null if an error occurs. Elsewhere in the code I set a setTimeout to retry 200 milliseconds later. The onload event is not quite reliable enough.

Next up was the click handler. If you simply add an onclick to the object tag, nothing happens when you click the svg. That makes perfect sense: the onclick is registered and handled within the svg, and never makes it to the html dom. I needed to have a click on the image open a little div in html though. From svg I could call the html javascript functions using top.someFunctionName(). I had the same svg image three times though, and it needed to do something different each time. Here's what I came up with:

Html:

<object type="image/svg+xml" data="/Content/Images/ball.svg" class="ball" style="left: 130px; top: 160px;" onclick="showPopup('homepopup2');"></object>

"But wait!" you say, "You just told me that doesn't work!". And it doesn't, but it would be pretty convenient if it did:

Html dom javascript:

        function setClickEvent(ball) {
            if (ball.onclick) {
                var svgwin = getSvgView(ball);
                if (svgwin) {
                    svgwin.eventHandler = ball.onclick;
                } else {
                    // if the svg view is not available,
                    // try again in 200 ms.
                    window.setTimeout(function() {
                        setClickEvent(ball);
                    }, 200);
                }
            }
        }

There you go. On load I simply funnel the onclick handler into the svg dom. The svg implementation is trivial:

    var eventHandler = null;

    function handleClick(){
      if(eventHandler)
        eventHandler();
    }

<circle cx="15" cy="15" r="5" id="ball" onclick="handleClick();" />

If you need this sort of thing more often - or for more events - you could work out a neat wrapper with registerEventHandler(eventName, eventHandler) and triggerEvent(eventName) methods to reuse (and a html-side script that automatically hooks it up). I don't currently foresee a need for it myself though.

Menno

Tuesday, December 4, 2012

Anything but restful

Last week I was asked to integrate some "webservices" into a .NET application. I say "webservices", but that word can mean anything from SOAP to custom format XML to some icky CSV that uses the asterisk for a delimiter.
I was lucky though: the url made obvious that this webservice was WCF-based, some methods returning json, and equivalent methods returning xml. All request parameters are to be sent as json through the querystring.
I'm an optimistic fool, so I simply use the "Add Service Reference" to add a link to the service, and all seems well. Until I try to actually use the service: my application throws up the following:

Could not find default endpoint element that references contract 'IServiceInterface' in the ServiceModel client configuration section. This might be because no configuration file was found for your application, or because no endpoint element matching this contract could be found in the client element.

Sure enough, when I check in my configuration file, the following has been helpfully added:

<configuration></configuration>

Well thank you very much, Visual Studio. You're a great help.
Google turns out to be more of a help, and points me to the following article telling me: what you're trying to do does not work and will not work. Because they're so RESTful. Instead I should be using the WebChannelFactory class to generate a channel and use that to call the webservice.
No problem.
I re-use the proxy classes generated in my non-working service reference (but delete the wrapper classes), and point the WebChannelFactory to the generated service interface. Does that work now?
No of course it doesn't work: "405 http method POST is not supported by this url".
Sure enough, Fiddler shows that the service is being called using a POST of the request paremeter serialized as xml. "Stupid boy, " Google tells me again, "you should be adding the WebGet attribute to your method, and if you want json, you need to add a behavior to your endpoint that selects the JsonQueryStringConverter". "And while you're on it, " Google continues, "don't forget to define a UriTemplate for your WebGet attribute".
Done, done and done.
Now does it work? No of course it bloody well doesn't work. No more errors, but no deserialized data either. The json serialized data appears to be wrapped in a single key 'd' for all requests. The class structure doesn't match up to the json hierarchy, so nothing gets deserialized. So I wrap all my return types in a small generic class:

  public class JsonWrapper<T>
    private T _d;
    public T d {
      get { return _d; }
      set { _d = value; }
    }
  }

Finally I start seeing data. WCF and the one side, WCF on the other side and nothing works by itself. Some of the articles I was reading had the gall to tell me this was because WCF has great extensibility. Is "extensible" newspeak for "does nothing useful out of the box"?
Sure, it is very extensible, but in the time it took me to figure out how to get it to call a simple json webservice, I could just as easily have used the WebClient class to call the url, and run the return data through NewtonSoft's Json.NET. In fact, I could have done it three times.
If the service reference is able to properly generate proxy classes, why can't the service reference also send the necessary metadata to configure all that stuff I had to do manually? Where was the added value in all that WCF stuff? Is this a useful abstraction, or needless obfuscation?
Now that I've got a bit more of a handle on how it works, I'll give it another chance. But I'm not sold.

Menno