A collection of learnings and opinions.

Tuesday, September 30, 2008

Podcasts

I listen to a lot of podcasts. What can I say, I bore easily. When I have to do something that's not social and takes more than a few minutes I often pop the earbuds in and start listening. My player, the Zen V 2Gb, is far too small to contain any music, and podcasts are a great way of getting some intelectual topping in the daily drudgery.

I thought I'd post the list of podcasts I enjoy regularly:
  • CBC Radio Quirks and Quarks
    A great show about the stories breaking in science. Definitely a Canadian slant, and they've been running theme shows about Canada's carbon-footprint and such lately (which doesn't interest me that much), but still one of the greatest.

  • FLOSS Weekly
    This show died for a while, but they're back with a vengeance. In-depth interviews with important people in the open-source world. Could it be better?

  • In Our Time With Melvyn Bragg
    Melvyn is a delightfully pretentious British chap interviewing academics on interesting themes. Always interesting, and a really eclectic selection of themes.

  • IT Conversations
    One of the really early podcasts, this is excellent quality. I listen to the entire feed, which I think syndicates a lot of other feeds. Great content from biomedical/genetics to philosophy and CTO-grade stuff. There's always something here you haven't heard about.

  • Math Mutation
    This is a new one for me. Erik Seligman talks very briefly (3-5 minutes) about some interesting topics from the world of Math and Numbers. Good content!

  • StarStuff with Stuart Gary
    I'm a sucker for the starry nights, and this recent discovery seems right up my alley. I've only listened to two yet, but I think it's a keeper!

  • The Java Posse
    My favourite! The posse have weekly commentaries on the tech-world with a heavy java-slant. All the members are world-class developers with strong opinions and the arguments to back them up. It makes my day every time I see a new one pop into my playlist.

  • The Skeptic's Guide to the Universe
    Tackling the idiocy and numb-thinking out there head on, the Skeptical Rogues are not afraid to delve deeply into any theme they find. It's so refreshing to see that there are intelligent people out there, not just nuts.

  • This Week in Science - The Kickass Science Podcast
    Another recent discovery on my part. I'm not sure if this is a keeper yet. The themes seem interesting, but they're just not catching me yet. I'll keep them around for a few weeks, and see if they're worth it.

  • This Week in Tech
    I've a feeling this is the big one. Leo Laporte is a great host, and he gets a lot of interesting punters. Whenever they talk about something I know they always manage to get it dead-wrong though, so I take their opinions with a grain of salt. Still, the show is a gem.
Definitely a Tech/Science slant, eh? Any others I should know about, or comments are much appreciated :)

Thursday, September 25, 2008

CountDistinct - How About SumDistinct - Or Just AggregateDistinct?

Making a report in Microsoft SQL Server 2005 Reporting Services today I needed to sum up several levels in a table. However, the numbers where available only at an intermediate aggregated level, so I needed to sum up only the distinct sublevel-items. Got that? Maybe it's easier if I show an example:


Let's concoct an example here. I'm extracting data in the following groups: Firm, ProductType, Product, Account, Item. Say I'm summing up one measure, BalanceAmount, on products, but my dataset returns other measures on the lowest level, Item.


Here's an example DataSet






















































































Firm ProductType Product Account Item BalanceAmount ItemMeasure
Acme Explosive Nitroglyserin C-001 Bottle $200 $2.14
Acme Explosive Nitroglyserin C-001 Mug $200 $4.87
Acme Explosive TNT C-001 Stick $200 $0.79
Acme Explosive Plastique C-001 Wad $200 $7.64
Acme Prop Fake rock C-001 Crate $200 $4.55
Acme Explosive TNT C-124 Stick $89 $0.79
Acme Explosive TNT C-124 Half-stick $89 $0.40
Acme Explosive Nitroglyserin C-089 Stick $40 $0.79


In the report I need to show this as a drill-down table, with the measures down on the lowest level (Item) summed up on every level, and the aggregated measure (BalanceAmount) summed up on every level where it makes sense.


The problem is that if you use the common SUM function the report would sum up all BalanceAmounts on the Account level resulting in a BalanceAmount of $1000 for Account C-001, not $200 which is the correct value (that would be $200, Einstein).


What you really need is something like CountDistinct, but just for Sum. In fact, this is a specific case of a more general problem where you need to conduct some aggregation on some scope. How do I do something like that?


It's not really hard. What I need is some way to make just the first instance of a value in whatever scope needed be the counting one. To get this I enter the
following into the Code tab of you Report properties:



    Dim scopes As System.Collections.Hashtable
Function getDistinct(ByVal checkMe As Object, ByVal scope As String) As Boolean
Dim firstTimeSeen As Boolean

If (scopes Is Nothing) Then
scopes = New System.Collections.Hashtable
End If

If (Not scopes.Contains(scope)) Then
scopes.Add(scope, New System.Collections.ArrayList)
End If

If (Not CType(scopes(scope), System.Collections.ArrayList).Contains(checkMe)) Then
firstTimeSeen = True
CType(scopes(scope), System.Collections.ArrayList).Add(checkMe)
Else
firstTimeSeen = False
End If

Return firstTimeSeen
End Function

The usage pattern to get SumDistinct in this case is:


=SUM(IIF(Code.getDistinct(Fields!account.Value, "scopeId"), Fields!BalanceAmount.Value, 0))

The "scopeId" string identifies the current scope, so it will change from group to group. E.g. in the Firm group you'd enter "firm", in the ProductType group you'd enter "prodType", etc.


So what am I doing here? We're remembering the values of the checkMe parameter in the context of the scope, and returning True in only the first instance. Thus we can just sum the first of every checkMe value. You could use this to do your own CountDistinct by entering the following expression (but this would be a bit roundabout for that use-case):


=COUNT(IIF(Code.getDistinct(Fields!account.Value, "scopeId"), Fields!BalanceAmount.Value, 0))

I hope you get some use out of this. This kind of code is easy to crank out, but even easier to just copy-paste.