A collection of learnings and opinions.

Showing posts with label reporting. Show all posts
Showing posts with label reporting. Show all posts

Thursday, September 25, 2008

CountDistinct - How About SumDistinct - Or Just AggregateDistinct?

Making a report in Microsoft SQL Server 2005 Reporting Services today I needed to sum up several levels in a table. However, the numbers where available only at an intermediate aggregated level, so I needed to sum up only the distinct sublevel-items. Got that? Maybe it's easier if I show an example:


Let's concoct an example here. I'm extracting data in the following groups: Firm, ProductType, Product, Account, Item. Say I'm summing up one measure, BalanceAmount, on products, but my dataset returns other measures on the lowest level, Item.


Here's an example DataSet






















































































Firm ProductType Product Account Item BalanceAmount ItemMeasure
Acme Explosive Nitroglyserin C-001 Bottle $200 $2.14
Acme Explosive Nitroglyserin C-001 Mug $200 $4.87
Acme Explosive TNT C-001 Stick $200 $0.79
Acme Explosive Plastique C-001 Wad $200 $7.64
Acme Prop Fake rock C-001 Crate $200 $4.55
Acme Explosive TNT C-124 Stick $89 $0.79
Acme Explosive TNT C-124 Half-stick $89 $0.40
Acme Explosive Nitroglyserin C-089 Stick $40 $0.79


In the report I need to show this as a drill-down table, with the measures down on the lowest level (Item) summed up on every level, and the aggregated measure (BalanceAmount) summed up on every level where it makes sense.


The problem is that if you use the common SUM function the report would sum up all BalanceAmounts on the Account level resulting in a BalanceAmount of $1000 for Account C-001, not $200 which is the correct value (that would be $200, Einstein).


What you really need is something like CountDistinct, but just for Sum. In fact, this is a specific case of a more general problem where you need to conduct some aggregation on some scope. How do I do something like that?


It's not really hard. What I need is some way to make just the first instance of a value in whatever scope needed be the counting one. To get this I enter the
following into the Code tab of you Report properties:



    Dim scopes As System.Collections.Hashtable
Function getDistinct(ByVal checkMe As Object, ByVal scope As String) As Boolean
Dim firstTimeSeen As Boolean

If (scopes Is Nothing) Then
scopes = New System.Collections.Hashtable
End If

If (Not scopes.Contains(scope)) Then
scopes.Add(scope, New System.Collections.ArrayList)
End If

If (Not CType(scopes(scope), System.Collections.ArrayList).Contains(checkMe)) Then
firstTimeSeen = True
CType(scopes(scope), System.Collections.ArrayList).Add(checkMe)
Else
firstTimeSeen = False
End If

Return firstTimeSeen
End Function

The usage pattern to get SumDistinct in this case is:


=SUM(IIF(Code.getDistinct(Fields!account.Value, "scopeId"), Fields!BalanceAmount.Value, 0))

The "scopeId" string identifies the current scope, so it will change from group to group. E.g. in the Firm group you'd enter "firm", in the ProductType group you'd enter "prodType", etc.


So what am I doing here? We're remembering the values of the checkMe parameter in the context of the scope, and returning True in only the first instance. Thus we can just sum the first of every checkMe value. You could use this to do your own CountDistinct by entering the following expression (but this would be a bit roundabout for that use-case):


=COUNT(IIF(Code.getDistinct(Fields!account.Value, "scopeId"), Fields!BalanceAmount.Value, 0))

I hope you get some use out of this. This kind of code is easy to crank out, but even easier to just copy-paste.

Wednesday, February 28, 2007

Parametrized woe

Let's set the stage. I'm developing an application in VB.NET that connects to a Microsoft Reporting Service using a Web Service.

My application (let's call it Pluto) will simply connect to the reporting server (let's call that one Uranus), get a list of reports, let the user choose one of them, let the user define the parameters of the report and render it to a file on disk. Simple, right? So I thought, or rather I thought it'd be simpler.

Now, the idea behind web services is that you can expose functionality on your platform to platform-independent apps through a well-defined protocol. That is really cool. It turns out, however, it's not always easy.

Uranus is already up and running (and in semi-production, no less) so I can't really mess with it. I am at the mercy of the setup that's already alive. I kind of like this, it's the way web services are supposed to work. So I connect to Uranus and fetch the list of reports. All is well. I let the user choose one and get the parameters to it. All is well, I get a list of parameters. I try to populate a UI with the parameters and their default values, to give the user some starting-point, and I discover that all is not well.

Uranus is exposed locally through a proxy class, so I'm working with a local object that "stands in" for Uranus. This is not really important, but kind of neat, so I thought I'd mention it. Anyway, I call methods on the proxy (let's call it Miranda) to get to Uranus. To be specific, Uranus is a ReportingService class, which delegates to Uranus.

The method Pluto is calling to get the parameters of a method is the aptly named ReportingService.GetReportParameters -method. Now this little method takes 5 parameters. That's pretty extreme to my eyes. Furthermore two of the parameters are actually arrays of objects from the same namespace. This is where my head starts to hurt.

Let's go through the parameters as they're described in msdn, my comments are in purple:
Report
The full path name of the report.
Pretty easy to understand, except it's not really the name, it's the path to the report. So, if your report is named "Bob", but it's on the path "http://Uranus/path/to/Bob" what you want to insert here is "/path/to/Bob". Naturally!
ForRendering
A Boolean expression that indicates how the parameter values are to be used. If set to a value of true, parameter properties that are returned are based on the parameter data that was used during the execution of the report.
I was a bit stumped at this one, for rendering? I'm not going to render yet, I thought, I want to get the parameters and show them to the user. Is that rendering? I thought it wasn't. Oh, did this cause me trouble...
HistoryID
The ID of the report history snapshot. Set the ForRendering parameter to a value of true in order to retrieve parameter properties for a report history snapshot. Set the value to null (Nothing in Visual Basic) if you are retrieving parameters for a report that is not a report history snapshot.
Aha! So that's what the ForRendering is for! You need it if you want a historical snapshot for some reason. I am now pretty sure I was right in leaving the ForRendering off. I just go ahead and pass a Nothing (NULL in c-like lingo) here.
ParameterValues
The parameter values (ParameterValue[] objects) that can be validated against the parameters of the report that are managed by the report server.
Now I like the kitchen sink as much as anybody, but is this really what a GetParameters should be used for? What this does as far as I can tell is that it lets you pass a set of values to this method for validation. In my meager mind some kind of ValidateParameter(ParameterValue[] p) or something would be better. It might even return a boolean, wouldn't that be nice?
Credentials
The data source credentials (DataSourceCredential[] objects) that can be used to validate query parameters.
Credentials are important. I wholeheartedly agree. I don't pass it anything yet, but I plan on later. Really, I do.
As you can probably see I was a bit confused at this method. So, I passed it the path of the report the user wanted (Name), a false (ForRendering), a Nothing (HistoryID), a Nothing (ParameterValues, I don't want to validate - I want to know what the parameters are) and a Nothing on the credentials too (hey, I'm debugging, I haven't gotten to that yet).

Success! I get a set of parameters out, and I'm smiling happily at the world. So I try to show them to the user, but they all come out as an undefined mess! They don't have default values so in effect they're just names for the parameters! That's no good for me, I have to give my users an inkling on what to enter. Hell, I even have to validate their input, how can I do that with no metadata? I go hunting for a solution.

Later, much later, I realize what my astute readers probably saw right away. Setting the ForRendering to True gives me parameters with types and default values and everything! It seems I foolishly assumed that rendering was somehow connected to what the service calls rendering in its other method. That is - the actual running of the report and output to some kind of format.

I am but a neophyte in the dark arts of web services, but I do wish they'd at least be internally consistent in their naming.

Next I'll delve into the joys of time-formats.