I have an interesting problem and I'm not sure how best to solve it.
Background:
We have a file that contains each request and how long that request takes. The problem is that this file gets quite big so any group by queries take an age. To get round this I have another file that pre-calculates the mean and standard deviation for request times for ten minute periods. This works fine and the group bys are much much quicker.
The problem:
What I want now is to be able to do an SQL group by query and call an aggregate function to combine the means and standard deviations. I've already done the maths and found the equations I need to calculate this.
I want to be able to say something like this
The solution?
I've just spent a bit of time looking at UDF's but you can only do scalar functions it seems. In other words I can't pass a field to my customer function that isn't included in the select and group statements. Bummer!
So it's now looking like perhaps I need to use native I/O or muck about and create a third file that maybe performs a bit better. I don't really want to have to do complicated groupings using native I/O and neither do I really want to be querying over that large file.
Any other ideas guys?
Thanks
Background:
We have a file that contains each request and how long that request takes. The problem is that this file gets quite big so any group by queries take an age. To get round this I have another file that pre-calculates the mean and standard deviation for request times for ten minute periods. This works fine and the group bys are much much quicker.
The problem:
What I want now is to be able to do an SQL group by query and call an aggregate function to combine the means and standard deviations. I've already done the maths and found the equations I need to calculate this.
I want to be able to say something like this
PHP Code:
select date, combinedMean(mean), combinedStdDev(stddev) from requests
group by date
order by date
I've just spent a bit of time looking at UDF's but you can only do scalar functions it seems. In other words I can't pass a field to my customer function that isn't included in the select and group statements. Bummer!
So it's now looking like perhaps I need to use native I/O or muck about and create a third file that maybe performs a bit better. I don't really want to have to do complicated groupings using native I/O and neither do I really want to be querying over that large file.
Any other ideas guys?
Thanks
Comment