Login | Register   
LinkedIn
Google+
Twitter
RSS Feed
Download our iPhone app
TODAY'S HEADLINES  |   ARTICLE ARCHIVE  |   FORUMS  |   TIP BANK
Browse DevX
Sign up for e-mail newsletters from DevX


Tip of the Day
Language: SQL Server
Expertise: Beginner
Mar 24, 1997

Calculating Average Number of Employees in Each Department

Question:
[Joe Celko's Personnel Problem Puzzle]

Daren Race was trying to aggregate the results from an aggregate result set using Gupta's SQLBase and could not think of any way other than using a temporary table or a VIEW. This is an example of what he was doing:

    Personnel:
    name  deptid 
    ===============
    Daren  Acct
    Joe    Acct
    Lisa   DP
    Helen  DP
    Fonda  DP
Then he viewed the data as a aggregate by deptid:
    SELECT deptid, COUNT(*) 
        FROM Personnel
        GROUP BY deptid;
The results will be:
    Result
    deptid COUNT(*) 
    ===============
    Acct    2 
    DP      3
Then he wanted to find the average department size! The way he did this was to use a VIEW:
    CREATE VIEW DeptView (deptid, tally) 
        AS SELECT deptid, COUNT(*) 
            FROM Personnel 
            GROUP BY deptid;
Then:
    SELECT AVG(tally) FROM DeptView;
He asked if anyone on the Gupta Forum on CompuServe could think of a way of doing this without using temporary tables (or views)? The two answers he got were:
    SELECT AVG(DISTINCT deptid) 
        FROM Personnel;
and
    SELECT COUNT(*) / COUNT(DISTINCT deptid) 
        FROM Personnel;
Your problem is to tell me what is wrong with each of them.

Answer:
The first answer will grab the department numbers, throw away NULLs (there should not be any in this case), then throw away duplicates and average what is left. This has nothing to do with the number of people in each department; we will get (1+2/2) = 1.5 for an answer.

The second answer is really much better and will give us the right results for this data. We have a COUNT(*) = 5, and COUNT(DISTINCT deptid) = 2, so the answer we get is (2.5) just as we wished.

But now we hire three new employees, Larry, Moe and Curly, who are not yet assigned to a department and our tables looks like this:

    Personnel:
    name  deptid 
    ===============
    Daren  Acct
    Joe    Acct
    Lisa   DP
    Helen  DP
    Fonda  DP
    Larry  NULL 
    Moe    NULL
    Curly  NULL  
We now have a COUNT(*) = 8, but COUNT(DISTINCT deptid) = 2 because it drops NULLs, so the answer we get is (4). The real answer is that we cannot determine an exact value, but we know that it is between (8/5 = 1.60) and (8/3 = 2.66) depending what we do with Larry, Moe and Curly.

If Mr. Race had stuck to his original method, we would have gotten:

    Result
    deptid COUNT(*) 
    ===============
    Acct    2 
    DP      3
    NULL    3
and a final result of 1.5 as before because the NULLs would form a group by themselves in the VIEW, but then been dropped out by the average in the final query.

Puzzle provided courtesy of:
Joe Celko
71062.1056@compuserve.com

DevX Pro
 
Comment and Contribute

 

 

 

 

 


(Maximum characters: 1200). You have 1200 characters left.

 

 

Sitemap
Thanks for your registration, follow us on our social networks to keep up-to-date