How do I show variable values with zero entries in PROC SQL?
I need to perform a number of imputations for missing values for modes of transport. There are 12 different modes of transport. As a preliminary step I calculate the number of people using each mode of transport and the corresponding quota:
My code looks like this:
PROC SQL; CREATE TABLE hjälptabell_transport_SE as SELECT R_fard_h,antal,antal/sum(antal) as kvot FROM (SELECT R_fard_h,count(R_fard_h) as antal FROM data_resor14_2 WHERE R_fard_h is not missing and R_res_land_grp="Sverige" GROUP BY R_fard_h); quit;
And my output ideally looks something like this:
However, in some case zero observations uses a certain mode of transport, making the output look like this instead:
Note that the number of people using the eleventh mode of transport is omitted. It’s actually very important that every table is precisely 12 rows long, and if precisely zero people uses a certain mode of transport I still want that row to be included, but with a count of zero.
How can I make this happen?
Solution – 1
Simple solution would be to create a table (or improvise a view) with numbers 1 to 12 and then add left join to table data_resor14_2
Solution – 2
First, create a template table with the numbers 1-12:
data template; do r_fard_h = 1 to 12; output; end; run;
Then perform a right join with it on your data. Use
coalesce to fill in any missing values with 0.
PROC SQL; CREATE TABLE hjälptabell_transport_SE as SELECT t2.R_fard_h , antal , coalesce(antal/sum(antal), 0) as kvot FROM (SELECT R_fard_h, count(R_fard_h) as antal FROM data_resor14_2 WHERE R_fard_h is not missing AND R_res_land_grp="Sverige" GROUP BY R_fard_h ) as t1 RIGHT JOIN template as t2 ON t1.r_fard_h = t2.r_fard_h ; quit;