I'm looking a way to impute using proc iml in sas the average of the sales of the next two months. As you can see sometimes I dont have the sales of 201901 and sometimes is missing on 201902 For example for the first barcode I want to find the sales[1]= mean(sales[2],sales[3]) and I want to do this for each unique barcode.
The "table A" is like this:
Obs. | Barcode | date | sales | Position
---------------------------------------------------------------
1 |21220000000| 201901 | . | 1
2 |21220000000| 201902| 311 | 2
3 |21220000000| 201903| 349 | 3
4 |21220000000| 201904| 360 | 4
5 |21220000000| 201905| 380 | 5
6 |21220000000| 201906| 440 | 6
7 |21220000000| 201907| 360 | 7
8 |21220000000| 201908| 390 | 8
9 |21220000000| 201909| 410 | 9
10 |21220000000| 201910| 520 | 10
11 |21220000000| 201911| 410 | 11
12 |21220000000| 201912| 390 | 12
13 |31350000000| 201901| 360 | 1
14 |31350000000| 201902| . | 2
.etc.
24 |31350000000| 201912| . | 12
25 |45480000000| 201901| 310 | 1
26 |45480000000| 201902| . | 2
.etc.
I tried something like this but it doesnt work:
proc iml;
t_a= TableCreateFromDataSet("work","table_a");
call TablePrint(t_a);
do i =1 to nrow(t_a);
if t_a[i,4]=. and t_a[i,5]=1 then t_a[1,4]= mean(t_a[i+1,4],t_a[i+2,4]) ;
i=i+1;
end;
run;
Is there a way to do it using matrices or lists in proc iml or would you recommend any other ways? Thank you in advance!
I don't think this is a good solution in
PROC IMLto your problem, but I can tell you where you're going wrong in your particular approach. Hopefully Rick or someone else can stop by to show the right IML way to solve this using Matrix operations, or you can browse the Do Loop as I'm fairly sure Rick has articles on imputation there.That said, your issue here is that SAS IML doesn't really have very much support for tables as a data structure. They were recently added, and mostly added just to make it easier to import datasets into IML from SAS without a lot of trouble. However, you can't treat them like Pandas data frames or similar; they're really just data storage devices that you need to extract things from. You need to move data into matrices to actually use them.
Here's how I would directly translate your nonfunctional code into functional code. Again, please remember this is probably not a good way to do this - matrices have a lot of features that make them good at this sort of thing, if you use them right, and you probably don't need to use a DO loop to iterate here - you should rather use matrix multiplication to do what you want. That's really the point of using IML; if you're just iterating, then use base SAS to do this, it's much easier to write the same program in base SAS (or, even better, use the imputation procedures if you have them licensed).
What I do first is extract the Barcode and Sales columns into matrices. Barcode is to check to make sure we're imputing from the same ID. Then, we check to see if sales is missing for that iteration, and further make sure we're not on the last two iterations (or it'll give an out-of-range error). Lastly, we compare the barcode with the next two barcodes and make sure they're the same. (The way I do that is pretty silly, honestly, but it's the quickest way I can think of.) If those all pass, then we calculate the mean.
Finally, we add the matrix back on the t_a table, and you can export it at your leisure back to a SAS dataset, or do whatever it is you want to do with it!
Again - this isn't really a good way to do this, it's more a direct answer of "what is wrong with your code". Find a better solution to imputation than this!