Add-Innovation Home

About Add-Innovation

Contact Us

  SAS Institute logo.   Use arrays to reduce space requirements and processing time.


When data summaries are required for data which is not sorted, space and time can be saved by creating the summary in a data step rather than reading, sorting, and summarising the data using a data view, sort and Proc Summary. This is especially useful when working with large input files such as the master file.

The first example below produces counts of certain products ;

Proc Format;
    InValue Prods ‘A12345’=1
                ‘A23456’=2
              ‘A34567’=3
              ‘A45678’=4
                 Other=0;

Data _Null_;
    Array PrdCnt(4) 4;
    Infile Invents EOF=Report;
    Retain Tab ‘05’x;
    Input @15 ProdId Prods.@;
    If ProdId Then PrdCnt(ProdId) + 1;
    Return;
Report: File Output;
    Put ‘Product counts for ‘ /
    ‘A12345’ Tab ‘A23456’ Tab ‘A34567’ Tab ‘A45678’ /
    PrdCnt(1) Tab PrdCnt(2) Tab PrdCnt(3) Tab PrdCnt(4);


This next (more complex) example accumulates total minutes for ceased SoHo Residential PSTN & ISDN calls, split by Month (for all of 1999), Template, Locality, and Rate band, from the Master file processed sequentially.
Because of the number of records which would be involved, this saves a massive amount of disk space, the need to sort the data and run against Proc Summary. Using a Class statement rather than By would save sorting the data, but this process has the advantage of knowing how many values of Month, Template, Locality, and Rate band exist, and hence how much memory is needed to run the step.
 

Proc Format;
    InValue Temps ‘01’=1
                ‘02’=2
              Other=0;

    InValue Locs ‘01’=1
               ‘02’=2
             ‘03’=3
             ‘04’=4
             ‘05’=5
             ‘06’=5
              Other=6;

    Invalue Rat  ‘1’=1
             ‘2’=2
             ‘3’=3
            Other=0;

Data _Null_;
   Array Mins(12,2,6,3) 8;  /* Month, Template, Locality, Rate */
   infile Master Vsam Eof=Report;                                        
   Input @116 LivFinal $1.@;
   If LivFinal=’F’;
   Input @114 AccType $1.@;
   If AccType=’8’;
   Input @178 revstart pib2.                                    
   @180 revmths  pib2.
         @225 ghost      $5.@;
   if revmths ne 0 and ghost ne 'GHOST' then do;                
offset = 224 + revstart + 1;                              
do i = 1 to revmths;                                      
   input @offset       billdate $char4.                   
         @offset+11    withcall    pd3.                   
         @offset+14    plans       pd2.                   
         @offset+16    nocall      pd2.@;                 
         offset + 18;                                           
                                                                      
   If BillDate=:"1999" Then Do;                       
MonRef=Intck('MONTH',"01JAN99"D,
       Input(BillDate||'01',yymmdd8.));
      do j = 1 to withcall;                                
               Input @offset     Template    Temps.@;
               If not(Template) Then Continue;
               Input @offset+2   Locality    Locs.
                     @offset+6   RateBand    Rat.
                     @offset+17  Minutes     pd6.2@;      
         Mins(MonRef,Template,Locality,Rate) + Minutes;
         offset + 23;                                     
      end;             /* Of Do j=1 to withcall */         
      offset + (plans*31) + (NoCall*12);                   
   end;             /* If BillDate=:'1999' */             
         Else Offset + withcall*23 + plans*31 + nocall*12;      
      end; /* Of do i=1 to revmths ... */
   end; /* Of if revmths ne 0 ... */                            
   input ;                                                      
   end;  /* of do while (chkfld = keyfld) */ 
   Return;
Report:
   File Output;
   Tab=’05’x;
   do month=’JAN 99’,’FEB 99’,’MAR 99’,’APR 99’,’MAY 99’,’JUN 99’,
          ‘JUL 99’,’AUG 99’,’SEP 99’,’OCT 99’,’NOV 99’,’DEC 99’;
      do Tempso='PSTN','ISDN';                                        
          do Locso= 'Local        ', 'Regional', 'National',
                    'International', 'Mobile', 'Other';
             do Rato = 'Day    ','Evening','Weekend';
            put month tempso locso rato ‘Minutes’ Tab @;
         end;
        end;
      end;
   end;
   put ;
   do monref=1 to 12;
      do template=1 to 2;
         do locality=1 to 6;
            do rate=1 to 3;
               put mins(monref, template, locality, rate) tab @;
        end;
        end;
     end;
   end;
Run;

This example could easily be modified to produce totals per account by ;

•    Outputting the titles in a previous Data _Null_ step.
•    Changing the statement Mins(Monref, Template, Locality, Rate) + Minutes to
Mins(Monref,Template,Locality,Rate)=Sum(Mins(Monref, Template, Locality, Rate), Minutes). This takes the implied retain off the Mins variable, whilst ensuring that missing values are not created by adding the variable minutes (if this value is missing) by using the sum function.
•    Delete the Return statement. This will cause the totals for Mins to be output after each account.