has 3 variables "B29" for month, of type "Number", Length 8, Format Best12., Informat F12., which shows value in the form of "1" through "12", representing month: January through December.

"B29_B" for day, of type "Number", Length 8, Format Best12., Informat F12., which shows value in the form of "1" through "31", representing days.

"B29_C" for year, of type "Number", Length 8, Format Best12., Informat F12., which shows value in the form of "1990" through "2016", representing years.

At time point 2 (later date)

has just 1 variable "ENDDATE", type "Text", Length 48, Format $F48., Informat $48., which shows value in the form of "20160125", for example.

Question 1

So to combine the 3 variables at time point 1 to be 1 variable in the same form of the variable "ENDDATE", which SAS function should be the best? Or should they (both time point 1 and time point 2 variables) be both converted to a third format?

Will the codes below be good? Or needs to be modified or changed to a better approach?

data want;

set have;

date=input(catx("",month,day,year),mmddyy10.);

format date mmddyy10.;

run;

Question 2

To calculate difference of two columns of dates after conversion in the same format, which SAS function should be the best? I found DATDIF and INTCK, but don't know which or if there's a better way to proceed, based on the ideal date format to be chosen in my question 1 above.

If I try to adapt the codes below, should the changes be just the variable names?

Re: calculate difference of two dates in different format

SAS dates are stored in numeric variables as the count of days since 01jan1960. The format applied to this numeric variable then determines how this number gets printed (it doesn't change the internal values).

So first thing you want to do, is to convert text strings representing a date to a SAS date numeric value stored in a numeric variable.

You can use an input statement with the appropriate informat for this, or there is also an mdy() function you can use.

Once you've got both dates represented as SAS dates (so number of days since 1/1/1960) you can actually simply substract the values from each other to get the days. You don't need any function. Having said that: Both functions you've used in your sample code would work so no harm to take this approach.

Re: calculate difference of two dates in different format

I used some similar codes (see below) which work fine for the cases when B29,B29_B,B29_C are all non-missing. However, when only year is available (case 1), or only year and month are available (case 2), the codes will just skip these and do not calculate the difference in days.

My goal actually is to classify the difference in days into 2 groups (Yes and No): Yes for 0 =< diffdays <=30No for diffdays >30

So I modified the codes by adding the lines in bold to find the difference in year, but that will not resolve the whole issue for both cases, and needs to be further investigated.

Also, if I just want to do a count for Case 1 (the cases when only year is available), and also a count for Case 2 (the cases when only year and month are available), how do I do that? proc freq, table, where=? Based on the counts/frequency, we can access and decide how to treat the cases with the missing days/months.