I am trying to extract a substring from a field that has up to 2000 characters. I need to capture whatever is between "Destination(s):" and "." The following works, but is there a more elegant way to do this? I feel like I'm missing something obvious.
Case
When FIND(t1.Comment_ID,"Destination(s):") > 1 then STRIP(SUBSTR(t1.Comment_ID,FIND(t1.Comment_ID,"Destination(s):")+16,FIND(SUBSTR(t1.Comment_ID,FIND(t1.Comment_ID,"Destination(s):")+16,200),".")-1))
Else ""
End
Thanks,
Lisa
Another alternative is to use PRXPARSE. Here is an example.
data parse_data;
retain re;
text = 'this is some text. Destination(s): more text.';
if(_N_ = 1) then re = PRXPARSE("/(?<=Destination\(s\):)(.*)(?=\.)/io");
if(prxmatch(re, text) ) then new_text = strip(prxposn(re, 1, text) );
drop re;
run;
proc print;
run;
Another alternative is to use PRXPARSE. Here is an example.
data parse_data;
retain re;
text = 'this is some text. Destination(s): more text.';
if(_N_ = 1) then re = PRXPARSE("/(?<=Destination\(s\):)(.*)(?=\.)/io");
if(prxmatch(re, text) ) then new_text = strip(prxposn(re, 1, text) );
drop re;
run;
proc print;
run;
Thanks, this is along the lines of what I was, it just totally slipped my mind. Appreciate it!
I would like to mention that if you have data like this:
text = 'this is some text. Destination(s): more text.xxxxx.';
And you could get this :
It that what you are looking for?
If I were you ,I would prefer to use Tom 's code .
One simple thing to make it easier is to use a data step instead of SQL code.
If SQL you could store the location into another variable and use the CALCULATED keyword to reference it.
If you want everything up to a period then use SCAN().
....
find(t1.Comment_ID,"Destination(s):") as loc
, case when (calculated loc > 0)
then left(scan(substr(t1.Comment_ID,calculated loc+16),1,'.'))
else ' '
end as WANT length=2000
options parmcards=x;
filename x temp;
parmcards;
whatever Destination(s): and and . The following and works, but is there a mor Destination(s): or or . The following
whatever Destination(s): and and . The following and works, but is there a mor Destination(s): or or . The following
;
data want;
infile x length=len;
input x $varying2000. len;
n+1;
pid=prxparse('/Destination\(s\):.+?\./i');
s=1;e=length(x);
call prxnext(pid,s,e,x,p,l);
do while(p>0);
want=scan(substr(x,p,l),2,':.');
output;
call prxnext(pid,s,e,x,p,l);
end;
keep n want;
run;
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.