October 12

4 comments

Extract Numbers from a String using RPGLE

By NickLitten

October 12, 2016

RPG, CHAR, convert, extract, ILE, modernization, performance, snippet, XLATE

RPG EXAMPLE to read NUMERIC from ALPHAMERIC

AKA: how to select only the number from an address in a field, or file data

A little while ago, I wanted to extract out the numerics from a string of data:

For example (a) if I feed in a phone number like “(540)-123 – 1234” but I just want to see 5401231234 or maybe (b) “111 High Street, 5th District, Charleston, SC 29466” then I want to get “111529466” returned.

Obviously there are several ways of doing this. But a few techniques sprang to mind so lets do some timings using common variable names. These code examples are using values of (a) = long varying variable with an Address in it and (b) variable defined as 15,0 containing returned numeric value. To do some timings on this, I read 10,000 address fields from a customer master file and fed this data into the following routines.

Assuming we have the long variable called longVariable with the address in it and we want to extract out numbers and return them in the value called RtnValue — lets look at some examples

This gave me a sensible execution time that i could do a rough timing with:

METHOD 1 – Read through the address string, position by position to find the numeric values:

Rpg code snippetFor X = 1 to %len(longVariable);
  StringChar = %subst( longVariable : X : 1);
  If StringChar >= ‘0’ and StringChar <= ‘9’;
    rtnValue = %Trim(rtnValue) + StringChar;
  Endif;
 EndFor;

10k results each saying “111529466” Correctly

Runtime – 2.15 seconds

METHOD 2 – Use %XLATE to remove all non-numerics then use %DEC to convert to numeric

stringClean = %xlate(rmvThis:Blanks:longVariable);
 If StringClean <> *blanks;
   rtnValue = %dec(stringClean:30:0);
 else;
   rtnValue = 0;
 endif;

10K results each saying “111529466” Correctly

Runtime – 2.09 seconds

NOTE: I have a slight hesitance here because of the question of receiving unknown HEX data in the input string. The XLATE doesnt know about it so it might miss it and fail to extract.

METHOD 3 – use the ‘C’ libary ATOF function to parse the string and return numerics

‘atof’ is a function in the C programming language that converts a string into a floating point numerical representation. atof stands for ASCII to float

rtnValue = %dech(atof(%trim(longVariable)):15:0);

10K results each saying “111” **ERROR** it seems that ATOF only returns the first number set if can find.

Runtime – 0.29

I could have written this into an array with a lookup but I know that would be a slower solution thant option (1). *IGNORED*

METHOD 4 – use SQL

Exec SQL Set :rtnValue = regexp_replace(trim(cast(:longVariable as varchar(29) ccsid 37)), '["[0-9]]', '', 1, 0, 'i');

10K results each saying “111529466” Correctly

Runtime – 2.95

*SLOWEST*

METHOD 5 – Use IBM i (MI) CVTEFN (Convert external form to numeric)

The IBM i Operating System (the new version of the old AS/400, iSeries) has something that is at a lower level than the C language, and that is the Machine Interface (MI). In effect, this is IBM i Assembly language. While I don’t want to get into the virtues of MI programming, I do advocate using a bit of MI now and then, especially today, since you can call an MI instruction directly from within RPG IV.

The MI instruction CVTEFN (Convert external form to numeric) allows you to convert from character to numeric. The resulting value can be virtually any numeric data type and length. The documentation on CVTEFN states the following:

“Scans a character source for a valid decimal number in display format, removes the display character, and places the result in receiver.”

The syntax diagram for CVTEFN is as follows:

void _CVTEFN (_SPCPTR receiver, _DPA_Template_T *rcvr_descr, _SPCPTRCN source, unsigned int *src_length, _SPCPTR mask);

I had high hope for this but it failed repeatedly. Then upon reading the IBM documentation it said “results could not be trusted”. *SIGH*

This option is ignored.

and thats about… I hope it helps somebody out?!

  • Hey Nick, in Method 1,
    Eval rtnValue = %Trim(rtnValue) + StringChar;

    Why do you use Eval? Does it actually serve a purpose?

    Thanks Rob

  • Very interesting one to go through, Nick!
    I’m trying to do something like this and was thinking why we could not have another BiF like scan, without shortcoming individual byte check. The first method crossed my mind, but thought of checking for other good methods. I was not aware of method 3, 4, & 5. Thanks for sharing it.

  • I did slight change to the logic and getting good result.
    I’m checking individual characters and extracting only numeric in an array in reverse to get the correct numeric value when converted.
    This may not be most efficient way, but where performance is not paramount, this can be tried.

    Dcl-Ds CharTelNum;
    CharTel char( 1) Dim(20);
    End-Ds;
    Dcl-s contactPhone char( 20) Inz(‘ (123) 456 7891′);
    Dcl-s byte char( 1) Inz(X’40’);
    Dcl-c digits ‘1234567890’;
    Dcl-s #i Int(10) Inz(0);
    Dcl-s #j Int(10) Inz(0);
    Dcl-s Pos int(10);
    Dcl-s PhoneNbr packed(20:0);

    #j = 20;
    clear CharTelNum;
    // reading in reverse to get correct value not right aligned value when converted to numeric.
    for #i = %len(%trim(contactPhone)) by 1 DOWNTO 1;
    byte = %subst(contactPhone: #i:1);
    Pos = %check(digits :byte);
    If Pos=0;
    CharTel(#j) = %subst(contactPhone: #i:1);
    #j = #j-1;
    endIf;
    endFor;

    // Value gets into the array
    CHARTEL OF CHARTELNUM(11) = ‘1’
    CHARTEL OF CHARTELNUM(12) = ‘2’
    CHARTEL OF CHARTELNUM(13) = ‘3’
    CHARTEL OF CHARTELNUM(14) = ‘4’
    CHARTEL OF CHARTELNUM(15) = ‘5’
    CHARTEL OF CHARTELNUM(16) = ‘6’
    CHARTEL OF CHARTELNUM(17) = ‘7’
    CHARTEL OF CHARTELNUM(18) = ‘8’
    CHARTEL OF CHARTELNUM(19) = ‘9’
    CHARTEL OF CHARTELNUM(20) = ‘1’

    If CharTelNum *Blank;
    PhoneNbr = %Dec(CharTelNum:20:0);
    endIf;

  • {"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}

    Join the IBM i Community for FREE Presentations, Lessons, Hints and Tips

    >