ibmi-brunch-learn

Announcement

Collapse
No announcement yet.

find the number of occurance of a string in a text

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • find the number of occurance of a string in a text

    Hi ,

    I have used REGEXP_COUNT to get the count of a string in a sentence like below and it was working fine in v7r2

    select REGEXP_COUNT (trim('mastiff is stiffany '),trim(' iff '))
    from SYSIBM/SYSDUMMY1
    with NC

    Result will be 2

    However this sql statement wont work in old version .

    So i am replacing this function with a similar RPGLE logic can you please help with a logic to get the count of a particular from in a string.
    The search words are set up in a PF and string is something entered by user in display file interactively.Search words needs to be taken one by one from PF and needs to be searched in string entered by user in screen to check whether its present .So basically i am looking find the count of a particular word stored in a PF in the string using RPGLE code.
    I am now planning to replace REGEXP_COUNT with an rpg logic. But below code is not working as expected . Can you please help with a logic to get the count

    D W1string S 50a inz('mastiff is stiffany ')
    D wkpos s 3p 0
    D W1Tiff s 50a inz('iff ')
    D W2Tiff s 4a
    D W2string S 50a
    D W3string S 50a
    D Wkcount S 2P 0

    D Wkupper C const('ABCDEFGHIJKLMNOPQRSTUVWXYZ')
    D wklower C const('abcdefghijklmnopqrstuvwxyz')
    /free
    eval w3string = %xlate(wklower:wkupper:w1string);
    eval w2Tiff = %xlate(wklower:wkupper:w1Tiff);
    eval wkpos = %scan(w2Tiff:W3String);

    Dow wkpos <> *Zero;
    Eval wkcount = wkcount + 1;
    eval wkpos = wkpos + %len(w2Tiff);
    eval wkpos = %scan(w2Tiff:W3String:wkpos);
    Enddo;

    dsply wkcount;
    *inlr = *on;

    This is my sample code which is not working as expected.



  • #2
    Your code is increasing the position from which it searches by the entire length of the searchWord, which means a searchWord such as 'iii' will not return the correct count when searching string 'iiiiiiiiii'

    If you change line eval wkpos = wkpos + %len(w2Tiff); to eval wkpos = wkpos + 1; it should work.


    Or, you can try something like this...

    Code:
           dcl-s searchString       char(50) inz('mastiff is stiffany ');
           dcl-s searchStringLen    int(5)   inz(*ZERO);
           dcl-s searchWord         char(50) inz('iff');
           dcl-s searchWordLen      int(5)   inz(*ZERO);
    
           dcl-s count              int(5)   inz(*ZERO);
           dcl-s i                  int(5)   inz(1);
    
    
           searchStringLen = %len(%trim(searchString));
           searchWordLen   = %len(%trim(searchWord));
    
           // Upper case the strings
           searchString = upper(searchString);
           searchWord   = upper(searchWord);
    
           dow i - 1 + searchWordLen < searchStringLen;
    
             if %subst(searchString: i: searchWordLen) = searchWord;
    
               count += 1;
             endif;
    
             i += 1;
           enddo;
    Walt

    Comment


    • #3
      Here's a modified version that seems to work. I used varying fields as it makes it easier and saves worrying about non-significant trailing spaces. I'd normally trim the fields on loading them but from your search criteria it appeared that a trailing space was significant. Because of the use of Dsply for input this code will not allow a trailing space in the search string. Note I reduced the field lengths to 30 to fit for Dsply.

      Code:
           D wkpos           s              3p 0
      
           D W1Tiff          s             30a   varying
           D W2Tiff          s             30a   varying
      
           D W1string        S             30a   varying
           D W2string        S             30a   varying
      
           D Wkcount         S              2P 0
      
           D startPos        S              5i 0 inz(1)
      
           D Wkupper         C                   'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
           D wklower         C                   'abcdefghijklmnopqrstuvwxyz'
      
              Dsply ('Pattern?') ' ' W1Tiff;
              Dsply ('Input?') ' ' W1String;
      
              w2string = %xlate( wklower: wkupper: w1string );
              w2Tiff = %xlate( wklower: wkupper: w1Tiff );
      
             Dou ( wkpos = *Zero )
                Or   ( startPos - 1 + %Len( W2Tiff ) ) > %Len( W2string ) ;
      
                wkpos = %scan( w2Tiff: W2String: startPos );
                If wkpos <> 0;
                   startPos = wkpos + %len( w2Tiff );
                   wkcount += 1;
                Endif;
             Enddo;
      
             dsply ( 'Found ' + %Char(wkcount) + ' in ' + W1string);
             *inlr = *on;
      Have you looked at using RegEx directly within RPG? Since you appear to be familiar with it it would probably work well for you.

      Comment


      • #4
        And this version is just becuase using fixed form D-specs bug me <grin>

        Code:
               dcl-s  wkpos  int(5);
        
               dcl-s  W1Tiff  varchar(30);
               dcl-s  W2Tiff  varchar(30);
        
               dcl-s  W1string  varchar(30);
               dcl-s  W2string  varchar(30);
        
               dcl-s  Wkcount  int(5);
               dcl-s  startPos int(5) inz(1);
        
               dcl-c  Wkupper  'ABCDEFGHIJKLMNOPQRSTUVWXYZ';
               dcl-c  wklower  'abcdefghijklmnopqrstuvwxyz';
        
               Dsply ('Pattern?') ' ' W1Tiff;
        
               Dsply ('Input?') ' ' W1String;
        
               w2string = %xlate( wklower: wkupper: w1string );
               w2Tiff = %xlate( wklower: wkupper: w1Tiff );
        
               Dou ( wkpos = *Zero )
                  Or  ( startPos - 1 + %Len( W2Tiff ) ) > %Len( W2string );
        
                  wkpos = %scan( w2Tiff: W2String: startPos );
                  If wkpos <> 0;
                     startPos = wkpos + %len( w2Tiff );
                     wkcount += 1;
                  Endif;
               Enddo;
        
               dsply ( 'Found ' + %Char(wkcount) + ' in ' + W1string);
               *inlr = *on;
        I also changed the packed counts to integers - just more efficient.
        Last edited by JonBoy; August 29, 2018, 05:12 PM.

        Comment


        • #5
          And an alternative to using XLATE for upper case conversion:

          Code:
          exec sql set :w2string = upper(:w1string);
          exec sql set :w2Tiff = upper(:w1Tiff);
          This way it would also convert less common case-sensitive characters like accented characters (à á â etc.)
          I think this would work for OS versions at least as old as v5r4

          Comment


          • #6
            Squeeze out the target string and do some math.

            Code:
            D BigString       s            100a   Varying  
            D FindStr         s             10a   Varying  
            D BeginLen        s             10i 0          
            D EndLen          s             10i 0          
            D Count           s             10i 0          
            
            BigString = '1abc23abc456abc-xxxx-abcd' ;                       
            BeginLen = %Len(BigString) ;                                    
            FindStr = 'abc' ;                                               
            Exec SQL Set : BigString = upper(: BigString) ;                 
            Exec SQL Set : FindStr = upper(: FindStr) ;                     
            
            Exec SQL Set : BigString = Replace(: BigString, : FindStr ,'') ;
            EndLen = %Len(BigString) ;                                      
            // 25 - 13 = 12 / 3 = 4                                         
            Count = (BeginLen-EndLen)/%Len(FindStr) ;
            Ringer

            Comment

            Working...
            X