ibmi-brunch-learn

Announcement

Collapse
No announcement yet.

Email Validation in RPG

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Email Validation in RPG

    It's not easy. A pretty thorough search of the internet turned up some code that would not compile. In theory, if it could compile, we are still looking at 200 lines of code or so to call a C++ function, with all of its header specifications, to execute a regular expression. A regular expression is a preferred way to analyze what is in a string of characters. In this one case, the goal is to validate an email address.

    In other languages like JavaScript, it is so much simpler. Copy the code below into a text editor. Save it with an .html extension. Run it in a browser. If you get a warning about running active content, ignore it.

    <html>

    <body>

    <head>

    <title> regextest.html regular expression test for email addresses</title>

    <script>

    // Returns true or false

    function isValidEmail(sText) {

    var reEmail = /^(?:\w+\.?)*\w+@(?:\w+\.)*\w+$/;

    return reEmail.test(sText);

    }

    </script>

    </head>

    <script>

    sText = "arnold@arnold.com";

    alert (sText + " | " + isValidEmail(sText) );

    sText = "arnold@arnold.com.de";

    alert (sText + " | " + isValidEmail(sText) );

    sText = "arnold@arnold..com.de";

    alert (sText + " | " + isValidEmail(sText) );

    </script>

    <br>

    <p> Done </p>

    </body>

    </html>

    You should get a succession of alert boxes that say "true", "true" and "false". The value of "true" indicates a valid email address.

    There is nothing regular about the regular expression var reEmail = /^(?:\w+\.?)*\w+@(?:\w+\.)*\w+$/;

    Anything between the forward slashes "/" is the expression. It's arcane but it is saying that case will be ignored; there may be multiple word groups followed by periods on either side of the at "@" sign; the address cannot begin or end with a period "." and there cannot be more than one period in a row.

    For more information, visit http://www.regular-expressions.info/email.html

    Regular expression support has been added to DB2 400 structured query language (SQL) as of release 7.1. SQL code can be embedded in an SQL RPG program as shown below:

    H option(*nodebugio:*srcstmt)

    * Parameters

    d this PR ExtPgm('VLDEML')
    d 80a
    d 10i 0

    d this PI
    d emailIn 80a
    d theCount 10i 0

    d theLength s 10i 0
    d theEmail s 80a

    /free

    exec sql set option commit=*none, datfmt=*iso;

    theCount = 0;

    theEmail = emailIn;

    exec sql

    SELECT REGEXP_COUNT(

    trim(:theEmail),

    '^(?:\w+\.?)*\w+@(?:\w+\.)*\w+$')

    INTO :theCount

    FROM sysibm/sysdummy1;


    return;

    Save this code as VLDEML with the member type of SQLRPGLE. When theCount is 1, the email address is valid. When it is 0, the email address is not valid. Pass to this program an email address. It returns a 0 or 1. Again, 1 is equal to success - a valid email address.

    Here is another example of why we may want to modernize our RPG programs. Few if any new enhancements are coming for RPG (or COBOL for that matter). SQL is a modern language and has been updated constantly. Plus, SQL skills on the DB2 400 platform translate nicely to other databases like MS Access, Oracle, mySQL and so on.

    If the operating system for the IBM i is not 7.1 or greater, you will have to "roll your own" validation routine. Some basics would include:

    One @ sign
    At least one period after the @ sign
    Cannot begin or end with a period
    Cannot have two periods in a row
    No special characters like double quotes, forward slash, backward slash and so on.
    The international rules for the form of email address syntax actually allows for question marks "?" and single quotes and other special characters. In reality, the rules for many domains like gmail.com or yahoo.com will not allow special characters with the exception of the underscore character "_" and perhaps some others.

    In theory, you could ping the domain name after the @ sign, but that implies that the network is always up. Also, entering email addresses for customers, vendors or employees may be done offline. Also, the results of a ping may take several seconds to complete.

    Finally, be aware that new sub-domains are being added as time goes on. In addition to .com, .net, .org you also have .tv and .info as well as country codes that might follow as in: harry.chen@funnythings.com.cn. The code posted here will handle those eventualities.


  • #2
    Two jobs ago, the Accounts Receivable director told me that they were having trouble with email addresses that her staff entered. Mostly it was stupid keying errors. I wrote an RPG routine that did some minimal checking and that took care of most of the problems. Anyone who wants it can get it from the IT Jungle web site.

    Error-Checking Email Addresses, for Intelligent People

    Comment


    • #3
      I use regular expressions to check mail-.addresses. My program is based on Scott Klements MAILCHK und the regular expresssion is from Markus Sipila:
      http://www.markussipila.info/pub/ema...ction=validate

      My actual expression is:

      pattern = '^[A-Z0-9_\+-]+(\.[A-Z0-9_\+-]+)*' +
      '@' +
      '[A-Z0-9-]+(\.[A-Z0-9-]+)*\.([A-Z]{2,13})$';

      The expression has been adapted to handle the new toplevel domains, as international.

      The check is only formal. I don't check if the mail-address exists.

      Please notice, that the correct setting of the systemvalue QLOCALE is essential. I am living in Switzerland and have /QSYS.LIB/DE_CH.LOCALE

      Best regards

      Jan

      Comment


      • stevenkontos
        stevenkontos commented
        Editing a comment
        Jan, this is very good. I could not get MAILCHK to compile. Perhaps you can post how you did it with a copy of the code. Also, can you explain how QLOCALE is essential?

    • #4
      What problems did you have getting it to compile?

      Comment


      • stevenkontos
        stevenkontos commented
        Editing a comment
        Msg id Sv Number Seq Message text
        *RNF7030 30 259 002700 The name or indicator PATTERN is not defined.
        *RNF7030 30 242 001000 The name or indicator REGPATTERN is not defined.
        *RNF7030 30 243 001100 The name or indicator REGSTRING is not defined.
        *RNF7030 30 242 001000 The name or indicator STDSTR is not defined.

      • Scott Klement
        Scott Klement commented
        Editing a comment
        stevenkontos wrote:

        Msg id Sv Number Seq Message text
        *RNF7030 30 259 002700 The name or indicator PATTERN is not defined.
        *RNF7030 30 242 001000 The name or indicator REGPATTERN is not defined.
        *RNF7030 30 243 001100 The name or indicator REGSTRING is not defined.
        *RNF7030 30 242 001000 The name or indicator STDSTR is not defined.

        In MAILCHK, the definition of "pattern" is very plain:

        D pattern s 50A varying


        The other fields you list here do not exist in MAILCHK, they must've been added to your copy by someone else after MAILCHK was published. There are no references to fields with those names in the original MAILCHK program.

    • #5
      Hello Steven
      Scott Klement had an article on checking mail-addresses with regular expressions :

      http://iprodeveloper.com/rpg-program...lar-expression

      It ends with a note on problems with CCSID and regular expressions. After having contact with IBM on this subject, I learned, that it can be solved, if you set the systemvalue QLOCALE correct.

      Here comes my subprocedure:


      *
      ************************************************** ******************
      *
      * Check Mail-Address
      * ------------------
      * Scott Klement, July 13, 2006
      * This program is based on work of Scott Klement (MAILCHK)
      *
      * The Regular Expression has been created by Markus Sipilä
      * http://www.markussipila.info/pub/ema...ction=validate
      *
      *
      * To reset program (clear variables, free compiled copy
      * of regular expression, etc.) call with no parameters.
      *
      *
      ************************************************** ******************

      P GISA_isMailAddressValid...
      P B EXPORT
      D PI n
      D p0EmailAddr 60a const options(*nopass)

      D compiled s N inz(*OFF)
      D valid s N inz(*OFF)
      D pattern s 150A varying
      D EmailAddr s 60A
      D reg ds likeds(regex_t)
      D match ds likeds(regmatch_t)
      D rc s 10I 0

      /free

      // --------------------------------------------------
      // If called with no parameters, clean everything
      // up and exit the program.
      // --------------------------------------------------


      if (%parms = 0);
      regfree(reg);
      compiled = *Off;
      *inlr = *on;
      return *off;
      endif;



      // --------------------------------------------------
      // Compile the regular expression
      // (This is only done once, on the first call.)
      // --------------------------------------------------


      if (not Compiled);
      pattern = '^[A-Z0-9_\+-]+(\.[A-Z0-9_\+-]+)*' +
      '@' +
      '[A-Z0-9-]+(\.[A-Z0-9-]+)*\.([A-Z]{2,13})$';

      rc = regcomp( reg
      : %trim(pattern)
      : REG_EXTENDED + REG_ICASE + REG_NOSUB);
      if rc <> 0;
      GISA_FatalError(rc:reg);
      endif;

      compiled = *on;
      endif;


      // --------------------------------------------------
      // Check the e-mail address against the regular
      // expression.
      // --------------------------------------------------

      EmailAddr = p0EmailAddr;

      if (regexec( reg
      : %trim(EmailAddr)
      : 0
      : match
      : 0 ) = 0);
      valid = *on;
      else;
      valid = *off;
      endif;

      if valid;
      return *on;
      else;
      return *off;
      endif;


      /end-free
      P GISA_isMailAddressValid...
      P E

      Plese excuse the poor format.

      You can find regex_h here in case you don't have it allready:

      http://www.scottklement.com/rpg/copy...ex_h.rpgle.txt

      Best regards

      Jan

      Comment


      • #6
        Jan, I hate to be dense, but where is regex_h in your sample? Is this created as a module or as a program? What would the CALL or CALLP look like? I know you modified Scott Klement's original program, but I can't find his original programming on his site or via Google search. Do you have that URL?
        Last edited by stevenkontos; March 7, 2017, 11:58 AM.

        Comment


        • #7
          Hello Steven,

          My subprocedure is in a service program. I use a /copy regex_h in the beginning of the service program to the get the copybook copied into the service program.

          You must use the subprocedure without callp or call. I recommend using a prototype and to put this prototype in a copybook. If you use the subprocedure internal in a program and if you are on 7.1 or later you don't need a prototype.

          An excample of using the subprocedure::

          if GISA_isMailAddressValid('myMailAddress@myDomain.co m');
          // place code for valid mail addresses here
          else;
          // place code for invalid mail addresses here
          endif;

          Another use of the subprocedure could be:

          Define a variable named valid as a boolean

          valid = GISA_isMailAddressValid('myMailAddress@myDomain.co m');
          if valid;
          // place code for valid mailaddresses here
          else;
          // place code for invalid maiaddresses here
          endif;


          Please find the original MAILCHK program from Scott Klement attached.

          Best regards

          Jan

          52826_80_MailChk.zip
          Attached Files

          Comment


          • #8
            I know this is an old discussion but this simple variation might help someone:


            exec sql set :myCount = regexp_count( :myEmail, '^(?:\w+\.?)*\w+@(?:\w+\.)*\w+$' );

            if myCount = 1;
            // email is good
            else;
            // email is malformed
            endif;

            EDITED to mention that - I use varchar wherever possible. If you are using fixed length fields you will need to add a %TRIM(:myemail) to remove *blanks

            dcl-s myEmail varchar(254); // longest length of an email address
            dcl-s myCount uns(10); // the count doodah

            Last edited by NickLitten; June 14, 2018, 08:19 AM.
            predictably positive, permanently punctilious, purposely proactive, potentially priceless, primarily professional : projex

            Comment


            • WilliamTasker
              WilliamTasker commented
              Editing a comment
              p.s. Tried the OP's code, which is exactly the same except with an added TRIM() and that works. So you need the TRIM()

            • WilliamTasker
              WilliamTasker commented
              Editing a comment
              p.p.s Doing a wider test over a larger number of emails I find only one fault with the original expression: it seems to object to dashes in email addresses, but they are quite legitimate. For instance, "src-americas@schneider-electric.com" is a real email address with dashes both before and after the "@".

            • NickLitten
              NickLitten commented
              Editing a comment
              Note that this example is using a very simple regex - The regular expression ‘^(?:\w+\.?)*\w+@(?:\w+\.)*\w+$‘ is saying – check this input string meets the standard of something before an @ sign followed by something dot something. But you can use any regex you like to enforce the rules. I blogged about a much stricter regex (official standard – RFC5322) and it looks like this:

              General Email Regex (RFC 5322 Official Standard)

              (?:[a-z0-9!#$%&’*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&’*+/=?^_`{|}~-]+)*|”(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*”)@(??:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(??:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\])

          • #9
            So I'm trying to implement the General Email Regex above. What is the best/easiest way to store that string for use in SQLRPGLE?

            Ok. So I figured out how to establish the General Email Regex above (i copied it from http://emailregex.com/)

            In case anyone else wants to use this... the above does not like upper-case letters. So my code (adapted from NickLitten) looks like this:

            Code:
            
                   dcl-proc  IF_ValidEmail    export;
            
                   dcl-pi IF_ValidEmail       ind;
                     inEmail                  varchar(256) const;
                   end-pi;
            
                   dcl-s myEmailRegex    varchar(512);
                   dcl-s myCount         uns(10);
            
                   dcl-s q               char(1) inz('''');
            
                  // --------------------------------------------------------------------
            
            
                   myEmailRegex =
                    '(?:[a-z0-9!#$%&'+q+'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'+q+'*+/=?^_`{|' +
                    '}~-]+)*|"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\' +
                    'x01-\x09\x0b\x0c\x0e-\x7f])*")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])' +
                    '?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:25[0-5]|2[0-4][0-9]|[0' +
                    '1]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-' +
                    '9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\' +
                    'x01-\x09\x0b\x0c\x0e-\x7f])+)\])';
            
                   Exec Sql
                     Set :myCount = regexp_count(lower(:inEmail), :myEmailRegex);
            
                   If myCount = 1;
                     return *on;
                   else;
                     return *off;
                   endif;
            
                   end-proc;
            Last edited by gwilburn; April 3, 2019, 11:39 AM.

            Comment

            Working...
            X