ibmi-brunch-learn

Announcement

Collapse
No announcement yet.

Check if a string has certain characters

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Check if a string has certain characters

    Hello,

    I have a string that has text in it like 'Description' and I want to check it doesn't contain the following 3 characters; . ' "

    After googling and testing myself, as far I can see the only options I have are;

    1. Use xlate (which would involve putting every allowed value into a string to compare... not feasible)
    2. Use SCAN on each character

    Surely there's a way of doing this on 1 line?

    Cheers,
    Ryan

  • #2
    Use bif %check()

    Comment


    • #3
      SCANRPL can be used to find certain characters in your string and replace them with something else, even '' (nothing), i.e. remove them.

      Comment


      • #4
        A separate %scan for each would probably be the smallest and most efficient way.

        %check would require checking for every allowed character.



        As an alternative, you can use embedded SQL and Regex. It uses one statement (technically), but as it's SQL it's a multi line statement. It scales well though, so the more characters you need to check for, the better this is.


        Are you familiar with Regex (Regular Expressions)? If not - it's a syntax for text pattern matching. This is a good site for testing Regex expressions: https://regex101.com/

        iSeries supports it in SQL via the REGEXP_LIKE() function.

        Like many SQL functions, it can be used in embedded RPGLE. However this function returns SQL true or false, so you need to wrap it in a case statement so it returns a value you can store in an RPGLE host variable.

        I had a string that was supposed to be a character representation of a hex value, with each pair of characters separated by a hyphen. E.g 'A1-DF-5C-B8-...'
        Where the length was variable, but would include at least 1 hex value (one pair of A-Z 0-9 characters). I used this method in RPGLE to validate it:


        Code:
        Dcl-S token varchar(4096); // The hex string
        Dcl-S validToken ind;
        
          validToken = *off;
        
          exec sql set :validToken =
            case when REGEXP_LIKE(:token,
                 '^([0-9A-F]{2})(-[0-9A-F]{2})*$')
                 then '1' else '0' end;
        
          If not validToken;
            // token does not match pattern
          EndIf;

        The SQL to check for characters . ' and " would probably be this (note, untested)

        Code:
        Dcl-S str char(100); // Your string
        Dcl-S found ind;
        
          found = *off;
        
          exec sql set :found =
            case when REGEXP_LIKE(trim(:str),'[.''"]')
                 then '1' else '0' end;
        
          If found 
            // . or ' or " found
          EndIf;
        The regex pattern to match any of those three characters would be [.'"]
        But because it is defined within a string literal, the apostrophe had to be doubled '[.''"]'

        Note that in my experience, regex patterns in REGEXP_LIKE() must always be hard-coded, they don't work reliably if you try and store them in a host variable

        Comment


        • #5
          Post Deleted.
          Last edited by wegrace; June 11, 2018, 09:12 AM.

          Comment


          • #6
            Thanks for the replies.
            I tried %Check initially but after testing I realised it's looking for an exact match.
            SCANRPL is a good idea, but doesn't it work the same as scan??? I.e. it looks for an exact match, so I'd need 3 lines anyway?


            I use regex a lot, yeah - I did see a post on here a while ago, where someone suggested SQL & Regex but when I tried it out to see if it would work, it couldn't find the function so I guess we don't have the PTF. I've checked our new development machine (V7R3) and it does have it but the client machines don't, so I guess if I did this and sent it across it would fail because the function doesn't exist?

            I'll save that post though because I'm sure it will come in handy in the future.

            Thanks again,
            Ryan

            Comment


            • #7
              I think REGEXP_LIKE() was introduced in i7.1 in a PTF


              I have had another idea, this could be a single line

              Use %xlate to replace every disallowed character with the same character, then you can scan for that single character.
              I.e. your disallowed characters are . ' " so replace ' and " with . and then scan for .

              Code:
              if %scan('.':%xlate('''"':'..':str)) > 0;
                // Disallowed character found
              endif;
              A separate %scan for each character would still be the most efficient in terms of processing, but this is the one line solution you were after.

              Comment


              • #8
                Good idea, as always.

                The only issue with that is if they have entered a '.' in the string it would fail - I suppose the fix would be to convert 2 of the 3, to the 3rd and scan for that, i.e convert , and ' to " then scan for " i.e;

                if %scan('"':%xlate('''',':'"':str)) > 0; // Disallowed character found endif; But you say this is less efficient? How come?

                Comment


                • RDKells
                  RDKells commented
                  Editing a comment
                  Forgot code tags and can't edit;
                  Code:
                  if %scan('.':%xlate('''"':'..':str)) > 0;
                    // Disallowed character found
                  endif;

                • RDKells
                  RDKells commented
                  Editing a comment
                  3rd time's a charm?

                  Code:
                  if %scan('"':%xlate('''',':'"':str)) > 0; 
                    // Disallowed character found 
                  endif;

              • #9
                But I thought you said that '.' was one of the characters you wanted to check for?
                and I want to check it doesn't contain the following 3 characters; . ' "
                That's the only reason I said convert to '.'

                You are right, the fundamentals of the idea is if you have multiple characters, lets call them A, B, and C, then use xlate to convert all instances of B and C into A. Then all you have to do is scan for A



                The reason I think this method would be less efficient than three separate %scan()s is it seems to me that the three scans each only have to find the first instance of each character, whereas the xlate method has to find and replace every instance of 2 of them, which seems to me to be more work

                3 scans:
                • Find first instance of char 1
                • return index
                • Find first instance of char 2
                • return index
                • Find first instance of char 3
                • return index
                scan xlate:
                • Find every instance of char 2
                • Replace all with char 1
                • Find every instance of char 3
                • Replace all with char 1
                • Find first instance of char 1
                • return index

                Comment


                • RDKells
                  RDKells commented
                  Editing a comment
                  Sorry, not sure why it's a . I meant a ,!

                  scan xlate - I see your logic there, that makes sense.

                  Looks like I'll just keep the 3 scans then, I find it odd that there isn't a function for this though but I guess it's never really been required?

                • Vectorspace
                  Vectorspace commented
                  Editing a comment
                  If it's a comma (,) rather than a full stop (.), then the code for my two methods would be:

                  Code:
                  // Scan xlate
                  If %scan(',':%xlate('''"':',,':str)) > 0;
                    // Disallowed character found
                  EndIf;
                  
                  //regex:
                  found = *off;
                  exec sql set :found =
                    case when REGEXP_LIKE(trim(:str),'[,''"]')
                         then '1' else '0' end;
                  If found 
                    // Disallowed character found
                  EndIf;
                  You could probably argue that REGEXP_LIKE is the function for this problem (and more complex variants of it), but it's a recent enough addition that not all systems have the OS version or PTF yet.

                • RDKells
                  RDKells commented
                  Editing a comment
                  You could argue that - however you could probably do something similar with SQL, so technically it's not a specific function to check a string for a list of characters it's more enhanced string manipulation functionality that allows you to achieve what you want, after a fair amount of lines! Lol.

              • #10
                Originally posted by RDKells View Post
                Thanks for the replies.
                I tried %Check initially but after testing I realised it's looking for an exact match...
                To clarify, the %Check function does not look for an exact match. It looks for any characters that are not in the list of characters given, which is the opposite of what you want.

                Comment


                • RDKells
                  RDKells commented
                  Editing a comment
                  Sorry you're right - I recall now, it kept returning 1 when I was expecting 0 because it found the first character.

              • #11
                I'm confused. The title basically says "contains certain characters" then the 1st sentence says "DOESN'T contain certain characters". Which is it? Then the example says 3 characters but ; . ' " are 4 characters.

                Comment


                • RDKells
                  RDKells commented
                  Editing a comment
                  Probably could've been worded better but don't worry about it now, the others understood it and it's been answered.

                  Cheers,
                  Ryan
              Working...
              X