ibmi-brunch-learn

Announcement

Collapse
No announcement yet.

junk characters

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • junk characters

    Hi,

    How to remove junk characters in flat file using sql.
    The Secret Of Achieving Your Goals Is Revealed Before Your Eyes.
    System Engineer

    Sudha...

  • #2
    Re: junk characters

    Hi Sudha,

    "Junk" characters is a pretty broad term. One man's junk is another man's treasure!

    You'll need to be more specific as to which characters define junk in your case.

    There are lots of ways to deal with junk in SQL. Some options:
    • REPLACE function
    • TRANSLATE function
    • REGEXP_REPLACE function to handle more complex pattern matching / replacement needs


    All the above functions can work with hex values if needed.

    The HEX and LOCATE functions are useful to locate junk.

    Mike

    Comment


    • #3
      Re: junk characters

      Any characters other than keyboard considered as junk characters. I should be in position to validate in ifs .
      The Secret Of Achieving Your Goals Is Revealed Before Your Eyes.
      System Engineer

      Sudha...

      Comment


      • #4
        Re: junk characters

        If you need the data in the standard QSYS.LIB file system at some point, you probably should start with copying the data there using a command like CPYFRMSTMF or CPYFRMIMPF, to copy the IFS data to a QSYS.LIB standard file system table, and then perform your SQL over the QSYS.LIB table. This is likely the better choice if you're not on V7R2, since it is easier easier than using SQL to directly access IFS data on systems before V7R2.

        To use SQL against IFS data directly (more effort if not on V7R2), you'll likely want to use an SQL function to access IFS data from SQL. Here are some ways to do so:

        Once you either have the data in the QSYS.LIB file system, or have used one of the above methods to make the IFS data accessible to SQL, you can use the regular expression function REGEXP_REPLACE to remove all characters not on a keyboard, using something like this:

        Code:
        [FONT=Lucida Console]select regexp_replace( ' abcdefghijklmnopqrstuvwxyz01234567890`~!@#$%^&*()-_=+[{]}\|;:''",<.>/?'  --replace this parameter with your source data to be converted 
                              ,'[[^a-z]&&[^0-9]&&[^\s`~!@#\$%\^&\*\(\)-_=\+\[\{\]\}\\\|;:''",<\.>/\?]]'
                              ,'',  1, 0, 'i' )   
        from ( values( 1 ) ) as T[/FONT]
        Notes:
        • Regular expression support in SQL started in V7R1 via a PTF. You may need PTFs installed to use them.
        • Characters on a keyboard can vary from country to country. The regular expression pattern string provided may need to be adjusted to fine tune the definition of junk.
        • The \s in the pattern string will match any white space character. White space is defined as [\t\n\f\r\p{Z}] (see regular expression matacharacters documentation). White space characters will be preserved (not removed) using the code provided.
        • The pattern string and the replacement value of an empty string, will remove any byte that is not a letter, and not a number, and not a white space, and not one of the special characters listed.
        • Regular expressions are extremely useful. All developers should know how to use them to a basic degree.
        • Regular expressions do have their limits, and at times developers try to use them to handle parsing tasks that regular expressions don't handle well. For example, some try to use them to parse XML or XHTML. An XML parser will work much better for that task.
        • Java, Python, and other languages on IBM i have support for regular expressions as well, but the syntax varies from the SQL functions.


        Mike
        Last edited by Michael Jones; February 9, 2016, 10:29 AM.

        Comment

        Working...
        X