ibmi-brunch-learn

Announcement

Collapse
No announcement yet.

Rows to Columns

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Rows to Columns

    I'm at a loss how to accomplish this. We have a legacy (control) file that contains 4 columns. Category, Value, Sequence, and Data. I'm trying to create an SQL View that will contain the necessary data in ONE row with multiple columns.

    For a given Category, Value and Sequence, the Data field (40 characters) contains information that I need to place in separate columns using SUBSTR.

    The challenge occurs when I have multiple rows (i.e. sequence 1, 2, and 3) and I need to substring Data into multiple columns from each of them.

    WWCAT WWSLC WWSQ2 WWDTA
    0026 AAA 1 001 AMANDA GRASSER
    0026 AAA 2 NON001


    In the example data above, I need to create ONE row containing
    WWSLC as column1
    substr(WWDTA, 5, 25) when WWSQ2=1 as column 2
    substr(WWDTA, 6, 3) when WWSQ2=2 as column 3
    substr(WWDTA, 9, 3) when WWSQ2 =2 as column 4

    I've tried subselects, but I'm hoping there is a more "elegant" method.

    Thx,
    Greg

  • #2
    The only way is to use substrings, but as soon you have wrapped your SELECT statement in a view, complexity disappears.


    Code:
    Create Or Replace View YourSchema.YourView
    as (Select WWSCL as Column1
               Case When WWSQ2 = 1 Then Substr(WWDTA, 5, 25) else '' End as Column2,
               Case When WWSQ2 = 2 Then Substr(WWDTA, 6,  3) else '' End as Column3,
               Case When WWSQ2 = 2 Then Substr(WWDTA, 9,  3) else '' End as Column4
               ...
           From YourSchema.YourTable
           ...)

    Birgitta

    Comment


    • #3
      My preferred method is not using a case statement as they have to be evaluated for each row.

      Select
      WWSLC,
      Max(COL2) As COL2,
      Max(COL3) As COL3,
      Max(COL4) As COL4
      From
      WWDATA
      Cross Join Lateral
      (Values
      (1,Substr(WWDTA,5,25),'',''),
      (2,'',Substr(WWDTA,6,3),'')
      (2,'','',Substr(WWDTA,9,3)) As A
      (VAL,COL2,COL3,COL4)
      On WWSQ2 = VAL
      Group By
      WWSLC

      Jim

      Comment


      • #4
        Thanks for the replies... However, my view is only useful if the result is ONE row for each unique WWSLC.

        Jim - I have tried your suggestion as it seems to meet that requirement - this is very close.
        However, I'm getting an error in run sql scripts (Keyword AS not expected). I inserted a missing comma and used the real field/file name.

        Select
        rfSLC,
        Max(COL2) As COL2,
        Max(COL3) As COL3,
        Max(COL4) As COL4
        From
        astdta.referrf
        Cross Join Lateral
        (Values
        (1,Substr(rfDTA,5,25),'',''),
        (2,'',Substr(rfDTA,6,3),''),
        (2,'','',Substr(rfDTA,9,3)) As A
        (VAL,COL2,COL3,COL4)
        On rfSQ2 = VAL
        Group By
        rfSLC;

        Comment


        • #5
          OK... Between Jim's post and a current view I have using "Lateral" values, I may have this partially solved. I removed some of the statement:

          -- This ALMOST works
          Select
          rfSLC,
          Max(COL2) As COL2,
          Max(COL3) As COL3,
          Max(COL4) As COL4
          From
          astdta.referrf,
          Lateral(Values
          (1,Substr(rfDTA,5,25),'',''),
          (2,'',Substr(rfDTA,6,3),''),
          (2,'','',Substr(rfDTA,9,3))) As A
          (VAL,COL2,COL3,COL4)
          where rfcat='0026'
          -- On rfSQ2 = VAL
          Group By
          rfSLC;

          -- OUTPUT looks like this:
          Code:
          RFSLC           COL2                     COL3 COL4
          A                A  ***                  A  ***
          AA              SGM SALES                NON 012
          AAA             AMANDA GRASSER           NON 001
          ...
          The trouble is that for RFSLC = A, the COL2 should be blanks. However It is picking up values from RFSQ2=2

          Code:
          RFSLC             RFSQ2  RFDTA          
          A                    1                  
          A                    2        A  ***
          It's almost like I need the CASE statements based on the sequence number (RFSQ2)
          Last edited by gwilburn; April 29, 2019, 07:44 AM.

          Comment


          • #6
            Greg,

            Give us a few sample rows of table astdta.referrf.

            Jim

            Comment


            • #7
              Here is some sample data:

              Code:
              RFCAT  RFSLC             RFSQ2  RFDTA                                  
              0026   A                    1                                          
              0026   A                    2        A  ***                            
              0026   AA                   1   012 SGM SALES                          
              0026   AA                   2        NON012                            
              0026   AAA                  1   001 AMANDA GRASSER                      
              0026   AAA                  2        NON001                            
              0026   AAB                  1   042 PSMA                                
              0026   AAB                  2        NON042                            
              0026   AAC                  1   021 EVOLUTION MANAGEMENT-IND            
              0026   AAC                  2        NON021                            
              0026   AAD                  1   035 GOETZ                              
              0026   AAD                  2        NON035                            
              0026   AAE                  1   035 KNACK                              
              0026   AAE                  2        NON035                            
              0026   AAF                  1   013 TABLE ONE                          
              0026   AAF                  2        NON013                            
              0026   AAG                  1   009 THE ERIN GRIFFIN GROUP    1040322400
              0026   AAG                  2   0    NON009
              This is an old "control file". The structure of the data for each RFCAT number is determined by an RPG data structure (that chagnes for each category). The RFSQ2 can be up to 20. This is for category 0026.

              In RPG, I have a service program procedure for getting data from this file. The parameters are RFCAT and RFSLC. It basically reads the rows of data by RFSQ2 and concatenates the RFDTA into one long string.

              Once I have one long string of data, I can use SUBSTR or put it into a pre-defined data structure.

              I just don't know if there is a way to do that with SQL - if so, that would solve A LOT of issues for me.
              Last edited by gwilburn; April 29, 2019, 08:26 AM.

              Comment


              • #8
                If you only care about RSQ2 = 1 or 2, and not any of the others, then you could do:
                Code:
                select * from astdta.referrf a
                   left join astdta.referrf b
                    on a.RFCAT = b.RFCAT and a.RFSLC = b.RFSLC and a.RFSQ2 = 1 and b.RFSQ2 = 2
                This can be your subselect. This will return 1 row containing RFDTA for both RFSQ2 = 1 (a.RFDTA) and RFSQ2 = 2 (b.RFDTA)
                Then in the outer query you can stitch them together, or extract from one or the other or both as needed. Just remember that b.RFDTA will be null if there is no RFSQ2 = 2 record.

                Comment


                • #9
                  So I found the LISTAGG function... This may work for the entire file! Any reasons to not use this?

                  This gets me ONE row for each RFCAT and RFSLC, with column concatenated (just like my RPG procedure does)

                  -- Use aggregate function to concatenate RFDTA
                  select rfcat, rfslc, listagg(rfdta) within group(order by rfcat, rfslc, rfsq2) as rfdata
                  from astdta.referrf
                  where rfcat='0026'
                  group by rfcat, rfslc;

                  Code:
                  RFCAT    RFSLC    RFDATA
                  0026    A                                                 A  ***
                  0026    AA    012 SGM SALES                                NON012
                  0026    AAA    001 AMANDA GRASSER                           NON001
                  0026    AAB    042 PSMA                                     NON042
                  0026    AAC    021 EVOLUTION MANAGEMENT-IND                 NON021
                  0026    AAD    035 GOETZ                                    NON035
                  0026    AAE    035 KNACK                                    NON035
                  0026    AAF    013 TABLE ONE                                NON013
                  0026    AAG    009 THE ERIN GRIFFIN GROUP    10403224000    NON009
                  0026    AAH    024 BDC GROUP                                NON024
                  0026    AAI    024 HIGH FIVE                                NON024
                  So then created it using it as a CTE.
                  -- Use a CTE
                  with refer as
                  (select rfcat, rfslc, listagg(rfdta) within group(order by rfcat, rfslc, rfsq2) as rfdata
                  from astdta.referrf
                  group by rfcat, rfslc)

                  select rfcat, rfslc, substr(rfdata,5,25) as arr_name, substr(rfdata,46,3) as slr_nbr, substr(rfdata,49,3) as com_nbr
                  from refer
                  where rfcat='0026';

                  Code:
                  RFCAT    RFSLC    ARR_NAME    SLR_NBR    COM_NBR
                  0026    A        A    ***
                  0026    AA    SGM SALES    NON    012
                  0026    AAA    AMANDA GRASSER    NON    001
                  0026    AAB    PSMA    NON    042
                  0026    AAC    EVOLUTION MANAGEMENT-IND    NON    021
                  So would it be more efficient to have the WHERE in my CTE or in the result?

                  Or should I simply create a separate view over the entire file (representing the CTE), then simply have separate views selecting from that view?

                  Comment


                  • #10
                    I forgot about ListAgg

                    I think ListAgg would only work if you can be sure you don't have omitted values for RFSQ2. You said it goes up to 20, you didn't say if there are ever omitted values

                    E.g.
                    Code:
                    RFCAT  RFSLC             RFSQ2  RFDTA                                  
                    0026   A                    1                                              
                    0026   A                    3        SOMETHING ELSE                          
                    0026   AA                   1   012 SGM SALES                          
                    0026   AA                   2        NON012                  
                    0026   AA                   3        SOMETHING ELSE
                    Your aggregated output for A would be 1+3, for AA it would be 1+2+3. So the positions change if there are omitted values.

                    Comment


                    • #11
                      Originally posted by Vectorspace View Post
                      I forgot about ListAgg

                      I think ListAgg would only work if you can be sure you don't have omitted values for RFSQ2. You said it goes up to 20, you didn't say if there are ever omitted values

                      E.g.
                      Code:
                      RFCAT RFSLC RFSQ2 RFDTA
                      0026 A 1
                      0026 A 3 SOMETHING ELSE
                      0026 AA 1 012 SGM SALES
                      0026 AA 2 NON012
                      0026 AA 3 SOMETHING ELSE
                      Your aggregated output for A would be 1+3, for AA it would be 1+2+3. So the positions change if there are omitted values.
                      You are exactly correct. However, that situation should never occur. This is basically a control file for our ERP system - I have seen RFSQ2 number skipped, but it was consistent within the given RFSLC.

                      That would mean the substring positions would not exactly match the RPG data structures, but I think I can live with that.

                      Comment

                      Working...
                      X