ibmi-brunch-learn

Announcement

Collapse
No announcement yet.

Safe to pointer overlay a varchar onto a clob?

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Safe to pointer overlay a varchar onto a clob?

    This:
    Code:
    Dcl-S myClob SqlType(CLOB:3000);
    is converted by the SQL precompiler to this:
    Code:
    Dcl-Ds myClob;
      myClob_LEN  UNS(10);
      myClob_DATA CHAR(3000);
    End-Ds;
    I had a realisation. UNS(10) is a 4 byte unsigned integer. A varchar has a 2 or 4 byte prefix to store the numeric number of characters, just like the myClob_LEN component of SqlType(CLOB).
    Which means, the memory layout for these two definitions:
    Code:
    Dcl-S myClob SqlType(CLOB:3000);
    Dcl-S myVar  Varchar(3000:4);
    should be the same - 4 byte unsigned integer for number of characters, followed by 3000 bytes of characters.

    Which means, it should be safe to use pointers to overlay the two variables on top of each other? We have programs that use varchars larger than the maximum 32kb supported by SQL, and being able to directly overlay a larger varchar with a CLOB would make them simpler to deal with if we ever needed to process that variable in SQL. Example:
    Code:
    Dcl-S myDataC  SqlType(CLOB:3000000);
    Dcl-S myDataV  Varchar(3000000:4) based(@myDataV);
    Dcl-S @myDataV Pointer inz(%addr(myDataC));
    
    myDataV = 'some long data';
    exec sql set :myDataC = upper(:myDataC);
    // myDataV now equals 'SOME LONG DATA'
    So RPG code can use myDataV and SQL can use myDataC, and both are manipulating the same data.

    Is this safe? Or are there reasons to not do this?


  • #2
    Why to use pointers? Just modify your data with SQL if you are working with SQLTYPE datatypes.

    Is it save modifying the original data directly (or data structure) directly or through a pointer data within pure RPG?

    Birgitta

    Comment


    • #3
      We have cases where huge varchars are passed in to programs as parameters.

      Comment


      • #4
        The answer to your question is yes - it should be perfectly safe. Actually I'm surprised that the SQL pre-compiler hasn't been updated to generate a VarChar rather than the old approach. Originally it would have had to do it this way since RPG varchar was (until V6) limited to 32K but now that the limit no longer exists it really should change. I'm meeting the IBMers responsible for the pre-compiler in a week or two and will try to remember to ask them about this. Of course you could always write an RFE requesting the change <grin>.

        Comment


        • #5
          I imagine they can't change the precompiler now because existing programs will be referencing data structure myCLOB, and subfields myCLOB_LEN and myCLOB_DATA?

          Unless they add a control spec keyword to tell the precompiler to make it a varchar instead?

          Or how about a new SQLTYPE that still means Clob, but generates a varchar rather than a DS? (presumeably a Blob/VarGraph version too?)

          Or maybe simply allow large varchars to directly interact with SQL, by the precompiler recognising they are varchars larger than 32kb so treat them the same as a clob data structure?
          I can see from the *LISTING code that when you reference a clob host variable in embedded SQL, it works by pointer using %addr. As clobs always have 4 byte lengths whereas varchars have 2 bytes or 4, the precompiler would need to be smart enough to recognise if the varchar has a 2 or 4 byte prefix and adjust accordingly.

          Comment


          • #6
            Originally posted by JonBoy View Post
            Originally it would have had to do it this way since RPG varchar was (until V6) limited to 32K
            Actually, the RPG limit was 64k.

            And, the approach with a data structure was also limited to 64k since RPG's CHAR as a subfield in the DS was limited to 64k. So the 64k limit was not the reason it used a data structure.

            The original implementation was actually before RPG had a varchar or varying string -- that is why it is a DS with a LEN and DATA subfield. They couldn't change it thereafter because programs were now referring to the _LEN and _DATA subfields. (Though, they probably could've had some sort of keyword to indicate the use of VARCHAR/VARYING, etc, I guess...)

            It certainly would be more elegant to work with a varchar/varying field than the DS, though.

            Comment


            • #7
              Right - it was 64K not 32K.

              The pre-compiler has been changed in the past and the change could be easily done compatibly - all they need to do is add the varchar to the DS as an overlay.

              Comment


              • #8
                That would mean the varchar has a different name to the clob itself, since the ds has the name of the clob. Kind of like my pointer example. In an ideal world, you'd use the same variable name in both rpg and sql

                Comment


                • #9
                  Yeah, I agree... adding an overlaid varchar in the DS would be awkward. (Though, still better than having to set both the length and data separately.)

                  I'd prefer to see something like this:

                  Code:
                  dcl-s myLobData SQLTYPE(CLOB: 1000000: RPGVARCHAR);
                  Then the precompiler could convert it to a VARCHAR rather than a DS when that extra keyword is given.

                  Comment


                  • #10
                    The SQL pre-compiler folks have asked me to write up an RFE for this ability. Any and all suggestions on approach are welcome. Will post the RFE number here when it is done.

                    Comment


                    • #11
                      I think there are two different possible changes here. Making CLOBs work like large varchars, and/or making embedded SQL treat large varchars as CLOBs.

                      #1, Precompiler converts SQLTYLE(CLOB:X) to VARCHAR(X:4) instead of to a DS.
                      This would let the variable be used in regular RPG. This should require no changes to actual SQL processing, as it is referenced by address and the memory layout (4 byte unsigned int for number of chars, followed by X bytes of character space) is identical. This could be accomplished using a global control-spec keyword to set a program-wide default, and/or a d-spec keyword or SQLTYPE keyword parameter to allow each individual SQLTYPE definition to be individually specified.


                      #2, Allow embedded SQL to directly reference large varchars (>32kb).
                      This would let existing large varchars be used in embedded SQL without having to first copy them to CLOBs (increased memory usage, which is especially an issue here because these are large variables) or use pointer/DS overlays.
                      This could be done with a precompiler-only change that would not work for some varchars (#2.1) or a precompiler and SQL Processing change to support all varchars (#2.2)

                      #2.1 Only allow this for large varchars defined with 4 byte prefixes.
                      Varchars with no explicitly specified prefix have a 2 byte prefix if their number of characters is 65565 or less. 65536 or more get a 4 byte prefix. You can explicitly specify a 4 byte prefix for an X <= 65535 varchar with VARCHAR(X:4).
                      It is the 4 byte varchars that are identical in memory layout to the CLOBs, therefore with a precompiler-only change, varchars with an implicit or explicit 4 byte prefix could be directly referenced in exactly the same way as a CLOB (via %addr). This means varchars above 64kb will work (they always have a 4 byte prefix), varchars above 32kb and below 64kb will work only if they have an explicit 4 byte prefix, but varchars above 32kb and below 64kb with only a 2 byte prefix will not work. (varchars < 32kb are of course directly referenced already)

                      #2.2 Allow this for large varchars defined with 4 byte prefixes.
                      Option #2.1 being a precompiler-only change works because a varchar with a 4 byte prefix has identical memory layout to a CLOB. The SQL Processing is referencing the CLOB only by %addr which means that it must be assuming the number of characters portion is 4 bytes. It would therefore need amending to recognise when a varchar has a 2 byte prefix instead of 4, in addition to the precompiler changes. But this would mean that ALL varchars are directly reference-able in embedded SQL operations, irrespective of length. This to my mind is the ideal solution.


                      Personally I would like to see both options #1 and #2.2 done, but I recognise #2.2 is much more complex piece of work because it is more than just the precompiler.

                      Comment


                      • Vectorspace
                        Vectorspace commented
                        Editing a comment
                        Use anything you want. Though full disclaimer, #1 is just a fleshed out explanation of Scott's idea.

                        I feel I should also mention, when I was experimenting with this earlier, I found that using a larger than supported varchar with embedded sql was causing the precompiler to crash. I never narrowed down the exact circumstances as work had other demands for my time, but felt it worth mentioning.

                      • Scott Klement
                        Scott Klement commented
                        Editing a comment
                        In order to use my idea, I'll need $1 million transferred to my private bank account in the Caymans

                      • JonBoy
                        JonBoy commented
                        Editing a comment
                        However will you get it back out of their tiny tummies Scott?

                    • #12
                      JonBoy, I was wondering if you are still planning to raise RFE's for this?

                      Comment


                      • #13
                        Originally posted by Vectorspace View Post
                        JonBoy, I was wondering if you are still planning to raise RFE's for this?
                        Yes - but I've been insanely busy for the last couple of weeks and it is not getting better any time soon.

                        Please feel free to write it up and I'll gladly vote for it.

                        Comment


                        • #14
                          I would write three RFE's: one for #1, one for #2.1, and one for #2.2 marking it as dependent on #2.1. I figure this way it's more likely that one of the ideas will get delivered. Otherwise if they reject the RFE because of one of the ideas, we would have to re-raise. Do these look good to everyone:

                          ================================================== ==========================

                          SQLRPGLE/CLOB support improvement - define sqlType(CLOB) as VARCHAR

                          Dcl-S myVar sqlType(CLOB:10000);
                          is converted by the SQL Precompiler into:
                          Dcl-Ds myVar;
                          myVar_LEN UNS(10);
                          myvar_DATA CHAR(10000);
                          End-Ds;
                          Embedded SQL functions referencing myVar do so by address:
                          SQL0001 = %addr(myVar);
                          This layout (4 byte unsigned int followed by character bytes) is identical in memory layout to a 4-byte-prefix VARCHAR
                          Dcl-S myVar VARCHAR(10000:4);
                          Therefore, SQL processing should be able to treat host variables VARCHAR(x:4) exactly the same as sqlType(CLOB:x) up to the max 16MB size of RPGLE variables

                          The SQL precompiler should be updated to optionally convert sqlType(CLOB) to a VARCHAR instead of a DS. This would have several benefits:
                          * CLOBs could be used as-is with regular RPGLE BIFs such as %trim, %scan, %xlate, etc.
                          * Programmers would not need to explicitly set separate length and data variables.
                          * All of this serves to simplify the use of CLOB columns in SQLRPGLE processing

                          This could be accomplished in a backwards-compatible way by the use of a Definition-Spec keyword to control which type the CLOB is converted to
                          Dcl-S myVar1 sqlType(CLOB:10000) lobSubType(*VARCHAR); // This one will be a VARCHAR
                          Dcl-S myVar2 sqlType(CLOB:10000) lobSubType(*DS); // This one will be a DS
                          Dcl-S myVar2 sqlType(CLOB:10000); // This one will be a DS because DS is the default

                          A Control-Spec keyword could be used to override the global default
                          Dcl-C lobSubType(*VARCHAR)
                          Dcl-S myVar1 sqlType(CLOB:10000) lobSubType(*VARCHAR); // This one will be a VARCHAR
                          Dcl-S myVar2 sqlType(CLOB:10000) lobSubType(*DS); // This one will be a DS
                          Dcl-S myVar2 sqlType(CLOB:10000); // This one will be a VARCHAR because the default has been overridden to VARCHAR

                          This could also apply to sqlType(BLOB) and VARGRAPHIC/VARBINARY

                          ================================================== ==========================

                          SQLRPGLE Embedded SQL support for large VARCHARs level 1

                          See linked RFE (SQLRPGLE Embedded SQL support for large VARCHARs level 2) for a proposed further enhancement on top of this change

                          Dcl-S myVar sqlType(CLOB:10000);
                          is converted by the SQL Precompiler into:
                          Dcl-Ds myVar;
                          myVar_LEN uns(10);
                          myvar_DATA char(10000);
                          End-Ds;
                          Embedded SQL functions referencing myVar do so by address:
                          SQL0001 = %addr(myVar);
                          This layout (4 byte unsigned int followed by character bytes) is identical in memory layout to a 4-byte-prefix VARCHAR
                          Dcl-S myVar VARCHAR(10000:4);
                          Therefore, SQL processing should be able to treat host variables VARCHAR(x:4) exactly the same as sqlType(CLOB:x) up to the max 16MB size of RPGLE variables

                          The SQL Precompiler should be updated to recognise that VARCHAR(x:4) is the same as a converted sqlType(CLOB:x), and allow them to be referenced as host variables using the same method (%addr()).
                          This is in effect an implicit conversion between SQL CLOBs and RPGLE VARCHAR(x:4)
                          This would have several benefits:
                          * Programs with existing large 4-byte-prefix VARCHARs would be instantly interoperable with embedded SQL, without needing to reassign to/pointer overlay with an sqlType(CLOB)
                          * CLOBs could be selected to regular VARCHARs to then be used with regular RPGLE BIFs such as %trim, %scan, %xlate, etc.
                          * All of this serves to simplify the use of CLOB columns in SQLRPGLE processing

                          This would not work with VARCHAR lengths 32705 to 65535 unless they had an explicit 4 byte prefix (by default these will only have a 2 byte prefix, making them incompatible without SQL processing changes), or unless SQL processing is updated to account for it.

                          The linked level 2 RFE, which is dependent on the RFE, proposes amending SQL processing too so it can handle 2 byte prefix VARCHARs as well as 4 byte, completing support. It was kept as a separate RFE to enable partial/staged delivery

                          This could also apply to sqlType(BLOB) and VARGRAPHIC/VARBINARY

                          ================================================== ==========================

                          SQLRPGLE Embedded SQL support for large VARCHARs level 2

                          This change is dependent on the linked RFE (SQLRPGLE Embedded SQL support for large VARCHARs level 1) being delivered

                          Dcl-S myVar sqlType(CLOB:10000);
                          is converted by the SQL Precompiler into:
                          Dcl-Ds myVar;
                          myVar_LEN uns(10);
                          myvar_DATA char(10000);
                          End-Ds;
                          Embedded SQL functions referencing myVar do so by address:
                          SQL0001 = %addr(myVar);
                          This layout (4 byte unsigned int followed by character bytes) is identical in memory layout to a 4-byte-prefix VARCHAR
                          Dcl-S myVar VARCHAR(10000:4);
                          Therefore, SQL processing should be able to treat host variables VARCHAR(x:4) exactly the same as sqlType(CLOB:x) up to the max 16MB size of RPGLE variables

                          VARCHARs of lengths up to 32704 can be handled by SQL processing as VARCHARs. But lengths 32705-65535 are too long for an SQL VARCHAR, but will by default have a 2 byte prefix making them incompatible with sqlType(CLOB) (unless the developer adds an explicit 4 byte prefix)

                          The level 1 RFE proposes amending the SQL precompiler to recognise that VARCHAR(x:4) is the same as a converted sqlType(CLOB:x), and allow them to be referenced as host variables using the same method (%addr())
                          The SQL precompiler and SQL processing should be further amended to be able to handle 2 byte prefix VARCHARs. They could be handled in the same way (via %addr()) but SQL Processing must be made aware of and account for the length component being only 2 bytes instead of 4.

                          This would have several benefits:
                          * Programs with existing large VARCHARs of any size/prefix would be instantly interoperable with embedded SQL, without needing to reassign to/pointer overlay with an sqlType(CLOB)
                          * CLOBs could be selected to regular VARCHARs to then be used with regular RPGLE BIFs such as %trim, %scan, %xlate, etc.
                          * All of this serves to simplify the use of CLOB columns in SQLRPGLE processing

                          This could also apply to sqlType(BLOB) and VARGRAPHIC/VARBINARY

                          Comment

                          Working...
                          X