sponsored links



No announcement yet.

Safe to pointer overlay a varchar onto a clob?

  • Filter
  • Time
  • Show
Clear All
new posts

  • Safe to pointer overlay a varchar onto a clob?

    Dcl-S myClob SqlType(CLOB:3000);
    is converted by the SQL precompiler to this:
    Dcl-Ds myClob;
      myClob_LEN  UNS(10);
      myClob_DATA CHAR(3000);
    I had a realisation. UNS(10) is a 4 byte unsigned integer. A varchar has a 2 or 4 byte prefix to store the numeric number of characters, just like the myClob_LEN component of SqlType(CLOB).
    Which means, the memory layout for these two definitions:
    Dcl-S myClob SqlType(CLOB:3000);
    Dcl-S myVar  Varchar(3000:4);
    should be the same - 4 byte unsigned integer for number of characters, followed by 3000 bytes of characters.

    Which means, it should be safe to use pointers to overlay the two variables on top of each other? We have programs that use varchars larger than the maximum 32kb supported by SQL, and being able to directly overlay a larger varchar with a CLOB would make them simpler to deal with if we ever needed to process that variable in SQL. Example:
    Dcl-S myDataC  SqlType(CLOB:3000000);
    Dcl-S myDataV  Varchar(3000000:4) based(@myDataV);
    Dcl-S @myDataV Pointer inz(%addr(myDataC));
    myDataV = 'some long data';
    exec sql set :myDataC = upper(:myDataC);
    // myDataV now equals 'SOME LONG DATA'
    So RPG code can use myDataV and SQL can use myDataC, and both are manipulating the same data.

    Is this safe? Or are there reasons to not do this?

  • #2
    Why to use pointers? Just modify your data with SQL if you are working with SQLTYPE datatypes.

    Is it save modifying the original data directly (or data structure) directly or through a pointer data within pure RPG?



    • #3
      We have cases where huge varchars are passed in to programs as parameters.


      • #4
        The answer to your question is yes - it should be perfectly safe. Actually I'm surprised that the SQL pre-compiler hasn't been updated to generate a VarChar rather than the old approach. Originally it would have had to do it this way since RPG varchar was (until V6) limited to 32K but now that the limit no longer exists it really should change. I'm meeting the IBMers responsible for the pre-compiler in a week or two and will try to remember to ask them about this. Of course you could always write an RFE requesting the change <grin>.


        • #5
          I imagine they can't change the precompiler now because existing programs will be referencing data structure myCLOB, and subfields myCLOB_LEN and myCLOB_DATA?

          Unless they add a control spec keyword to tell the precompiler to make it a varchar instead?

          Or how about a new SQLTYPE that still means Clob, but generates a varchar rather than a DS? (presumeably a Blob/VarGraph version too?)

          Or maybe simply allow large varchars to directly interact with SQL, by the precompiler recognising they are varchars larger than 32kb so treat them the same as a clob data structure?
          I can see from the *LISTING code that when you reference a clob host variable in embedded SQL, it works by pointer using %addr. As clobs always have 4 byte lengths whereas varchars have 2 bytes or 4, the precompiler would need to be smart enough to recognise if the varchar has a 2 or 4 byte prefix and adjust accordingly.


          • #6
            Originally posted by JonBoy View Post
            Originally it would have had to do it this way since RPG varchar was (until V6) limited to 32K
            Actually, the RPG limit was 64k.

            And, the approach with a data structure was also limited to 64k since RPG's CHAR as a subfield in the DS was limited to 64k. So the 64k limit was not the reason it used a data structure.

            The original implementation was actually before RPG had a varchar or varying string -- that is why it is a DS with a LEN and DATA subfield. They couldn't change it thereafter because programs were now referring to the _LEN and _DATA subfields. (Though, they probably could've had some sort of keyword to indicate the use of VARCHAR/VARYING, etc, I guess...)

            It certainly would be more elegant to work with a varchar/varying field than the DS, though.


            • #7
              Right - it was 64K not 32K.

              The pre-compiler has been changed in the past and the change could be easily done compatibly - all they need to do is add the varchar to the DS as an overlay.


              • #8
                That would mean the varchar has a different name to the clob itself, since the ds has the name of the clob. Kind of like my pointer example. In an ideal world, you'd use the same variable name in both rpg and sql


                • #9
                  Yeah, I agree... adding an overlaid varchar in the DS would be awkward. (Though, still better than having to set both the length and data separately.)

                  I'd prefer to see something like this:

                  dcl-s myLobData SQLTYPE(CLOB: 1000000: RPGVARCHAR);
                  Then the precompiler could convert it to a VARCHAR rather than a DS when that extra keyword is given.


                  • #10
                    The SQL pre-compiler folks have asked me to write up an RFE for this ability. Any and all suggestions on approach are welcome. Will post the RFE number here when it is done.


                    • #11
                      I think there are two different possible changes here. Making CLOBs work like large varchars, and/or making embedded SQL treat large varchars as CLOBs.

                      #1, Precompiler converts SQLTYLE(CLOB:X) to VARCHAR(X:4) instead of to a DS.
                      This would let the variable be used in regular RPG. This should require no changes to actual SQL processing, as it is referenced by address and the memory layout (4 byte unsigned int for number of chars, followed by X bytes of character space) is identical. This could be accomplished using a global control-spec keyword to set a program-wide default, and/or a d-spec keyword or SQLTYPE keyword parameter to allow each individual SQLTYPE definition to be individually specified.

                      #2, Allow embedded SQL to directly reference large varchars (>32kb).
                      This would let existing large varchars be used in embedded SQL without having to first copy them to CLOBs (increased memory usage, which is especially an issue here because these are large variables) or use pointer/DS overlays.
                      This could be done with a precompiler-only change that would not work for some varchars (#2.1) or a precompiler and SQL Processing change to support all varchars (#2.2)

                      #2.1 Only allow this for large varchars defined with 4 byte prefixes.
                      Varchars with no explicitly specified prefix have a 2 byte prefix if their number of characters is 65565 or less. 65536 or more get a 4 byte prefix. You can explicitly specify a 4 byte prefix for an X <= 65535 varchar with VARCHAR(X:4).
                      It is the 4 byte varchars that are identical in memory layout to the CLOBs, therefore with a precompiler-only change, varchars with an implicit or explicit 4 byte prefix could be directly referenced in exactly the same way as a CLOB (via %addr). This means varchars above 64kb will work (they always have a 4 byte prefix), varchars above 32kb and below 64kb will work only if they have an explicit 4 byte prefix, but varchars above 32kb and below 64kb with only a 2 byte prefix will not work. (varchars < 32kb are of course directly referenced already)

                      #2.2 Allow this for large varchars defined with 4 byte prefixes.
                      Option #2.1 being a precompiler-only change works because a varchar with a 4 byte prefix has identical memory layout to a CLOB. The SQL Processing is referencing the CLOB only by %addr which means that it must be assuming the number of characters portion is 4 bytes. It would therefore need amending to recognise when a varchar has a 2 byte prefix instead of 4, in addition to the precompiler changes. But this would mean that ALL varchars are directly reference-able in embedded SQL operations, irrespective of length. This to my mind is the ideal solution.

                      Personally I would like to see both options #1 and #2.2 done, but I recognise #2.2 is much more complex piece of work because it is more than just the precompiler.


                      • Vectorspace
                        Vectorspace commented
                        Editing a comment
                        Use anything you want. Though full disclaimer, #1 is just a fleshed out explanation of Scott's idea.

                        I feel I should also mention, when I was experimenting with this earlier, I found that using a larger than supported varchar with embedded sql was causing the precompiler to crash. I never narrowed down the exact circumstances as work had other demands for my time, but felt it worth mentioning.

                      • Scott Klement
                        Scott Klement commented
                        Editing a comment
                        In order to use my idea, I'll need $1 million transferred to my private bank account in the Caymans

                      • JonBoy
                        JonBoy commented
                        Editing a comment
                        However will you get it back out of their tiny tummies Scott?