Announcement

**Scott Klement** · October 19, 2016, 03:31 PM

Your two examples are dangerous. You have told it that your field is 4 bytes longer than it actually is, so the system can overwrite areas of memory that come after your variable. That is unsafe. Your last example, at least, is not dangerous -- though it writes data into the length portion of the field, not just the data portion, so won't work the way you want.

The proper approach requires 3 steps:

1) Set the field to it's maximum length. If you don't do this, you'll only get the data that fit into the variable at the length it was set prior to the read() call.
2) Call read() with %addr(*data) and %len(*max)
3) Use the len returned from read() with the %len() BIF to set the new length.

Code:

%len(stmfData) = %len(stmfData:*MAX);
len = read(fd: %addr(stmfData:*data): %len(stmfData:*MAX));
if len < 0;
   len = 0;
endif;
%len(stmfData) = len;

I'd wrap that logic in a subprocedure so you don't have to repeat it every time you read the file.

FYI: On a VARCHAR field with a 4-byte length, %addr(stmfData:*data) is equivalent to %addr(StmfData)+4, and %len(stmfData:*MAX) is equivalent to %size(stmfData)-4. If the length is 2 bytes, it's the same thing, but +/- 2. On other data types like UCS-2 or Graphic the %size() will be twice as large (since %size is bytes, %len is characters) plus or minus the length. Using %addr(*data) and %len(*max) figures that out for you, making your code more understandable and less prone to mistakes. But, before we had those features, you'll see examples where people manually add/subtract the length.

**gwilburn** · October 20, 2016, 06:34 AM

Thanks... I knew one of my examples was likely writing to memory that it shouldn't (so i didn't do it other than to debug). Right now it's working with a fixed-length field of 1 MB. The file I'm reading should never approach that size.

So my question is whether I should use VARCHAR at all? If I set the field length to it's max before using it, do I still get the performance gain of VARCHAR versus CHAR?

**Barbara Morris** · October 20, 2016, 04:05 PM

If you don't use VARCHAR, then you'll get basically the same performance by using %SUBST every time you use the CHAR version of the field. But having to use %SUBST all the time would be inconvenient and possibly error-prone.

So before changing to use CHAR, I would do a bit of performance testing to see if setting the full VARCHAR to blanks is really an issue.

Try the program below. Call it with increasing values, say 1000, 10000, 100000, until it reports that it took at least one second.

On my machine, calling this program with 10000 iterations took a bit less than a second, meaning it took about .0001 seconds for each setting of the VARCHAR field to all blanks. For me, that would be too small to make my code ugly and error-prone by using %SUBST everywhere, although I guess it would depend on how many times the value was actually going to be used.

Code:

        ctl-opt dftactgrp(*no);
        dcl-pi *n;
           iters_parm packed(15:5) const;
        end-pi;
        dcl-s iters int(10);
        dcl-s i int(10);
        dcl-s fld varchar(2000000);
        dcl-s t_start timestamp;
        dcl-s t_end timestamp;
        dcl-s seconds packed(5 : 2);
        iters = iters_parm;
        t_start = %timestamp();
        for i = 1 to iters;
           fld = *blanks;
        endfor;
        t_end = %timestamp();
        seconds = %diff(t_end : t_start : *ms) / 1000000;
        dsply ('That took ' + %char(seconds) + ' seconds');
        return;

**Scott Klement** · October 20, 2016, 04:20 PM

Originally posted by gwilburn View Post

So my question is whether I should use VARCHAR at all? If I set the field length to it's max before using it, do I still get the performance gain of VARCHAR versus CHAR?

Well, you don't keep it at max, do you? In my post, I set it to max before the read, and set it to the proper length after the read so that it ended up with the proper length.

Assuming your program utilizes the string after the part where you set the length to the proper length, you should get a performance benefit.

An even bigger advantage to using varchar is just that it makes your code so much simpler and more elegant, not worrying about all the %TRIMs and other junk that goes with fixed-length strings.

**gwilburn** · October 21, 2016, 06:23 AM

Good info!

No, I do not keep it at the max length. I later pass it (as a pointer) to other subprocedures that translate it to ASCII and Calculate the MD5 hash.

I really appreciate the explanation.

Announcement

Using the read() API with VARCHAR

Using the read() API with VARCHAR

Comment

Comment

Comment

Comment

Comment