ibmi-brunch-learn

Announcement

Collapse
No announcement yet.

rgzpfm vs cpyf clrpfm cpyf

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • rgzpfm vs cpyf clrpfm cpyf

    We have an older file that needs to have the deleted records removed and arrival sequence changed to match its keyed sequence. It is a large file 50 million records 10 million deleted records and also has quite a few logicals and joined logicals associated with it. I was just going to do a RGZPFM FILE(FILE) KEYFILE(*FILE) but my supervisor said last time they did it, it brought the system to a crawl. And he wants me to do cpyf to a temp file clrpfm on the original and then a cpyf back. To me this seems like it would take longer and consume more resources but I don't want to be the one that disobeys and brings the system down.

    Just looking for experience if it makes much difference or if there are any I gotchas associated with the cpyf clrpfm cpyf solution.

  • #2
    If you go the CPYF route, then one tip would be to change the indexes to rebuild (CHGLF MAINT(*REBLD)) before you copy the data back to the production file. After the copy is completed, change them back to immediate (CHGLF MAINT(*IMMED)) or whatever the original value was.

    Comment


    • #3
      I see in other sources that after the change back to do a opndbf on each of the logicals in order to force the access path rebuild. I suppose that will take a while and those logicals won't be available until after the rebuild is done but that sounds like the preferred method when doing large files.

      Thanks for the info.

      Comment


      • #4
        Actually, just doing the CHGLF MAINT(*IMMED) will rebuild the access path. The rebuild will be offloaded to one of the system jobs (I forget the name of it), so it won't affect interactive performance. If you do an OPNDBF from your interactive job, then the access path rebuild will be done interactively which may affect interactive performance depending on the size of your machine.

        Comment


        • #5
          I was going to submit the whole thing to batch so none of it will be run interactively. Though now he wants to just do a read/write using input primary files because of record blocking, little old school but I am not sure I am going to argue.

          Thanks again.

          Comment


          • #6
            Another way to achieve blocking and still use the CPYF command is to run a OVRDBF FILE(filename) SEQONLY(*YES xxxx) command before the CPYF for each file replacing xxxx with the number of records to read in a block.

            Comment


            • #7
              Brian Rusch's comment is appropriate. If you do CHGLF {file-name} MAINT(*IMMED) to change back from MAINT(*REBLD), you should see one or more QDBSRVnn jobs with 'Function' IDX-{file-name} under the system jobs portion of WRKACTJOB. If you change a bunch of LFs at once, you should see them cycle through at least a couple QDBSRVnn jobs at a time.

              There is a little more to technical details that others might cover, but some comments can be added to the general thread even if they aren't really needed after the above. (And others might correct any details below.)

              Originally posted by jj_dahlheimer View Post
              I was going to submit the whole thing to batch so none of it will be run interactively.
              Were the various OPNDBFs all going to run in the same batch job (i.e., one after the other)? Or were they going to be individually submitted through a multi-threaded *JOBQ?

              Though now he wants to just do a read/write using input primary files because of record blocking, little old school...
              In terms of OS/400, it's just about as old-school as OPNDBF. Regardless, "record blocking", input primary and read/write is mostly unrelated to index rebuilding, except that an 'open' of the file must happen before any of those actually do anything beyond that. There's no need for any of those. OPNDBF mostly accomplishes the 'open' without adding much of what is included with the rest.

              Could possibly be done with one CL program that receives a name from a *DTAQ and puts it into OPNDBF, then ends. Load the *DTAQ and submit a bunch of calls to your program (to a multi-threaded *JOBQ would probably be better).

              In short, just use the CHGLF MAINT() parameter to switch back and forth, and be done with it. Do this when user jobs aren't accessing affected LFs. (If you're recreating the LFs and they have related sequencing, you might want to create them in an order that allows efficient access-path sharing.)
              Tom

              There are only two hard things in Computer Science: cache invalidation, naming things and off-by-one errors.

              Why is it that all of the instruments seeking intelligent life in the universe are pointed away from Earth?

              Comment


              • #8
                Original plan was just to do a reorg, not sure why but last time they said it caused a lot of system performance issues. System performance issues that cause big issues around here is when our website reports dropped database connections even though none of them use this file.

                Second plan was to do a the was to do the cpyf clrpfm cpyf, and then I had it coded to do the chglf one right after another that would all run in one batch job. Figuring this would prevent any kind of system performance issue. But again was told last time we tried something like this when the index rebuild kicked off it caused system performance issues. I think though last time it was happening interactively due to not running the chglf commands. But hard to know.

                So third iteration now I have a simple CL that will run a program to read primary write to temporary file, clear original file, and then read primary temp file and write original file.

                FWIW I created two CL and ran them interactively, one just did the cpyf clrpfm cpyf with no chglf commands, and then second was the read primary CL. I ran them both on a test file with a million records in it with sequence numbers and deleted records from the original file, the cpyf only took 25 seconds while the read primary took 39 seconds. The test file only had one logical attached to it so I know that time will get a lot longer but as for the test it seems to show that regardless the cpyf is a lot faster.

                I am going to present my findings and see if they want to give the cpyf method another try.

                Comment


                • #9
                  Do you know what may have caused the performance issues? Could it be due to the index rebuilds? If that's the case, then I don't see how CPYF/CLRPFM/CPYF will solve this issue as all processes that want to access to logicals will have to wait until the indexes are rebuilt. I'm not sure why the CPYF etc is better than a RGZPFM?
                  Have you considered the RBDACCPTH parameter on the RGZPFM? The default is *YES which will cause the indexes to be rebuilt at the end of the reorg. You could specify *NO so they are maintained during the reorg process, maybe that would help resolve the performance issues? As it stands, without knowing why the performance issues you are stabbing in the dark.

                  Comment


                  • #10
                    I don't know exactly what caused the issues, I am only going off what I have been told happened. So yes I am kind of stabbing in the dark. Just trying to gather as much info as possible present it and see how they want me to do it.

                    Comment


                    • #11
                      Maybe someone with more intimate knowledge of the mechanics of the reorg process can help more, however I don't really see much difference between a rgzpfm and a copy/clear/copy. When a reorg with alwcancel(*NO) parameter is used, the system creates a temporary file and I believe (rightly or wrongly) that it does basically the same sort of thing.
                      The performance issue you mention is confusing. Is that the performance of your overall system or your application? I wouldn't have expected a file reorg to significantly impact system performance, so if it's the app does that mean your application is using the file while you're doing the reorg? How will it cope if you suddenly clear the file?
                      You also mention that you have some join LFs. Are you reorging multiple files at the same time? Could 2 PFs with a shared LF be reorged at the same time? The latter could be problematic as it may result in the LF indexes not being rebuilt until they are first touched.
                      Last edited by john.sev99; February 21, 2017, 03:26 PM.

                      Comment


                      • #12
                        I agree, there shouldn't be a big difference between reorg, and the copy/clear/copy.

                        I was just told it caused performance issues, website getting dropped db calls and interactive sessions going slow. It has been a long time since they attempted it so things may be different now and perhaps it was caused by an unrelated event. I think that is why they are leaning towards just doing a read/write program.

                        I will only be doing one PF but the last time I don't know maybe they were doing two.

                        Comment

                        Working...
                        X