ibmi-brunch-learn

Announcement

Collapse
No announcement yet.

Missing records at 4:30 AM

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Missing records at 4:30 AM

    Hi,
    I got a real quirky issue:

    A program that we use all day is first used at 4:30 well after all backups and other maintenance jobs.
    At this time of the morning when the user runs the program, the user reports that records are missing; expecting 28 but only gets 21.
    (Review of his job-log shows no errors of any kind yet he is missing records.)

    When I come in at 8:00, I tell him to exit the program and I delete all the records from the file they were written to.
    Now when the user tries again he get the expected record count of 28.

    I go into my test system using the same criteria and I get 28 records.
    All day long the correct record counts are written but the next day at 4:30, this user comes up short again.

    I thought it was the user profile but we've eliminated that as his profile was recreated numerous times and lastly copied from another user that does not have issues.

    I maintain that something else running at 4:30 is causing interference even though I know that if that were the case, there would be errors.

    Can anyone offer any other possibilities?

    Red.

    Everyday's a school day, what grade are you in?

  • #2
    Can you just check the journal?

    Comment


    • #3
      What journal?
      Everyday's a school day, what grade are you in?

      Comment


      • jtaylor___
        jtaylor___ commented
        Editing a comment
        Sorry. I sometimes forget the realities of the typical IBMi shop.

    • #4
      Usually tables on an iSeries are Journalled (STRJRNPF). Depending on how journalling is configured (assuming you are using journalling), all add/update/delete activity is logged to a Journal/Journal Receiver, with the timestamp and the name of the program & job. You can view the Journal using DSPJRN or (my preference) SQL function DISPLAY_JOURNAL() to see if the relevant input records existed at the time this program was run, or if they were created after (by finding the journal records where the records were added).
      This all assumes that your tables are journalled. If you use commitment control then they must be

      Comment


      • Vectorspace
        Vectorspace commented
        Editing a comment
        P.S. you can tell if a table is journalled, and that Journal object it is journalled to, using DSPFD

    • #5
      You don't tell us what the source to the 4:30 users data is.
      If it is a program that generates the data and it hasn't finished when the user runs the report then the cause
      could be buffering. When the program that generates the data writes data to the file then the records are
      saved in a buffer and not written to the disk before the buffer is full or the file is closed. This has often caused
      some confusion.

      regards
      Peder

      Comment


      • #6
        Strange idea, could it be it is an SQL defined table and you try to write invalid numeric values (for example blanks) with native I/O, because the data structures are not properly initialized?

        Birgitta

        Comment


        • #7
          Maybe running under commitment control and commit has not happened yet?

          Comment


          • #8
            All,
            Very interesting ideas have been presented however, there are several data files involved; client master, machine master, client machine master, machine types, etc. None of those input files are being used (according to my manager) at 4:30 in the morning and if they were, I would surely see some errors in the job log - No?

            My program requires some basic data input; client# & date. Then it retrieves data from the master files, write it to a work file and displays those same records in a sub file. The results correspond to items being scanned as physically selected by the user (taken from a drawer or bin and placed in a cart) so as to ensure no items are missed. When they scan an item, the screen is updated and once all items have been scanned, the process continues to the next phase, like packaging and then shipping, etc. Certain processes require a set number of items. Sometimes, there are missing items, thus my initial posting.

            The records written to the work file do have to pass various selection criteria but their values do not change so if they were to not be selected when the user runs the job, they would not be selected either when I run the job or anyone else for that matter which they are.

            Very strange.
            Red.

            PS: None of the involved fils are journaled nor is this process under commitment control.
            Everyday's a school day, what grade are you in?

            Comment


            • #9
              If you don't have any journal, I'd add before delete triggers to the files in case and log when, from whom and which program will delete the row as well the row to be deleted.

              Long time ago we had a situation where a customer insisted in our software "orders would disappear". I added a before delete trigger to the order header and order detail file. Two days later he complained again, when looking at the joblog I found out that the missing orders where deleted manually (with STRSQL) ... by the guy who complained!!!

              Birgitta

              Comment


              • #10
                I reread your first post.
                You said that when arriving at 8 you ask the user to exit the program then you delete the records in the file.

                How many records do you see in the file using DSPFD BEFORE the user exits the program?
                And how many AFTER the user user exits the program?

                If there is a difference then you have a case of buffering.
                When the program close the files then the buffers are written to the file.

                If this is not the case then I would focus on the transaction files assuming that the master data are static and do not change.
                Or looking at the way the user uses the application. There might be a difference in how the user uses it at 4:30 and the way
                you uses it at 8:00. Fx pressing or not pressing a function key. Scrolling at the wrong moment.
                Not knowing the application I can of course only speak in general terms.

                Regards
                Peder

                Comment


                • #11
                  Peter,
                  The user has a control sheet generated from another process. On that sheet is the grand total of items that should be accounted for. For the incident that led me to create the original post, the control sheet said there should have been 28 items. The user got 21 and stopped knowing something was wrong. He did not exit the program. When I came in and looked at the work file, there were 21 records in it. I had the user exit the program and 21 records still remained. I deleted the 21 and had the user redo the request and then he got 28 - the required amount.

                  The rest is history and a mystery. (no pun intended)
                  Red.
                  Everyday's a school day, what grade are you in?

                  Comment


                  • #12
                    Thank you for the "like".

                    Assuming that the program generating the control sheet is the same that generates the file that
                    the program the user runs giving the 21 items instead of 28, I would focus on that.
                    1. What records are missing?
                      Any special things about them?
                      Is it the first records or the last records?
                      Perhaps a specific customer number ?
                    2. Could it be an invalid record that messes up everything?
                      Maybe a negative customer number - a user could by accident have pressed Field- ( I have experienced this ).
                      And be aware that edit codes can hide the sign !
                    3. Is it the same programs running both at 4:30 and 8:00 ?
                    4. By the way is this a new incident?
                      When did it start?
                      What was changed on your system then?
                    5. Perhaps there is a bug in one of the programs that is activated when a specific sequence occurs.
                      I once found a bug that came after 200.000 records were read from a file. Only at that time a specific sequence occurred.
                      Also check for missing handling of record not found. This causes the previous record to be used ( old values ).
                    Regards
                    Peder

                    Comment


                    • #13
                      Peder,
                      First; you're welcome.
                      Second; my apologies for misspelling your name. I tried to edit my post but was unable.
                      Lastly; this issue just came up mid last week but no programming changes were made prior, in fact weeks prior. Same program ran both times. As for the rest of your suggestion - thought of and researched but to no reasonable conclusion or even additional thoughts. It's happened 3 times in the past 8 days. it's a program that is used every day and several times a day (Once each phase is completed, the next phase is automatically generated from the previous completed records and this issue only happens at the beginning of the process, never downstream and, never during the day at the beginning of the process.)

                      Red.
                      Everyday's a school day, what grade are you in?

                      Comment


                      • #14
                        Looks like 1 of 2 things.
                        An error in the data.
                        Or a bug in a program.

                        If you have a colleague have him/her sit next to you and start scrutinize the code from the beginning.
                        You explain what should happen in the program.
                        If you have a fresh set of eyes it might help.

                        30 years ago I was very new and inexperienced and one of my very experienced colleagues asked me to
                        take a look at some code he was debugging. He couldn't figure out what was wrong.
                        He told me what the program was doing:

                        "..... and here 1 is added to X" he said.

                        No, I said it is not an X it is a Y.

                        The bug was finally found.


                        Regards
                        Peder

                        Comment


                        • #15
                          It's hard to give any real definitive assistance with just an outline of the process. One thing you may want to look at is how you are retrieving and processing the records. Are you assuming the records are being presented in e.g. arrival sequence but not specifying the record order? My guess is there's a program bug that some recent change (e.g. PTFs) has exposed.

                          Comment

                          Working...
                          X