ibmi-brunch-learn

Announcement

Collapse
No announcement yet.

Journal of file lock ?

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Journal of file lock ?

    Hello,

    We have a multimember physical file that is used for Excel extractions. The program begins by allocating the file so it can give the user autority on the table and it also creates a member for the user. The file is deallocated once this is done. Then, a calculation is performed on a file in QTEMP, after which, if the user chooses to extract the results, the data is copied from QTEMP to the member created. The user retrieves the data by transferring data (the green arrow Click image for larger version

Name:	trf.png
Views:	119
Size:	725 Bytes
ID:	159174 ) to an Excel file.



    We are having issues with the program crashing because it says the file is locked. However, when this happens, we use WRKOBJLCK and see that no one is locking the file. The file has a 30-second wait time before timing out.

    We have tried to figure out why we keep getting these crashes but have found no answer. Are there any journals for file locking? We need the journal of QSYS2.OBJECT_LOCK_INFO, or equivalent. When the data transfer is taking place, is the file locked? Is it possible the crashes occur when another user is extracting a very large file taking over 30 seconds and a second user just happens to call the program at that time?

    Thanks.
    Attached Files

  • #2
    After WRKOBJLCK, do you see anything with F6 ?

    F6=Work with member locks

    Comment


    • #3
      No, nothing. The file is unlocked by the time we are able to run WRKOBJLCK. So we don't know if the file just happened to be unlocked right before we checked or if something else is going on.

      Comment


      • #4
        You could query OBJECT_LOCK_INFO result into another file at the beginning of the program, and if the count > 0 then skip over the next steps and look into the result lock info.

        Comment


        • #5
          You don't mention what step is causing the lock failure. Is it when the process tries to grant authority or the copy step? When you allocate the object, what lock state are you using?

          In my mind, the change authority step seems a strange thing to do.
          What is the point of changing the authorities? Given it seems the file can be used by multiple people, why aren't they authorised to the file anyway? Why grant them authority later? How is the authority granted? Do you change the ownership of the file or grant them private authority? Do you revoke that later?
          If you keep granting people private authority, then it seems sensible to setup the file with the needed authorities and remove that first step. This could be done via a group profile for instance.
          If you revoke the authority so that only the person who created the member can download it, then this isn't a good method as when someone else runs this process and the authority is changed, they'll be able to see the member create by someone else. I'm confused as to the point of that step.
          I also don't know why you are allocating the object before granting authority as the OS will handle that.

          How are you copying the data? Copy operations are likely to put a shared lock on the file. Multiple CPYF commands for instance can copy to different members of the same file so my guess is the ALCOBJ is where the issue resides. I think if the authority on this file was sorted and that step removed, this will resolve the issue but I don't know why you are doing that in the first place.

          Comment


          • #6
            Originally posted by MFisher View Post
            You could query OBJECT_LOCK_INFO result into another file at the beginning of the program, and if the count > 0 then skip over the next steps and look into the result lock info.
            Yes, this is an option. We'll look into creating a program that runs this query and stores the result in a table.

            Originally posted by john.sev99 View Post
            You don't mention what step is causing the lock failure. Is it when the process tries to grant authority or the copy step? When you allocate the object, what lock state are you using?

            In my mind, the change authority step seems a strange thing to do.
            What is the point of changing the authorities? Given it seems the file can be used by multiple people, why aren't they authorised to the file anyway? Why grant them authority later? How is the authority granted? Do you change the ownership of the file or grant them private authority? Do you revoke that later?
            If you keep granting people private authority, then it seems sensible to setup the file with the needed authorities and remove that first step. This could be done via a group profile for instance.
            If you revoke the authority so that only the person who created the member can download it, then this isn't a good method as when someone else runs this process and the authority is changed, they'll be able to see the member create by someone else. I'm confused as to the point of that step.
            I also don't know why you are allocating the object before granting authority as the OS will handle that.

            How are you copying the data? Copy operations are likely to put a shared lock on the file. Multiple CPYF commands for instance can copy to different members of the same file so my guess is the ALCOBJ is where the issue resides. I think if the authority on this file was sorted and that step removed, this will resolve the issue but I don't know why you are doing that in the first place.
            So basically, we have a CLP program and two RPGLEs. Let's call them CLP1 and RPGLE1 and RPGLE2.
            CLP1 starts by allocating the file with the following command
            Code:
            ALCOBJ     OBJ((OURFILE *FILE *EXCL))
            ,
            creating a member
            Code:
            ADDPFM     FILE(OURFILE) MBR(&MBR) TEXT(&MBRTXT)
            ,
            granting autority
            Code:
            GRTOBJAUT  OBJ(OURFILE) OBJTYPE(*FILE) USER(&USER) AUT(*USE)
            ,
            deallocating the file
            Code:
            DLCOBJ     OBJ((OURFILE *FILE *EXCL))
            ,
            and finally, performing an OVRDBF
            Code:
            OVRDBF     FILE(OURFILE) MBR(&MBR) SHARE(*YES)
            .
            Then it calls RPGLE1 that calculates the results in QTEMP. After RPGLE1 ends, CLP1 calls RPGLE2 which copies the data to our multimember file. Because the user has the option of creating either a spool file or an Excel extraction, RPGLE2 just reads the file (with SETLL *Start QTEMPFILE) and if the choice of the user is extraction, then we write each record with a WRITE into the multimember file after each READ on the QTEMP file.

            The locking seems to be in place when RPGLE2 is called. The program is unable to open the file because it is locked and ends up crashing. So this means that another job is locking the file by the time RPGLE2 is called. Does the system lock a file (or a member) when transferring it using the green arrow? Could the explanation be from that?

            I guess we give authority just to only give it to those who need it, but we can try giving *USE to everyone, which would eliminate the need to change authority. We do not remove authority, as a GRNTOBJAUT does not crash if the user already has the authority being given.

            You are right, we don't need to allocate the object, as we could just try adding a member and change authority and if it fails, just leave the program. We may try changing this and seeing.


            We will try several things. The crashes don't happen often, only once every so often, so it's not a problem that is easily reproducible.
            Last edited by JustPassing; May 29, 2024, 01:09 AM.

            Comment


            • #7
              Hard to know the true cause but I believe the EXCL lock is likely where the issue lies. Maybe it's occurring during heavy use, hard to know.
              A few things on what you've mentioned of the process:

              The copy/download will likely place a SHRRD/SHRUPD lock on the file/member.
              According to the help text, ADDPFM will add an EXCLRD lock on the file while it does this. This is compatible with the SHRRD/SHRUPD locks.
              Granting authority requires an EXCL lock on the file (this may be the original reason for the ALCOBJ). This is not compatible with anything else and will wait until all locks are released or it times out.

              Given that the file seems to be a publicly available file as anyone can use this file as they are given authority that is never revoked and I'd have to assume the data is not sensitive as anyone can access it, then granting *PUBLIC *USE seems sensible. This would remove the EXCL lock requirement.

              Comment


              • #8
                I haven't had time to try the things recommended here. Today, we had the crash once more. Apparently, it happens because the user submits the calculation and then changes to his alternate session while the calculation is being done. This was seemingly done after the file had been allocated but before it was released. Changing sessions apparently suspends the first session (the programs stop executing), and it's not until the user comes back that the session continues execution.

                I've searched but could not find any information. When you change to an alternate session, does this suspend the execution of programs on the first session? If so, how can we prevent this?


                EDIT : So yes, switching to an alternate session suspends the first :
                Transfer Secondary Job (TFRSECJOB) Where allowed to run: Interactive environments (*INTERACT *IPGM *IREXX *EXEC) Threadsafe: No Parameters Examples Error messages The Transfer Secondary Job (TFRSECJOB) command creates a secondary interactive job at your work station, then transfers control between the primary and secondary jobs. The first time you issue this command, you receive the sign-on prompt for the secondary job. Once you sign on, a secondary job is created, allowing you to receive the basic working display of the new job. Your primary job remains suspended as long as you remain in your secondary job. The next time you issue the TFRSECJOB command, your current job is suspended, and you return to the first job at the point at which you left it. When you sign off either job, you are automatically returned to the remaining job. There are no parameters for this command.
                Now the question is what to do about it.
                Last edited by JustPassing; June 18, 2024, 03:39 AM.

                Comment


                • #9
                  Block them from using SYSREQ , instead have them start a separate emulater session.



                  ​​Click image for larger version

Name:	image.png
Views:	55
Size:	21.0 KB
ID:	159204
                  Last edited by MFisher; June 20, 2024, 03:07 PM.

                  Comment

                  Working...
                  X