ibmi-brunch-learn

Announcement

Collapse
No announcement yet.

HTTP job won't end

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • HTTP job won't end

    I can't remember having this issue over the past 10 years of working with the HTTP server, but I've now run into it 3-4 times since January. Possibly coincidentally we upgraded to 7.3 in mid-January.

    This first job is hogging up some CPU and I can't end it unless I issue the ENDJOBABN command. We have a few different HTTP instances running and this is the only instance we've seen it happen in.

    It's a little odd there is no function listed for it. There is no call stack either.

    Code:
    Opt  Subsystem/Job  User        Type  CPU %  Function        Status   
           XXXXXXXXXX   QTMHHTP1    BCI    47.4                   THDW    
           XXXXXXXXXX   QTMHHTTP    BCH      .0  PGM-QZHBMAIN     SIGW    
           XXXXXXXXXX   QTMHHTTP    BCI      .0  PGM-QZSRLOG      SIGW    
           XXXXXXXXXX   QTMHHTTP    BCI      .0  PGM-QZSRHTTP     SIGW    
           XXXXXXXXXX   QTMHHTP1    BCI      .0  PGM-jvmStartPa   TIMW    
           XXXXXXXXXX   QTMHHTP1    BCI      .0  PGM-QZSRCGI      TIMW    
           XXXXXXXXXX   QTMHHTTP    BCI      .0  PGM-QZSRHTTP     DEQW    
           XXXXXXXXXX   QTMHHTP1    BCI      .0  PGM-QZSRCGI      TIMW    
           XXXXXXXXXX   QTMHHTP1    BCI      .0  PGM-QZSRCGI      TIMW
    Your friends list is empty!

  • #2
    They typically end nicely if you end with these options:

    Code:
    ENDJOB JOB(the/job/here) OPTION(*CNTRLD) DELAY(1)
    This sends the proper signalss to the HTTP server to shut the job down properly.

    Frustratingly, people try to end them with OPTION(*IMMED). This is where the problems occur, since it tries to kill the job instantly and not send it signals or give it time to react. This maans that that parent job is not notified, and the job must remain active to prevent problems in the parent job, so it doesn't end, and gets stuck there. Never use *IMMED with an HTTP server job!

    Comment


    • #3
      I'll keep that in mind about the *IMMED with HTTP server jobs. I failed to mention a key point, in that these 3-4 times I've tried to end these jobs over the past few months, the job has been running at 50% for hours, if not days. A fellow programmer will do a WRKACTJOB and notice it. Then I attempt to end it.

      I guess it's possible the job was ended with *IMMED before I even looked at it and that's what started the whole mess?

      Your friends list is empty!

      Comment


      • #4
        Originally posted by Scott Klement View Post
        ...
        Never use *IMMED with an HTTP server job!
        Exactly! I learned this the hard way. *IMMED always ended the job eventually, but it took several minutes.

        Comment


        • #5
          With these jobs I've been having issues with recently we're talking hours to end. I just tried to ENDJOBABN and got this gem even though we tried to end the job hours ago.

          HTML Code:
          Cause . . . . . : The End Job Abnormal (ENDJOBABN) command is not allowed 
           until ten minutes after the job has entered immediate ending.
          Your friends list is empty!

          Comment


          • #6
            Originally posted by mjhaston View Post
            I guess it's possible the job was ended with *IMMED before I even looked at it and that's what started the whole mess?
            Possible? But seems more likely that a program got stuck in a loop or something like that.

            Comment


            • #7
              Wouldn't I see a program in a loop in the call stack though?
              Your friends list is empty!

              Comment


              • #8
                Scott - I have a lot of "not okay!" logs generated from YAJL in the DO_GENVALUE procedure. Is there any chance a bunch of these would lock up a job? My job log is 155 pages. IBM took a look at the job log and that was what they pointed out first. I'm sure it's something for me to look at to clean up some data, but I'm not sure that would create a run away or stalled job.

                PS - not suggesting it's a YAJL issue. I'll have to see what's wrong with my data.
                Last edited by mjhaston; June 12, 2019, 04:08 PM.
                Your friends list is empty!

                Comment


                • #9
                  I've not heard of do_genvalue generating any logs, this is all new to me.

                  Comment


                  • #10
                    Looking at the code inside YAJLR4 I see this:
                    Code:
                            if rc <> yajl_gen_status_ok;
                               log('not okay!');
                            endif;
                    The idea here was that this should "never" happen. So I stuck that little bit of diagnostic code in there just to be sure (should've used a better messsage, though.) I've never seen it actually happen before.

                    For some reason the YAJL generator is returning an error code. I will modify YAJLR4 so that it prints what the code is to help diagnose this further.

                    Comment


                    • #11
                      Just as a note, I started getting this a couple weeks ago too on one client's system. It builds job logs that are 300+ pages long with just this error over and over:

                      CPF9897 Diagnostic 40 08/08/19 15:16:53.215008 YAJLR4 YAJL *STMT YAJLR4 YAJL *STMT
                      From module . . . . . . . . : YAJLR4
                      From procedure . . . . . . : LOG
                      Statement . . . . . . . . . : 4624
                      To module . . . . . . . . . : YAJLR4
                      To procedure . . . . . . . : DO_GENVALUE
                      Statement . . . . . . . . . : 3952
                      Message . . . . : not okay!
                      Cause . . . . . : No additional online help information is available.

                      I am going to see if they have an old version or something.

                      Edit: very old version (2015). Also V5R4 still (smh). They do have a new system ready to implement that they're testing.

                      It's odd this has worked for years as well (we always hear that, don't we haha).

                      JSON still seems to be being created fine. Nothing has really changed. We did have one error where a ] was replace with a } which is why I went to look. No idea how that happened only once in over 4 years of this running.

                      Last edited by bvstonebvstools; August 8, 2019, 02:42 PM.

                      Comment


                      • #12
                        Ok, so I loaded the latest YAJL on their new V7R3 system. Got a bunch of 500+ page spooled files on this system, but a different error (looks like Scott updated the description)

                        MSGID TYPE SEV DATE TIME FROM PGM LIBRARY INST TO PGM LIBRARY INST
                        CPF9897 Diagnostic 40 08/09/19 08:17:35.134693 YAJLR4 YAJL *STMT YAJLR4 YAJL *STMT
                        From module . . . . . . . . : YAJLR4
                        From procedure . . . . . . : GENERATOR_ERROR
                        Statement . . . . . . . . . : 5009
                        To module . . . . . . . . . : YAJLR4
                        To procedure . . . . . . . : DO_GENVALUE
                        Statement . . . . . . . . . : 4244
                        Message . . . . : do_genValue: received YAJL generator status 4 for type
                        code 1
                        Cause . . . . . : No additional online help information is available.

                        The file we're trying to do in this case is about 140k records.

                        Comment


                        • #13
                          Thanks for the heads up. I'll install the latest YAJL tonight..
                          Last edited by bvstonebvstools; August 9, 2019, 03:54 PM.
                          Your friends list is empty!

                          Comment


                          • #14
                            Yeah, I thought that messsage might be more useful than just saying "not okay", haha

                            Comment


                            • #15
                              FWIW, status code 4 means that you tried to generate more JSON after the document was complete. This isn't allowed by JSON spec, so the generator will send that error and refuse to create the element.

                              Comment

                              Working...
                              X