Mailing List Archive

Running asyncio.run() more than once
We have an application that involves submitting hundreds to thousands of jobs to a shared computing resource, and we're using asyncio to do so because it is far less overhead than threading or multiprocessing for the bookkeeping required to keep track of all these jobs. It makes extensive use of asyncio.create_subprocess_exec(). This was developed mostly in Python 3.9.7.

Normally we know ahead of time all the jobs that need to be run and this can be accommodated by a single call to asyncio.run(). However, in this new case we need to submit a few hundred jobs, review these results, and compose many more. That means a separate call to asyncio.run() is necessary.

I have tried to call our app twice, and during the second iteration things hang indefinitely. Processes get launched, but it eventually stops reporting job completions.

I have added debug=True to the asyncio.run() keyword args, but I'm not sure what I'm looking for that might tell me what's wrong. It may be something I'm doing, but based on the docs being ambiguous about this it could also be a fundamental limitation of asyncio.

Is what I'm trying going to be impossible to accomplish? I would hate to have to rig up some sort of crazy async/sync queue system to feed jobs dynamically all because of this problem with asyncio.run().

Thanks,

-Clint
--
https://mail.python.org/mailman/listinfo/python-list
Re: Running asyncio.run() more than once [ In reply to ]
On Monday, March 13, 2023 at 11:55:22?PM UTC-7, gst wrote:
> Le mardi 14 mars 2023 à 02:32:23 UTC-4, Clint Olsen a écrit :
> I'm not asyncio expert or even not advanced user, but using a simple list to hold the jobs to execute and fill it as necessary after results gathering is not good ?
>
> ```
> @async
> def execute_jobs(jobs: List["Job"]):
> while len(jobs) > 0:
> # launch_job(s)
> # gather_job(s)_result(s)
> # append_jobs_if_desired
> ```

The problem with this implementation is that most/all of the code calling this app is not async code. So, we'd need a method (and thread) to communicate between the sync and async worlds.

A possible implementation is here: https://stackoverflow.com/questions/59650243/communication-between-async-tasks-and-synchronous-threads-in-python

So, while this is certainly possible, it would be much more straightforward to just call asyncio.run() more than once.

Thanks,

-Clint
--
https://mail.python.org/mailman/listinfo/python-list
Re: Running asyncio.run() more than once [ In reply to ]
On Monday, March 13, 2023 at 11:32:23?PM UTC-7, Clint Olsen wrote:
> We have an application that involves submitting hundreds to thousands of jobs to a shared computing resource, and we're using asyncio to do so because it is far less overhead than threading or multiprocessing for the bookkeeping required to keep track of all these jobs. It makes extensive use of asyncio.create_subprocess_exec(). This was developed mostly in Python 3.9.7.

Good news! I did end up finding the source of the problem. I kept looking for the use of global data as a potential source of bugs and didn't really find anything there. I did fix some potential signal handler problems. However I had this particular piece of code that I needed to follow my processes.

watcher = asyncio.FastChildWatcher()
watcher.attach_loop(asyncio.get_event_loop())
asyncio.set_child_watcher(watcher)

Since async reuses the same event loop on successive calls, doing this again is bad. I just needed to ensure I only set this once.

Thanks,

-Clint
--
https://mail.python.org/mailman/listinfo/python-list