lundi 11 septembre 2017

Laravel queues and supervisor - php runs out of memory and parallel processing issues

I am working on a data importing functionality for a website. I am using Laravel 5.1 on homestead, and have given the machine 4GB of RAM and two virtual cpus. For importing the data I used the Laravel Excel package, with its chuncking mechanism that basically breaks the data set in chunks and adds the processing to the queue. The workflow for this is as follows:

  • the user uploads the excel file;
  • a job is dispatched to handle the excel file;
  • this job then chunks the file and dispatches multiple other jobs to process the chunked data;
  • each of these chunks dispatches another job to do some background logic on the inserted data when they are done.

I have setup a queue named 'import' in supervisor for these processes. Along with this queue, I also have a default one (for sending mails and other, less intensive, stuff) and a low priority queue for dispatching jobs from other jobs. Here is my supervisor configuration:

[program:laravel-worker]
process_name=%(program_name)s_%(process_num)02d
command=/usr/bin/php /home/vagrant/Code/iris/artisan queue:work --tries=1 --daemon --queue=default,import,background1
autostart=true
autorestart=true
user=vagrant
numprocs=8
redirect_stderr=true
stdout_logfile=/home/vagrant/Code/iris/storage/logs/worker.log

With smaller files this works well, but when I attempt to import a ~30k row document, the php processes spawned by supervisor run out of memory toward the end and in the laravel log I start seeing InvalidArgumentException: No handler registered for command [__PHP_Incomplete_Class] in Illuminate\Bus\Dispatcher (this happens especially when I run two imports in parallel or I attempt to download something through websockets) and I am really confused as to why this is happening. No one process exceeds the 512 MB limit as far as I can tell from using memory_get_usage(). Is this a garbage collector issue? Should I summon it manually?

And since I mentioned the websockets, I was also trying to create a separate queue (with higher priority) for handling web socket requests. I tried multiple supervisor configurations (created dedicated worker in supervisor configuration files, added the queue to the --queue= option in the config, etc.) but to no avail. I implemented report downloading through web sockets, and it works fine by itself, but when there are other things in the queue, the socket request gets handled after several items in the lower priority 'import' queue finish, leaving me to believe that I do not understand queue priorities very well. Is there a way to have a separate queue for socket requests that responds immediately to these requests?



via Chebli Mohamed

Aucun commentaire:

Enregistrer un commentaire