Here's the guts of the program using Parallel::ForkManager. It seems to stop at 200 proccesses, sometimes its around 30, depending on the size of the pgsql query that collects URLs to send to Mojo::UserAgent. There seems to be some hard limits somewhere? Is there a better way to write this so that I don't run into those limits? The machine its running on has 16 CPUs and 128GB of memory, so it can certainly run more than 200 proccesses that will die after the Mojo::UserAgent timeout, which is generally 2 seconds.
use Parallel::ForkManager;
use Mojo::Base-strict;
use Mojo::UserAgent;
use Mojo::Pg;
use Math::Random::Secure qw(rand irand);
use POSIX qw(strftime);
use Socket;
use GeoIP2::Database::Reader;
use File::Spec::Functions qw(:ALL);
use File::Basename qw(dirname);
use feature 'say';
$max_kids = 500;
sub do_auth {
...
push( @url, $authurl );
}
do_auth();
my $pm = Parallel::ForkManager->new($max_kids);
LINKS:
foreach my $linkarray (@url) {
$pm->start and next LINKS; # do the fork
my $ua = Mojo::UserAgent->new( max_redirects => 5, timeout => $timeout );
$ua->get($url);
$pm->finish;
}
$pm->wait_all_children;
Most likely you are running into an operating system limit on threads or processes. The quick and dirty way to fix this would be to increase the limit, which is usually configurable. That said, rewriting the code not to use so many short lived threads is a more scalable solution.