Trigger Apache Nutch Crawl Programmatically

247 views Asked by At

I'm trying to create a ASP.NET web api to trigger a crawl event to happen. I can't seem to get cygwin to process any of the commands I give it. The only thing I can really do is get it to open a terminal. Once the terminal is open I'd have to redirect the pwd to another location and then trigger my command I want.

Process p = new Process();
ProcessStartInfo info = new ProcessStartInfo();
info.CreateNoWindow = false;
info.RedirectStandardInput = true;
info.UseShellExecute = false;
info.FileName = "C:\\cygwin64\\bin\\mintty.exe";

p.StartInfo = info;
p.Start();
StreamWriter sw = p.StandardInput;
if (sw.BaseStream.CanWrite)
{
    sw.WriteLine(@"cd C:\Users\UName\Desktop\apache-nutch-2.3-mongodb\runtime\local\");
    sw.WriteLine("bin/autoCrawl");
}
sw.Close();
p.WaitForExit();

I've tried many approaches, this is the last one I've tried but it just does nothing. Is there a way to launch this crawl from my .NET application? I've looked into the NutchApi about creating a new job with a type of crawl but I'm not sure if that applies here or not.

1

There are 1 answers

0
itsNino91 On BEST ANSWER

I ended up figuring out how to use the NutchApi to answer my question.