I am using data scrapers: Import.io & Portia.
They both allow you to define a regular expression for the crawler to abide by. for example the url: https://weedmaps.com/dispensaries/pdi-medical
how would I account for the ending "pdi-medical"?
I've looked all over and understand how to use regex in a JS environment, but I'm a little confused as to what I'd exactly put in the input on Portia/Import.io
Something like this? https://weedmaps.com/dispensaries//^[a-zA-Z0-9-_]+$/
For Portia, if you want your crawler to follow any URLs starting with https://weedmaps.com/dispensaries/, you can just add a crawling rule with the following regex:
^https?://weedmaps.com/dispensaries/