I'm reading source code of an online shop website, and on each product page I need to find a JSON string which shows product SKUs and their quantity.
Here are 2 samples:
'{"sku-SV023435_B_M":7,"sku-SV023435_BL_M":10,"sku-SV023435_PU_M":11}'
The sample above shows 3 SKUs.
'{"sku-11430_B_S":"20","sku-11430_B_M":"17","sku-11430_B_L":"30","sku-11430_B_XS":"13","sku-11430_BL_S":"7","sku-11430_BL_M":"17","sku-11430_BL_L":"4","sku-11430_BL_XS":"16","sku-11430_O_S":"8","sku-11430_O_M":"6","sku-11430_O_L":"22","sku-11430_O_XS":"20","sku-11430_LBL_S":"27","sku-11430_LBL_M":"25","sku-11430_LBL_L":"22","sku-11430_LBL_XS":"10","sku-11430_Y_S":"24","sku-11430_Y_M":36,"sku-11430_Y_L":"20","sku-11430_Y_XS":"6","sku-11430_RR_S":"4","sku-11430_RR_M":"35","sku-11430_RR_L":"47","sku-11430_RR_XS":"6"}',
The sample above shows many more SKUs.
The number of SKUs in the JSON string can range from one to infinity.
Now, I need a regex pattern to extract this JSON string from each page. At that point, I can easily use json_encode()
.
Update: Here I found another problem, sorry that my question was not complete, there is another similar json string which is starting with sku- , Please have a look at source code of below link you will understand, the only difference is the value for that one is alphanumeric and for our required one is numeric. Also please note our final goal is to extract SKUs with their quantity, maybe you have a most straightforward solution.
@chris85
Second update:
Here is another strange issue which is a bit off topic.
while I'm opening the URL content using below code there is no json string in the source!
$html = file_get_contents("http://www.dresslink.com/womens-candy-color-basic-coat-slim-suit-jacket-blazer-p-8131.html");
But when I'm opening the url with my browser the json is there! really confused about this :(
Trying to extract specific data from json directly with regexp is normally always a bad idea due to the way json is encoded. The best way is to regexp the whole json data, then decode using the php function json_decode.
The issue with the missing data is due to a missing required cookie. See my comments in the code below.
Special thanks to @chris85 for making me read the question again. Sorry but I couldn't undo my downvote.