Sometimes I come across an image that I can't scrape so that it can be saved. An example of this is:
https://s3.amazonaws.com/plumdistrict.com-production/perks/12321/image/original.?1325898487
When I hit the url from Internet Explorer I see the image but when I try to get it from the code below I get the following error message "System.Net.WebException The remote server returned an error: (403) Forbidden" error with GetResponse:
string url = "https://s3.amazonaws.com/plumdistrict.com-production/perks/12321/image/original.?1325898487";
WebRequest request = WebRequest.Create(url);
WebResponse response = request.GetResponse();
Any ideas on how to get this image?
Edit:
I am able to get to save images that do have extensions. For example I can scrape the following image just fine:
https://s3.amazonaws.com/plumdistrict.com-production/perks/12659/image/original.jpg?1326828951
Well, it looks like it's being generated from a script (possibly being retrieved from a database). The server should be sending a file/content type to go along with that... but it doesn't seem to be, which I believe is a violation of standards.
My Linux box knows full well that that's a JPEG image once it's on my hard drive, because it examines file headers rather than relying on extensions. Perhaps there is a tool to do the same in Windows?
Edit: Actually, on further contemplation, it seems odd that you'd get a 403 for that. Perhaps the server is actually blocking you from retrieving the file in that manner.