MATLAB urlread Won't Work for Specific Webpage

Question

MATLAB urlread Won't Work for Specific Webpage

974 views Asked by dwm8 At 12 November 2015 at 21:47

I am attempting to scrape a webpage using the urlread() function in MATLAB, though I've run into a problem that I haven't seen before. When I run the code

X = urlread('http://espn.go.com/mens-college-basketball/schedule/_/date/20141114');

I get the error

Error using urlreadwrite (line 92) The server did not find a resource to match this request.

Error in urlread (line 36) [s,status] = urlreadwrite(mfilename,catchErrors,url,varargin{:});

When I attempt to visit the link on my browser (http://espn.go.com/mens-college-basketball/schedule/_/date/20141114), I have no problems accessing the page. Does anyone have a solution to this problem?

Original Q&A

There are 3 answers

ASH On 14 November 2015 at 17:21

That didn't work for me, but this does.

URL = 'http://espn.go.com/mens-college-basketball/schedule/_/date/20141114';
str = urlread(URL,'Get',{'term','urlread'});

ASH On 14 November 2015 at 17:26

Although I think r and Python are much better for web scraping exercises.

Here's an R script that works great.

library(rvest)
rawhtml <- read_html("http://espn.go.com/mens-college-basketball/schedule/_/date/20141114")
rvested <- rawhtml %>% 
    html_nodes("table") %>%
    html_table(fill = TRUE) %>%
    .[[1]]

**zelanix** · Accepted Answer · 2015-11-12T22:55:39+00:00

zelanix On 12 November 2015 at 22:55 BEST ANSWER

It appears that the site is blocking the default MATLAB Rxxxxx user-agent parameter in the http request.

Faking the user-agent seems to work around the limitation:

x = urlread('http://espn.go.com/mens-college-basketball/schedule/_/date/20141114', 'UserAgent', 'Mozilla/5.0');

TechQA.

MATLAB urlread Won't Work for Specific Webpage

There are 3 answers

Related Questions in MATLAB

Related Questions in WEB-SCRAPING

Related Questions in URLREAD

Popular Questions

Trending Questions