This document says that that by default, newspaper caches all previously extracted articles and eliminates any article which it has already extracted.
>>> cbs_paper = newspaper.build('http://cbs.com')
>>> cbs_paper.size()
1030
>>> cbs_paper = newspaper.build('http://cbs.com')
>>> cbs_paper.size()
2
Okay, but it says nothing if I build once a website how I can retrieve the cashed articles?
newspaper3k uses memoize to cache the articles for a source
setting memoize to false would stop the caching mechanism
however, if you still want the caching and want to access the cached articles you can find .newspaper_scraper dir inside the temp folder (Path in windows machine)
For Linux-based OSes, try looking in
For macOS, look in the directory specified by
$TMPDIR
. This may be any or none of the following: