I want to find the important links in a site using Jsoup library. So for this suppose we have following code:
<h1><a href="http://example.com">This is important </a></h1>
Now while parsing how can we find that the tag a is inside the h1 tag?
You can do it this way:
File input = new File("/tmp/input.html"); Document doc = Jsoup.parse(input, "UTF-8", "http://example.com/"); Elements headlinesCat1 = doc.getElementsByTag("h1"); for (Element headline : headlinesCat1) { Elements importantLinks = headline.getElementsByTag("a"); for (Element link : importantLinks) { String linkHref = link.attr("href"); String linkText = link.text(); System.out.println(linkHref); } }
Taken from the JSoup Cookbook.
Use selector:
Elements elements = doc.select("h1 > a");
You can do it this way:
Taken from the JSoup Cookbook.