Removing with ->outertext is not removing

108 views Asked by At

I have the following code:

<div id="coursename">
  <h1>My Golf Club<br>
  <span class="courseregion">My Region</span></h1>
</div>

What I want to do is get the course name and it's region. Separately. Now, because the region is inside the #coursename element, I firstly want to get .courseregion, then remove it so I don't end up with My Golf ClubMy Region

This is what I'm trying but it still returns both together:

$course_region = $html->find('.courseregion', 0);

$region_to_use = $course_region; // stored

$course_region->outertext = ""; // get rid of course region

$course_name = $html->find('#coursename', 0);

echo $course_name->plaintext; // returns -> My Golf ClubMy Region

Where am I going wrong? Any ideas?

UPDATE I cannot modify the html, it is what it is

3

There are 3 answers

3
Mike 'Pomax' Kamermans On BEST ANSWER

Just work with the strings, don't try to modify the HTML. In this case:

// the element you're showing has an id, so there is only ever one
$cn = $html->find('#coursename');
$h1 = $cn->find('h1');

// get both the "full" text, and the "text we don't want":
$a = $h1.innertext;
$b = $h1->find('span').innertext;

// now we just remove $b from $a.
// We don't need to edit the HTML to achieve that:
$actual_text = str_replace($b, '', $a);
0
pguardiario On

It's because simple isn't updating plaintext (it's a bug):

$html = <<<EOF
<div id="coursename">
  <h1>My Golf Club<br>
  <span class="courseregion">My Region</span></h1>
</div>
EOF;

$doc = str_get_html($html);
$doc->find('.courseregion', 0)->outertext = "";

echo $doc->find('#coursename', 0)->plaintext . "\n";
//    My Golf Club    My Region   

$doc = str_get_html((string)$doc); // reload $doc (or switch to http://sourceforge.net/projects/advancedhtmldom/?source=directory)

echo $doc->find('#coursename', 0)->plaintext . "\n";
//    My Golf Club  
0
Sari Rahal On

You can use str_replace and remove $region_to_use from $course_name->plaintext;

//this will remove $region_to_use from $course_name->plaintext
echo(substr_replace ($region_to_use, '', $course_name->plaintext);