⊗ppPsRgBTS 15 of 84 menu

Two-Step Block Parsing with Regex in PHP

When working with regex, you should not try to solve a complex task with a single regex. It is better to apply several regexes sequentially.

Let's look at an example. Suppose we have the following code:

<p> --- </p> <main class="header"> <p> +++ </p> <p> +++ </p> </main>

Suppose we need to parse all paragraphs from the main tag. Let's do this in two stages: first, get the content of the main tag, and then inside this content, we will search for paragraphs.

So, the first stage. Let the text of the entire page be stored in the variable $str1. Let's get the content of the main tag:

<?php preg_match('#<main[^>]*>(.+?)</main>#su', $str1, $match1); ?>

Let's check that we caught the correct text:

<?php $str2 = $match1[1]; var_dump($str2); ?>

Now, in the obtained text, let's find all paragraphs:

<?php preg_match_all('#<p[^>]*>(.+?)</p>#su', $str2, $match2, PREG_PATTERN_ORDER); ?>

Let's check that we found the texts of our paragraphs:

<?php var_dump($match2[1]); ?>

Parse all h2 tags from the aside tag:

<main> <h2>---</h2> </main> <aside> <h2>+++</h2> <p> text </p> <h2>+++</h2> <p> text </p> <h2>+++</h2> <p> text </p> </aside>
kknltrbyka