Re: Get text between A and B?

From: Justin Koivisto (spam_at_koivi.com)
Date: 11/17/03


Date: Mon, 17 Nov 2003 19:52:09 GMT

Philipp Lenssen wrote:
> I want to read out several strings from an HTML (text) file, like
> everything between "<h2>" and "</h2>" to create a table-of-contents
> (also, things other than tags).
> I do have a function but it's slow, and sometimes doesn't finish for
> larger files (600K, not much really!).
> Now what would be a nice function to do this job? I suppose some regex
> with preg_match_all?
>
> It should have a parameter telling which occurrence of the string
> should be used, e.g. the second, third and so on.
>
> ------------
>
> Like:
>
> function getTextBetween($allText, $textBefore, $textAfter, $offset = 0)
> {
> // ?
> }
>
> Then I could say:
>
> $s = getTextBetween("<h2>foo</h2><p>Hello World</p><h2>bar</h2>",
> "<h2>", "</h2>", 1);
> echo $s; // ... would be "bar"
>
> ------------
>
> Any help greatly appreciated!
>

Kinda like this then...

function getTextBetween($allText,$textBefore,$textAfter,$offset=0){
     $pattern='#'.$textBefore.'(.*)'.$textAfter.'#iU';
     preg_match_all($pattern, $allText,$matches);
     return $matches[1][$offset];
}

-- 
Justin Koivisto - spam@koivi.com
PHP POSTERS: Please use comp.lang.php for PHP related questions,
              alt.php* groups are not recommended.


Relevant Pages

  • replace tags sin html page
    ... I want to replace tags in html pages. ... I want to replace all the strings ... padding-top: 8px; ...
    (microsoft.public.inetsdk.programming.webbrowser_ctl)
  • Get text between A and B?
    ... I want to read out several strings from an HTML file, ... (also, things other than tags). ... I suppose some regex ...
    (comp.lang.php)
  • Re: save control characters in a session object
    ... > strings. ... One way to get around the newline problem would be to, ... > This will create the newlines for you in HTML. ... > tags. ...
    (comp.lang.java.programmer)
  • Re: efficiency of JList setElementAt()
    ... If you modify your program to produce strings about 3 times as long, make the strings HTML with a font color tag, and increase the list size to about 130, I think you'll get the kind of results I cited in my first post. ... The Swing cell renderer design is based on assumptions that construction is expensive and updating values is cheap. ... So let's cache. ...
    (comp.lang.java.gui)
  • Re: Best way to dump HTML strings to a page in ASP.Net?
    ... of HTML as strings into a string builder. ... HTML as text into a string builder. ... longer work, this only happens with a high number of strings being ...
    (microsoft.public.dotnet.framework.aspnet)