Why Are These Functions Causing MASSIVE Memory Problems? Please Help!

Posted on 16th Feb 2014 by admin

Hi,

I have a script with some options.

I use regex to replace patterns in strings, but I seem to be using them incorrectly, because they very quickly break my max_memory_limit (by several orders of magnitude)

This is strange, because I'm dealing with maybe 10 simultaneous strings of 500 words max.

I'm clearly causing some kind of overly recursive syntax, but I can't see how...

Any help you could give me to programme this better (or tell me where I'm going wrong) would be most appreciated

Code: [Select]

$string = "this would be about 500 words long";
$parts = $string; // $parts would normally be a substring of $string;

wp_wordmash($parts);
wp_synonymize($string);
wp_keyword2url($string);

//html stuff follows here...


function wp_wordmash($parts) {

$wordlist = file_get_contents('dictionary.txt', true);
$dictionary = explode(",", $wordlist);
$htmldictionary = array();
foreach($dictionary as $dicword) {
$htmldictionary[] = wp_htmlcode($dicword);
$htmldictionary_u[] = wp_htmlcode(strtoupper($dicword));
$htmldictionary_u1[] = wp_htmlcode(ucfirst($dicword));
$htmldictionary_ucwords[] = wp_htmlcode(ucwords($dicword));
}
for($i=0;$i<count($dictionary);$i++){

$parts = preg_replace("/b$dictionary[$i]b/", $htmldictionary[$i], $parts);
$parts = preg_replace("/b" . strtoupper($dictionary[$i]) . "b/", $htmldictionary_u[$i], $parts);
$parts = preg_replace("/b" . ucfirst($dictionary[$i]) . "b/", $htmldictionary_u1[$i], $parts);
$parts = preg_replace("/b" . ucwords($dictionary[$i]) . "b/", $htmldictionary_ucwords[$i], $parts);
}
return $parts;
}

function wp_htmlcode($string) {

$buffer= NULL;
for($i=0;$i<strlen($string);$i++) {
$buffer .= "&#" . ord($string{$i}) . ";";
}
return $buffer;
}

function wp_synonymize($string){

$buffer=$string;
$synonymfile = file_get_contents('synonyms.txt', true);
$synonyms = explode("n", $synonymfile);
for($i=0;$i<count($synonyms);$i++){
$synonymlist = explode(",", $synonyms[$i]);
$oldword = $synonymlist[0];
$synonym = $synonymlist[1];
$synonym = str_replace("r", '', $synonym);
$buffer = preg_replace("/b$oldwordb/", $synonym, $buffer);
$buffer = preg_replace("/b" . strtoupper($oldword) . "b/", strtoupper($synonym), $buffer);
$buffer = preg_replace("/b" . ucfirst($oldword) . "b/", ucfirst($synonym), $buffer);
$buffer = preg_replace("/b" . ucwords($oldword) . "b/", ucwords($synonym), $buffer);
}
return $buffer;
}

function wp_keyword2url($string){

$buffer=$string;
$keyword2urlfile = file_get_contents('keyword2url.txt', true);
$keywords = explode("n", $keyword2urlfile);
for($i=0;$i<count($keywords);$i++){
$keywordlist = explode(",", $keywords[$i]);
$keyword = $keywordlist[0];
$url = $keywordlist[1];
$url = str_replace("r", '', $url);
$buffer = preg_replace("/b$keywordb/", '<a href = "' . $url . '">' . $keyword . '</a>', $buffer);
$buffer = preg_replace("/b" . strtoupper($keyword) . "b/", '<a href = "' . $url . '">' . strtoupper($keyword) . '</a>', $buffer);
$buffer = preg_replace("/b" . ucfirst($keyword) . "b/", '<a href = "' . $url . '">' . ucfirst($keyword) . '</a>', $buffer);
$buffer = preg_replace("/b" . ucwords($keyword) . "b/", '<a href = "' . $url . '">' . ucwords($keyword) . '</a>', $buffer);
}
return $buffer;
}
As I say, the string passed to these functions is typically < 500 words.

I've also included the comparison files (dictionary.txt, synonyms.txt and keyword2URL.txt)...HERE


I hope you can help...I'm 99% certain I'm using preg_replace() wrong...because if I substitute it with str_replace() then my memory issues disappear.

Problem is, I like preg_replace because it gives me the word border functionality.

I'm just obviously doing it wrong!

Any thoughts?

Other forums