RSS Feeds are broken in sNews, fix it with this hack

RSS Feeds are broken in sNews, fix it with this hack

Wow, OK, shame on me for not realizing this before, maybe I should subscribe to my own feeds, but I was just made aware that my entire articles were being displayed in the RSS feed.

This is not how the feeds are generally intended to work, they should contain a linked title and a snippet of the article, I like 300 or 400 characters.

OK, so how to fix it? Easy, BACK-UP your snews.php and work off of a copy, now find function strip, and insert the highlighted text;

// PREPARING ARTICLE FOR XML
function strip($text) {
    $search = array('/\[include\](.*?)\[\/include\]/', '/\[func\](.*?)\[\/func\]/', '/\[break\]/', '/</', '/>/');
    $replace = array('', '', '', '<', '>');
    $output = preg_replace($search, $replace, $text);
    $output = stripslashes(strip_tags($output, '<a><img><h1><h2><h3><h4><h5><ul><li><ol><p><hr><br><b><i><strong><em><blockquote>'));
    if (strlen($output) > 400) { 
        $output = substr($output,0,400);
        $pos = strrpos($output, ". ");
        if ($pos !== false && $pos > 300) { // there's a sentence end close by, let's cut it there
            $output = substr($output,0,$pos).'.';
        } else {
            $output = $output.'...';
        }
    }
    return $output;
}

You may also want to strip all of the tags from the preview, to avoid unmatched tags, etc, i.e. change;

$output = stripslashes(strip_tags($output, '<a><img><h1><h2><h3><h4><h5><ul><li><ol><p><hr><br><b><i><strong><em><blockquote>'));

to

$output = stripslashes(strip_tags($output));

That's it, that will cut the text preview down to 300 characters max, and add a ... after if a word gets chopped. Thanks to Poppoll for the head's up on this one!

Tags

 

You might like

Comments


Made a couple small changes just now... now the function will chop things more cleanly at the end of a sentence if within a reasonable space from the end of the snippet.


I recommend updating.


A fast MOD, Matt... and a good one. Implemented it right away. Thanks again, sNoozer.


Nice one, Matt. Thanks!
Now, if we only had a "summary" or "snippet" to tie this in with we could do something with the category index page as well... lol (look for post on forum about that little nugget in a little while...)


Hey Fred,

Cool, I can't wait to see what you do.

I've had so many ideas regarding that... i.e., a special summary field for viewing on the index pages, and a summary thumbnail image, etc.

--Matt


Just found this, and again, good job on all your hacks/mods, I really like the way you do stuff.


Hey.

I'm trying to strip my rss feed for the bbcodes I made together with asundrus, but, let's say I wanna strip [i ]text[/i ] I then found a solution, but it aint the best, as I have to add each bbcode, like:
for [i ] I made '/\[i\]/'
and for [/i ] I made '/\[\/i\]/'

I search around trying to find a simple solution, but maybe im just looking the wrong places.

Any thoughts?

ps. spaces added in bbcodes incase you article picks up the code.


Hey there, if you're trying to strip the bbcodes out from around the text, you could do a preg_replace on anything matched inside bracketed pairs, something like;


preg_replace("/(\[.*\])(.*)(\[\/.*\])/e","'\\2'",$text);


Thanks for the reply, and I just realized, I should have been able to sort this myself, as I did the bbcode function myself, but thanks again for enlightning me, next time I gotta think before asking hehe.


The function strlen ($ output) does not work correctly for utf8. Only for ASCII ... :(
What can you advise to utf8?


Hi Different, try utf8_decode

if (strlen(utf8_decode($output)) > 400) {


Hi Matt,
Thanks for your reply, but it did not help.
Could be an error occurs even after counting characters -> in subst function...


I helped a new feature utf8_substr:

// MAKE SUBSTR FUNCTION FOR UTF-8
function utf8_substr($str,$from,$len){
  return preg_replace('#^(?:[\x00-\x7F]|[\xC0-\xFF][\x80-\xBF]+){0,'.$from.'}'.
                       '((?:[\x00-\x7F]|[\xC0-\xFF][\x80-\xBF]+){0,'.$len.'}).*#s',
                       '$1',$str);
}



I replaced a couple of lines in the mod:

if (strlen(utf8_decode($output)) > 400) { 
        $output = utf8_substr($output,0,400);


I tried to use the Multibyte String Functions, but did not succeed...
Sorry for the frequent comments. The final version looks like this:

if (strlen(utf8_decode($output)) > 400) { 
        $output = utf8_substr($output,0,400);
        $pos = strrpos(utf8_decode($output), '. ');
        if ($pos !== false && $pos > 300) { // there's a sentence end close by, let's cut it there
            $output = utf8_substr($output,0,$pos).'.';


You say multi-byte functions didn't work for you, did you try
mb_strlen($output, 'UTF-8')

and then again with mb_substr?

After looking at it for a minute here, the easiest thing to do should be to use utf8_decode on the output only once, then do everything against that value.
    $output = preg_replace($search, $replace, $text);
    $output = stripslashes(strip_tags($output, '<a><img><h1><h2><h3><h4><h5><ul><li><ol><p><hr><br><b><i><strong><em><blockquote>'));
    <span class="highlight">$output = utf8_decode($output);</span>
    if (strlen($output) > 400) {


Sorry, Matt
*LOL* I forgot to add php_mbstring.dll for my local server...
Now everything works as expected! See below:


if (mb_strlen($output,'UTF-8') > 400) { 
$output = mb_substr($output,0,400,'UTF-8');
$pos = mb_strrpos($output, '. ','UTF-8');
if ($pos !== false && $pos > 300) { 
$output = mb_substr($output,0,$pos,'UTF-8').'.';

The same should be done for comments:

$ncom = mb_strlen($ncom) > $stringlen ? mb_substr($ncom, 0, $stringlen - 3,'UTF-8').'...' : $ncom;
$ncom.= mb_strlen($name) < $stringlen ? ')' : '';


Hi.
How do I add a category of the article to an RSS feed?
I can not find a solution to this problem.
Would you be kind to see what we can do about it?

----------------------------------------
Translation from Polish: translate.google.pl


That shouldn't be too difficult toolman, I will see if I can explain it later today.


That shouldn't be too difficult to add Sven, I'll have a look when I get a free moment.


Sven,

Inside the function rss_contents, find the following and add the bits I've added (bolded);
if ($rss_item == "rss-articles") {
        $jump = '<br /><a href="'.$link.'#jump">read more</a>';
    }
    $item  =
    '<item>
    <title><![CDATA['.strip(htmlspecialchars_decode($title,ENT_QUOTES)).']]></title>
    <description>
        <![CDATA[
        '.strip($text).$jump.'
        ]]>
    </description>
    <pubDate>'.$date.'</pubDate>
    <link>'.$link.'</link>
    <guid>'.$link.'</guid>
    </item>';
    echo $item;


That should add a "read more" link to rss-articles RSS feeds.


Splendid! Bravo! Thanks a lot Matt.


Oh! Thinking...
I don't know if you have your feed published in other sites but there might be a strong SEO improvement:
The idea is to replaced in the feed break the read more text anchor created by the one from "An easy mod to create custom break titles for your sNews articles".
What do you think of it?


Shouldn't be too hard Sven, we'd just need to extract the break title from the text if it exists. Use this bit for the first part. (not tested, but should work)


if ($rss_item == "rss-articles") {
    if (preg_match("/\[break title=\"(.*)\"\]/i",$r['text'],$matches)) {
        $readmore = $matches[1];
    } else {
        $readmore = l('read_more');
    }
    $jump = '<br /><a href="'.$link.'#jump">'.$readmore.'</a>';
}


Yeah! It works almost fine but there's an issue:
Cutting the text works but if it's in a link there's no more link at all.

I tried to fix by cutting the text without words breaking but as you know PHP is not my cup of tea (anyway I don't care, I only drink alcohol, even for breakfast) so I didn't get any results using this:

<?php
//substring without words breaking

$str = "aa bb ccc ddd ee fff gg hhh iii";

echo substr(($str=wordwrap($str,$,'$$')),0,strpos($str,'$$'));
?>


That nice piece of code was found there: http://www.php.net/manual/en/function.subs...



(optional, not publicly displayed) (optional)

Copyleft 2002 - 2014 Matt Jones
Hand crafted with HTML5 & CSS3
↑ Back to top