PHP Regex to Make Twitter Links Clickable

This is just a quicky post, not one of my usual long, rambling diatribes. This week is madness, even by my own absurd standards, but I didn’t want to miss jotting this down in case it might be helpful to others.

I’ve had varying degrees of success trying to find a series of preg_replace statements that would correctly replace output generated by Twitter’s RSS feeds (which do not contain any linking HTML) to autolink hyperlinks, @replies and hashtags, so I finally sat down and sorted it out myself.

The code below should correctly autolink all of the autolinkables in your PHP script:

  • links @username to the user’s Twitter profile page
  • links regular links to wherever they should link to
  • links hashtags to a Twitter search on that hashtag

[sourcecode language=’php’]function twitterify($ret) {
$ret = preg_replace(“#(^|[\n ])([\w]+?://[\w]+[^ \”\n\r\t< ]*)#", "\\1\\2″, $ret);
$ret = preg_replace(“#(^|[\n ])((www|ftp)\.[^ \”\t\n\r< ]*)#", "\\1
\\2″, $ret);
$ret = preg_replace(“/@(\w+)/”, “
@\\1“, $ret);
$ret = preg_replace(“/#(\w+)/”, “#\\1“, $ret);
return $ret;
}[/sourcecode]

Someday I’ll try to find the time to write a regex primer/tutorial. Regex is another one of the things, like SVN, that seems scary and incomprehensible to many people, but eventually it clicks and makes sense. In the meantime, if you’re interested in learning more about using Regex in PHP, check out the following resources:

More Regexy Goodness:

Make sure you leave yourself some time to actually try out some examples, and dissect the examples they give so that you really grok what’s happening. Once it clicks, it seems so simple, you won’t believe you ever let it beat you up and take your lunch money.

Image by xkcd, via their awesome web store. The comic rocks my geeky face off. I buy my clothes there. So should you.

PS – still really, really hate posting code in WordPress. Even in HTML mode, it keeps converting my fscking special characters, which then get double/triple/etc converted.

  • Pingback: Daily Digest for September 10th | piersonthe.net()

  • this is awesome, i made one modification – your first two replacements forgot to close the link.
    .-= mike´s last blog ..the family at home =-.

  • this is awesome, i made one modification – your first two replacements forgot to close the link.
    .-= mike´s last blog ..the family at home =-.

  • I’ll be damned, you’re right… lol – how the hell did I do that? Thanks for the heads up – I’ll fix it.

  • I’ll be damned, you’re right… lol – how the hell did I do that? Thanks for the heads up – I’ll fix it.

  • Okay – weirdly, I didn’t forget the closing anchor tags – WordPress keeps stripping them out, no matter how many times I re-add them. WTF?

  • Okay – weirdly, I didn’t forget the closing anchor tags – WordPress keeps stripping them out, no matter how many times I re-add them. WTF?

  • Not if this matters but I was getting “bad escape” warnings from Zend Studio. You can see the changes I made below.

    $string = preg_replace(“#(^|[n ])([w]+?://[w]+[^ “nrt< ]*)#", "12“, $string);
    $string = preg_replace(“#(^|[n ])((www|ftp).[^ “tnr< ]*)#", "12“, $string);
    $string = preg_replace(“/@(w+)/”, “@1“, $string);
    $string = preg_replace(“/#(w+)/”, “#1“, $string);

  • Not if this matters but I was getting “bad escape” warnings from Zend Studio. You can see the changes I made below.

    $string = preg_replace(“#(^|[n ])([\w]+?://[\w]+[^ “nrt< ]*)#", "\1\2“, $string);
    $string = preg_replace(“#(^|[n ])((www|ftp)\.[^ “tnr< ]*)#", "\1\2“, $string);
    $string = preg_replace(“/@(\w+)/”, “@\1“, $string);
    $string = preg_replace(“/#(\w+)/”, “#\1“, $string);

  • That’s a great one, I really was thinking about using regex to make links clickable, but here you are =]
    can’t live without your code 😉 haha

  • That’s a great one, I really was thinking about using regex to make links clickable, but here you are =]
    can’t live without your code 😉 haha

  • Wait, I caught a bug.
    If you parse a message like this “blah..blah..blah http://www.google.com?“,
    the question mark will be included in the url.

  • Wait, I caught a bug.
    If you parse a message like this “blah..blah..blah http://www.google.com?“,
    the question mark will be included in the url.

  • Thanks a lot for this nice function!

  • That's very cool tips. Thanks for your post.

  • Bill K

    Awesome! Thank you so much. I've been trying for a week to do this and being new to Regex was getting strange results (if any).

  • You can't use this on HTML code without risking to destroy your HTML. Example:

    Don't break!

    (… let's hope this thing doesn't post my link as link but as HTML code 😉 )

  • Twitter doesn't allow HTML code, so that's not really an issue.

  • Ninja Developer

    Thanks buddy, that made my day 🙂

  • amidar

    So how do you apply this function in a tweet or rss?
    Sorry to sound dumb but would appreciate an example

  • You wouldn't use it in a tweet or RSS, you would use it in a PHP script that parses tweets or tweet RSS feeds.

    echo twitterify($tweet_content);

  • Please let me know how to implement this on my twitter. Thank you

  • You don't implement it on your Twitter – Twitter already does auto-linking for you. You would use this in an application or web page that is pulling directly from your Twitter feed and emebdding in the page.

  • This is beautiful.. no seriously, almost made me cry when I saw it. I was about to put “twitter-ifying” my twitter RSS feed on the back-burner, but this is drop-in beauty. Thanks.

  • LOL It's good to know you're not prone to hyperbole or anything. 😀 I'm glad it helped!

  • LOL It's good to know you're not prone to hyperbole or anything. 😀 I'm glad it helped!

  • Joel R

    This totally kicks ass and saved me a good hour+ of work and I'm really good with regexs.

    Thanks!!

  • Excellent – glad to hear it!

  • Stace

    This is great! I'm just learning preg replace and regex and had already spent some time trying to figure this out with no result in sight. Thank you very much.

    One quickie, though: email addresses. Some tweets contain 'you@gmail.com' with the predictable result of only '@gmail' linking to http://www.twitter.com/gmail. Any thoughts on a fix?

    Thanks again!

  • Very nice. I remember when Regex seemed like magic, but now it is one of my most often used tools.

  • I think that's the exact same set of regular expressions I'm using to parse links on my website home page.

  • Pingback: Magazine covers, subscriptions, and speed « Media UK status()

  • I use this on Media UK – http://www.mediauk.com – to power a few things, including the ingest of the media tweets on the front page.

    You might like to add
    $ret = html_entity_decode($ret);
    at the start of the function – otherwise you see odd things going on when Twitter (or some client) HTML-encodes things like pound signs.

  • Pingback: PHP-help » Build a Lifestream with SimplePie()

  • Pingback: TG Developer » Build a Lifestream with SimplePie()

  • This is awesome, thanks! I started to write my own with a bunch of if's, else's and substr_replace(), and substr() but it quickly became a mess. I really need to learn regex!

  • Hehe – yeah, regex rocks, but it can be a little confusing. There are some great tutorials online though. Stick with it!

  • Pingback: Build a Lifestream with SimplePie | allphp.com()

  • Luboš

    Thanks a lot!!! There is problem with some characters in parsed text. For exapmle #Nestlé is not parsed right

  • It looks like Twitter doesn't really acknowledge the character, and just converts it to a regular, unaccented 'e': https://twitter.com/#search?q=%23Nestl%C3%A9

  • evilunix

    I converted this function to javascript 🙂

    function twitterify(ret) {

    ret = ret.replace(/(^|[n ])([w]+?://[w]+[^ “nrt< ]*)/, “$1$2“);
    ret = ret.replace(/(^|[n ])((www|ftp).[^ “tnr< ]*)/, “$1$2“);
    ret = ret.replace(/@(w+)/, “@$1“);
    ret = ret.replace(/#(w+)/, “#$1“);

    return ret;

    }

  • Nice work, thanks!

  • evilunix

    The HTML tags seem to have gone missing from the replace parts of those statements! Anyway, the basic idea is to replace 1 with $1. The hash delimiters should be replaced with forward slashes, which means any other forward slashes need escaping with backslashes!

  • Bop

    Just excellent!

  • Nice function.But, I need some help.
    The URL “http://test.com/test.htm#test”does'nt links correctly. Twitter parsed this as complete url (no hastag). Your function parsed a hastag, wich is not correct.

    Can you fix this?

  • I have modified your function. This code fixes the bug i wrote in my last comment.
    It fix also the “you@gmail.com”-bug. I hope it is helpfully!

    function twitterify($ret) {
    $ret = preg_replace(“#(^|[n ])([w]+?://[w]+[^ “nrt< ]*)#”, “12“, $ret);
    $ret = preg_replace(“#(^|[n ])((www|ftp).[^ “tnr< ]*)#”, “12“, $ret);
    $ret = preg_replace(“/(^| )@(w+)/”, “@1“, $ret);
    $ret = preg_replace(“/(^| )#(w+)/”, “1#2“, $ret);
    return $ret;
    }

  • Damn! There was an mistake in my posted code, sorry!
    This code should be work:

    function twitterify($ret) {
    $ret = preg_replace(“#(^|[n ])([w]+?://[w]+[^ “nrt< ]*)#”, “12“, $ret);
    $ret = preg_replace(“#(^|[n ])((www|ftp).[^ “tnr< ]*)#”, “12“, $ret);
    $ret = preg_replace(“/(^| )@(w+)/”, “1@2“, $ret);
    $ret = preg_replace(“/(^| )#(w+)/”, “1#2“, $ret);
    return $ret;
    }

  • Pingback: Latest Tweet in WordPress, now with Links! « Ross Hanney()

  • Ecenica

    Thanks for sharing this snippet of code! Now using it with SimplePie and Smarty to bring our Twitter into our customer hosting panel.

  • VanishDesign

    This is really cool, but it looks like a 'smart' apostrophe (’)will break it

  • I want you to know that i LOVE you! Finally I found something that works!!!

  • Yes, of course. Smart quotes are not recognized as quote marks.

  • Deratrius

    Nice work, noticed that if there is a regular link between parentheses the the link isn’t parsed.
    eg: http://www.link.com is parsed but (www.link.com) isn’t.

  • Nic

    Thanks, this is what i’m looking for.

  • Thanks, function work on my code

  • Lucian

    Awesome! Thank you so much for writing this! It’s perfect!

  • Blocki

    Thanks man! This really helped me a lot, I’m gonna use it in my own twitter app on facebook!

  • Thanks so much for this! I’ve been working on a script for collecting tweets and I was missing functionality to get Twitter handles and hash tags clickable to where they need to go. This helps a ton!

  • Benny

    This helped a lot! Thanks!

  • Biketrails.nl

    Great stuff, really helped me out recreating my timeline using the twitter API and php/mysql.

    Check it out: http://ww.biketrails.nl

  • Very nice, tyvm

  • Hi, thanks a lot for this code snippet.
    It seems that the search subdomain for hashtags is not needed anymore, I had to remove the subdomain in the regexp to fix the link.

  • jkns

    Thanks. Big help. Are the first couple of regexs missing a closing anchor tag ()?

  • Ter

    Hi…this is great code and an easy simple solution to making entities click-able in a twitter plugin. My only concern is when tweets end in with urls it seems to break the url. Don’t know if anyone else experiences this in their plug-ins but it is certainly a problem in mine.

    • Aron

      Yes I’ve just implemented this and noticed that too, I guess I will have to look at an alternative unless the author can update it!

  • Stephan Fischer

    I use this one:

    function twitterify($ret)

    {

    $ret = preg_replace(“#(^|[n ])([w]+?://[w]+[^ “nrt< ]*)#", "12“, $ret);

    $ret = preg_replace(“#(^|[n ])((www|ftp).[^ “tnr< ]*)#", "12“, $ret);

    $ret = preg_replace(“/@(w+)/”, “@1“, $ret);

    $ret = preg_replace(“/#(w+)/”, “#1“, $ret);

    return $ret;

    }

  • Baljka Gan

    Just Awesome!

  • Jame

    OK, this is good but old and doesn’t always get round those edge cases. I spent quite a bit of time looking for the definitive answer to this and behold it comes from twitter themselves. *phew* https://dev.twitter.com/docs/tco-url-wrapper/how-twitter-wrap-urls

  • Phil Kershaw

    I was looking into this for a client and you don’t need Regex at all to make the links in a twitter feed clickable. If you want to know more check my blog post about it and let me know what you think: http://blog.philkershaw.me/2013/11/clickable-links-in-your-twitter-feed.html