Regular expressions and all things nice
By Jonathan Street
Firstly we create two arrays. $bbcode_regex contains all the regular expression we're looking for. $bbcode_replace contains what we want to replace them with. Next we apply ksort to both these arrays to make sure we start inspecting each array in the right place. This is probably unnecessary but I think it's good practice. Using preg_replace we look for text matching the regular expressions and replace it with their replacements.
Now on to regular expressions: incredibly useful but on first glance incomprehensible. As an example we'll take number 4 (one of the more complex expressions) and break it down.
There are certain special characters available for use in regular expressions which is really where the power lies. However, if we just want to use the standard meaning for a character we can 'escape' its special meaning with a backslash. If we strip out these backslashes we get the following.
We also need to define the start and end of the regular expression. This is simply done by putting the same character, in this case the forward slash, at the beginning and end of the expression. We can also add certain letters on to the end of the expression which have certain meanings. Let's strip all that away.
This is looking a lot like the bbcode we are familiar with. The only difference remaining are the two occurances of (.+?). The period just means any character and the plus sign means one or more. This will therefore match whatever the user puts between the tags. Why the question mark? Ordinarily that plus sign is greedy meaning that if the user uses a certain bbcode tag twice it will ignore the first closing tag and the opening tag of the second. It will grab more of the post than we want it to grab. This is what we mean by greedy. Adding the question mark makes it less greedy so it will recognise that there are two identical tags and treat them accordingly.
I mentioned before about adding letters onto the end of the regular expression. Here I use an s. This just makes sure that tags encompassing multiple lines of text doesn't present a problem.
Hopefully that explains everything except from how we get the text between the tags out. Going back to (.+?) we see it is enclosed in brackets. Brackets signify that we want this text available later. Text enclosed in brackets is put into variables that can be used in the replacement. The replacement for the above regular expression is as follows.
This is just standard html apart from $1 and $2. These are the variables where our text is stored. $0 contains the entire string. $1 contains the text from the first set of brakets and $2 contains the text from the second set of brackets.
That's it for replacing the bbCode tags. Apart from creating the regular expressions there isn't much to it.
The final thing that needs to be done is replace the new lines with br tags and then echo the post to the browser.
nl2br is a nice little function that does exactly what it says it will do. It converts new lines to br tags.
Hopefully you should now be able to easily handle bbCode submitted by your visitors as well as making it easy for them to submit bbCode in the first place. In most cases the bbCode replacements label the differenct aspects of the post using classes. This should make it easy for you to match the output of these elements with the design of your site using cascading style sheets. Good luck and if you have any questions feel free to ask on the forum.
This site best viewed in a W3C standard browser at 800*600 or higher
Site design by Red Squirrel | Contact
© Copyright 2021 Ryan Auclair/IceTeks, All rights reserved