Jump to content

First sentence of body text to be different color with regex


JoyceKeller

Recommended Posts

I need to change text color within body text, specifically, need the first sentence of body text to be a different color. Using regular expression to find first sentence.

 

 
return TaggedTextFromRaw(Field("body text")).replace(/(\&lt\(\S+)(\&gt\;)/g, '<color name="purple">????</color>');

 

I don't know what to write between the color tags (represented by question marks). The code validates, and shows (in the validation window) the first sentence separated from the rest of the paragraph by the color tag, but I don't know where to take it from here.

 

Should end up looking something like this:

 

The quick brown fox jumps over the lazy dog. Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Nam cursus. Morbi ut mi. Nullam enim leo, egestas id, condimentum at, laoreet mattis, massa. Sed eleifend nonummy diam. Praesent mauris ante, elementum et, bibendum at, posuere sit amet, nibh. Duis tincidunt lectus quis dui viverra vestibulum. Suspendisse vulputate aliquam dui.

 

Nulla elementum dui ut augue. Aliquam vehicula mi at mauris. Maecenas placerat, nisl at consequat rhoncus, sem nunc gravida justo, quis eleifend arcu velit quis lacus. Morbi magna magna, tincidunt a, mattis non, imperdiet vitae, tellus.

Link to comment
Share on other sites

I figured out most of my answer. But instead of changing the color of the first sentence, I change the color of the body text after the first sentence, found by looking for the first instance in the body text of a ".", "!", or "?".

 

The only problem I am having now is the punctuation which ends the first sentence disappears.

 

Is anybody out there good with regex?

 

 
return TaggedTextFromRaw(Field("maintxt")).replace(/([.!?]|<p[^>]*>)(\s| |<br\s*\/?>)([^\.])*/,'<color name="Black-66">');

Link to comment
Share on other sites

Would this work the way you want it to?
return TaggedTextFromRaw(Field("maintxt")).replace(/(^(\b.*?)(?:\.|\?\!))/gi,'<color name="CHS-4014">$1</color>');

That doesn't seem to work if the first sentence ends with a question mark or exclamation point.

 

Here's my solution:

var s = TaggedTextFromRaw(Field("maintxt"));
return s.replace(/([^\.\?\!]+.)/,'<color name="CHS-4014">$&</color>');

Breaking down my solution from the inside out, you can see how it works:

 

  • [^\.\?\!] The square brackets denote a set of characters. The characters in the set are the period, question mark, and exclamation mark. Since these all have special meanings in regular expressions, we use the backslash to escape them as literal characters. The caret ^ at the start of the set negates the character set. So the meaning of this expression is, "any character other than period, question mark, or exclamation point."
  • [^\.\?\!]+ The plus sign means, "one or more of the character set (of any character other than period, question mark, or exclamation point).
  • [^\.\?\!]+. The period at the end here means, "any character." So now the whole thing means, "one or more of the character set (of any character other than period, question mark, or exclamation point), followed by any character." In other words, a sentence. In this case, the "any character" has to be a period, question mark, or exclamation point; if it were anything else, it would have been grabbed up by the set [^\.\?\!]+ .
  • ([^\.\?\!]+.) The parentheses denote a pattern to match and "capture," to be referenced later in the replacement string with $& or $1, $2, etc.
  • /([^\.\?\!]+.)/ The slashes denote a literal Regular Expression.

Note that we don't need to add the "gi" flags to the Regular Expression literal. The "g" (global) flag is unnecessary, as we only care about the first match (the first sentence). The "i" (case insensitive) flag is unnecessary, as we're not trying to match any alphabetical characters where case would matter anyway, just punctuation.

 

Finally, we use the "$&" notation in the replacement string to represent the matched substring. You could use "$1" instead, but there's only one string to match here. Voila!

 

Remember that the JavaScript Reference is your friend, specifically:

https://developer.mozilla.org/en-US/docs/JavaScript/Guide/Regular_Expressions

and:

https://developer.mozilla.org/en-US/docs/JavaScript/Reference/Global_Objects/String/replace

 

I'm not sure why you have other things like (\s| |<br\s*\/?>) in your examples. It seems to me that it doesn't really matter what else is in the string, including any markup. We only need to look for a punctuation character that ends a sentence (a period, question mark, or exclamation point).

 

The only problem with this approach is if the first sentence contains one of those punctuation marks which doesn't actually end the sentence. For instance, if there's an abbreviation such as "Mr.", or another quoted sentence, or something else that would use one of those marks, that will break. But there's only so much you can do in a computerized algorithm to parse written text, which has rules that are not always consistent.

 

Finally, putting on my moderator hat, I'll note that you can always edit your own threads and posts, so if you make a mistake, you can correct it without making another post.

Link to comment
Share on other sites

thanks for y'all's help. And Step... happy 30th anniversary to your company :)

 

I was getting hung up on a notation to represent the matched substring, and not paying enough attention to the substring itself. Thanks Dan, also for the bit of regex guidance. I've bookmarked the links as well.

 

Now I have another problem after uploading the template to MCC and try to use the Rich Text Editor... it works until I add a line break... but I will post that in the MCC Forum.

Edited by JoyceKeller
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...