Jump to content

Smart Quotes in Rich Text Editor and Plain Text Editor


jl_arnold

Recommended Posts

I am having issues with perfecting this smart quote javascript. I have a template using both plain and rich text. I have successfully gotten the smart quotes to show in the correct direction a majority of the time.

 

One issue is when an apostrophe is used as the first character in the field, both in a plain text and rich text field.

 

The second issue is when quotes the first character in the rich text editor only.

 

 

My code is:

 

Rich Text Rule

function replaceFunction(field)
{
   // replace quotes at the beginning or end of the entire string first
   field = field.replace(/^\"(?=[\w|$][^<>]*<)/, "&ldquor;");
   field = field.replace(/^\'(?=[\w|$][^<>]*<)/, "&lsquor;");
   field = field.replace(/\"$(?=[^<>]*<)/, "&rdquor;");
   field = field.replace(/\'$(?=[^<>]*<)/, "&rsquor;");

   // now replace quotes before or inside words
   field = field.replace(/\s"(?=[\w|$][^<>]*<)/g, " &ldquor;");
   field = field.replace(/\s'(?=[\w|$][^<>]*<)/g, " &lsquor;");

   // finally, replace all other quotes (after words)
   field = field.replace(/\"(?=[^<>]*<)/g, "&rdquor;");
   field = field.replace(/\'(?=[^<>]*<)/g, "&rsquor;");

   return field;
}

if (Field("Body") != "")
   return replaceFunction(Feild("Body"))
   else
   return "";

 

 

Plain Text Rule

function replaceFunction(field)
{
   // replace quotes at the beginning or end of the entire string first
   field = field.replace(/^\"/, "&ldquor;");
   field = field.replace(/^\'/, "&lsquor;");
   field = field.replace(/\"$/, "&rdquor;");
   field = field.replace(/\'$/, "&rsquor;");

   // now replace quotes before or inside words
   field = field.replace(/\s"(?=[\w|$])/g, " &ldquor;");
   field = field.replace(/\s'(?=[\w|$])/g, " &lsquor;");

   // finally, replace all other quotes (after words)
   field = field.replace(/\"/g, "&rdquor;");
   field = field.replace(/\'/g, "&rsquor;");

   return field;
}

if (Field("Headline") != "")
   return replaceFunction(Field("Headline"))
   else
   return "";

 

 

 

I tested modifying the codes a few ways but don't really know how to right the correct fix. I added a line to the "now replace quotes before or inside words" but wasn't able to get it right exactly.

 

During testing, I could add the following line to the replace quotes before or inside words to fix the double quotes:

   field = field.replace(/\"(?=[\w|$][^<>]*<)/g, "&ldquor;");

, but I can't do this for the single quote because it changes the direction of an apostrophe (&rsquor;) to a left single quote.

 

 

 

 

Can someone help me as I am not very good at creating javascript code?

 

Thanks!

Link to comment
Share on other sites

How can I write that I only want to change an apostrophe if it's surrounded by other letters? I'm guessing the any character code is $ but I'm not sure.

 

would it be field = field.replace(/\$'$(?=[\w|$])/g, "&rsquor;");

There's no need to guess. Everything you ever wanted to know about JavaScript Regular Expressions and more can be found here:

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions

 

The "any character" character in a Regular Expression is actually the dot, or period, character.

 

However, in this case, I think the distinction you want to make is whether the apostrophe is next to a "word" character, not any character. For most purposes, a word character is matched by \w in a Regular Expression.

 

Also, besides the Mozilla reference, the other great resources that you have at your fingertips for help with JavaScript are search engines such as Google, and programming knowledge and Q&A sites such as Stack Overflow.

 

A Google search for "JavaScript replace with smart quotes" turns up lots of hits, including this one:

http://codereview.stackexchange.com/questions/75699/regex-for-curly-quotes-and-apostrophes

 

Cribbing from the answer there:

http://codereview.stackexchange.com/a/77177

 

We get this:

field = field.replace(/"([^"]*)"/g, "“$1”").replace(/(\w)'(\w)/g, "$1’$2").replace(/'([^']*)'/g, "‘$1’");

Which I believe meets the requirements.

 

Note that I'm simply using actual curly quote characters here instead of markup entities such as "&rsquor;". If you use entities, you need to check the "Treat returned strings as tagged text" box for the rule, but then you also need to call functions such as TaggedTextFromRaw on the parts of the data that you aren't replacing, to make sure that things like literal ampersands appear correctly in the output. You could call TaggedTextFromRaw on the whole thing first, then replace the ' and " entities that result, but that gets even more complicated when you need to figure out whether an apostrophe is in the middle of a word or not. It's better to leave the format of the data (either tagged or non-tagged "raw" text) alone.

Link to comment
Share on other sites

Thanks Dan!

 

With this single line of code, I replaced my original 8 lines for the rule associated to the plain text editor.

 

I have found this doesn't work 100% with the rich text editor. The quotes are converting to the curly smart quotes, but it's not applying the font. The font type seems to be unicode character.

 

The font I'm using is very much a rounded curly quote, which appears when entering text in a plain text field. But in the Rich Text Editor, the quotes appear blocky and not rounded.

 

I tried editing the code by using the markup entities, such as "&rsquor;". I used RawTextfromTagged("rsquor;") + "$1" + RawTextfromTagged("&rsquor;") but that did not fix the issue with the font style of the quotes.

 

Can you also provided this code for the Rich Text Editor?

 

Thanks!

Link to comment
Share on other sites

Thanks Dan!

 

With this single line of code, I replaced my original 8 lines for the rule associated to the plain text editor.

Great!

I have found this doesn't work 100% with the rich text editor. The quotes are converting to the curly smart quotes, but it's not applying the font. The font type seems to be unicode character.

 

The font I'm using is very much a rounded curly quote, which appears when entering text in a plain text field. But in the Rich Text Editor, the quotes appear blocky and not rounded.

 

I tried editing the code by using the markup entities, such as "&rsquor;". I used RawTextfromTagged("rsquor;") + "$1" + RawTextfromTagged("&rsquor;") but that did not fix the issue with the font style of the quotes.

I'm not sure I understand what is not appearing the way you want. A picture might be worth a thousand words here.

Can you also provided this code for the Rich Text Editor?

To what editor specifically are you referring? Do you mean an online editor in a web app such as MarcomCentral?

Link to comment
Share on other sites

The image below shows the quotes correctly where the green background is, but the text has incorrect quotes. I also just noticed the font is not Promixa Nova which is used in the FusionPro template.

 

1abk4uuy.pdf

 

 

 

As for the Rich Text Editor, this is a feature in MarcomCentral that allows users to choose to bold certain text, or italicize, as well as center and left align. I have two smart quote rules, one when a plain text form field is used in MarcomCentral, and another for the rich text form fields. The JavaScript Regular Expression for the Rich Text Editor seems to ignore any content within a tag <>.

 

This extra part of the code contains data like (?=[^<>]*<) in one of the expressions I was previously using.

 

Does this make sense and help determine the code needed?

Link to comment
Share on other sites

The image below shows the quotes correctly where the green background is, but the text has incorrect quotes. I also just noticed the font is not Promixa Nova which is used in the FusionPro template.

 

1abk4uuy.pdf

Okay, thanks for the attachment. But I still don't see what you mean. All of the text in there seems to have the correct curly quotes applied. I've copied it from the PDF and pasted it here:

“Please’s” ‘Work’s’

‘Please’s’ “Work’s”

“Please’s” ‘Work’s’ “Please’s”

‘Please’s’ “Work’s” ‘Please’s’

What specifically is not right about it?

 

Or how about this: What is the expected result, and exactly how is it different than the actual result?

As for the Rich Text Editor, this is a feature in MarcomCentral that allows users to choose to bold certain text, or italicize, as well as center and left align. I have two smart quote rules, one when a plain text form field is used in MarcomCentral, and another for the rich text form fields. The JavaScript Regular Expression for the Rich Text Editor seems to ignore any content within a tag <>.

 

This extra part of the code contains data like (?=[^<>]*<) in one of the expressions I was previously using.

 

Does this make sense and help determine the code needed?

Not really, sorry. If this part of the question is specific to MarcomCentral, it's generally more appropriate to post it in the sub-forum specific to the MarcomCentral web app. That said, when you use the RTE in MarcomCentral, it's generating tagged markup instead of "raw" flat-file data like is generally used in FP Desktop when designing templates. So, unlike your flat-file data, the data generated by Marcom probably already has entities such as ' in it. The best way to figure this out is to determine exactly what is in that MarcomCentral-generated data file. That's something that your BRM should be able to help with, or you can try asking in the MarcomCentral sub-forum.

Link to comment
Share on other sites

Dan,

 

The PDF does contain smart quotes in both sections, where the white text is 100% correct. This white text uses a plain text form field.

 

The black text is incorrect.

 

Downloading the PDF, you will see the text in white has rounded curly quotes. When you zoom in to the black text, it's squared smart quotes. The text color is also black, when it should be 80% black and the font type is Helvetica when it should be Proxima Nova A Light. This black text form field uses a third party "Rich Text Editor" within MarcomCentral to allow users in the storefront to make the font changes.

 

I'll post the Rich Text Editor question in the other forum you mentioned, but if you know what can be added to the code you originally provided to ignore any text found between and open tag (<) and closing tag (>), that would be extremely helpful.

field = field.replace(/"([^"]*)"/g, "“$1”").replace(/(\w)'(\w)/g, "$1’$2").replace(/'([^']*)'/g, "‘$1’");

Link to comment
Share on other sites

The PDF does contain smart quotes in both sections, where the white text is 100% correct. This white text uses a plain text form field.

 

The black text is incorrect.

 

Downloading the PDF, you will see the text in white has rounded curly quotes. When you zoom in to the black text, it's squared smart quotes. The text color is also black, when it should be 80% black and the font type is Helvetica when it should be Proxima Nova A Light.

It sounds to me like the rule to replace straight quotes with "smart" quotes is working fine (and therefore the original issue/question in this thread is solved/answered), and that you're now describing a separate issue with the formatting of the text. Specifically, one font has "rounded curly quotes" while another has "squared smart quotes." Therefore, I think that if we can get all of the text into the same font (or at least into the same font family), then all the quotes will look basically the same.

 

It's very hard to determine much about what's going on with the job just from the output, but I suspect that FusionPro is failing to find the requested font at composition time. This could be due to a mismatch between the font in the Rich Text Editor and the fonts collected with the FusionPro job in Acrobat.

 

The best way to determine for sure what's going on is to look at the composition log (.msg) file from the MarcomCentral composition. All you need to do is find the URL to the output PDF you already downloaded, and replace the ".pdf" at the end of that URL with ".msg" and paste that back into your browser's address bar and press Enter or Go and you should see the log file. There's probably a message in there about the font.

Link to comment
Share on other sites

I know the font is there, because the Green bar background color in the one composition is the 1st page of a 4 page template. There is a field in the form where the user selects Green, Teal, Purple or Pink. I only edited the Green template with the new code which loses the font. The 2nd PDF with the Teal is simply selecting that page of the template which uses my original rule. Nothing in the template was changed other than the addition of the second smart quote rule.

 

But the message is found below:

 

Job started 04:21:28 - 1469100088.

Creator: FusionPro VDP Producer (API) 9.3.36

Computer Name: SDPFI04

Current working folder: D:\US\ImageServer\bin

Temporary files folder: C:\Users\Public\Documents\PTI\FusionPro\TEMP_1524\

Template File: \\sdfsc02.dc.pti.com\tempvol\tmp\vd3j4hiy.dif

Input File: \\sdfsc02.dc.pti.com\tempvol\tmp\vd3j4hiy.xml

Job Config File: \\sdfsc02.dc.pti.com\tempvol\tmp\vd3j4hiy.cfg

Unknown Tag /p ignored.

Composing record #1, input record 1

Para style <> not found

Unknown Tag /p ignored.

Job ended 04:21:29 - 1469100089.

Total Job Time: 1s

 

 

 

I compared the message above to the message created with the Teal bar, and the only difference seems to be that the Green bar had a message with: "Para style <> not found". (see below)

 

Teal composition message:

 

Job started 04:23:53 - 1469100233.

Creator: FusionPro VDP Producer (API) 9.3.36

Computer Name: SDPFI04

Current working folder: D:\US\ImageServer\bin

Temporary files folder: C:\Users\Public\Documents\PTI\FusionPro\TEMP_10880\

Template File: \\sdfsc02.dc.pti.com\tempvol\tmp\dtmwb01p.dif

Input File: \\sdfsc02.dc.pti.com\tempvol\tmp\dtmwb01p.xml

Job Config File: \\sdfsc02.dc.pti.com\tempvol\tmp\dtmwb01p.cfg

Unknown Tag /p ignored.

Composing record #1, input record 1

Unknown Tag /p ignored.

Job ended 04:23:55 - 1469100235.

Total Job Time: 2s

Link to comment
Share on other sites

I know the font is there, because the Green bar background color in the one composition is the 1st page of a 4 page template. There is a field in the form where the user selects Green, Teal, Purple or Pink. I only edited the Green template with the new code which loses the font. The 2nd PDF with the Teal is simply selecting that page of the template which uses my original rule. Nothing in the template was changed other than the addition of the second smart quote rule.

 

But the message is found below:

 

Job started 04:21:28 - 1469100088.

Creator: FusionPro VDP Producer (API) 9.3.36

Computer Name: SDPFI04

Current working folder: D:\US\ImageServer\bin

Temporary files folder: C:\Users\Public\Documents\PTI\FusionPro\TEMP_1524\

Template File: \\sdfsc02.dc.pti.com\tempvol\tmp\vd3j4hiy.dif

Input File: \\sdfsc02.dc.pti.com\tempvol\tmp\vd3j4hiy.xml

Job Config File: \\sdfsc02.dc.pti.com\tempvol\tmp\vd3j4hiy.cfg

Unknown Tag /p ignored.

Composing record #1, input record 1

Para style <“”> not found

Unknown Tag /p ignored.

Job ended 04:21:29 - 1469100089.

Total Job Time: 1s

 

 

 

I compared the message above to the message created with the Teal bar, and the only difference seems to be that the Green bar had a message with: "Para style <“”> not found". (see below)

 

Teal composition message:

 

Job started 04:23:53 - 1469100233.

Creator: FusionPro VDP Producer (API) 9.3.36

Computer Name: SDPFI04

Current working folder: D:\US\ImageServer\bin

Temporary files folder: C:\Users\Public\Documents\PTI\FusionPro\TEMP_10880\

Template File: \\sdfsc02.dc.pti.com\tempvol\tmp\dtmwb01p.dif

Input File: \\sdfsc02.dc.pti.com\tempvol\tmp\dtmwb01p.xml

Job Config File: \\sdfsc02.dc.pti.com\tempvol\tmp\dtmwb01p.cfg

Unknown Tag /p ignored.

Composing record #1, input record 1

Unknown Tag /p ignored.

Job ended 04:23:55 - 1469100235.

Total Job Time: 2s

 

My guess is that the rule is replacing the quotes in your tags with smart quotes as well and FP is having difficulty parsing them. You can add this to your code to change any smart quotes in your tags back to normal quotes:

var field = Field("Your Field");
field = field.replace(/"([^"]*)"/g, "“$1”").replace(/(\w)'(\w)/g, "$1’$2").replace(/'([^']*)'/g, "‘$1’");

[color="Red"](field.match(/<[^>]+>/g) || [])
   .filter(function(tag) {
       return /[“”‘’]/.test(tag);
   })
   .forEach(function(find) {
       var replace = find.replace(/[‘’]/g, "'").replace(/[“”]/g,'"');
       field = field.replace(find, replace);
   });[/color]

return field;

Link to comment
Share on other sites

Thanks Ste!

 

Testing this code, I found it works for both Rich Text and plain text form fields, which is very convenient.

 

There is only one thing I found that does not convert correctly to smart quotes. It's a single quote after a word. This would be used for possessives. For instance:

 

The Arnolds' house.

I like Texas' weather.

 

 

To add this to the existing code, would I add the following:

.replace(/(\w)'/g, "$1’")

 

 

I noticed that the code doesn't convert characters with spaces around it, but that shouldn't be an issue since that shouldn't be used by itself.

 

Please let me know about fixing a single quote at the end of a word. Thanks!

Link to comment
Share on other sites

To add this to the existing code, would I add the following:

.replace(/(\w)'/g, "$1’")

Sure, you could do it that way. But consider the following:

var field = "The Arnolds' house is Texas-travelers' \"most frequented\" establishment.";
return field.replace(/"([^"]*)"/g, "“$1”").replace(/(\w)'(\w)/g, "$1’$2").replace(/'([^']*)'/g, "‘$1’").replace(/(\w)'/g, "$1’");

The Arnolds house is Texas travelers “most frequented” establishment.

In the case of two plural possessive words in the same sentence, the single quotes are replaced first so you'd want to make sure you performed that replacement before a pair of single quotes.

 

Or you could modify your existing regexp to include:

field = field.replace(/"([^"]*)"/g, "“$1”").replace(/(\w)'(\w[color="Red"]*[/color])/g, "$1’$2").replace(/'([^']*)'/g, "‘$1’");

Previously, that particular regexp was looking for a word character (\w) followed by an apostrophe (') followed by a word character in order to consider it a replaceable match. Adding the asterisk (*) modifies the search pattern to be: a word character followed by an apostrophe followed by 0 or more word characters.

I noticed that the code doesn't convert characters with spaces around it, but that shouldn't be an issue since that shouldn't be used by itself.

Not sure what you mean by that. The code matches single/double quotes followed by anything that's not a single/double quote and replaces it with smart single/double quotes. Spaces are included in the list of characters that aren't single/double quotes so I wouldn't think there would be any issue converting something like the following:

" stephen "

stephen

Link to comment
Share on other sites

Score! I got it!

 

I modified your code slightly because I couldn't get it working for every quote I was testing. I basically broke the rules apart to replace quotes before and after words separately.

 

 

Here is my code in it's entirety for future reference. As you mentioned before there are likely instances not covered by this rule, but this converted a correctly for the text that follows:

 

function replaceFunction(field)
{
field = field.replace(/(\w+)"/g, "$1”").replace(/"(\w*)/g, "“$1").replace(/(\w)'(\w+)/g, "$1’$2").replace(/'(\w+)/g, "‘$1").replace(/(\w*)'/g, "$1’");


(field.match(/<[^>]+>/g) || [])
   .filter(function(tag) {
       return /[“”‘’]/.test(tag);
   })
   .forEach(function(find) {
       var replace = find.replace(/[‘’]/g, "'").replace(/[“”]/g,'"');
       field = field.replace(find, replace);
   });

return field;
}

if (Field("Body") != "")
   return replaceFunction(Rule("BodyFont"))
   else
   return "";

 

 

"test" 'test'

'Please's' "Tests" 'Test'

"Please's" 'Test' "Tests"

"Test" 'Test' "Test"

Test's Tests'. This is 5'6"

'Test of this good stuff.'

Test "says this IS now right.

Correct 'direction for this quote'? = YES!

 

 

Thanks for the help!! I would never have gotten this without you two!

 

- Jason

Link to comment
Share on other sites

  • 1 year later...

One of the instances not covered in this JavaScript rule is class abbreviations. For example:

 

John “Smitty” Smith, class of ’81

 

The rule above would incorrectly return:

 

John “Smitty” Smith, class of ‘81

 

So I added the code in red below to handle class abbreviations:

 

function SmartQuotes(field)
{
field = field.replace(/(\w+)"/g, "$1”").replace(/"(\w*)/g, "“$1").replace(/(\w)'(\w+)/g, "$1’$2").replace(/'(\w+)/g, "‘$1").replace(/(\w*)'/g, "$1’")[color="Red"].replace(/(\u2018)([0-9]{2}[^\u2019]*)(\u2018([^0-9]|$)|$|\u2019[a-z])/ig, '\u2019$2$3')[/color];


(field.match(/<[^>]+>/g) || [])
   .filter(function(tag) {
       return /[“”‘’]/.test(tag);
   })
   .forEach(function(find) {
       var replace = find.replace(/[‘’]/g, "'").replace(/[“”]/g,'"');
       field = field.replace(find, replace);
   });

return field;
}

 

But it is only catching the first instance of a class abbreviation. Not the second or third, etc. For example:

 

John “Smitty” Smith, class of ’81 and Jane Doe, class of ‘82

 

Any idea's as to what I'm doing wrong?

 

Thank you.

Link to comment
Share on other sites

I believe the issue is:

.replace(/(\u2018)([0-9]{2}[color="Red"][^\u2019]*[/color])(\u2018([^0-9]|$)|$|\u2019[a-z])/ig, '\u2019$2$3');

That's capturing (as $2) two digits followed by 0 or more characters that aren't closing single quotes. So it ends up capturing the entire line.

 

If you're looking for classes specifically, why not make it a little easier on yourself and just add a little context to your regexp:

return SmartQuotes('John "Smitty" Smith, class of \'81, Jane Doe, class of \'82')
   .replace(/(class of )\u2018(\d{2})\b/gi, '$1\u2019$2');

Link to comment
Share on other sites

If you're looking for classes specifically, why not make it a little easier on yourself and just add a little context to your regexp:

return SmartQuotes('John "Smitty" Smith, class of \'81, Jane Doe, class of \'82')
   .replace(/(class of )\u2018(\d{2})\b/gi, '$1\u2019$2');

 

Thank you Step. That would be easier. Unfortunately, the data has instances where "class of " is not present. Only the 2-digit year.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...