Jump to content

Edit XML files


scotts

Recommended Posts

I used to be able to open the XML files with FP8 in BBEdit on the Mac. But now with FP9, the XML files are 16-bit, and I can not find a way to edit them with BBEdit. I've tried different settings with BBEdit, but still cannot figure it out. Very Frustrating.

 

Anyone know of a workaround, another editor of some sort. Again, I'm on a Mac.

 

Scott

FP 9.1.3 on Mac 10.9.5 with Acrobat 10.1.10

Link to comment
Share on other sites

Scott, I had a similar issue. I'm not working with XML files, but the change to 16-bit broke some scripts that I run on my FusionPro .msg logs after composition. In order to compensate for that, I wrote a bash script that will convert the file(s) (in my case I'm batch processing data) back to ascii format to conform to my workflow. This may work for you as well:

  1. Open Terminal.app
  2. Paste this line of code and press enter:
    conv_ascii() { iconv -c -f `file -I $1 | sed -e 's/.*charset=\(.*\)/\1/'` -t us-ascii//TRANSLIT "$1" > "$1".tmp; mv -f "$1".tmp "$1"; }


  3. type "conv_ascii " and then drop the file you want to convert onto the Terminal window and press enter:
    conv_ascii /path/to/file.xml


You can run the following command on your file afterwards to verify that it was converted to ascii:

file -I /path/to/file.xml

Expected output:

/path/to/file.xml: application/xml; charset=us-ascii

 

Less nerdy solution:

Depending on how many files you're actually editing, you might find it easier to open the file in a text editor like Sublime Text and click File > Save with encoding > UTF-8.

Edited by step
Added simpler solution
Link to comment
Share on other sites

I used to be able to open the XML files with FP8 in BBEdit on the Mac.

What do you mean by "the XML files?" The XML log file that FusionPro generates?

But now with FP9, the XML files are 16-bit

Yes, every text file generated by FusionPro is now 16-bit UTF-16 (actually UCS-2). This allows for full support for Japanese, Chinese, or any other language for anything FusionPro uses, including the names of colors, fonts, variables, rules, etc. (It was actually something we should have done a long time ago; being bought by a Japanese company has a way of hurrying these kinds of things along.)

and I can not find a way to edit them with BBEdit. I've tried different settings with BBEdit, but still cannot figure it out. Very Frustrating.

You must have a really old version of BBEdit, because BBEdit opens FusionPro's 16-bit output files just fine for me. So does TextWrangler, the free version. And TextEdit. And Xcode. And Word. They all detect the UTF-16 byte order mark. I would see about upgrading BBEdit.

Anyone know of a workaround, another editor of some sort.

TextEdit? Xcode? TextWrangler? Any editor from this century? Seriously, almost every Mac app supports Unicode these days.

Scott, I had a similar issue. I'm not working with XML files, but the change to 16-bit broke some scripts that I run on my FusionPro .msg logs after composition.

What kind of scripts?

In order to compensate for that, I wrote a bash script that will convert the file(s) (in my case I'm batch processing data) back to ascii format to conform to my workflow.

Okay, but you're not really converting "back" to ASCII. The file never was ASCII. It was always UTF-16. If the file actually had any 16-bit characters in it, such as a Japanese or Chinese character, or a curly quote or something, then that won't be preserved correctly in ASCII. (With FusionPro 9.2, anything can have such a 16-bit character in it, including color and variable names.) UTF-8 would preserve everything, but that's still Unicode.

Depending on how many files you're actually editing, you might find it easier to open the file in a text editor like Sublime Text and click File > Save with encoding > UTF-8.

Or again, just about any other editor out there.

Link to comment
Share on other sites

What kind of scripts?

The script that I had to revise checked for composition time errors and parsed out information that I printed to the .msg log from my FP templates at the end of batch compositions. Information like: total pieces to print, total number of press sheets, job number, etc. This is all information that our planning department requires and since a composition can produce upwards of 80-something files, I scripted that in lieu of manually checking every log file.

 

Okay, but you're not really converting "back" to ASCII. The file never was ASCII. It was always UTF-16. If the file actually had any 16-bit characters in it, such as a Japanese or Chinese character, or a curly quote or something, then that won't be preserved correctly in ASCII. (With FusionPro 9.2, anything can have such a 16-bit character in it, including color and variable names.) UTF-8 would preserve everything, but that's still Unicode.

Yeah, that may be the case. But, I think (it's been a while since I worked on this) that the file command in Bash recognized the logs created on an FP8 server as ASCII so that's what I converted the logs from an FP9 server to.

 

And just for full disclosure, I have very little idea what I'm talking about when it comes to text encoding.

Link to comment
Share on other sites

What do you mean by "the XML files?" The XML log file that FusionPro generates?

I realized after my last post that this is in the "XML Dialog Boxes" sub-forum, which is about the XML Template Rules.

 

So yes, in FusionPro 9.2, the built-in XML Templates are now UTF-16, because, as you can see if you open one up, FusionPro now supports Japanese and Chinese names, descriptions, comments, etc., as well as the older 8-bit languages. Thus, Unicode support is required.

 

However, even though the rules that come with FusionPro are 16-bit, you can still create your own XML Template Rule files as ASCII or any other 8-bit format, and put them in the "/Library/Application Support/PTI/FusionPro/Plug-ins/Template XML" folder (or the corresponding Program Files subfolder on Windows), and FusionPro will still read them just fine.

 

So, I'm now puzzled as to why you need to open up the built-in XML Templates, which come installed with FusionPro, at all. We neither recommend nor support modifying the XML Template rules which come with FusionPro. You can always make copies of them, but please don't modify the original files. And if you have saved off the older 8-bit XML Template rules from a previous version of FusionPro, you can still make copies of those, edit them as 8-bit, and add them to that Plug-ins subfolder. So I don't really see why you now suddenly need a Unicode-aware editor to do that.

The script that I had to revise checked for composition time errors and parsed out information that I printed to the .msg log from my FP templates at the end of batch compositions. Information like: total pieces to print, total number of press sheets, job number, etc. This is all information that our planning department requires and since a composition can produce upwards of 80-something files, I scripted that in lieu of manually checking every log file.

I would use the XML Log File for that. In your OnJobStart rule, you can just do this:

FusionPro.Composition.CreateXMLLogFile();

And it will automatically generate an XML file with most of that information in it. You can specify a file name in there (between the parentheses) if you want, or it will just append ".xml" to the main log file name for the XML log. (Or you can add the "XMLLogFile" setting to the CFG file in an FP Server composition.)

 

You will need to add a couple of lines for some of your custom information, like the job number. Something like this, maybe in OnRecordStart:

FusionPro.Composition.LogXMLMetadata("Job Number", Field("Job Number"));

This will generate a file that, unlike the main .msg log file, is in a well-defined, known format, specifically XML, which is a lot easier for an automated process to parse. (In fact, our own FusionPro VDP Producer API Web Service uses the XML log file to obtain information about what was composed.)

Yeah, that may be the case. But, I think (it's been a while since I worked on this) that the file command in Bash recognized the logs created on an FP8 server as ASCII so that's what I converted the logs from an FP9 server to.

I can do "file -I", followed by one of the files generated by FusionPro, say, the XML Log File, and it's very happy to report "application/xml; charset=utf-16le". I'm on Mac OS X 10.9 (Mavericks), but I think that the BSD file utility in OS X has handled UTF-16 files for a long time. OS X has supported Japanese for a lot longer than FusionPro has, and Unix had Unicode support even longer ago than that.

And just for full disclosure, I have very little idea what I'm talking about when it comes to text encoding.

It can be complicated. But in many ways, UTF-16 is the simplest format for (UCS-2) Unicode, as you don't have to deal with UTF-8 escapes. And we always put out a UTF-16 byte order mark, which almost any app or utility from this century will recognize. But you shouldn't really have to know much about all that; things should just work, if your tools are anywhere close to up-to-date.

Edited by Dan Korn
Link to comment
Share on other sites

I would use the XML Log File for that. In your OnJobStart rule, you can just do this:

FusionPro.Composition.CreateXMLLogFile();

And it will automatically generate an XML file with most of that information in it. You can specify a file name in there (between the parentheses) if you want, or it will just append ".xml" to the main log file name for the XML log. (Or you can add the "XMLLogFile" setting to the CFG file in an FP Server composition.)

 

You will need to add a couple of lines for some of your custom information, like the job number. Something like this, maybe in OnRecordStart:

FusionPro.Composition.LogXMLMetadata("Job Number", Field("Job Number"));

This will generate a file that, unlike the main .msg log file, is in a well-defined, known format, specifically XML, which is a lot easier for an automated process to parse. (In fact, our own FusionPro VDP Producer API Web Service uses the XML log file to obtain information about what was composed.)

You know, I actually came across that little gem a few weeks ago and thought that would be a much cleaner method of doing things but have yet to get around to re-writing everything. The log parse is part of a larger workflow of scripts that produces a job so I'm hesitant to make edits (no matter how seemingly minor) without having enough time to properly test it. And unfortunately at the moment I'm too busy. This forum isn't going to read itself!

 

I can do "file -I", followed by one of the files generated by FusionPro, say, the XML Log File, and it's very happy to report "application/xml; charset=utf-16le". I'm on Mac OS X 10.9 (Mavericks), but I think that the BSD file utility in OS X has handled UTF-16 files for a long time. OS X has supported Japanese for a lot longer than FusionPro has, and Unix had Unicode support even longer ago than that.

That's for files generated in FP9 I'm assuming? I was referring to files generated by FP8 being ASCII format according to the "file" command. We have an FP9 server and an FP8 server that has yet to be upgraded and the workflow has to run on both (I know this is another argument for me to just use the XML log). But I digress.

Link to comment
Share on other sites

I realized after my last post that this is in the "XML Dialog Boxes" sub-forum, which is about the XML Template Rules.

 

So yes, in FusionPro 9.2, the built-in XML Templates are now UTF-16, because, as you can see if you open one up, FusionPro now supports Japanese and Chinese names, descriptions, comments, etc., as well as the older 8-bit languages. Thus, Unicode support is required.

 

I do have BBEdit 10, and I can open the files which it sees them as UTF-16, but the text is all in some Asian dialect that I cannot read. I have looked into them before, as a reference, so I did not have to open the PDF. And now, I want to get field names and such, so I can build my own XML templates. But this makes it quite time consuming and painful.

 

Before I could just open the XML in BBEdit and read it.

 

I do not plan on editing the base templates, I just want to create my own.

Link to comment
Share on other sites

I do have BBEdit 10, and I can open the files which it sees them as UTF-16, but the text is all in some Asian dialect that I cannot read.

So if you look at the same XML file in TextEdit, does it look right?

 

If it looks right in TextEdit, but wrong in BBEdit, then there's some kind of problem with BBEdit or its configuration, which means that your question is really about BBEdit, not about FusionPro, and therefore is outside the scope of this forum.

 

If the XML file also looks wrong in TextEdit, then please post it here so I can take a look.

And now, I want to get field names and such, so I can build my own XML templates.

I'm still not sure why you can't just take one of the older 8-bit XML templates that you built previously and build off of that. At any rate, though, even if BBEdit is not working for some reason, you should be able to use TextEdit, or just about any other editor.

Link to comment
Share on other sites

And here is the XML file, I'm trying to open.

Okay, thanks for attaching that. The attachment is literally worth a thousand words.

 

Yes, that XML file appears to be corrupted. It looks like an 8-bit file which erroneously has a UTF-16LE byte order mark in front of it. If I open up the file in a binary editor and delete the first two bytes (the UTF-16 marker), it then becomes a valid 8-bit (Latin-1/ANSI) XML file. I've attached it here.

 

Now that I can actually see the file you're talking about, I see that it's not actually a Template XML Rule, as I had assumed from the sub-forum you posted in. It's actually an HTML Form Definition file, generated from the Web DataCollect dialog.

 

You could have saved us all quite a bit of time if you had made it clear from the start that you were talking specifically about the XML generated by the HTML Form / Web DataCollect dialog, or if you had bothered to answer the question right at the start of my first post in this thread:

What do you mean by "the XML files?" The XML log file that FusionPro generates?

Without you answering that question, and without actually having the file to look at, I was left to guess as to what you meant by "the XML files" in your original post, because there are lots of different XML files that can be used or generated by FusionPro, including:

  • a tagged markup input data file
  • the Data Definition (.def) file
  • the XML Log File
  • an XML Template Rule
  • an FP Expression log file
  • a JDF file
  • an HTML Form Definition file

Among others. So please try to be more specific in the future.

 

Anyway, it seems that you have uncovered a subtle bug which wasn't caught in our QA testing. This is probably because, when you click the Preview Form button on that Web DataCollect dialog, the XML file, while malformed, still is (somehow) successfully processed by the XSL transformation which converts it to the HTML page, which does appear properly in the web browser. Since that all seems to work, our QA team never thought to actually open up the XML file to see if it shows up properly in an editor.

 

So this really IS a bug in FusionPro. But it's not a general problem with how FusionPro generates 16-bit files; it's very specifically a problem with how that Web DataCollect dialog creates the XML file that represents the HTML Form Definition, on Mac. (On Windows, the file is generated correctly.)

 

Now that I know what the actual bug is, I can enter a specific bug report about it. But I can't tell you when it will be fixed.

 

As a workaround, I'm able to get the file back into a human-readable state by opening it up in TextWrangler or BBEdit, then from the menu, selecting File -> Reopen Using Encoding -> Western (ISO Latin 1), then deleting the first two characters (which are really a UTF-16LE byte order mark). So you should be able to use that workaround to be able to edit the file for now.

Shine_DH1_2014.xml.txt

Link to comment
Share on other sites

Sorry about that Dan, next time I will include the attachments in the beginning.

 

My end goal is to create some Template XMLs, but wanted to glean information from some of the XMLs from previous jobs.

 

Since I found a bug, is all forgiven?

 

And thank you very much for the work around. That will be very helpful in the future.

Link to comment
Share on other sites

Sorry about that Dan, next time I will include the attachments in the beginning.

Thanks. Like I said, it's usually worth a thousand words.

My end goal is to create some Template XMLs, but wanted to glean information from some of the XMLs from previous jobs.

Okay, but now I'm confused again, because you say you're trying to create XML Template rules, but you're using an HTML Form Definition file. While the files for the XML Template rules and the HTML Form Definition are similar, they're not the same, and they're not interchangeable.

  • The XML Template rules are for rules inside a FusionPro job, which allow the end user to do things like select from a list of fields or fonts at template design time (and they have JavaScript code in them to do something based on the selected fields.)
  • The HTML Form Definition is for a web form which lets the user enter data or upload graphics with specific constraints to build the data for a composition (where the client-side JavaScript for validation comes from a separate file.)

Anyway, if you want to create an XML Template rule, then you should start with an XML Template rule file, not with an HTML Form Definition file.

 

Or, if you're trying to create HTML Form Definitions, then that doesn't really make sense either, because different jobs are going to have different fields that need data entered. It's better to just create the HTML Form Definition using the Web DataCollect dialog for each job.

 

Maybe if you can give me a more specific example of what you're actually trying to accomplish, I can offer a more specific suggestion.

Since I found a bug, is all forgiven?

There's nothing to forgive. I'm sorry you ran into this bug, and I appreciate you finding it.

And thank you very much for the work around. That will be very helpful in the future.

Sure. Though, like I said, I think you're using the wrong file for what you're trying to do.

Link to comment
Share on other sites

  • 11 months later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...