web stats
Convert ISO-8859-1 to UTF-8 - Mirth Community

Go Back   Mirth Community > Mirth Connect > Support

Reply
 
Thread Tools Display Modes
  #1  
Old 09-17-2015, 02:54 PM
brlmcguire brlmcguire is offline
OBX.2 Kenobi
 
Join Date: Dec 2012
Location: Prince George, BC, Canada
Posts: 84
brlmcguire is on a distinguished road
Default Convert ISO-8859-1 to UTF-8

Hello.

I am trying to convert an incoming message that is encoded as ISO-8859-1 to UTF-8.

I am grabbing the incoming raw message, using Java's String function to convert it to UTF-8 and then turn it into XML so that my subsequent steps can do further processing.

Code:
var rawdata = java.lang.String(connectorMessage.getRawData()).getBytes();
var utf8data = java.lang.String(rawdata, "ISO-8859-1").getBytes("UTF-8");
var utfstr = java.lang.String(utf8data);

channelMap.put('utf8msg',utfstr);
var utfxml = SerializerFactory.getSerializer('HL7V2').toXML(utfstr);

channelMap.put('utfxml', utfxml);
However, the resulting XML is different than the XML I would have gotten from the 'msg' channel variable.

This is causing my subsequent steps to fail, as they were built using the 'msg' variable.

So, my question is: what method is used the take the HL7V2 message and 'transform' it, as my method above is giving a different output.

Thanks,

Bruce.

Last edited by brlmcguire; 09-17-2015 at 03:22 PM.
Reply With Quote
  #2  
Old 09-17-2015, 06:51 PM
narupley's Avatar
narupley narupley is online now
Mirth Employee
 
Join Date: Oct 2010
Posts: 7,126
narupley is on a distinguished road
Default

Strings are not encoded in any charset. They are merely a list of Unicode code points. Only the byte arrays themselves are encoded with a charset.

If you're receiving the data via, say, a File Reader or TCP Listener, then you should set the incoming charset encoding appropriately on the source settings. By default it will use the JVM default encoding (depends on your JRE/OS). If you're receiving it from an HTTP Listener, then the source connector will take the charset from the incoming Content-Type header (and if there is no such header, shame on the client sending to you).

Once you have the correct charset configured on the source connector, that's it, you have your string. And a string is a string; there is no such thing as a "string encoded in ISO-8859-1", or a "string encoded in UTF-8", unless you mean the raw byte representation that is used when you encode the string and then store the resulting byte array to disk somewhere.
__________________
Step 1: JAVA CACHE...DID YOU CLEAR ...wait, ding dong the witch is dead?

Nicholas Rupley
Work: 949-237-6069
Always include what Mirth Connect version you're working with. Also include (if applicable) the code you're using and full stacktraces for errors (use CODE tags). Posting your entire channel is helpful as well; make sure to scrub any PHI/passwords first.


- How do I foo?
- You just bar.

Last edited by narupley; 09-18-2015 at 04:22 PM.
Reply With Quote
  #3  
Old 09-18-2015, 08:21 AM
brlmcguire brlmcguire is offline
OBX.2 Kenobi
 
Join Date: Dec 2012
Location: Prince George, BC, Canada
Posts: 84
brlmcguire is on a distinguished road
Red face Ummm. Doh!

Quote:
Originally Posted by narupley View Post
Strings are not encoded in any charset. They are merely a list of Unicode code points. Only the byte arrays themselves are encoded with a charset.

If you're receiving the data via, say, a File Reader or TCP Listener, then you should set the incoming charset encoding appropriately on the source settings. By default it will use the JVM default encoding (depends on your JRE/OS). If you're receiving it from an HTTP Listener, then the source connector will take the charset from the incoming Content-Type header (and if there is no such header, shame on the client sending to you).

Once you have the correct charset configured on the source connector, that's it, you have your string. And a string is a string; there is no such thing as a "string encoded in ISO-8859-1", or a "string encoded in UTF-8", unless you mean the raw byte representation that is used when you encode the string and then store the resulting byte array it to disk somewhere.
Colour me chagrined. You probably can't see it from there, but my face is a bright, flaming red right now.

Is there any way to delete this thread?

Thanks,

Bruce.
Reply With Quote
  #4  
Old 09-29-2015, 06:38 AM
Turanga_Fry Turanga_Fry is offline
Mirth Newb
 
Join Date: Sep 2015
Posts: 10
Turanga_Fry is on a distinguished road
Default

Hi there, maybe you could help me with something regarding this subject also.

I have an HL7 message that is encoded with ISO 8859-15 and wanto to transform it to store in database in format utf-8

Is it possible that i can search throught all fields of the message and transform it one by one?

example:

I got this message with various OBX:

Code:
OBX|9|NM|S8^Leuc\XF3\citos|9|9,29|x 10E3/uL|3.8-10.6|N|||F|||20150114112600

OBX|10|NM|S9^Granul\XF3\citos Neutr\XF3\filos|10|65,0|%|||||F|||20150114112600

OBX|11|NM|S9^Granul\XF3\citos Neutr\XF3\filos|11|6,04||1.3-8.8|N|||F|||20150114112600
where there is Leuc\XF3\citos for example, i want to replace the "\XF3\" for "" . The code F3 in 8859-15 represents the "" in utf-8

I want to be able to do this for all the segments in the HL7 message where this occurs.

Thanks

Last edited by Turanga_Fry; 09-29-2015 at 06:49 AM.
Reply With Quote
  #5  
Old 09-29-2015, 06:49 AM
narupley's Avatar
narupley narupley is online now
Mirth Employee
 
Join Date: Oct 2010
Posts: 7,126
narupley is on a distinguished road
Default

You can use this code template: http://www.mirthcorp.com/community/f...3206#post43206

Then just do this in your transformer:

Code:
for each (obx in msg.OBX) {
	unescapeXSequences(obx['OBX.3'], 'ISO-8859-15');
}
Or if you want to do it for the entire message:

Code:
unescapeXSequences(msg, 'ISO-8859-15');
__________________
Step 1: JAVA CACHE...DID YOU CLEAR ...wait, ding dong the witch is dead?

Nicholas Rupley
Work: 949-237-6069
Always include what Mirth Connect version you're working with. Also include (if applicable) the code you're using and full stacktraces for errors (use CODE tags). Posting your entire channel is helpful as well; make sure to scrub any PHI/passwords first.


- How do I foo?
- You just bar.
Reply With Quote
  #6  
Old 09-29-2015, 07:58 AM
Turanga_Fry Turanga_Fry is offline
Mirth Newb
 
Join Date: Sep 2015
Posts: 10
Turanga_Fry is on a distinguished road
Default

I put it into the source transformers, and it says that is not defined.

Code:
unescapeXSequences(msg, 'ISO-8859-15');
ReferenceError: "unescapeXSequences" is not defined.

What am i missing?
Reply With Quote
  #7  
Old 09-29-2015, 08:17 AM
Turanga_Fry Turanga_Fry is offline
Mirth Newb
 
Join Date: Sep 2015
Posts: 10
Turanga_Fry is on a distinguished road
Default Got it to work.

I already got it to work, it was simple that i am embarassed.

Had to put the code template in a source transformer rule (javascript) https://www.mirthcorp.com/community/...?t=6902&page=4

and then the part of the code that you said in another rule in source transformers
Code:
unescapeXSequences(msg, 'ISO-8859-15');
thanks for the heads up narupley


Ps:. has your name something to do with "naruto"?
Reply With Quote
  #8  
Old 09-29-2015, 08:19 AM
narupley's Avatar
narupley narupley is online now
Mirth Employee
 
Join Date: Oct 2010
Posts: 7,126
narupley is on a distinguished road
Default

Quote:
Originally Posted by Turanga_Fry View Post
I already got it to work, it was simple that i am embarassed.

Had to put the code template in a source transformer rule (javascript) https://www.mirthcorp.com/community/...?t=6902&page=4

and then the part of the code that you said in another rule in source transformers
Code:
unescapeXSequences(msg, 'ISO-8859-15');
thanks for the heads up narupley


Ps:. has your name something to do with "naruto"?
Cool, glad you got it to work. FYI you can import that as a code template if you want, rather than copying the code into a transformer.

My initials are N.A.R., last name being Rupley, hence "narupley". Though I am a huge anime fan
__________________
Step 1: JAVA CACHE...DID YOU CLEAR ...wait, ding dong the witch is dead?

Nicholas Rupley
Work: 949-237-6069
Always include what Mirth Connect version you're working with. Also include (if applicable) the code you're using and full stacktraces for errors (use CODE tags). Posting your entire channel is helpful as well; make sure to scrub any PHI/passwords first.


- How do I foo?
- You just bar.
Reply With Quote
  #9  
Old 10-02-2015, 12:37 AM
Turanga_Fry Turanga_Fry is offline
Mirth Newb
 
Join Date: Sep 2015
Posts: 10
Turanga_Fry is on a distinguished road
Default

Quote:
Cool, glad you got it to work. FYI you can import that as a code template if you want, rather than copying the code into a transformer.

My initials are N.A.R., last name being Rupley, hence "narupley". Though I am a huge anime fan
Nice one Me too!
Keep up the good work !!

Regards
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -8. The time now is 01:27 AM.


Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Mirth Corporation