Bug #407
Support #705: How to UTF-8 characters in FDSN StationXML
8bits ascii char in dataless
0%
Description
Dear stationxml-converter-1.0.2.jar maintainers,
We (in France) have dataless that contain 8 bits char in textual blockette fields, like for instance B050F09.
rdseed doesn't complain about these files and decodes the text, but stationxml-converter-1.0.2.jar does not.
Thanks for help
Catherine (Catherine.Pequegnat@ujf-grenoble.fr)
Example : (the dataless file contains a : "à". Without this char, stationxml-converter-1.0.2.jar runs well.
java -jar stationxml-converter-1.0.2.jar -x -l resp -se RESIF -so RESIF -o ESCA.sta.xml ./ESCA.dataless
ERROR with Integer field value for fieldNum=10 and fieldIndex=0: forcing value of zero
blktype=0, value=e~0
java.lang.NullPointerException
at edu.iris.converter.StationBuilder.buildNetwork(StationBuilder.java:84)
at edu.iris.converter.StationPrinter.process(StationPrinter.java:120)
at edu.iris.converter.Main.<init>(Main.java:197)
at edu.iris.converter.Main.main(Main.java:216)
History
#1 Updated by Chad Trabant over 8 years ago
- Assignee set to Yazan Suleiman
#2 Updated by Yazan Suleiman over 8 years ago
- Status changed from New to Feedback
I have a work around, version 1.0.3 should deal with that. At this point this is an issue in the network description field. The file in question will be very helpful
#3 Updated by Catherine Pequegnat over 8 years ago
- File ESCA.dataless ESCA.dataless added
here is the file, see B050F10
#4 Updated by Catherine Pequegnat over 8 years ago
with stationxml-converter-1.0.3.jar :
java -jar /home/sysop/resifMdi/stationxml-converter-1.0.3.jar -x -l resp -se RESIF -so RESIF -o ESCA.sta.xml ./ESCA.dataless
ERROR with Integer field value for fieldNum=10 and fieldIndex=0: forcing value of zero
blktype=0, value=e~0
#5 Updated by Celso Reyes over 8 years ago
Yes, it looks like it is not looking into the extended character set at all, for neither program. Notice, that the values "wrap". So at a minimum, the error message display is incorrect.
It looks like it's probably wrapping within the range of -128 to +127. For example, the radical symbol (checkmark) is 512 * 17 + 26. while the (R) is extended-ASCII value 174 (128 - 46)
I wonder if this same sort of error is propagated elsewhere in the system. It should be looking at probably unsigned integers instead of signed char or equivalent.
When I try it with PDCC 3.8 I get:
WARNING: character '√' (value=26) is not acceptable for Blockette 50 field 9 Expected one of: UpperCase LowerCase Digit Punctuation Space CORRECTION: substituting with character '.' WARNING: character '†' (value=32) is not acceptable for Blockette 50 field 9 Expected one of: UpperCase LowerCase Digit Punctuation Space CORRECTION: substituting with character '.' WARNING: character '®' (value=-82) is not acceptable for Blockette 50 field 9 Expected one of: UpperCase LowerCase Digit Punctuation Space CORRECTION: substituting with character '.'
And, when I try with station xml converter 1.0.3, I get equivelent sorts of messages:
java -jar stationxml_latest.jar ~/DATA/ESCA.dataless -x -l resp -se RESIF -so RESIF -o ESCA.sta.xml WARNING: character '?' (value=26) is not acceptable for Blockette 50 field 9 Expected one of: UpperCase LowerCase Digit Punctuation Space CORRECTION: substituting with character '.' WARNING: character '?' (value=32) is not acceptable for Blockette 50 field 9 Expected one of: UpperCase LowerCase Digit Punctuation Space CORRECTION: substituting with character '.' WARNING: character '?' (value=-82) is not acceptable for Blockette 50 field 9 Expected one of: UpperCase LowerCase Digit Punctuation Space CORRECTION: substituting with character '.' skookum:stationxml celsoreyes$ java -version java version "1.6.0_65" Java(TM) SE Runtime Environment (build 1.6.0_65-b14-462-11M4609) Java HotSpot(TM) 64-Bit Server VM (build 20.65-b04-462, mixed mode)
#6 Updated by Chad Trabant about 7 years ago
- Parent task set to #705
- Status changed from Feedback to Closed
No feedback.
Related to general issue of mapping UTF-8 (even though disallowed in StationXML) #705.