Project

General

Profile

Bug #407

Support #705: How to UTF-8 characters in FDSN StationXML

8bits ascii char in dataless

Added by Catherine Pequegnat over 10 years ago. Updated almost 9 years ago.

Status:
Closed
Priority:
Normal
Start date:
12/10/2013
Due date:
% Done:

0%

Estimated time:
Resolution:

Description

Dear stationxml-converter-1.0.2.jar maintainers,
We (in France) have dataless that contain 8 bits char in textual blockette fields, like for instance B050F09.
rdseed doesn't complain about these files and decodes the text, but stationxml-converter-1.0.2.jar does not.
Thanks for help
Catherine ()

Example : (the dataless file contains a : "à". Without this char, stationxml-converter-1.0.2.jar runs well.

java -jar stationxml-converter-1.0.2.jar -x -l resp -se RESIF -so RESIF -o ESCA.sta.xml ./ESCA.dataless
ERROR with Integer field value for fieldNum=10 and fieldIndex=0: forcing value of zero
blktype=0, value=e~0
java.lang.NullPointerException
at edu.iris.converter.StationBuilder.buildNetwork(StationBuilder.java:84)
at edu.iris.converter.StationPrinter.process(StationPrinter.java:120)
at edu.iris.converter.Main.<init>(Main.java:197)
at edu.iris.converter.Main.main(Main.java:216)

ESCA.dataless (368 KB) ESCA.dataless Catherine Pequegnat, 12/11/2013 04:47 PM

History

#1 Updated by Chad Trabant over 10 years ago

  • Assignee set to Yazan Suleiman

#2 Updated by Yazan Suleiman over 10 years ago

  • Status changed from New to Feedback

I have a work around, version 1.0.3 should deal with that. At this point this is an issue in the network description field. The file in question will be very helpful

#3 Updated by Catherine Pequegnat over 10 years ago

here is the file, see B050F10

#4 Updated by Catherine Pequegnat over 10 years ago

with stationxml-converter-1.0.3.jar :

java -jar /home/sysop/resifMdi/stationxml-converter-1.0.3.jar -x -l resp -se RESIF -so RESIF -o ESCA.sta.xml ./ESCA.dataless
ERROR with Integer field value for fieldNum=10 and fieldIndex=0: forcing value of zero
blktype=0, value=e~0

#5 Updated by Celso Reyes over 10 years ago

Yes, it looks like it is not looking into the extended character set at all, for neither program. Notice, that the values "wrap". So at a minimum, the error message display is incorrect.

It looks like it's probably wrapping within the range of -128 to +127. For example, the radical symbol (checkmark) is 512 * 17 + 26. while the (R) is extended-ASCII value 174 (128 - 46)
I wonder if this same sort of error is propagated elsewhere in the system. It should be looking at probably unsigned integers instead of signed char or equivalent.

When I try it with PDCC 3.8 I get:

WARNING: character '√' (value=26) is not acceptable for Blockette 50 field 9
Expected one of:
    UpperCase
    LowerCase
    Digit
    Punctuation
    Space

CORRECTION: substituting with character '.'
WARNING: character '†' (value=32) is not acceptable for Blockette 50 field 9
Expected one of:
    UpperCase
    LowerCase
    Digit
    Punctuation
    Space

CORRECTION: substituting with character '.'
WARNING: character '®' (value=-82) is not acceptable for Blockette 50 field 9
Expected one of:
    UpperCase
    LowerCase
    Digit
    Punctuation
    Space

CORRECTION: substituting with character '.'

And, when I try with station xml converter 1.0.3, I get equivelent sorts of messages:

 java -jar stationxml_latest.jar ~/DATA/ESCA.dataless -x -l resp -se RESIF -so RESIF -o ESCA.sta.xml
WARNING: character '?' (value=26) is not acceptable for Blockette 50 field 9
Expected one of:
    UpperCase
    LowerCase
    Digit
    Punctuation
    Space

CORRECTION: substituting with character '.'
WARNING: character '?' (value=32) is not acceptable for Blockette 50 field 9
Expected one of:
    UpperCase
    LowerCase
    Digit
    Punctuation
    Space

CORRECTION: substituting with character '.'
WARNING: character '?' (value=-82) is not acceptable for Blockette 50 field 9
Expected one of:
    UpperCase
    LowerCase
    Digit
    Punctuation
    Space

CORRECTION: substituting with character '.'
skookum:stationxml celsoreyes$ java -version
java version "1.6.0_65" 
Java(TM) SE Runtime Environment (build 1.6.0_65-b14-462-11M4609)
Java HotSpot(TM) 64-Bit Server VM (build 20.65-b04-462, mixed mode)

#6 Updated by Chad Trabant almost 9 years ago

  • Parent task set to #705
  • Status changed from Feedback to Closed

No feedback.

Related to general issue of mapping UTF-8 (even though disallowed in StationXML) #705.

Also available in: Atom PDF