Webserviceshell 2.2 User Documentation¶
25 Aug 2016
- Table of contents
- Webserviceshell 2.2 User Documentation
- Introduction
- WSS Concepts
- Conventions and Configuration Concepts
- WSS Configuration
- Param configuration file (servicename-param.cfg)
- Log4j properties file (servicename-log4j.properties)
- Additional CmdProcessor Features
- Changes from version 1.x
- Parameter File Examples
Introduction¶
The Web Service Shell (WSS) is a web service that can be configured via simple properties files to utilize external resources (either command-line programs or Java classes) to fulfill web service requests.
WSS core features:¶
- manage HTTP connection with the client
- validate request parameters
- log service requests, state, and errors
- route one or more endpoint queries to execution of respective command-line programs or Java classes
- handle, via the servlet container, HTTP authentication
WSS 2.x change overview, see the Issues page for more details:¶
- Version 2.x
- enable queries to multiple endpoints
- add more flexibility for service naming convention
- add selected new configuration options, standardize naming convention
- generalize the ability to proxy additional content
- add more flexibility in content handling from Java endpoints
- Version 2.1
- Add new RabbitMQ usage logging option
- Version 2.2 to 2.2.3
- Fix issues related to initial testing and production operations
WSS is written in Java and is delivered as a deployable web application in the form of a .war file. When the war file is installed into a Tomcat or other servlet container, WSS reads its respective configuration files and is ready to handle client requests. Java programming is not needed to configure and operate a web service using WSS.
- Fix issues related to initial testing and production operations
WSS Concepts¶
The primary objectives of WSS is to enable internet access to data. WSS handles the HTTP communication with clients while executing task specific programs to retrieve the desired data, thus, effectively separating the HTTP component of a web service from the data extraction work.
While the WSS can be extended by writing custom Java processor code, WSS is designed to bypass Java coding and, by configuration, use existing data access command-line scripts for data access. Two general use processors are provided with WSS to accomplish this:
- edu.iris.wss.endpoints.CmdProcessor
- edu.iris.wss.endpoints.ProxyResource
The CmdProcessor processor executes command-line programs and streams requested data back to a client. The ProxyResource processor returns data as defined by a couple of configuration parameters, one for the URL of a respective data source and another to designate the output media type.
Both CmdProcessor and ProxyResource can be configured multiple times in one WSS installation to implement multiple endpoints and deliver a suite of data products to HTTP clients.
Conventions and Configuration Concepts¶
A few conventions are needed to deploy a WSS web service in a service container like Tomcat. These conventions control the name of the service, its respective URL as well as enable each WSS service to load its respective configuration information.
Terms¶
For example, to create a service with a URL of this form:
http://service.iris.edu/fdsnws/event/1/query?minmagnitude=8.5&format=geocsv
we need to refer to specific part of this URL, so we will use the following names on the left, in bold, to refer to respective parts of a URL, in this case:
- protocol - http
- host - service.iris.edu
- service (or application) - fdsnws/event/1
- endpoint - query
- client interface (i.e. query) parameters - minmagnitude and format (i.e. each term after the '?' or '&' character and to the left side of the '=' character)
- when deploying the WSS war file, change its name to reflect the desired service name
- create configuration files whose names match the deployed WSS war file name
- put the appropriate entries in the respective config files to get the desired endpoint name(s)
- put the appropriate entries in the respective config file to get the desired client interface parameter(s) names
Note: Some references external to this document define "endpoint" as the full URL up to the '?' character (e.g. http://service.iris.edu/fdsnws/event/1/query). However, in this document, “endpoint” refers only to the portion of the URL between the service ( i.e. "fdsnws/event/1") and the client interface parameters (i.e. “minmagnitude=8.5&format=geocsv”), so in this case, the endpoint is "query".
The host and protocol are usually set for a given site and are not discussed further here.
Service Name¶
service naming convention -. The service name is derived from the deployed war file name. For example, to get a service name of " fdsnws/event/1", the WSS war file must be renamed to fdsnws#event#1.war, which when deployed, results in the following URL:
http://service.iris.edu/fdsnws/event/1/
where fdsnws/event/1, is determined by replacing any hash characters ‘#’ in the WSS war file name with slash characters ‘/’. Note: using '#" is not required if your naming conventions do not need a service name with '/' in it.
Endpoint Names¶
endpoints - Endpoints are defined in a service.cfg file. An endpoint name may include slashes or dots. For example, endpoints named "query", "v2/swagger" and "application.wadl" result in these URLs, respectively:
http://service.iris.edu/fdsnws/event/1/query
http://service.iris.edu/fdsnws/event/1/v2/swagger
http://service.iris.edu/fdsnws/event/1/application.wadl
Client Interface Parameters¶
Interface parameter convention - Client interface parameters for a given endpoint are defined in a param.cfg file (discussed in more detail below). For example, to define the parameters "minmagnitude" and "format" for the "query" endpoint, add the following lines to the param.cfg file:
query.minmagnitude=NUMBER
query.format=TEXT
WSS will then be configured to accept a client request using this URL:
http://service.iris.edu/fdsnws/event/1/query?minmagnitude=8.5&format=geocsv
In summary -- A war file name may be chosen as needed to get a desired service name, but respective configuration file names must correspond to that war file name.
- Endpoint names may be chosen as needed, but the contents of respective WSS configuration files must contain the respective parameters for that endpoint name.
WSS Configuration¶
Each WSS service uses a respective set of configuration files to define endpoints and interface parameters, as well as manage operation of WSS. The following naming convention is used so that multiple web services can be deployed in the same container.
configuration file name convention - WSS tries to open these configuration files when starting. These file names are derived from the deployed .war file name and are determined by replacing any hash character ‘#’ with a dot character ‘.’ and appending the respective file type (e.g. -service.cfg, -param.cfg, -log4j.properties). For the fdsnws#event#1.war file, the required configuration files are:- fdsnws.event.1-service.cfg - contains operational and endpoint parameters
- fdsnws.event.1-param.cfg - contains client interface parameters for each endpoint
- fdsnws.event.1-log4j.properties - contains logging configuration
Note: When deployed, WSS will try to startup even when there are errors in the configuration files, for new web services, the respective log files should be reviewed to see that the desired endpoints and client parameters are fully loaded.
Built in endpoints¶
These endpoints are built in to WSS and are available to a client without configuration. The wssstatus endpoint is only allowed to clients on the local network. It is assumed that the status may contain site sensitive information and there for should not be generally available on the internet.
Endpoint | Default Value | Comments |
---|---|---|
/ | Internally generated information page | Use parameter rootServiceDoc in service.cfg to specify a URL with this service's specific documentation. |
wssstatus | status information about this WSS and its configuration | Blocked to external request, it is only available on the local network. |
wssversion | current version of Web Service Shell | The wssversion value is set in code when WSS is released. |
whoami | remote address of requestor | The address returned may be affected by network topology. |
version | "notversioned" | Use parameter "version" in service.cfg to set a number or other identifier as needed for this service. |
Service configuration file (servicename-service.cfg)¶
The service.cfg file is composed of two types of parameters: global and endpoint parameters. Global parameters apply to the application as a whole while endpoint parameters can be configured multiple times.
Global Parameters¶
Parameter Name | Default Value | Description |
---|---|---|
appName | "notnamed" | It can be different from the war file name - it can indicate this service or data application name - Note, this value is also used in output to "usage" log files as field "application". |
version | "notversioned" | Indicates the version of this web service. The response to a "version" endpoint request. |
corsEnabled | true | when true, the HTTP response header includes Access-Control-Allow-Origin: * |
rootServiceDoc | null | a URL providing information about the application, documentation on usage, etc., if a value is not provided, a generic html page will be generated. |
loggingMethod | "LOG4J" | choices are LOG4J or JMS or RABBIT_ASYNC. |
loggingConfig | null | a file name or URL referencing an IRIS RabbitMQ configuration file. Only required if loggingMethod is set to RABBIT_ASYNC |
sigkillDelay | 30 | time in seconds before a handler process is sent a SIGKILL, see section "Command-Line Process Time Limits", time starts after handlerTimeout (see Endpoint Parameter table) |
singletonClassName | null | optional - A user provided Java class that will be instantiated once when the service starts. A service can use this class to provide capability that may be needed by individual endpoints in the application. |
Setting up Endpoints¶
The flexibility to set multiple arbitrary endpoints within the service.cfg file is a primary benefit to the 2.x version of the Web Service Shell. An endpoint is defined and configured within the service.cfg file by prepending a string in front of the parameters described in this section.
For example, to specify a "query" endpoint to execute programABC:
query.handlerProgram=/usr/folder1/programABC
query.handlerTimeout=45
plus other endpoint parameters as needed (defined in the following table).
to add another endpoint "answer" which is to execute programDEF, do the following:
answer.handlerProgram=/usr/folder1/programDEF
answer.handlerTimeout=160
etc.
Endpoint Parameters¶
The term "formatType" refers to the name of a media type that may be provided by an endpoint. Media type is a case insensitive string and one or more may be defined in the formatTypes parameter.
The term "${UTC}" can be used where noted to insert a UTC time string in ISO 8601 format.
The term "${appName}" can be used to insert the appName parameter string.
Parameter Name | Default Value | Description |
---|---|---|
endpointClassName | edu.iris.wss.endpoints.CmdProcessor | a Java class that extends the IrisProcessor class. The class will be instantiated for each request on this endpoint. Two IrisProcessor classes are provided with Web Service shell, edu.iris.wss.endpoints.CmdProcessor and edu.iris.wss.endpoints.ProxyResource. CmdProcessor executes a handlerProgram and ProxyResource provides the capability to deliver the contents specified in the proxyURL parameter. |
handlerProgram | "nonespecified" | a fully qualified executable file name. It must be specified when endpointClassName is set to edu.iris.wss.endpoints.CmdProcessor |
handlerTimeout | 30 | inactivity time in seconds before WSS sends a handlerProgram process a SIGTERM. see section "Command-Line Process Time Limits". |
handlerWorkingDirectory | /tmp | if specified, it must be a valid folder with write access. It is required to start a handlerProgram process and the handlerProgram may use the folder to write content into. |
usageLog | true | when true, log information is written to the usageLogger logger defined in the log4j.properties file, or JMS, or RabbitMQ, depending on the value of loggingMethod |
formatTypes | BINARY: application/octet-stream | a list of (formatType, media type) pairs that are provided by this endpoint. When provided, the first pair in the list becomes the default. The BINARY formatType pair is retained in the list and is alway available for selection. Note: A client selects a media type by using the "format" interface parameter and specifying one of the defined formatTypes. |
formatDispositions | empty string | a list of (formatType, HTTP Content-Disposition value) pairs that will override the Content-Disposition header set by WSS default or addHeader parameter. A Content-Disposition, filename value may have ${UTC} and/or ${appName} appended/inserted so as to avoid name conflicts when files are downloaded by a client. |
addHeaders | empty string | a list of HTTP headers that will be added to the default headers, or override any default header with the same name. ${UTC} and/or ${appName} may be used in the filename value. |
postEnabled | false | when false, POST request are ignored, when true, the POST body is passed to the respective handlerProgram or Java processor endpoint. |
logMiniseedExtents | false | when true, additional Miniseed channel information is collected and written to the usage log, only applies to edu.iris.wss.endpoints.CmdProcessor |
use404For204 | false | when true and a response has no data, return an HTTP code 404 instead of 204. |
proxyURL | "noproxyURL" | when used, it must point to a valid URL. It is required when the endpointClassName is set to edu.iris.wss.endpoints.ProxyResource |
Managing HTTP headers for a client response¶
HTTP headers can be set both by endpoint parameter and by a command-line handler. WSS sets headers in the following order:- WSS default headers
- When corsEnabled is true, return this header
access-control-allow-origin: * - When formatType is MSEED, MINISEED, or BINARY, return this header
content-disposition: attachment; filename=service${UTC}.${formatType} (note: no file type for BINARY)
otherwise, return this header
content-disposition: inline; filename=service${UTC}.${formatType}
- When corsEnabled is true, return this header
- addHeaders - adds new headers, or overrides default headers.
- formatDispositions - overrides content-disposition header, but only for the respective formatType.
- command-line handler - can add to, or override previous headers - see section "Additional CmdProcessor Features".
Param configuration file (servicename-param.cfg)¶
The contents of the param.cfg file specify the names and types of client interface parameters that a given endpoint supports. Web Service Shell will perform type checking of parameters from client requests before executing respective handlers or Java classes. This enables a quick response to a client for simple parameter format errors in a request URL.
The data types recognized by Web Service Shell are NUMBER, TEXT, BOOLEAN, DATE, and NONE. If the type "NONE" is used for a parameter type, WSS will return a "No value permitted" exception for that parameter.
Reserved Parameters¶
A few interface parameters have special meaning in Web Service Shell and can only be used as defined here. The reserved interface parameters are:- format
- aliases
- username
- stdin
- nodata
- output
WSS uses the "format" parameter to determine the response media type. The value of format must be one of the formatType names defined in formatTypes parameter. By default, the format parameter does not need to be defined and the response will be the default media type. Add the format parameter to allow a client to explictly select media type.
The "aliases" parameter can be used to specify one or more short names for interface parameters on a given endpoint. The aliases parameter itself is not accessible by a client, see a usage example below.
"username" is added by WSS to a handlerProgram argument list when a user has been successfully authenticated.
"stdin" is added by WSS to a handlerProgram argument list when a post request has been made. It indicates to the handler that the handler should read stdin to get the post data.
"nodata" is predefined in WSS and can be used on any query to indicate to WSS to return an HTTP code 404 instead of 204. The accepted values are "204" or "404". When this parameter is used, it overrides the configuration parameter "use404For204"
"output" is a deprecated parameter and should not be used.
param.cfg parameters¶
Client interface parameter names and types are defined as needed for respective endpoints in the param.cfg file. Each parameter must be prepended with an endpoint name. The endpoint name must be defined in the service.cfg file or the parameter will be ignored.
For example, to specify interface parameters "format", "minlongitude", "maxlongitude" for endpoint "query" enter the following lines in a param.cfg file:
query.format=TEXT
query.minlongitude=NUMBER
query.maxlongitude=NUMBER
The "aliases" parameter can be used to define shorter names for a respective interface parameter name. Each endpoint may have an "aliases" list that relates a defined parameter to one or more alternate parameter names. For example, to define mnlg and minlong as short names for minlongitude; and mxlong for maxlongitude, add the following:
query.aliases = \
minlongitude: (mnlg, minlong), \
maxlongitude: mxlong
Log4j properties file (servicename-log4j.properties)¶
WSS uses log4j to log two types of messages: operation messages and data usage messages. Use the log4j properties file to specify the name and location of the respective log files. Note: the usage log file is only used when loggingMethod is set to LOG4J, otherwise usage log messages are sent to their respective JMS or RabbitMQ destination.
If a respective log4j properties files is not found when WSS starts, it will try to write log messages to files wss.log and wss_usage.log in the container log folder.
Additional CmdProcessor Features¶
The Java class, edu.iris.wss.endpoints.CmdProcessor, provided with WSS has additional features that may be useful when developing services.
Command-Line Process Time Limits¶
Once a command-line process starts, CmdProcessor will wait for handlerTimeout seconds for a process to return data or error messages. Once handlerTimeout is exceeded, CmdProcessor sends the process a SIGTERM. A process may catch the SIGTERM and do cleanup as needed such as close databases, write messages to standard err, etc., and terminate. WSS will wait sigkillDelay seconds for the process to end, if the process does not end, WSS will send a SIGKILL.
CmdProcessor to Command-Line Interface¶
CmdProcessor sends request parameters to the command-line process's standard input. The interface parameters and values are written to the process in this form:
--interfaceParameter value
If the client request is a POST, a parameter, --STDIN (with no value) along with the POST body is written to the process.
If an error occurs, CmdProcessor checks standard error and includes any information read from standard error in an error response to the client.
Once CmdProcessor detects any data from the process on standard output, it returns an HTTP 200 to the client and hands off the data stream to the web service container, which then streams the data to the client.
Setting HTTP Headers From a Command-Line Process¶
CmdProcessor has the ability to set response HTTP headers. The CmdProcessor checks for header values when a process starts writing and then returns any headers found to WSS. WSS adds or overrides any previously set headers before returning an HTTP response, see section "Managing HTTP headers for a client response".
The following restrictions apply:
- A command-line process that needs to set headers must write the header information before writing any other output
- The header information must be text in the following form:
- starts with "HTTP_HEADERS_START"
- followed by each header in regular HTTP format, terminated with linefeed or carriage return linefeed
- ends with "HTTP_HEADERS_END"
- The marker strings must not be followed by linefeed or carriage return linefeed.
For example, to provide headers "Content-Disposition", "Access-Control-Allow-Origin", and "Test-Header", the output looks like this (\n indicates linefeed):
HTTP_HEADERS_STARTContent-Disposition : inline\nAccess-Control-Allow-Origin : http://host.example\nTest-Header : value-for-test-hdr\nHTTP_HEADERS_END
Environmental Information for Command-Line Processes¶
When CmdProcessor starts a command-line process, the following environmental values are set.
REQUESTURL - the client request URL
USERAGENT - value from request HTTP header User-Agent
IPADDRESS - client IP, may be affected by network topology
APPNAME - value from appName configuration parameter
VERSION - value from version configuration parameter
CLIENTNAME - dummy string, not currently in use
HOSTNAME - host name of WSS server
If an authenticated user is present, then also AUTHENTICATEDUSERNAME - authenticated user name
Changes from version 1.x¶
Web Service Shell has been upgraded to:
- enable more flexibility in naming a WSS application
- support more than one user defined endpoint per WSS application
- generalize proxying capability
- enable setting HTTP headers from a command-line process
- add detailed error description as needed
Additionally, unused parameters have been removed and a few parameter names have been changed.
Converting an existing Web Service Shell application to 2.x¶
The general steps needed to change a Web Service Shell 1.x configuration to 2.x are:
- service.cfg file
- change web service part of config file name as needed
- create global parameters
- create endpoint for "query"
- create endpoints for "application.wadl" or "v2/swagger" as needed and set respective proxyURL
- param.cfg file
- change web service part of config file name as needed
- change parameter names by prepending endpoint name
- log4j.properties
- change web service part of config file name as needed
- change output log file names as needed
Changes for endpoints provided by Web Service Shell¶
Previous Endpoint | New Endpoint | Comment |
---|---|---|
/ | same as 1.x | |
status | wssstatus | endpoint rename |
application.wadl | builtin implementation removed | A general solution is now provided. Add an endpoint configuration with endpointClassName set to edu.iris.wss.endpoints.ProxyResource, add proxyURL and set the proxy content, and add formatTypes and set the default media type. |
wssversion | same as 1.x | |
whoami | same as 1.x | |
version | same as 1.x | |
catalogs | removed | Removed from WSS code and will be re-implemented as endpoint in Event service |
contributors | removed | same as catalogs |
counts | removed | same as catalogs |
service.cfg related changes - global parameters¶
Previous Parameter | New Parameter | Comment |
---|---|---|
appName | same as 1.x | |
version | same as 1.x | |
corsEnabled | same as 1.x | |
rootServiceDoc | same as 1.x | |
loggingMethod | same as 1.x | with addition of RABBIT_ASYNC option |
- | loggingConfig | new with 2.1, only applies to loggingMethod RABBIT_ASYNC |
sigkillDelay | same as 1.x | |
singletonClassName | same as 1.x | |
rootServicePath | removed | not needed |
jndiUrl | removed | not needed |
connectionFactory | removed | not needed |
topicDestination | removed | not needed |
argPreProcessorClassName | removed | not needed |
service.cfg related changes - endpoint parameters¶
Previous Parameter | New Parameter | Comment |
---|---|---|
streamingOutputClassName | endpointClassName | parameter renamed |
handlerProgram | same as 1.x | |
handlerTimeout | same as 1.x | |
handlerWorkingDirectory | same as 1.x | |
usageLog | same as 1.x | |
outputTypes | formatTypes | parameter renamed |
- | formatDispositions | new parameter |
- | addHeaders | new parameter |
postEnabled | same as 1.x | |
- | logMiniseedExtents | new parameter - extra processing and usage logging for Miniseed channel information is no longer performed by CmdProcessor unless this parameter is added |
use404For204 | same as 1.x | |
- | proxyURL | new parameter, must be used when endpointClassName is set to edu.iris.wss.endpoints.ProxyResource |
param.cfg configuration changes¶
Configuring interface parameters has not changed except to add an endpoint name. Each parameter must now be prepended with an endpoint name. The endpoint name must be defined in the service.cfg file or the parameter will be ignored.
For example, to change parameters for endpoint "query":
original form | is now |
---|---|
format=TEXT | query.format=TEXT |
minlongitude=NUMBER | query.minlongitude=NUMBER |
maxlongitude=NUMBER | query.maxlongitude=NUMBER |
The "aliases" parameter is handled like other parameters, i.e. prepend the endpoint name in front of "aliases".
answer.aliases = \
minlongitude: (mnlg, minlong), \
maxlongitude: mxlong
Additional endpoints are specified in the same way, for example, to add parameters for an endpoint "answer":
answer.format=TEXT
answer.minlongitude=NUMBER
answer.maxlongitude=NUMBER
Parameter File Examples¶
service.cfg¶
# Service Documentation and context path baselines # ---------------- globals appName=my-service-app-name version=1.0.0 # CORS is enabled by default, set to false to disable CORS processing ##corsEnabled=false # a URL providing information about the application, documentation on usage, etc. #rootServiceDoc=http://service/fdsnws/dataselect/docs/1/root/ # LOG4J or JMS or RABBIT_ASYNC loggingMethod=LOG4J ##loggingConfig=local_file_system or URL for rabbitconfig-publisher.properties file # time is seconds sigkillDelay=30 # -------- endpoint -------- query.endpointClassName=edu.iris.wss.endpoints.CmdProcessor query.handlerProgram=/user/home/tomcat/bin/extractdata-handler # time in seconds for command line processes query.handlerTimeout=300 query.handlerWorkingDirectory=/tmp # usageLog is true by default, set this to false to disable usage logging #query.usageLog=false # formatTypes - specifies a list of "formatType: mediaType" pairs query.formatTypes = \ mseed:application/vnd.fdsn.mseed, \ miniseed:application/vnd.fdsn.mseed, \ text: text/plain,\ csv: text/csv,\ json: application/json, \ xml: application/xml # Content-Disposition overrides for respective media types "text" and "miniseed" query.formatDispositions= \ text: inline; filename="mypart_${appName}_${UTC}.txt", \ miniseed: attachment; filename="a_miniseed_file.mseed" # Disable or remove this to disable POST processing query.postEnabled=true # false by default, enables additional miniseed processing and logging ##query.logMiniseedExtents=true # Enable this to return HTTP 404 in lieu of 204, NO CONTENT query.use404For204=true # -------- endpoint -------- application.wadl.endpointClassName=edu.iris.wss.endpoints.ProxyResource # required when endpointClassName is set to edu.iris.wss.endpoints.ProxyResource application.wadl.proxyURL=http://service/fdsnws/dataselect/docs/1/wadl application.wadl.formatTypes = \ xml: application/xml
param.cfg¶
query.format=TEXT query.type=TEXT query.minlongitude=NUMBER query.maxlongitude=NUMBER query.minlatitude=NUMBER query.maxlatitude=NUMBER
log4j.properties¶
log4j.rootLogger=INFO, ShellAppender log4j.appender.ShellAppender=org.apache.log4j.RollingFileAppender log4j.appender.ShellAppender.File=${catalina.base}/logs/my_app_1.log log4j.appender.ShellAppender.MaxFileSize=10MB log4j.appender.ShellAppender.MaxBackupIndex=5 log4j.appender.ShellAppender.layout=org.apache.log4j.PatternLayout log4j.appender.ShellAppender.layout.ConversionPattern=%d %-5p [%t]: %c{1} %x - %m%n log4j.category.UsageLogger=INFO, UsageAppender log4j.additivity.UsageLogger=false log4j.appender.UsageAppender=org.apache.log4j.DailyRollingFileAppender log4j.appender.UsageAppender.File=${catalina.base}/logs/my_app_usage_1.log log4j.appender.UsageAppender.DatePattern='_'yyyy-MM-dd log4j.appender.UsageAppender.layout=edu.iris.wss.utils.WssLog4JLayout log4j.appender.UsageAppender.layout.ConversionPattern=%m%n
rabbitconfig-publisher.properties¶
# Host that's the broker. This will normally be the load balancer broker=broker1,broker2 # The virtual host within the broker virtualhost=test # Internal buffer size for the async publishers buffersize=10000 # Persistent or not default_persistence=true # The exchange name that recieves these messages exchange=ws_logging # Credentials user=hellorabbit password=hellopw # Probably never normnally reconnect reconnect_interval=-1 # How often to wait between failed connection attempts in msec retry_interval=4000