Extending Open Source tools with functionality for defining and executing message transformations from EDIFACT to other standards and back requires a good understanding of the UN/EDIFACT Standard.
Hereafter I will briefly explain the UN/EDIFACT standard and discuss the following topics:
- important character sets
- the different elements of the structure
- a description of the structure
- the position, status and repetition factor of the structure elements
- the compression rules
important character sets
- Segment terminator = apostrophe '
- Segment tag and data element separator = plus sign +
- Component data element separator = colon :
- Release character = question mark ? immediately preceding one of the characters ' + : ? restores their normal meaning.
Example: 10?+10=20 means 10+10=20
the structure elements
The structure of an EDIFACT message is fixed and contains a number of mandatory and conditional elements. The most important building block of an EDIFACT message is the segment and hence the structure is defined in segment tables. These tables show which segments are used in the message and the order in which the segments must occur.
A simple example of such a segment table looks as follows:
Segments that belong together can be grouped in a segment group. A segment group always contains a trigger segment and at least one or more segments or segment groups. The trigger segment is a mandatory segment and must appear once. The trigger segment is always the first segment in the group.
A segment is uniquely identified using a segment tag and the position in the message.
In the above example the segment with the segment tag RFF is a trigger segment of the segment group 1 and also of the segment group 3.
A segment is a collection of logically related stand-alone or composite data elements in a fixed, defined sequence. The segment BGM, see example below, contains two composite data elements and two stand-alone data elements. A composite data element contains a combination of several data elements.
description of the structure
The structure of an EDIFACT message consists of three group levels. These levels are often called envelopes. The start and end of an envelope is identified by a service segment. The segment tag of service segments start with the letters UN.
The three levels of the EDIFACT structure are:
- Interchange Envelope (mandatory)
- Functional Group Envelope (conditional)
- Message Envelope (mandatory)
The Interchange Envelope consists of the segments UNA, UNB and UNZ. The segment UNA Service String Advice is an optional segment and provides the capability to specify the use of different separators from the standard.
The segment UNB Interchange Header identifies the sender and receiver of a message. All the data within the Interchange Envelope are defined for one receiver. The segment UNZ Interchange Header closes the envelope.
The electronic document, the actual EDI message, contains the segments from UNH to UNT. A message always starts with a Message Header, the UNH segment, and ends with a Message Trailer, the UNT segment.
The segment UNH contains the Composite Data Element UNH-S009 Message Identifier with the message type, version number and the identification of the organization responsible for the message specification. The Data Element 0065 Message Type contains the message type. The message type is identified using a 6-letter word.
The service segment UNS Section Separator can also appear in the message part to make a clear distinction between the header, detail and summary sections. For the separation between header and detail sections the segment contains the value D and between the detail and the summary sections the value S.
The Functional Group Envelope is defined by the segments UNG and UNE. The use of the functional group envelope is mandatory for communication to and from North America. The functional group envelope provides the capability to include messages from different types in separate functional groups.
position, status and repetition factor
Segment groups, segments, composite data elements and data elements have a fixed position in the structure of an EDIFACT message. The position is represented by the position number. The position of segments and segment groups can be different for each message type.
The field Status (S) identifies the elements in the structure that are Mandatory or Conditional. When a segment is mandatory it should at least occur once.
The field Repetition (R) indicates the maximum number of times an element in the structure may repeat.
A Data Element is considered available when the data element contains a value of 1 character. A composite data element is considered available if at least one of the data elements is present. A segment is available when the segment tag is present and a segment group is available when the trigger segment is present.
Compression rules
The UN/EDIFACT standard was created when bandwidth was limited and therefore several compression rules were defined to ensure that messages are as compact as possible.
These compression rules are one of the rules that makes working with EDIFACT messages difficult as you will understand later on.
* exclusion of conditional segments by omission
Conditional segments containing no data shall be fully omitted.
* exclusion of conditional data elements in a segment by omission
If a conditional data element is omitted and is followed by another data element, its position shall be indicated by retention of its data element separator.
* exclusion of conditional data element in a segment by truncation
If one or more conditional data elements at the end of a segment are omitted, the segment may be truncated by the segment terminator. Contiguous trailing data element separators are not required to be transmitted.
The last two data elements from the previous example are omitted because these do not contain a value.
These rules count also for data elements in a composite data element except that another separator is used.
* exclusion of conditional data elements in a composite data element by omission
* exclusion of conditional data elements in a composite data element by truncation
The full overview of the UN/EDIFACT Syntax Rules can be found on the website of the UN/ECE on the webpage UN/EDIFACT Syntax Rules.
Last update: 26-11-2011