Removing an UTF BOM header
Some UTF file encodings have a special "BOM" sequence of characters (Byte Order Mark).
Example when viewing an UTF-8 with BOM file within an ISO-8859-1 editor:
John;Doe
Jane;Jackson
When reading such files, these first characters may be considered as part of the data.
In order to ignore them, a simple trick in Semarchy xDI is to add a transformation script to the File metadata, such as:
i=0;
j=0;
do
{
i=__in__.read();
j=j+1;
/* return data only after third character */
if(j>3)
{
if (i>-1)
{
ch=String.fromCharCode(i);
i=ch.charCodeAt();
__out__.write(i);
}
}
}while(i>-1);