Hi zarzyk

FsLex currently only accepts 8-bit inputs. This is not ideal, and Unicode lexing has been long on our TODO list. If you can work with a Unicode encoding where polish characters are in the sub-256 character range then you may be able to make things work very smoothly (just convert the string to bytes in that encoding using one of the System.Text encoding objects and write your lexer using the byte-encodings for the characters you want)

For F# we implement non-ASCII lexing by accepting an approximtion of UTF-8 encodings - take a look for the UTF-8 encoded lex rules in lex.mll in the F# source. You may be able to do this as well. However it's a fair bit of work and you should make sure to get from bytes to Unicode strings as soon as possible.

Unicode lexing isn't actually too hard to implement for us. I'll try to take a look at it for the next release.

Kind regards

Don

By on 6/26/2007 7:12 PM ()
IntelliFactory Offices Copyright (c) 2011-2012 IntelliFactory. All rights reserved.
Home | Products | Consulting | Trainings | Blogs | Jobs | Contact Us | Terms of Use | Privacy Policy | Cookie Policy
Built with WebSharper