The problem is almost certainly that "open_file" should be using "open_in_bin" instead of "open_in", since you are doing binary input.

In particular, ultimately the lexer created using "from_channel" simply uses "Pervasives.input". In the case of files opened using "open_in" this reads directly from the underlying stream, as indicated by the following:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
/// Attempt to input the given number of bytes from the channel, writing them into the
/// buffer at the given start position. Does not block if the bytes are not available.
///
/// The use of this function with a channel performing byte-to-character translation (e.g. one
/// created with open_in, open_in_utf8 or open_in_encoded, or one 
/// or built from a StreamReader or TextReader) is not recommended.
/// Instead, open the channel using open_in_bin or binary_reader_to_in_channel.
///
/// If used with a StreamReader channel, i.e. one created using 
/// open_in, open_in_utf8 or open_in_encoded, or one 
/// or built from a StreamReader, this function reads bytes directly from the underlying
/// BaseStream.  This may not be appropriate if any other input techniques are being
/// used on the channel.
///

Please feel free to post your entire sample.

Don

By on 9/12/2006 10:05 AM ()

Hi Don,

Thanks - that does solve the problem, though I dont understand why. The source file I am parsing is not binary. I guess your comment "you are doing binary input" means fslex is doing binary input?

I should point out that I actually have two versions of the "read_x_file_header" function - one for ocaml and one for f#. Do you have any comments on how I could unify these? Apart from a minor incompatibility in fslex (*) this is the only difference I had to contend with.

ocaml:

1
2
3
4
5
6
7
let read_x_file_header file = 
  let s = String.create 16 in
  for j=0 to 15 do
    s.[j] <- input_char file;
  done;
  if (not (s = "xof 0303txt 0032")) && (not (s = "xof 0303txt 0064")) then failwith "Dont recognise header."

f#:

1
2
3
4
5
6
let read_x_file_header file =
  let rec make n = 
    if n = 0 then ""
    else (String.of_char (input_char file)) ^ (make (n-1)) in
  let s = make 16 in
  if (not (s = "xof 0303txt 0032")) && (not (s = "xof 0303txt 0064")) then failwith "Dont recognise header."

Cheers,

Andy

(*)

1
| ['A'-'Z' 'a'-'z'] ['A'-'Z' 'a'-'z' '0'-'9' '_'] * as id { ... }

needs to be redone as

1
| ['A'-'Z' 'a'-'z'] ['A'-'Z' 'a'-'z' '0'-'9' '_'] *  { let id = lexeme lexbuf in ... }
By on 9/12/2006 2:53 PM ()

Hi Andy,

Yes, fslex does binary input, in the sense that it reads bytes directly, rather than applying a Unicode interpretation (i.e. encoding/decoding) to the bytes.

On .NET and with F# the text/binary input distinction is really related to whether you are applying a Unicode interpretation to the bytes or not. Text mode functions such as "input_line" and "input_char" only really make sense in conjunction with a Unicode interpretation (or you have a default interpreation chosen for you) - for example, what Unicode character would bytes in the range 0x80 - 0xFF represent? This is not specified by OCaml, since there characters are just bytes, rather than being Unicode characters. This means you can't really mix and match with doing character input with "input_char" and byte input using "input".

The upshot is that when writing I/O code that cross compiles with OCaml you should use binary mode input more often, since from the .NET persepctive OCaml is really doing binary mode input most of the time, as it is reading input without applying a Unicode encoding.

To write binary-mode functions like read_x_file_header that work correctly with both OCaml and F# you should develop an OCaml module that makes OCaml compatible with F#. In particular you should add a Bytearray module that is compatible with the F# Bytearray module (see Microsoft.FSharp.MLLib.Bytearray). This F# module exists precisely to allow the development of cross-compiling byte-oriented I/O functionality. Here are some sample functions:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
 

(*IF-OCAML*)
(* This module is an OCaml implementation of an F#-compatible Bytearray module.
 * When using OCaml bytearrays are strings, and when using F# they are of type "byte[]"
 * You can easily implement other functions in the F# bytearray module in this way. *)
module Bytearray = struct
    type bytearray = string
    let create n x = String.make n x
    let zero_create n x = String.make n '0'
    let ascii_to_string (b:bytearray) : string = b
end
(*ENDIF-OCAML*)

let read_x_file_header file =
    let buf = Bytearray.zero_create 16 in 
    let n = input file buf 0 16 in
    if n <> 16 or 
       (let text = Bytearray.ascii_to_string buf in
        text <> "xof 0303txt 0032" && text <> "xof 0303txt 0064")
    then failwith "Dont recognise header."
By on 9/13/2006 4:55 AM ()

Hi Don,

I think I am beginning to understand now. My usual interpretation of what text/binary mode means is totally wrong in this case. So text mode means unicode. Cool.

I have implemented the code you suggested and now it cross compiles between ocaml and f# without a hitch which is nice.

I have also been playing around trying to access the f# code from c# (blimey you dont half write complex type systems in ML with barely a thought). One issue I have is that I cant seem to access a function from one of my modules (other functions in that module work) - it complains that "no overload for method 'write_xfile_data' takes '1' argument". I beg to differ...

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
let write_xfile_data d = 

  let rec write_simple_type i data =
    match data with
    | X_int s -> os (i ^ "  i " ^ s ^ "\n")
    | X_real s -> os (i ^ "  r " ^ s ^ "\n")
    | X_string s -> os (i ^ "  s " ^ s ^ "\n")
    | X_compound d -> 
      os (i ^ "  <c\n");
      iter (write_simple_type i) d;
      os (i ^ "  c>\n")
    | X_array (T_string_index(s), d) 
    | X_array (T_int_index(s), d) ->  
      os (i ^ "  <a " ^ s ^ "\n");
      iter (write_simple_type i) d;
      os (i ^ "  a>\n")
  in

  let rec write_xfile_data1 i d = 
    let tname, oname, data, hier = d in 
    os (i ^ tname ^ " " ^ oname ^ " {\n");
    iter (fun data ->
      match data with
      | X_data d -> write_simple_type i d
      | _ -> ()
    ) data;
    iter (fun hier -> 
      match hier with
      | X_object o -> write_xfile_data1 (i ^ "  ") o
      | _ -> ()
    ) hier;
    os (i ^ "}\n")
  in
  write_xfile_data1 "" d

And the c# code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
using Microsoft.FSharp;
using System;

using X_templates = Microsoft.FSharp.List<Xtypes.xfile_el_t>;

using X_data = Microsoft.FSharp.List<Microsoft.FSharp.Tuple<string,string,
          Microsoft.FSharp.List<Xtypes.x_object_el_t>,
          Microsoft.FSharp.List<Xtypes.x_object_el_t>>>;

using X_object = Microsoft.FSharp.Tuple<Microsoft.FSharp.List<Xtypes.xfile_el_t>,
          Microsoft.FSharp.List<Microsoft.FSharp.Tuple<string,string,
                      Microsoft.FSharp.List<Xtypes.x_object_el_t>,
                      Microsoft.FSharp.List<Xtypes.x_object_el_t>>>>;
                  
class Xtest {
 
 static X_templates GetTemplates(X_object x) { return x.Item1; }
 static X_data GetData(X_object x) { return x.Item2; }

 static void Main() { 
  X_templates xtemplates = GetTemplates(Xread.load_x_file(X_templates.Nil, "templates.x"));
  X_data xobj = GetData(Xread.load_x_file(xtemplates, "airplane.x"));
  
  Xwrite.write_xfile_template(xtemplates.Head);
  Xwrite.write_xfile_data(xobj.Head);
 }
}

It's more than possible that I've done something stupid given my level of expertise with these tools.

Thanks for code formatting suggestion. I have altered my account settings so that I can see them in the tool bar.

Cheers,

Andy

By on 9/13/2006 1:51 PM ()

The representation of public F# functions in the .NET IL is type-directed. In particular, functions taking tupled arguments are compiled as if they take multiple arguments, regardless of whether the functions are written

1
let f p = ...

or

1
let f (x,y) = ...

So write_x_file_data has a tuple type (indeed a triple), hence will be compiled as a .NET method that takes 3 arguments. That's just the way things work.

Given you're writing code you want to consume from C# and you're using tuples for types that are important across the interop boundary you may wish to use named data types in your F# code to replace tuples. This is really easy, for example,

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
 

type x_data = Data of string * string *  x_object_el_t list * x_data list

let write_xfile_data d = 

    ...

    let rec write_xfile_data1 i (Data(tname, oname, data, hier)) = 

        ...

 
    write_xfile_data1 "" d

Don

By on 9/13/2006 4:24 PM ()

Thanks - thats explains that then. So if I had a function which was

1
let blah (x,y) z = ...

then that would expand to a c# function of 3 arguments.

I'm really starting to get a feel for just how powerful this whole .net malarky is. I've just had a play with the DbgCLR tool and I am seriously impressed. Ocaml's days may be numbered (and I really like ocaml).

Cheers,

Andy

By on 9/13/2006 5:25 PM ()

BTW the nice formatting for the pasted code above comes from pasting in "raw text" into the "Design" tab and then marking it with the "F#" code highlighter marker next to the "Insert Simley" box (again in the "Design" tab).

Pasting in code directly from visual studio does not look great. Instead paste into notepad, and copy from notepad into the text editor in the "Design" tab.

By on 9/13/2006 5:00 AM ()
IntelliFactory Offices Copyright (c) 2011-2012 IntelliFactory. All rights reserved.
Home | Products | Consulting | Trainings | Blogs | Jobs | Contact Us | Terms of Use | Privacy Policy | Cookie Policy
Built with WebSharper