It's also possible to use .NET to parse an XML document into a .NET object that represents it (presuming you create a .NET class of a matching structure) using XmlSerializer ([link:msdn2.microsoft.com]). XmlSerializer is itself very C#-ish, so I'd have to think more about a natural way to do this in F#, but it is a quick and easy way to decompose XML into an AST (albeit a sort of weird one, if you're used to the OCaml/F# approach you showed).
Just a thought. :)
You might like to consider using active patterns to dissect XML documents. This provides the benefits of F#'s powerful pattern matching facilities applied directly to the DOM tree, i.e. without having to copy the data.
This technique was described in one of the earlier F#.NET Journal articles.
Could you maybe provide a small example demonstrating how the code would look like? Would the whole approach in your opinion be easier, terser or more efficient than the code posted above?
Topic tags
- f# × 3707
- websharper × 2884
- core × 418
- bolero × 329
- compiler × 291
- enhancement × 215
- functional × 201
- bug × 177
- ui next × 140
- ui × 132
- c# × 122
- classes × 97
- web × 97
- .net × 84
- book × 84
- async × 77
- ui.next × 67
- templates × 58
- website × 51
- trywebsharper × 50
- question × 46
- html × 45
- server × 45
- owin × 44
- javascript × 43
- parallel × 43
- parsing × 41
- testing × 41
- typescript × 39
- template × 38
- sitelet × 31
- asynchronous × 30
- feature request × 28
- monad × 28
- ocaml × 28
- warp × 28
- tutorial × 27
- haskell × 26
- dotnet-ws × 23
- linq × 22
- sitelets × 22
- workflows × 22
- rpc × 21
- getting started × 20
- wpf × 20
- fpish × 19
- introduction × 19
- silverlight × 19
- monodevelop × 17
- piglets × 17
- suave × 17
- docs × 16
- collections × 15
- jquery × 15
- proposal × 15
- aspnetcore × 14
- pipeline × 14
- reactive × 14
- 4.6.0.361 × 13
- documentation × 13
- kendoui × 13
- formlets × 12
- 4.1.0.171 × 11
- monads × 11
- released: v0.1 × 11
- websocket × 11
- 4.4.0.280 × 10
- 4.4.1.288 × 10
- opinion × 10
- tryfsharponwasm × 10
- 4.0.190.100-rc × 9
- deployment × 9
- fixed × 9
- in × 9
- json × 9
- plugin × 9
- scheme × 9
- solid × 9
- wontfix × 9
- 4.3.0.274 × 8
- 4.5.4.317 × 8
- basics × 8
- concurrent × 8
- highcharts × 8
- how-to × 8
- mvu × 8
- python × 8
- released: v0.11 × 8
- 4.1.1.175 × 7
- 4.5.1.304 × 7
- complexity × 7
- remoting × 7
- visual studio × 7
- 4.1.2.178 × 6
- 4.5.4.151 × 6
- authentication × 6
- datefns × 6
- lisp × 6
- real-world × 6
- released in 4.0.192.103-rc × 6
- resources × 6
- scala × 6
- websharper ui.next × 6
- workshop × 6
- xaml × 6
- 4.0.193.110 × 5
- 4.2.11.258 × 5
- 4.2.3.236 × 5
- aspnetmvc × 5
- azure × 5
- bootstrap × 5
- conference × 5
- css × 5
- dsl × 5
- formlet × 5
- java × 5
- list × 5
- metaprogramming × 5
- ml × 5
- q&a × 5
- released in Zafir.4.0.188.91-beta10 × 5
- released: v0.4 × 5
- released: v0.8 × 5
- spa × 5
- sql × 5
- visualstudio × 5
- websharper.forms × 5
- zafir × 5
- 4.0.192.106 × 4
- 4.0.195.127 × 4
- 4.1.0.38 × 4
- 4.2.1.86 × 4
- 4.2.13.263 × 4
- 4.2.6.118 × 4
- 4.5.5.155 × 4
- 4.6.4.404 × 4
- discussion × 4
- example × 4
- extension × 4
- extensions × 4
- fsi × 4
- fsx × 4
- help wanted × 4
- highlightjs × 4
- html5 × 4
- jqueryui × 4
- lift × 4
- performance × 4
- qna × 4
- react × 4
- reflection × 4
- released: v0.10 × 4
- released: v0.5 × 4
- remote × 4
- rest × 4
- teaching × 4
- todomvc × 4
- 4.0.196.147 × 3
- 4.1.0.34 × 3
- 4.1.6.207 × 3
- 4.2.1.223-beta × 3
- 4.2.14.264 × 3
- 4.2.4.114 × 3
- 4.2.4.247 × 3
- 4.2.5.115 × 3
- 4.2.6.253 × 3
- 4.2.9.256 × 3
- 4.5.0.140 × 3
- 4.5.0.290 × 3
- 4.5.18.348 × 3
- 4.5.2.309 × 3
- 4.5.8.327 × 3
- 4.6.2.386 × 3
- ajax × 3
- alt.net × 3
- aml × 3
- asp.net mvc × 3
- build × 3
- canvas × 3
- cloudsharper × 3
- compilation × 3
- d3 × 3
- data × 3
- database × 3
- erlang × 3
- events × 3
- file upload × 3
- forums × 3
- how to × 3
- http × 3
- inline × 3
- issue × 3
- kendo × 3
- macro × 3
- materialui × 3
- mono × 3
- msbuild × 3
- mvc × 3
- pattern × 3
- piglet × 3
- released in Zafir.4.0.187.90-beta10 × 3
- released: v0.12 × 3
- released: v0.9 × 3
- svg × 3
- type provider × 3
- view × 3
- websharper4 × 3
- 4.1.1.64 × 2
- 4.1.5.203 × 2
- 4.1.7.232 × 2
- 4.2.10.257 × 2
- 4.2.3.111 × 2
- 4.2.5.249 × 2
- 4.3.0.127 × 2
- 4.3.1.275 × 2
- 4.5.10.166 × 2
- 4.5.10.332 × 2
- 4.5.15.342 × 2
- 4.5.19.349 × 2
- 4.5.3.146 × 2
- 4.5.9.301 × 2
- android × 2
- api × 2
- asp.net × 2
- beginner × 2
- blog × 2
- chart × 2
- client × 2
- client server app × 2
- clojure × 2
- computation expressions × 2
- constructor × 2
- corporate × 2
- courses × 2
- cufp × 2
- debugging × 2
- direct × 2
- discriminated union × 2
- dom × 2
- elm × 2
- endpoint × 2
- endpoints × 2
- enterprise × 2
- entity framework × 2
- event × 2
- f# interactive × 2
- fable × 2
- flowlet × 2
- formdata × 2
- forms × 2
- fsc × 2
- fsharp × 2
- google × 2
- google maps × 2
- hosting × 2
- https × 2
- iis 8.0 × 2
- install × 2
- interactive × 2
- interface × 2
- iphone × 2
- iteratee × 2
- jobs × 2
- jquery mobile × 2
- keynote × 2
- lens × 2
- lenses × 2
- linux × 2
- listmodel × 2
- mac × 2
- maps × 2
- numeric × 2
- oauth × 2
- obfuscation × 2
- offline × 2
- oop × 2
- osx × 2
- packaging × 2
- pattern matching × 2
- pipelines × 2
- post × 2
- quotation × 2
- reference × 2
- released in Zafir.4.0.185.88-beta10 × 2
- released: v0.13 × 2
- released: v0.6 × 2
- remarkable × 2
- rx × 2
- script × 2
- security × 2
- self host × 2
- seq × 2
- sockets × 2
- stm × 2
- sweetalert × 2
- tcp × 2
- trie × 2
- tutorials × 2
- type × 2
- url × 2
- var × 2
- websharper.charting × 2
- websockets × 2
- wig × 2
- xna × 2
- zh × 2
- .net framework × 1
- .net interop × 1
- 2012 × 1
- 4.0.194.126 × 1
- 4.1.3.184 × 1
- 4.1.4.189 × 1
- 4.2.0.214-beta × 1
- 4.2.12.259 × 1
- 4.2.2.231-beta × 1
- 4.2.8.255 × 1
- 4.4.1.137 × 1
- 4.5.1.141 × 1
- 4.5.11.334 × 1
- 4.5.12.177 × 1
- 4.5.13.318 × 1
- 4.5.13.338 × 1
- 4.5.16.344 × 1
- 4.5.2.145 × 1
- 4.5.3.144 × 1
- 4.5.3.310 × 1
- 4.5.5.319 × 1
- 4.5.6.156 × 1
- 4.5.6.320 × 1
- 4.5.7.322 × 1
- 4.5.8.161 × 1
- 4.5.9.164 × 1
- 4.6.1.127 × 1
- 4.6.1.381 × 1
- 4.6.3.388 × 1
- 4.6.5.406 × 1
- 4.6.6.407 × 1
- Canvas Sample Example × 1
- DynamicStyle Animated Style × 1
- ES8 × 1
- Fixed in 4.0.190.100-rc × 1
- Metro-Ui-Css × 1
- Metro4 × 1
- Released in Zafir.UI.Next.4.0.169.79-beta10 × 1
- SvgDynamicAttribute × 1
- Swiper × 1
- WebComponent × 1
- WebSharper.TypeScript × 1
- abstract class × 1
- accumulator × 1
- active pattern × 1
- actor × 1
- addin × 1
- agents × 1
- aggregation × 1
- agile × 1
- alter session × 1
- animation × 1
- anonymous object × 1
- apache × 1
- appcelerator × 1
- architecture × 1
- array × 1
- arrays × 1
- asp.net 4.5 × 1
- asp.net core × 1
- asp.net integration × 1
- asp.net mvc 4 × 1
- asp.net web api × 1
- aspnet × 1
- ast × 1
- attributes × 1
- authorization × 1
- b-tree × 1
- back button × 1
- badimageformatexception × 1
- bash script × 1
- batching × 1
- binding-vars × 1
- bistro × 1
- body × 1
- bundle × 1
- camtasia studio × 1
- cas protocol × 1
- charts × 1
- clarity × 1
- class × 1
- cli × 1
- clipboard × 1
- clojurescript × 1
- closures × 1
- cloud × 1
- cms × 1
- code-review × 1
- coding diacritics × 1
- color highlighting × 1
- color zones × 1
- combinator × 1
- combinators × 1
- compile × 1
- compile code on server × 1
- config × 1
- confirm × 1
- content × 1
- context × 1
- context.usersession × 1
- continuation-passing style × 1
- coords × 1
- cordova × 1
- cors × 1
- coursera × 1
- cross-domain × 1
- csla × 1
- current_schema × 1
- custom content × 1
- data grid × 1
- datetime × 1
- debug × 1
- declarative × 1
- delete × 1
- devexpress × 1
- dhtmlx × 1
- dictionary × 1
- directattribute × 1
- disqus × 1
- distance × 1
- do binding × 1
- doc elt ui.next upgrade × 1
- docker × 1
- dojo × 1
- dol × 1
- domain × 1
- dotnet core × 1
- du × 1
- duf-101 × 1
- dynamic × 1
- eastern language × 1
- eclipse × 1
- edsl × 1
- em algorithm × 1
- emacs × 1
- emotion × 1
- enums × 1
- error × 1
- etw × 1
- euclidean × 1
- eventhandlerlist × 1
- examples × 1
- ext js × 1
- extension methods × 1
- extjs × 1
- extra × 1
- facet pattern × 1
- failed to translate × 1
- fake × 1
- fantomas × 1
- fear × 1
- float × 1
- form × 1
- form-data × 1
- forum × 1
- fp × 1
- frank × 1
- fsdoc × 1
- fsharp.core × 1
- fsharp.powerpack × 1
- fsharpx × 1
- fsunit × 1
- function × 1
- functional style × 1
- game × 1
- games × 1
- gc × 1
- generic × 1
- geometry × 1
- getlastwin32error × 1
- getting-started × 1
- good first issue × 1
- google visualization timeline × 1
- google.maps × 1
- grid × 1
- group × 1
- guide × 1
- hash × 1
- headers × 1
- hello world example × 1
- heroku × 1
- highchart × 1
- history × 1
- html-templating × 1
- http405 × 1
- httpcontext × 1
- hubfs × 1
- i18n × 1
- ide × 1
- ie 8 × 1
- if-doc × 1
- iis × 1
- image × 1
- images × 1
- inheritance × 1
- initialize × 1
- input × 1
- install "visual studio" × 1
- installer × 1
- int64 × 1
- interfaces × 1
- internet explorer × 1
- interop × 1
- interpreter × 1
- invalid × 1
- io × 1
- iobservable × 1
- ios × 1
- iot × 1
- ipad × 1
- isomorphic × 1
- javascript optimization × 1
- javascript semanticui resources × 1
- jquery-plugin × 1
- jquery-ui × 1
- jquery-ui-datepicker × 1
- jquerymobile × 1
- js × 1
- kendo datasource × 1
- kendochart × 1
- kendoui compiler × 1
- knockout × 1
- l10n × 1
- leaflet × 1
- learning × 1
- library × 1
- libs × 1
- license × 1
- licensing × 1
- lineserieszonescfg × 1
- local setting × 1
- localization × 1
- logging × 1
- loop × 1
- macros × 1
- mailboxprocessor × 1
- mapping × 1
- markerclusterer × 1
- markup × 1
- marshal × 1
- math × 1
- mathjax × 1
- message × 1
- message passing × 1
- message-passing × 1
- meta × 1
- metro style × 1
- metro-ui × 1
- micro orm × 1
- minimum-requirements × 1
- mix × 1
- mobile installation × 1
- mod_mono × 1
- modal × 1
- module × 1
- mouseevent × 1
- mouseposition × 1
- multidimensional × 1
- multiline × 1
- multithreading × 1
- mysql × 1
- mysqlclient × 1
- nancy × 1
- native × 1
- nested × 1
- nested loops × 1
- netstandard × 1
- node × 1
- nunit × 1
- object relation mapper × 1
- object-oriented × 1
- om × 1
- onboarding × 1
- onclick × 1
- optimization × 1
- option × 1
- orm × 1
- os x × 1
- output-path × 1
- override × 1
- paper × 1
- parameter × 1
- persistence × 1
- persistent data structure × 1
- phonegap × 1
- plotly × 1
- pola × 1
- powerpack × 1
- prefix tree × 1
- principle of least authority × 1
- privacy × 1
- private × 1
- profile × 1
- programming × 1
- project × 1
- project euler × 1
- projekt_feladat × 1
- protected × 1
- provider × 1
- proxy × 1
- ptvs × 1
- public × 1
- pure f# × 1
- purescript × 1
- quant × 1
- query sitelet × 1
- quotations × 1
- range × 1
- raphael × 1
- razor × 1
- rc × 1
- reactjs × 1
- real-time × 1
- ref × 1
- region × 1
- released in 4.0.190.100-rc × 1
- released: v0.2 × 1
- released: v0.3 × 1
- released: v0.7 × 1
- reporting × 1
- responsive design × 1
- rest api × 1
- rest sitelet × 1
- restful × 1
- round table × 1
- router × 1
- routing × 1
- rpc reverseproxy × 1
- runtime × 1
- sales × 1
- sample × 1
- sampleapp × 1
- scriptcs × 1
- scripting × 1
- search × 1
- self hosted × 1
- semanticui × 1
- sequence × 1
- serialisation × 1
- service × 1
- session-state × 1
- sharepoint × 1
- signals × 1
- sitelet website × 1
- sitelet.protect × 1
- sitlets × 1
- slickgrid × 1
- source code × 1
- sqlentityconnection × 1
- ssl × 1
- standards × 1
- static content × 1
- stickynotes × 1
- streamreader × 1
- stress × 1
- strong name × 1
- structures × 1
- submitbutton × 1
- subscribe × 1
- svg example html5 websharper.ui.next × 1
- system.datetime × 1
- system.reflection.targetinvocationexception × 1
- table storage × 1
- targets × 1
- tdd × 1
- template ClientServer × 1
- templates ui.next × 1
- templating × 1
- text parsing × 1
- three.js × 1
- time travel × 1
- tls × 1
- tooltip × 1
- tracing × 1
- tsunamiide × 1
- turkish × 1
- twitter-bootstrap × 1
- type erasure × 1
- type inference × 1
- type providers × 1
- type-providers × 1
- typeprovider × 1
- ui next forms × 1
- ui-next × 1
- ui.next jqueryui × 1
- ui.next charting × 1
- ui.next formlets × 1
- ui.next forms × 1
- ui.next suave visualstudio × 1
- ui.next templating × 1
- unicode × 1
- unittest client × 1
- up for grabs × 1
- upload × 1
- usersession × 1
- validation × 1
- vb × 1
- vb.net × 1
- vector × 1
- view.map × 1
- visal studio × 1
- visual f# × 1
- visual studio 11 × 1
- visual studio 2012 × 1
- visual studio code × 1
- visual studio shell × 1
- visualstudio-websharper × 1
- vs2017 compiler zafir × 1
- vsix × 1
- web api × 1
- web-scraping × 1
- webapi × 1
- webcomponents × 1
- webforms × 1
- webgl × 1
- webrtc × 1
- webshaper × 1
- websharper async × 1
- websharper codemirror × 1
- websharper f# google × 1
- websharper forms × 1
- websharper reactive × 1
- websharper rpc × 1
- websharper sitelets routing × 1
- websharper warp × 1
- websharper-interface-generator × 1
- websharper.chartsjs × 1
- websharper.com × 1
- websharper.exe × 1
- websharper.owin × 1
- websharper.ui.next × 1
- websharper.ui.next jquery × 1
- websockets iis × 1
- webspeech × 1
- why-websharper × 1
- windows 7 × 1
- windows 8 × 1
- windows-phone × 1
- winrt × 1
- www.grabbitmedia.com × 1
- xamarin × 1
- xml × 1
- yeoman × 1
- yield × 1
- zafir beta × 1
- zafir websharper4 × 1
- zarovizsga × 1
|
Copyright (c) 2011-2012 IntelliFactory. All rights reserved. Home | Products | Consulting | Trainings | Blogs | Jobs | Contact Us | Terms of Use | Privacy Policy | Cookie Policy |
Built with WebSharper |







I've developed a small example script which parses a toy XML document description language into an abstract syntax tree using the XmlReader class from .net. The script also contains a printer function that produces html output from the syntax tree. I hope it is of use to other F# users.
XmlReader is an XML reader that provides fast, non-cached, forward-only access to XML data. Although it is normally a bit less convenient to use than the DOM-based XmlDocument it seems to be the proper choice for a parser which builds its own document tree. Due to the expressive power of F# the direct use of its interface is confined to a few lines.
The parser is combined from smaller parser functions that call each other recursively. The pattern matching and function currying capabilities of F# make the definition of the parser particularly easy. I hope the approach is general enough to be easily extended to cope with more complex XML markup.
HashMultiMaps or System.Collections.Generic.Dictionary might be more appropiate data structures for what I'm currently using Map for, but swapping out the data structure should be trivial.
Any comment or criticism on the code is very welcome. I'd especially appreciate any suggestion on how I could conveniently get rid of the impure use of refs in parseDocument [I'm using the ref's to circumvent F# restrictions on recursive data definitions].
Stephan
// We want to read XML documents similar to the following snippet... let xmldoc = "<document> <section id=\"Section 1\"> The first <bold>paragraph</bold> in <underline>the</underline> <italic>first</italic> section <anchor id=\"SecondPara\"/>The second paragraph in the first section <orderedlist> <listitem>First list item in ordered list.</listitem> <listitem>Second list item in ordered list.</listitem> <unorderedlist> <listitem><anchor id=\"NestedList\"/>First item in nested <italic>unordered</italic> list.</listitem> <listitem>Second item in nested <bold>unordered</bold> list.</listitem> </unorderedlist> </orderedlist> </section> <section id=\"Section 2\"> Introduction to second section. <section id=\"Section 2.1\"> Links to second <link ref=\"SecondPara\">paragraph</link> and <link ref=\"NestedList\">nested list</link> <heading level=\"1\">Heading 1</heading> <heading level=\"2\">Heading <underline>2</underline></heading> <heading level=\"3\">Heading <italic>3</italic></heading> <heading level=\"3\">Another heading 3</heading> </section> </section> </document>"//============================================================================== // ... into an abstract syntax tree with the following data types type phrase = | Italic of phrase list | Bold of phrase list | Underline of phrase list | Anchor of string | Link of string * (phrase list) | Text of string type listitem = | ListItem of phrase list | NestedOrderedList of listitem list | NestedUnorderedList of listitem list type block = | Section of string * block list | Heading of int * (phrase list) | OrderedList of listitem list | UnorderedList of listitem list | Paragraph of phrase list type document = Document of block list //============================================================================== // the parser is based on the following functions that // pull XML elements from the passed System.Xml.XmlReader object /// report error message (and position if 2nd argument supports IXmlLineInfo) let error msg optReader = match box(optReader) with | :? System.Xml.IXmlLineInfo as li -> failwith (sprintf "%s at Line: %d, Column: %d" msg li.LineNumber li.LinePosition) | _ -> failwith msg /// read attribute of XML element let readXmlAttrib (attribName : string) attribConv (reader :> System.Xml.XmlReader) = let s = match reader.get_Item(attribName) with // F# can't cope with reader.["attribName"] | null -> error ("could not find attribute " ^ attribName) reader | s -> s attribConv s type NodeType = System.Xml.XmlNodeType /// read sub elements of XML element and pull next element let readXmlSubs (subMapR : Map<string, 'a -> 'b> ref) (reader :> System.Xml.XmlReader) : 'b list = let subMap = !subMapR let d = reader.Depth let mutable lst = [] // we use a mutable list to collect the result list try let isEmpty = reader.IsEmptyElement reader.ReadStartElement() while reader.Depth = d + 1 do // the nested "sub elements" are parsed by dispatching to the functions in subMap match reader.NodeType with | NodeType.Element -> lst <- (subMap.[reader.Name] reader) :: lst | NodeType.Text | NodeType.Whitespace when subMap.ContainsKey("TEXT") -> lst <- (subMap.["TEXT"] reader) :: lst | _ -> ignore (reader.Read()); if not isEmpty then reader.ReadEndElement() // pull next element with | Not_found -> let elem = if reader.NodeType = NodeType.Element then "XML Element " ^ reader.Name else "XML Textnode" error ("unexpected " ^ elem) reader | :? System.Xml.XmlException as e -> error ("XMLReader error [" ^ e.Message ^ "]") () List.rev lst /// read XML text node and pull next element let readXmlText (reader :> System.Xml.XmlReader) = let s = reader.Value let _ = reader.Read() // pull next element Text(s) /// read XML element with one attribute (but no sub elements) and pull next element let readXmlElemA constr attribName attribConv reader = let a = readXmlAttrib attribName attribConv reader let isEmpty = reader.IsEmptyElement reader.ReadStartElement() try if not isEmpty then reader.ReadEndElement() with :? System.Xml.XmlException as e -> error "unexpected xml node" reader constr(a) /// read XML element with sub elements (but no attributes) and pull next element let readXmlElemS constr subMapR reader = constr(readXmlSubs subMapR reader) /// read XML element with attribute and sub elements and pull next element let readXmlElemAS constr attribName attribConv subMapR reader = let a = readXmlAttrib attribName attribConv reader constr(a, readXmlSubs subMapR reader) //============================================================================== /// the main parser function, constructed by combining recursive element parsers let parseDocument = // s2x are helper functions converting attribute strings to the respective values let s2s (x : string) = x let s2i (x : string) = Int32.of_string x // name of tag // function for // data type // attrib // s2x // map of // reading xml // constructor // name // s2x // nested elements let phraseMapR = ref Map.empty let phraseMap = ["italic", readXmlElemS Italic phraseMapR; "bold", readXmlElemS Bold phraseMapR; "underline", readXmlElemS Underline phraseMapR; "anchor", readXmlElemA Anchor "id" s2s ; "link", readXmlElemAS Link "ref" s2s phraseMapR; "TEXT", readXmlText ; ] |> Map.of_list // the reference assignment in the following line lets all the map refs in phraseMapR:= phraseMap; // the curried functions point back to phraseMap let listitemMapR = ref Map.empty let listitemMap = ["listitem", readXmlElemS ListItem phraseMapR; "orderedlist", readXmlElemS NestedUnorderedList listitemMapR; "unorderedlist", readXmlElemS NestedOrderedList listitemMapR; ] |> Map.of_list listitemMapR:= listitemMap; let blockMapR = ref Map.empty let blockMap = ["section", readXmlElemAS Section "id" s2s blockMapR; "heading", readXmlElemAS Heading "level" s2i phraseMapR; "orderedlist", readXmlElemS OrderedList listitemMapR; "unorderedlist", readXmlElemS UnorderedList listitemMapR; "para", readXmlElemS Paragraph phraseMapR; ] |> Map.of_list blockMapR:= blockMap; readXmlElemS Document blockMapR // return a curried function //============================================================================== /// convert Document to HTML string let toHtmlString (Document blocks) = // a StringBuilder is used for efficiently building a large string from many small ones let sb = new System.Text.StringBuilder(); let print (s : string) = ignore (sb.Append(s)) let sectionLevel = ref 1 // the current section level let i2s i = Int32.to_string i let printSym tag f arg = print ("<" ^ tag ^ ">"); f arg; print ("</" ^ tag ^ ">") let rec printPhrases pList = for p in pList do match p with | Italic(lst) -> printSym "i" printPhrases lst | Bold(lst) -> printSym "b" printPhrases lst | Underline(lst) -> printSym "u" printPhrases lst | Anchor(id) -> print ("<a name=\"" ^ id ^ "\"></a>") | Link(id, lst) -> print ("<a href=\"#" ^ id ^ "\">") printPhrases lst print "</a>" | Text(s) -> print s; let rec printListItems liList = for li in liList do match li with | ListItem(lst) -> printSym "li" printPhrases lst | NestedOrderedList(lst) -> printSym "ol" printListItems lst | NestedUnorderedList(lst) -> printSym "ul" printListItems lst let rec printBlocks bList = for b in bList do match b with | Section(id, lst) -> let slev = i2s !sectionLevel sectionLevel:= !sectionLevel + 1 print "" printSym ("h" ^ slev) print id printBlocks lst print "" sectionLevel:= !sectionLevel - 1 | Heading(lev, lst) -> printSym ("h" ^ i2s lev) printPhrases lst | OrderedList(lst) -> printSym "ol" printListItems lst | UnorderedList(lst) -> printSym "ul" printListItems lst | Paragraph(lst) -> printSym "p" printPhrases lst print "<html><body>" printBlocks blocks // print document into StringBuilder sb print "</body></html>" sb.ToString() //============================================================================== let test xmldoc = // initialize reader and scan to beginning of document let reader = System.Xml.XmlReader.Create(new System.IO.StringReader(xmldoc)) if (not (reader.Read())) || reader.Name <> "document" then invalid_arg "invalid start tag" let st = parseDocument reader // parse document into syntax tree // print_any st // print representation of syntax tree let html = toHtmlString st print_string (html ^ "\n") // print document as html test xmldoc