Book contents
- Frontmatter
- Contents
- Preface
- 1 Getting started
- 2 Values, operators, expressions and functions
- 3 Tuples, records and tagged values
- 4 Lists
- 5 Collections: Lists, maps and sets
- 6 Finite trees
- 7 Modules
- 8 Imperative features
- 9 Efficiency
- 10 Text processing programs
- 11 Sequences
- 12 Computation expressions
- 13 Asynchronous and parallel computations
- Appendix A Programs from the keyword example
- Appendix B The TextProcessing library
- Appendix C The dialogue program from Chapter 13
- References
- Index
10 - Text processing programs
Published online by Cambridge University Press: 05 May 2013
- Frontmatter
- Contents
- Preface
- 1 Getting started
- 2 Values, operators, expressions and functions
- 3 Tuples, records and tagged values
- 4 Lists
- 5 Collections: Lists, maps and sets
- 6 Finite trees
- 7 Modules
- 8 Imperative features
- 9 Efficiency
- 10 Text processing programs
- 11 Sequences
- 12 Computation expressions
- 13 Asynchronous and parallel computations
- Appendix A Programs from the keyword example
- Appendix B The TextProcessing library
- Appendix C The dialogue program from Chapter 13
- References
- Index
Summary
Processing text files containing structured data is a common problem in programming – you may just think of analysing any kind of textual data generated by electronic equipment or retrieved data from the web.
In this chapter we show how such programs can be made in a systematic and elegant way using F# and the .NET library. Data are extracted from text files using functions from the RegularExpressions library. The data processing of the extracted data is done with a systematic use of F# collections types list <′a>, Map <′a,′b> and Set<′a>. Easy access from F# programs to the extensive text processing features of the .NET library is given in a special TextProcessing library that can be copied from the home page of the book. The chapter centers on a real-world example illustrating the techniques.
Time performance of programs is always a problem, even with todays very fast computers. Poor performance of text processing programs is often caused by operations on very long strings. The method in this chapter uses three strategies to avoid using very long strings:
1. Text input is in most cases read and processed in small pieces (one or a few lines).
2. Text is generated and written in small pieces.
3. Large amounts of internal program data are stored in many small pieces in F# collections like list, set or map.
- Type
- Chapter
- Information
- Functional Programming Using F# , pp. 219 - 250Publisher: Cambridge University PressPrint publication year: 2013