Norway


For one reason or another, we find ourselves writing scanners and parsers quite often. Sometimes we parse a specific file format, or a expression language, or just a file name that conforms to a certain naming scheme.

One approach is to use the Scanner class from Foundation (it used to be called ). A Scanner instance stores the scanned String and a scan , which is the position in the string. For example, scanning a single character just returns the character at the current scan and increases the scan .

In pure , there’s another type that stores a String and an offset into that String: Substring. Instead of using a scanner, we could write mutating methods on Substring. As an illustration, here are three such methods. The first matches a character that matches a certain condition, the second scans exactly count characters, and the last scans a specific prefix:

extension Substring {
    mutating func scan(_ condition: (Element) -> Bool) -> Element? {
        guard let f = first, condition(f) else { return nil }
        return removeFirst()
    }

    mutating func scan(count: Int) -> Substring? {
        let result = prefix(count)
        guard result.count == count else { return nil }
        removeFirst(count)
        return result
    }

    mutating func scan<C>(prefix: C) -> Bool where C: Collection, C.Element == Character {
        guard starts(with: prefix) else { return false }
        removeFirst(prefix.count)
        return true
    }
}

To use this with strings, we first need to make a mutable Substring out of a String, and then we can call the scan method:

var remainder = "value: 3"[...]
if remainder.scan(prefix: "value: "),
   let firstDigit = remainder.scan({ "0356789".contains($0) }) {
  print(firstDigit)
}

You can write a whole bunch of these scanning methods, there is no need for an extra Scanner type. You can even write “higher-order” scanners, like this:

extension Substring {
  mutating func many<A>(until end: Character, _ f: (inout Substring) throws -> A, separator: (inout Substring) throws -> Bool) throws -> [A] {
    // ... left as an exercise
  }
}

So far, we could have done similar things with a Scanner. However, one of the fun things about Swift is that the code we write is actually far more generic! Instead of defining it on Substring, we can define it on any Collection that supports removeFirst. Reviewing the method’s definition, we learn that it exists on any collection that has itself as a Subsequence. This means we only have to change the definition of the method, but not the method body:

extension Collection where SubSequence == Self {
    mutating func scan(_ condition: (Element) -> Bool) -> Element? {
        guard let f = first, condition(f) else { return nil }
        return removeFirst()
    }

    mutating func scan(count: Int) -> Self? {
        let result = prefix(count)
        guard result.count == count else { return nil }
        removeFirst(count)
        return result
    }
}

extension Collection where SubSequence == Self, Element: Equatable {
    mutating func scan<C>(prefix: C) -> Bool where C: Collection, C.Element == Element {
        guard starts(with: prefix) else { return false }
        removeFirst(prefix.count)
        return true
    }
}

Now we can use our scan method on many other types as well, most notably ArraySlice and Data. For example, we can use it to parse the beginning of a GIF header:

var t = try! Data(contentsOf: URL(string: "https://media.giphy.com/media/gw3IWyGkC0rsazTi/giphy.gif")!)[...]
guard t.scan(prefix: [71,73,70]), // GIF
    let version = t.scan(count: 3), // 87a or 89a
    let width = t.scan(count: 2),
    let height = t.scan(count: 2)
    else {
    fatalError()
}

print(version, width, height)

For further inspiration, see this gist by Michael Ilseman.

In Swift Talk Episode 78 (a public episode), we show how to work with Swift’s String and Substring types by writing a simple CSV parser.

To support our work you can subscribe, or give someone a gift.



Source link
Based Blockchain Network

LEAVE A REPLY

Please enter your comment!
Please enter your name here