Generic

Module: Maven Central

The fs2-data-csv-generic module provides automatic (Scala 2-only) and semi-automatic derivation for RowDecoder and CsvRowDecoder.

It makes it easier to support custom row types but is based on shapeless, which can have a significant impact on compilation time on Scala 2. On Scala 3, it relies on mix of hand-written derivation on top of scala.deriving.Mirror and the more light-weight shapeless-3, so that compile times shouldn't be problematic as on Scala 2. Note that auto derivation is currently not yet supported on Scala 3, same goes for using default constructor arguments of case classes (for background see dotty#11667).

To demonstrate how it works, let's work again with the CSV data from the core module documentation.

import fs2._
import fs2.data.csv._

val input = """i,s,j
              |1,test,2
              |,other,-3
              |""".stripMargin
// input: String = """i,s,j
// 1,test,2
// ,other,-3
// """

val stream = Stream.emit(input).covary[Fallible]
// stream: Stream[Fallible, String] = Stream(..)

Derivation of CellDecoder & CellEncoder

Cell types (Int, String, ...) can be decoded and encoded by providing implicit instances of CellDecoder/CellEncoder. Instances for primitives and common types are defined already. You can easily define your own or use generic derivation for coproducts:

import fs2.data.csv.generic._
import fs2.data.csv.generic.semiauto._

sealed trait State
object State {
  case object On extends State
  case object Off extends State
}

implicit val stateDecoder = deriveCellDecoder[State]
// stateDecoder: CellDecoder[State] = fs2.data.csv.generic.internal.DerivedCellDecoder$$anonfun$coproductDecoder$3@65aca825
// use stateDecoder to derive decoders for rows...or just test:
stateDecoder("On")
// res1: DecoderResult[State] = Right(value = On)
stateDecoder("Off")
// res2: DecoderResult[State] = Right(value = Off)

// same goes for the encoder
implicit val stateEncoder = deriveCellEncoder[State]
// stateEncoder: CellEncoder[State] = fs2.data.csv.generic.internal.DerivedCellEncoder$$anonfun$coproductEncoder$3@6c022db9
stateEncoder(State.On)
// res3: String = "On"

The generic derivation for cell decoders also supports renaming and deriving instances for unary product types (case classes with one field):

import fs2.data.csv.generic.semiauto._

sealed trait Advanced
object Advanced {
  @CsvValue("Active") case object On extends Advanced
  case class Unknown(name: String) extends Advanced
}

// works as we have an implicit CellDecoder[String]
implicit val unknownDecoder = deriveCellDecoder[Advanced.Unknown]
// unknownDecoder: CellDecoder[Advanced.Unknown] = fs2.data.csv.generic.internal.DerivedCellDecoder$$anonfun$unaryProductDecoder$3@5b064f54
implicit val advancedDecoder = deriveCellDecoder[Advanced]
// advancedDecoder: CellDecoder[Advanced] = fs2.data.csv.generic.internal.DerivedCellDecoder$$anonfun$coproductDecoder$3@1bcd2a51

advancedDecoder("Active")
// res4: DecoderResult[Advanced] = Right(value = On)
advancedDecoder("Off")
// res5: DecoderResult[Advanced] = Right(value = Unknown(name = "Off"))

implicit val unknownEncoder = deriveCellEncoder[Advanced.Unknown]
// unknownEncoder: CellEncoder[Advanced.Unknown] = fs2.data.csv.generic.internal.DerivedCellEncoder$$anonfun$unaryProductEncoder$3@424b0f7d
implicit val advancedEncoder = deriveCellEncoder[Advanced]
// advancedEncoder: CellEncoder[Advanced] = fs2.data.csv.generic.internal.DerivedCellEncoder$$anonfun$coproductEncoder$3@6e0572f3

advancedEncoder(Advanced.On)
// res6: String = "Active"
advancedEncoder(Advanced.Unknown("Off"))
// res7: String = "Off"

Derivation of RowDecoder & RowEncoder

One can automatically derive an instance for a shapeless HList if there are instances for all cell types. The example previously written manually now looks like:

import shapeless._
import fs2.data.csv.generic.hlist._

// .tail drops the header line
val hlists = stream.through(decodeSkippingHeaders[Option[Int] :: String :: Int :: HNil]())
// hlists: Stream[[x]Fallible[x], Option[Int] :: String :: Int :: HNil] = Stream(..)
hlists.compile.toList
// res8: Either[Throwable, List[Option[Int] :: String :: Int :: HNil]] = Right(
//   value = List(
//     Some(value = 1) :: "test" :: 2 :: HNil,
//     None :: "other" :: -3 :: HNil
//   )
// )

Derivation of CsvRowDecoder

Let's say you want to decode the CSV row to the following case class:

case class MyRow(i: Option[Int], j: Int, s: String)

You can get an automatically derived CsvRowDecoder (and a matching CsvRowEncoder) for every case class by importing fs2.data.csv.generic.auto._

import fs2.data.csv.generic.auto._

val roundtrip = stream.through(decodeUsingHeaders[MyRow]())
  // and back - note that types and corresponding are all inferred
  .through(encodeUsingFirstHeaders(fullRows = true))
// roundtrip: Stream[[x]Fallible[x], String] = Stream(..)
roundtrip.compile.string
// res9: Either[Throwable, String] = Right(
//   value = """i,j,s
// 1,2,test
// ,-3,other
// """
// )

Automatic derivation can be quite slow at compile time, so you might want to opt for semiautomatic derivation. In this case, you need to explicitly define the implicit instance in scope.

import fs2.data.csv.generic.semiauto._

implicit val MyRowDecoder: CsvRowDecoder[MyRow, String] = deriveCsvRowDecoder[MyRow]
// MyRowDecoder: CsvRowDecoder[MyRow, String] = fs2.data.csv.generic.internal.DerivedCsvRowDecoder$$anon$1@6f41cf1a

val decoded = stream.through(decodeUsingHeaders[MyRow]())
// decoded: Stream[[x]Fallible[x], MyRow] = Stream(..)
decoded.compile.toList
// res10: Either[Throwable, List[MyRow]] = Right(
//   value = List(
//     MyRow(i = Some(value = 1), j = 2, s = "test"),
//     MyRow(i = None, j = -3, s = "other")
//   )
// )

Both automatic and semi-automatic decoders support also default values when decoding, so instead of an Option[Int] for i, you can define this class:

import fs2.data.csv.generic.auto._

case class MyRowDefault(i: Int = 42, j: Int, s: String)

val decoded = stream.through(decodeUsingHeaders[MyRowDefault]())
// decoded: Stream[[x]Fallible[x], MyRowDefault] = Stream(..)
decoded.compile.toList
// res11: Either[Throwable, List[MyRowDefault]] = Right(
//   value = List(
//     MyRowDefault(i = 1, j = 2, s = "test"),
//     MyRowDefault(i = 42, j = -3, s = "other")
//   )
// )

It's important to note that by the limitations of the CSV file format, there's no clear notion of when default values would apply. fs2-data-csv-generic treats values as missing if there's no column with the expected name or if the value is empty. This implies that cells with an empty value won't be parsed of there's a default present, even if the corresponding CellDecoder instance could handle empty input, like CellDecoder[String]. If you need to handle empty inputs explicitly, refrain from defining a (non-empty) default or define the CsvRowDecoder instance manually.