Generic
The fs2-data-csv-generic
module provides automatic (Scala 2-only) and semi-automatic derivation for RowDecoder
and CsvRowDecoder
.
It makes it easier to support custom row types but is based on shapeless, which can have a significant impact on compilation time on Scala 2. On Scala 3, it relies on mix of hand-written derivation on top of scala.deriving.Mirror
and the more light-weight shapeless-3, so that compile times shouldn't be problematic as on Scala 2. Note that auto derivation is currently not yet supported on Scala 3, same goes for using default constructor arguments of case class
es (for background see dotty#11667).
To demonstrate how it works, let's work again with the CSV data from the core module documentation.
import fs2._
import fs2.data.csv._
val input = """i,s,j
|1,test,2
|,other,-3
|""".stripMargin
// input: String = """i,s,j
// 1,test,2
// ,other,-3
// """
val stream = Stream.emit(input).covary[Fallible]
// stream: Stream[Fallible, String] = Stream(..)
Derivation of CellDecoder
& CellEncoder
Cell types (Int
, String
, ...) can be decoded and encoded by providing implicit instances of CellDecoder
/CellEncoder
. Instances for primitives and common types are defined already. You can easily define your own or use generic derivation for coproducts:
import fs2.data.csv.generic._
import fs2.data.csv.generic.semiauto._
sealed trait State
object State {
case object On extends State
case object Off extends State
}
implicit val stateDecoder = deriveCellDecoder[State]
// stateDecoder: CellDecoder[State] = fs2.data.csv.generic.internal.DerivedCellDecoder$$anonfun$coproductDecoder$3@380105ee
// use stateDecoder to derive decoders for rows...or just test:
stateDecoder("On")
// res1: DecoderResult[State] = Right(value = On)
stateDecoder("Off")
// res2: DecoderResult[State] = Right(value = Off)
// same goes for the encoder
implicit val stateEncoder = deriveCellEncoder[State]
// stateEncoder: CellEncoder[State] = fs2.data.csv.generic.internal.DerivedCellEncoder$$anonfun$coproductEncoder$3@4166e481
stateEncoder(State.On)
// res3: String = "On"
The generic derivation for cell decoders also supports renaming and deriving instances for unary product types (case classes with one field):
import fs2.data.csv.generic.semiauto._
sealed trait Advanced
object Advanced {
@CsvValue("Active") case object On extends Advanced
case class Unknown(name: String) extends Advanced
}
// works as we have an implicit CellDecoder[String]
implicit val unknownDecoder = deriveCellDecoder[Advanced.Unknown]
// unknownDecoder: CellDecoder[Advanced.Unknown] = fs2.data.csv.generic.internal.DerivedCellDecoder$$anonfun$unaryProductDecoder$3@3bc7f8b5
implicit val advancedDecoder = deriveCellDecoder[Advanced]
// advancedDecoder: CellDecoder[Advanced] = fs2.data.csv.generic.internal.DerivedCellDecoder$$anonfun$coproductDecoder$3@4a98019c
advancedDecoder("Active")
// res4: DecoderResult[Advanced] = Right(value = On)
advancedDecoder("Off")
// res5: DecoderResult[Advanced] = Right(value = Unknown(name = "Off"))
implicit val unknownEncoder = deriveCellEncoder[Advanced.Unknown]
// unknownEncoder: CellEncoder[Advanced.Unknown] = fs2.data.csv.generic.internal.DerivedCellEncoder$$anonfun$unaryProductEncoder$3@6b386898
implicit val advancedEncoder = deriveCellEncoder[Advanced]
// advancedEncoder: CellEncoder[Advanced] = fs2.data.csv.generic.internal.DerivedCellEncoder$$anonfun$coproductEncoder$3@64a1f881
advancedEncoder(Advanced.On)
// res6: String = "Active"
advancedEncoder(Advanced.Unknown("Off"))
// res7: String = "Off"
Derivation of RowDecoder
& RowEncoder
One can automatically derive an instance for a shapeless HList
if there are instances for all cell types. The example previously written manually now looks like:
import shapeless._
import fs2.data.csv.generic.hlist._
// .tail drops the header line
val hlists = stream.through(decodeSkippingHeaders[Option[Int] :: String :: Int :: HNil]())
// hlists: Stream[[x]Fallible[x], Option[Int] :: String :: Int :: HNil] = Stream(..)
hlists.compile.toList
// res8: Either[Throwable, List[Option[Int] :: String :: Int :: HNil]] = Right(
// value = List(
// Some(value = 1) :: "test" :: 2 :: HNil,
// None :: "other" :: -3 :: HNil
// )
// )
Derivation of CsvRowDecoder
Let's say you want to decode the CSV row to the following case class:
case class MyRow(i: Option[Int], j: Int, s: String)
You can get an automatically derived CsvRowDecoder
(and a matching CsvRowEncoder
) for every case class by importing fs2.data.csv.generic.auto._
import fs2.data.csv.generic.auto._
val roundtrip = stream.through(decodeUsingHeaders[MyRow]())
// and back - note that types and corresponding are all inferred
.through(encodeUsingFirstHeaders(fullRows = true))
// roundtrip: Stream[[x]Fallible[x], String] = Stream(..)
roundtrip.compile.string
// res9: Either[Throwable, String] = Right(
// value = """i,j,s
// 1,2,test
// ,-3,other
// """
// )
Automatic derivation can be quite slow at compile time, so you might want to opt for semiautomatic derivation. In this case, you need to explicitly define the implicit instance in scope.
import fs2.data.csv.generic.semiauto._
implicit val MyRowDecoder: CsvRowDecoder[MyRow, String] = deriveCsvRowDecoder[MyRow]
// MyRowDecoder: CsvRowDecoder[MyRow, String] = fs2.data.csv.generic.internal.DerivedCsvRowDecoder$$anon$1@51c5194d
val decoded = stream.through(decodeUsingHeaders[MyRow]())
// decoded: Stream[[x]Fallible[x], MyRow] = Stream(..)
decoded.compile.toList
// res10: Either[Throwable, List[MyRow]] = Right(
// value = List(
// MyRow(i = Some(value = 1), j = 2, s = "test"),
// MyRow(i = None, j = -3, s = "other")
// )
// )
Both automatic and semi-automatic decoders support also default values when decoding, so instead of an Option[Int]
for i
, you can define this class:
import fs2.data.csv.generic.auto._
case class MyRowDefault(i: Int = 42, j: Int, s: String)
val decoded = stream.through(decodeUsingHeaders[MyRowDefault]())
// decoded: Stream[[x]Fallible[x], MyRowDefault] = Stream(..)
decoded.compile.toList
// res11: Either[Throwable, List[MyRowDefault]] = Right(
// value = List(
// MyRowDefault(i = 1, j = 2, s = "test"),
// MyRowDefault(i = 42, j = -3, s = "other")
// )
// )
It's important to note that by the limitations of the CSV file format, there's no clear notion of when default values would apply. fs2-data-csv-generic
treats values as missing if there's no column with the expected name or if the value is empty. This implies that cells with an empty value won't be parsed of there's a default present, even if the corresponding CellDecoder
instance could handle empty input, like CellDecoder[String]
. If you need to handle empty inputs explicitly, refrain from defining a (non-empty) default or define the CsvRowDecoder
instance manually.