❄️ The Icicle Streaming Query Language ❄️

Refutation of Facts#

This document describes how Icicle resovles fact conflicts, e.g. from duplicate facts.

Keys#

Each fact may be defined with a refutation expression, which acts as a nubBy key on the facts stream.

The key values are assumed to be monotonic.

Priorities and Keys#

Facts are nubbed as first-write-wins, since we assume:

  • Facts with higher priorities will be ingested first.
  • No fact can have the same priority, key, and value(s).

Examples#

No refutation#

This is the default. Facts are consumed as-is. Duplicate ingestion will result in duplicate facts.

Time#

[fact.keyed_by_time]
  key = "time"
  encoding="int"
  
bulbasaur|keyed_by_time|1|2014-01-01
bulbasaur|keyed_by_time|3|2014-01-01
bulbasaur|keyed_by_time|5|2016-01-01
charmander|keyed_by_time|0|2014-01-01
charmander|keyed_by_time|2|2014-01-01
> feature keyed_by_time ~> count value
[bulbasaur, 2,charmander, 1]

> feature keyed_by_time ~> newest value
[bulbasaur, 5,charmander, 0]

> feature keyed_by_time ~> group (year_of time) ~> newest value
[bulbasaur, [(2014,1),(2016,5)],charmander, [(2014,0)]]

Expression: year_of time#

[fact.keyed_by_year_of_time]
  key = "year_of time"
  encoding="string"

blatoise|keyed_by_year_of_time|foo14|2014-01-01
blatoise|keyed_by_year_of_time|foo15|2015-01-01
blatoise|keyed_by_year_of_time|foo16|2016-01-01
blatoise|keyed_by_year_of_time|foo160302|2016-03-02
blatoise|keyed_by_year_of_time|foo160307|2016-03-07
> feature keyed_by_year_of_time ~> count value
[blatoise, 3]

> feature keyed_by_year_of_time ~> newest value
[blatoise, "foo16"]

> feature keyed_by_year_of_time ~> group (year_of time) ~> newest value
[blatoise, [(2014,"foo14"),(2015,"foo15"),(2016,"foo16")]]

Struct Field#

[fact.keyed_by_field]
  key = "transaction_id"
  encoding="(transaction_id:int,dollarydoos:int)"

zubat|keyed_by_field|{"transaction_id":0,"dollarydoos":2}|2015-01-01
zubat|keyed_by_field|{"transaction_id":3,"dollarydoos":1}|2016-01-01
zubat|keyed_by_field|{"transaction_id":3,"dollarydoos":9}|2016-01-05
> feature keyed_by_field ~> count dollarydoos
[zubat, 2]

> feature keyed_by_field ~> newest dollarydoos
[zubat, 1]

> feature keyed_by_field ~> group (year_of time) ~> count dollarydoos
[zubat, [(2015,1),(2016,1)]]

> feature keyed_by_field ~> group (year_of time) ~> newest dollarydoos
[zubat, [(2015,2),(2016,1)]]

Struct Field + Time#

[fact.keyed_by_field_time]
  key = "(transaction_id, time)"
  encoding="(transaction_id:int,dollarydoos:int)"

zubat|keyed_by_field_time|{"transaction_id":5,"dollarydoos":2}|2015-01-01
zubat|keyed_by_field_time|{"transaction_id":5,"dollarydoos":5}|2015-01-01
zubat|keyed_by_field_time|{"transaction_id":5,"dollarydoos":7}|2015-01-03
> feature keyed_by_field_time ~> count dollarydoos
[zubat, 2]

> feature keyed_by_field_time ~> newest dollarydoos
[zubat, 7]

> feature keyed_by_field_time ~> group (year_of time) ~> newest dollarydoos
[zubat, [(2015,7)]]

> feature keyed_by_field_time ~> group (day_of time) ~> newest dollarydoos
[zubat, [(1,2),(3,7)]]