❄️ The Icicle Streaming Query Language ❄️
Refutation of Facts#
This document describes how Icicle resovles fact conflicts, e.g. from duplicate facts.
Keys#
Each fact may be defined with a refutation expression, which acts as a nubBy key on the facts stream.
The key values are assumed to be monotonic.
Priorities and Keys#
Facts are nubbed as first-write-wins, since we assume:
- Facts with higher priorities will be ingested first.
- No fact can have the same priority, key, and value(s).
Examples#
No refutation#
This is the default. Facts are consumed as-is. Duplicate ingestion will result in duplicate facts.
Time#
[fact.keyed_by_time]
key = "time"
encoding="int"
bulbasaur|keyed_by_time|1|2014-01-01
bulbasaur|keyed_by_time|3|2014-01-01
bulbasaur|keyed_by_time|5|2016-01-01
charmander|keyed_by_time|0|2014-01-01
charmander|keyed_by_time|2|2014-01-01
> feature keyed_by_time ~> count value
[bulbasaur, 2,charmander, 1]
> feature keyed_by_time ~> newest value
[bulbasaur, 5,charmander, 0]
> feature keyed_by_time ~> group (year_of time) ~> newest value
[bulbasaur, [(2014,1),(2016,5)],charmander, [(2014,0)]]
Expression: year_of time#
[fact.keyed_by_year_of_time]
key = "year_of time"
encoding="string"
blatoise|keyed_by_year_of_time|foo14|2014-01-01
blatoise|keyed_by_year_of_time|foo15|2015-01-01
blatoise|keyed_by_year_of_time|foo16|2016-01-01
blatoise|keyed_by_year_of_time|foo160302|2016-03-02
blatoise|keyed_by_year_of_time|foo160307|2016-03-07
> feature keyed_by_year_of_time ~> count value
[blatoise, 3]
> feature keyed_by_year_of_time ~> newest value
[blatoise, "foo16"]
> feature keyed_by_year_of_time ~> group (year_of time) ~> newest value
[blatoise, [(2014,"foo14"),(2015,"foo15"),(2016,"foo16")]]
Struct Field#
[fact.keyed_by_field]
key = "transaction_id"
encoding="(transaction_id:int,dollarydoos:int)"
zubat|keyed_by_field|{"transaction_id":0,"dollarydoos":2}|2015-01-01
zubat|keyed_by_field|{"transaction_id":3,"dollarydoos":1}|2016-01-01
zubat|keyed_by_field|{"transaction_id":3,"dollarydoos":9}|2016-01-05
> feature keyed_by_field ~> count dollarydoos
[zubat, 2]
> feature keyed_by_field ~> newest dollarydoos
[zubat, 1]
> feature keyed_by_field ~> group (year_of time) ~> count dollarydoos
[zubat, [(2015,1),(2016,1)]]
> feature keyed_by_field ~> group (year_of time) ~> newest dollarydoos
[zubat, [(2015,2),(2016,1)]]
Struct Field + Time#
[fact.keyed_by_field_time]
key = "(transaction_id, time)"
encoding="(transaction_id:int,dollarydoos:int)"
zubat|keyed_by_field_time|{"transaction_id":5,"dollarydoos":2}|2015-01-01
zubat|keyed_by_field_time|{"transaction_id":5,"dollarydoos":5}|2015-01-01
zubat|keyed_by_field_time|{"transaction_id":5,"dollarydoos":7}|2015-01-03
> feature keyed_by_field_time ~> count dollarydoos
[zubat, 2]
> feature keyed_by_field_time ~> newest dollarydoos
[zubat, 7]
> feature keyed_by_field_time ~> group (year_of time) ~> newest dollarydoos
[zubat, [(2015,7)]]
> feature keyed_by_field_time ~> group (day_of time) ~> newest dollarydoos
[zubat, [(1,2),(3,7)]]