Why do people use C* in addition to ES? It seems like in this case most of the data could directly be piped into ES?
I understand that ES can lose data, or have some data storage problems, but one could just as well store all the incoming data on Hadoop or so, without having to bother with C*, no?
>For funnel analysis, it’s not feasible to use this data model for getting back a summary of the funnel steps and the sessions matching it, since there’s no option in Solr to run a recursive query, which would allow to go over each session and check if it’s a match for the funnel.
I don't think this approach scales, even in an environment that supports recursive queries like PostgreSQL.
The more scalable approach would be to use either a commercial database systems with explicit support for pattern matching or encode conversion path as a string (ex: "top page -> product page with SKU=1337 -> Purchase" becomes "T_SKU1337_P") and use REGEX/GROUP BY.
In all cases, this sounds like a suboptimal use case for either Solr or Elasticsearch.