- Ability to source data near-real-time from transactional/OLTP systems either by tailing logs or consuming events
- Augmenting traditional ETL data pipes with stream processing
- Enabling greater model development agility and optimization via machine learning
- Ability to feed resulting insights (e.g. segmentation changes) back to OLTP systems supporting customer interactions
Saturday, September 10, 2016
Saturday, September 3, 2016
This can serve as a nice litmus test for when technology governance is in danger of devolving into a bureaucracy. This is why it’s so important to focus on tangible value in areas such as governance and technology architecture.
Wednesday, August 3, 2016
It is a truism that every engineering decision involves trade-offs – selecting certain strengths at the price of accepting certain limitations (not just in software/systems engineering, but in all engineering disciplines) at a given level of investment. Thus, every instrument has a sweet spot for its application.
CQRS is an interesting example to consider. It builds on the strengths of a more vanilla version of EDA (event-driven architecture) – uncoupling domains from each other – and adds asymmetric read/write models and event sourcing.
By enabling updates at highly granular, attribute level, coupled with arbitrarily denormalized aggregate reads, CQRS creates opportunities for independently scaling the read and write/update stores – much more so than adding read replicas to traditional, symmetric-model domain implementations.
These strengths come at a certain price of additional complexity – managing highly granular even models, ensuring interoperability with non-CQRS-based domains, dealing with compacting of logs in event sourcing, etc.
It would seem that domains that would benefit most from this approach are those where multiple actors are attempting simultaneous updates to varying attributes of the same aggregate – i.e. domains demanding high-concurrency writes.
Friday, May 13, 2016
The challenge is that simply having more data, or combining all of the business’ data into a common pool or ‘lake’ isn’t by itself going to unlock insights, as if by magic.
Rigor is required in managing the data sources and the meaning of various data elements, and equally rigor is required in applying proper mathematical techniques in analysis of the data and avoidance of misleading conclusions.
Things like curse of dimensionality (applicable to sampling and anomaly detection among other things), misuse of p-values, and implicit assumptions about the shape of probability distribution come to mind as some of the most common omissions.
Wednesday, January 6, 2016
It is normally implicitly understood that people at more senior managerial or executive levels have to operate at (and be comfortable with) progressively higher levels of uncertainty. Often, one has to make choices being cognizant of a superposition of multiple future states that is yet to be resolved into one or the other (not unlike a quantum one).
It is also true that key role of management is to enable people to be as productive as possible. Ergo, the manager’s job has to include a meaningful reduction in uncertainty that’s being passed through to the subordinates.