Latency has become an important metric for network monitoring since the emergence of new latency-sensitive applications (e.g., algorithmic trading and high-performance computing). To satisfy the need, researchers have proposed new architectures such as LDA and RLI that can provide fine-grained latency measurements. However, these architectures are fundamentally ossified in their design as they are designed to provide only a specific pre-configured aggregate measurement - either average latency across all packets (LDA) or per-flow latency measurements (RLI). Network operators, however, need latency measurements at both finer (e.g., packet) as well as flexible (e.g., flow subsets) levels of granularity.