Yan Zhou

Yan Zhou

Ithaca, New York, United States
571 followers 500+ connections

Activity

Join now to see all activity

Experience

  • Cedar Graphic
  • -

  • -

    San Francisco, California, United States

  • -

    Ithaca, New York Area

  • -

  • -

    Greater Boston Area

  • -

  • -

    Tarrytown, NY

Education

Licenses & Certifications

Publications

  • Adaptive load shedding via fuzzy control in data stream management systems

    5th IEEE International Conference on Service-Oriented Computing and Applications

    Data stream management systems (DSMS) aim to process massive data streams in a timely fashion to support important applications, e.g., financial market analysis. However, DSMS can be overloaded due to large bursts in data stream arrivals and data-dependent query executions. To avoid overloads, we design a new load shedding scheme by applying distributed fuzzy logic control, which is very effective to deal with uncertainties in highly dynamic systems such as DSMS, based on the per-stream backlog…

    Data stream management systems (DSMS) aim to process massive data streams in a timely fashion to support important applications, e.g., financial market analysis. However, DSMS can be overloaded due to large bursts in data stream arrivals and data-dependent query executions. To avoid overloads, we design a new load shedding scheme by applying distributed fuzzy logic control, which is very effective to deal with uncertainties in highly dynamic systems such as DSMS, based on the per-stream backlog and selectivity of each query operator. We have implemented our approach by extending an open source distributed DSMS. The performance evaluation using high-rate Internet traces shows that our approach closely supports a specified backlog bound for each data stream queue, while improving the query processing delay, with little overhead.

    See publication
  • A Federated Approach for Increasing the Timely Throughput of Real-Time Data Services

    2012 IEEE 18th Real-Time and Embedded Technology and Applications Symposium

    As the demand for real-time data services (e.g., e-commerce or on line auctions) increases, it is desired for a real-time database to increase the timely throughput-the amount of data processed in a timely manner. As the timely throughput of a centralized real-time database is limited, it is desired to federate a set of real-time databases to increases the timely throughput. However, related work on distributed real-time databases is scarce. Most existing approaches are highly complex…

    As the demand for real-time data services (e.g., e-commerce or on line auctions) increases, it is desired for a real-time database to increase the timely throughput-the amount of data processed in a timely manner. As the timely throughput of a centralized real-time database is limited, it is desired to federate a set of real-time databases to increases the timely throughput. However, related work on distributed real-time databases is scarce. Most existing approaches are highly complex, incurring non-trivial overheads. Neither are they implemented in a real database system. To address the problem, we design a new system architecture for federated real-time data services and develop efficient approaches for load sharing among a set of clustered databases. To support the desired data service delay even in the presence of dynamic workloads, each individual database employs a single-input single-output (SISO) feedback admission control scheme. Based on the admission control signals collected from the individual databases, cluster-wide load sharing is performed to enhance the total timely throughput by fully utilizing the federated databases, while avoiding to overload them. We have implemented and evaluated the performance of our approach by extending the Oracle Berkeley DB. Our system significantly enhances the timely data throughput compared to a single centralized system, while effectively dealing with emulated partial unavailability of a set of federated databases.

    See publication
  • Estimating and Enhancing Real-Time Data Service Delays: Control Theoretic Approaches

    IEEE Transactions on Knowledge and Data Engineering

    It is essential to process real-time data service requests such as stock quotes and trade transactions in a timely manner using fresh data, which represent the current real-world phenomena such as the stock market status. Users may simply leave when the database service delay is excessive. Also, temporally inconsistent data may give an outdated view of the real-world status. However, supporting the desired timeliness and freshness is challenging due to dynamic workloads. To address the problem,…

    It is essential to process real-time data service requests such as stock quotes and trade transactions in a timely manner using fresh data, which represent the current real-world phenomena such as the stock market status. Users may simply leave when the database service delay is excessive. Also, temporally inconsistent data may give an outdated view of the real-world status. However, supporting the desired timeliness and freshness is challenging due to dynamic workloads. To address the problem, we present new approaches for 1) database backlog estimation, 2) fine-grained closed-loop admission control based on the backlog model, and 3) incoming load smoothing. Our backlog estimation and control-theoretic approaches aim to support the desired service delay bound without degrading the data freshness, critical for real-time data services. Specifically, we design, implement, and evaluate two feedback controllers based on linear control theory and fuzzy logic control theory, to meet the desired service delay. Workload smoothing, under overload, helps the database admit and process more transactions in a timely fashion by probabilistically reducing the burstiness of incoming data service requests. In terms of the data service delay and throughput, our closed-loop admission control and probabilistic load smoothing schemes considerably outperform several baselines in the experiments undertaken in a stock trading database testbed.

    See publication
  • Deadline Assignment and Tardiness Control for Real-Time Data Services

    22nd Euromicro Conference on Real-Time Systems

    It is challenging to support the timeliness of real-time data service requests in data-intensive real-time applications such as online auction or stock trading, while maintaining the freshness of temporal data that capture the current real-world status. Although deadline-aware real-time scheduling would significantly enhance the timeliness of data services, it is not clear how to assign explicit feasible deadlines to data service requests in an open environment. To address the problem, we…

    It is challenging to support the timeliness of real-time data service requests in data-intensive real-time applications such as online auction or stock trading, while maintaining the freshness of temporal data that capture the current real-world status. Although deadline-aware real-time scheduling would significantly enhance the timeliness of data services, it is not clear how to assign explicit feasible deadlines to data service requests in an open environment. To address the problem, we design a new deadline assignment scheme to derive feasible deadlines for real-time data service requests considering their individual data needs. Further, we develop a systematic closed-loop approach to supporting the desired tardiness-the actual service delay to deadline ratio-of real-time data services even in the presence of dynamic workloads. We choose the tardiness metric due to its expressiveness compared to the deadline miss ratio and utilization that saturate at 0 and 1 when the system is underutilized or overloaded, respectively. The performance evaluation results acquired in our real-time stock trading testbed show that the desired average/transient tardiness is closely supported. Consequently, the deadline miss ratio is significantly reduced compared to a state-of-art database system with a real-time scheduling extension.

    See publication
  • Integrating Proactive and Reactive Approaches for Robust Real-Time Data Services

    30th IEEE Real-Time Systems Symposium

    Real-time data services are needed in data-intensive real-time applications such as e-commerce or traffic control. However, it is challenging to support real-time data services, if workloads dynamically change based on the market or traffic status. To enhance the quality of real-time data services even in the presence of dynamic workloads, feedback control theory has been applied. However, a major drawback of feedback control is that it only reacts to performance errors. To improve the…

    Real-time data services are needed in data-intensive real-time applications such as e-commerce or traffic control. However, it is challenging to support real-time data services, if workloads dynamically change based on the market or traffic status. To enhance the quality of real-time data services even in the presence of dynamic workloads, feedback control theory has been applied. However, a major drawback of feedback control is that it only reacts to performance errors. To improve the robustness of real-time data services, we develop a statistical feed-forward approach that proactively adapts the incoming load, if necessary, to support the desired real-time data service delay. Further, we integrate it with a feedback controller to compensate potential prediction errors and adjust the system behavior in a reactive manner for timely data services. Performance evaluation results acquired in our real-time data service testbed show that our integrated approach considerably reduces the average delay and transient delay fluctuations, while improving throughput compared to the tested baselines including the feed-forward-only and feedback-only approaches.

    See publication
  • Backlog Estimation and Management for Real-Time Data Services

    20th Euromicro Conference on Real-Time Systems

    Real-time data services can benefit data-intensive real-time applications, e.g., e-commerce, via timely transaction processing using fresh data, e.g., the current stock prices. To enhance the real-time data service quality, we present several novel techniques for (1) database backlog estimation, (2) fine-grained closed-loop admission control based on the backlog model, and (3) hint-based incoming load smoothing. Our backlog estimation and feedback control aim to support the desired service…

    Real-time data services can benefit data-intensive real-time applications, e.g., e-commerce, via timely transaction processing using fresh data, e.g., the current stock prices. To enhance the real-time data service quality, we present several novel techniques for (1) database backlog estimation, (2) fine-grained closed-loop admission control based on the backlog model, and (3) hint-based incoming load smoothing. Our backlog estimation and feedback control aim to support the desired service delay bound without degrading the data freshness critical for real-time data services. Workload smoothing, under overload, help the database admit and process more transactions in a timely manner by probabilistically reducing the burstiness of incoming data service requests. In terms of the data service delay and throughput, our feedback-based admission control and probabilistic load smoothing considerably outperform the baselines, which represent the current state of the art, in the experiments performed in a stock trading database testbed.

    See publication

More activity by Yan

View Yan’s full profile

  • See who you know in common
  • Get introduced
  • Contact Yan directly
Join to view full profile

Other similar profiles

Explore collaborative articles

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Explore More

Others named Yan Zhou in United States

Add new skills with these courses